Muckrack Analysis of 15M AI Citations Finds 25% Trace to Journalism

PR software company Muckrack analyzed 15 million quotes across ChatGPT, Claude, Gemini, and Perplexity and found that roughly one in four cited sources originates from journalism.
Reuters ranks first globally among cited publications, followed by Forbes; in the UK, The Guardian leads, with specialist title Homes and Gardens ranking second.
Former Business Insider chief Henry Blodget is the most cited individual journalist worldwide across the four AI systems studied.
Muckrack used the findings to launch an “AI visibility” rating feature that tracks how frequently journalists and outlets appear in AI-generated responses, classified across three tiers.

What Happened

PR software company Muckrack released findings from an analysis of 15 million quotes generated by four major AI chatbot systems—ChatGPT, Claude, Gemini, and Perplexity—showing that approximately one in four source citations traces back to journalism, as first reported by Press Gazette and subsequently covered by The Decoder. Muckrack sent millions of queries across all four services and tracked how often specific journalists and outlets appeared as linked sources in the generated responses. The company used the findings to launch a new platform feature it calls “AI visibility,” which rates journalists and publications on how frequently they are cited across the four AI systems.

Why It Matters

The 15-million-data-point dataset across four widely used AI platforms represents one of the more systematic public attempts to quantify journalism’s role in AI-generated citation patterns at scale. It arrives as news publishers continue to negotiate or litigate over the use of their content in AI outputs—The New York Times filed a copyright lawsuit against OpenAI and Microsoft in December 2023, and licensing negotiations between major publishers and AI developers have accelerated since. A parallel Muckrack analysis of Google’s AI Overviews found that Facebook and Reddit rank among the most cited sources in AI-generated search summaries, complicating the argument that journalism holds unique standing in these citation ecosystems.

Technical Details

Muckrack’s methodology involved issuing millions of structured queries to ChatGPT, Claude, Gemini, and Perplexity, then logging the publications and individual journalists that appeared as cited sources in the responses. Globally, Reuters ranked first among publications, followed by Forbes, The Guardian, the Financial Times, and CNBC in that order. At the individual journalist level, Henry Blodget—former chief of Business Insider—ranked as the most cited journalist worldwide across all four systems. In the UK market, specialist lifestyle title Homes and Gardens ranked second only to The Guardian, a result suggesting that domain-specific outlets can achieve disproportionate citation frequency in subject-area queries where general news publications are less dominant. Muckrack classified all results across three tiers of AI visibility to allow journalists and PR professionals to benchmark citation performance.

Who’s Affected

News publishers and trade outlets have a direct material interest in these findings, as AI citation patterns affect referral traffic volumes, brand visibility, and the leverage they can claim in licensing negotiations with AI developers. PR professionals—Muckrack’s primary customer base—can now use the AI visibility metric to assess and potentially improve the AI citation footprint of the media personalities and brands they represent. AI developers operating these systems face continued scrutiny over how citations are selected and surfaced, and whether the citation frequency of journalism in responses creates any copyright or compensation obligations distinct from those raised by training data use.

What’s Next

Muckrack has integrated the citation data into its live platform as the AI visibility feature, with journalist and publication rankings now accessible to subscribers. No public methodology paper or raw data release has been announced by the company, limiting independent verification of the query design and sampling approach. The findings are likely to surface in ongoing licensing discussions between news organizations and AI developers as both sides attempt to quantify the scale of journalism’s contribution to AI-generated outputs.

Muckrack Analysis of 15M AI Citations Finds 25% Trace to Journalism

What Happened

Why It Matters

Technical Details

Who’s Affected

What’s Next

Enjoyed this story?

Tufts’ Neuro-Symbolic AI Cuts Robotic Energy Use by 100x

LLM-Powered Guide Dogs Can Now Speak to Their Visually Impaired Owners

MegaTrain Trains 100B+ LLMs on One GPU, Outpaces DeepSpeed ZeRO-3 by 1.84×