When someone asks ChatGPT or Perplexity a question, the model picks a handful of sources, summarizes them, and shows the user a small number of links. If you’re cited, you exist. If you’re not, the user never sees your page.
The instinct is to treat “AI search” as one channel and optimize for it the way we used to optimize for Google. The data says the opposite. The four major chatbots use four different retrieval backends, surface four different sets of sources, and cite four wildly different volumes of links per answer. Optimizing for one isn’t optimizing for all of them.
This post pulls together the largest public studies of AI citation behavior (30M+ citations from Profound, 78.6M queries from Ahrefs, 200,000 AI Overviews from Semrush, and the Princeton GEO benchmark) into one picture: who cites what, how much overlap there really is, and which on-page edits actually move the needle.
Engine | Backend | Avg citations / answer | Top source domain | Google top-10 URL overlap |
|---|---|---|---|---|
ChatGPT | Bing index |
7.92
Wikipedia (16.3%) |
~6.5% |
Claude | Brave Search | 5.67 | NYT / Atlantic / NPR cluster | n/a (different backend) |
Gemini / AIO | Google index (query fan-out) | ~13.3 | Wikipedia (11.2%) | 38% (down from 76%) |
Perplexity | Sonar (own pipeline) | 21.87 | Reddit (6.6%) | 82% |
Backend, citation volume, top source, and Google overlap for the four major chatbots.
ChatGPT’s search is wired into Bing. OpenAI’s own crawler documentation lists OAI-SearchBot as the bot that surfaces sites in ChatGPT search, and Seer Interactive’s controlled study found that 87% of SearchGPT citations match Bing’s top results while only 56% match Google’s. If you’re not in Bing’s index, ChatGPT can’t cite you. And because OpenAI’s crawlers don’t execute JavaScript, anything rendered client-side is invisible.
The source mix is brutally concentrated. Ahrefs analyzed ChatGPT’s top 1,000 cited pages and found that 67% live in categories most marketers can’t reach: Wikipedia (29.7%), brand homepages (23.8%), and app store listings (6.6%). Wikipedia alone is roughly 16% of ChatGPT citations across Ahrefs’ 78.6M-query study.
The most actionable finding from SE Ranking’s 216,000-page analysis: referring domains is the single strongest predictor of ChatGPT citations. Sites with 350,000+ referring domains averaged 8.4 citations; sites under 2,500 averaged 1.6. About 28% of ChatGPT-cited pages had zero organic search visibility — brand presence and link graphs beat rankings outright.
One more pattern worth flagging: ChatGPT cites competitors much more aggressively than Google does. Zenith’s study of 80 high-intent questions found ChatGPT references competitor sites 28.8% of the time vs. Google’s 17.7% — an 11-point swing. If you’re a category leader, that’s great. If you’re a challenger, that’s an opportunity.
Anthropic doesn’t publicly name its search partner, but the evidence is direct: Brave Search was added to Anthropic’s subprocessor list in March 2025, and engineers spotted a BraveSearchParams parameter inside Claude’s web_search tool. Profound’s narrow 3-query test found 86.7% citation overlap with Brave’s top results — directional but tight enough to confirm the partnership.
Claude cites the fewest sources per answer (5.67) and biases toward premium, slow-moving editorial brands. The 5W AI Platform Citation Source Index calls out The New York Times, The Atlantic, The New Yorker, and The Economist as Claude’s anchor sources. Muck Rack’s analysis found Claude cites NPR, Yahoo Finance, Variety, and CNN at ≥ 2x the rate of competing platforms, and its top-100 outlets average roughly half the audience size of ChatGPT’s top-100 — meaning more niche / trade publications.
Claude also has the longest freshness window of any chatbot. Per Muck Rack, Claude is roughly 3x more likely than ChatGPT to cite content that’s 2–4 weeks old, and only 36% of its journalism citations come from the last 12 months (vs. ChatGPT’s 56%). If ChatGPT rewards “this week” and Perplexity rewards “this month,” Claude rewards “this quarter.”
Google grounds Gemini against its own search index. The mechanism is “query fan-out” (Google docs): the user’s query is rewritten into multiple sub-queries, each is run through Google Search, and the citations are pulled across all sub-results. That’s the structural reason cited URLs increasingly don’t match the original query’s top 10.
How dramatic? Ahrefs’ March 2026 study of 863,000 SERPs found that only 38% of AI Overview citations come from the top 10 organic results, down from 76% less than a year earlier. BrightEdge pegged the number even lower at 16.7%. About 37% of cited URLs don’t appear in the top 100 for the original query at all.
The top sources are dominated by Google-owned properties and aggregators. The Digital Bloom’s analysis of 36 million AI Overviews put Wikipedia at 11.2%, YouTube at 9.5%, blog.google at 6.0%, Reddit at 5.8%, and google.com at 5.6% — 22.8% of AIO citations are to Google’s own properties. Reddit citations in AIO grew roughly 450% between March and June 2025 after Google’s data licensing deal with Reddit.
Gemini chat is its own animal. Yext’s 6.8M-citation study found that 52.15% of Gemini chat citations come from brand-owned websites — the highest of any major engine. And the two Google surfaces don’t even agree with each other: AI Overviews and AI Mode share only ~13.7% of citations on the same queries, and AIO content shifts ~70% of the time when the same query is repeated.
Perplexity’s Sonar pipeline does live retrieval on every query, and Sonar Pro runs multiple sub-searches per question. The result: 21.87 citations per response on average (Profound) — nearly 4x ChatGPT. Perplexity is also the engine closest to Google: Semrush measured 91% domain overlap and 82% URL overlap with Google’s top 10. If your URL ranks in Google’s top 10, there’s a good chance Perplexity is already citing it.
Reddit dominates the top of the source list (6.6% of all citations, 46.7% of the top-10 share), with YouTube, Gartner, Yelp, LinkedIn, and Forbes filling out the next tier. For news, BBC leads at 3.2% followed by Yahoo and 163.com, per the arXiv news citation study of 366,000 citations — and Perplexity cites 1,430 unique news domains vs. Google’s 881 and OpenAI’s 707. It’s the broadest news footprint of any AI engine.
Industry skew matters more on Perplexity than anywhere else: BrightEdge found Perplexity-Google overlap ranges from 82% in healthcare to 27% in restaurants. If you operate in a regulated category (health, finance, legal), Perplexity behaves a lot like a stricter, more authority-weighted Google. In ecommerce or hospitality, it leans hard on UGC.

Source: Profound, 30M+ citations across the four engines, Aug 2024 – Jun 2025.
A Perplexity answer pulls from 3–4x more sources than a ChatGPT answer and almost 4x a Claude answer. From a content strategy perspective, that means Perplexity gives you many more shots on goal per query, while Claude is a winner-take-most game.

Top cited domains by engine. Sources: Ahrefs (78.6M queries, ChatGPT), Ahrefs/The Digital Bloom (36M AIOs), Profound (30M+ citations).
Wikipedia is the only domain that lands top-2 on both ChatGPT and AI Overviews. Reddit lands top-2 on AI Overviews and is the single largest source on Perplexity. YouTube is structurally important on AI Overviews and Perplexity, mostly absent on ChatGPT.
On branded queries the picture shifts. Omniscient Digital’s study of 23,387 citations across all five major engines found that 57% of branded citations come from reviews and social proof (review sites, listicles, press), 17% from directories, and only ~4.5% from the brand’s own About / FAQ / Home pages. Your homepage isn’t where you win branded search inside an LLM — third-party validation is.

URL overlap with Google’s top 10 organic. Sources: Semrush (5,000 keywords / 150k citations), Ahrefs (863K SERPs, Mar 2026).
Perplexity tracks Google tightly; ChatGPT barely tracks Google at all. AI Overviews used to look a lot like Google’s top 10 — the same Ahrefs methodology measured 76% URL overlap in mid-2025 — but the query fan-out behavior has been steadily decoupling it. The practical implication: if you’re only measuring SERP rank, you’re measuring one out of four AI surfaces well, two of them poorly, and ChatGPT not at all.
The engines disagree on sources. They mostly agree on what kinds of content earn citations. Five patterns show up in every large-sample study we reviewed.

Visibility lift by tactic. Source: Aggarwal et al., GEO paper, KDD 2024. GEO-bench: 10,000 queries across 25 domains.
The Princeton-led Generative Engine Optimization paper ran nine on-page edits across a 10,000-query benchmark. The biggest wins: adding direct quotations (+41% visibility), adding original statistics (+37%), and citing sources inline (+30%). Keyword stuffing produced essentially zero lift. Those three edits — quotes, stats, citations — are also the easiest to apply to existing content.
Listicles get cited at roughly a 25% rate vs. 11% for opinion pieces. Pages with tables get cited approximately 2.5x more often than unstructured pages of comparable length. The bias is the same across every engine: chatbots extract claims, and extractable claims live in tables, lists, comparisons, and "Best of" structures.
Note that this is about visible structure, not schema markup. Search Engine Roundtable confirmed both ChatGPT and Perplexity read structured data “as if it was just being read like any other page of text.” Semantic clarity matters; JSON-LD doesn’t move the needle on its own.
Ahrefs analyzed 75,000 brands and found that brand mention frequency correlates with AI Overview presence at roughly 3x the strength of backlinks. Branded anchor text (r=0.527) and branded search volume (r=0.334) are stronger predictors than Domain Rating. For LLMs, being a familiar entity in a corpus is worth more than being a target of links.
For branded queries, third-party validation is the source. The Omniscient study cited above puts brand-owned About/FAQ/Home at 4.5% of branded citations — less than a tenth of what reviews and social proof contribute. Get on G2, Capterra, Trustpilot, PCMag, Wirecutter, and category-specific roundups before you spend weeks rewriting your About page.
AI-cited URLs are on average ~25% newer than Google-organic URLs (Ahrefs, ~17M citations). But the curves differ: ChatGPT and Perplexity reward content updated in the last 30 days; Claude is happy with the last quarter; AI Overviews tolerates the last year. The practical rule: refresh your top revenue pages quarterly, and your top thought-leadership pages whenever something material changes.
Treat each chatbot as a separate channel. Measure ChatGPT, Claude, Gemini, and Perplexity independently. A single “AI visibility score” hides the fact that they almost don’t overlap.
Earn third-party mentions before you optimize your own pages. Brand mention frequency is the strongest single predictor across engines. Get into listicles, get analyst coverage, seed honest review threads.
Add original stats and direct quotes to your top pages. This is the highest-ROI on-page edit per the GEO benchmark — and it works on every engine.
Take Wikipedia, Reddit, and YouTube seriously. Wikipedia for category and brand entities. Reddit for product-comparison and recommendation queries. YouTube for how-to and explainer queries. These three are unavoidable.
Refresh quarterly. Top commercial pages, top blog posts, datasheets, comparison pages. ChatGPT and Perplexity will reward you within days.
Stop fixating on the #1 organic ranking. Semrush found fewer than half of AI Overviews include the top organic result. Ranking 2–10 still earns citations. Ranking off page one still earns ChatGPT citations if your brand has presence.
The hard part of all this is measurement. Citation patterns shift week to week (Reddit’s ChatGPT share collapsed from 60% to 10% in five weeks; Gemini’s citation rate dropped 23 percentage points in two weeks earlier this year). A one-off audit is out of date almost immediately.
QuickSEO runs your prompts across ChatGPT, Claude, Gemini, and Perplexity every week so you can see who’s being cited, who’s being recommended, and how it changes over time. If you just want a snapshot first, the free AI visibility audit checks your brand across all four chatbots in a couple of minutes — no signup, no card.
Citation behavior is volatile. The numbers in this post come from the largest publicly available studies (Profound’s 30M+ citations, Ahrefs’ 78.6M queries, Semrush’s 200,000 AI Overviews, the GEO benchmark’s 10,000 queries) and from studies with smaller but cleanly scoped samples (Seer’s 500-citation SearchGPT analysis, Yext’s 6.8M citations, Omniscient’s 23k citations). Read the trend, not the single number.
Some engine-specific notes worth flagging: the Profound 86.7% Claude-Brave overlap is a 3-query analysis and should be treated as directional. The 5W index aggregates six studies without per-engine breakdowns. Muck Rack’s underlying sample for Claude isn’t disclosed. Where studies disagreed materially (e.g. AIO citation rates), we noted both numbers.
Track your AI visibility across ChatGPT, Gemini, Claude, and Perplexity — and turn chat-bot mentions into traffic.
Try it yourself
Free SEO tools related to the topics covered in this article.
Check if Claude AI mentions your website. See your AI visibility score, sentiment, and competitors.
Check if ChatGPT mentions your website. See your AI visibility score, sentiment, and competitors.
Check if Google Gemini mentions your website. See your AI visibility score, sentiment, and competitors.
Keep reading
More articles on the same topics, prioritized by shared tags and keyword overlap.

A data-driven look at how ChatGPT and Perplexity actually decide which brands to cite, where their traffic goes, and what each platform rewards in 2026.

Perplexity holds under 1% of search but converts 6x better than Google. We break down the citation data, query types, and where the traffic is actually coming from.

ChatGPT now answers 1 billion queries a day. We pulled the latest market share, user behavior, and referral data to show what SEOs should actually do this year.

Get your brand cited in ChatGPT answers with proven 2026 tactics: brand mentions on authoritative sources, schema markup, GPTBot access, and citation tracking.