Will the results change if you extend the observation window?

Quite possibly. AI referral traffic is a young channel, sensitive to UI updates, answer-engine logic changes, and shifts in user demographics. A 30-day snapshot reflects 'a pattern observed at this point in time,' not stability over time. We plan to keep observing this with 90- and 180-day data to see whether the same pattern continues.

Are there caveats specific to GA4 measurement of AI referrals?

Yes. AI referral traffic includes (1) cases that send no referrer, (2) cases recorded as direct, and (3) cases where sessionSource resolves to "(not set)". Our dataset contained "(not set)" records, and we cannot rule out that those are AI traffic. There is a gap between 'AI traffic we could attribute' and 'true AI traffic volume,' which should be taken as a baseline caveat.

SEO Lab

Do AI Referral Sources Differ in Article Genre?

Tracking 30 days of AI referral traffic on our own GA4 surfaced a clear separation of article genres by source. A fair write-up of what we saw — and why it should not be casually generalized.

9 min read2026-05-18

AI-referral landing-page genre means the type of page users arrive at via AI services (ChatGPT, Gemini, Perplexity, etc.). On 30 days of GA4 data from a separately operated web-developer publication we run, article genres landed on were clearly separated by AI source. That said, this is a snapshot from one site over a short window and cannot be treated as an industry-wide pattern.

Summary

What We Observed

We withhold the concrete identity of each genre label (what A through K stand for) to protect the competitive position of the separately operated observation source. We share only the AI × genre correspondence and its percentages.

1ChatGPT referrals concentrated in a single genre (Genre A) at approximately 64%, with the runner-up (Genre B) at ~28%. The top two genres account for more than 90% of attributed traffic.
2Gemini referrals concentrated in a different single genre (Genre E) at ~58%, with the remainder distributed across several genres (A, F, C, G).
3OpenAI Search referrals concentrated in yet another genre (Genre H) at ~71% — the highest single-genre concentration of the four sources.
4Perplexity referrals fell into only three genres (I, J, K). None of the dominant genres for the other AI sources (A–H) received any Perplexity traffic.
5Pages that received traffic from both ChatGPT and Gemini were limited to two genres (A and C). Outside that, AI sources did not overlap on the same pages.
6Genre distributions were clearly separated by AI source. Whether this reflects AI-side preference or a projection of our own site composition cannot be determined from this data alone.

Methodology

Measurement Conditions (Disclosed at Reproducible Granularity)

The measurement target is the GA4 property of a separately operated web-developer / frontend-focused publication we own. This is a different site from the present blog (seo.codequest.work). We do not disclose traffic volume or specific URLs, but we disclose the measurement method and classification axis at a reproducible granularity.

GA4 Query Conditions

Period:: 30 days (Apr 18, 2026 – May 17, 2026)
Dimensions:: sessionSource × landingPagePlusQueryString
Metrics:: totalUsers, sessions, eventCount
Filter:: sessionSource ∈ { chatgpt.com / openai.com / openai / perplexity.ai / gemini.google.com / gemini.com / claude.ai / copilot.microsoft.com / you.com / phind.com }
Sources with no traffic in window:: claude.ai / copilot.microsoft.com / you.com / phind.com

Each landing page was hand-classified — based on URL pattern and page title — into one of 11 genres in total. Each page is assigned to a single genre exclusively. Concrete genre identities are withheld to protect the competitive position of the observation source; in this article they are referenced as A through K.

Observation

Genre Distribution by AI Source

Using only landings with identifiable genre as the base, we calculated each AI source's distribution. Treat the percentages as approximate composition shares.

Genre labels are anonymized as letters A–K. Identical letters refer to the same genre, so a letter appearing under multiple AI sources means both sources sent traffic to that genre (e.g., Genre A appears under both ChatGPT and Gemini). The concrete identity of each genre is withheld to protect the competitive position of the observation source site.

ChatGPT

Highly concentrated in a single genre (Genre A). The top two genres together account for more than 90% of attributed traffic.

Genre A

64%

Genre B

28%

Genre C

Genre D

Gemini

A different single genre (Genre E) holds the majority. ChatGPT's dominant genre (A) also appears as the second largest, but not at the level of concentration ChatGPT shows.

Genre E

58%

Genre A

17%

Genre F

Genre C

Genre G

OpenAI Search

Yet another single genre (Genre H) dominates. Although both come from OpenAI, ChatGPT and OpenAI Search concentrate on clearly different genres.

Genre H

71%

Genre E

14%

Genre D

14%

Perplexity

Falls into three genres (I, J, K) that no other AI source touched in this window. Zero Perplexity traffic landed on the dominant genres of the other sources (A–H).

Genre I

60%

Genre J

20%

Genre K

20%

Cross-AI Observation

Pages receiving traffic from both ChatGPT and Gemini were limited to two genres (A and C). Across other AI pairs, traffic to the same pages barely overlapped. This suggests separation by AI source occurs not only at the genre level but also at the individual page level.

Interpretive Hypotheses

Why Might Genres Separate by AI? (Hypotheses)

Several hypotheses could plausibly underlie the observed pattern. None can be verified by this data alone. This section organizes the hypotheses — it is not a conclusion.

Training-data / answer-logic difference

Each AI is trained on different corpora and uses different citation logic. Models stronger at code, or stronger at tool recommendation, may surface different domains. Verifying this requires cross-tabulation across multiple sites.

UI flow difference

Gemini's interface emphasizes Workspace integration and feature-discovery flows, while Perplexity foregrounds research and comparison use cases. UI design differences may indirectly steer which genre users land on.

Use-case / user-base difference

ChatGPT may be used as a coding companion, Perplexity for preliminary research, and Gemini as a productivity tool inside the Google Workspace context. The scenarios where each AI is invoked differ, and so do the article genres that get pulled in.

Site-composition projection (the critical counter-hypothesis)

If the observed site already over-indexes on certain genres, the pattern may simply project the site's own genre distribution — not reflect any AI-side tendency. In that case the observation says more about the site than about the AIs.

Limitations

What This Observation Cannot Claim (Critical Section)

This is a lab-style observation share, not industry guidance. To prevent over-generalization, the limits of the data are stated explicitly.

Single-site data

The data is from a single GA4 property we operate. Site composition, target audience, and category mix are specific to that site and do not represent the broader industry.

Short-window snapshot

A 30-day window is short for discussing AI referral behavior. Results may shift with UI updates, answer-engine logic changes, and seasonal factors; stability is not guaranteed.

Site-composition bias is not separated

We cannot rule out that 'Gemini = majority on Genre E' merely reflects an over-representation of Genre-E-like articles on the site. Identifying an interaction between AI source and genre requires cross-validation across multiple sites with different compositions.

GA4 measurement gap

AI traffic can fail to send referrers, be recorded as direct, or resolve to a (not set) sessionSource. Our dataset includes (not set) records, so attributed AI traffic likely differs from true AI traffic.

Do not paraphrase as 'AI preference'

The observed skew must not be paraphrased as 'AI prefers genre X.' We intentionally avoid 'prefer' style language throughout this article. Observation and preference live at different layers.

Counter-hypothesis: noise or composition mirroring

The observed pattern could be short-window noise or a straightforward mirror of site composition. Until these counter-hypotheses can be rejected with data, the pattern should not be used as a basis for decisions.

If This Pattern Generalizes (Conditional Practical Notes)

Conditioned on 'if this pattern generalizes,' here are directions for content design in the AI era. Because this report withholds the concrete identity of genres, we do not prescribe genre-specific moves. Instead, we outline how to run the same query on your own GA4 and build your own genre distribution.

AI source	Observed concentration shape	What to verify on your own site
ChatGPT	60%+ concentrated in a single genre (top two ≥ 90%)	Check whether your site also concentrates on top genres. If so, prioritize auditing the structured-data type and heading structure of those top-genre articles.
Gemini	Majority share in a different genre than ChatGPT	A different set of pages may grow. Compare 'is the dominant genre the same as ChatGPT or different?' to check whether improvement targets overlap.
OpenAI Search	Highest single-genre concentration (~71%)	Check whether extreme single-genre concentration appears on your site too. If so, prioritize direct-answer blocks and FAQPage structured data on those articles.
Perplexity	Spread across a few genres. Zero overlap with other AIs' top genres	Verify whether the genres dominant under other AIs are zero under Perplexity. If so, design your internal linking under the assumption that Perplexity traffic flows through a different set of articles.

To repeat: these are directions 'suggested by observation,' not guaranteed plays. If your own GA4 shows the opposite pattern, disregard the table above.

Next Actions to Strengthen the Observation

1Extend the observation window to 90 and 180 days to test stability over time.
2Run the same methodology on multiple sites with different compositions (content-led, tool-led, corporate) to separate site-composition bias.
3Cross-tabulate landing genres against structured-data types (HowTo, SoftwareApplication, FAQPage, Article, etc.).
4Include direct traffic and (not set) referrers in the base, and estimate true AI traffic from timestamps, UA, and landing patterns.
5Run the same measurement on this site (seo.codequest.work) and log AI referral patterns for the content-SEO-tool industry as a separate sample.

Summary: Read This as an Observation Share

Genre distributions of AI referral landings clearly separated by source. That much is a factual observation.

But there is no basis yet to declare 'AIs prefer that genre.' We cannot reject the site-composition-mirroring hypothesis or the short-window-noise hypothesis. This article deliberately ships the observation alongside hypotheses and limits, without generalizing to the industry. In a fast-moving AI ecosystem, sharing observations honestly is itself the contribution.

Check Whether Your Site Structure Reaches AI Referral Sources

Audit structured data, heading structure, direct-answer blocks, E-E-A-T, and other items that influence AI citability — all in one go with CodeQuest.work SEO.

Written by

今井政和

SEO Director / Frontend Developer

SEO Director with 20+ years of web industry experience. Creator of CodeQuest.work SEO and the official WordPress plugin "ORECTIC SEO CHECK." Author of a book on web strategy inspired by Edo-era merchant principles.

@imai_director

FAQ

Can the observed AI-genre patterns be reproduced on other sites?▾

Not at this point. The observation comes from GA4 data on a single site we operate separately, so the site's own content mix is baked in as a bias. For example, 'Gemini traffic skewing toward tool articles' may reflect the fact that our site has many tool articles, not an AI-side tendency. Without verification across multiple sites and longer time spans, this cannot be treated as an industry-wide pattern.

How did you identify ChatGPT, Gemini, Perplexity, and OpenAI Search in GA4?▾

We filtered the GA4 sessionSource dimension against ten values: chatgpt.com, openai.com, openai, gemini.google.com, gemini.com, perplexity.ai, claude.ai, copilot.microsoft.com, you.com, phind.com. The observation window was 30 days, dimensions were sessionSource × landingPagePlusQueryString, and metrics were totalUsers / sessions / eventCount. No traffic from claude.ai, copilot, you, or phind was observed during the period.

Can 'genres separated by observation' be paraphrased as 'AI prefers this genre'?▾

It should not be. This report sticks to language like 'observed,' 'suggested,' and 'pattern that appears to exist,' and avoids asserting AI-side preference. The reason: we cannot separate whether the observed pattern is driven by (1) AI citation behavior, (2) AI UI flows, (3) user habits, or (4) a projection of our own site's article mix. Using 'prefers' at this stage would let the claim exceed what the data can support.

Why don't you publish specific article titles or raw session counts?▾

Out of consideration for the competitive position of the separately operated site, we omit information that would identify specific landing pages or traffic volumes. The purpose of this report is to share the pattern (AI sources separating by genre) and the methodology, not to disclose that site's traffic. To preserve reproducibility, we do disclose the measurement method, dimensions, and time window.

How can this observation be applied in practice?▾

Under the condition 'if this pattern generalizes,' two directions are worth considering: (1) structure llms.txt sections with AI distribution in mind (separate tool / implementation / troubleshooting groupings), and (2) align article structured data with 'types that tend to be cited per AI' (HowTo / Product / FAQPage and so on). But generalization is premature at this point — verify on your own GA4 first before acting.

Analytics10 min

Tools and Articles Related to This Analysis

SEO Score Check

45-item audit of Google communication accuracy

llms.txt Checker

Check llms.txt presence, required items, and format

Cited in AIO Despite No Organic Ranking

Verifying the gap between search position and AIO citation

llms.txt GEO Effectiveness (CDN Log Analysis)

Effectiveness verified through 300K-domain study and CDN logs