Age of Generative Search Article Set: 3 of 3
Opening Scene
The Shift Begins
In early 2025, a travel brand asked a simple question: “Why does Gemini keep recommending our competitor, even when we outrank them on Google?”
The SEO team checked everything. Technical health: fine. Backlinks: strong. Content depth: excellent. But when they asked Gemini directly where it sourced information, the answers hinted at something unexpected. The competitor had been included in the model's training data. The travel brand, despite dominating the SERPs, had not.
And the second blow arrived minutes later. Perplexity, running a retrieval-based model, did surface the travel brand… but only one specific article, because the rest of their content wasn't structured, crawlable, or semantically clear enough to be parsed.
Two AI models. Two types of memory. Two wildly different visibility outcomes.
This is the quiet reality shaping search today. AI does not “know the internet.” It knows what it remembers, and what it can retrieve.
The brands winning generative visibility are the ones optimising for both.
The Insight
What's Really Happening
For years, marketers assumed AI behaved like Google: it crawled, indexed, ranked, and surfaced content.
But large language models operate very differently. They learn in two distinct phases:
- Pre-training, the “model memory layer”
- Retrieval, the “real-time lookup layer”
Both determine whether your content appears in AI answers. Both follow different rules. And both must be optimised deliberately.
Your visibility depends on whether the machine can:
- remember you (training)
- find you (retrieval)
- understand you (semantic parsing)
- trust you (factual grounding)
Traditional SEO never had to deal with this duality. GEO and AEO must.
How Training Works
What AI “Remembers”
When a model like GPT-4o, Gemini 2.0, or Claude 3.5 is trained, it ingests trillions of tokens from:
- open web text
- licensed datasets (Reddit, StackOverflow, news archives)
- curated documents
- academic repositories
- publisher partnerships
- structured knowledge sources (Wikipedia, Wikidata, schema corpora)
During this phase, the model builds a probabilistic understanding of the world, what entities are, how topics relate, which sources appear authoritative, and what patterns constitute trustworthy information.
Brands that appear in this layer have a structural advantage: AI does not need to “look you up” to include you. The model already knows you.
This is why early inclusion matters. It mirrors early link-building in the 2000s, the foundations calcify.
If your content was not part of the training data:
- the model may not identify your brand as a distinct entity
- it may hallucinate or misrepresent your information
- it may default to competitors with clearer entity footprints
- you may struggle to appear in answers even with perfect SEO
This is why entity clarity, not keyword density, is becoming the new currency of visibility.
How Retrieval Works
What AI “Looks Up”
Modern AI systems layer retrieval on top of training to ensure accuracy and freshness. This is where GEO and AEO have direct influence.
Retrieval draws from:
- indexable websites
- structured databases
- live search APIs
- proprietary RAG pipelines
- citations surfaced from model memory
- curated knowledge stores and embeddings
The retrieval process is governed by:
- crawlability, can the system access the page?
- structure, is the page machine-readable?
- semantic scoring, does the content match the query clearly?
- evidence certainty, are facts explicit?
- chunk quality, can meaning be extracted in 150–350 token segments?
This is how Perplexity, Bing Copilot, and ChatGPT with search produce citations.
And this is where many brands fail.
Common reasons include:
- over-designed pages with weak semantic markup
- ambiguous entities
- content that looks visually rich but structurally empty
- duplicated or contradictory definitions
- poor schema, or schema that doesn't match visible content
- paywalled or blocked sections that break discoverability
Retrieval, unlike training, is brutally literal. If the machine cannot extract meaning cleanly, it moves on.
The Strategic Shift
Why This Matters for Business
For leaders, the implications are profound.
1. SEO alone cannot secure AI visibility
You might rank #1 in Google but appear nowhere in AI answers. Ranking and retrieval are not the same process.
2. GEO demands entity engineering, not just optimisation
AI must understand what your brand is, how it connects to other entities, and what problems it solves.
This requires:
- structured definitions
- stable naming conventions
- schema consistency
- factual clarity
- internal linking that reinforces identity
3. Training is slow, retrieval is instant, both shape your future
Training sets the long-term baseline. Retrieval fills the gaps. If you're absent from both, AI has nothing authoritative to use.
4. Visibility becomes a strategic asset
Being included in model memory influences product recommendations, brand comparisons, travel advice, financial guidance, health queries, and B2B category definitions.
Brands absent from training and retrieval layers risk becoming invisible, even if their marketing is strong.
The Human Dimension
Reframing the Relationship
Your audience is no longer “searching” in the traditional sense. They are conversing.
They ask:
- “What's the best CRM for a small business?”
- “Where should I stay in Edinburgh?”
- “Which laptop should I buy under £1,500?”
The AI delivers a narrative, a recommendation, a shortlist, a decision-making pathway.
Your brand is either part of that narrative, or not.
Users aren't browsing. They're accepting. The AI acts like a trusted adviser, filtering complexity. When it includes your content, the relationship begins before the customer reaches your site. When it doesn't, you never enter the conversation.
This is the new discovery frontier: the private conversation between your customer and their AI.
Optimising for Training and Retrieval
he Dual Playbook
Brands must treat AI visibility as a two-sided optimisation challenge.
1. Optimise for Training (Long-Term Authority)
Make your brand part of the model's structural understanding through:
- publicly accessible, high-authority pages
- consistent entity definitions
- domain-level clarity
- contributions to open data ecosystems
- structured, factual cornerstone content
- evergreen thought leadership
2. Optimise for Retrieval (Real-Time Inclusion)
Enable the model to “look you up” effectively through:
- semantic HTML
- schema markup aligned with visible content
- question-led page structures
- explicit definitions
- clear citations and data callouts
- removal of ambiguous or contradictory phrasing
- AI-readable layouts (FAQs, summaries, scannable sections)
This dual playbook transforms your site from a marketing asset into a knowledge asset, one that machines can synthesise and reuse.
The Takeaway
What Happens Next
Search is no longer just a list of links. It is a process of interpretation.
AI models:
- learn what they can
- retrieve what they trust
- synthesise what is clear
- recommend what they understand
Your job is to ensure your brand sits confidently at the intersection of all three.
The next era of visibility won't be won through rankings. It will be won through understanding.
AI will amplify the brands it can interpret, and forget the ones it can't.



