Research Synthesis

The Reality Gap: Public Guidance vs. Engineering Reality

Google's public communications serve two purposes: genuine guidance to help webmasters build better sites, and strategic misdirection to prevent gaming. The gap between the two is not always deception, but the systematic denial of click-based ranking, despite NavBoost being "perhaps the central way web ranking has improved for 15 years," represents the clearest case of deliberate public misdirection. This page maps the gap with specific quotes, specific API modules, and specific sworn testimony.

Compiled by Aviel Fahl · Last updated April 1, 2026

Key Findings

The 2024 Google API leak exposed 2,596 modules containing 14,014 attributes across Google's ranking infrastructure. The DOJ antitrust trial produced sworn testimony from senior Google engineers confirming click data as a core ranking signal. Together, these sources contradict at least eight specific public statements made by Google spokespeople over the past decade. Google denied using click data for ranking while maintaining NavBoost, a 13-month click-signal table confirmed under oath as one of their most important ranking signals. They denied having a domain authority metric while a field called siteAuthority exists in their CompressedQualitySignals module. They denied a sandbox for new sites while a hostAge attribute is explicitly described as used "to sandbox fresh spam in serving time."

Contents

2,596

API modules leaked (May 2024)

14,014

ranking attributes documented

public statements directly contradicted

13 mo

NavBoost click data window (sworn testimony)

The Gap Exists

In May 2024, internal Google API documentation, the Content API Warehouse, 2,500+ pages of Java-based protobuf specs, was published to a public GitHub repository by an automated bot called yoshi-code-bot. Rand Fishkin at SparkToro and Mike King at iPullRank published independent analyses on May 27, 2024.

Google's response was a non-denial: "We would caution against making inaccurate assumptions about Search based on out-of-context, outdated, or incomplete information." They did not claim the documentation was fabricated, inaccurate, or from a non-production system. They declined to specify which elements are accurate, invalid, currently active, or weighted.

Separately, the DOJ v. Google antitrust trial in October 2023 produced sworn testimony from senior engineers including Pandu Nayak (VP of Search) and Eric Lehman (Distinguished Engineer). Court documents stated that "learning from user feedback is perhaps the central way web ranking has improved for 15 years."

The table below maps the core contradictions. Each row pairs a specific public statement with the specific internal evidence that contradicts it.

Source:Sources: SparkToro, iPullRank, Hobo Web (May 2024); DOJ v. Google trial documents (October 2023); SERoundtable, Search Engine Land (various dates)
Topic	Public Statement	Internal Evidence
Click data	Gary Illyes (2019): "Dwell time, CTR, whatever Fishkin's new theory is, those are generally made up crap."	NavBoost: goodClicks, badClicks, lastLongestClicks, unicornClicks in QualityNavboostCrapsCrapsClickSignals. Confirmed under oath by Pandu Nayak as "one of the most important ranking signals."
Domain authority	John Mueller (various): Google does not have a site-wide authority score.	siteAuthority field exists in CompressedQualitySignals. Q* combines it with PageRank and content quality.
Chrome data	Google (various): "We don't use Chrome browsing data for ranking purposes."	chromeInTotal in QualityNsrNsrData. chrome_trans_clicks in click processing. P* (Popularity signal) powered by Chrome visit data.
Sandbox	Google (various): denied existence of a sandbox for new sites.	hostAge in PerDocData, explicitly described as used "to sandbox fresh spam in serving time."
Link equality	Gary Illyes (2024): "We need very few links to rank pages... we've made links less important."	sourceType on AnchorsAnchorSource classifies links into HIGH_QUALITY / MEDIUM_QUALITY / LOW_QUALITY tiers tied to index storage. PageRank threshold bucketing at 51K (uint16): above = premium treatment, below 47K = all equivalent.
Authorship	Gary Illyes (various): Google is "not using authorship" as a ranking factor.	isAuthor boolean, authorReputationScore in documentation. Author Vectors patent (US 10,599,770) enables stylometric authorship detection.
Penguin real-time	Gary Illyes (2016): Penguin 4.0 is "real-time, part of core algorithm."	penguinLastUpdate timestamp suggests batch processing. penguinPenalty, penguinTooManySources operate as discrete flags.
User engagement	John Mueller (various): Google does not use "engagement" as a factor for ranking.	NavBoost operates on a rolling 13-month window of engagement data. CRAPS module segments clicks by country, device, language, metro area.

Interpreting the leak

The API documentation is confirmed authentic, but Google correctly notes it is "out-of-context." We cannot determine which attributes are actively weighted, deprecated-but-populated, or experimental. Attributes marked deprecated may still be populated; unmarked attributes may be experimental. The system architecture and module relationships are the most durable findings. Treat specific attribute names as confirmed-to-exist, not confirmed-to-be-active.

Click Signals and User Behavior

This is the widest gap between public guidance and internal reality. For over a decade, multiple Google spokespeople denied or minimized the role of click data in rankings. The leaked documentation and sworn testimony prove click data is central to the ranking system.

Source:Statements: SERoundtable, SEJ (various dates). Systems: iPullRank, Hobo Web, DOJ trial testimony (2023-2024)
What They Said	Who Said It	What the Systems Show
"Dwell time, CTR... those are generally made up crap. Search is much more simple than people think."	Gary Illyes, Reddit AMA, January 2019	NavBoost tracks goodClicks (satisfaction), badClicks (pogo-sticking), lastLongestClicks (session-ending satisfaction), and unicornClicks (high-trust user behavior).
"We're not using such metrics" (dwell time, time on page).	Martin Splitt, 2019	CRAPS module (Click-Related Active Promotion Signals) segments click data by country, device, language, and metro area. A page can rank differently on mobile in Brazil vs. desktop in Germany.
Google does not use "engagement" as a factor for ranking.	John Mueller, various	NavBoost confirmed under oath by Pandu Nayak (VP of Search) and Eric Lehman (Distinguished Engineer) as one of the most important ranking signals. Uses a rolling 13-month window.

NavBoost is not a machine learning model. Eric Lehman described it under oath as "essentially a large spreadsheet" storing which URLs were clicked and how often for each query. "Long clicks" (user stays) are positive; "short clicks" (quick returns) are negative. The system also tracks aging buckets, evaluating click performance separately by content age to detect whether a page's engagement is improving or decaying over time.

Click data is processed through the CompressedQualitySignals module as crapsNewUrlSignals (URL-level), crapsNewHostSignals (host-level), and crapsNewPatternSignals (pattern-level). Google maintains both squashed (production) and unsquashed (experimentation) data. Click squashing is statistical compression preventing any single dominant signal from overwhelming rankings, the primary defense against CTR manipulation while still using clicks as a core input.

The trial also revealed Google's ranking distills to two top-level signals: Quality (Q*) and Popularity (P*). The Popularity signal is directly powered by Chrome visit data and user interaction signals from NavBoost, contradicting the public statement that Chrome browsing data is not used for ranking.

Source:2024 Google API leak, HexDocs raw documentation
Chrome Signal	Location	Function
chromeInTotal	QualityNsrNsrData	Site-level aggregate Chrome browser views
chrome_trans_clicks	Click processing pipeline	Chrome transition clicks, used to identify most-visited URLs and generate Sitelinks

Why this matters for practitioners

Click signals are not something you can directly optimize. But understanding that they exist changes strategic priorities. Pages that generate satisfied clicks (long dwell, no SERP return) accumulate NavBoost signals that compound over time. This is the quantitative mechanism behind "make content that satisfies the query", not because it is vague aspirational advice, but because a 13-month table of click satisfaction data is directly re-ranking results.

Link Signals and PageRank

Google's public messaging on links has evolved from "one of the top three ranking factors" to "we need very few links to rank pages." The leaked documentation tells a different story: Google maintains at least 11 distinct PageRank variants and a deeply granular anchor text processing pipeline.

Source:Statements: SERoundtable, Page One Power (various). Systems: iPullRank, On-Page.ai, Hobo Web (May 2024)
What They Said	Who Said It	What the Systems Show
"Links are important, but people overestimate their importance. Not top 3, hasn't been for some time."	Gary Illyes, Pubcon, September 2023	11+ active PageRank variants in PerDocData: pagerank, pagerank0/1/2, homepagePagerankNs, pagerankNs, rawPagerank, crawlerPageRank, IndyRank, ScaledIndyRank, site_pr, setiPagerankWeight.
"We need very few links to rank pages... we've made links less important."	Gary Illyes, SERP Conf, March 2024	sourceType on AnchorsAnchorSource classifies linking pages into three quality tiers (HIGH_QUALITY, MEDIUM_QUALITY, LOW_QUALITY) tied directly to index storage tier. Links from Base-tier pages carry full signal; Landfills-tier links carry minimal.
"I would recommend avoiding link building."	John Mueller, 2021	PageRank threshold bucketing: topPrOnsiteAnchorCount / topPrOffdomainAnchorCount treat links from sources above 51,000 PageRank (uint16) as qualitatively different from everything below 47,000. Binary premium, not gradual.

The anchor text processing is far more granular than public guidance suggests. Google maintains completely separate scoring pipelines for internal vs. external links: normalizedScoreFromOffdomain vs. normalizedScoreFromOnsite, with independent counts, volume denominators, and aggregate scores. Redirected links and fragment links are each scored separately. Anchors are deduplicated within source organizations to prevent a single company from flooding anchor signals.

The system also analyzes full context around each anchor: fullLeftContext and fullRightContext capture all terms preceding and following the anchor text. Content position matters: inbodyTargetLink vs. outlinksTargetLink differentiates main content links from sidebar and footer links. And onsiteProminence measures page importance within its own site, but it is not a theoretical PageRank calculation. It is computed by propagating simulated traffic from the homepage and pages with high search-click volume. This is a user-behavior-informed simulation seeded from empirically validated entry points. The implication: Google says clicks don't matter for ranking, but clicks literally seed the simulation that determines internal page importance.

New pages inherit homepage authority (homepagePagerankNs) until acquiring individual PageRank. The Nearest Seeds variant seeds calculations from known-trusted pages rather than distributing uniformly. "Distance from a known good source" is how Q* measures authority. Homepages themselves are classified into a four-tier trust system via homePageInfo: FULLY_TRUSTED, PARTIALLY_TRUSTED, NOT_TRUSTED, and NOT_HOMEPAGE. Every page on the domain inherits from this tier, a site-wide authority gate by another name, despite Google's denials of a "domain authority" concept.

The practical takeaway

Links have not become less important. What has changed is which links carry signal. The three-tier index tiering system means links from pages with no clicks (Landfills tier) carry minimal value, while links from high-engagement pages (Base tier) carry full value. "We need very few links" is technically correct if those few links are from Base-tier, high-PageRank pages. It is misleading if interpreted as "links barely matter."

Content Quality and Site Authority

Google has publicly stated, repeatedly and through multiple spokespeople, that there is no domain authority metric. The leaked documentation shows a concrete field called siteAuthority inside CompressedQualitySignals, the pre-computed quality gatekeeper module that can disqualify a page before query-time ranking even begins.

Source:2024 Google API leak analysis: iPullRank, Hobo Web, SE Ranking
Signal	Module	Function
siteAuthority	CompressedQualitySignals	Combines content quality, click data, and link profile into a site-level authority score
pandaDemotion	CompressedQualitySignals	Site-wide quality penalty. Operates as algorithmic debt: a ceiling no page-level optimization can overcome
contentEffort	PerDocData	LLM-based effort estimation for article pages. Quantifies human labor, originality, and resources invested
OriginalContentScore	PerDocData	Score from 0-512 measuring content uniqueness at page level
siteFocusScore	QualityNsrNsrData	How dedicated a site is to a single topic (specialist vs. generalist)
siteRadius	QualityNsrNsrData	How much an individual page deviates from the site's central theme
authorityPromotion	CompressedQualitySignals	Boost signal (inverse of demotion). Explicit positive authority weighting

The contentEffort attribute is particularly notable. It is described as an "LLM-based effort estimation for article pages" that quantifies human labor, originality, and resources invested in creating content. Contributing factors include unique images, videos, embedded tools, in-depth content, original data, and linguistic complexity. This is the closest algorithmic proxy for the Experience dimension of E-E-A-T.

Meanwhile, Google's internal quality system uses codenames that reveal a layered evaluation pipeline: chard acts as the initial content classifier (and triggers more rigorous E-E-A-T evaluation if it classifies a page as YMYL), rhubarb measures the quality differential between a specific URL and its parent site, and tofu predicts site-level quality based on content patterns.

Danny Sullivan stated in 2024 that Google does not have a system that says "this is a brand, let's rank it higher." The internal evidence suggests something more nuanced: Copia and Firefly monitor content velocity (the ratio of URLs generated against substantive articles produced), and directFrac measures the fraction of direct traffic to a site, a brand signal. High direct traffic may boost quality scoring. The system does not explicitly favor "brands," but the signals it measures (direct traffic, click satisfaction, authority accumulation) structurally advantage established brands.

Google publicly states they do not penalize AI-generated or automated content, only "scaled content abuse" regardless of production method. Internally, Patent US9767157B2 (N-gram Quality, granted 2017) builds phrase models from known-quality sites using 2-gram through 5-gram frequency patterns. Each phrase is measured by its relative frequency across a site's pages. Template output that produces unnatural phrase distributions (repeated boilerplate, identical sentence structures, shallow variable substitution) gets flagged as low quality. The policy is method-agnostic; the detection system is not.

The Q* quality metric compounds this. Sites scoring below 0.4 on the 0-1 scale are ineligible for rich results (Featured Snippets, People Also Ask, and other SERP features) regardless of their structured data implementation. Google publicly encourages sites to implement structured data for rich result eligibility. Internally, a hard quality gate prevents low-scoring sites from ever appearing in them.

E-E-A-T: Marketing Label vs. Engineering Pipeline

Danny Sullivan and Gary Illyes have both repeatedly stated that E-E-A-T is not a ranking factor or score. Sullivan: "E-E-A-T is not a direct ranking factor. It's a concept from the Quality Rater Guidelines." Illyes: Google does not have an internal E-A-T score. Mueller: "There's no single ranking factor you can point to and say it's the deciding factor."

This is technically true and practically misleading. E-E-A-T is indeed not a single signal. It is a marketing label for 80+ independent algorithmic features evaluated at three levels: document, domain, and originator entity.

The DOJ trial revealed the actual engineering pipeline: Quality Raters evaluate pages using E-E-A-T criteria from the Quality Rater Guidelines. Those evaluations become training data for Google's RankEmbed and RankEmbedBERT models. The models learn to predict rater-like quality scores at scale. E-E-A-T is not a direct ranking factor. It is the training objective for the quality scoring model.

Source:Statements: Multiple Google spokespeople (various dates). Pipeline: DOJ trial, API leak, Google patents
What They Say	What the Pipeline Does
"E-E-A-T is not a ranking factor, it's a concept."	Quality Raters evaluate pages using E-E-A-T criteria. Those evaluations train RankEmbed models. The models produce quality scores used in ranking. E-E-A-T is the training signal, not a direct input.
"There is no E-A-T score."	Q* aggregates site/document quality. CompressedQualitySignals contains siteAuthority, pandaDemotion, authorityPromotion. The relationship: E-E-A-T is the goal, Q* is the system, Site_Quality is the score.
"Authorship is not a ranking factor."	isAuthor boolean, authorReputationScore in API docs. Author Topic Authority patent (US8458196B1) accumulates per-topic authority scores. Author Vectors patent (US 10,599,770) enables stylometric detection.
"Structured data doesn't directly rank."	bylineDateConfidence scores byline date accuracy. Entity-Based Ranking patent (US10235423B2) computes composite scores from knowledge graph entity metrics. Being a recognized entity is a direct ranking input.

Google's only white paper confirming differential E-A-T weighting is "How Google Fights Disinformation" (2019), which states: "Where our algorithms detect that a user's query relates to a YMYL topic, we will give more weight in our ranking systems to factors like our understanding of the authoritativeness, expertise, or trustworthiness of the pages."

The API leak adds a mechanism: the internal classifier chard determines whether a page is YMYL. If chard classifies a page as YMYL, it triggers more rigorous E-E-A-T evaluation. This is the algorithmic gate the white paper describes.

The YMYL gate does not stop at ranking suppression. A 16-month SE Ranking experiment (2,000 AI-generated articles across 20 new domains) found that Finance pages retained only 9 of 100 indexed at month 16, and Health retained only 14 of 100. Broad niches (food, home, lifestyle) retained near-complete indexing over the same period. The YMYL quality gate operates at the indexing stage, removing pages from the index entirely rather than simply demoting them. Google's public framing of E-E-A-T as a ranking concept understates its operational scope.

The distinction that matters

"E-E-A-T is not a ranking factor" is a true statement about direct measurement. It is a misleading statement about practical impact. The training objective for Google's quality models is directly derived from E-E-A-T criteria. Saying E-E-A-T is not a ranking factor is like saying "customer satisfaction is not a revenue driver" because it does not appear as a line item on the income statement. It is the input to the system that produces the output.

Freshness and Document History

Google's public guidance on freshness is relatively straightforward: update your content, keep it current, dates matter. The internal systems reveal a significantly more sophisticated evaluation pipeline that distinguishes cosmetic edits from genuine content improvement.

Source:2024 Google API leak: iPullRank, Hobo Web, Cyrus Shepard / Zyppy analysis
Signal	What It Does	Why It Matters
lastSignificantUpdate	Timestamp of last substantive revision, not last edit	Explains why updating dates without content changes stopped working circa 2023 (per Cyrus Shepard / Zyppy)
freshByDocFp	Document fingerprinting that detects whether actual content changed vs. just timestamps	Cosmetic date changes are detected and ignored
bylineDateConfidence	Confidence score for byline date accuracy	Contradictory dates in structured data vs. visible page degrade the freshness signal
freshnessDuration	How long content retains its freshness boost after publication or update	Freshness is a decaying signal, not a permanent state
syntacticDate / semanticDate	Date extracted from URL/title vs. estimated from content	Google cross-references multiple date signals to detect manipulation

Google stores only the last 20 versions of a document (via urlHistory / CrawlerChangerateUrlHistory). This contradicts any implication of full history awareness and means the window of observable content evolution is finite.

The Content Freshness Scoring patent (US8549014B2) tracks the age distribution of content within a document, how much is old vs. recently added. It also detects unnatural link acquisition patterns (sudden spikes) as spam signals. Genuine updates that change the age distribution of information on the page improve freshness scores. Date changes alone do not.

The Sandbox That Doesn't Exist

Google has publicly denied the existence of a "sandbox" for new sites, a deliberate trust-building period where new domains are held back from full ranking potential. The leaked documentation directly contradicts this.

Source:Statements: Various Google spokespeople. Systems: 2024 API leak (Hobo Web, iPullRank)
What They Said	What the Documentation Shows
Google has no sandbox for new sites.	hostAge in PerDocData is explicitly described as used "to sandbox fresh spam in serving time." Represents the earliest firstseen date of all pages on a host/domain.
No special treatment for new domains.	RegistrationInfo.createdDate / expiredDate. Domain creation and expiration timestamps are stored as ranking-accessible attributes.
Content quality determines ranking from day one.	NavBoost requires a 13-month rolling window of click data. A new domain has zero accumulated click signals, creating an inherent structural disadvantage against sites with established engagement history.

The sandbox effect is real but more nuanced than the SEO community originally theorized. It is not a single deliberate penalty. It is the combined effect of multiple systems that inherently disadvantage new domains: zero NavBoost history (no click signals to boost with), no accumulated backlink signals in the source tier system, no siteAuthority accumulation in CompressedQualitySignals, and the explicit hostAge sandbox flag for new hosts.

For programmatic SEO builds on new domains, this has direct implications. The first 13 months are structurally disadvantaged regardless of content quality. The evidence-builder strategy, winning achievable queries first to accumulate NavBoost signals, then competing for harder queries, is the only viable path given how these systems actually work.

What This Means for Practitioners

The gap between public guidance and engineering reality does not mean Google's public advice is worthless. Much of it is directionally correct: create useful content, build real authority, maintain technical health. The problem is that the advice is incomplete, and the incompleteness createsmisallocation of resources. The clearest example: "create helpful content" obscures 13 independent classifiers that each detect a specific pattern of unhelpful content.

Source:Synthesis: API leak, DOJ testimony, public statements cross-referenced
Public Guidance	What It Misses	Practical Correction
"Create helpful content"	Helpfulness is measured by click satisfaction (NavBoost), content effort (contentEffort LLM scorer), and pairwise comparison against competitors, not by a single "helpfulness" metric. Notably, Google quietly dropped its "written by people, for people" language from its guidelines, acknowledging the practical reality that AI-generated content is not inherently penalized if it satisfies these signals.	Optimize for satisfied clicks and genuine content depth. Pairwise quality means your content only needs to beat the specific competition, not achieve abstract quality.
"Links aren't that important"	11+ PageRank variants, three-tier link quality system, separate scoring for internal/external. Links from Base-tier pages carry full signal; Landfills-tier carry minimal.	Few high-quality links from high-engagement pages outperform many links from low-tier pages. Focus on earning links from sites that themselves receive traffic.
"E-E-A-T is not a ranking factor"	E-E-A-T criteria are the training objective for RankEmbed quality models. 80+ independent features across document, domain, and entity levels.	Build the entity (author and brand recognition across platforms) and the evidence (contentEffort inputs: original data, unique images, linguistic complexity). The cosmetic signals (author bios, "About" pages) matter far less than the structural ones.
"Keep content fresh"	Google distinguishes substantive updates from cosmetic edits via document fingerprinting (freshByDocFp) and byline date cross-referencing (bylineDateConfidence).	Date changes without content changes are detected and ignored. Genuine content improvement (adding new data, updating outdated information) triggers lastSignificantUpdate.
"No sandbox for new sites"	hostAge explicitly sandboxes new hosts. NavBoost requires 13 months of click history. New domains start with zero authority signals.	New domains face a structural disadvantage for 12-13 months. Plan for it: target achievable queries first, build NavBoost signals through genuine engagement, accumulate entity trust before competing on high-difficulty queries.

The reality gap now extends into AI search. Google describes AI Overviews and AI Mode as complementary to organic results, but the systems cite from different source pools. Daniel Shashko's sentence-level analysis (42,971 citations, 520 queries, 6 platforms) found that AI Mode and Gemini share only 3.5% of cited domains. Kevin Indig documented a 56.6% click decline across 10 B2B SaaS sites correlated with AIO rollout. Profound's analysis of 250M+ AI responses found that traffic explains 5% of citation behavior and backlinks explain 3.8%. The Semrush 2024 Ranking Factors study confirms the pattern on the organic side: text relevance is the strongest ranking factor at 0.47 correlation, significantly above domain authority at 0.21. The metrics practitioners have optimized for a decade, the ones that populate every SEO dashboard, have declining explanatory power across both channels.

Mike King's counter-perspective after analyzing the leak remains the most grounded summary: despite the revelations, the fundamental practice remains unchanged. "Build websites and content that people want to visit, spend time on, and link to." The leak does not change what to do. It changes why it works and how much to invest in each signal.

The operating principle

Treat Google's public guidance as directionally correct but strategically incomplete. When a spokesperson says "don't worry about X," check whether the leaked documentation contains modules measuring X. When they say "focus on quality," ask which of the 80+ quality signals you are specifically losing the pairwise comparison on. The gap is not a reason for cynicism. It is a competitive advantage for practitioners who read the engineering documentation beyond the blog posts.