Research Synthesis

The Reality Gap: Public Guidance vs. Engineering Reality

Google's public communications serve two purposes: genuine guidance to help webmasters build better sites, and strategic misdirection to prevent gaming. The gap between the two is not always deception — but the systematic denial of click-based ranking, despite NavBoost being "perhaps the central way web ranking has improved for 15 years," represents the clearest case of deliberate public misdirection. This page maps the gap with specific quotes, specific API modules, and specific sworn testimony.

Compiled by Aviel Fahl

Key Findings

The 2024 Google API leak exposed 2,596 modules containing 14,014 attributes across Google's ranking infrastructure. The DOJ antitrust trial produced sworn testimony from senior Google engineers confirming click data as a core ranking signal. Together, these sources contradict at least eight specific public statements made by Google spokespeople over the past decade. Google denied using click data for ranking while maintaining NavBoost, a 13-month click-signal table confirmed under oath as one of their most important ranking signals. They denied having a domain authority metric while a field called siteAuthority exists in their CompressedQualitySignals module. They denied a sandbox for new sites while a hostAge attribute is explicitly described as used "to sandbox fresh spam in serving time."

Contents

2,596

API modules leaked (May 2024)

14,014

ranking attributes documented

8+

public statements directly contradicted

13 mo

NavBoost click data window (sworn testimony)

The Gap Exists


In May 2024, internal Google API documentation — the Content API Warehouse, 2,500+ pages of Java-based protobuf specs — was published to a public GitHub repository by an automated bot called yoshi-code-bot. Rand Fishkin at SparkToro and Mike King at iPullRank published independent analyses on May 27, 2024.

Google's response was a non-denial: "We would caution against making inaccurate assumptions about Search based on out-of-context, outdated, or incomplete information." They did not claim the documentation was fabricated, inaccurate, or from a non-production system. They declined to specify which elements are accurate, invalid, currently active, or weighted.

Separately, the DOJ v. Google antitrust trial in October 2023 produced sworn testimony from senior engineers including Pandu Nayak (VP of Search) and Eric Lehman (Distinguished Engineer). Court documents stated that "learning from user feedback is perhaps the central way web ranking has improved for 15 years."

The table below maps the core contradictions. Each row pairs a specific public statement with the specific internal evidence that contradicts it.

Source:Sources: SparkToro, iPullRank, Hobo Web (May 2024); DOJ v. Google trial documents (October 2023); SERoundtable, Search Engine Land (various dates)
TopicPublic StatementInternal Evidence
Click dataGary Illyes (2019): "Dwell time, CTR, whatever Fishkin's new theory is, those are generally made up crap."NavBoost: goodClicks, badClicks, lastLongestClicks, unicornClicks in QualityNavboostCrapsCrapsClickSignals. Confirmed under oath by Pandu Nayak as "one of the most important ranking signals."
Domain authorityJohn Mueller (various): Google does not have a site-wide authority score.siteAuthority field exists in CompressedQualitySignals. Q* combines it with PageRank and content quality.
Chrome dataGoogle (various): "We don't use Chrome browsing data for ranking purposes."chromeInTotal in QualityNsrNsrData. chrome_trans_clicks in click processing. P* (Popularity signal) powered by Chrome visit data.
SandboxGoogle (various): denied existence of a sandbox for new sites.hostAge in PerDocData, explicitly described as used "to sandbox fresh spam in serving time."
Link equalityGary Illyes (2024): "We need very few links to rank pages... we've made links less important."sourceType on AnchorsAnchorSource classifies links into HIGH_QUALITY / MEDIUM_QUALITY / LOW_QUALITY tiers tied to index storage. PageRank threshold bucketing at 51K (uint16): above = premium treatment, below 47K = all equivalent.
AuthorshipGary Illyes (various): Google is "not using authorship" as a ranking factor.isAuthor boolean, authorReputationScore in documentation. Author Vectors patent (US 10,599,770) enables stylometric authorship detection.
Penguin real-timeGary Illyes (2016): Penguin 4.0 is "real-time, part of core algorithm."penguinLastUpdate timestamp suggests batch processing. penguinPenalty, penguinTooManySources operate as discrete flags.
User engagementJohn Mueller (various): Google does not use "engagement" as a factor for ranking.NavBoost operates on a rolling 13-month window of engagement data. CRAPS module segments clicks by country, device, language, metro area.

Interpreting the leak

The API documentation is confirmed authentic, but Google correctly notes it is "out-of-context." We cannot determine which attributes are actively weighted, deprecated-but-populated, or experimental. Attributes marked deprecated may still be populated; unmarked attributes may be experimental. The system architecture and module relationships are the most durable findings. Treat specific attribute names as confirmed-to-exist, not confirmed-to-be-active.

Click Signals and User Behavior


This is the widest gap between public guidance and internal reality. For over a decade, multiple Google spokespeople denied or minimized the role of click data in rankings. The leaked documentation and sworn testimony prove click data is central to the ranking system.

Source:Statements: SERoundtable, SEJ (various dates). Systems: iPullRank, Hobo Web, DOJ trial testimony (2023-2024)
What They SaidWho Said ItWhat the Systems Show
"Dwell time, CTR... those are generally made up crap. Search is much more simple than people think."Gary Illyes, Reddit AMA, January 2019NavBoost tracks goodClicks (satisfaction), badClicks (pogo-sticking), lastLongestClicks (session-ending satisfaction), and unicornClicks (high-trust user behavior).
"We're not using such metrics" (dwell time, time on page).Martin Splitt, 2019CRAPS module (Click-Related Active Promotion Signals) segments click data by country, device, language, and metro area. A page can rank differently on mobile in Brazil vs. desktop in Germany.
Google does not use "engagement" as a factor for ranking.John Mueller, variousNavBoost confirmed under oath by Pandu Nayak (VP of Search) and Eric Lehman (Distinguished Engineer) as one of the most important ranking signals. Uses a rolling 13-month window.

NavBoost is not a machine learning model. Eric Lehman described it under oath as "essentially a large spreadsheet" storing which URLs were clicked and how often for each query. "Long clicks" (user stays) are positive; "short clicks" (quick returns) are negative. The system also tracks aging buckets — evaluating click performance separately by content age to detect whether a page's engagement is improving or decaying over time.

Click data is processed through the CompressedQualitySignals module as crapsNewUrlSignals (URL-level), crapsNewHostSignals (host-level), and crapsNewPatternSignals (pattern-level). Google maintains both squashed (production) and unsquashed (experimentation) data — click squashing is statistical compression preventing any single dominant signal from overwhelming rankings, the primary defense against CTR manipulation while still using clicks as a core input.

The trial also revealed Google's ranking distills to two top-level signals: Quality (Q*) and Popularity (P*). The Popularity signal is directly powered by Chrome visit data and user interaction signals from NavBoost — contradicting the public statement that Chrome browsing data is not used for ranking.

Source:2024 Google API leak, HexDocs raw documentation
Chrome SignalLocationFunction
chromeInTotalQualityNsrNsrDataSite-level aggregate Chrome browser views
chrome_trans_clicksClick processing pipelineChrome transition clicks, used to identify most-visited URLs and generate Sitelinks

Why this matters for practitioners

Click signals are not something you can directly optimize. But understanding that they exist changes strategic priorities. Pages that generate satisfied clicks (long dwell, no SERP return) accumulate NavBoost signals that compound over time. This is the quantitative mechanism behind "make content that genuinely satisfies the query" — not because it is vague aspirational advice, but because a 13-month table of click satisfaction data is directly re-ranking results.


Google's public messaging on links has evolved from "one of the top three ranking factors" to "we need very few links to rank pages." The leaked documentation tells a different story: Google maintains at least 11 distinct PageRank variants and a deeply granular anchor text processing pipeline.

Source:Statements: SERoundtable, Page One Power (various). Systems: iPullRank, On-Page.ai, Hobo Web (May 2024)
What They SaidWho Said ItWhat the Systems Show
"Links are important, but people overestimate their importance. Not top 3, hasn't been for some time."Gary Illyes, Pubcon, September 202311+ active PageRank variants in PerDocData: pagerank, pagerank0/1/2, homepagePagerankNs, pagerankNs, rawPagerank, crawlerPageRank, IndyRank, ScaledIndyRank, site_pr, setiPagerankWeight.
"We need very few links to rank pages... we've made links less important."Gary Illyes, SERP Conf, March 2024sourceType on AnchorsAnchorSource classifies linking pages into three quality tiers (HIGH_QUALITY, MEDIUM_QUALITY, LOW_QUALITY) tied directly to index storage tier. Links from Base-tier pages carry full signal; Landfills-tier links carry minimal.
"I would recommend avoiding link building."John Mueller, 2021PageRank threshold bucketing: topPrOnsiteAnchorCount / topPrOffdomainAnchorCount treat links from sources above 51,000 PageRank (uint16) as qualitatively different from everything below 47,000. Binary premium, not gradual.

The anchor text processing is far more granular than public guidance suggests. Google maintains completely separate scoring pipelines for internal vs. external links: normalizedScoreFromOffdomain vs. normalizedScoreFromOnsite, with independent counts, volume denominators, and aggregate scores. Redirected links and fragment links are each scored separately. Anchors are deduplicated within source organizations to prevent a single company from flooding anchor signals.

The system also analyzes full context around each anchor: fullLeftContext and fullRightContext capture all terms preceding and following the anchor text. Content position matters: inbodyTargetLink vs. outlinksTargetLink differentiates main content links from sidebar and footer links. And onsiteProminence measures page importance within its own site — but it is not a theoretical PageRank calculation. It is computed by propagating simulated traffic from the homepage and pages with high search-click volume. This is a user-behavior-informed simulation seeded from empirically validated entry points. The implication: Google says clicks don't matter for ranking, but clicks literally seed the simulation that determines internal page importance.

New pages inherit homepage authority (homepagePagerankNs) until acquiring individual PageRank. The Nearest Seeds variant seeds calculations from known-trusted pages rather than distributing uniformly — "distance from a known good source" is how Q* measures authority. Homepages themselves are classified into a four-tier trust system via homePageInfo: FULLY_TRUSTED, PARTIALLY_TRUSTED, NOT_TRUSTED, and NOT_HOMEPAGE. Every page on the domain inherits from this tier — a site-wide authority gate by another name, despite Google's denials of a "domain authority" concept.

The practical takeaway

Links have not become less important. What has changed is which links carry signal. The three-tier index tiering system means links from pages with no clicks (Landfills tier) carry minimal value, while links from high-engagement pages (Base tier) carry full value. "We need very few links" is technically correct if those few links are from Base-tier, high-PageRank pages. It is misleading if interpreted as "links barely matter."

Content Quality and Site Authority


Google has publicly stated — repeatedly and through multiple spokespeople — that there is no domain authority metric. The leaked documentation shows a concrete field called siteAuthority inside CompressedQualitySignals, the pre-computed quality gatekeeper module that can disqualify a page before query-time ranking even begins.

Source:2024 Google API leak analysis: iPullRank, Hobo Web, SE Ranking
SignalModuleFunction
siteAuthorityCompressedQualitySignalsCombines content quality, click data, and link profile into a site-level authority score
pandaDemotionCompressedQualitySignalsSite-wide quality penalty. Operates as algorithmic debt: a ceiling no page-level optimization can overcome
contentEffortPerDocDataLLM-based effort estimation for article pages. Quantifies human labor, originality, and resources invested
OriginalContentScorePerDocDataScore from 0-512 measuring content uniqueness at page level
siteFocusScoreQualityNsrNsrDataHow dedicated a site is to a single topic (specialist vs. generalist)
siteRadiusQualityNsrNsrDataHow much an individual page deviates from the site's central theme
authorityPromotionCompressedQualitySignalsBoost signal (inverse of demotion). Explicit positive authority weighting

The contentEffort attribute is particularly notable. It is described as an "LLM-based effort estimation for article pages" that quantifies human labor, originality, and resources invested in creating content. Contributing factors include unique images, videos, embedded tools, in-depth content, original data, and linguistic complexity. This is the closest algorithmic proxy for the Experience dimension of E-E-A-T.

Meanwhile, Google's internal quality system uses codenames that reveal a layered evaluation pipeline: chard acts as the initial content classifier (and triggers more rigorous E-E-A-T evaluation if it classifies a page as YMYL), rhubarb measures the quality differential between a specific URL and its parent site, and tofu predicts site-level quality based on content patterns.

Danny Sullivan stated in 2024 that Google does not have a system that says "this is a brand, let's rank it higher." The internal evidence suggests something more nuanced: Copia and Firefly monitor content velocity (the ratio of URLs generated against substantive articles produced), and directFrac measures the fraction of direct traffic to a site — a brand signal. High direct traffic may boost quality scoring. The system does not explicitly favor "brands," but the signals it measures (direct traffic, click satisfaction, authority accumulation) structurally advantage established brands.

Google publicly states they do not penalize AI-generated or automated content — only "scaled content abuse" regardless of production method. Internally, Patent US9767157B2 (N-gram Quality, granted 2017) builds phrase models from known-quality sites using 2-gram through 5-gram frequency patterns. Each phrase is measured by its relative frequency across a site's pages. Template output that produces unnatural phrase distributions — repeated boilerplate, identical sentence structures, shallow variable substitution — gets flagged as low quality. The policy is method-agnostic; the detection system is not.

The Q* quality metric compounds this. Sites scoring below 0.4 on the 0-1 scale are ineligible for rich results — Featured Snippets, People Also Ask, and other SERP features — regardless of their structured data implementation. Google publicly encourages sites to implement structured data for rich result eligibility. Internally, a hard quality gate prevents low-scoring sites from ever appearing in them.

E-E-A-T: Marketing Label vs. Engineering Pipeline


Danny Sullivan and Gary Illyes have both repeatedly stated that E-E-A-T is not a ranking factor or score. Sullivan: "E-E-A-T is not a direct ranking factor. It's a concept from the Quality Rater Guidelines." Illyes: Google does not have an internal E-A-T score. Mueller: "There's no single ranking factor you can point to and say it's the deciding factor."

This is technically true and practically misleading. E-E-A-T is indeed not a single signal. It is a marketing label for 80+ independent algorithmic features evaluated at three levels: document, domain, and originator entity.

The DOJ trial revealed the actual engineering pipeline: Quality Raters evaluate pages using E-E-A-T criteria from the Quality Rater Guidelines. Those evaluations become training data for Google's RankEmbed and RankEmbedBERT models. The models learn to predict rater-like quality scores at scale. E-E-A-T is not a direct ranking factor — it is the training objective for the quality scoring model.

Source:Statements: Multiple Google spokespeople (various dates). Pipeline: DOJ trial, API leak, Google patents
What They SayWhat the Pipeline Does
"E-E-A-T is not a ranking factor — it's a concept."Quality Raters evaluate pages using E-E-A-T criteria. Those evaluations train RankEmbed models. The models produce quality scores used in ranking. E-E-A-T is the training signal, not a direct input.
"There is no E-A-T score."Q* aggregates site/document quality. CompressedQualitySignals contains siteAuthority, pandaDemotion, authorityPromotion. The relationship: E-E-A-T is the goal, Q* is the system, Site_Quality is the score.
"Authorship is not a ranking factor."isAuthor boolean, authorReputationScore in API docs. Author Topic Authority patent (US8458196B1) accumulates per-topic authority scores. Author Vectors patent (US 10,599,770) enables stylometric detection.
"Structured data doesn't directly rank."bylineDateConfidence scores byline date accuracy. Entity-Based Ranking patent (US10235423B2) computes composite scores from knowledge graph entity metrics. Being a recognized entity is a direct ranking input.

Google's only white paper confirming differential E-A-T weighting is "How Google Fights Disinformation" (2019), which states: "Where our algorithms detect that a user's query relates to a YMYL topic, we will give more weight in our ranking systems to factors like our understanding of the authoritativeness, expertise, or trustworthiness of the pages."

The API leak adds a mechanism: the internal classifier chard determines whether a page is YMYL. If chard classifies a page as YMYL, it triggers more rigorous E-E-A-T evaluation. This is the algorithmic gate the white paper describes.

The distinction that matters

"E-E-A-T is not a ranking factor" is a true statement about direct measurement. It is a misleading statement about practical impact. The training objective for Google's quality models is directly derived from E-E-A-T criteria. Saying E-E-A-T is not a ranking factor is like saying "customer satisfaction is not a revenue driver" because it does not appear as a line item on the income statement. It is the input to the system that produces the output.

Freshness and Document History


Google's public guidance on freshness is relatively straightforward: update your content, keep it current, dates matter. The internal systems reveal a significantly more sophisticated evaluation pipeline that distinguishes cosmetic edits from genuine content improvement.

Source:2024 Google API leak: iPullRank, Hobo Web, Cyrus Shepard / Zyppy analysis
SignalWhat It DoesWhy It Matters
lastSignificantUpdateTimestamp of last substantive revision, not last editExplains why updating dates without content changes stopped working circa 2023 (per Cyrus Shepard / Zyppy)
freshByDocFpDocument fingerprinting that detects whether actual content changed vs. just timestampsCosmetic date changes are detected and ignored
bylineDateConfidenceConfidence score for byline date accuracyContradictory dates in structured data vs. visible page degrade the freshness signal
freshnessDurationHow long content retains its freshness boost after publication or updateFreshness is a decaying signal, not a permanent state
syntacticDate / semanticDateDate extracted from URL/title vs. estimated from contentGoogle cross-references multiple date signals to detect manipulation

Google stores only the last 20 versions of a document (via urlHistory / CrawlerChangerateUrlHistory). This contradicts any implication of full history awareness and means the window of observable content evolution is finite.

The Content Freshness Scoring patent (US8549014B2) tracks the age distribution of content within a document — how much is old vs. recently added. It also detects unnatural link acquisition patterns (sudden spikes) as spam signals. Genuine updates that change the age distribution of information on the page improve freshness scores. Date changes alone do not.

The Sandbox That Doesn't Exist


Google has publicly denied the existence of a "sandbox" for new sites — a deliberate trust-building period where new domains are held back from full ranking potential. The leaked documentation directly contradicts this.

Source:Statements: Various Google spokespeople. Systems: 2024 API leak (Hobo Web, iPullRank)
What They SaidWhat the Documentation Shows
Google has no sandbox for new sites.hostAge in PerDocData is explicitly described as used "to sandbox fresh spam in serving time." Represents the earliest firstseen date of all pages on a host/domain.
No special treatment for new domains.RegistrationInfo.createdDate / expiredDate — domain creation and expiration timestamps are stored as ranking-accessible attributes.
Content quality determines ranking from day one.NavBoost requires a 13-month rolling window of click data. A new domain has zero accumulated click signals, creating an inherent structural disadvantage against sites with established engagement history.

The sandbox effect is real but more nuanced than the SEO community originally theorized. It is not a single deliberate penalty. It is the combined effect of multiple systems that inherently disadvantage new domains: zero NavBoost history (no click signals to boost with), no accumulated backlink signals in the source tier system, no siteAuthority accumulation in CompressedQualitySignals, and the explicit hostAge sandbox flag for new hosts.

For programmatic SEO builds on new domains, this has direct implications. The first 13 months are structurally disadvantaged regardless of content quality. The evidence-builder strategy — winning achievable queries first to accumulate NavBoost signals, then competing for harder queries — is not just good practice. It is the only viable path given how these systems actually work.

What This Means for Practitioners


The gap between public guidance and engineering reality does not mean Google's public advice is worthless. Much of it is directionally correct: create genuinely useful content, build real authority, maintain technical health. The problem is that the advice is incomplete, and the incompleteness createsmisallocation of resources.

Source:Synthesis: API leak, DOJ testimony, public statements cross-referenced
Public GuidanceWhat It MissesPractical Correction
"Create helpful content"Helpfulness is measured by click satisfaction (NavBoost), content effort (contentEffort LLM scorer), and pairwise comparison against competitors — not by a single "helpfulness" metric. Notably, Google quietly dropped its "written by people, for people" language from its guidelines — acknowledging the practical reality that AI-generated content is not inherently penalized if it satisfies these signals.Optimize for satisfied clicks and genuine content depth. Pairwise quality means your content only needs to beat the specific competition, not achieve abstract quality.
"Links aren't that important"11+ PageRank variants, three-tier link quality system, separate scoring for internal/external. Links from Base-tier pages carry full signal; Landfills-tier carry minimal.Few high-quality links from high-engagement pages outperform many links from low-tier pages. Focus on earning links from sites that themselves receive traffic.
"E-E-A-T is not a ranking factor"E-E-A-T criteria are the training objective for RankEmbed quality models. 80+ independent features across document, domain, and entity levels.Build the entity (author and brand recognition across platforms) and the evidence (contentEffort inputs: original data, unique images, linguistic complexity). The cosmetic signals (author bios, "About" pages) matter far less than the structural ones.
"Keep content fresh"Google distinguishes substantive updates from cosmetic edits via document fingerprinting (freshByDocFp) and byline date cross-referencing (bylineDateConfidence).Date changes without content changes are detected and ignored. Genuine content improvement — adding new data, updating outdated information — triggers lastSignificantUpdate.
"No sandbox for new sites"hostAge explicitly sandboxes new hosts. NavBoost requires 13 months of click history. New domains start with zero authority signals.New domains face a structural disadvantage for 12-13 months. Plan for it: target achievable queries first, build NavBoost signals through genuine engagement, accumulate entity trust before competing on high-difficulty queries.

Mike King's counter-perspective after analyzing the leak remains the most grounded summary: despite the revelations, the fundamental practice remains unchanged — "build websites and content that people want to visit, spend time on, and link to." The leak does not change what to do. It changes why it works and how much to invest in each signal.

The operating principle

Treat Google's public guidance as directionally correct but strategically incomplete. When a spokesperson says "don't worry about X," check whether the leaked documentation contains modules measuring X. When they say "focus on quality," ask which of the 80+ quality signals you are specifically losing the pairwise comparison on. The gap is not a reason for cynicism. It is a competitive advantage for practitioners who read the engineering documentation, not just the blog posts.