Book Consultation

The Search Engineering Dictionary

The vocabulary of SEO has evolved from "keywords and links" to embeddings, retrieval, and agents.These are the concepts you need to know to navigate the LLM era of discovery.

Retrieval & Discovery Concepts

Large Language Model (LLM)#
A neural-network model (e.g., GPT-5, Gemini) trained on massive text corpora and capable of predicting tokens, answering questions, and following instructions.
Token#
The smallest unit an LLM processes (≈ one word or punctuation mark). Costs, context limits, and output length are all measured in tokens.
Chunking / Text Splitting#
The process of splitting longer content into smaller, coherent segments before creating embeddings. Effective chunking follows natural content boundaries (e.g., headings and paragraphs) so each vector corresponds to a complete idea that AI systems can reliably retrieve as context.
Embedding / Vector Embedding#
A fixed-length list of numbers that captures a text's meaning so similar texts sit close together in multi-dimensional space.
Vector Store / Vector DB#
A database (e.g., Pinecone, Supabase, Elasticsearch KNN) optimized for storing embeddings and running “nearest-vector” queries.
Cosine Similarity#
The mathematical distance between a user's query and your content. Our goal is to structure your answer so it sits semantically close to the user's question.
Retrieval-Augmented Generation (RAG)#
A workflow that retrieves documents (often via a vector store) and feeds them into the LLM so it cites those facts instead of hallucinating.
Retrieval Weighting#
The scoring logic (similarity x recency x authority, etc.) used to rank documents returned to the LLM during RAG.
Prompt / System Message#
Instructions prepended to the user prompt that set tone, policy, or formatting rules for an LLM conversation.
Function Calling#
When an LLM outputs a JSON payload instructing software to run a tool (e.g., calculateSavingsRate) mid-conversation.
Hallucination#
An LLM answer that sounds plausible but is factually wrong because the model filled gaps with guesswork.
Knowledge-Graph Triple#
A fact stored as (subject → predicate → object), e.g., “Citibank → offersRate → 3.30 % APY.”
Digital Signature / Provenance (C2PA)#
Cryptographic metadata proving who created a file and whether it has been altered, boosting trust in LLM pipelines.
Fine-tuning / Supervised Fine-tuning (SFT)#
Further training a pre-trained LLM on a smaller, task-specific dataset to adopt domain language, style, or private knowledge.
Reinforcement Learning from Human Feedback (RLHF)#
A training loop that uses human ratings of model outputs as rewards, aligning the LLM with helpful, harmless responses.
Temperature#
A generation parameter (0-2) controlling randomness: lower = deterministic, higher = creative.
Context Window#
The maximum token count an LLM can “remember” per interaction (prompt + response); governs chunk sizing in RAG.
Multimodal LLM#
A model that both consumes and produces text plus other media (images, audio, video), enabling richer search and UX.

S — Structured Signals

Schema.org Markup#
JSON-LD tags that declare entities (products, authors, FAQs, etc.) in a format search engines and LLM scrapers can parse.
Sitemap / RSS / Atom Feed#
XML files listing URLs and timestamps so crawlers and RAG sync jobs know what's new or updated.
Canonical Tag#
An HTML link that identifies the master version of a page, preventing duplicate URLs from diluting authority.
Robots.txt / Robots Meta#
Rules that allow or block crawling and indexing, managing crawl budget and LLM ingestion cost.
Persistent ID (GUID / GTIN / ISBN)#
A globally unique identifier letting different datasets resolve the same entity without confusion.
Data Contract#
An explicit schema (often versioned) that defines the fields an API or feed will always supply.
Content Delivery Network (CDN)#
A globally distributed cache that serves APIs and structured data closer to users and crawlers, boosting speed and uptime.
Schema Markup Validator#
Tools (e.g., Google Rich Results Test) that check JSON-LD syntax and completeness, lowering Schema Error Rate.
API Versioning#
Labeling API releases (v1, v2…) so consumers—including LLMs—can rely on stable fields while you evolve the contract.

A — Authority & Authenticity

Brand Mention#
Your name in trusted publications without a link; modern ranking systems treat these as “implied links.”
Author Markup / Bylines#
Schema fields or HTML blocks that tie content to a real person, allowing knowledge graphs to attribute expertise.
Third-Party Review#
User or expert ratings hosted off-site (e.g., Trustpilot) that act as external validation of quality.
NAP Consistency#
Exact match of Name, Address, Phone across directories—critical for local SEO and entity disambiguation.
Citation Velocity#
The rate at which new referring domains cite your content; spikes often precede ranking gains.
Knowledge Graph#
A search-engine database of entities and their relationships; accurate representation fast-tracks authority in LLM answers.
Entity Linking / Disambiguation#
Associating a text mention with the correct entity ID (e.g., Apple-fruit vs. Apple-Inc.), preventing authority leaks to competitors.
Citation (in LLM output)#
An inline reference the LLM attaches to a statement—often a URL—that lets users verify the fact.

G — Generative Presence

AI Overview (AIO)#
Google / Bing chat-style summary boxes that answer queries directly, usually citing 2-5 URLs.
Answer Snippet Engineering#
Crafting concise, self-contained paragraphs or bulleted answers so AIOs and voice assistants can quote you verbatim.
Content Licensing#
Deals (e.g., Reddit ↔ OpenAI) that feed your data into model training or private RAG stores.
Embedding Feed#
A scheduled job pushing new content as embeddings to a vector store, ensuring freshness for RAG assistants.
Zero-Shot Retrieval#
The ability of a retriever to surface your page for queries the model never saw during training.
Prompt Engineering#
The craft of writing clear, constrained prompts (plus examples) to steer LLM outputs toward desired style and accuracy.
AI Agent / Autonomous Agent#
An LLM-powered system that chains tools and decisions to complete multi-step tasks with minimal human input.
Zero-/Few-/Multi-Shot Prompting#
Supplying zero, a few, or many examples in the prompt to guide model reasoning and relevance.

E — Experience Depth

Core Web Vitals (LCP, INP, CLS)#
Google's performance metrics for load, interactivity, and visual stability; key to ranking and user happiness.
Interactive Tool / Calculator#
On-page widgets or callable APIs that let users compute something instantly, boosting engagement.
Engagement Signal#
Behavioral metrics like dwell time, scroll depth, and repeat visits; can feed back into RLHF loops.
User Feedback Loop#
Thumbs-up/down or rating data on AI answers used to fine-tune future ranking or generation behavior.
Accessibility & Semantic HTML (WCAG)#
Proper headings, ARIA roles, alt text, and WCAG compliance so humans and multimodal agents can parse content.
Time on Task#
Measures how long it takes users to complete a goal or the % who succeed—finer than generic dwell time.
User-Generated Content (UGC)#
Reviews, comments, or forum posts that add fresh, authentic signals of experience.

Measurement & Ops

Schema Error Rate#
The percentage of URLs whose JSON-LD fails validation; high error rates hide facts from crawlers and embeddings.
API Uptime#
The percentage of successful (200) responses from your data endpoints; downtime means missing citations.
Embedding Sync Lag#
The time between publishing content and its appearance in your vector store, measured in minutes or hours.
AIO Impression Share#
The fraction of queries where your URL is cited inside AI Overviews versus total AIO results in your niche.
Tool Usage Rate#
The percentage of chat sessions where an agent invokes your calculator/API—a proxy for “Experience Depth.”
Cost per Token#
What you pay an LLM provider for each input/output token; crucial for budgeting large-scale RAG or generation.
Hallucination Rate#
The share of answers where the LLM asserts unverified or false information; must be monitored.
API Latency#
The round-trip time of an API call; high latency degrades real-time chat or agent experiences.

Fan-Out & Related Concepts

Query Fan-Out#
Google's AI Overviews break a single query into many narrower sub-queries, run them in parallel, then stitch results together to reduce hallucination and improve depth.
Sub-Query#
A derivative search created during fan-out (e.g., “cost of living in Brooklyn” from “Is NYC affordable?”); each feeds fresh documents into retrieval.

From Definitions to Deployment

The landscape is shifting, and vocabulary is just the start.I help organizations bridge the gap between traditional technical SEO and the generative future.

Book a Strategic Diagnostic