Passage Rank, Retrieval Augmented Generation, and The Future of SEO

Or: How I Learned to Stop Worrying and Love the Robot

I have come to believe that understanding Retrieval Augmented Generation (RAG), structured data, e.g., Schema.org, and the way that LLMs retrieve and store data is critical in the future of SEO.

I was introduced to the concept of Retrieval Augmented Generation at Pubcon in early 2023 in a presentation from Mike King (iPullRank). Mike wrote it up in a Search Engine Land article as well: How Search Generative Experience works and why retrieval-augmented generation is our future.

Ever since that presentation I have been trying to wrap my brain around the importance of structured data, AI, and retrieval augmented generation for SEO.

This weekend I am attending a conference called SEO Spring Training.

Unlike more mainstream conferences, this one is almost entirely about SEO and there is an expectation that the audience members have a good familiarity with SEO concepts. This makes it so rather than a lot of preamble, speakers can jump right in.

Many of the speakers are SEOs that I follow online.The topics have included both Technical and content SEO and are at a much higher level than conferences I traditionally attend.

They are also a bit more edge-y, so some of the sessions I have attended are well outside the mainstream of the SEO conference and may be considered a little bit gray (black as midnight in one case).

So far none of the sessions have specifically addressed my interest in how we as SEOs can continue to work in the new search paradigm where the top of the search engine results page (SERP) is either the output of LLMs, or strongly informed by AI.

A number of the presentations at SEO Spring Training have, however, given me some ideas that sharpen the focus of my intent relative to content generation retrieval augmented generated search.

Just like the SEO freak out at the inclusion of rich snippets in the search engine result pages, many are decrying the “Death of SEO,” yet again because of the use of LLMs and retrieval augmented generation (RAG) for the first answers a user might see.

Unlike the Chicken Littles I believe that this is still a winnable game. It is incumbent on the SEO practitioner, therefore, to learn as much as they can about AI, Retrieval Augmented Generation, LLMs, and structured data such as Schema.org.

With that in mind, I decided to ask Google Gemini what it thought.

My question:

Can you please explain the concept of “passage rank” and include examples? I’m trying to understand how best to structure content so that it is most likely to be surfaced in retrieval augmented generation (RAG).

Gemini’s Answer:

Passage ranking is a system introduced by Google in 2021 that analyzes individual sections of a webpage, rather than just the entire page itself. This helps Google understand the specific topic or question a particular passage is addressing.

Here’s how it works:

  • Traditionally, Google ranked entire webpages based on keywords and overall relevance to a search query.
  • With passage ranking, Google’s AI can now analyze a webpage and identify specific sections (passages) that directly answer a user’s search query.

Example:

Let’s say you search for “how to change a tire.” In the past, Google might have surfaced webpages that talk about general car maintenance, even if only a small portion of the page discussed changing a tire.

With passage ranking, Google can identify the specific passage within that webpage that talks about tire changing and rank that passage higher in the search results. This gives you a more relevant and direct answer to your search query.

How to structure content for RAG (Retrieval Augmented Generation):

While RAG is a complex system, understanding passage ranking can help with structuring your content. Here are some tips:

  • Clear and concise headings: Use clear headings and subheadings to break down your content into distinct sections. This helps Google identify the topics covered in each passage.
  • Topical focus: Keep each passage focused on a single topic or subtopic. This makes it easier for Google to understand the relevance of each passage to a specific search query.
  • Direct answers: When answering questions, aim to provide clear and concise answers within a passage. This increases the chance of your passage being surfaced for RAG.

By structuring your content with clear sections and focusing on specific topics, you can increase the chances of your content being relevant for passage ranking and potentially used in RAG for generating informative answers.

Conclusion, for now..

It really is a lot like how to win the rich snippet.

Make sure your paragraphs are focused on a single topic, and answer the question in a concise way.

At SEO Spring Training Simon Cornelius talked about two concepts that brought it home for me.

One was the idea that each heading (h1, h2) should have a “Topic Sentence.” We should all remember the topic sentence from high school English class.

The other was the concept of a “Content Window.” The idea being that Search Engines can only understand passages of 30-50 words. I haven’t fact checked this, but it tracks with other concepts that have been out there for a while.

So, if you want to win in an AI, LLM, RAG world, think like a search engine, and go back to school.

And, learn to love the robot.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.