Rekursiver Abruf für RAG: Implementierung mit LlamaIndex

Lerne, wie du rekursives Retrieval in RAG-Systemen mit LlamaIndex implementieren kannst, um die Genauigkeit und Relevanz der abgerufenen Informationen zu verbessern, insbesondere bei großen Dokumentensammlungen.

Aktualisiert 13. Nov. 2024 · 8 Min. lesen

In vielen RAG-Anwendungenist der Abrufprozess oft recht einfach. Dokumente werden in der Regel in Chunks zerlegt, in Embeddings umgewandelt und in einer Vektordatenbank gespeichert. Wenn eine Anfrage gestellt wird, sucht das System die Top-k-Dokumente danach aus, wie ähnlich ihre Einbettungen sind.

Diese Methode hat jedoch einige Nachteile, vor allem bei großen Sammlungen. Die Chunks können unklar sein, und das System zieht vielleicht nicht immer die relevantesten Informationen, was zu weniger genauen Ergebnissen führt.

Das rekursive Retrieval wurde entwickelt, um die Treffergenauigkeit zu verbessern, indem die Struktur des Dokuments genutzt wird. Anstatt direkt Chunks abzurufen, werden zunächst relevante Zusammenfassungen abgerufen und dann die entsprechenden Chunks aufgeschlüsselt, wodurch die Suchergebnisse relevanter werden.

In diesem Artikel erklären wir den rekursiven Abruf und zeigen dir, wie du ihn Schritt für Schritt mit LlamaIndex.

Was ist ein rekursiver Abruf?

Anstatt nur rohe Dokumententeile einzubetten und sie auf der Grundlage der Ähnlichkeit abzurufen, werden bei der rekursiven Suche zunächst Zusammenfassungen der Dokumente eingebettet und mit den vollständigen Dokumententeile verknüpft. Wenn eine Anfrage gestellt wird, ruft das System zunächst die relevanten Zusammenfassungen ab und sucht dann weiter nach den dazugehörigen Informationsbrocken.

Mit dieser Methode erhält das Retrievalsystem mehr Kontext, bevor es die endgültigen Chunks bereitstellt, sodass es relevante Informationen besser finden kann.

Rekursive Retrieval-Implementierung mit LlamaIndex

In diesem Abschnitt führen wir dich Schritt für Schritt durch die Implementierung der rekursiven Suche mit LlamaIndex, angefangen vom Laden der Dokumente bis hin zur Ausführung von Abfragen mit rekursiver Suche.

Schritt 1: Lade und bereite die Dokumente vor

Zuerst laden wir die Dokumente mit SimpleDirectoryReader in das System. Jedes Dokument erhält einen Titel und Metadaten (z. B. die Kategorie), um später das Filtern zu erleichtern. Die geladenen Dokumente werden dann in einem Wörterbuch gespeichert, damit sie leicht zugänglich sind.

from llama_index.core import SimpleDirectoryReader

# Document titles and metadata
article_titles = ["How to Do Great Work", "Having Kids", "How to Lose Time and Money"]
article_metadatas = {
    "How to Do Great Work": {
        "category": "self-help",
    },
    "Having Kids": {
        "category": "self-help",
    },
    "How to Lose Time and Money": {
        "category": "self-help",
    },
}

# Load documents and update with metadata
docs_dict = {}
for title in article_titles:
    doc = SimpleDirectoryReader(
        input_files=[f"llamaindex-data/{title}.txt"]
    ).load_data()[0]
    doc.metadata.update(article_metadatas[title])
    docs_dict[title] = doc

docs_dict

Aus Gründen der Lesbarkeit kürze ich die Ausgabe unten ab:

{'How to Do Great Work': Document(id_='e26a2fcc-77d2-43e8-968b-f893944907dc', embedding=None, metadata={'file_path': 'llamaindex-data/How to Do Great Work.txt', 'file_name': 'How to Do Great Work.txt', 'file_type': 'text/plain', 'file_size': 59399, 'creation_date': '2024-09-18', 'last_modified_date': '2024-09-18', 'category': 'self-help'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, text='July 2023\\n\\nIf you collected lists of techniques for doing great work in a lot of different fields, what would the intersection look like? I decided to find out by making it.\\n\\nPartly my goal was to create a guide that could be used by someone working in any field. But I was also curious about the shape of the intersection. And one thing this exercise shows is that it does have a definite shape; it\\'s not just a point labelled "work hard."\\n\\nThe following recipe assumes you\\'re very ambitious.\\n\\n\\n\\n\\n\\nThe first step is to decide what to work on. The work you choose needs to have three qualities: it has to be something you have a natural aptitude for, that you have a deep interest in, and that offers scope to do great work.\\n\\nIn practice you don\\'t have to worry much about the third criterion. Ambitious people are if anything already too conservative about it. So all you need to do is find something you have an aptitude for and great interest in. [1]\\n\\nThat sounds straightforward, but it\\'s often quite difficult. When you\\'re young you don\\'t know what you\\'re good at or what different kinds of work are like. Some kinds of work you end up doing may not even exist yet. So while some people know what they want to do at 14, most have to figure it out.\\n\\nThe way to figure out what to work on is by working. If you\\'re not sure what to work on, guess. But pick something and get going. You\\'ll probably guess wrong some of the time, but that\\'s fine. It\\'s good to know about multiple things; some of the biggest discoveries come from noticing connections between different fields.\\n\\n
…
(truncated)

Schritt 2: Richte das LLM und das Chunking ein

Als Nächstes initialisieren wir das große Sprachmodell (LLM) mit OpenAIs GPT-4o Mini und konfigurieren einen Satzsplitter, um die Dokumente für die Einbettung in kleinere Abschnitte zu zerlegen. Wir haben auch einen Rückrufmanager eingerichtet, um den Prozess zu verfolgen.

from llama_index.llms.openai import OpenAI
from llama_index.core.callbacks import LlamaDebugHandler, CallbackManager
from llama_index.core.node_parser import SentenceSplitter

# Initialize LLM and chunk splitter
llm = OpenAI("gpt-4o-mini")
callback_manager = CallbackManager([LlamaDebugHandler()])
splitter = SentenceSplitter(chunk_size=256)

Schritt 3: Vektorindizes aufbauen und Zusammenfassungen erstellen

Für jedes Dokument erstellen wir einen Vektorindex, mit dem wir später relevante Dokumententeile auf der Grundlage der Ähnlichkeit wiederfinden können. Die Zusammenfassungen der einzelnen Dokumente werden vom LLM erstellt. Diese Zusammenfassungen werden als IndexNode gespeichert.

from llama_index.core import VectorStoreIndex, SummaryIndex
from llama_index.core.schema import IndexNode

# Define top-level nodes and vector retrievers
nodes = []
vector_query_engines = {}
vector_retrievers = {}

for title in article_titles:
    # build vector index
    vector_index = VectorStoreIndex.from_documents(
        [docs_dict[title]],
        transformations=[splitter],
        callback_manager=callback_manager,
    )
    
    # define query engines
    vector_query_engine = vector_index.as_query_engine(llm=llm)
    vector_query_engines[title] = vector_query_engine
    vector_retrievers[title] = vector_index.as_retriever(similarity_top_k=3)
    # save summaries
    out_path = Path("summaries") / f"{title}.txt"
    if not out_path.exists():
        # use LLM-generated summary
        summary_index = SummaryIndex.from_documents(
            [docs_dict[title]], callback_manager=callback_manager
        )
        summarizer = summary_index.as_query_engine(
            response_mode="tree_summarize", llm=llm
        )
        response = await summarizer.aquery(
            f"Give me a summary of {title}"
        )
        article_summary = response.response
        Path("summaries").mkdir(exist_ok=True)
        with open(out_path, "w") as fp:
            fp.write(article_summary)
    else:
        with open(out_path, "r") as fp:
            article_summary = fp.read()
    print(f"**Summary for {title}: {article_summary}")
    node = IndexNode(text=article_summary, index_id=title)
    nodes.append(node)

**********
Trace: index_construction
**********
**Summary for How to Do Great Work: The essence of doing great work revolves around a few key principles. First, it's crucial to choose a field that aligns with your natural aptitudes and deep interests, as this will drive your motivation and creativity. Engaging in your own projects and maintaining a sense of excited curiosity are vital for discovering new ideas and making significant contributions.
Learning enough to reach the frontiers of knowledge in your chosen field allows you to identify gaps and explore them, often leading to innovative breakthroughs. Hard work is essential, but it should be fueled by genuine interest rather than mere diligence. Consistency and the willingness to embrace challenges, including the risk of failure, are important for growth and discovery.
Collaboration with high-quality colleagues can enhance your work, as they can provide insights and encouragement. Maintaining morale is also crucial; a positive mindset can help you navigate setbacks and keep you focused on your goals.
Ultimately, curiosity serves as the driving force behind great work, guiding you through the process of exploration and discovery. By nurturing your curiosity and being open to new experiences, you can uncover unique opportunities and make meaningful contributions in your field.
**********
Trace: index_construction
**********
**Summary for Having Kids: The piece reflects on the author's transformation in perspective regarding parenthood. Initially apprehensive about having children, viewing parents as uncool and burdensome, the author experiences a profound shift after becoming a parent. The arrival of their first child triggers protective instincts and a newfound appreciation for children, leading to genuine joy in parenting moments that were previously overlooked. 
The author acknowledges the challenges of parenthood, such as reduced productivity and ambition, as well as the necessity of adapting to a child's schedule. Despite these challenges, the author finds that the happiness and meaningful moments shared with children far outweigh the difficulties. The narrative emphasizes that while parenting can be demanding, it also brings unexpected joy and fulfillment, ultimately leading to a richer life experience.
**********
Trace: index_construction
**********
**Summary for How to Lose Time and Money: The piece discusses the author's reflections on wealth and time management after selling a startup. It emphasizes that losing wealth often stems from poor investments rather than excessive spending, as the latter triggers alarms in our minds. The author highlights the need to develop new awareness to avoid bad investments, which can be less obvious than overspending on luxuries. Similarly, when it comes to time, the most significant loss occurs not through leisure activities but through engaging in unproductive work that feels legitimate, like managing emails. The author argues that modern complexities require us to recognize and avoid these deceptive traps that mimic productive behavior but ultimately lead to wasted time.

Wie du siehst, haben wir jetzt drei Knoten, die jeweils eine Zusammenfassung eines der Dokumente darstellen. Außerdem haben wir vector_retrievers, wo die Chunk-Vektoren für jedes Dokument gespeichert werden.

print(nodes)
print('------'
print(vector_retrievers)

[IndexNode(id_='406d9927-c9e2-486f-9fc5-111efefc1649', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="The essence of doing great work revolves around a few key principles. First, it's crucial to choose a field that aligns with your natural aptitudes and deep interests, as this will drive your motivation and creativity. Engaging in your own projects and maintaining a sense of excited curiosity are vital for discovering new ideas and making significant contributions.\\n\\nLearning enough to reach the frontiers of knowledge in your chosen field allows you to identify gaps and explore them, often leading to innovative breakthroughs. Hard work is essential, but it should be fueled by genuine interest rather than mere diligence. Consistency and the willingness to embrace challenges, including the risk of failure, are important for growth and discovery.\\n\\nCollaboration with high-quality colleagues can enhance your work, as they can provide insights and encouragement. Maintaining morale is also crucial; a positive mindset can help you navigate setbacks and keep you focused on your goals.\\n\\nUltimately, curiosity serves as the driving force behind great work, guiding you through the process of exploration and discovery. By nurturing your curiosity and being open to new experiences, you can uncover unique opportunities and make meaningful contributions in your field.", mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n', index_id='How to Do Great Work', obj=None),
 IndexNode(id_='8007fdd2-6617-4a76-95d7-79efef0700e7', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="The piece reflects on the author's transformation in perspective regarding parenthood. Initially apprehensive about having children, viewing parents as uncool and burdensome, the author experiences a profound shift after becoming a parent. The arrival of their first child triggers protective instincts and a newfound appreciation for children, leading to genuine joy in parenting moments that were previously overlooked. \\n\\nThe author acknowledges the challenges of parenthood, such as reduced productivity and ambition, as well as the necessity of adapting to a child's schedule. Despite these challenges, the author finds that the happiness and meaningful moments shared with children far outweigh the difficulties. The narrative emphasizes that while parenting can be demanding, it also brings unexpected joy and fulfillment, ultimately leading to a richer life experience.", mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n', index_id='Having Kids', obj=None),
 IndexNode(id_='7e4dd169-eb28-4b2f-8a1a-ca1c5b85ac30', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="The piece discusses the author's reflections on wealth and time management after selling a startup. It emphasizes that losing wealth often stems from poor investments rather than excessive spending, as the latter triggers alarms in our minds. The author highlights the need to develop new awareness to avoid bad investments, which can be less obvious than overspending on luxuries. Similarly, when it comes to time, the most significant loss occurs not through leisure activities but through engaging in unproductive work that feels legitimate, like managing emails. The author argues that modern complexities require us to recognize and avoid these deceptive traps that mimic productive behavior but ultimately lead to wasted time.", mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n', index_id='How to Lose Time and Money', obj=None)]
 ------
 {'How to Do Great Work': <llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x330afeeb0>,
 'Having Kids': <llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x33129c7c0>,
 'How to Lose Time and Money': <llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x32e8929a0>}

Schritt 4: Erstelle einen Top-Level-Vektor-Index

Sobald wir die Zusammenfassungen (nodes) haben, können wir einen Top-Level-Vektorindex und einen Retriever (top_vector_retriever) erstellen. Dieser Index verwendet die Zusammenfassungen, um den Abrufprozess zu starten. Es hilft uns, die wichtigsten Zusammenfassungen zu finden, bevor wir uns die detaillierten Dokumente ansehen.

# Build top-level vector index from summary nodes
top_vector_index = VectorStoreIndex(
    nodes, transformations=[splitter], callback_manager=callback_manager
)

# Set up a retriever for the top-level summaries
top_vector_retriever = top_vector_index.as_retriever(similarity_top_k=1)
top_vector_retriever

<llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x32db715b0>

Schritt 5: Rekursiven Abruf einrichten

Da wir nun sowohl den Top-Level-Retriever als auch die einzelnen Dokument-Retriever haben, können wir den rekursiven Retriever einrichten. Auf diese Weise kann das System zunächst die relevanten Zusammenfassungen abrufen und dann die einzelnen Dokumententeile nach ihrer Relevanz durchsuchen.

from llama_index.core.retrievers import RecursiveRetriever

# Combine top-level retriever with individual document retrievers
recursive_retriever = RecursiveRetriever(
    "vector",
    retriever_dict={"vector": top_vector_retriever, **vector_retrievers},
    verbose=True,
)

Schritt 6: Rekursive Suchanfragen ausführen

Schließlich können wir unseren rekursiven Retriever verwenden, um einige Beispielabfragen auszuführen.

# Run recursive retriever on sample queries
result = recursive_retriever.retrieve("should I have kids?")
for res in result:
    print(res.node.get_content())

Retrieving with query id None: should I have kids?
Retrieved node with id, entering: Having Kids
Retrieving with query id Having Kids: should I have kids?
Retrieving text node: Do you have so little to spare?
And while having kids may be warping my present judgement, it hasn't overwritten my memory. I remember perfectly well what life was like before. Well enough to miss some things a lot, like the ability to take off for some other country at a moment's notice. That was so great. Why did I never do that?
See what I did there? The fact is, most of the freedom I had before kids, I never used. I paid for it in loneliness, but I never used it.
I had plenty of happy times before I had kids. But if I count up happy moments, not just potential happiness but actual happy moments, there are more after kids than before. Now I practically have it on tap, almost any bedtime.
People's experiences as parents vary a lot, and I know I've been lucky. But I think the worries I had before having kids must be pretty common, and judging by other parents' faces when they see their kids, so must the happiness that kids bring.
Retrieving text node: December 2019
Before I had kids, I was afraid of having kids. Up to that point I felt about kids the way the young Augustine felt about living virtuously. I'd have been sad to think I'd never have children. But did I want them now? No.
If I had kids, I'd become a parent, and parents, as I'd known since I was a kid, were uncool. They were dull and responsible and had no fun. And while it's not surprising that kids would believe that, to be honest I hadn't seen much as an adult to change my mind. Whenever I'd noticed parents with kids, the kids seemed to be terrors, and the parents pathetic harried creatures, even when they prevailed.
When people had babies, I congratulated them enthusiastically, because that seemed to be what one did. But I didn't feel it at all. "Better you than me," I was thinking.
Now when people have babies I congratulate them enthusiastically and I mean it. Especially the first one. I feel like they just got the best gift in the world.
Retrieving text node: Which meant I had to finish or I'd be taking away their trip to Africa. Maybe if I'm really lucky such tricks could put me net ahead. But the wind is there, no question.
On the other hand, what kind of wimpy ambition do you have if it won't survive having kids? Do you have so little to spare?
And while having kids may be warping my present judgement, it hasn't overwritten my memory. I remember perfectly well what life was like before. Well enough to miss some things a lot, like the ability to take off for some other country at a moment's notice. That was so great. Why did I never do that?
See what I did there? The fact is, most of the freedom I had before kids, I never used. I paid for it in loneliness, but I never used it.
I had plenty of happy times before I had kids. But if I count up happy moments, not just potential happiness but actual happy moments, there are more after kids than before. Now I practically have it on tap, almost any bedtime.

result = recursive_retriever.retrieve("How to buy more time?")
for res in result:
    print(res.node.get_content())

Retrieving with query id None: How to buy more time?
Retrieved node with id, entering: How to Lose Time and Money
Retrieving with query id How to Lose Time and Money: How to buy more time?
Retrieving text node: Which is why people trying to sell you expensive things say "it's an investment."
The solution is to develop new alarms. This can be a tricky business, because while the alarms that prevent you from overspending are so basic that they may even be in our DNA, the ones that prevent you from making bad investments have to be learned, and are sometimes fairly counterintuitive.
A few days ago I realized something surprising: the situation with time is much the same as with money. The most dangerous way to lose time is not to spend it having fun, but to spend it doing fake work. When you spend time having fun, you know you're being self-indulgent. Alarms start to go off fairly quickly. If I woke up one morning and sat down on the sofa and watched TV all day, I'd feel like something was terribly wrong. Just thinking about it makes me wince. I'd start to feel uncomfortable after sitting on a sofa watching TV for 2 hours, let alone a whole day.
Retrieving text node: The solution is to develop new alarms. This can be a tricky business, because while the alarms that prevent you from overspending are so basic that they may even be in our DNA, the ones that prevent you from making bad investments have to be learned, and are sometimes fairly counterintuitive.
A few days ago I realized something surprising: the situation with time is much the same as with money. The most dangerous way to lose time is not to spend it having fun, but to spend it doing fake work. When you spend time having fun, you know you're being self-indulgent. Alarms start to go off fairly quickly. If I woke up one morning and sat down on the sofa and watched TV all day, I'd feel like something was terribly wrong. Just thinking about it makes me wince. I'd start to feel uncomfortable after sitting on a sofa watching TV for 2 hours, let alone a whole day.
And yet I've definitely had days when I might as well have sat in front of a TV all day — days at the end of which, if I asked myself what I got done that day, the answer would have been: basically, nothing.
Retrieving text node: Investing bypasses those alarms. You're not spending the money; you're just moving it from one asset to another. Which is why people trying to sell you expensive things say "it's an investment."
The solution is to develop new alarms. This can be a tricky business, because while the alarms that prevent you from overspending are so basic that they may even be in our DNA, the ones that prevent you from making bad investments have to be learned, and are sometimes fairly counterintuitive.
A few days ago I realized something surprising: the situation with time is much the same as with money. The most dangerous way to lose time is not to spend it having fun, but to spend it doing fake work. When you spend time having fun, you know you're being self-indulgent. Alarms start to go off fairly quickly. If I woke up one morning and sat down on the sofa and watched TV all day, I'd feel like something was terribly wrong. Just thinking about it makes me wince.

Fazit

Durch die Verwendung von Dokumentenzusammenfassungen und Hierarchien macht das rekursive Retrieval die abgerufenen Chunks relevanter, selbst wenn es sich um große Datensätze handelt. Für Unternehmen, die große Datenmengen verarbeiten, ist die rekursive Suche eine zuverlässige Methode, um genauere Suchsysteme zu erstellen.

Um mehr über RAG-Techniken zu erfahren, empfehle ich diese Blogs:

Author

Ryan Ong

Ryan ist ein führender Datenwissenschaftler, der sich auf die Entwicklung von KI-Anwendungen mit LLMs spezialisiert hat. Er ist Doktorand für natürliche Sprachverarbeitung und Wissensgraphen am Imperial College London, wo er auch seinen Master in Informatik gemacht hat. Außerhalb der Datenwissenschaft schreibt er einen wöchentlichen Substack-Newsletter, The Limitless Playbook, in dem er eine umsetzbare Idee von den besten Denkern der Welt teilt und gelegentlich über zentrale KI-Konzepte schreibt.

Themen

Künstliche Intelligenz

Große Sprachmodelle

Lerne KI mit diesen Kursen!

Lernpfad

Entwicklung von KI-Anwendungen

21 Std.

Lerne, KI-gestützte Anwendungen mit den neuesten KI-Entwicklungstools zu erstellen, darunter die OpenAI API, Hugging Face und LangChain.

Details anzeigen

Kurs starten

Kurs

Vektordatenbanken für Einbettungen mit Pinecone

3 Std.

6.9K

Entdecke, wie die Vektordatenbank von Pinecone die Entwicklung von KI-Anwendungen total auf den Kopf stellt!

Details anzeigen

Kurs starten

Kurs

Retrieval Augmented Generation (RAG) mit LangChain

3 Std.

13.7K

Dieser Kurs vermittelt moderne Methoden zur Einbindung externer Daten in LLMs mit Retrieval Augmented Generation (RAG) und LangChain.

Details anzeigen

Kurs starten

Verwandt

Blog

Die 20 besten Snowflake-Interview-Fragen für alle Niveaus

Bist du gerade auf der Suche nach einem Job, der Snowflake nutzt? Bereite dich mit diesen 20 besten Snowflake-Interview-Fragen vor, damit du den Job bekommst!

Nisha Arya Ahmed

15 Min.

Mehr anzeigen Mehr anzeigen

Was ist ein rekursiver Abruf?

Rekursive Retrieval-Implementierung mit LlamaIndex

Schritt 1: Lade und bereite die Dokumente vor

Schritt 2: Richte das LLM und das Chunking ein

Schritt 3: Vektorindizes aufbauen und Zusammenfassungen erstellen

Schritt 4: Erstelle einen Top-Level-Vektor-Index

Schritt 5: Rekursiven Abruf einrichten

Schritt 6: Rekursive Suchanfragen ausführen

Fazit

Die 20 besten Snowflake-Interview-Fragen für alle Niveaus

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Entwicklung von KI-Anwendungen

Vektordatenbanken für Einbettungen mit Pinecone

Retrieval Augmented Generation (RAG) mit LangChain

Die 20 besten Snowflake-Interview-Fragen für alle Niveaus

Entwicklung von KI-Anwendungen