Récupération récursive pour RAG : Mise en œuvre avec LlamaIndex

Apprenez à mettre en œuvre la recherche récursive dans les systèmes RAG en utilisant LlamaIndex pour améliorer la précision et la pertinence des informations récupérées, en particulier pour les grandes collections de documents.

Actualisé 13 nov. 2024 · 8 min lire

Dans de nombreuses RAGle processus de recherche est souvent très simple. Les documents sont généralement décomposés en morceaux, convertis en encastrements et stockés dans une base de données vectorielle. Lorsqu'une requête est formulée, le système extrait les k premiers documents en fonction de la similitude de leurs encastrements sont similaires.

Cette méthode présente toutefois quelques inconvénients, en particulier pour les grandes collections. Les éléments peuvent manquer de clarté et le système peut ne pas toujours extraire les informations les plus pertinentes, ce qui conduit à des résultats moins précis.

La recherche récursive a été développée pour améliorer la précision de la recherche en utilisant la structure du document. Au lieu d'extraire directement les morceaux, il extrait d'abord les résumés pertinents et descend ensuite jusqu'aux morceaux correspondants, ce qui rend les résultats finaux de l'extraction plus pertinents.

Dans cet article, nous expliquerons la recherche récursive et nous vous guiderons pour la mettre en œuvre étape par étape à l'aide de LlamaIndex.

Qu'est-ce que la récupération récursive ?

Au lieu de simplement intégrer des morceaux de documents bruts et de les récupérer sur la base de la similarité, la récupération récursive fonctionne en intégrant d'abord des résumés des documents et en les reliant aux morceaux de documents complets. Lorsqu'une requête est formulée, le système récupère d'abord les résumés pertinents, puis approfondit la recherche pour trouver les éléments d'information connexes.

Cette méthode donne au système de recherche plus de contexte avant de fournir les morceaux finaux, ce qui lui permet de mieux trouver les informations pertinentes.

Mise en œuvre de la recherche récursive à l'aide de LlamaIndex

Dans cette section, nous allons vous guider pas à pas dans la mise en œuvre de la recherche récursive à l'aide de LlamaIndex, depuis le chargement des documents jusqu'à l'exécution de requêtes avec recherche récursive.

Étape 1 : Chargement et préparation des documents

Tout d'abord, nous chargeons les documents dans le système à l'aide de SimpleDirectoryReader. Chaque document se voit attribuer un titre et des métadonnées (comme sa catégorie) afin de faciliter le filtrage ultérieur. Les documents chargés sont ensuite stockés dans un dictionnaire pour un accès facile.

from llama_index.core import SimpleDirectoryReader

# Document titles and metadata
article_titles = ["How to Do Great Work", "Having Kids", "How to Lose Time and Money"]
article_metadatas = {
    "How to Do Great Work": {
        "category": "self-help",
    },
    "Having Kids": {
        "category": "self-help",
    },
    "How to Lose Time and Money": {
        "category": "self-help",
    },
}

# Load documents and update with metadata
docs_dict = {}
for title in article_titles:
    doc = SimpleDirectoryReader(
        input_files=[f"llamaindex-data/{title}.txt"]
    ).load_data()[0]
    doc.metadata.update(article_metadatas[title])
    docs_dict[title] = doc

docs_dict

Par souci de lisibilité, je vais tronquer le résultat ci-dessous :

{'How to Do Great Work': Document(id_='e26a2fcc-77d2-43e8-968b-f893944907dc', embedding=None, metadata={'file_path': 'llamaindex-data/How to Do Great Work.txt', 'file_name': 'How to Do Great Work.txt', 'file_type': 'text/plain', 'file_size': 59399, 'creation_date': '2024-09-18', 'last_modified_date': '2024-09-18', 'category': 'self-help'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, text='July 2023\\n\\nIf you collected lists of techniques for doing great work in a lot of different fields, what would the intersection look like? I decided to find out by making it.\\n\\nPartly my goal was to create a guide that could be used by someone working in any field. But I was also curious about the shape of the intersection. And one thing this exercise shows is that it does have a definite shape; it\\'s not just a point labelled "work hard."\\n\\nThe following recipe assumes you\\'re very ambitious.\\n\\n\\n\\n\\n\\nThe first step is to decide what to work on. The work you choose needs to have three qualities: it has to be something you have a natural aptitude for, that you have a deep interest in, and that offers scope to do great work.\\n\\nIn practice you don\\'t have to worry much about the third criterion. Ambitious people are if anything already too conservative about it. So all you need to do is find something you have an aptitude for and great interest in. [1]\\n\\nThat sounds straightforward, but it\\'s often quite difficult. When you\\'re young you don\\'t know what you\\'re good at or what different kinds of work are like. Some kinds of work you end up doing may not even exist yet. So while some people know what they want to do at 14, most have to figure it out.\\n\\nThe way to figure out what to work on is by working. If you\\'re not sure what to work on, guess. But pick something and get going. You\\'ll probably guess wrong some of the time, but that\\'s fine. It\\'s good to know about multiple things; some of the biggest discoveries come from noticing connections between different fields.\\n\\n
…
(truncated)

Étape 2 : Mise en place du LLM et du chunking

Ensuite, nous initialisons le grand modèle linguistique (LLM) à l'aide du logiciel OpenAI GPT-4o Mini d'OpenAI et configurons un séparateur de phrases pour diviser les documents en morceaux plus petits en vue de l'intégration. Nous avons également mis en place un gestionnaire de rappels pour assurer le cursus.

from llama_index.llms.openai import OpenAI
from llama_index.core.callbacks import LlamaDebugHandler, CallbackManager
from llama_index.core.node_parser import SentenceSplitter

# Initialize LLM and chunk splitter
llm = OpenAI("gpt-4o-mini")
callback_manager = CallbackManager([LlamaDebugHandler()])
splitter = SentenceSplitter(chunk_size=256)

Étape 3 : Construire des index vectoriels et générer des résumés

Pour chaque document, nous créons un index vectoriel qui nous permet de retrouver ultérieurement des morceaux de documents pertinents en fonction de leur similarité. Les résumés de chaque document sont générés par le LLM. Ces résumés sont stockés à l'adresse IndexNode.

from llama_index.core import VectorStoreIndex, SummaryIndex
from llama_index.core.schema import IndexNode

# Define top-level nodes and vector retrievers
nodes = []
vector_query_engines = {}
vector_retrievers = {}

for title in article_titles:
    # build vector index
    vector_index = VectorStoreIndex.from_documents(
        [docs_dict[title]],
        transformations=[splitter],
        callback_manager=callback_manager,
    )
    
    # define query engines
    vector_query_engine = vector_index.as_query_engine(llm=llm)
    vector_query_engines[title] = vector_query_engine
    vector_retrievers[title] = vector_index.as_retriever(similarity_top_k=3)
    # save summaries
    out_path = Path("summaries") / f"{title}.txt"
    if not out_path.exists():
        # use LLM-generated summary
        summary_index = SummaryIndex.from_documents(
            [docs_dict[title]], callback_manager=callback_manager
        )
        summarizer = summary_index.as_query_engine(
            response_mode="tree_summarize", llm=llm
        )
        response = await summarizer.aquery(
            f"Give me a summary of {title}"
        )
        article_summary = response.response
        Path("summaries").mkdir(exist_ok=True)
        with open(out_path, "w") as fp:
            fp.write(article_summary)
    else:
        with open(out_path, "r") as fp:
            article_summary = fp.read()
    print(f"**Summary for {title}: {article_summary}")
    node = IndexNode(text=article_summary, index_id=title)
    nodes.append(node)

**********
Trace: index_construction
**********
**Summary for How to Do Great Work: The essence of doing great work revolves around a few key principles. First, it's crucial to choose a field that aligns with your natural aptitudes and deep interests, as this will drive your motivation and creativity. Engaging in your own projects and maintaining a sense of excited curiosity are vital for discovering new ideas and making significant contributions.
Learning enough to reach the frontiers of knowledge in your chosen field allows you to identify gaps and explore them, often leading to innovative breakthroughs. Hard work is essential, but it should be fueled by genuine interest rather than mere diligence. Consistency and the willingness to embrace challenges, including the risk of failure, are important for growth and discovery.
Collaboration with high-quality colleagues can enhance your work, as they can provide insights and encouragement. Maintaining morale is also crucial; a positive mindset can help you navigate setbacks and keep you focused on your goals.
Ultimately, curiosity serves as the driving force behind great work, guiding you through the process of exploration and discovery. By nurturing your curiosity and being open to new experiences, you can uncover unique opportunities and make meaningful contributions in your field.
**********
Trace: index_construction
**********
**Summary for Having Kids: The piece reflects on the author's transformation in perspective regarding parenthood. Initially apprehensive about having children, viewing parents as uncool and burdensome, the author experiences a profound shift after becoming a parent. The arrival of their first child triggers protective instincts and a newfound appreciation for children, leading to genuine joy in parenting moments that were previously overlooked. 
The author acknowledges the challenges of parenthood, such as reduced productivity and ambition, as well as the necessity of adapting to a child's schedule. Despite these challenges, the author finds that the happiness and meaningful moments shared with children far outweigh the difficulties. The narrative emphasizes that while parenting can be demanding, it also brings unexpected joy and fulfillment, ultimately leading to a richer life experience.
**********
Trace: index_construction
**********
**Summary for How to Lose Time and Money: The piece discusses the author's reflections on wealth and time management after selling a startup. It emphasizes that losing wealth often stems from poor investments rather than excessive spending, as the latter triggers alarms in our minds. The author highlights the need to develop new awareness to avoid bad investments, which can be less obvious than overspending on luxuries. Similarly, when it comes to time, the most significant loss occurs not through leisure activities but through engaging in unproductive work that feels legitimate, like managing emails. The author argues that modern complexities require us to recognize and avoid these deceptive traps that mimic productive behavior but ultimately lead to wasted time.

Comme vous pouvez le voir, nous avons maintenant trois nœuds, chacun représentant un résumé de l'un des documents. En outre, nous disposons de vector_retrievers, qui stocke les vecteurs de morceaux pour chaque document.

print(nodes)
print('------'
print(vector_retrievers)

[IndexNode(id_='406d9927-c9e2-486f-9fc5-111efefc1649', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="The essence of doing great work revolves around a few key principles. First, it's crucial to choose a field that aligns with your natural aptitudes and deep interests, as this will drive your motivation and creativity. Engaging in your own projects and maintaining a sense of excited curiosity are vital for discovering new ideas and making significant contributions.\\n\\nLearning enough to reach the frontiers of knowledge in your chosen field allows you to identify gaps and explore them, often leading to innovative breakthroughs. Hard work is essential, but it should be fueled by genuine interest rather than mere diligence. Consistency and the willingness to embrace challenges, including the risk of failure, are important for growth and discovery.\\n\\nCollaboration with high-quality colleagues can enhance your work, as they can provide insights and encouragement. Maintaining morale is also crucial; a positive mindset can help you navigate setbacks and keep you focused on your goals.\\n\\nUltimately, curiosity serves as the driving force behind great work, guiding you through the process of exploration and discovery. By nurturing your curiosity and being open to new experiences, you can uncover unique opportunities and make meaningful contributions in your field.", mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n', index_id='How to Do Great Work', obj=None),
 IndexNode(id_='8007fdd2-6617-4a76-95d7-79efef0700e7', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="The piece reflects on the author's transformation in perspective regarding parenthood. Initially apprehensive about having children, viewing parents as uncool and burdensome, the author experiences a profound shift after becoming a parent. The arrival of their first child triggers protective instincts and a newfound appreciation for children, leading to genuine joy in parenting moments that were previously overlooked. \\n\\nThe author acknowledges the challenges of parenthood, such as reduced productivity and ambition, as well as the necessity of adapting to a child's schedule. Despite these challenges, the author finds that the happiness and meaningful moments shared with children far outweigh the difficulties. The narrative emphasizes that while parenting can be demanding, it also brings unexpected joy and fulfillment, ultimately leading to a richer life experience.", mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n', index_id='Having Kids', obj=None),
 IndexNode(id_='7e4dd169-eb28-4b2f-8a1a-ca1c5b85ac30', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="The piece discusses the author's reflections on wealth and time management after selling a startup. It emphasizes that losing wealth often stems from poor investments rather than excessive spending, as the latter triggers alarms in our minds. The author highlights the need to develop new awareness to avoid bad investments, which can be less obvious than overspending on luxuries. Similarly, when it comes to time, the most significant loss occurs not through leisure activities but through engaging in unproductive work that feels legitimate, like managing emails. The author argues that modern complexities require us to recognize and avoid these deceptive traps that mimic productive behavior but ultimately lead to wasted time.", mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n', index_id='How to Lose Time and Money', obj=None)]
 ------
 {'How to Do Great Work': <llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x330afeeb0>,
 'Having Kids': <llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x33129c7c0>,
 'How to Lose Time and Money': <llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x32e8929a0>}

Étape 4 : Construire un index vectoriel de premier niveau

Une fois que nous avons les résumés (nodes), nous pouvons créer un index vectoriel de premier niveau et un extracteur (top_vector_retriever). Cet index utilise les résumés pour lancer le processus de recherche. Il nous aide à trouver les résumés les plus pertinents avant d'examiner les documents détaillés.

# Build top-level vector index from summary nodes
top_vector_index = VectorStoreIndex(
    nodes, transformations=[splitter], callback_manager=callback_manager
)

# Set up a retriever for the top-level summaries
top_vector_retriever = top_vector_index.as_retriever(similarity_top_k=1)
top_vector_retriever

<llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x32db715b0>

Étape 5 : Mise en place d'une récupération récursive

Maintenant que nous disposons de l'extracteur de premier niveau et des extracteurs de documents individuels, nous pouvons mettre en place l'extracteur récursif. Cette configuration permet au système d'obtenir d'abord les résumés pertinents, puis de se plonger dans les morceaux de documents spécifiques en fonction de leur pertinence.

from llama_index.core.retrievers import RecursiveRetriever

# Combine top-level retriever with individual document retrievers
recursive_retriever = RecursiveRetriever(
    "vector",
    retriever_dict={"vector": top_vector_retriever, **vector_retrievers},
    verbose=True,
)

Étape 6 : Exécuter des requêtes de recherche récursive

Enfin, nous sommes prêts à utiliser notre récupérateur récursif pour exécuter quelques exemples de requêtes.

# Run recursive retriever on sample queries
result = recursive_retriever.retrieve("should I have kids?")
for res in result:
    print(res.node.get_content())

Retrieving with query id None: should I have kids?
Retrieved node with id, entering: Having Kids
Retrieving with query id Having Kids: should I have kids?
Retrieving text node: Do you have so little to spare?
And while having kids may be warping my present judgement, it hasn't overwritten my memory. I remember perfectly well what life was like before. Well enough to miss some things a lot, like the ability to take off for some other country at a moment's notice. That was so great. Why did I never do that?
See what I did there? The fact is, most of the freedom I had before kids, I never used. I paid for it in loneliness, but I never used it.
I had plenty of happy times before I had kids. But if I count up happy moments, not just potential happiness but actual happy moments, there are more after kids than before. Now I practically have it on tap, almost any bedtime.
People's experiences as parents vary a lot, and I know I've been lucky. But I think the worries I had before having kids must be pretty common, and judging by other parents' faces when they see their kids, so must the happiness that kids bring.
Retrieving text node: December 2019
Before I had kids, I was afraid of having kids. Up to that point I felt about kids the way the young Augustine felt about living virtuously. I'd have been sad to think I'd never have children. But did I want them now? No.
If I had kids, I'd become a parent, and parents, as I'd known since I was a kid, were uncool. They were dull and responsible and had no fun. And while it's not surprising that kids would believe that, to be honest I hadn't seen much as an adult to change my mind. Whenever I'd noticed parents with kids, the kids seemed to be terrors, and the parents pathetic harried creatures, even when they prevailed.
When people had babies, I congratulated them enthusiastically, because that seemed to be what one did. But I didn't feel it at all. "Better you than me," I was thinking.
Now when people have babies I congratulate them enthusiastically and I mean it. Especially the first one. I feel like they just got the best gift in the world.
Retrieving text node: Which meant I had to finish or I'd be taking away their trip to Africa. Maybe if I'm really lucky such tricks could put me net ahead. But the wind is there, no question.
On the other hand, what kind of wimpy ambition do you have if it won't survive having kids? Do you have so little to spare?
And while having kids may be warping my present judgement, it hasn't overwritten my memory. I remember perfectly well what life was like before. Well enough to miss some things a lot, like the ability to take off for some other country at a moment's notice. That was so great. Why did I never do that?
See what I did there? The fact is, most of the freedom I had before kids, I never used. I paid for it in loneliness, but I never used it.
I had plenty of happy times before I had kids. But if I count up happy moments, not just potential happiness but actual happy moments, there are more after kids than before. Now I practically have it on tap, almost any bedtime.

result = recursive_retriever.retrieve("How to buy more time?")
for res in result:
    print(res.node.get_content())

Retrieving with query id None: How to buy more time?
Retrieved node with id, entering: How to Lose Time and Money
Retrieving with query id How to Lose Time and Money: How to buy more time?
Retrieving text node: Which is why people trying to sell you expensive things say "it's an investment."
The solution is to develop new alarms. This can be a tricky business, because while the alarms that prevent you from overspending are so basic that they may even be in our DNA, the ones that prevent you from making bad investments have to be learned, and are sometimes fairly counterintuitive.
A few days ago I realized something surprising: the situation with time is much the same as with money. The most dangerous way to lose time is not to spend it having fun, but to spend it doing fake work. When you spend time having fun, you know you're being self-indulgent. Alarms start to go off fairly quickly. If I woke up one morning and sat down on the sofa and watched TV all day, I'd feel like something was terribly wrong. Just thinking about it makes me wince. I'd start to feel uncomfortable after sitting on a sofa watching TV for 2 hours, let alone a whole day.
Retrieving text node: The solution is to develop new alarms. This can be a tricky business, because while the alarms that prevent you from overspending are so basic that they may even be in our DNA, the ones that prevent you from making bad investments have to be learned, and are sometimes fairly counterintuitive.
A few days ago I realized something surprising: the situation with time is much the same as with money. The most dangerous way to lose time is not to spend it having fun, but to spend it doing fake work. When you spend time having fun, you know you're being self-indulgent. Alarms start to go off fairly quickly. If I woke up one morning and sat down on the sofa and watched TV all day, I'd feel like something was terribly wrong. Just thinking about it makes me wince. I'd start to feel uncomfortable after sitting on a sofa watching TV for 2 hours, let alone a whole day.
And yet I've definitely had days when I might as well have sat in front of a TV all day — days at the end of which, if I asked myself what I got done that day, the answer would have been: basically, nothing.
Retrieving text node: Investing bypasses those alarms. You're not spending the money; you're just moving it from one asset to another. Which is why people trying to sell you expensive things say "it's an investment."
The solution is to develop new alarms. This can be a tricky business, because while the alarms that prevent you from overspending are so basic that they may even be in our DNA, the ones that prevent you from making bad investments have to be learned, and are sometimes fairly counterintuitive.
A few days ago I realized something surprising: the situation with time is much the same as with money. The most dangerous way to lose time is not to spend it having fun, but to spend it doing fake work. When you spend time having fun, you know you're being self-indulgent. Alarms start to go off fairly quickly. If I woke up one morning and sat down on the sofa and watched TV all day, I'd feel like something was terribly wrong. Just thinking about it makes me wince.

Conclusion

En utilisant des résumés et des hiérarchies de documents, la recherche récursive rend les morceaux récupérés plus pertinents, même lorsqu'il s'agit de grands ensembles de données. Pour les organisations qui traitent de gros volumes de données, la recherche récursive est une méthode fiable pour créer des systèmes de recherche plus précis.

Pour en savoir plus sur les techniques RAG, je vous recommande ces blogs :

Author

Ryan Ong

Ryan est un data scientist de premier plan spécialisé dans la création d'applications d'IA utilisant des LLM. Il est candidat au doctorat en traitement du langage naturel et graphes de connaissances à l'Imperial College de Londres, où il a également obtenu une maîtrise en informatique. En dehors de la science des données, il rédige une lettre d'information hebdomadaire Substack, The Limitless Playbook, dans laquelle il partage une idée exploitable provenant des plus grands penseurs du monde et écrit occasionnellement sur les concepts fondamentaux de l'IA.

Sujets

Intelligence artificielle

Grands modèles linguistiques

Apprenez l'IA avec ces cours !

Cursus

Développer des applications d'IA

21 h

Apprenez à créer des applications alimentées par l'IA avec les derniers outils de développement d'IA, notamment l'API OpenAI, Hugging Face et LangChain.

Afficher les détails

Commencer le cours

Cours

Bases de données vectorielles pour les intégrations avec Pinecone

3 h

9.6K

Découvrez comment la base de données vectorielle Pinecone révolutionne le développement d'applications d'IA.

Afficher les détails

Commencer le cours

Cours

Retrieval Augmented Generation (RAG) avec LangChain

3 h

17.9K

Découvrez des méthodes d’intégration de data externes à des LLM à l'aide de la Génération à enrichissement contextuel (RAG) avec LangChain.

Afficher les détails

Commencer le cours

Qu'est-ce que la récupération récursive ?

Mise en œuvre de la recherche récursive à l'aide de LlamaIndex

Étape 1 : Chargement et préparation des documents

Étape 2 : Mise en place du LLM et du chunking

Étape 3 : Construire des index vectoriels et générer des résumés

Étape 4 : Construire un index vectoriel de premier niveau

Étape 5 : Mise en place d'une récupération récursive

Étape 6 : Exécuter des requêtes de recherche récursive

Conclusion

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Développer des applications d'IA

Bases de données vectorielles pour les intégrations avec Pinecone

Retrieval Augmented Generation (RAG) avec LangChain

Développer des applications d'IA