Saltar al contenido principal

Recuperación recursiva para RAG: Implementación con LlamaIndex

Aprende a implementar la recuperación recursiva en sistemas RAG utilizando LlamaIndex para mejorar la precisión y relevancia de la información recuperada, especialmente para grandes colecciones de documentos.
Actualizado 13 nov 2024  · 8 min de lectura

En muchas aplicaciones GARel proceso de recuperación suele ser bastante sencillo. Normalmente, los documentos se dividen en trozos, se convierten en incrustaciones y se almacenan en una base de datos vectorial. Cuando se realiza una consulta, el sistema extrae los documentos top-k basándose en la similitud de sus incrustaciones son.

Sin embargo, este método tiene algunos inconvenientes, sobre todo con colecciones grandes. Los trozos pueden ser poco claros, y el sistema no siempre puede extraer la información más relevante, lo que lleva a resultados menos precisos.

La recuperación recursiva se desarrolló para mejorar la precisión de la recuperación utilizando la estructura del documento. En lugar de recuperar directamente los trozos, primero recupera los resúmenes relevantes y luego profundiza en los trozos correspondientes, haciendo que los resultados finales de la recuperación sean más relevantes.

En este artículo, explicaremos la recuperación recursiva y te guiaremos paso a paso para ponerla en práctica utilizando LlamaIndex.

¿Qué es la recuperación recursiva?

En lugar de incrustar trozos de documentos sin procesar y recuperarlos por similitud, la recuperación recursiva funciona incrustando primero resúmenes de los documentos y enlazándolos con los trozos de documentos completos. Cuando se hace una consulta, el sistema recupera primero los resúmenes relevantes y luego profundiza para encontrar los trozos de información relacionados.

Este método da al sistema de recuperación más contexto antes de proporcionar los trozos finales, lo que le permite encontrar mejor la información relevante.

Implementación de la recuperación recursiva mediante LlamaIndex

En esta sección, te guiaremos paso a paso en el proceso de implementación de la recuperación recursiva mediante LlamaIndex, desde la carga de los documentos hasta la ejecución de consultas con recuperación recursiva.

Paso 1: Carga y prepara los documentos

En primer lugar, cargamos los documentos en el sistema utilizando SimpleDirectoryReader. A cada documento se le asigna un título y metadatos (como su categoría) para facilitar el filtrado posterior. Los documentos cargados se almacenan en un diccionario para facilitar su acceso.

from llama_index.core import SimpleDirectoryReader

# Document titles and metadata
article_titles = ["How to Do Great Work", "Having Kids", "How to Lose Time and Money"]
article_metadatas = {
    "How to Do Great Work": {
        "category": "self-help",
    },
    "Having Kids": {
        "category": "self-help",
    },
    "How to Lose Time and Money": {
        "category": "self-help",
    },
}

# Load documents and update with metadata
docs_dict = {}
for title in article_titles:
    doc = SimpleDirectoryReader(
        input_files=[f"llamaindex-data/{title}.txt"]
    ).load_data()[0]
    doc.metadata.update(article_metadatas[title])
    docs_dict[title] = doc

docs_dict

En aras de la legibilidad, truncaré el resultado a continuación:

{'How to Do Great Work': Document(id_='e26a2fcc-77d2-43e8-968b-f893944907dc', embedding=None, metadata={'file_path': 'llamaindex-data/How to Do Great Work.txt', 'file_name': 'How to Do Great Work.txt', 'file_type': 'text/plain', 'file_size': 59399, 'creation_date': '2024-09-18', 'last_modified_date': '2024-09-18', 'category': 'self-help'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, text='July 2023\\n\\nIf you collected lists of techniques for doing great work in a lot of different fields, what would the intersection look like? I decided to find out by making it.\\n\\nPartly my goal was to create a guide that could be used by someone working in any field. But I was also curious about the shape of the intersection. And one thing this exercise shows is that it does have a definite shape; it\\'s not just a point labelled "work hard."\\n\\nThe following recipe assumes you\\'re very ambitious.\\n\\n\\n\\n\\n\\nThe first step is to decide what to work on. The work you choose needs to have three qualities: it has to be something you have a natural aptitude for, that you have a deep interest in, and that offers scope to do great work.\\n\\nIn practice you don\\'t have to worry much about the third criterion. Ambitious people are if anything already too conservative about it. So all you need to do is find something you have an aptitude for and great interest in. [1]\\n\\nThat sounds straightforward, but it\\'s often quite difficult. When you\\'re young you don\\'t know what you\\'re good at or what different kinds of work are like. Some kinds of work you end up doing may not even exist yet. So while some people know what they want to do at 14, most have to figure it out.\\n\\nThe way to figure out what to work on is by working. If you\\'re not sure what to work on, guess. But pick something and get going. You\\'ll probably guess wrong some of the time, but that\\'s fine. It\\'s good to know about multiple things; some of the biggest discoveries come from noticing connections between different fields.\\n\\n
…
(truncated)

Paso 2: Configurar el LLM y la fragmentación

A continuación, inicializamos el gran modelo lingüístico (LLM) utilizando el programa de OpenAI GPT-4o Mini de OpenAI y configuramos un divisor de frases para dividir los documentos en trozos más pequeños para incrustarlos. También creamos un gestor de devoluciones de llamada para seguir el proceso.

from llama_index.llms.openai import OpenAI
from llama_index.core.callbacks import LlamaDebugHandler, CallbackManager
from llama_index.core.node_parser import SentenceSplitter

# Initialize LLM and chunk splitter
llm = OpenAI("gpt-4o-mini")
callback_manager = CallbackManager([LlamaDebugHandler()])
splitter = SentenceSplitter(chunk_size=256)

Paso 3: Construye índices vectoriales y genera resúmenes

Para cada documento, creamos un índice vectorial, que nos permite recuperar posteriormente trozos de documentos relevantes basándonos en la similitud. El LLM genera resúmenes de cada documento. Estos resúmenes se almacenan como IndexNode.

from llama_index.core import VectorStoreIndex, SummaryIndex
from llama_index.core.schema import IndexNode

# Define top-level nodes and vector retrievers
nodes = []
vector_query_engines = {}
vector_retrievers = {}

for title in article_titles:
    # build vector index
    vector_index = VectorStoreIndex.from_documents(
        [docs_dict[title]],
        transformations=[splitter],
        callback_manager=callback_manager,
    )
    
    # define query engines
    vector_query_engine = vector_index.as_query_engine(llm=llm)
    vector_query_engines[title] = vector_query_engine
    vector_retrievers[title] = vector_index.as_retriever(similarity_top_k=3)
    # save summaries
    out_path = Path("summaries") / f"{title}.txt"
    if not out_path.exists():
        # use LLM-generated summary
        summary_index = SummaryIndex.from_documents(
            [docs_dict[title]], callback_manager=callback_manager
        )
        summarizer = summary_index.as_query_engine(
            response_mode="tree_summarize", llm=llm
        )
        response = await summarizer.aquery(
            f"Give me a summary of {title}"
        )
        article_summary = response.response
        Path("summaries").mkdir(exist_ok=True)
        with open(out_path, "w") as fp:
            fp.write(article_summary)
    else:
        with open(out_path, "r") as fp:
            article_summary = fp.read()
    print(f"**Summary for {title}: {article_summary}")
    node = IndexNode(text=article_summary, index_id=title)
    nodes.append(node)
**********
Trace: index_construction
**********
**Summary for How to Do Great Work: The essence of doing great work revolves around a few key principles. First, it's crucial to choose a field that aligns with your natural aptitudes and deep interests, as this will drive your motivation and creativity. Engaging in your own projects and maintaining a sense of excited curiosity are vital for discovering new ideas and making significant contributions.
Learning enough to reach the frontiers of knowledge in your chosen field allows you to identify gaps and explore them, often leading to innovative breakthroughs. Hard work is essential, but it should be fueled by genuine interest rather than mere diligence. Consistency and the willingness to embrace challenges, including the risk of failure, are important for growth and discovery.
Collaboration with high-quality colleagues can enhance your work, as they can provide insights and encouragement. Maintaining morale is also crucial; a positive mindset can help you navigate setbacks and keep you focused on your goals.
Ultimately, curiosity serves as the driving force behind great work, guiding you through the process of exploration and discovery. By nurturing your curiosity and being open to new experiences, you can uncover unique opportunities and make meaningful contributions in your field.
**********
Trace: index_construction
**********
**Summary for Having Kids: The piece reflects on the author's transformation in perspective regarding parenthood. Initially apprehensive about having children, viewing parents as uncool and burdensome, the author experiences a profound shift after becoming a parent. The arrival of their first child triggers protective instincts and a newfound appreciation for children, leading to genuine joy in parenting moments that were previously overlooked. 
The author acknowledges the challenges of parenthood, such as reduced productivity and ambition, as well as the necessity of adapting to a child's schedule. Despite these challenges, the author finds that the happiness and meaningful moments shared with children far outweigh the difficulties. The narrative emphasizes that while parenting can be demanding, it also brings unexpected joy and fulfillment, ultimately leading to a richer life experience.
**********
Trace: index_construction
**********
**Summary for How to Lose Time and Money: The piece discusses the author's reflections on wealth and time management after selling a startup. It emphasizes that losing wealth often stems from poor investments rather than excessive spending, as the latter triggers alarms in our minds. The author highlights the need to develop new awareness to avoid bad investments, which can be less obvious than overspending on luxuries. Similarly, when it comes to time, the most significant loss occurs not through leisure activities but through engaging in unproductive work that feels legitimate, like managing emails. The author argues that modern complexities require us to recognize and avoid these deceptive traps that mimic productive behavior but ultimately lead to wasted time.

Como puedes ver, ahora tenemos tres nodos, cada uno de los cuales representa un resumen de uno de los documentos. Además, tenemos vector_retrievers, que almacena los vectores de trozos de cada documento.

print(nodes)
print('------'
print(vector_retrievers)
[IndexNode(id_='406d9927-c9e2-486f-9fc5-111efefc1649', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="The essence of doing great work revolves around a few key principles. First, it's crucial to choose a field that aligns with your natural aptitudes and deep interests, as this will drive your motivation and creativity. Engaging in your own projects and maintaining a sense of excited curiosity are vital for discovering new ideas and making significant contributions.\\n\\nLearning enough to reach the frontiers of knowledge in your chosen field allows you to identify gaps and explore them, often leading to innovative breakthroughs. Hard work is essential, but it should be fueled by genuine interest rather than mere diligence. Consistency and the willingness to embrace challenges, including the risk of failure, are important for growth and discovery.\\n\\nCollaboration with high-quality colleagues can enhance your work, as they can provide insights and encouragement. Maintaining morale is also crucial; a positive mindset can help you navigate setbacks and keep you focused on your goals.\\n\\nUltimately, curiosity serves as the driving force behind great work, guiding you through the process of exploration and discovery. By nurturing your curiosity and being open to new experiences, you can uncover unique opportunities and make meaningful contributions in your field.", mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n', index_id='How to Do Great Work', obj=None),
 IndexNode(id_='8007fdd2-6617-4a76-95d7-79efef0700e7', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="The piece reflects on the author's transformation in perspective regarding parenthood. Initially apprehensive about having children, viewing parents as uncool and burdensome, the author experiences a profound shift after becoming a parent. The arrival of their first child triggers protective instincts and a newfound appreciation for children, leading to genuine joy in parenting moments that were previously overlooked. \\n\\nThe author acknowledges the challenges of parenthood, such as reduced productivity and ambition, as well as the necessity of adapting to a child's schedule. Despite these challenges, the author finds that the happiness and meaningful moments shared with children far outweigh the difficulties. The narrative emphasizes that while parenting can be demanding, it also brings unexpected joy and fulfillment, ultimately leading to a richer life experience.", mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n', index_id='Having Kids', obj=None),
 IndexNode(id_='7e4dd169-eb28-4b2f-8a1a-ca1c5b85ac30', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="The piece discusses the author's reflections on wealth and time management after selling a startup. It emphasizes that losing wealth often stems from poor investments rather than excessive spending, as the latter triggers alarms in our minds. The author highlights the need to develop new awareness to avoid bad investments, which can be less obvious than overspending on luxuries. Similarly, when it comes to time, the most significant loss occurs not through leisure activities but through engaging in unproductive work that feels legitimate, like managing emails. The author argues that modern complexities require us to recognize and avoid these deceptive traps that mimic productive behavior but ultimately lead to wasted time.", mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n', index_id='How to Lose Time and Money', obj=None)]
 ------
 {'How to Do Great Work': <llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x330afeeb0>,
 'Having Kids': <llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x33129c7c0>,
 'How to Lose Time and Money': <llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x32e8929a0>}

Paso 4: Construye un índice vectorial de nivel superior

Una vez que tenemos los resúmenes (nodes), podemos crear un índice vectorial de nivel superior y recuperador (top_vector_retriever). Este índice utiliza los resúmenes para iniciar el proceso de recuperación. Nos ayuda a encontrar los resúmenes más relevantes antes de examinar los trozos detallados del documento.

# Build top-level vector index from summary nodes
top_vector_index = VectorStoreIndex(
    nodes, transformations=[splitter], callback_manager=callback_manager
)

# Set up a retriever for the top-level summaries
top_vector_retriever = top_vector_index.as_retriever(similarity_top_k=1)
top_vector_retriever
<llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x32db715b0>

Paso 5: Configura la recuperación recursiva

Ahora que tenemos el recuperador de nivel superior y los recuperadores de documentos individuales, podemos configurar el recuperador recursivo. Esta configuración permite que el sistema obtenga primero los resúmenes relevantes y luego se sumerja en los trozos de documentos específicos en función de su relevancia.

from llama_index.core.retrievers import RecursiveRetriever

# Combine top-level retriever with individual document retrievers
recursive_retriever = RecursiveRetriever(
    "vector",
    retriever_dict={"vector": top_vector_retriever, **vector_retrievers},
    verbose=True,
)

Paso 6: Ejecuta consultas de recuperación recursivas

Por último, estamos listos para utilizar nuestro recuperador recursivo para ejecutar algunas consultas de ejemplo.

# Run recursive retriever on sample queries
result = recursive_retriever.retrieve("should I have kids?")
for res in result:
    print(res.node.get_content())
Retrieving with query id None: should I have kids?
Retrieved node with id, entering: Having Kids
Retrieving with query id Having Kids: should I have kids?
Retrieving text node: Do you have so little to spare?
And while having kids may be warping my present judgement, it hasn't overwritten my memory. I remember perfectly well what life was like before. Well enough to miss some things a lot, like the ability to take off for some other country at a moment's notice. That was so great. Why did I never do that?
See what I did there? The fact is, most of the freedom I had before kids, I never used. I paid for it in loneliness, but I never used it.
I had plenty of happy times before I had kids. But if I count up happy moments, not just potential happiness but actual happy moments, there are more after kids than before. Now I practically have it on tap, almost any bedtime.
People's experiences as parents vary a lot, and I know I've been lucky. But I think the worries I had before having kids must be pretty common, and judging by other parents' faces when they see their kids, so must the happiness that kids bring.
Retrieving text node: December 2019
Before I had kids, I was afraid of having kids. Up to that point I felt about kids the way the young Augustine felt about living virtuously. I'd have been sad to think I'd never have children. But did I want them now? No.
If I had kids, I'd become a parent, and parents, as I'd known since I was a kid, were uncool. They were dull and responsible and had no fun. And while it's not surprising that kids would believe that, to be honest I hadn't seen much as an adult to change my mind. Whenever I'd noticed parents with kids, the kids seemed to be terrors, and the parents pathetic harried creatures, even when they prevailed.
When people had babies, I congratulated them enthusiastically, because that seemed to be what one did. But I didn't feel it at all. "Better you than me," I was thinking.
Now when people have babies I congratulate them enthusiastically and I mean it. Especially the first one. I feel like they just got the best gift in the world.
Retrieving text node: Which meant I had to finish or I'd be taking away their trip to Africa. Maybe if I'm really lucky such tricks could put me net ahead. But the wind is there, no question.
On the other hand, what kind of wimpy ambition do you have if it won't survive having kids? Do you have so little to spare?
And while having kids may be warping my present judgement, it hasn't overwritten my memory. I remember perfectly well what life was like before. Well enough to miss some things a lot, like the ability to take off for some other country at a moment's notice. That was so great. Why did I never do that?
See what I did there? The fact is, most of the freedom I had before kids, I never used. I paid for it in loneliness, but I never used it.
I had plenty of happy times before I had kids. But if I count up happy moments, not just potential happiness but actual happy moments, there are more after kids than before. Now I practically have it on tap, almost any bedtime.
result = recursive_retriever.retrieve("How to buy more time?")
for res in result:
    print(res.node.get_content())
Retrieving with query id None: How to buy more time?
Retrieved node with id, entering: How to Lose Time and Money
Retrieving with query id How to Lose Time and Money: How to buy more time?
Retrieving text node: Which is why people trying to sell you expensive things say "it's an investment."
The solution is to develop new alarms. This can be a tricky business, because while the alarms that prevent you from overspending are so basic that they may even be in our DNA, the ones that prevent you from making bad investments have to be learned, and are sometimes fairly counterintuitive.
A few days ago I realized something surprising: the situation with time is much the same as with money. The most dangerous way to lose time is not to spend it having fun, but to spend it doing fake work. When you spend time having fun, you know you're being self-indulgent. Alarms start to go off fairly quickly. If I woke up one morning and sat down on the sofa and watched TV all day, I'd feel like something was terribly wrong. Just thinking about it makes me wince. I'd start to feel uncomfortable after sitting on a sofa watching TV for 2 hours, let alone a whole day.
Retrieving text node: The solution is to develop new alarms. This can be a tricky business, because while the alarms that prevent you from overspending are so basic that they may even be in our DNA, the ones that prevent you from making bad investments have to be learned, and are sometimes fairly counterintuitive.
A few days ago I realized something surprising: the situation with time is much the same as with money. The most dangerous way to lose time is not to spend it having fun, but to spend it doing fake work. When you spend time having fun, you know you're being self-indulgent. Alarms start to go off fairly quickly. If I woke up one morning and sat down on the sofa and watched TV all day, I'd feel like something was terribly wrong. Just thinking about it makes me wince. I'd start to feel uncomfortable after sitting on a sofa watching TV for 2 hours, let alone a whole day.
And yet I've definitely had days when I might as well have sat in front of a TV all day — days at the end of which, if I asked myself what I got done that day, the answer would have been: basically, nothing.
Retrieving text node: Investing bypasses those alarms. You're not spending the money; you're just moving it from one asset to another. Which is why people trying to sell you expensive things say "it's an investment."
The solution is to develop new alarms. This can be a tricky business, because while the alarms that prevent you from overspending are so basic that they may even be in our DNA, the ones that prevent you from making bad investments have to be learned, and are sometimes fairly counterintuitive.
A few days ago I realized something surprising: the situation with time is much the same as with money. The most dangerous way to lose time is not to spend it having fun, but to spend it doing fake work. When you spend time having fun, you know you're being self-indulgent. Alarms start to go off fairly quickly. If I woke up one morning and sat down on the sofa and watched TV all day, I'd feel like something was terribly wrong. Just thinking about it makes me wince.

Conclusión

Al utilizar resúmenes de documentos y jerarquías, la recuperación recursiva hace que los trozos recuperados sean más relevantes, incluso cuando se trata de grandes conjuntos de datos. Para las organizaciones que manejan grandes volúmenes de datos, la recuperación recursiva es un método fiable para crear sistemas de recuperación más precisos.

Para saber más sobre las técnicas GAR, te recomiendo estos blogs:


Photo of Ryan Ong
Author
Ryan Ong
LinkedIn
Twitter

Ryan es un científico de datos líder especializado en la creación de aplicaciones de IA utilizando LLMs. Es candidato al doctorado en Procesamiento del Lenguaje Natural y Grafos de Conocimiento en el Imperial College de Londres, donde también completó su máster en Informática. Fuera de la ciencia de datos, escribe un boletín semanal de Substack, The Limitless Playbook, donde comparte una idea procesable de los mejores pensadores del mundo y ocasionalmente escribe sobre conceptos básicos de la IA.

Temas

Aprende IA con estos cursos

curso

Retrieval Augmented Generation (RAG) with LangChain

3 hr
824
Learn cutting-edge methods for integrating external data with LLMs using Retrieval Augmented Generation (RAG) with LangChain.
Ver detallesRight Arrow
Comienza El Curso
Ver másRight Arrow
Relacionado

blog

Evaluación de un LLM: Métricas, metodologías y buenas prácticas

Aprende a evaluar grandes modelos lingüísticos (LLM) utilizando métricas clave, metodologías y mejores prácticas para tomar decisiones informadas.
Stanislav Karzhev's photo

Stanislav Karzhev

9 min

tutorial

RAG Con Llama 3.1 8B, Ollama y Langchain: Tutorial

Aprende a crear una aplicación RAG con Llama 3.1 8B utilizando Ollama y Langchain, configurando el entorno, procesando documentos, creando incrustaciones e integrando un recuperador.
Ryan Ong's photo

Ryan Ong

12 min

tutorial

RankGPT como Agente de Re-Ranking para RAG (Tutorial)

RankGPT es un método que utiliza LLMs como ChatGPT para reordenar los documentos recuperados en sistemas RAG, mejorando la calidad del resultado al priorizar la información más relevante.
Ryan Ong's photo

Ryan Ong

8 min

tutorial

Ajuste fino de LLaMA 2: Guía paso a paso para personalizar el modelo de lenguaje grande

Aprende a ajustar Llama-2 en Colab utilizando nuevas técnicas para superar las limitaciones de memoria y computación y hacer más accesibles los grandes modelos lingüísticos de código abierto.
Abid Ali Awan's photo

Abid Ali Awan

12 min

tutorial

Guía para principiantes de LlaMA-Factory WebUI: Ajuste de los LLM

Aprende a afinar los LLM en conjuntos de datos personalizados, evaluar el rendimiento y exportar y servir modelos sin problemas utilizando el marco de trabajo de bajo/ningún código de LLaMA-Factory.
Abid Ali Awan's photo

Abid Ali Awan

12 min

tutorial

Guía introductoria para el ajuste preciso de los LLM

El ajuste preciso de los grandes modelos lingüísticos (LLM) ha revolucionado el procesamiento del lenguaje natural (PLN) y ofrece capacidades sin precedentes en tareas como la traducción lingüística, el análisis del sentimiento y la generación de textos. Este enfoque transformador aprovecha modelos preentrenados como el GPT-2 y mejora su rendimiento en dominios específicos mediante el proceso de ajuste preciso.
Josep Ferrer's photo

Josep Ferrer

12 min

See MoreSee More