RankGPT como Agente de Re-Ranking para RAG (Tutorial)

RankGPT es un método que utiliza LLMs como ChatGPT para reordenar los documentos recuperados en sistemas RAG, mejorando la calidad del resultado al priorizar la información más relevante.

Actualizado 30 ago 2024 · 8 min de lectura

La generación aumentada de recuperación (GRA) es una técnica que hace grandes modelos lingüísticos (LLM) más inteligentes y precisos al permitirles utilizar información externa al generar texto.

El gran reto, sin embargo, es elegir los documentos o pasajes adecuados de entre una enorme colección de datos.

RankGPT aborda este problema mejorando el paso de reordenación en los conductos RAG. Utiliza las capacidades de comprensión profunda de los LLM para evaluar y (re)clasificar mejor qué información es la más relevante.

En este artículo, presentaremos RankGPT y demostraremos cómo puedes integrarlo en tus aplicaciones RAG aplicaciones de IA.

Comprender la Generación Aumentada de Recuperación (RAG)

La generación aumentada de recuperación (RAG) es un método que combina los LLM con los sistemas de recuperación de información. Esto significa que cuando se pide a un LLM que genere un texto, puede extraer información relevante de fuentes externas, haciendo que sus respuestas sean más precisas e informadas.

RAG consta de dos componentes principales -el recuperador y el generador- y un componente opcional, el reranker:

Recuperador-El trabajo del recuperador consiste en encontrar documentos o segmentos de texto relevantes de un gran conjunto de documentos basándose en la consulta del usuario. Utiliza algoritmos como el BM25 para clasificar los documentos según su relevancia.
Reranker (opcional) - El reranker toma el conjunto inicial de documentos recuperados y los reordena para asegurarse de que los más relevantes están arriba. Esto ayuda a filtrar la información menos útil y a centrarse en lo importante.
Generador - El generador es el LLM que utiliza los documentos recuperados para generar el resultado final. El acceso a datos externos relevantes puede producir respuestas más precisas.

El papel y los beneficios del RankGPT en la RAG

RankGPT utiliza LLMs para evaluar la relevancia de los documentos o segmentos de texto recuperados, asegurándose de que los más importantes están arriba. Con RankGPT, el generador de la tubería RAG obtiene las entradas de mayor calidad, lo que da lugar a respuestas más precisas.

Mayor relevancia y rendimiento

RankGPT va más allá de la simple concordancia de palabras clave, ya que comprende el significado más profundo y el contexto de las consultas y los documentos. Esto le permite proporcionar información más precisa a los LLM, identificando el contenido más relevante en función de su significado real, no sólo de las palabras clave.

Cuando utilices GPT-4 con generación de permutaciones instructivas de disparo cero, RankGPT supera a los principales sistemas supervisados en varios puntos de referencia como TREC, BEIR y Mr.TyDi.

Destilación eficaz y rentable

RankGPT utiliza la permutación destilación para transferir las capacidades de clasificación de grandes modelos como GPT-4 a modelos más pequeños y especializados.

Estos modelos más pequeños mantienen un alto rendimiento a la vez que son mucho más eficientes. Por ejemplo, un modelo 440M destilado superó a un modelo supervisado 3B en la prueba de referencia BEIR, reduciendo los costes computacionales computacionales y obteniendo mejores resultados.

Manejo de información nueva y desconocida

RankGPT incluye el conjunto de pruebas NovelEval para garantizar la solidez y abordar los problemas de contaminación de datos. Este conjunto evalúa la capacidad del modelo para clasificar pasajes basándose en información reciente y desconocida.

GPT-4 consiguió un rendimiento puntero en esta prueba, demostrando su capacidad para gestionar eficazmente consultas nuevas y desconocidas.

Rendimiento de referencia RankGPT

RankGPT (gpt-4) supera a todos los demás modelos en TREC y BEIR, con una puntuación media nDCG@10 de 53,68, como se muestra en la tabla siguiente. Obtuvo los mejores resultados en los conjuntos de datos BEIR, superando a modelos supervisados potentes como monoT5 (3B) y Cohere Rerank-v2. Incluso con gpt-3.5-turbo, RankGPT obtiene puntuaciones competitivas, lo que demuestra que es un reranker muy eficaz.

Fuente: Weiwei Sun et al., 2023

RankGPT (gpt-4) también obtiene buenos resultados en los conjuntos de datos de Mr.TyDi, liderando con una puntuación media nDCG@10 de 62,93, superando tanto a BM25 como a mmarcoCE. Supera sistemáticamente a BM25 e incluso supera a mmarcoCE en muchos idiomas, especialmente en indonesio y suajili.

En general, RankGPT obtuvo la mejor puntuación en muchos idiomas, como el bengalí, el indonesio y el japonés, y sólo en unos pocos casos quedó ligeramente por detrás de mmarcoCE.

Fuente: Weiwei Sun et al., 2023

Por último, RankGPT se probó en el conjunto de datos NovelEval, que mide lo bien que un modelo puede clasificar pasajes basándose en información reciente y desconocida. RankGPT (gpt-4) obtuvo la puntuación más alta en todas las métricas de evaluación (nDCG@1, nDCG@5 y nDCG@10), especialmente con la puntuación nDCG@10 de 90,45. Superó a otros modelos potentes como monoT5 (3B) y monoBERT (340M), lo que pone de relieve su gran rendimiento como reranker.

Fuente: Weiwei Sun et al., 2023

En todos los resultados de las pruebas comparativas, RankGPT (gpt-4) supera sistemáticamente a los demás métodos, ya sean supervisados o no supervisados, lo que demuestra su capacidad superior en la reordenación.

Implementación de RankGPT en tuberías RAG

He aquí cómo podemos integrar RankGPT en una canalización RAG.

Paso 1: Clonar el repositorio RankGPT

En primer lugar, tendrás que clonar el repositorio RankGPT. Ejecuta el siguiente comando en tu terminal:

git clone https://github.com/sunnweiwei/RankGPT

Paso 2: Configura tu entorno

Navega hasta el directorioRankGPT e instala los paquetes necesarios. Puede que quieras crear un entorno virtual e instalar paquetes utilizando losrequisitos proporcionados en .txt:

pip install -r requirements.txt

Paso 3: Implementación de RankGPT

Aquí utilizamos la consulta de ejemplo simplista y los documentos recuperados proporcionados por el repositorio original RankGPT.

item = {
    'query': 'How much impact do masks have on preventing the spread of the COVID-19?',
    'hits': [
        {'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'},
        {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'},
        {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}
    ]
}

Puedes utilizar el conducto de permutación proporcionado para volver a clasificar fácilmente los documentos recuperados con RankGPT.

from rank_gpt import permutation_pipeline
new_item = permutation_pipeline(
    item,
    rank_start=0,
    rank_end=3,
    model_name='gpt-3.5-turbo',
    api_key='Your OPENAI Key!'
)
print(new_item)

El resultado será el siguiente nuevo orden de los documentos:

{
    'query': 'How much impact do masks have on preventing the spread of the COVID-19?',
    'hits': [
        {'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'},
        {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'},
        {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}
    ]
}

Instructivo paso a paso Generación de Permutaciones

Para una implementación más paso a paso de la canalización de permutaciones, puedes interactuar directamente con RankGPT para crear y procesar instrucciones de permutación como se indica a continuación:

from rank_gpt import (
    create_permutation_instruction,
    run_llm,
    receive_permutation
)
# Create permutation generation instruction
messages = create_permutation_instruction(
    item=item,
    rank_start=0,
    rank_end=3,
    model_name='gpt-3.5-turbo'
)

[{'role': 'system',
  'content': 'You are RankGPT, an intelligent assistant that can rank passages based on their relevancy to the query.'},
 {'role': 'user',
  'content': 'I will provide you with 3 passages, each indicated by number identifier []. \\nRank the passages based on their relevance to query: How much impact do masks have on preventing the spread of the COVID-19?.'},
 {'role': 'assistant', 'content': 'Okay, please provide the passages.'},
 {'role': 'user',
  'content': '[1] Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'},
 {'role': 'assistant', 'content': 'Received passage [1].'},
 {'role': 'user',
  'content': '[2] Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'},
 {'role': 'assistant', 'content': 'Received passage [2].'},
 {'role': 'user',
  'content': '[3] Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'},
 {'role': 'assistant', 'content': 'Received passage [3].'},
 {'role': 'user',
  'content': 'Search Query: How much impact do masks have on preventing the spread of the COVID-19?. \\nRank the 3 passages above based on their relevance to the search query. The passages should be listed in descending order using identifiers. The most relevant passages should be listed first. The output format should be [] > [], e.g., [1] > [2]. Only response the ranking results, do not say any word or explain.'}]

# Get ChatGPT predicted permutation
permutation = run_llm(
    messages,
    api_key='Your OPENAI Key!',
    model_name='gpt-3.5-turbo'
)

'[1] > [3] > [2]'

# Use permutation to re-rank the passage
item = receive_permutation(
    item,
    permutation,
    rank_start=0,
    rank_end=3
)

{'query': 'How much impact do masks have on preventing the spread of the COVID-19?',
 'hits': [{'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'},
  {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'},
  {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}]}

Estrategia de Ventana Deslizante (SWA) para RankGPT

Si necesitas clasificar más documentos de los que el modelo puede manejar a la vez, utiliza una estrategia de ventana deslizante. A continuación te explicamos cómo aplicar una estrategia de ventana deslizante para volver a clasificar los documentos:

from rank_gpt import sliding_windows
api_key = "Your OPENAI Key"
new_item = sliding_windows(
    item,
    rank_start=0,
    rank_end=3,
    window_size=2,
    step=1,
    model_name='gpt-3.5-turbo',
    api_key=api_key
)
print(new_item)

En este ejemplo, la ventana deslizante tiene un tamaño de 2 y un tamaño de paso de 1, lo que significa que procesa dos documentos a la vez, avanzando un documento para la siguiente pasada de clasificación.

Conclusión

Al utilizar los LLM para evaluar mejor la relevancia de la información, RankGPT mejora la precisión de la clasificación y reordenación de los contenidos.

De este modo, se abordan problemas comunes, como garantizar que el contenido esté en su punto, mejorar la eficacia y reducir la probabilidad de generar información engañosa.

En general, RankGPT contribuye a crear aplicaciones RAG más fiables y precisas.

Temas

Inteligencia Artificial

Grandes modelos lingüísticos

Aprende IA con estos cursos

Programa

Desarrollar grandes modelos lingüísticos

0 min

Aprende a desarrollar grandes modelos lingüísticos (LLM) con PyTorch y Hugging Face, utilizando las últimas técnicas de aprendizaje profundo y PNL.

Ver detalles

Comienza el curso

Curso

Vector Databases for Embeddings with Pinecone

3 h

5.9K

Discover how the Pinecone vector database is revolutionizing AI application development!

Ver detalles

Comienza el curso

Curso

Working with Llama 3

2 h

10.8K

Explore the latest techniques for running the Llama LLM locally and integrating it within your stack.

Ver detalles

Comienza el curso

Relacionado

blog

10 maneras de utilizar ChatGPT para las finanzas

Descubre cómo los modelos lingüísticos de IA como ChatGPT pueden revolucionar tus operaciones financieras, desde la generación de informes hasta la traducción de jerga financiera.

Matt Crabtree

13 min

blog

10 de los mejores plugins de ChatGPT para sacar el máximo partido a la IA en 2023

Libera todo el potencial de ChatGPT con nuestra guía de expertos sobre los 10 mejores plugins para 2023. Mejora la productividad, agiliza los flujos de trabajo y descubre nueva funcionalidad para elevar tu experiencia ChatGPT.

Matt Crabtree

12 min

blog

12 alternativas de código abierto a GPT-4

Alternativas de código abierto a GPT-4 que pueden ofrecer un rendimiento similar y requieren menos recursos informáticos para funcionar. Estos proyectos vienen con instrucciones, fuentes de código, pesos del modelo, conjuntos de datos e IU de chatbot.

Abid Ali Awan

9 min

Tutorial

Guía para principiantes sobre la ingeniería de avisos ChatGPT

Descubra cómo conseguir que ChatGPT le proporcione los resultados que desea dándole las entradas que necesita.

Matt Crabtree

Tutorial

Guía introductoria para el ajuste preciso de los LLM

El ajuste preciso de los grandes modelos lingüísticos (LLM) ha revolucionado el procesamiento del lenguaje natural (PLN) y ofrece capacidades sin precedentes en tareas como la traducción lingüística, el análisis del sentimiento y la generación de textos. Este enfoque transformador aprovecha modelos preentrenados como el GPT-2 y mejora su rendimiento en dominios específicos mediante el proceso de ajuste preciso.

Josep Ferrer

Tutorial

Cómo hacer modelos de ChatGPT personalizados: 5 sencillos pasos para conseguir GPT personalizados

Echa un vistazo a estos cinco sencillos pasos para liberar todo el potencial de ChatGPT con tus propios GPT personalizados.

Moez Ali

Ver más Ver más

Comprender la Generación Aumentada de Recuperación (RAG)

El papel y los beneficios del RankGPT en la RAG

Mayor relevancia y rendimiento

Destilación eficaz y rentable

Manejo de información nueva y desconocida

Rendimiento de referencia RankGPT

Implementación de RankGPT en tuberías RAG

Paso 1: Clonar el repositorio RankGPT

Paso 2: Configura tu entorno

Paso 3: Implementación de RankGPT

Instructivo paso a paso Generación de Permutaciones

Estrategia de Ventana Deslizante (SWA) para RankGPT

Conclusión

10 maneras de utilizar ChatGPT para las finanzas

10 de los mejores plugins de ChatGPT para sacar el máximo partido a la IA en 2023

12 alternativas de código abierto a GPT-4

Guía para principiantes sobre la ingeniería de avisos ChatGPT

Guía introductoria para el ajuste preciso de los LLM

Cómo hacer modelos de ChatGPT personalizados: 5 sencillos pasos para conseguir GPT personalizados

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Desarrollar grandes modelos lingüísticos

Vector Databases for Embeddings with Pinecone

Working with Llama 3

10 maneras de utilizar ChatGPT para las finanzas

10 de los mejores plugins de ChatGPT para sacar el máximo partido a la IA en 2023

12 alternativas de código abierto a GPT-4

Guía para principiantes sobre la ingeniería de avisos ChatGPT

Guía introductoria para el ajuste preciso de los LLM

Cómo hacer modelos de ChatGPT personalizados: 5 sencillos pasos para conseguir GPT personalizados

Desarrollar grandes modelos lingüísticos