RankGPT como agente de reclassificação para RAG (Tutorial)

O RankGPT é um método que usa LLMs como o ChatGPT para classificar novamente os documentos recuperados em sistemas RAG, melhorando a qualidade do resultado ao priorizar as informações mais relevantes.

Atualizado 30 de ago. de 2024 · 8 min lido

Geração aumentada de recuperação (RAG) é uma técnica que torna grandes modelos de linguagem (LLMs) mais inteligentes e mais precisos, permitindo que eles usem informações externas ao gerar texto.

O grande desafio, no entanto, é escolher os documentos ou trechos certos em uma enorme coleção de dados.

O RankGPT aborda esse problema aprimorando a etapa de reclassificação nos pipelines RAG. Ele usa os recursos de compreensão profunda dos LLMs para avaliar melhor e (re)classificar quais informações são as mais relevantes.

Neste artigo, apresentaremos o RankGPT e demonstraremos como você pode integrá-lo aos seus aplicativos RAG AI.

Entendendo o Retrieval Augmented Generation (RAG)

O RAG (Retrieval Augmented Generation) é um método que combina LLMs com sistemas de recuperação de informações. Isso significa que, quando um LLM é solicitado a gerar texto, ele pode extrair informações relevantes de fontes externas, tornando suas respostas mais precisas e informadas.

O RAG consiste em dois componentes principais, o retriever e o gerador, e um componente opcional, o reranker:

Retriever- O trabalho do retriever é encontrar documentos ou segmentos de texto relevantes em um grande conjunto de documentos com base na consulta do usuário. Ele usa algoritmos como o BM25 para classificar os documentos de acordo com sua relevância.
Reranker (opcional) - O reranker pega o conjunto inicial de documentos recuperados e os reordena para garantir que os mais relevantes estejam no topo. Isso ajuda a filtrar as informações menos úteis e a se concentrar no que é importante.
Gerador - O gerador é o LLM que usa os documentos recuperados para gerar o resultado final. O acesso a dados externos relevantes pode produzir respostas mais precisas.

A função e os benefícios do RankGPT no RAG

O RankGPT usa LLMs para avaliar a relevância de documentos ou segmentos de texto recuperados, garantindo que os mais importantes estejam no topo. Com o RankGPT, o gerador no pipeline RAG obtém as entradas de maior qualidade, resultando em respostas mais precisas.

Relevância e desempenho aprimorados

O RankGPT vai além da simples correspondência de palavras-chave, compreendendo o significado e o contexto mais profundos das consultas e dos documentos. Isso permite que ele forneça informações mais precisas para os LLMs, identificando o conteúdo mais relevante com base em seu significado real, e não apenas em palavras-chave.

Ao usar o GPT-4 com geração de permutação instrucional de disparo zero, o RankGPT supera o desempenho dos principais sistemas supervisionados em vários benchmarks como TREC, BEIR e Mr.TyDi.

Destilação eficiente e econômica

O RankGPT usa permutação destilação para transferir as capacidades de classificação de modelos grandes, como o GPT-4, para modelos menores e especializados.

Esses modelos menores mantêm o alto desempenho e são muito mais eficientes. Por exemplo, um modelo 440M destilado superou um modelo supervisionado 3B no benchmark BEIR, reduzindo os custos computacionais reduzindo significativamente os custos computacionais e obtendo melhores resultados.

Lidar com informações novas e desconhecidas

O RankGPT inclui o conjunto de testes NovelEval para garantir a robustez e resolver problemas de contaminação de dados. Esse conjunto avalia a capacidade do modelo de classificar passagens com base em informações recentes e desconhecidas.

O GPT-4 obteve um desempenho de ponta nesse teste, demonstrando sua capacidade de lidar efetivamente com consultas novas e inéditas.

Desempenho de benchmark do RankGPT

O RankGPT (gpt-4) supera todos os outros modelos no TREC e no BEIR, com uma pontuação média de nDCG@10 de 53,68, conforme mostrado na tabela abaixo. Ele obteve os melhores resultados nos conjuntos de dados BEIR, superando modelos supervisionados fortes, como o monoT5 (3B) e o Cohere Rerank-v2. Mesmo com o gpt-3.5-turbo, o RankGPT tem uma pontuação competitiva, provando que é um reranker altamente eficaz.

Fonte: Weiwei Sun et al., 2023

O RankGPT (gpt-4) também apresenta um bom desempenho nos conjuntos de dados Mr.TyDi, liderando com uma pontuação média nDCG@10 de 62,93, superando o BM25 e o mmarcoCE. Ele supera consistentemente o BM25 e até mesmo o mmarcoCE em muitos idiomas, especialmente em indonésio e swahili.

No geral, o RankGPT obteve a melhor pontuação em muitos idiomas, como bengali, indonésio e japonês, com apenas alguns casos em que ficou um pouco atrás do mmarcoCE.

Fonte: Weiwei Sun et al., 2023

Por fim, o RankGPT foi testado no conjunto de dados NovelEval, que mede a capacidade de um modelo de classificar passagens com base em informações recentes e desconhecidas. O RankGPT (gpt-4) obteve a maior pontuação em todas as métricas de avaliação (nDCG@1, nDCG@5 e nDCG@10), especialmente com a pontuação nDCG@10 de 90,45. Ele superou outros modelos fortes, como o monoT5 (3B) e o monoBERT (340M), o que destaca seu bom desempenho como reranker.

Fonte: Weiwei Sun et al., 2023

Em todos os resultados de benchmark, o RankGPT (gpt-4) supera consistentemente outros métodos, sejam eles supervisionados ou não supervisionados, demonstrando sua capacidade superior de reranking.

Implementação do RankGPT em pipelines RAG

Veja como podemos integrar o RankGPT em um pipeline RAG.

Etapa 1: Clonar o repositório RankGPT

Primeiro, você precisará clonar o repositório RankGPT. Execute o seguinte comando em seu terminal:

git clone https://github.com/sunnweiwei/RankGPT

Etapa 2: Configure seu ambiente

Navegue até o diretório do RankGPT e instale os pacotes necessários. Talvez você queira criar um ambiente virtual e instalar pacotes usando osrequisitos fornecidos em .txt:

pip install -r requirements.txt

Etapa 3: Implementação do RankGPT

Aqui, estamos usando o exemplo simplista de consulta e documentos recuperados fornecidos pelo repositório RankGPT original.

item = {
    'query': 'How much impact do masks have on preventing the spread of the COVID-19?',
    'hits': [
        {'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'},
        {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'},
        {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}
    ]
}

Você pode usar o pipeline de permutação fornecido para ranquear facilmente os documentos recuperados com o RankGPT.

from rank_gpt import permutation_pipeline
new_item = permutation_pipeline(
    item,
    rank_start=0,
    rank_end=3,
    model_name='gpt-3.5-turbo',
    api_key='Your OPENAI Key!'
)
print(new_item)

Isso resultará na seguinte nova ordem de documentos:

{
    'query': 'How much impact do masks have on preventing the spread of the COVID-19?',
    'hits': [
        {'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'},
        {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'},
        {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}
    ]
}

Instruções passo a passo para a geração de permutações

Para obter uma implementação mais detalhada do pipeline de permutação, você pode interagir diretamente com o RankGPT para criar e processar instruções de permutação da seguinte forma:

from rank_gpt import (
    create_permutation_instruction,
    run_llm,
    receive_permutation
)
# Create permutation generation instruction
messages = create_permutation_instruction(
    item=item,
    rank_start=0,
    rank_end=3,
    model_name='gpt-3.5-turbo'
)

[{'role': 'system',
  'content': 'You are RankGPT, an intelligent assistant that can rank passages based on their relevancy to the query.'},
 {'role': 'user',
  'content': 'I will provide you with 3 passages, each indicated by number identifier []. \\nRank the passages based on their relevance to query: How much impact do masks have on preventing the spread of the COVID-19?.'},
 {'role': 'assistant', 'content': 'Okay, please provide the passages.'},
 {'role': 'user',
  'content': '[1] Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'},
 {'role': 'assistant', 'content': 'Received passage [1].'},
 {'role': 'user',
  'content': '[2] Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'},
 {'role': 'assistant', 'content': 'Received passage [2].'},
 {'role': 'user',
  'content': '[3] Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'},
 {'role': 'assistant', 'content': 'Received passage [3].'},
 {'role': 'user',
  'content': 'Search Query: How much impact do masks have on preventing the spread of the COVID-19?. \\nRank the 3 passages above based on their relevance to the search query. The passages should be listed in descending order using identifiers. The most relevant passages should be listed first. The output format should be [] > [], e.g., [1] > [2]. Only response the ranking results, do not say any word or explain.'}]

# Get ChatGPT predicted permutation
permutation = run_llm(
    messages,
    api_key='Your OPENAI Key!',
    model_name='gpt-3.5-turbo'
)

'[1] > [3] > [2]'

# Use permutation to re-rank the passage
item = receive_permutation(
    item,
    permutation,
    rank_start=0,
    rank_end=3
)

{'query': 'How much impact do masks have on preventing the spread of the COVID-19?',
 'hits': [{'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'},
  {'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'},
  {'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}]}

Estratégia de janela deslizante (SWA) para RankGPT

Se você precisar classificar mais documentos do que o modelo pode processar de uma só vez, use uma estratégia de janela deslizante. Veja como aplicar uma estratégia de janela deslizante para classificar novamente os documentos:

from rank_gpt import sliding_windows
api_key = "Your OPENAI Key"
new_item = sliding_windows(
    item,
    rank_start=0,
    rank_end=3,
    window_size=2,
    step=1,
    model_name='gpt-3.5-turbo',
    api_key=api_key
)
print(new_item)

Neste exemplo, a janela deslizante tem um tamanho de 2 e um tamanho de etapa de 1, o que significa que ela processa dois documentos de cada vez, avançando um documento para a próxima passagem de classificação.

Conclusão

Ao usar LLMs para avaliar melhor a relevância das informações, o RankGPT aumenta a precisão da classificação e da reclassificação do conteúdo.

Isso aborda problemas comuns, como garantir que o conteúdo esteja no ponto, melhorar a eficiência e reduzir a probabilidade de gerar informações enganosas.

De modo geral, o RankGPT contribui para a criação de aplicativos RAG mais confiáveis e precisos.

Tópicos

Inteligência Artificial

Modelos de idiomas grandes

Aprenda IA com estes cursos!

Programa

Desenvolvimento de modelos de idiomas grandes

0 min

Aprenda a desenvolver grandes modelos de linguagem (LLMs) com PyTorch e Hugging Face, usando as mais recentes técnicas de aprendizagem profunda e PNL.

Ver detalhes

Iniciar curso

Curso

Vector Databases for Embeddings with Pinecone

3 h

4.6K

Discover how the Pinecone vector database is revolutionizing AI application development!

Ver detalhes

Iniciar curso

Curso

Working with Llama 3

2 h

9.6K

Explore the latest techniques for running the Llama LLM locally and integrating it within your stack.

Ver detalhes

Iniciar curso

Ver mais

Relacionado

blog

12 Alternativas de código aberto ao GPT-4

GPT-4 alternativas de código aberto que podem oferecer desempenho semelhante e exigem menos recursos computacionais para serem executadas. Esses projetos vêm com instruções, fontes de código, pesos de modelos, conjuntos de dados e interface de usuário do chatbot.

Abid Ali Awan

9 min

blog

ChatGPT vs Google Bard: Um guia comparativo para chatbots de IA

Uma introdução amigável para iniciantes aos dois chatbots com tecnologia de IA sobre os quais todos estão falando.

Javier Canales Luna

14 min

blog

Os 10 melhores GPTs personalizados na GPT Store

Explore os melhores GPTs personalizados que vimos até agora na loja GPT, desde ferramentas de ciência de dados até assistentes de SEO e geração de imagens.

Nisha Arya Ahmed

10 min

Tutorial

Guia de Introdução ao Ajuste Fino de LLMs

O ajuste fino dos grandes modelos de linguagem (LLMs, Large Language Models) revolucionou o processamento de linguagem natural (PLN), oferecendo recursos sem precedentes em tarefas como tradução de idiomas, análise de sentimentos e geração de textos. Essa abordagem transformadora aproveita modelos pré-treinados como o GPT-2, aprimorando seu desempenho em domínios específicos pelo processo de ajuste fino.

Josep Ferrer

Tutorial

Guia para iniciantes no uso da API do ChatGPT

Este guia o orienta sobre os conceitos básicos da API ChatGPT, demonstrando seu potencial no processamento de linguagem natural e na comunicação orientada por IA.

Moez Ali

Ver mais Ver mais

Entendendo o Retrieval Augmented Generation (RAG)

A função e os benefícios do RankGPT no RAG

Relevância e desempenho aprimorados

Destilação eficiente e econômica

Lidar com informações novas e desconhecidas

Desempenho de benchmark do RankGPT

Implementação do RankGPT em pipelines RAG

Etapa 1: Clonar o repositório RankGPT

Etapa 2: Configure seu ambiente

Etapa 3: Implementação do RankGPT

Instruções passo a passo para a geração de permutações

Estratégia de janela deslizante (SWA) para RankGPT

Conclusão

12 Alternativas de código aberto ao GPT-4

ChatGPT vs Google Bard: Um guia comparativo para chatbots de IA

Os 10 melhores GPTs personalizados na GPT Store

Guia de Introdução ao Ajuste Fino de LLMs

Guia para iniciantes no uso da API do ChatGPT

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Desenvolvimento de modelos de idiomas grandes

Vector Databases for Embeddings with Pinecone

Working with Llama 3

12 Alternativas de código aberto ao GPT-4

ChatGPT vs Google Bard: Um guia comparativo para chatbots de IA

Os 10 melhores GPTs personalizados na GPT Store

Guia de Introdução ao Ajuste Fino de LLMs

Guia para iniciantes no uso da API do ChatGPT

Desenvolvimento de modelos de idiomas grandes