Course
RankGPT como agente de reclassificação para RAG (Tutorial)
Geração aumentada de recuperação (RAG) é uma técnica que torna grandes modelos de linguagem (LLMs) mais inteligentes e mais precisos, permitindo que eles usem informações externas ao gerar texto.
O grande desafio, no entanto, é escolher os documentos ou trechos certos em uma enorme coleção de dados.
O RankGPT aborda esse problema aprimorando a etapa de reclassificação nos pipelines RAG. Ele usa os recursos de compreensão profunda dos LLMs para avaliar melhor e (re)classificar quais informações são as mais relevantes.
Neste artigo, apresentaremos o RankGPT e demonstraremos como você pode integrá-lo aos seus aplicativos RAG AI.
Entendendo o Retrieval Augmented Generation (RAG)
O RAG (Retrieval Augmented Generation) é um método que combina LLMs com sistemas de recuperação de informações. Isso significa que, quando um LLM é solicitado a gerar texto, ele pode extrair informações relevantes de fontes externas, tornando suas respostas mais precisas e informadas.
O RAG consiste em dois componentes principais, o retriever e o gerador, e um componente opcional, o reranker:
- Retriever- O trabalho do retriever é encontrar documentos ou segmentos de texto relevantes em um grande conjunto de documentos com base na consulta do usuário. Ele usa algoritmos como o BM25 para classificar os documentos de acordo com sua relevância.
- Reranker (opcional) - O reranker pega o conjunto inicial de documentos recuperados e os reordena para garantir que os mais relevantes estejam no topo. Isso ajuda a filtrar as informações menos úteis e a se concentrar no que é importante.
- Gerador - O gerador é o LLM que usa os documentos recuperados para gerar o resultado final. O acesso a dados externos relevantes pode produzir respostas mais precisas.
A função e os benefícios do RankGPT no RAG
O RankGPT usa LLMs para avaliar a relevância de documentos ou segmentos de texto recuperados, garantindo que os mais importantes estejam no topo. Com o RankGPT, o gerador no pipeline RAG obtém as entradas de maior qualidade, resultando em respostas mais precisas.
Relevância e desempenho aprimorados
O RankGPT vai além da simples correspondência de palavras-chave, compreendendo o significado e o contexto mais profundos das consultas e dos documentos. Isso permite que ele forneça informações mais precisas para os LLMs, identificando o conteúdo mais relevante com base em seu significado real, e não apenas em palavras-chave.
Ao usar o GPT-4 com geração de permutação instrucional de disparo zero, o RankGPT supera o desempenho dos principais sistemas supervisionados em vários benchmarks como TREC, BEIR e Mr.TyDi.
Destilação eficiente e econômica
O RankGPT usa permutação destilação para transferir as capacidades de classificação de modelos grandes, como o GPT-4, para modelos menores e especializados.
Esses modelos menores mantêm o alto desempenho e são muito mais eficientes. Por exemplo, um modelo 440M destilado superou um modelo supervisionado 3B no benchmark BEIR, reduzindo os custos computacionais reduzindo significativamente os custos computacionais e obtendo melhores resultados.
Lidar com informações novas e desconhecidas
O RankGPT inclui o conjunto de testes NovelEval para garantir a robustez e resolver problemas de contaminação de dados. Esse conjunto avalia a capacidade do modelo de classificar passagens com base em informações recentes e desconhecidas.
O GPT-4 obteve um desempenho de ponta nesse teste, demonstrando sua capacidade de lidar efetivamente com consultas novas e inéditas.
Desempenho de benchmark do RankGPT
O RankGPT (gpt-4) supera todos os outros modelos no TREC e no BEIR, com uma pontuação média de nDCG@10 de 53,68, conforme mostrado na tabela abaixo. Ele obteve os melhores resultados nos conjuntos de dados BEIR, superando modelos supervisionados fortes, como o monoT5 (3B) e o Cohere Rerank-v2. Mesmo com o gpt-3.5-turbo, o RankGPT tem uma pontuação competitiva, provando que é um reranker altamente eficaz.
Fonte: Weiwei Sun et al., 2023
O RankGPT (gpt-4) também apresenta um bom desempenho nos conjuntos de dados Mr.TyDi, liderando com uma pontuação média nDCG@10 de 62,93, superando o BM25 e o mmarcoCE. Ele supera consistentemente o BM25 e até mesmo o mmarcoCE em muitos idiomas, especialmente em indonésio e swahili.
No geral, o RankGPT obteve a melhor pontuação em muitos idiomas, como bengali, indonésio e japonês, com apenas alguns casos em que ficou um pouco atrás do mmarcoCE.
Fonte: Weiwei Sun et al., 2023
Por fim, o RankGPT foi testado no conjunto de dados NovelEval, que mede a capacidade de um modelo de classificar passagens com base em informações recentes e desconhecidas. O RankGPT (gpt-4) obteve a maior pontuação em todas as métricas de avaliação (nDCG@1, nDCG@5 e nDCG@10), especialmente com a pontuação nDCG@10 de 90,45. Ele superou outros modelos fortes, como o monoT5 (3B) e o monoBERT (340M), o que destaca seu bom desempenho como reranker.
Fonte: Weiwei Sun et al., 2023
Em todos os resultados de benchmark, o RankGPT (gpt-4) supera consistentemente outros métodos, sejam eles supervisionados ou não supervisionados, demonstrando sua capacidade superior de reranking.
Implementação do RankGPT em pipelines RAG
Veja como podemos integrar o RankGPT em um pipeline RAG.
Etapa 1: Clonar o repositório RankGPT
Primeiro, você precisará clonar o repositório RankGPT. Execute o seguinte comando em seu terminal:
git clone https://github.com/sunnweiwei/RankGPT
Etapa 2: Configure seu ambiente
Navegue até o diretório do RankGPT e instale os pacotes necessários. Talvez você queira criar um ambiente virtual e instalar pacotes usando osrequisitos fornecidos em .txt:
pip install -r requirements.txt
Etapa 3: Implementação do RankGPT
Aqui, estamos usando o exemplo simplista de consulta e documentos recuperados fornecidos pelo repositório RankGPT original.
item = {
'query': 'How much impact do masks have on preventing the spread of the COVID-19?',
'hits': [
{'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'},
{'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'},
{'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'}
]
}
Você pode usar o pipeline de permutação fornecido para ranquear facilmente os documentos recuperados com o RankGPT.
from rank_gpt import permutation_pipeline
new_item = permutation_pipeline(
item,
rank_start=0,
rank_end=3,
model_name='gpt-3.5-turbo',
api_key='Your OPENAI Key!'
)
print(new_item)
Isso resultará na seguinte nova ordem de documentos:
{
'query': 'How much impact do masks have on preventing the spread of the COVID-19?',
'hits': [
{'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'},
{'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'},
{'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}
]
}
Instruções passo a passo para a geração de permutações
Para obter uma implementação mais detalhada do pipeline de permutação, você pode interagir diretamente com o RankGPT para criar e processar instruções de permutação da seguinte forma:
from rank_gpt import (
create_permutation_instruction,
run_llm,
receive_permutation
)
# Create permutation generation instruction
messages = create_permutation_instruction(
item=item,
rank_start=0,
rank_end=3,
model_name='gpt-3.5-turbo'
)
[{'role': 'system',
'content': 'You are RankGPT, an intelligent assistant that can rank passages based on their relevancy to the query.'},
{'role': 'user',
'content': 'I will provide you with 3 passages, each indicated by number identifier []. \\nRank the passages based on their relevance to query: How much impact do masks have on preventing the spread of the COVID-19?.'},
{'role': 'assistant', 'content': 'Okay, please provide the passages.'},
{'role': 'user',
'content': '[1] Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'},
{'role': 'assistant', 'content': 'Received passage [1].'},
{'role': 'user',
'content': '[2] Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'},
{'role': 'assistant', 'content': 'Received passage [2].'},
{'role': 'user',
'content': '[3] Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'},
{'role': 'assistant', 'content': 'Received passage [3].'},
{'role': 'user',
'content': 'Search Query: How much impact do masks have on preventing the spread of the COVID-19?. \\nRank the 3 passages above based on their relevance to the search query. The passages should be listed in descending order using identifiers. The most relevant passages should be listed first. The output format should be [] > [], e.g., [1] > [2]. Only response the ranking results, do not say any word or explain.'}]
# Get ChatGPT predicted permutation
permutation = run_llm(
messages,
api_key='Your OPENAI Key!',
model_name='gpt-3.5-turbo'
)
'[1] > [3] > [2]'
# Use permutation to re-rank the passage
item = receive_permutation(
item,
permutation,
rank_start=0,
rank_end=3
)
{'query': 'How much impact do masks have on preventing the spread of the COVID-19?',
'hits': [{'content': 'Title: Universal Masking is Urgent in the COVID-19 Pandemic: SEIR and Agent Based Models, Empirical Validation, Policy Recommendations Content: We present two models for the COVID-19 pandemic predicting the impact of universal face mask wearing upon the spread of the SARS-CoV-2 virus--one employing a stochastic dynamic network based compartmental SEIR (susceptible-exposed-infectious-recovered) approach, and the other employing individual ABM (agent-based modelling) Monte Carlo simulation--indicating (1) significant impact under (near) universal masking when at least 80% of a population is wearing masks, versus minimal impact when only 50% or less of the population is wearing masks, and (2) significant impact when universal masking is adopted early, by Day 50 of a regional outbreak, versus minimal impact when universal masking is adopted late. These effects hold even at the lower filtering rates of homemade masks. To validate these theoretical models, we compare their predictions against a new empirical data set we have collected'},
{'content': 'Title: To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic Content: Face mask use by the general public for limiting the spread of the COVID-19 pandemic is controversial, though increasingly recommended, and the potential of this intervention is not well understood. We develop a compartmental model for assessing the community-wide impact of mask use by the general, asymptomatic public, a portion of which may be asymptomatically infectious. Model simulations, using data relevant to COVID-19 dynamics in the US states of New York and Washington, suggest that broad adoption of even relatively ineffective face masks may meaningfully reduce community transmission of COVID-19 and decrease peak hospitalizations and deaths. Moreover, mask use decreases the effective transmission rate in nearly linear proportion to the product of mask effectiveness (as a fraction of potentially infectious contacts blocked) and coverage rate (as'},
{'content': 'Title: Masking the general population might attenuate COVID-19 outbreaks Content: The effect of masking the general population on a COVID-19 epidemic is estimated by computer simulation using two separate state-of-the-art web-based softwares, one of them calibrated for the SARS-CoV-2 virus. The questions addressed are these: 1. Can mask use by the general population limit the spread of SARS-CoV-2 in a country? 2. What types of masks exist, and how elaborate must a mask be to be effective against COVID-19? 3. Does the mask have to be applied early in an epidemic? 4. A brief general discussion of masks and some possible future research questions regarding masks and SARS-CoV-2. Results are as follows: (1) The results indicate that any type of mask, even simple home-made ones, may be effective. Masks use seems to have an effect in lowering new patients even the protective effect of each mask (here dubbed"one-mask protection") is'}]}
Estratégia de janela deslizante (SWA) para RankGPT
Se você precisar classificar mais documentos do que o modelo pode processar de uma só vez, use uma estratégia de janela deslizante. Veja como aplicar uma estratégia de janela deslizante para classificar novamente os documentos:
from rank_gpt import sliding_windows
api_key = "Your OPENAI Key"
new_item = sliding_windows(
item,
rank_start=0,
rank_end=3,
window_size=2,
step=1,
model_name='gpt-3.5-turbo',
api_key=api_key
)
print(new_item)
Neste exemplo, a janela deslizante tem um tamanho de 2
e um tamanho de etapa de 1
, o que significa que ela processa dois documentos de cada vez, avançando um documento para a próxima passagem de classificação.
Conclusão
Ao usar LLMs para avaliar melhor a relevância das informações, o RankGPT aumenta a precisão da classificação e da reclassificação do conteúdo.
Isso aborda problemas comuns, como garantir que o conteúdo esteja no ponto, melhorar a eficiência e reduzir a probabilidade de gerar informações enganosas.
De modo geral, o RankGPT contribui para a criação de aplicativos RAG mais confiáveis e precisos.
Aprenda IA com estes cursos!
Course
Trabalhando com o Llama 3
Track
Desenvolvimento de modelos de idiomas grandes
blog
O que é Retrieval Augmented Generation (RAG)?
blog
12 Alternativas de código aberto ao GPT-4
blog
ChatGPT vs Google Bard: Um guia comparativo para chatbots de IA
blog
Os 10 melhores GPTs personalizados na GPT Store
Nisha Arya Ahmed
10 min
tutorial
Guia de Introdução ao Ajuste Fino de LLMs
Josep Ferrer
12 min
tutorial