You're working for a well-known car manufacturer who is looking at implementing LLMs into vehicles to provide guidance to drivers. You've been asked to experiment with integrating car manuals with an LLM to create a context-aware chatbot. They hope that this context-aware LLM can be hooked up to a text-to-speech software to read the model's response aloud.
As a proof of concept, you'll integrate several pages from a car manual that contains car warning messages and their meanings and recommended actions. This particular manual, stored as an HTML file, mg-zs-warning-messages.html, is from an MG ZS, a compact SUV. Armed with your newfound knowledge of LLMs and LangChain, you'll implement Retrieval Augmented Generation (RAG) to create the context-aware chatbot.
Before you start
In order to complete the project you will need to create a developer account with OpenAI and store your API key as a secure environment variable. Instructions for these steps are outlined below.
Create a developer account with OpenAI
-
Go to the API signup page.
-
Create your account (you'll need to provide your email address and your phone number).
-
Go to the API keys page.
-
Create a new secret key.
- Take a copy of it. (If you lose it, delete the key and create a new one.)
Add a payment method
OpenAI sometimes provides free credits for the API, but this can vary depending on geography. You may need to add debit/credit card details.
This project should cost much less than 1 US cents with gpt-4o-mini (but if you rerun tasks, you will be charged every time).
-
Go to the Payment Methods page.
-
Click Add payment method.
- Fill in your card details.
Add an environmental variable with your OpenAI key
-
In the workbook, click on "Environment," in the top toolbar and select "Environment variables".
-
Click "Add" to add environment variables.
-
In the "Name" field, type "OPENAI_API_KEY". In the "Value" field, paste in your secret key.
- Click "Create", then you'll see the following pop-up window. Click "Connect," then wait 5-10 seconds for the kernel to restart, or restart it manually in the Run menu.
#Run this cell to install the necessary packages
import subprocess
import pkg_resources
def install_if_needed(package, version):
'''Function to ensure that the libraries used are consistent to avoid errors.'''
try:
pkg = pkg_resources.get_distribution(package)
if pkg.version != version:
raise pkg_resources.VersionConflict(pkg, version)
except (pkg_resources.DistributionNotFound, pkg_resources.VersionConflict):
subprocess.check_call(["pip", "install", f"{package}=={version}"])
install_if_needed("langchain-core", "0.3.18")
install_if_needed("langchain-openai", "0.2.8")
install_if_needed("langchain-community", "0.3.7")
install_if_needed("unstructured", "0.14.4")
install_if_needed("langchain-chroma", "0.1.4")
install_if_needed("langchain-text-splitters", "0.3.2")# Set your API key to a variable
import os
openai_api_key = os.environ["OPENAI_API_KEY"] = "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"
# Import the required packages
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain_community.document_loaders import UnstructuredHTMLLoader
from langchain_openai import OpenAIEmbeddings
from langchain_core.runnables import RunnablePassthrough
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma# Load the HTML as a LangChain document loader
loader = UnstructuredHTMLLoader(file_path="data/mg-zs-warning-messages.html")
car_docs = loader.load()# Start coding here, use as many cells as you like# 1. Split the document into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
car_docs_split = text_splitter.split_documents(car_docs)
# 2. Embed the documents
embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
# 3. Create the vectorstore
vectorstore = Chroma.from_documents(documents=car_docs_split, embedding=embeddings)
# 4. Create the retriever
retriever = vectorstore.as_retriever()
# 5. Set up the model (using gpt-4o-mini)
llm = ChatOpenAI(model="gpt-4o-mini", openai_api_key=openai_api_key)
# 6. Set up a prompt
template = """You are an expert car assistant.
Answer the user's question based only on the context provided.
If the answer is not in the context, say "I'm sorry, I don't have information about that."
Context:
{context}
Question: {question}
Answer:"""
prompt = ChatPromptTemplate.from_template(template)
# 7. Create the RAG chain
rag_chain = (
{"context": retriever, "question": RunnablePassthrough()}
| prompt
| llm
)
# 8. Ask the question
user_question = "The Gasoline Particular Filter Full warning has appeared. What does this mean and what should I do about it?"
answer = rag_chain.invoke(user_question).content
# Display the answer
print(answer)