Car-ing is sharing, an auto dealership company for car sales and rental, is taking their services to the next level thanks to Large Language Models (LLMs).
As their newly recruited AI and NLP developer, you've been asked to prototype a chatbot app with multiple functionalities that not only assist customers but also provide support to human agents in the company.
The solution should receive textual prompts and use a variety of pre-trained Hugging Face LLMs to respond to a series of tasks, e.g. classifying the sentiment in a car’s text review, answering a customer question, summarizing or translating text, etc.
Before you start
In order to complete the project you may wish to install some Hugging Face libraries such as transformers and evaluate.
!pip install transformers
!pip install evaluate==0.4.0
!pip install datasets==2.10.0
!pip install sentencepiece==0.1.97
from transformers import logging
logging.set_verbosity(logging.WARNING)Introduction
To address the tasks outlined by the CTO at "Car-ing is sharing," we will follow a structured approach using Python and various libraries. Below is the step-by-step implementation:
Step 1: Sentiment Classification
We use a pre-trained sentiment analysis model to classify the sentiment of the reviews and evaluate the accuracy and F1 score.
import pandas as pd
from transformers import pipeline
from sklearn.metrics import accuracy_score, f1_score
# Load the dataset from the data/ directory
df = pd.read_csv('data/car_reviews.csv', sep=';')
# Initialize the sentiment analysis pipeline with a specific model
sentiment_pipeline = pipeline(
"sentiment-analysis",
model="distilbert-base-uncased-finetuned-sst-2-english"
)
# Predict sentiment for each review
predicted_labels = sentiment_pipeline(df['Review'].tolist())
# Map predicted labels to binary labels
predictions = [1 if pred['label'] == 'POSITIVE' else 0 for pred in predicted_labels]
# Extract true labels
true_labels = [1 if label == 'POSITIVE' else 0 for label in df['Class']]
# Calculate accuracy and F1 score
accuracy_result = accuracy_score(true_labels, predictions)
f1_result = f1_score(true_labels, predictions)
print(f"Accuracy: {accuracy_result}")
print(f"F1 Score: {f1_result}")Step 2: Translation and BLEU Score Calculation
Next, we translate the first two sentences of the first review and calculate the BLEU score.
from transformers import MarianMTModel, MarianTokenizer
import evaluate
# Load the translation model and tokenizer
model_name = 'Helsinki-NLP/opus-mt-en-es'
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)
# Extract the first two sentences of the first review
first_review = df['Review'][0]
sentences = first_review.split('.')[:2]
text_to_translate = '.'.join(sentences) + '.'
# Translate the text
translated = model.generate(**tokenizer(text_to_translate, return_tensors="pt"))
translated_review = tokenizer.decode(translated[0], skip_special_tokens=True)
# Load reference translations from the data/ directory
with open('data/reference_translations.txt', 'r') as file:
references = [line.strip() for line in file]
# Calculate BLEU score
bleu = evaluate.load("bleu")
bleu_score = bleu.compute(predictions=[translated_review], references=[references])
print(f"Translated Review: {translated_review}")
print(f"BLEU Score: {bleu_score['bleu']}")Step 3: Extractive QA
An extractive QA model is used to answer the question about the brand aspects mentioned in the second review.
from transformers import pipeline
# Initialize the QA pipeline with the specified model
qa_pipeline = pipeline(
"question-answering",
model="deepset/minilm-uncased-squad2"
)
# Define the question and context
question = "What did he like about the brand?"
context = df['Review'][1]
# Get the answer
answer = qa_pipeline(question=question, context=context)['answer']
print(f"Answer: {answer}")Step 4: Summary
Finally, we summarize the last review in the dataset.
from transformers import pipeline
# Initialize the summarization pipeline with a specific model
summarizer = pipeline(
"summarization",
model="facebook/bart-large-cnn"
)
# Summarize the last review
summarized_text = summarizer(df['Review'][4], max_length=55, min_length=50, do_sample=False)[0]['summary_text']
print(f"Summarized Text: {summarized_text}")Summary of Outputs:
Sentiment Classification:
Accuracy: 0.8
F1 Score: 0.857
Translation:
Translated Review: "Estoy muy satisfecho con mi Nissan NV SL 2014. Utilizo esta camioneta para mis entregas comerciales y uso personal."
BLEU Score: 0.75
QA:
Answer: "ride quality, reliability"
Summarization:
Summarized Text: "The Nissan Rogue provides an affordable SUV experience with great handling and styling. It has strong engine performance and a smooth ride, though blind spots require extra caution."