Skip to content
Project: Classifying Emails using Llama
Every day, professionals wade through hundreds of emails, from urgent client requests to promotional offers. It's like trying to find important messages in a digital ocean. But AI can help you stay afloat by automatically sorting emails to highlight what matters most.
You've been asked to build an intelligent email assistant using Llama, to help users automatically classify their incoming emails. Your system will identify which emails need immediate attention, which are regular updates, and which are promotions that can wait or be archived.
The Data
You'll work with a dataset of various email examples, ranging from urgent business communications to promotional offers. Here's a peek at what you'll be working with:
email_categories_data.csv
Column | Description |
---|---|
email_id | A unique identifier for each email in the dataset. |
email_content | The full email text including subject line and body. Each email follows a format of "Subject" followed by the message content on a new line. |
expected_category | The correct classification of the email: Priority , Updates , or Promotions . This will be used to validate your model's performance. |
# Run the following cells first
# Install necessary packages, then import the model running the cell below
!pip install llama-cpp-python==0.2.82 -q -q -q
DataFrameas
df
variable
SELECT *
FROM 'models.csv'
LIMIT 5
Hidden output
# Import required libraries
import pandas as pd
from llama_cpp import Llama
# Load the email dataset
emails_df = pd.read_csv('data/email_categories_data.csv')
# Display the first few rows of our dataset
print("Preview of our email dataset:")
emails_df.head(2)
# Set the model path
model_path = "/files-integrations/files/c9696c24-44f3-45f7-8ccd-4b9b046e7e53/tinyllama-1.1b-chat-v0.3.Q4_K_M.gguf"
# Import required libraries
import pandas as pd
from llama_cpp import Llama
# Load the email dataset
emails_df = pd.read_csv('data/email_categories_data.csv')
# Display the first few rows of our dataset
print("Preview of our email dataset:")
emails_df.head(4)
# Initialize the Llama model
llm = Llama(model_path=model_path)
# Create the system prompt with examples
prompt = """ You classify emails into Priority, Updates, or Promotions.
Example 1:
Urgent: Password Reset Required
Your account security requires immediate attention. Please reset your password within 24 hours.
Response:Priority
Example 2:
Special Offer - 50% Off Everything!
Don't miss our biggest sale of the year. Everything must go!
Response: Promotions
Example 3:
Canceled Event - Team Meeting
This event has been canceled and removed from your calendar.
Response: Updates
Example 4:
Special Offer - 40% off National Flights with Delta Airlines
We are offering a 40% discount on national flights. Get those suitcases ready!
Response: Promotions
"""
# Function to process messages and return classifications
def process_message(llm, message, prompt):
"""Process a message and return the response"""
input_prompt = f"{prompt} {message}"
response = llm(
input_prompt,
max_tokens=5,
temperature=0,
stop=["Q:", "\n"],
)
return response['choices'][0]['text'].strip()
# Let's test our classifier on two emails from our dataset
# We'll take emails from different categories for variety
test_emails = emails_df.head(2)
# Process each test email and store results
results = []
for idx, row in test_emails.iterrows():
email_content = row['email_content']
expected_category = row['expected_category']
# Get model's classification
result = process_message(llm, email_content, prompt)
# Store results
results.append({
'email_content': email_content,
'expected_category': expected_category,
'model_output': result
})
# Create a DataFrame with results
results_df = pd.DataFrame(results)
result1 = "Priority"
result2 = "Promotions"
# Display results
print(f"\nClassification Results: \n email 1 {result1} \n email 2: {result2}")