Skip to content
Food Delivery App
A food delivery app has just hired you as a data analyst. It coordinates orders from different restaurants to customers in New York City. They have only been in operation a month and need more visibility into their business.
The founder would like to know what insights you can extract from the data. For example:
- Are there many repeat customers?
- Do repeat customers like to try different cuisines, or do they have favorite restaurant types?
- Is there a relationship between how long it takes to deliver a meal and the customer's rating?
They would also like to know your recommendations based on what you find. What does the data suggest their next steps should be?
Source of dataset.
Introduction
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv("food_order.csv")
df
df.customer_id.value_counts(ascending=False).head(200)
print(df.info())
There are 1898 rows with 9 attributes. Deeper examination of the data will need to be conducted to ensure cleanliness.
print('There are', df.customer_id.nunique(), 'unique customers - this will be examined in more detail later.',
'As expected, there are', df.order_id.nunique(), "unique order id's - one for each row.")
df.restaurant_name.unique()
Many of these names appear incorrect. These should be fixed:
badNames = ['Big Wong Restaurant \x8c_¤¾Ñ¼', 'Empanada Mama (closed)', 'Chipotle Mexican Grill $1.99 Delivery', "Joe's Shanghai \x8e_À\x8eü£¾÷´", 'Dirty Bird To Go (archived)', 'CafÌ© China']
goodNames = ['Big Wong Restaurant', 'Empanada Mama', 'Chipotle Mexican Grill', "Joe's Shanghai", 'Dirty Bird To Go', 'Cafe China']
# Get index values of bad names
for i, name in enumerate(badNames):
value = df[df['restaurant_name'] == name].index
# Replace bad names with good names
for x in value:
df.loc[x,'restaurant_name'] = goodNames[i]
# Standardize names
df['restaurant_name'] = df['restaurant_name'].apply(lambda x : x.strip().capitalize())
# checking restaurant names again for verifying changes
df.restaurant_name.unique()
df.cuisine_type.unique()
This looks normal.
df.cost_of_the_order.describe()
This appears to be reasonable as well.
df.day_of_the_week.unique()