Skip to content

INTRODUCTION

This data analysis project delves into Glovo's food delivery operations in the city of Glovalia. Utilizing comprehensive data from one week of both delivered and canceled orders, the project aims to identify key factors influencing delivery times and develop actionable strategies to enhance efficiency.

The dataset contains the following:

  • Store_address_id - is the unique id identifying which store the order was for Courier_id - is theid of the last courier the order was assigned to.

  • Vertical

  • `WALL - Partner means the order was for a store which is our partner (they receive the orders through the partner web-app before the courier arrives there, and we charge a commission to them).

  • WALL - NonPartner means the order was for a store which we have as “Fake” (we have no agreements with this store, so the courier needs to go inside and order as a regular customer and we don’t charge partner commission either).

  • QUIERO means the order was done through the central “Anything” button on the app where customer can ask us to deliver anything that fits into a Glovo backpack. Usually these type of orders are assigned to nonPartners.

  • Transport is the vehicle type that was used by the courier who was assigned the last to the order.

  • Number of assignments shows how many times we needed to assign different couriers to the orders (e.g. if it is equal to 2, it means that the first courier assigned refused to do the order and we needed to find a second courier).

  • Total distance stands for the total real distance in KM the courier did to deliver the order.

One of the main KPIs used to evaluate the performance of operations is % of food orders delivered under 45 mins. Delivery time is impacted by a great variety of factors and consists of few major parts:

Time to assign (how much time we need to find a courier available and match an order - heavily impacted by reassignments: once a courier receives an order he/she has the freedom to refuse to do this order, so we need to spend extra time looking for a next courier).

Time for the courier to start (the time passed between the moment we assigned the order and the courier started to perform it - heavily impacted by the fact whether the courier was another order ongoing once the next one was assigned or not).

Time from the moment the courier starts the order until he/she enters the pick-up area (we consider a courier to enter the pick-up area once he/she is within 100mradius from the Store).

Waiting time at pick up area (the time between the moment the courier enterspick-up zone and the time when the order is actually collected).

Time to cover the distance from the Store to the Customer.

Waiting time at the delivery area (we consider a courier to enter the delivery area oncehe/she is within 100m radius from the customer location so this period of time isbetween that moment and the actual delivery, which is when the customer signs the order in-app).

GOAL:

Based on the data attached - suggest a 4 week plan to improve the delivery time of food orders in the city of Glovalia to reach 85% of food deliveries to be done under 45 mins.

Summary of Key Findings:

  • Reassignments Impact: There's a significant increase in average delivery time as the number of reassignments rises.

  • Day and Time Analysis: Certain days (Wednesday, Saturday, Tuesday) and hours (evening times) have higher average delivery times and order volumes.

  • Waiting Times: The most significant waiting time is observed at the pickup points, indicating a potential area for improvement.

  • Transport Mode: Cars have the longest delivery times and should be discouraged. Motorbikes, having shorter delivery times, should be prioritized, especially during peak hours.

  • Distance and delivery time: The data shows some correlation between distance crossed and average delivery time duration, further investigation is advised.

4-Week Plan to Improve Food Delivery Times:

TARGET: Reach 85% of total food deliveries in 45 minutes or less.

Week 1: Reducing Reassignments and Optimizing Courier Assignment

Focus: Implement strategies to minimize reassignments, which are significantly increasing delivery times.

Actions:

  • Refine the courier matching algorithm to reduce the likelihood of reassignments.

  • Introduce incentives for couriers to accept and complete orders.

  • Start a performance-based reward system for timely deliveries.

Expected Outcome: Reduced average delivery time by decreasing reassignments.

Week 2: Implementing Dynamic Pricing and Slot Optimization

Focus: Adjust pricing and courier availability during peak hours.

Actions:

  • Introduce dynamic pricing to encourage orders during off-peak hours.

  • Increase courier availability during peak periods (Wednesday, Saturday, Tuesday evenings).

Expected Outcome: More even distribution of orders throughout the week, reducing pressure during peak times.

Week 3: Encouraging Efficient Modes of Transportation

Focus: Prioritize faster transportation modes.

Actions:

  • Discourage the use of cars for deliveries due to longer delivery times.

  • Promote the use of motorbikes, especially during peak hours.

  • Expected Outcome: Improved delivery efficiency by leveraging faster transport modes.

Week 4: Enhancing Waiting Time Management

Focus: Reduce waiting times, especially at pickup points.

Actions:

  • Work with partner stores to optimize food preparation times.

  • Implement a real-time notification system for couriers to align their arrival with order readiness.

  • Expected Outcome: Reduced overall delivery time by minimizing courier waiting periods.

Continuous Monitoring and Adjustment Throughout the 4-week period, it's crucial to:

  • Monitor KPIs like the percentage of deliveries under 45 minutes, courier efficiency, and customer satisfaction.

  • Analyze the cancellation rates and reasons behind any cancellations.

  • Adjust the strategies based on real-time data and feedback.

Pilot and Scale

Simulate the effect of implemented changes.

Gradually expand the strategies across the city as they prove effective.

Additional resources:

Delivery time data: link

Exploring the data

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

# Load your dataset into a DataFrame
data1 = pd.read_excel("DATA_1_BC.xlsx")

# Print the number of rows and columns
print("Number of rows and columns:", data1.shape)

# Remove columns with names starting with 'Unnamed'
data1 = data1.loc[:, ~data1.columns.str.startswith('Unnamed')]

# Print out the first five rows
data1.head()
# Display information about the dataframe 'data1'
data1.info()

1. Checking for missing values

# Check for missing values in the 'data1' dataframe
data1.isna().sum()
# Fill missing values in the "store_address_id" column with 0
data1["store_address_id"].fillna(0, inplace=True)
print("Number of unique courier:",data1["courier_id"].nunique())
print("Number of unique stores:",data1["store_address_id"].nunique())

2. Exploring the canceled deliveries

# Filter the rows in data1 which have a canceled status
not_delivered = data1[data1["final_status"]=="CanceledStatus"]

# Display the filtered dataframe
display(not_delivered)

# Filter the rows in not_delivered that are duplicates
duplicates = not_delivered[not_delivered.duplicated()]

# Check if there are any duplicates
if duplicates.empty:
    print("No duplicates")

# Display descriptive statistics for the "number_of_assignments" and "total_real_distance" columns in not_delivered
not_delivered[["number_of_assignments", "total_real_distance"]].describe().round(2)
# Display descriptive statistics for the "number_of_assignments" and "total_real_distance" columns in data1
data1[["number_of_assignments", "total_real_distance"]].describe().round(2)