Skip to content

Solving supply chain logistics problems using SQL

The goal of this project is to use SQL to do step-by-step process to get to a final table that assings routes to every order based on restrictions (e.g. weight restriction, and date restriction).

Database Schema

The database consists of of 7 tables that contain information about products, plants, orders, restrictions, and freight rates that will help us to assign the most suitable route for each order based on restrictions.

Tables Description:

  • Order List: contains the order identifier, the product identifier, unit quantity, and weight.

  • Freight Rates: contains information about the cost, and requirements for the routes done by each carrier.

  • Plant ports: ennumerates the ports available for each plant.

  • VmiCustomers: list of plants that can only be assigned specific customers.

  • Products per Plant: products available on each plant.

  • Wh capacities: the daily capacity of each plant.

  • Wh costs: manufacturing cost of each product.

import pandas as pd
import matplotlib.pyplot as plt

#First, let's load all the tables into our workspace using the pandas library

OrderList = pd.read_csv('OrderList.csv')
PlantPorts = pd.read_csv('PlantPorts.csv')
ProductsPerPlant = pd.read_csv('ProductsPerPlant.csv')
VmiCustomers = pd.read_csv('VmiCustomers.csv')
WhCapacities = pd.read_csv('WhCapacities.csv')
WhCosts = pd.read_csv('WhCosts.csv')
FreightRates = pd.read_csv('FreightRates.csv')
Spinner
DataFrameas
df
variable
--Let's output the OrderList table and leave it here for future reference
SELECT *
FROM OrderList

Different products can be produced on a single plant, but one product one product can't be produced on different plants. Let's group these table by each plant and count the number of products they produce. Also, let's order it from highest to lowest to see which plant has the highest variety.

Spinner
DataFrameas
df2
variable
SELECT order_id, product_id, OrderList.plant_code, product_per_plant_count
FROM OrderList
LEFT JOIN 
		( SELECT plant_code, COUNT(product_id) AS product_per_plant_count
		FROM ProductsPerPlant
		GROUP BY plant_code
		ORDER BY product_per_plant_count DESC) AS products_count
USING(plant_code)

Let's take a look at the busiest routes in one day, in our supply chain.

Spinner
DataFrameas
df1
variable
SELECT origin_port, count(order_id) AS orders_amount_perday
FROM OrderList
GROUP BY origin_port
ORDER BY orders_amount_perday

We can notice with this table that out of 11 ports, just 3 are used for orders.

Let's now take a look at how many orders did each customer do, from highest to lowest

Spinner
DataFrameas
df3
variable
SELECT customer, count(order_id) AS orders_amount
FROM OrderList
GROUP BY customer
ORDER BY orders_amount DESC

We also want to take a look at the total quantity of units the customer ordered, the total weight of the product shipped, the sales for each customer, what was paid to freight for each customer's orders, and the profit margin.

Spinner
DataFrameas
df4
variable
SELECT
	customer, 
	count(order_id) AS orders_amount, 
	SUM(unit_quantity) AS total_units_sold, 
	ROUND(SUM(weight), 2) AS total_weight,
	ROUND(SUM(unit_quantity*cost.cost_unit), 2) AS total_sales
	
FROM OrderList AS orders
INNER JOIN WhCosts AS cost
USING(plant_code)
INNER JOIN FreightRates
GROUP BY customer

ORDER BY orders_amount DESC