Project: Analyzing Motorcycle Part Sales

You're working for a company that sells motorcycle parts, and they've asked for some help in analyzing their sales data!

They operate three warehouses in the area, selling both retail and wholesale. They offer a variety of parts and accept credit cards, cash, and bank transfer as payment methods. However, each payment type incurs a different fee.

The board of directors wants to gain a better understanding of wholesale revenue by product line, and how this varies month-to-month and across warehouses. You have been tasked with calculating net revenue for each product line and grouping results by month and warehouse. The results should be filtered so that only "Wholesale" orders are included.

They have provided you with access to their database, which contains the following table called sales:

Sales

Column	Data type	Description
`order_number`	`VARCHAR`	Unique order number.
`date`	`DATE`	Date of the order, from June to August 2021.
`warehouse`	`VARCHAR`	The warehouse that the order was made from— `North`, `Central`, or `West`.
`client_type`	`VARCHAR`	Whether the order was `Retail` or `Wholesale`.
`product_line`	`VARCHAR`	Type of product ordered.
`quantity`	`INT`	Number of products ordered.
`unit_price`	`FLOAT`	Price per product (dollars).
`total`	`FLOAT`	Total price of the order (dollars).
`payment`	`VARCHAR`	Payment method—`Credit card`, `Transfer`, or `Cash`.
`payment_fee`	`FLOAT`	Percentage of `total` charged as a result of the `payment` method.

Your query output should be presented in the following format:

`product_line`	`month`	`warehouse`	`net_revenue`
product_one	---	---	---
product_one	---	---	---
product_one	---	---	---
product_one	---	---	---
product_one	---	---	---
product_one	---	---	---
product_two	---	---	---
...	...	...	...

DataFrameas

df

variable

Run cancelled

SELECT * FROM sales;

Run cancelled

print(df.info())

DataFrameas

revenue_by_product_line

variable

Run cancelled

-- Start coding here
WITH revenue_by_product AS (
	SELECT
	product_line,
	EXTRACT(MONTH FROM date) as month_number,
	warehouse,
	SUM(total) AS net_revenue

	FROM sales
	WHERE client_type = 'Wholesale'

	GROUP BY product_line, EXTRACT(MONTH FROM date), warehouse
	
)

SELECT 
product_line,
CASE 
	WHEN month_number = 1 THEN 'January'
    WHEN month_number = 2 THEN 'February'
    WHEN month_number = 3 THEN 'March'
    WHEN month_number = 4 THEN 'April'
    WHEN month_number = 5 THEN 'May'
    WHEN month_number = 6 THEN 'June'
    WHEN month_number = 7 THEN 'July'
    WHEN month_number = 8 THEN 'August'
    WHEN month_number = 9 THEN 'September'
    WHEN month_number = 10 THEN 'October'
    WHEN month_number = 11 THEN 'November'
    WHEN month_number = 12 THEN 'December'
END as month,
warehouse,
net_revenue

FROM revenue_by_product
ORDER BY net_revenue DESC;

Best Performing Warehouses

DataFrameas

warehouse_revenue

variable

Run cancelled

SELECT 
warehouse,
SUM(total) as net_revenue

FROM sales 
GROUP BY warehouse
ORDER BY net_revenue

DataFrameas

best_warehouses

variable

Run cancelled

SELECT 
EXTRACT(MONTH from date) as month, 
warehouse, 
SUM(total) / SUM(quantity) as net_revenue

FROM sales 
GROUP BY date, warehouse
ORDER BY net_revenue DESC;

Run cancelled

import pandas as pd
import matplotlib.pyplot as plt
from babel.numbers import format_currency
import math
def chartBestWarhouses(df):
    df = df.groupby('warehouse')
    df = df[['warehouse', 'net_revenue']].sum().reset_index()
    x = df['warehouse']
    y = df['net_revenue']
    
    colors = {
        'Central': 'xkcd:blue',
        'North': 'xkcd:orange',
        'West': 'xkcd:red'
    }
    
    plt.ylabel(r'Net Revenue per unit sold ($)')
    plt.bar(x, y, color=[colors[warehouse] for warehouse in df['warehouse']])
    plt.ylim(0,3000)
    plt.gca().set_yticklabels([format_currency(i, 'USD', locale='en_US') for i in plt.gca().get_yticks()])
    
    plt.show()
    print(df)

print(chartBestWarhouses(best_warehouses))

Best Performing Products

Run cancelled

def BestPerformingProducts(df):
    df = df[['product_line', 'quantity', 'total']]
    df = df.groupby('product_line').agg({
        'quantity' : 'sum', 
        'total' : 'sum'
        }).reset_index().sort_values(by='total', ascending=False)
    
    color = [
        'xkcd:bright blue', 
        'xkcd:magenta', 
        'xkcd:emerald',
        'xkcd:violet',
        'xkcd:gold',
        'xkcd:aquamarine'
    ]
    
    

    x = df.iloc[:, 2]
    y = df.iloc[:, 0]
    plt.title(r'Total Net Revenue by Product')
    plt.bar(x, labels=y, autopct='%1.1f%%', colors=color)
    
    plt.show()
    print(df.to_string(index=False))


print(BestPerformingProducts(df))

Best Performing Products (Wholesale)

Run cancelled

def WholesaleProducts(df):
    df = df.query("client_type == 'Wholesale'")
    df = df[['product_line','quantity', 'total']].groupby('product_line').agg({
        'quantity': 'sum',
        'total':'sum'

    }).reset_index().sort_values(by='total', ascending=False)
    
    color = [
        'xkcd:bright blue', 
        'xkcd:magenta', 
        'xkcd:emerald',
        'xkcd:violet',
        'xkcd:gold',
        'xkcd:aquamarine'
    ]
    x = df.iloc[:, 2]
    y = df.iloc[:, 0]
    plt.title(r'Wholesale Net Revenue Product Compostion')

    plt.pie(x, labels=y, autopct='%1.1f%%', startangle=90, colors=color)

    plt.show()
    print(df.to_string(index=False))
    
WholesaleProducts(df)

Compostion of Revenue with respect to product type with Wholesale clients are not significantly different to the Total Revenue Composition

Revenue Over Time

Run cancelled

def WeeklyTotalRevenue(df):
    df['date'] = pd.to_datetime(df['date'])
    df = df[['date', 'total']]

    df.set_index('date', inplace=True)

    df = df.resample('W').sum()
    
    x = df.index
    y = df['total']


    plt.figure(figsize=(10, 5))
    plt.plot(x, y, marker='o')
    plt.title('Weekly Net Revenue')
    plt.xlabel('Week')
    plt.ylabel(r'Total Revenue in $ (Retail + Wholesale)')
    
    
    plt.show()
    print(df)
WeeklyTotalRevenue(df)

Project: Analyzing Motorcycle Part Sales

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Sales

Best Performing Warehouses

Best Performing Products

Best Performing Products (Wholesale)

Revenue Over Time

Sales