Skip to content

You're working for a company that sells motorcycle parts, and they've asked for some help in analyzing their sales data!

They operate three warehouses in the area, selling both retail and wholesale. They offer a variety of parts and accept credit cards, cash, and bank transfer as payment methods. However, each payment type incurs a different fee.

The board of directors wants to gain a better understanding of wholesale revenue by product line, and how this varies month-to-month and across warehouses. You have been tasked with calculating net revenue for each product line and grouping results by month and warehouse. The results should be filtered so that only "Wholesale" orders are included.

They have provided you with access to their database, which contains the following table called sales:

Sales

ColumnData typeDescription
order_numberVARCHARUnique order number.
dateDATEDate of the order, from June to August 2021.
warehouseVARCHARThe warehouse that the order was made from— North, Central, or West.
client_typeVARCHARWhether the order was Retail or Wholesale.
product_lineVARCHARType of product ordered.
quantityINTNumber of products ordered.
unit_priceFLOATPrice per product (dollars).
totalFLOATTotal price of the order (dollars).
paymentVARCHARPayment method—Credit card, Transfer, or Cash.
payment_feeFLOATPercentage of total charged as a result of the payment method.

Your query output should be presented in the following format:

product_linemonthwarehousenet_revenue
product_one---------
product_one---------
product_one---------
product_one---------
product_one---------
product_one---------
product_two---------
............

First: Identify the table 'sales' given

Spinner
DataFrameas
df
variable
-- Show the resources data given
SELECT *
FROM sales;

Second: Setup the columns needed

Spinner
DataFrameas
product_line
variable
-- Check the rows of product_line
SELECT DISTINCT product_line
FROM sales;
Spinner
DataFrameas
warehouse
variable
-- Check the rows of warehouse
SELECT DISTINCT warehouse
FROM sales;
Spinner
DataFrameas
months
variable
-- Getting the right query of months
SELECT DISTINCT
	CASE WHEN EXTRACT('month' FROM date) = 6 THEN 'June'
		 WHEN EXTRACT('month' FROM date) = 7 THEN 'July'
		 WHEN EXTRACT('month' FROM date) = 8 THEN 'August'
	END as months
FROM sales;
Spinner
DataFrameas
df1
variable
-- Apply the months query created to the monthly_sale query
SELECT 
    product_line,
    warehouse,
    client_type,
    total,
    payment_fee,
    CASE WHEN EXTRACT('month' FROM date) = 6 THEN 'June'
         WHEN EXTRACT('month' FROM date) = 7 THEN 'July'
         WHEN EXTRACT('month' FROM date) = 8 THEN 'August'
	END as months
FROM sales
WHERE EXTRACT('month' FROM date) IN (6, 7, 8);
Spinner
DataFrameas
df2
variable
-- CTE for monthly_sales
WITH monthly_sales AS (
    SELECT 
        product_line,
        warehouse,
        client_type,
        total,
        payment_fee,
        CASE WHEN EXTRACT('month' FROM date) = 6 THEN 'June'
             WHEN EXTRACT('month' FROM date) = 7 THEN 'July'
             WHEN EXTRACT('month' FROM date) = 8 THEN 'August'
        END as months
    FROM sales
    WHERE EXTRACT('month' FROM date) IN (6, 7, 8)
)

SELECT * FROM monthly_sales;
Spinner
DataFrameas
net_revenues
variable
-- Getting the right query of net_revenue
SELECT SUM(total) - SUM(payment_fee) AS net_revenues
FROM sales;
Spinner
DataFrameas
df3
variable
-- CTE for wholesale summary and apply the net_revenue query created
WITH monthly_sales AS (
    SELECT 
        product_line,
        warehouse,
        client_type,
        total,
        payment_fee,
        CASE WHEN EXTRACT('month' FROM date) = 6 THEN 'June'
             WHEN EXTRACT('month' FROM date) = 7 THEN 'July'
             WHEN EXTRACT('month' FROM date) = 8 THEN 'August'
        END as months
    FROM sales
    WHERE EXTRACT('month' FROM date) IN (6, 7, 8)
),
wholesale_summary AS (
    SELECT 
        product_line,
        months,
        warehouse,
        SUM(total) - SUM(payment_fee) AS net_revenue
    FROM monthly_sales
    WHERE client_type = 'Wholesale'
    GROUP BY product_line, months, warehouse
)

SELECT * FROM wholesale_summary

Final: Combine all columns and show the final output

Spinner
DataFrameas
revenue_by_product_line
variable
WITH monthly_sales AS (
    SELECT 
        product_line,
        warehouse,
        client_type,
        total,
        payment_fee,
        CASE WHEN EXTRACT('month' FROM date) = 6 THEN 'June'
             WHEN EXTRACT('month' FROM date) = 7 THEN 'July'
             WHEN EXTRACT('month' FROM date) = 8 THEN 'August'
        END as months
    FROM sales
    WHERE EXTRACT('month' FROM date) IN (6, 7, 8)
),

wholesale_summary AS (
    SELECT 
        product_line,
        months,
        warehouse,
        SUM(total) - SUM(payment_fee) AS net_revenue
    FROM monthly_sales
    WHERE client_type = 'Wholesale'
    GROUP BY product_line, months, warehouse
)

SELECT 
    product_line,
    months as month,
    warehouse,
    net_revenue
FROM wholesale_summary
ORDER BY product_line, months, net_revenue DESC;