Skip to content

You're working for a company that sells motorcycle parts, and they've asked for some help in analyzing their sales data!

They operate three warehouses in the area, selling both retail and wholesale. They offer a variety of parts and accept credit cards, cash, and bank transfer as payment methods. However, each payment type incurs a different fee.

The board of directors wants to gain a better understanding of wholesale revenue by product line, and how this varies month-to-month and across warehouses. You have been tasked with calculating net revenue for each product line and grouping results by month and warehouse. The results should be filtered so that only "Wholesale" orders are included.

They have provided you with access to their database, which contains the following table called sales:

Sales

ColumnData typeDescription
order_numberVARCHARUnique order number.
dateDATEDate of the order, from June to August 2021.
warehouseVARCHARThe warehouse that the order was made from— North, Central, or West.
client_typeVARCHARWhether the order was Retail or Wholesale.
product_lineVARCHARType of product ordered.
quantityINTNumber of products ordered.
unit_priceFLOATPrice per product (dollars).
totalFLOATTotal price of the order (dollars).
paymentVARCHARPayment method—Credit card, Transfer, or Cash.
payment_feeFLOATPercentage of total charged as a result of the payment method.

Your query output should be presented in the following format:

product_linemonthwarehousenet_revenue
product_one---------
product_one---------
product_one---------
product_one---------
product_one---------
product_one---------
product_two---------
............
Spinner
DataFrameas
revenue_by_product_line
variable
SELECT product_line,
    CASE WHEN EXTRACT('month' from date) = 6 THEN 'June'
        WHEN EXTRACT('month' from date) = 7 THEN 'July'
        WHEN EXTRACT('month' from date) = 8 THEN 'August'
    END as month,
    warehouse,
	SUM(total) - SUM(payment_fee) AS net_revenue
FROM sales
WHERE client_type = 'Wholesale'
GROUP BY product_line, warehouse, month
ORDER BY product_line, month, net_revenue DESC;

Extended Project below

The finance team is exploring ways to reduce transaction costs and improve profitability. They’ve asked you to determine the most profitable payment method for each warehouse in each month. Calculate the net revenue for each payment method, grouped by warehouse and month, and identify the top payment method for each combination.

Spinner
DataFrameas
df
variable
WITH MonthlyPaymentNetRevenue AS (
    -- 1. Calculate net revenue per payment method, warehouse, and month
    SELECT
        warehouse,
        EXTRACT(MONTH FROM date) AS month_num,
        payment AS payment_method,  -- FIX: Use the correct column name 'payment'
        SUM(total) - SUM(payment_fee) AS net_revenue
    FROM
        sales
    GROUP BY
        1, 2, 3
),
RankedPayments AS (
    -- 2. Rank the payment methods based on net revenue within each warehouse and month
    SELECT
        *,
        ROW_NUMBER() OVER (
            PARTITION BY warehouse, month_num
            ORDER BY net_revenue DESC
        ) AS payment_rank
    FROM
        MonthlyPaymentNetRevenue
)
-- 3. Select only the top-ranked payment method (rank 1) for each group
SELECT
    warehouse,
    CASE month_num
        WHEN 1 THEN 'January'
        WHEN 2 THEN 'February'
        WHEN 3 THEN 'March'
        WHEN 4 THEN 'April'
        WHEN 5 THEN 'May'
        WHEN 6 THEN 'June'
        WHEN 7 THEN 'July'
        WHEN 8 THEN 'August'
        WHEN 9 THEN 'September'
        WHEN 10 THEN 'October'
        WHEN 11 THEN 'November'
        WHEN 12 THEN 'December'
    END AS month_name,
    payment_method AS top_payment_method,
    net_revenue
FROM
    RankedPayments
WHERE
    payment_rank = 1
ORDER BY
    warehouse,
    month_num;

The marketing team is planning a targeted campaign and wants to know the most popular product lines for retail and wholesale customers.

They have given you the task to find the top 3 most ordered product lines for each client type.

Spinner
DataFrameas
df1
variable
WITH ProductLineOrders AS (
    -- 1. Count the total number of orders/transactions for each product line and client type
    SELECT
        client_type,
        product_line,
        COUNT(*) AS total_orders
    FROM
        sales
    WHERE
        client_type IN ('Retail', 'Wholesale') -- Assuming these are the only two client types of interest
    GROUP BY
        1, 2
),
RankedProductLines AS (
    -- 2. Rank the product lines based on order count within each client type
    SELECT
        *,
        RANK() OVER (
            PARTITION BY client_type
            ORDER BY total_orders DESC
        ) AS line_rank
    FROM
        ProductLineOrders
)
-- 3. Select only the top 3 ranked product lines for each client type
SELECT
    client_type,
    product_line,
    total_orders
FROM
    RankedProductLines
WHERE
    line_rank <= 3
ORDER BY
    client_type,
    total_orders DESC;