Motoparts
Motoparts is an imaginary company that sells motorcycle parts, and they've asked for some help in analysing their sales data!
They operate three warehouses in Australia, selling both retail and wholesale. They offer a variety of parts and accept credit cards, cash, and bank transfer as payment methods. However, each payment type incurs a different fee.
The board of directors wants to gain a better understanding of wholesale revenue by product line, and how this varies month-to-month and across warehouses. I have been tasked with calculating net revenue for each product line and grouping results by month and warehouse. The results should be filtered so that only "Wholesale"
orders are included.
They have provided me with access to their database, which contains the following table called sales
:
Schema
Column | Data type | Description |
---|---|---|
order_number | VARCHAR | Unique order number. |
date | DATE | Date of the order, from June to August 2021. |
warehouse | VARCHAR | The warehouse that the order was made from— North , Central , or West . |
client_type | VARCHAR | Whether the order was Retail or Wholesale . |
product_line | VARCHAR | Type of product ordered. |
quantity | INT | Number of products ordered. |
unit_price | FLOAT | Price per product (dollars). |
total | FLOAT | Total price of the order (dollars). |
payment | VARCHAR | Payment method—Credit card , Transfer , or Cash . |
payment_fee | FLOAT | Percentage of total charged as a result of the payment method. |
Requirements
They have requested the query output be presented in the following format:
product_line | month | warehouse | net_revenue |
---|---|---|---|
product_one | --- | --- | --- |
product_one | --- | --- | --- |
product_one | --- | --- | --- |
product_one | --- | --- | --- |
product_one | --- | --- | --- |
product_one | --- | --- | --- |
product_two | --- | --- | --- |
... | ... | ... | ... |
Solution
Let's get started.
After assessing the schema for the sales table, and considering the expected output for this query, it is clear that we are only interested in the following columns:
- product_line (unchanged)
- date (converted to mmmm) AS month
- warehouse (unchanged)
- total & payment fee (SUM(total) - SUM(payment_fee)) AS net_revenue
We are only interested in records where:
- client_type is "Wholesale"
The output then needs to be ordered by product_line, month and net_revenue.
Understanding the Data
Lets start simple and run an ordered select all
SELECT * FROM public.sales
ORDER BY product_line, date, total DESC
Interesting...
Some observations on the selected data:
- There are exactly 1000 rows.
- order_number appears to be comprised of the first letter of warehouse followed by a unique id number.
- date appears to be stored as datetime
- The "Credit card" payment method incurs a 3c fee, "Transfer" incurs 1c and "Cash" incurs no fee.
Cleaning
lets cull the records and columns we don't need and work on our custom columns, month and net_revenue, before we start grouping.
SELECT
product_line,
to_char(date, 'Month') AS month,
warehouse,
total - payment_fee AS net_revenue
FROM public.sales
WHERE client_type = 'Wholesale'
ORDER BY product_line, "month", "net_revenue" DESC
Great!
We have the desired records and columns using the requested schema, ordered in the correct sequence.
Final Output: Grouping & Aggregating
Time to GROUP BY and aggregate our net_revenue.
Our output should then be ready to submit to the board of directors.
SELECT
product_line,
to_char(date, 'Month') AS month,
warehouse,
SUM(total) - SUM(payment_fee) AS net_revenue
FROM public.sales
WHERE client_type = 'Wholesale'
GROUP BY product_line, "month", warehouse
ORDER BY product_line, "month", "net_revenue" DESC
Concluding Notes
We now have an indication of the wholesale product line performance by month and warehouse in the requested format, however I have some concerns regarding the readability and usefulness of this output.
It is quite difficult to compare the performance of product lines against eachother using a common dimension such as month or warehouse.
For example; has a particular product line improved or weakened over time? Are there product lines that excel in some warehouses but not others?
Having understood the objectives of the board of directors, it may be worth going above and beyond their original request to visualise wholesale product line performance. These visualisations will provide greater context and supplement their strategic analysis.