Skip to content

Bicycle Sales Database

This workspace is automatically connected to a Microsoft SQL Server database containing two schemas (sales and production), which in turn have various details about the sales and inventory for bike stores (source). I will use this workspace to demonstrate my ability to use SQL. The workspace will be broken up into two parts 1) general exploration and 2) ad hoc reporting simulation.

The first section I will use SQL just to explore the data and pull general numbers and stats from the data. The second part I will have a specific question that a stakeholder may ask, and will provide a query to answer the question.

The Schema layout is shown below

Data Exploration

Looking at the states the stores are located in and the number of orders and total quantity of each state.

Spinner
DataFrameas
df
variable
SELECT s.state
        ,COUNT(DISTINCT s.store_id) AS Stores
        ,COUNT(DISTINCT o.order_id) AS Orders
        ,SUM(i.quantity) AS Quantity
    FROM Sales.stores AS s
    LEFT JOIN Sales.orders AS o
        ON o.store_id = s.store_id
    LEFT JOIN Sales.order_items AS i
        ON i.order_id = o.order_id
    GROUP BY s.state
Spinner
DataFrameas
df
variable
SELECT b.brand_name
        ,SUM(s.quantity) AS TotalStock
        ,SUM(s.quantity * list_price) AS StockValue
    FROM Production.brands AS b
    LEFT JOIN Production.products AS p
        ON p.brand_id = b.brand_id
    LEFT JOIN Production.stocks AS s
        ON s.product_id = p.product_id
    GROUP BY b.brand_name

Ad Hoc Simulation

In this section, I will post a simulated request from a stakeholder, followed by a SQL query to answer that question.

Top Customers

"We're looking at sending out a discount to our top customers as a 'Thank you', and would like to tailor it to them, can you get me a list of our top 10 customers by sales and the brand they use the most?"

Spinner
DataFrameas
df
variable
--Getting the top 10 customers by sales
SELECT TOP 10 c.customer_id
        ,SUM(i.quantity * i.list_price) AS TotalSales
        ,ROW_NUMBER() OVER (ORDER BY SUM(i.quantity * i.list_price) DESC) AS CustRank
    INTO #TopCust
    FROM Sales.customers AS c
    LEFT JOIN Sales.orders AS o
        ON o.customer_id = c.customer_id
    LEFT JOIN Sales.order_items AS i
        ON i.order_id = o.order_id
    GROUP BY c.customer_id
    ORDER BY TotalSales DESC

--Getting the Brands they buy
SELECT t.CustRank
        ,c.customer_id
        ,CONCAT(c.first_name, ' ', c.last_name) AS CustomerName
        ,c.email
        ,b.brand_name
        ,SUM(i.quantity * i.list_price) AS TotalSales
    FROM Sales.customers AS c
    INNER JOIN #TopCust AS t
        ON t.customer_id = c.customer_id
    LEFT JOIN Sales.orders AS o
        ON o.customer_id = c.customer_id
    LEFT JOIN Sales.order_items AS i
        ON i.order_id = o.order_id
    LEFT JOIN Production.products AS p
        ON p.product_id = i.product_id
    LEFT JOIN Production.brands AS b
        ON b.brand_id = p.brand_id
    GROUP BY t.CustRank
        ,c.customer_id
        ,CONCAT(c.first_name, ' ', c.last_name)
        ,c.email
        ,b.brand_name
    ORDER BY CustRank ASC, TotalSales DESC 
        
DROP TABLE #TopCust 

--The resulting table shows the top 10 customers and the brands they buy ordered by total sales

Sufficient Inventory

"We want to make sure our stores are stocked with at least 3 months supply. Please get us a list for each store of their current inventory, the inventory they should have and the difference as a number and percentage, the current monetary value of the stock and estimated cost to reach the needed inventory. Please include brands and product names."

(For this example I will use list_price as both list_price and purchase_price)

Spinner
DataFrameas
df
variable
--Necessary stock

SELECT e.store_id
        ,e.product_id
        ,CEILING(AVG(e.monthly_qty)) AS avg_monthly_qty
        ,CEILING(AVG(e.monthly_qty)) * 3 AS three_month_qty
    INTO #Three_Month_Supply
    FROM (
        --Getting the qty of each product sold in each month in each store
        SELECT s.store_id
                ,DATEADD(DAY, -(DAY(order_date) - 1), order_date) AS order_month
                ,i.product_id
                ,SUM(i.quantity) AS monthly_qty
            FROM Sales.stores AS s
            LEFT JOIN Sales.orders AS o
                ON o.store_id = s.store_id
            LEFT JOIN Sales.order_items AS i
                ON i.order_id = o.order_id
            GROUP BY s.store_id
                    ,i.product_id
                    ,DATEADD(DAY, -(DAY(order_date) - 1), order_date)
        ) AS e
    GROUP BY e.store_id
            ,e.product_id

--Current and Needed Stock

SELECT s.store_id
        ,t.product_id
        ,p.product_name
        ,b.brand_name
        ,t.quantity
        ,m.three_month_qty
        ,CASE WHEN m.three_month_qty - t.quantity > 0 THEN m.three_month_qty - t.quantity
            ELSE 0 END AS stock_needed
        ,t.quantity * p.list_price AS current_value
        ,(CASE WHEN m.three_month_qty - t.quantity > 0 THEN m.three_month_qty - t.quantity ELSE 0 END) * p.list_price AS supply_cost
    FROM Sales.stores AS s
    LEFT JOIN Production.stocks AS t
        ON t.store_id = s.store_id
    LEFT JOIN #Three_Month_Supply AS m
        ON m.store_id = s.store_id
            AND m.product_id = t.product_id
    LEFT JOIN Production.products AS p
        ON p.product_id = t.product_id
    LEFT JOIN Production.brands AS b
        ON b.brand_id = p.brand_id
    ORDER BY store_id
            ,product_id

DROP TABLE #Three_Month_Supply

Past Sales

"I will need a report showing sales of each product by a specific brand for each store in a selected month and the preceeding 6 months. I will need this report for one specific brand at a time"

Spinner
DataFrameas
df
variable
--Setting variable
DECLARE @brand VARCHAR(10);
DECLARE @date DATE; --First day of the month
SET @brand = 'Trek'
SET @date = '2017-11-01'

SELECT s.store_id
        ,s.store_name
        ,b.brand_name
        ,p.product_id
        ,p.product_name
        ,SUM(CASE WHEN DATEADD(DAY, -(DAY(order_date) - 1), order_date) = DATEADD(MONTH, -6, @date) THEN i.quantity * i.list_price ELSE 0 END) AS 'Six Months Previous'
        ,SUM(CASE WHEN DATEADD(DAY, -(DAY(order_date) - 1), order_date) = DATEADD(MONTH, -5, @date) THEN i.quantity * i.list_price ELSE 0 END) AS 'Five Months Previous'
        ,SUM(CASE WHEN DATEADD(DAY, -(DAY(order_date) - 1), order_date) = DATEADD(MONTH, -4, @date) THEN i.quantity * i.list_price ELSE 0 END) AS 'Four Months Previous'
        ,SUM(CASE WHEN DATEADD(DAY, -(DAY(order_date) - 1), order_date) = DATEADD(MONTH, -3, @date) THEN i.quantity * i.list_price ELSE 0 END) AS 'Three Months Previous'
        ,SUM(CASE WHEN DATEADD(DAY, -(DAY(order_date) - 1), order_date) = DATEADD(MONTH, -2, @date) THEN i.quantity * i.list_price ELSE 0 END) AS 'Two Months Previous'
        ,SUM(CASE WHEN DATEADD(DAY, -(DAY(order_date) - 1), order_date) = DATEADD(MONTH, -1, @date) THEN i.quantity * i.list_price ELSE 0 END) AS 'Previous Month'
        ,SUM(CASE WHEN DATEADD(DAY, -(DAY(order_date) - 1), order_date) = @date THEN i.quantity * i.list_price ELSE 0 END) AS 'Selected Month'
    FROM Sales.stores AS s
    LEFT JOIN Sales.orders AS o
        ON o.store_id = s.store_id
    LEFT JOIN Sales.order_items AS i
        ON i.order_id = o.order_id
    LEFT JOIN Production.products AS p
        ON p.product_id = i.product_id
    LEFT JOIN Production.brands AS b
        ON b.brand_id = p.brand_id
    WHERE b.brand_name = @brand
        AND o.order_date >= DATEADD(MONTH, -6, @date)
        AND o.order_date < DATEADD(MONTH, 1, @date)
    GROUP BY s.store_id
        ,s.store_name
        ,b.brand_name
        ,p.product_id
        ,p.product_name
    ORDER BY store_id
        ,product_id