Investigate Coffee Shop Sales
This sample dataset contains sales and transaction data from January through June. I downloaded the dataset from Kaggle, and utilized SQL to answer the following questions about the dataset:
- Do sales change by month?
- What product categories are people buying? Which are the most profitable?
- What product categories are being bought at what times throughout the day?
- Do sales change by the day of the week?
- How is each of the three company store locations performing?
- What are the 10 best-selling products?
- Rank the top selling products in each of the product categories.
Let's see what we can find out.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sales = pd.read_excel('coffee_shop_original.xlsx')SELECT *
FROM sales1. Do sales change by month?
The query below groups the number of transactions per month, as well as the total dollar sales per month.
SELECT
EXTRACT(month FROM transaction_date) AS month,
COUNT(transaction_id) AS transactions,
ROUND(SUM(transaction_qty * unit_price),0) AS sales
FROM sales
GROUP BY month
ORDER BY month2. What product categories are people buying, and which are the most profitable?
The query below groups the number of transactions per product category, as well as total dollars sales by product category, then calculates dollars revenue per transaction.
SELECT
product_category,
COUNT(transaction_id) AS transactions,
ROUND(SUM(transaction_qty * unit_price),0) AS sales,
ROUND(SUM(transaction_qty * unit_price) / COUNT(transaction_id), 2) AS dollars_per_transaction
FROM sales
GROUP BY product_category
ORDER BY sales DESC3. What product categories are people buying and at what times are they buying them?
The query below is similar to the above query, however it takes that one step further. This query identifies total dollar sales per product category, per hour of the business day.
SELECT
EXTRACT(hour FROM CAST(transaction_time AS TIME)) AS hour,
product_category,
COUNT(transaction_id) AS transactions
FROM sales
GROUP BY hour, product_category
ORDER BY transactions DESC4. What days are people buying?
The query below groups transaction counts by the day of week, and also sum of dollar sales by day of week.
SELECT
EXTRACT(DOW FROM transaction_date) AS day_of_week,
COUNT(transaction_id) AS count,
ROUND(SUM(transaction_qty * unit_price),0) AS sales
FROM sales
GROUP BY EXTRACT(DOW FROM transaction_date)
ORDER BY day_of_week5. How is each location performing?
The query below groups transaction count by store location, as well as total dollar sales by store location.
SELECT
store_location,
COUNT(transaction_id) AS transactions,
ROUND(SUM(transaction_qty * unit_price),0) AS sales
FROM sales
GROUP BY store_location
ORDER BY sales DESC6. What are the top 10 best-selling products?
The query below provides a count of the top 10 selling products, in descending order. There is also a 'rank' column, produced by a window function.
SELECT
product_type,
COUNT(product_type) AS count,
RANK() OVER (ORDER BY count DESC) AS rank
FROM sales
GROUP BY product_type
LIMIT 10