SQL Sales Data Exploration
As a business analyst for a fictional online retail company, use the customers, products, and sales data to find the following information for company stakeholders using SQL query techniques. Assume today is the beginning of Q3 in 2024 or July 01, 2024.
- Find the sales transactions made in the last 30 days including customer name, product name, and total amount of the sale. Also find the total number of transactions, number of distinct customers, and total revenue during this period.
- Find the total revenue generated by each product category in 2023 from least to greatest.
- Find all customers who made purchases in 2023 and who are located in the West region so the marketing team can send them coupons for next quarter.
- Display the total number of transactions, total product quantity sold, and total revenue for each customer, in decreasing order by total revenue.
- Find the top 3 customers by total revenue in the year 2024 so they can receive Most Valuable Customer bonus products.
- Rank products by their total sales quantity in 2024 to identify the most popular items sold so far this year.
- Customers have the option of joining the Loyalty Program at any time. Categorize customers into "New", if their sign-up date for the Loyalty program is in the last 6 months or "Existing" otherwise. Where are most new loyalty program members located?
- Find the total transactions for each month in the last 12 months (Q3 2023 through Q2 2024).
- Identify the product categories that generated more than $4000 in revenue during the last 6 months (Q1 & Q2 of 2024).
- Check for any sales where the total amount in the sales data doesn't match the expected value of the sale using quantity x price.
Original datasets "customers", "products", and "sales" were created in Google Sheets for use in this project.
- "customers" data includes: customer_id, customer_name, sales_region, sign_up_date
- "products" data includes: product_id, product_name, category, price
- "sales" data includes: sales_id, customer_id, product_id, sale_date, quantity, total_amount
Question 1
Find the sales transactions made in the last 30 days including customer name, product name, and total amount of the sale. Also find the total number of transactions, number of distinct customers, and total revenue during this period.
Answer
There were 55 sales in the last 30 days to 32 distinct customers for a total revenue of $3320.
SELECT c.customer_name, p.product_name, s.total_amount
FROM 'sales' as s
INNER JOIN 'customers' as c ON s.customer_id = c.customer_id
INNER JOIN 'products' as p ON s.product_id = p.product_id
WHERE DATEDIFF('day', CAST(s.sale_date AS DATE), CAST('2024-07-01' AS DATE) ) BETWEEN 0 and 31SELECT COUNT(*) as total_transactions,
COUNT(DISTINCT customer_name) as distinct_cust,
SUM(total_amount) as total_revenue
FROM (SELECT c.customer_name, p.product_name, s.total_amount
FROM 'sales' as s
INNER JOIN 'customers' as c ON s.customer_id = c.customer_id
INNER JOIN 'products' as p ON s.product_id = p.product_id
WHERE DATEDIFF('day', CAST(s.sale_date AS DATE), CAST('2024-07-01' AS DATE) ) BETWEEN 0 and 31)Question 2
Find the total revenue generated by each product category in 2023 from least to greatest.
Answer
In the last year, Tech products generated $1346, Clothing products generated $2573, Toy products generated $2708, and Food products generated $3016. Tech products are significantly underperforming the other categories.
SELECT p.category, SUM(s.total_amount) as total_revenue
FROM 'sales' as s
INNER JOIN 'products' as p ON s.product_id = p.product_id
WHERE CAST(s.sale_date AS DATE) BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY p.category
ORDER BY SUM(s.total_amount) Question 3
Find all customers who made purchases in 2023 and who are located in the West region, so the marketing team can send them coupons for next quarter.
Answer In the West region, 11 distinct customers made purchases in 2023.
SELECT DISTINCT customer_id, customer_name
FROM 'customers'
INNER JOIN
(SELECT customer_id, sale_date
FROM 'sales'
WHERE YEAR(CAST(sale_date AS DATE))= 2023) USING (customer_id)
WHERE region = 'West'
ORDER BY customer_idQuestion 4
Display the total number of transactions, total product quantity sold, and total revenue for each customer for 2023 & 2024, in decreasing order by total revenue.
Answer
Evageline Ellis is the customer with the highest total revenue of $919 from buying 40 products in 14 separate transactions.
SELECT customer_name, COUNT(id) as total_transactions, SUM(quantity) as total_prod, SUM(total_amount) as total_revenue
FROM sales
LEFT JOIN customers USING (customer_id)
GROUP BY customer_name
ORDER BY SUM(total_amount) DESCNote It is important to note that the last result is a customer entry that is missing the customer's name. Find the customer id and transaction information for this customer.
Answer
Customer with id number 51 is missing their full name in the customers table. They made one transaction, a sale of 1 Product 26 for $22. Upon closer inspection, since there are only 50 customers in the customers dataset, this appears to be a typo and the Customer ID of 51 is incorrect. We would need to further investigate which customer made this purchase and correct the mistake in the dataset.
SELECT customer_name, * FROM sales
FULL JOIN customers USING (customer_id)
WHERE customer_name IS NULLQuestion 5
Find the top 3 customers by total revenue in the first half of 2024 so they can receive Most Valuable Customer bonus products.
Answer Since the sales data only contains transactions through the end of Quarter 2 in 2024, we can filter for just the year 2024 when identifying the top 3 customers.
Customers with id numbers 10, 28, and 25 generated the most total revenue in the year 2024.
SELECT customer_id, SUM(total_amount) as total_revenue
FROM sales
WHERE YEAR(CAST(sale_date as DATE)) = 2024
GROUP BY customer_id
ORDER BY SUM(total_amount) DESC
LIMIT 3Question 6
Rank products by their total sales quantity in 2024 to identify the most popular items sold so far this year.
Answer Product A is the most popular item in 2024, by the end of Q2. It sold 50 times.