Skip to content
Sales Data Analysis - motorcycle parts
# Importing the pandas module
import pandas as pd
# Reading in the sales data
df = pd.read_csv('data/sales_data.csv', parse_dates=['date'])
# Take a look at the first datapoints
df.head()
💾 The data
The sales data has the following fields:
- "date" - The date, from June to August 2021.
- "warehouse" - The company operates three warehouses: North, Central, and West.
- "client_type" - There are two types of customers: Retail and Wholesale.
- "product_line" - Type of products purchased.
- "quantity" - How many items were purchased.
- "unit_price" - Price per item sold.
- "total" - Total sale = quantity * unit_price.
- "payment" - How the client paid: Cash, Credit card, Transfer.
df.head()
💪 Challenge
Create a report to answer your colleague's questions. Include:
- What are the total sales for each payment method?
- What is the average unit price for each product line?
- Create plots to visualize findings for questions 1 and 2.
- [Optional] Investigate further (e.g., average purchase value by client type, total purchase value by product line, etc.)
- Summarize your findings.
** SALES DATA ANALYSIS REPORT **
- Total sales for each payment method is
Cash 19199.10
Credit card 110271.57
Transfer 159642.33
Transfer is the highest payment method used for payment followed by credit card and cash. - Average unit price for each product line shows highest for engine part and lowest for
breaking system. - suspension & traction product shows highest purchase value.
- wholesale client type purchase is more than retailer.
# Determine total sales for each payment method.
Total_sales = df.groupby('payment')['total'].sum()
Total_sales
# What is the average unit price for each product line?
Avg_unit_price = df.groupby('product_line')['unit_price'].mean()
Avg_unit_price
# create plot to visualize findings for Total sales for each payment method.
import matplotlib.pyplot as plt
Total_sales.plot(kind= 'barh')
plt.show()
# create plot for Average unit price for each product
Avg_unit_price.plot(kind= 'barh')
plt.show()
# Determine average purchase value by client type
Avg_pur_value = df.groupby('client_type')['quantity'].sum()
Avg_pur_value
Avg_pur_value = df.groupby('client_type')['quantity'].sum()
Avg_pur_value.plot(kind='barh')
plt.show()
# Determine total purchase value by product_line
Total_client_pur = df.groupby('product_line')['total'].sum().sort_values(ascending = True)
Total_client_pur
import matplotlib.pyplot as plt
Total_client_pur.plot(kind='barh')
plt.show()