Skip to content
0
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('data/sales_data.csv')

total_sales_by_payment_method = df.groupby('payment')['total'].sum()
average_unit_price_eachprod = df.groupby('product_line')['unit_price'].mean()
average_purchase_value_by_cliente = df.groupby('client_type')['total'].mean()
total_purchase_value_by_product_line = df.groupby('product_line')['total'].mean()
warehouse_with_more_sales = df.groupby('warehouse')['total'].max()
top_product_by_warehouse_by_warehouse = df.groupby('warehouse')['product_line','total'].max()
min_sales_product_by_warehouse = df.groupby('warehouse')['product_line','total'].min()

Summarize your findings.

What are the total sales for each payment method?

total_sales_by_payment_method.to_frame()

What is the average unit price for each product line?

average_unit_price_eachprod.to_frame()

Create plots to visualize findings for questions 1 and 2.

plt.figure(figsize=(10, 5))
total_sales_by_payment_method = df.groupby('payment')['total'].sum()
total_sales_by_payment_method.plot(kind='barh')
plt.title('Question 1')
plt.show()


plt.figure(figsize=(10, 5))
average_unit_price_eachprod = df.groupby('product_line')['unit_price'].mean()
average_unit_price_eachprod.plot(kind='barh')
plt.title('Question 2')
plt.show()

[Optional] Investigate further (e.g., average purchase value by client type, total purchase value by product line, etc.)

Averager purchase value by client type:

average_purchase_value_by_cliente.to_frame()

Total purchase value by productline:

total_purchase_value_by_product_line.to_frame().sort_values('total', ascending=False)

Warehoues with more sales:

warehouse_with_more_sales.to_frame()