Skip to content
0

Driving Growth and Efficiency: Business Insights and Supply Chain Analysis

from IPython.display import Image

# Replace 'data/JIT_LOGO.jpg' with the correct path to your image
image_path = 'data/JIT_LOGO.jpg'

# Display the image in the notebook
Image(filename=image_path)

Importing Libraries

  • This section of the code is responsible for importing the necessary libraries and modules that will be used throughout the notebook for data visualization and analysis.
import plotly.express as px
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import seaborn as sns

Data Cleaning

This section of the code involves data cleaning and preparation steps for the dataset. The key data cleaning tasks performed are as follows:

  1. Loading Data: The code loads three datasets: 'orders_and_shipments.csv', 'inventory.csv', and 'fulfillment.csv', using the pandas library's read_csv() function.

  2. Column Names Cleaning: The column names in the 'orders_and_shipments.csv' dataset are cleaned by removing leading and trailing whitespaces using the str.strip() method.

  3. Data Type Conversion: The 'Order Time' column is converted to the datetime data type using pd.to_datetime(). Additionally, the 'Discount %' column is processed by removing the '%' symbol and converting it to a numeric data type using pd.to_numeric().

  4. Hour Extraction: A new column 'Order Hour' is created by extracting the hour from the 'Order Time' column, and the original 'Order Time' column is dropped using the drop() method.

  5. Numeric Column Validation: The code checks for any non-numeric values in the numeric columns ('int64' and 'float64') of the dataset. If there are any non-numeric values, it prints the column names that contain them.

  6. Display Data Types: After data cleaning, the code displays the data types of each column to ensure proper conversion.

By completing these data cleaning tasks, the dataset is now ready for further analysis and visualization.

data = pd.read_csv("data/orders_and_shipments.csv")
data
# Load the orders and shipments data
orders_shipments_df = pd.read_csv('data/orders_and_shipments.csv')

# Load the inventory data
inventory_df = pd.read_csv('data/inventory.csv')

# Load the fulfillment data
fulfillment_df = pd.read_csv('data/fulfillment.csv')

data.columns = data.columns.str.strip()
print(data.columns)
#Check Column DTypes
print(data.dtypes)

#Check for text strings in numeric columns
numeric_columns = data.select_dtypes(include=['int64', 'float64']).columns
non_numeric_columns = []

for column in numeric_columns:
    if data[column].apply(lambda x: isinstance(x, str)).any():
        non_numeric_columns.append(column)

if len(non_numeric_columns) > 0:
    print("Non-numeric values found in the following columns:")
    print(non_numeric_columns)
else:
    print("All numeric columns contain only numeric values.")
# Convert 'Order Time' column to datetime type
data['Order Time'] = pd.to_datetime(data['Order Time'])

# Extract hour from 'Order Time' column
data['Order Hour'] = data['Order Time'].dt.hour

# Drop the original 'Order Time' column
data = data.drop(columns=['Order Time'])

# Remove '%' symbol from 'Discount %' column
data['Discount %'] = data['Discount %'].str.replace('%', '')

# Convert 'Discount %' column to numeric type
data['Discount %'] = pd.to_numeric(data['Discount %'], errors='coerce')
#Check Column DTypes
print(data.dtypes)

Business Sales and Profit Analysis

Sales Performance by Product Category

  • The code groups the data by 'Product Category' and calculates the sum of 'Gross Sales' and 'Profit' for each category. The results are displayed in a tabular format.
sales_performance = data.groupby('Product Category').agg({'Gross Sales': 'sum', 'Profit': 'sum'})

print(sales_performance)

Business Performance Analysis

  • This part calculates the total 'Gross Sales' and 'Total Profits' by summing up the corresponding columns. It also calculates the 'Sales Growth Rate' by year and displays the results in a tabular format.
# Calculate Gross Sales by summing up the 'Gross Sales' column
gross_sales = data['Gross Sales'].sum()

# Calculate Total Profits by summing up the 'Profit' column
total_profits = data['Profit'].sum()

# Calculate the Sales Growth Rate
# First, group the data by 'Order Year' and calculate the total Gross Sales for each year
yearly_sales = data.groupby('Order Year')['Gross Sales'].sum().reset_index()

# Calculate the percentage change in Gross Sales from one year to the next
yearly_sales['Sales Growth Rate'] = yearly_sales['Gross Sales'].pct_change() * 100

# Display the results
print(f'Gross Sales: ${gross_sales}')
print(f'Total Profits: ${total_profits}')
print('Sales Growth Rate:')
print(yearly_sales[['Order Year', 'Sales Growth Rate']])