Analyzing eCommerce Store Sales
Executive Summary
Introduction
This report provides an executive summary of the key findings from the analysis of sales data for an eCommerce store. The analysis focuses on various aspects such as visit patterns, channel grouping effectiveness, browser usage, operating system preferences, device categories, and geographical insights. The findings are derived from a series of plots that visualize the underlying data.
Key Findings
Visit Patterns: The analysis of visit patterns indicates a significant variation in the number of visits across different hours of the day. Peak visit hours are observed during the evening, suggesting that most customers prefer shopping outside regular business hours.
Channel Grouping Effectiveness: The channel grouping analysis reveals that certain marketing channels are more effective in driving transactions than others. Organic search and direct channels lead in terms of transaction volume, highlighting the importance of SEO and brand recognition.
Browser Usage: Browser usage patterns suggest a diverse preference among users, with Chrome being the most popular browser. This indicates the necessity for the eCommerce store to ensure compatibility and optimization across various browsers to enhance user experience.
Operating System Preferences: The operating system analysis shows a clear preference for Windows and Android among users, suggesting that these platforms should be prioritized in terms of website optimization and app development to cater to the majority of the user base.
Device Categories: Device category analysis indicates a significant portion of traffic comes from mobile devices, emphasizing the importance of having a mobile-friendly website or app to accommodate the shopping preferences of modern consumers.
Geographical Insights: Geographical analysis highlights the global reach of the eCommerce store, with significant transactions originating from various continents and countries. North America and Europe emerge as key markets, suggesting potential areas for targeted marketing and expansion efforts.
Conclusion
The analysis of the eCommerce store's sales data provides valuable insights into customer behavior, preferences, and the effectiveness of different marketing channels. By understanding these patterns, the store can optimize its marketing strategies, website, and app to better cater to its target audience, ultimately driving higher sales and customer satisfaction.
The sample dataset contains obfuscated GA360 data for August 1, 2017 from the Google Merchandise Store, a real ecommerce store selling Google branded merchandise (Source).
This data contains session data with traffic source, location and transcation info. Other data has been anonymized. The data is available for in a publicly available Google Sheet. We will be using a combination of SQL and Python to analyze this dataset, getting more insights into what is driving sales on this particular day.
-- This SQL query is designed to retrieve all rows from the ga_sessions table
SELECT *
FROM ga_sessions#Display dataframe
df# Import packages
import pandas as pd
import plotly.express as pxTask 3: Clean the data
Let's inspect the types of the columns, and make adjustments where needed.
# Display the data types of each column in the dataframe 'df'
df.dtypes# Create a copy of the original dataframe to preserve the original data
df_clean = df.copy()
# Convert the 'visitStartTime' column from UNIX timestamp to datetime format
df_clean['visitStartTime'] = pd.to_datetime(df_clean['visitStartTime'], unit="s")
# Display the cleaned dataframe
df_clean# Display the data types of each column in the dataframe 'df_clean'
df_clean.dtypesTask 4: Explore the data
Use the predefined plotting function to explore different grouped session counts as either a bar chart or a pie chart.
def plot_sessions_per_group(df, group, viz_type='bar'):
# Group the dataframe by the specified column and count the number of sessions in each group
sessions_per_group = df.groupby(group).size().reset_index(name='sessions').sort_values(by='sessions')
# Check if the visualization type is 'bar'
if viz_type == 'bar':
# Create and return a bar chart showing the number of sessions per group
return px.bar(sessions_per_group,
x=group,
y='sessions',
title=f'Number of sessions per {group}',
text='sessions')
# Check if the visualization type is 'pie'
elif viz_type == 'pie':
# Create and return a pie chart showing the distribution of sessions per group
return px.pie(sessions_per_group,
names=group,
values='sessions',
title=f'Distribution of sessions per {group}')
else:
# Raise an error if the visualization type is neither 'bar' nor 'pie'
raise ValueError("viz_type can only be 'bar' or 'pie'")# Plot the number of sessions per continent using the predefined plotting function
plot_sessions_per_group(df_clean, 'continent')# Plot the number of sessions per channel grouping using the predefined plotting function
plot_sessions_per_group(df_clean, 'channelGrouping')