Skip to content
Project: Data-Driven Product Management: Conducting a Market Analysis
You are a product manager for a fitness studio and are interested in understanding the current demand for digital fitness classes. You plan to conduct a market analysis in Python to gauge demand and identify potential areas for growth of digital products and services.
The Data
You are provided with a number of CSV files in the "Files/data" folder, which offer international and national-level data on Google Trends keyword searches related to fitness and related products.
workout.csv
Column | Description |
---|---|
'month' | Month when the data was measured. |
'workout_worldwide' | Index representing the popularity of the keyword 'workout', on a scale of 0 to 100. |
three_keywords.csv
Column | Description |
---|---|
'month' | Month when the data was measured. |
'home_workout_worldwide' | Index representing the popularity of the keyword 'home workout', on a scale of 0 to 100. |
'gym_workout_worldwide' | Index representing the popularity of the keyword 'gym workout', on a scale of 0 to 100. |
'home_gym_worldwide' | Index representing the popularity of the keyword 'home gym', on a scale of 0 to 100. |
workout_geo.csv
Column | Description |
---|---|
'country' | Country where the data was measured. |
'workout_2018_2023' | Index representing the popularity of the keyword 'workout' during the 5 year period. |
three_keywords_geo.csv
Column | Description |
---|---|
'country' | Country where the data was measured. |
'home_workout_2018_2023' | Index representing the popularity of the keyword 'home workout' during the 5 year period. |
'gym_workout_2018_2023' | Index representing the popularity of the keyword 'gym workout' during the 5 year period. |
'home_gym_2018_2023' | Index representing the popularity of the keyword 'home gym' during the 5 year period. |
# Import the necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Start coding here
workout_df = pd.read_csv("workout.csv")
three_keywords_df = pd.read_csv("three_keywords.csv")
workout_geo = pd.read_csv("workout_geo.csv")
three_keywords_geo_df = pd.read_csv("three_keywords_geo.csv")
workout_df.head()
# Set index to month
workout_df = workout_df.set_index('month')
When was the global search for 'workout' at its peak?
workout_peak = workout_df[workout_df['workout_worldwide']==workout_df['workout_worldwide'].max()].index[0]
print(f"The workout peak was {workout_peak}")
g = sns.lineplot(x='month',y='workout_worldwide',data=workout_df)
g.set_title("Global Workout word search Trend 2018-2023",y=1.02)
plt.xticks(rotation=45)
plt.show()
year_str = "2020"
Of the keywords available, what was the most popular during the covid pandemic, and what is the most popular now?
three_keywords_df.head()
covid_filter = three_keywords_df[(three_keywords_df['month']>='2020-01')&(three_keywords_df['month']<='2021-12')]
covid_filter[['home_workout_worldwide','gym_workout_worldwide','home_gym_worldwide']].max().sort_values(ascending=False).index[0]
now_filter = three_keywords_df[(three_keywords_df['month']>='2022-01')]
now_filter[['home_workout_worldwide','gym_workout_worldwide','home_gym_worldwide']].max().sort_values(ascending=False).index[0]
peak_covid = 'home_workout_worldwide'
current = 'gym_workout_worldwide'
What country has the highest interest for workouts among the following: United States, Australia, or Japan?
workout_geo.head()
country_filter = ['United States','Australia','Japan']
workout_geo_filtered = workout_geo[workout_geo['country'].isin(country_filter)]
workout_geo_filtered.max()['country']