Skip to content
Project: Data-Driven Product Management: Conducting a Market Analysis
You are a product manager for a fitness studio and are interested in understanding the current demand for digital fitness classes. You plan to conduct a market analysis in Python to gauge demand and identify potential areas for growth of digital products and services.
The Data
You are provided with a number of CSV files in the "Files/data" folder, which offer international and national-level data on Google Trends keyword searches related to fitness and related products.
workout.csv
Column | Description |
---|---|
'month' | Month when the data was measured. |
'workout_worldwide' | Index represeting the popularity of the keyword 'workout', on a scale of 0 to 100. |
three_keywords.csv
Column | Description |
---|---|
'month' | Month when the data was measured. |
'home_workout_worldwide' | Index represeting the popularity of the keyword 'home workout', on a scale of 0 to 100. |
'gym_workout_worldwide' | Index represeting the popularity of the keyword 'gym workout', on a scale of 0 to 100. |
'home_gym_worldwide' | Index represeting the popularity of the keyword 'home gym', on a scale of 0 to 100. |
workout_geo.csv
Column | Description |
---|---|
'country' | Country where the data was measured. |
'workout_2018_2023' | Index represeting the popularity of the keyword 'workout' during the 5 year period. |
three_keywords_geo.csv
Column | Description |
---|---|
'country' | Country where the data was measured. |
'home_workout_2018_2023' | Index represeting the popularity of the keyword 'home workout' during the 5 year period. |
'gym_workout_2018_2023' | Index represeting the popularity of the keyword 'gym workout' during the 5 year period. |
'home_gym_2018_2023' | Index represeting the popularity of the keyword 'home gym' during the 5 year period. |
# Import the necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
workout = pd.read_csv("data/workout.csv")
#The global search for 'workout' at its peak
workout.plot(x = "month", y = "workout_worldwide" , kind = "line")
plt.show()
year_str = "2020"
# year_str = str(workout.loc[workout["workout_worldwide"].idxmax() , "month"].split('-')[0])
# print(year_str)
# The most popular during the covid pandemic, and what is the most popular now
covid_vs_now = pd.read_csv("data/three_keywords.csv")
covid_vs_now.plot(kind = "line")
plt.show()
peak_covid = "home_workout_worldwide"
current = "gym_workout_worldwide"
# The highest interest for workouts among the following: United States, Australia, or Japan
highest_interest = pd.read_csv("data/workout_geo.csv")
highest_interest[highest_interest["country"].isin(["United States","Australia" ,"Japan"])].plot(kind = "bar" , x = "country" , y = "workout_2018_2023")
plt.show()
top_country = "United States"
# The highest interest in home workouts between Philippines and Malaysia
interest_home_workout = pd.read_csv("data/three_keywords_geo.csv")
highest_interest_home_workout = interest_home_workout.loc[interest_home_workout["Country"].isin(["Philippines" , "Malaysia"]), ["Country", "home_workout_2018_2023"]]
plt.bar(highest_interest_home_workout["Country"], highest_interest_home_workout["home_workout_2018_2023"])
plt.xlabel('Country')
plt.ylabel('Interest Level')
plt.show()
home_workout_geo = "Philippines"