Data-Driven Product Management: Conducting a Market Analysis
Explore local and global fitness trends to identify product niches. Investigate online interest in gyms, workouts, digital services, and web apps.
In this project, you'll explore local and international markets to find opportunities for your fitness products. You'll use your data manipulation skills to examine data about online interest in home gyms, gym workouts, home workouts, and fitness products, and create visualizations to help guide your product decisions.
You are a product manager for a fitness studio and are interested in understanding the current demand for digital fitness classes. You plan to conduct a market analysis in Python to gauge demand and identify potential areas for growth of digital products and services.
The Data
You are provided with a number of CSV files in the "Files/data" folder, which offer international and national-level data on Google Trends keyword searches related to fitness and related products.
workout.csv
Column | Description |
---|---|
'month' | Month when the data was measured. |
'workout_worldwide' | Index representing the popularity of the keyword 'workout', on a scale of 0 to 100. |
three_keywords.csv
Column | Description |
---|---|
'month' | Month when the data was measured. |
'home_workout_worldwide' | Index representing the popularity of the keyword 'home workout', on a scale of 0 to 100. |
'gym_workout_worldwide' | Index representing the popularity of the keyword 'gym workout', on a scale of 0 to 100. |
'home_gym_worldwide' | Index representing the popularity of the keyword 'home gym', on a scale of 0 to 100. |
workout_geo.csv
Column | Description |
---|---|
'country' | Country where the data was measured. |
'workout_2018_2023' | Index representing the popularity of the keyword 'workout' during the 5 year period. |
three_keywords_geo.csv
Column | Description |
---|---|
'country' | Country where the data was measured. |
'home_workout_2018_2023' | Index representing the popularity of the keyword 'home workout' during the 5 year period. |
'gym_workout_2018_2023' | Index representing the popularity of the keyword 'gym workout' during the 5 year period. |
'home_gym_2018_2023' | Index representing the popularity of the keyword 'home gym' during the 5 year period. |
Help the fitness studio explore interest in workouts at a global and national level.
How to approach the project
-
Load data on global interest in workouts
-
Find the time of peak searches for workout
-
Find the most popular keywords for the current year and during covid
-
Find the country with the highest interest for workouts
-
Find the country in the MESA region with the highest interest in home workouts
# Import the libraries
import pandas as pd
import matplotlib.pyplot as plt
1. Load data on global interest in workouts
# use 2nd variant of function to load dataset
def display_multiple_datasets(file_paths):
"""
This function takes a list of file paths, reads each CSV file into a DataFrame,
displays the first few rows, and prints the DataFrame's info for each file.
Parameters:
file_paths (list of str): A list of paths to the CSV files.
Returns:
None
"""
for file_path in file_paths:
print(f"Displaying data for: {file_path}")
df = pd.read_csv(file_path)
display(df.head())
display(df.info())
print("\n" + "="*50 + "\n")
# Example usage:
file_paths = [
'data/three_keywords.csv',
'data/workout_geo.csv',
'data/three_keywords_geo.csv',
'data/workout.csv'
]
display_multiple_datasets(file_paths)
workout = display_multiple_datasets(['data/workout.csv'])
2. Find the time of peak searches for workout
When was the global search for 'workout' at its peak? Save the year of peak interest as a string named year_str in the format "yyyy".
# Find the peak for global 'workout' searches
df_workout = pd.read_csv("data/workout.csv")
plt.plot(df_workout["month"], df_workout["workout_worldwide"])
plt.xticks(rotation=90)
plt.show()