Skip to content

You are a product manager for a fitness studio and are interested in understanding the current demand for digital fitness classes. You plan to conduct a market analysis in Python to gauge demand and identify potential areas for growth of digital products and services.

The Data

You are provided with a number of CSV files in the "Files/data" folder, which offer international and national-level data on Google Trends keyword searches related to fitness and related products.

workout.csv

ColumnDescription
'month'Month when the data was measured.
'workout_worldwide'Index representing the popularity of the keyword 'workout', on a scale of 0 to 100.

three_keywords.csv

ColumnDescription
'month'Month when the data was measured.
'home_workout_worldwide'Index representing the popularity of the keyword 'home workout', on a scale of 0 to 100.
'gym_workout_worldwide'Index representing the popularity of the keyword 'gym workout', on a scale of 0 to 100.
'home_gym_worldwide'Index representing the popularity of the keyword 'home gym', on a scale of 0 to 100.

workout_geo.csv

ColumnDescription
'country'Country where the data was measured.
'workout_2018_2023'Index representing the popularity of the keyword 'workout' during the 5 year period.

three_keywords_geo.csv

ColumnDescription
'country'Country where the data was measured.
'home_workout_2018_2023'Index representing the popularity of the keyword 'home workout' during the 5 year period.
'gym_workout_2018_2023'Index representing the popularity of the keyword 'gym workout' during the 5 year period.
'home_gym_2018_2023'Index representing the popularity of the keyword 'home gym' during the 5 year period.
# Import the necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
# Start coding here
# Help the fitness studio explore interest in workouts at a global and national level.

# 1. When was the global search for 'workout' at its peak? Save the year of peak interest
# as a string named year_str in the format "yyyy".

# 2. Of the keywords available, what was the most popular during the covid pandemic, and 
# what is the most popular now? Save your answers as variables called peak_covid and current respectively.

# 3. What country has the highest interest for workouts among the following: United States, 
# Australia, or Japan? Save your answer as top_country.

# 4. You'd be interested in expanding your virtual home workouts offering to either the Philippines 
# or Malaysia. Which of the two countries has the highest interest in home workouts? Identify the 
# country and save it as home_workout_geo.


import pandas as pd
import os
import matplotlib.pyplot as plt

# Import data
folder_path = r"data"
csv_files = [file for file in os.listdir(folder_path) if file.endswith(".csv")]

#%%
dataframes = {}
for file in csv_files:
    file_path = os.path.join(folder_path, file)
    df_name = file.split(".")[0]
    df = pd.read_csv(file_path)
    dataframes[df_name] = df
    print(f"DataFrame name: {df_name}")

workout, workout_geo, three_kw, three_kw_geo = dataframes['workout'].fillna(0), dataframes['workout_geo'].fillna(0), dataframes['three_keywords'].fillna(0), dataframes['three_keywords_geo'].fillna(0)

#%%
# 1. When was the global search for 'workout' at its peak? Save the year of peak interest as a string named year_str in the format "yyyy".
workout['year'] = workout['month'].str[:4]
search_count = workout.groupby('year')['workout_worldwide'].sum()
year_str = search_count.idxmax()
search_count.plot(kind = 'bar')
print(year_str)

#%%
# 2. Of the keywords available, what was the most popular during the covid pandemic, and what is the most popular now? Save your answers as variables called peak_covid and current respectively.
three_kw.plot(kind = 'line')
peak_covid = 'home_workout_worldwide'
current = 'gym_workout_worldwide'

print(peak_covid)
print(current)

#%%
# 3. What country has the highest interest for workouts among the following: United States, Australia, or Japan? Save your answer as top_country.
workout_geo = workout_geo[workout_geo['country'].isin(['United States', 'Australia', 'Japan'])]
workout_geo.plot(kind='bar', x='country')
top_country = workout_geo.loc[workout_geo['workout_2018_2023'].idxmax(),'country']
print(top_country)

#%%
# 4. You'd be interested in expanding your virtual home workouts offering to either the Philippines or Malaysia. Which of the two countries has the highest interest in home workouts? Identify the country and save it as home_workout_geo.
three_kw_geo = three_kw_geo[three_kw_geo['Country'].isin(['Philippines','Malaysia'])]
home_workout_geo = three_kw_geo.loc[three_kw_geo['home_workout_2018_2023'].idxmax(),'Country']
three_kw_geo
three_kw_geo.plot(kind ='bar', x='Country')
print(home_workout_geo)