Skip to content
# Import the necessary libraries
import pandas as pd
import matplotlib.pyplot as plt

When was the global search for 'workout' at its peak?

To determine when the global search for 'workout' was at its peak, follow these steps:

  1. Identify the highest value in the 'workout_worldwide' column.
  2. Locate the row(s) in the DataFrame that match this highest value.
  3. Extract the 'month' value from these row(s).
  4. Format the 'month' value to return the result in 'YYYY' format.
import pandas as pd

workout = pd.read_csv("data/workout.csv")
workout

# find the max workout worldwide
# return the year associated with it

max_num = workout['workout_worldwide'].max()
year_str = workout.loc[workout['workout_worldwide'] == max_num, 'month'].values[0][:4]
print('The global search for workout was at its peak in the year ' + year_str + '.')

Of the keywords available, what was the most popular during the covid pandemic, and what is the most popular now?

To determine which keyword was the most popular during the COVID pandemic and which one is the most popular now, follow these steps:

  1. Most Popular During the COVID Pandemic:

    • Calculate the sum of each keyword column individually.
    • Identify the maximum value among these sums to determine the most popular keyword.
    • Find the keyword column name that corresponds to this maximum sum value.
  2. Most Popular Currently:

    • Extract the last row from the DataFrame.
    • Identify the keyword with the highest popularity in this last row.
    • Return the name of the most searched keyword.
import pandas as pd

three_keywords = pd.read_csv('data/three_keywords.csv')
three_keywords

# sum up the values in the columns and find the max of the three

home_sum = three_keywords['home_workout_worldwide'].values.sum()
gym_sum = three_keywords['gym_workout_worldwide'].values.sum()
home_gym_sum = three_keywords['home_gym_worldwide'].values.sum()

# Determine the most popular keyword during COVID

if home_sum > gym_sum and home_sum > home_gym_sum:
    peak_covid = 'home_workout_worldwide'
elif gym_sum > home_sum and gym_sum > home_gym_sum:
    peak_covid = 'gym_workout_worldwide'
else:
    peak_covid = 'home_gym_worldwide'

print('Of the keywords available, the most popular during the covid pandemic was ' + peak_covid +'.')

# Check the last row to find the current most popular keyword

last_row = three_keywords.iloc[-1]

if last_row['home_workout_worldwide'] > last_row['gym_workout_worldwide'] and last_row['home_workout_worldwide'] > last_row['home_gym_worldwide']:
    current = 'home_workout_worldwide'
elif last_row['gym_workout_worldwide'] > last_row['home_workout_worldwide'] and last_row['gym_workout_worldwide'] > last_row['home_gym_worldwide']:
    current = 'gym_workout_worldwide'
else:
    current = 'home_gym_worldwide'

print('The most popular now is ' + current + '.')

What country has the highest interest for workouts among the following: United States, Australia, or Japan?

To determine which country has the highest interest in workouts among the United States, Australia, and Japan, we need to:

  1. Extract the values for each of these countries from the workout_geo DataFrame under the workout_2018_2023 column.
  2. Identify the maximum value among these three countries.
  3. Return the country with the highest value as top_country.
import pandas as pd

workout_geo = pd.read_csv('data/workout_geo.csv')
workout_geo

# find the values for Us, AUS and JPN
#find the max out of the three
# print the country with max


US = workout_geo.loc[workout_geo['country'] == 'United States', 'workout_2018_2023'].values

AUS = workout_geo.loc[workout_geo['country'] == 'Australia', 'workout_2018_2023'].values

JPN = workout_geo.loc[workout_geo['country'] == 'Japan', 'workout_2018_2023'].values

# Find highest value

if US > AUS and US > JPN:
    top_country = 'United States'
elif AUS > US and AUS > JPN:
    top_country = 'Australia'
else:
    top_country = 'Japan'

print('The country that has the highest interest for workouts is ' + top_country + '.')

Which of the two countries (Philippines or Malaysia) has the highest interest in home workouts?

To determine which country, between the Philippines and Malaysia, has the highest interest in home workouts, we will follow these steps:

  1. Retrieve the popularity values for the keyword 'home_workout_2018_2023' for both the Philippines and Malaysia from the three_keywords_geo DataFrame.
  2. Compare the retrieved values to identify the country with the higher interest.
  3. Return a statement indicating which country has the higher interest in home workouts.
import pandas as pd

three_keywords_geo = pd.read_csv('data/three_keywords_geo.csv')
three_keywords_geo

PH = three_keywords_geo.loc[three_keywords_geo['Country'] == 'Philippines', 'home_workout_2018_2023'].values

MA = three_keywords_geo.loc[three_keywords_geo['Country'] == 'Malaysia', 'home_workout_2018_2023'].values

home_workout_geo = 'Philippines' if PH > MA else 'Malaysia'

print('The country with the highest interest in home workouts is ' + home_workout_geo + '.')