Skip to content

AI Python Zero-to-Hero: Build Your Own Fitness Tracker

In this code-along, we'll learn how to combine AI, data analysis, and Python to explore and visualize fitness data.

The data is available in this workbook as fitness_data.csv. It is synthetic data consisting of the following columns:

Data Dictionary

Column NameDescriptionAdditional Context
dateThe specific day of data recordingAllows tracking changes and patterns over the 185-day period
stepsTotal daily step countThe common goal of 10,000 steps serves as a reference point for daily activity level assessment
weightBody weight measurement (in kg)Tracked daily to monitor body mass changes over time
resting_heart_rateHeart beats per minute while at complete restLower values typically indicate better cardiovascular fitness
sleep_hoursTotal daily sleep durationIncludes all sleep phases; adults typically need 7-9 hours per night for optimal health
active_minutesTotal time spent in physical activityEncompasses all activity intensities throughout the day
total_calories_burnedTotal daily energy expenditureCombines both resting metabolic rate and activity-based calorie burn
fat_burn_minutesTime in 50-69% of max heart rate zoneLower intensity zone optimal for building base endurance and metabolizing fat
cardio_minutesTime in 70-84% of max heart rate zoneModerate to high intensity zone that improves cardiovascular capacity
peak_minutesTime in 85%+ of max heart rate zoneHighest intensity zone, typically reached during interval training or sprints
workout_typeCategory of exercise performedHelps analyze the distribution and effectiveness of different activities
workout_durationLength of exercise session in minutesUsed to analyze exercise patterns and time commitment
workout_caloriesEnergy expended during workoutSpecifically tracks calories burned during structured exercise sessions
workout_avg_hrMean heart rate during exerciseIndicates the overall intensity of the workout session
workout_max_hrHighest heart rate during exerciseShows the point of maximum exertion during the workout

Most fitness tracking apps and devices contain variations of the above columns, and allow you to export the data for your own analysis.

Let's dive in!

Task 0: Setup

We're going to use pandas for data analysis and plotly.express for interactive data visualization.

# Import necessary packages
import pandas as pd
import plotly.express as px

Task 1: Reading in the data 📖

Let's read in and display the fitness_data.csv file using pandas.

# Read in the dataset "fitness_data.csv"
fitness_data = pd.read_csv('fitness_data.csv')

# Display the data
fitness_data.head()

Task 2: Checking for missing values 🔎

Now that we've read in the data, let's check whether it has any missing values.

# Does the data have any missing values?
missing_values = fitness_data.isna().sum()
missing_values
# Which rows have missing values?
missing_value_locations = fitness_data[fitness_data.isna().any(axis=1)]
missing_value_locations

Task 3: Exploring the data 🔎

Let's start exploring our data by calculating summary statistics and confirming that our columns are of the correct data type.

# What are the summary statistics?
summary_stats = fitness_data.describe()
summary_stats
# Check the data types of each column
data_types = fitness_data.dtypes
data_types
# Convert the date column to a datetime format
fitness_data['date'] = pd.to_datetime(fitness_data['date'])
fitness_data.dtypes

Task 4: Creating useful new columns

Sometimes we might want to rename columns with more descriptive names to be easier to interpret, or create new columns to measure things we are interested in.

# Rename 'weight' and 'workout_duration' to have more descriptive names
fitness_data.rename(columns ={'weight':'body_weight_kg', 'workout_duration':'workout_duration_minutes'}, inplace=True)
‌
‌
‌