AI Python Zero-to-Hero: Build Your Own Fitness Tracker
In this code-along, we'll learn how to combine AI, data analysis, and Python to explore and visualize fitness data.
The data is available in this workbook as fitness_data.csv
. It is synthetic data consisting of the following columns:
Data Dictionary
Column Name | Description | Additional Context |
---|---|---|
date | The specific day of data recording | Allows tracking changes and patterns over the 185-day period |
steps | Total daily step count | The common goal of 10,000 steps serves as a reference point for daily activity level assessment |
weight | Body weight measurement (in kg) | Tracked daily to monitor body mass changes over time |
resting_heart_rate | Heart beats per minute while at complete rest | Lower values typically indicate better cardiovascular fitness |
sleep_hours | Total daily sleep duration | Includes all sleep phases; adults typically need 7-9 hours per night for optimal health |
active_minutes | Total time spent in physical activity | Encompasses all activity intensities throughout the day |
total_calories_burned | Total daily energy expenditure | Combines both resting metabolic rate and activity-based calorie burn |
fat_burn_minutes | Time in 50-69% of max heart rate zone | Lower intensity zone optimal for building base endurance and metabolizing fat |
cardio_minutes | Time in 70-84% of max heart rate zone | Moderate to high intensity zone that improves cardiovascular capacity |
peak_minutes | Time in 85%+ of max heart rate zone | Highest intensity zone, typically reached during interval training or sprints |
workout_type | Category of exercise performed | Helps analyze the distribution and effectiveness of different activities |
workout_duration | Length of exercise session in minutes | Used to analyze exercise patterns and time commitment |
workout_calories | Energy expended during workout | Specifically tracks calories burned during structured exercise sessions |
workout_avg_hr | Mean heart rate during exercise | Indicates the overall intensity of the workout session |
workout_max_hr | Highest heart rate during exercise | Shows the point of maximum exertion during the workout |
Most fitness tracking apps and devices contain variations of the above columns, and allow you to export the data for your own analysis.
Let's dive in!
Task 0: Setup
We're going to use pandas
for data analysis and plotly.express
for interactive data visualization.
# Import necessary packages
import pandas as pd
import plotly.express as px
Task 1: Reading in the data 📖
Let's read in and display the fitness_data.csv
file using pandas
.
# Read in the dataset "fitness_data.csv"
fitness_data = pd.read_csv('fitness_data.csv')
# Display the data
fitness_data.head()
Task 2: Checking for missing values 🔎
Now that we've read in the data, let's check whether it has any missing values.
# Does the data have any missing values?
missing_values = fitness_data.isna().sum()
missing_values
# Which rows have missing values?
missing_value_locations = fitness_data[fitness_data.isna().any(axis=1)]
missing_value_locations
Task 3: Exploring the data 🔎
Let's start exploring our data by calculating summary statistics and confirming that our columns are of the correct data type.
# What are the summary statistics?
summary_stats = fitness_data.describe()
summary_stats
# Check the data types of each column
data_types = fitness_data.dtypes
data_types
# Convert the date column to a datetime format
fitness_data['date'] = pd.to_datetime(fitness_data['date'])
fitness_data.dtypes
Task 4: Creating useful new columns
Sometimes we might want to rename columns with more descriptive names to be easier to interpret, or create new columns to measure things we are interested in.
# Rename 'weight' and 'workout_duration' to have more descriptive names
fitness_data.rename(columns ={'weight':'body_weight_kg', 'workout_duration':'workout_duration_minutes'}, inplace=True)
‌
‌