AI Python Zero-to-Hero: Build Your Own Fitness Tracker
In this code-along, we'll learn how to combine AI, data analysis, and Python to explore and visualize fitness data.
The data is available in this workbook as fitness_data.csv. It is synthetic data consisting of the following columns:
Data Dictionary
| Column Name | Description | Additional Context |
|---|---|---|
date | The specific day of data recording | Allows tracking changes and patterns over the 185-day period |
steps | Total daily step count | The common goal of 10,000 steps serves as a reference point for daily activity level assessment |
weight | Body weight measurement (in kg) | Tracked daily to monitor body mass changes over time |
resting_heart_rate | Heart beats per minute while at complete rest | Lower values typically indicate better cardiovascular fitness |
sleep_hours | Total daily sleep duration | Includes all sleep phases; adults typically need 7-9 hours per night for optimal health |
active_minutes | Total time spent in physical activity | Encompasses all activity intensities throughout the day |
total_calories_burned | Total daily energy expenditure | Combines both resting metabolic rate and activity-based calorie burn |
fat_burn_minutes | Time in 50-69% of max heart rate zone | Lower intensity zone optimal for building base endurance and metabolizing fat |
cardio_minutes | Time in 70-84% of max heart rate zone | Moderate to high intensity zone that improves cardiovascular capacity |
peak_minutes | Time in 85%+ of max heart rate zone | Highest intensity zone, typically reached during interval training or sprints |
workout_type | Category of exercise performed | Helps analyze the distribution and effectiveness of different activities |
workout_duration | Length of exercise session in minutes | Used to analyze exercise patterns and time commitment |
workout_calories | Energy expended during workout | Specifically tracks calories burned during structured exercise sessions |
workout_avg_hr | Mean heart rate during exercise | Indicates the overall intensity of the workout session |
workout_max_hr | Highest heart rate during exercise | Shows the point of maximum exertion during the workout |
Most fitness tracking apps and devices contain variations of the above columns, and allow you to export the data for your own analysis.
Let's dive in!
Task 0: Setup
We're going to use pandas for data analysis and plotly.express for interactive data visualization.
# Import necessary packages
import pandas as pd
import plotly.express as pe
Task 1: Reading in the data π
Let's read in and display the fitness_data.csv file using pandas.
# Read in the dataset "fitness_data.csv"
fitness = pd.read_csv("fitness_data.csv")
# Display the data
fitness.head()Task 2: Checking for missing values π
Now that we've read in the data, let's check whether it has any missing values.
# Does the data have any missing values?
fitness.isnull().sum()# Which rows have missing values?
Task 3: Exploring the data π
Let's start exploring our data by calculating summary statistics and confirming that our columns are of the correct data type.
# What are the summary statistics?
# Check the data types of each column
# Convert the date column to a datetime format
Task 4: Creating useful new columns
Sometimes we might want to rename columns with more descriptive names to be easier to interpret, or create new columns to measure things we are interested in.
# Rename 'weight" and 'workout_duration' to have more descriptive names
# Add a new column 'weight_lbs' converting weight from kilograms to pounds (1 kg = 2.20462 lbs)
β
β