SleepInc: Helping you find better sleep ๐ด
๐ Background
SleepInc, a sleep health company that recently launched a sleep-tracking app called SleepScope. The app monitors sleep patterns and collects users' self-reported data on lifestyle habits. SleepInc wants to identify lifestyle, health, and demographic factors that strongly correlate with poor sleep quality.
๐พ The data
SleepInc provided an anonymized dataset of sleep and lifestyle metrics for 374 individuals. This dataset contains average values for each person calculated over the past six months.
The dataset includes 13 columns covering sleep duration, quality, disorders, exercise, stress, diet, demographics, and other factors related to sleep health.
Column | Description |
---|---|
Person ID | An identifier for each individual. |
Gender | The gender of the person (Male/Female). |
Age | The age of the person in years. |
Occupation | The occupation or profession of the person. |
Sleep Duration (hours) | The average number of hours the person sleeps per day. |
Quality of Sleep (scale: 1-10) | A subjective rating of the quality of sleep, ranging from 1 to 10. |
Physical Activity Level (minutes/day) | The average number of minutes the person engages in physical activity daily. |
Stress Level (scale: 1-10) | A subjective rating of the stress level experienced by the person, ranging from 1 to 10. |
BMI Category | The BMI category of the person (e.g., Underweight, Normal, Overweight). |
Blood Pressure (systolic/diastolic) | The average blood pressure measurement of the person, indicated as systolic pressure over diastolic pressure. |
Heart Rate (bpm) | The average resting heart rate of the person in beats per minute. |
Daily Steps | The average number of steps the person takes per day. |
Sleep Disorder | The presence or absence of a sleep disorder in the person (None, Insomnia, Sleep Apnea). |
Acknowledgments: Laksika Tharmalingam, Kaggle: https://www.kaggle.com/datasets/uom190346a/sleep-health-and-lifestyle-dataset (this is a fictitious dataset)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
raw_data = pd.read_csv('sleep_health_data.csv')
raw_data
Executive Summary
Findings Demographics:
- Balanced gender representation with 185 women and 189 men(49.5% female, 50.5% male).
- Age range: women 29-59, men 27-49.
- Similar average sleep duration (7.08 hours) but higher average sleep quality in women (7.16 vs. 6.90 for men). Sleep Disorders:
- Women more prone to sleep disorders (106 or 57.3%) compared to men (52 or 27.5%), especially sleep apnea.
- 41.45% of participants have insomnia or sleep apnea, concentrated in the overweight BMI category.
- Below-average sleep duration and quality observed in participants beyond the normal BMI range. Occupation and Well-being:
- Accountants, doctors, engineers, and lawyers maintain healthy BMI compared to other professions.
- Scientists, sales reps, and software engineers experience the worst sleep and highest stress levels.
- Women generally sleep better than men in similar occupations. Lifestyle and Health Correlations:
- Moderate physical activity linked to reduced stress and potentially better sleep.
- Age shows weaker correlations with sleep duration compared to sleep quality.
- Strong positive correlation between heart rate and stress level. Key Takeaways:
- Gender, occupation, physical activity, and stress levels likely influence sleep health.
- Further research needed on age-related sleep patterns.
- Physical activity appears beneficial for stress and sleep.
Boost Your Zzzs through these recommendations:
- Like to move it, move it: Prioritize physical activity to manage stress and potentially improve sleep.Aim for at least 30 minutes of moderate-intensity exercise most days. A brisk walk, swimming, or dancing could be your ticket to dreamland.
- Relation and decompress: Wind down with calming activities like reading, taking a warm bath, or deep breathing exercises. Avoid screens for at least an hour before bedtime.
- Stick to a sleep schedule: Go to bed and wake up at roughly the same time each day, even on weekends. This helps regulate your body's natural sleep-wake cycle.
- Create a sleep oasis: Make your bedroom cool, dark, and quiet. Invest in blackout curtains, earplugs, or a white noise machine if needed.
- Eat for sleep: Avoid heavy meals or caffeine close to bedtime. Caffeine may be your love, but sleep is your soulmate.Opt for light snacks like yogurt with fruit or chamomile tea.
- Sunshine is your friend: Get some natural light exposure early in the day to support your internal clock.
- Donโt hit snooze: Resist the temptation to hit snooze! It throws off your sleep cycle and makes you groggy.
- Stress less, sleep more: Practice stress management techniques like yoga, meditation, or mindfulness. A calm mind leads to a restful night.
- Listen to your body: If you're still struggling to sleep after trying these tips, talk to your doctor. There could be underlying medical conditions affecting your sleep.Sleep hygiene is not a one size fits all.
Bonus Tips for Specific Professions:
- Busy bees (teachers, nurses, sales reps): Schedule short power naps during breaks to recharge. Prioritize relaxation activities outside of work to unwind.
- Brainiacs (scientists, software engineers): Take breaks from the screen and schedule time for physical activity throughout the day. Set boundaries between work and personal life to manage stress. We donโt exist to pay bills!
- Ambition in motion (accountants, lawyers, doctors): Maintain a healthy work-life balance. Regular exercise and social connections can combat stress and promote better sleep.
Remeber you arent lazy, you are practicing for hiberation season. Enjoy your downtime!
# Get unique values and counts for categorical columns
raw_data.nunique()
# Get the shape (number of rows and columns)
raw_data.shape
# Get the data types of each column
raw_data.dtypes
# Check for missing values
raw_data.isnull().sum()
# Check for duplicate person IDs
duplicates = raw_data[raw_data["Person ID"].duplicated()]
# Print the results
print("Number of duplicate IDs:", len(duplicates))
print("Duplicate IDs:", duplicates["Person ID"].unique())
# Define the column and replacement value
replacement_value = "Normal"
# Replace "Normal Weight" with "Normal"
raw_data["BMI Category"] = raw_data["BMI Category"].replace("Normal Weight", replacement_value, regex=True)
# Define the column and replacement value
replacement_value_occupation = "Sales Representative"
# Replace "Salesperson" with "Sales Representative"
raw_data["Occupation"] = raw_data["Occupation"].replace("Salesperson", replacement_value_occupation, regex=True)
# Get basic statistics for numerical columns
raw_data.describe()
# Calculate the correlation matrix
corr_matrix = round(raw_data.corr(),2)
# Print the correlation matrix
print("Correlation Matrix:")
print(corr_matrix)
sns.heatmap(corr_matrix)
# Calculate counts using value_counts and reset index
counts = raw_data["BMI Category"].value_counts().reset_index(name="Count")
# Calculate total participants
total_participants = len(raw_data)
# Calculate percentages as a new column
counts["Percentage"] = counts["Count"] / total_participants * 100
counts["Percentage"] = counts["Percentage"].round(2)
# Print the results
print("** Number and Percentage of Participants in each BMI Category:")
print(counts)
# Create data frame to use for pie chart
BMI={"Category":["Normal","Overweight","Obese"],"Count":[216,148,10],"Percentage":[57.75,39.57,2.67],}
BMI=pd.DataFrame(BMI)
# Create pie chart
plt.figure(figsize=(8, 8)) # Adjust figure size as needed
plt.pie(BMI["Percentage"], labels=BMI["Category"], autopct="%1.1f%%")
plt.axis("equal") # Equal aspect ratio ensures a circular pie chart
# Customize the chart
plt.title("Percentage of Participants in each BMI Category")
plt.legend(title="BMI Category",loc="upper right",bbox_to_anchor=(1.2, 1))
# Show the pie chart
plt.show()
# Group counts by BMI category and quality of sleep
raw_data.groupby(["BMI Category","Quality of Sleep"]).size().to_frame(name="Count").reset_index()
โ
โ