Sleep Health and Lifestyle

This synthetic dataset contains sleep and cardiovascular metrics as well as lifestyle factors of close to 400 fictive persons.

The workspace is set up with one CSV file, data.csv, with the following columns:

Person ID
Gender
Age
Occupation
Sleep Duration: Average number of hours of sleep per day
Quality of Sleep: A subjective rating on a 1-10 scale
Physical Activity Level: Average number of minutes the person engages in physical activity daily
Stress Level: A subjective rating on a 1-10 scale
BMI Category
Blood Pressure: Indicated as systolic pressure over diastolic pressure
Heart Rate: In beats per minute
Daily Steps
Sleep Disorder: One of None, Insomnia or Sleep Apnea

Check out the guiding questions or the scenario described below to get started with this dataset! Feel free to make this workspace yours by adding and removing cells, or editing any of the existing cells.

Source: Kaggle

🌎 Some guiding questions to help you explore this data:

Which factors could contribute to a sleep disorder?
Does an increased physical activity level result in a better quality of sleep?
Does the presence of a sleep disorder affect the subjective sleep quality metric?

📊 Visualization ideas

Boxplot: show the distribution of sleep duration or quality of sleep for each occupation.
Show the link between age and sleep duration with a scatterplot. Consider including information on the sleep disorder.

🔍 Scenario: Automatically identify potential sleep disorders

This scenario helps you develop an end-to-end project for your portfolio.

Background: You work for a health insurance company and are tasked to identify whether or not a potential client is likely to have a sleep disorder. The company wants to use this information to determine the premium they want the client to pay.

Objective: Construct a classifier to predict the presence of a sleep disorder based on the other columns in the dataset.

Check out our Linear Classifiers course (Python) or Supervised Learning course (R) for a quick introduction to building classifiers.

You can query the pre-loaded CSV files using SQL directly. Here’s a sample query:

DataFrameas

df

variable

SELECT *
FROM 'data.csv'
LIMIT 10

import pandas as pd
import matplotlib.pyplot as plt
sleep_data = pd.read_csv('data.csv')
sleep_data.head()

df['Sleep Duration'] = df['Sleep Duration'].astype(float)

Ready to share your work?

Click "Share" in the upper right corner, copy the link, and share it! You can also easily add this workspace to your DataCamp Portfolio.

DataFrameas

df1

variable

SELECT count('Person ID')

FROM 'data.csv'

average_sleep = sleep_data.groupby('Occupation')['Sleep Duration'].mean()

# Ordina i risultati in ordine decrescente
average_sleep = average_sleep.sort_values(ascending=False)

# Prendi le prime 5 righe
top_3_occupations = average_sleep.head(5)

# Stampa le prime 5 occupazioni con la media più alta di ore di sonno
print(top_3_occupations)

average_sleep = sleep_data.groupby('Occupation')['Stress Level'].mean()

# Ordina i risultati in ordine decrescente
average_sleep = average_sleep.sort_values(ascending=False)

# Prendi le prime 5 righe
top_3_occupations = average_sleep.head(5)

# Stampa le prime 5 occupazioni con la media più alta di ore di sonno
print(top_3_occupations)

average_sleep = sleep_data.groupby('Occupation')['Quality of Sleep'].mean()

# Ordina i risultati in ordine decrescente
average_sleep = average_sleep.sort_values(ascending=False)

# Prendi le prime 5 righe
top_3_occupations = average_sleep.head(5)

# Stampa le prime 5 occupazioni con la media più alta di ore di sonno
print(top_3_occupations)

# I dati delle occupazioni e le relative medie delle ore di sonno
occupations = top_3_occupations.index
average_sleep_hours = top_3_occupations.values

# Creazione del diagramma a barre
plt.figure(figsize=(10, 6))
plt.bar(occupations, average_sleep_hours, color='skyblue')
plt.xlabel('Occupazioni')
plt.ylabel('Media Ore di Sonno')
plt.title('Media Ore di Sonno per le Prime 3 Occupazioni')
plt.xticks(rotation=45)  # Rotazione delle etichette sull'asse x per una migliore leggibilità
plt.tight_layout()

# Mostra il diagramma a barre
plt.show()

occupations = top_3_occupations.index
average_sleep_hours = top_3_occupations.values

# Creazione del diagramma a barre
plt.figure(figsize=(10, 6))
bars = plt.bar(occupations, average_sleep_hours, color='skyblue')

# Aggiungi etichette ai bar
for bar in bars:
    yval = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2, yval, round(yval, 2), ha='center', va='bottom')

plt.xlabel('Occupazioni')
plt.ylabel('Media Ore di Sonno')
plt.title('Media Ore di Sonno per le Prime 3 Occupazioni')
plt.xticks(rotation=45)
plt.tight_layout()

# Mostra il diagramma a barre con le etichette
plt.show()

Sleep Health and Lifestyle

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Sleep Health and Lifestyle

🌎 Some guiding questions to help you explore this data:

📊 Visualization ideas

🔍 Scenario: Automatically identify potential sleep disorders

Ready to share your work?

Sleep Health and Lifestyle