Sleep Health and Lifestyle

This synthetic dataset contains sleep and cardiovascular metrics as well as lifestyle factors of close to 400 fictive persons.

The workspace is set up with one CSV file, data.csv, with the following columns:

Person ID
Gender
Age
Occupation
Sleep Duration: Average number of hours of sleep per day
Quality of Sleep: A subjective rating on a 1-10 scale
Physical Activity Level: Average number of minutes the person engages in physical activity daily
Stress Level: A subjective rating on a 1-10 scale
BMI Category
Blood Pressure: Indicated as systolic pressure over diastolic pressure
Heart Rate: In beats per minute
Daily Steps
Sleep Disorder: One of None, Insomnia or Sleep Apnea

Source: Kaggle

1. Data Cleaning and basic exploration

The first and the most important step.

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import plotly.graph_objects as go
import plotly.figure_factory as ff
sns.set(palette="deep")

df = pd.read_csv('data.csv')
df.head()

1.1 Information and missing values

df.info()

df.isna().sum()

1.2 Correct spelling

df['Occupation'].unique()

df['Sleep Disorder'].unique()

df['BMI Category'].unique()

1.3 Warning about blood pressure and hear rate

df['Blood Pressure'].unique()

Regarding Blood Pressure:

Blood pressure is measured using two numbers:

The first number, called systolic blood pressure, measures the pressure in your arteries when your heart beats.

The second number, called diastolic blood pressure, measures the pressure in your arteries when your heart rests between beats.

Depending the guideline, the value of a "correct" blood pressure may change. I will use the classic "(120/80)" value, but it is important to note that the dataset does not explain clearly if the Blood Pressure was a "one time" measurement or if it was performed many time to establish a diagnosis. Thefore I think this column should be use carefully.

The same comments could be done about Hear Rate, as a one time measurement does not mean anything.

df['Blood Pressure']=df['Blood Pressure'].apply(lambda x:0 if x in ['120/80','117/76','118/76','115/75'] else 1)

df['Blood Pressure'].dtypes

‌
‌
‌

Sleep Health and Lifestyle

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Sleep Health and Lifestyle

1. Data Cleaning and basic exploration

1.1 Information and missing values

1.2 Correct spelling

1.3 Warning about blood pressure and hear rate

Sleep Health and Lifestyle