SleepInc: Helping you find better sleep 😴

📖 Background

Your client is SleepInc, a sleep health company that recently launched a sleep-tracking app called SleepScope. The app monitors sleep patterns and collects users' self-reported data on lifestyle habits. SleepInc wants to identify lifestyle, health, and demographic factors that strongly correlate with poor sleep quality. They need your help to produce visualizations and a summary of findings for their next board meeting! They need these to be easily digestible for a non-technical audience!

📂 Preview of data

import pandas as pd
raw_data = pd.read_csv('sleep_health_data.csv')
raw_data

📄 Executive Summary

🎯 Aim:

To research which factors affect your sleeping.

🛠 Method:

Validate all data
Make a Machine Learning model
Assess how well it does
Adjust it so it (hopefully) predicts everything right
Look at the impact of each column (calculated by the model) on making predictions
Make charts of the columns which have significance
Double check if the Machine Learning model was right
Form a conclusion

🏁 Results:

According to the Machine Learning model and statistics...
To get better sleep, you should:

Sleep longer (or go to bed earlier)
Don't be stressed (try relax yourself)
Be physically fit - do excercise (to have a low Resting Heart Rate)
Do as much steps as possible in a day
Older people sleep better (probably because they don't have to work because they are on a pension)

16 hidden cells

🤖 Answering the challenge 📊

📕 Part 1

from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import train_test_split

from sklearn.metrics import mean_absolute_error, accuracy_score
from sklearn.model_selection import GridSearchCV

First,

we need to encode text columns so that our model will understand them.

for col in ['BMI Category', 'Sleep Disorder', 'BP_category']:
    encoded = pd.get_dummies(raw_data[col])
    raw_data[encoded.columns.to_list()] = encoded.values

Second,

we need to create training and testing sets. Training sets are what the ML model learns from, and the Testing set is what the ML model gets tested on.

X = raw_data.select_dtypes(exclude='object').drop('Quality of Sleep', axis=1)
y = raw_data['Quality of Sleep'].values

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2)

Third,

We need to create a model without any adjustments, so we can improve it later, but also make it make predictions to measure the accuracy of its predictions.

‌
‌
‌

SleepInc: Helping you find better sleep 😴

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}SleepInc: Helping you find better sleep 😴

📖 Background

📂 Preview of data

📄 Executive Summary

🎯 Aim:

🛠 Method:

🏁 Results:

🤖 Answering the challenge 📊

📕 Part 1

First,

Second,

Third,

SleepInc: Helping you find better sleep 😴