Skip to content
Sleep Health and Lifestyle
  • AI Chat
  • Code
  • Report
  • Sleep Health and Lifestyle

    This synthetic dataset contains sleep and cardiovascular metrics as well as lifestyle factors of close to 400 fictive persons.

    The workspace is set up with one CSV file, data.csv, with the following columns:

    • Person ID
    • Gender
    • Age
    • Occupation
    • Sleep Duration: Average number of hours of sleep per day
    • Quality of Sleep: A subjective rating on a 1-10 scale
    • Physical Activity Level: Average number of minutes the person engages in physical activity daily
    • Stress Level: A subjective rating on a 1-10 scale
    • BMI Category
    • Blood Pressure: Indicated as systolic pressure over diastolic pressure
    • Heart Rate: In beats per minute
    • Daily Steps
    • Sleep Disorder: One of None, Insomnia or Sleep Apnea

    Source: Kaggle

    Hidden code
    Hidden code
    Hidden code

    In general the data shows that there arent many outliers in different varibles with the exception of Heart rate, that has some outliers. With this in consideration it doesnt seem necesary to drop these values, considering the limited amount of data and the fat that these data probablpy is authentic

    Hidden code

    It can be observed that there are no null values and the data type matches the column, so there is no need for adjustment

    Hidden code

    First we need to split the data into the target variable and the features variables. Also we need to split the data between training and testing sets

    # Split the data into features and target variable
    X = sleep_data.drop('Sleep Disorder', axis=1)
    y = sleep_data['Sleep Disorder']
    
    # Split the data into training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

    Logistic Regression

    To perform logistic regression on the sleep_data dataset, we can use the LogisticRegression class from the sklearn.linear_model module.

    from sklearn.linear_model import LogisticRegression
    from sklearn.model_selection import train_test_split
    
    # Create an instance of the LogisticRegression model
    model = LogisticRegression()
    
    # Fit the model to the training data
    model.fit(X_train, y_train)
    
    # Predict the target variable for the test data
    y_pred_log = model.predict(X_test)

    Decision Tree

    To build a decision tree model for the sleep_data dataset, we can use the DecisionTreeClassifier class from the sklearn.tree module.

    from sklearn.tree import DecisionTreeClassifier
    from sklearn.model_selection import train_test_split
    
    # Create an instance of the DecisionTreeClassifier model
    model = DecisionTreeClassifier()
    
    # Fit the model to the training data
    model.fit(X_train, y_train)
    
    # Predict the target variable for the test data
    y_pred_tree = model.predict(X_test)

    Neural Network

    To build a neural network model for the sleep_data dataset, we can use the MLPClassifier class from the sklearn.neural_network module.