Skip to content

Sleep Health and Lifestyle

This synthetic dataset contains sleep and cardiovascular metrics as well as lifestyle factors of close to 400 fictive persons.

The workspace is set up with one CSV file, data.csv, with the following columns:

  • Person ID
  • Gender
  • Age
  • Occupation
  • Sleep Duration: Average number of hours of sleep per day
  • Quality of Sleep: A subjective rating on a 1-10 scale
  • Physical Activity Level: Average number of minutes the person engages in physical activity daily
  • Stress Level: A subjective rating on a 1-10 scale
  • BMI Category
  • Blood Pressure: Indicated as systolic pressure over diastolic pressure
  • Heart Rate: In beats per minute
  • Daily Steps
  • Sleep Disorder: One of None, Insomnia or Sleep Apnea

Source: Kaggle

🌎 Questions Being Explored:

  1. Do people experience longer sleep with age based on their Gender?
  2. Do occurrances of sleep disorders rely on either Age or Gender?

2 hidden cells

Sleep Duration by Age for Men and Women

Do people experience longer sleep with age based on their Gender?

Showcase of ggplot2 and dplyr skill

sleep_age <- sleep_data %>%
	select("Person ID","Gender","Age","Sleep Duration") %>%
	rename(person_id = "Person ID", gender = "Gender", age = "Age", duration = "Sleep Duration")

sleep_age
library(ggplot2)
library(dplyr)

avg_male_age <- sleep_age %>%
  filter(gender == "Male") %>%
  summarise(avg_age = mean(age)) %>%
  pull(avg_age)

avg_male_duration <- sleep_age %>%
  filter(gender == "Male") %>%
  summarise(avg_duration = mean(duration)) %>%
  pull(avg_duration)

avg_female_age <- sleep_age %>%
  filter(gender == "Female") %>%
  summarise(avg_age = mean(age)) %>%
  pull(avg_age)

avg_female_duration <- sleep_age %>%
  filter(gender == "Female") %>%
  summarise(avg_duration = mean(duration)) %>%
  pull(avg_duration)
sleep_age_graph <- sleep_age %>%
  ggplot(aes(x = age, y = duration)) +
  labs(y = "Sleep Duration", x = "Age", linetype = "Trend Lines")

sleep_age_graph +
	geom_point(size = 4,aes(shape = gender,color = gender), alpha = 0.9) +
	scale_shape_manual(values = c(19, 17), name = "Gender",
                     labels = c("Female", "Male")) +
	scale_color_manual(values = c("red", "navy"), name = "Gender",
                     labels = c("Female", "Male")) +
  	geom_smooth(method = "lm", se = FALSE, color = "black", aes(linetype = gender))

Conclusions

In general, as people age, they tend to sleep more, but this change is seen much greater in Female responses. However, it should also be noted that there were significantly more older females in the sample. In addition, there were a greater number of younger Male responses.

This leads me to believe that this data may not necessarily be reliable for this specific question. This discrepancy is explored in the graph below:

sleep_age_graph +
	geom_point(size = 2, color = "gray", aes(shape=gender)) +
 	 geom_smooth(method = "lm", se = FALSE, color = "gray", aes(linetype = gender)) +
	 geom_point(aes(y = avg_male_duration, x = avg_male_age), color = "navy", size = 5, shape = 2) +
		annotate(geom = "text", label = "Avg Male Response", y = avg_male_duration + 0.1, x = avg_male_age, color = "navy", size = 5) +
	geom_point(aes(y = avg_female_duration, x = avg_female_age), color = "red", size = 5, shape = 1) +
		annotate(geom = "text", label = "Avg Female Response", y = avg_female_duration + 0.1, x = avg_female_age, color = "red", size = 5) +
	theme(legend.position = "none")

The above graph highlights this difference in the average age of respondants. The average male response is roughly 10 years younger than that of female responses, indicating that the dataset is not necessarily reliable. In order to find the correct answer to this question, my suggestion is to record more older male responses, and more younger female responses.

An alternative option is below:

sleep_age$age <- as.numeric(as.character(sleep_age$age))

sleep_age %>%
	slice_min(age)%>%
	head(n=1)

sleep_age %>%
	slice_max(age)%>%
	head(n=1)

sleep_age %>%
	filter(gender == "Female") %>%
	slice_min(age) %>%
	head(n=1)

sleep_age %>%
	filter(gender == "Male") %>%
	slice_max(age)%>%
	head(n=1)

sleep_age_strat <- sleep_age %>%
	 mutate(age_group = cut(age, breaks = c(25, 30, 35, 40, 45, 50, 55), labels = FALSE))

sleep_age_strat <- sleep_age_strat %>%
	select(gender,age_group,duration) %>%
	group_by(age_group,gender) %>%
	summarize(avg_dur = mean(duration))

sleep_age_strat
sleep_age_graph_strat <- sleep_age_strat %>%
	ggplot(aes(y = avg_dur, x = age_group, shape = gender)) +
  	labs(y = "Sleep Duration", x = "Age Group", linetype = "Trend Lines")

sleep_age_graph_strat +
	geom_point(size = 4,aes(shape = gender,color = gender), alpha = 0.9) +
	scale_shape_manual(values = c(19, 17), name = "Gender",
                     labels = c("Female", "Male")) +
	scale_color_manual(values = c("red", "navy"), name = "Gender",
                     labels = c("Female", "Male")) +
  	geom_smooth(method = "lm", se = FALSE, color = "black", aes(linetype = gender))

The above graph and data transformation groups the ages into buckets, seperated every five years. The average duration of sleep for each bucket is then calculated, and plotted on the graph. This indicates that male and female sleep duration increases at relatively the same rate, however male responses record a longer duration of sleep in general, regardless of age group.

The limitations of this approach are that some of the groups have very few or no responses in them depending on gender, especially as we move to the outer groups.

Incidence of Sleep Disorder by Age and Gender

Do occurrences of sleep disorders rely on either Age or Gender?

Showcase of dplyr and ggplot2 skill

disorder_by_gender <- sleep_data %>%
	select(Gender,'Sleep Disorder',Age) %>%
	rename(gender = Gender, disorder = 'Sleep Disorder') %>%
	group_by(gender,disorder) %>%
	count(gender) %>%
	rename(count = n)

dis_by_gender_pct <- disorder_by_gender %>%
  group_by(gender) %>%
  mutate(total_count = sum(count)) %>%
  mutate(pct_by_gender = round(count / total_count *100,0)) %>%
  select(-total_count)

dis_by_gender_pct_m <- dis_by_gender_pct %>%
	select(gender,disorder,pct_by_gender) %>%
	filter(gender == "Male")

dis_by_gender_pct_m

dis_by_gender_pct_f <- dis_by_gender_pct %>%
	select(gender,disorder,pct_by_gender) %>%
	filter(gender == "Female")

dis_by_gender_pct_f
‌
‌
‌