Skip to content
0

Health Factors & Hair Loss, Lets save our hair!

Understanding the factors behind hair loss is key to unlocking the mystery of what causes those dreaded thinning spots! By digging into things like genetics, medical conditions, diet, and lifestyle, we can pinpoint what might be contributing to the problem. It's not just about saving hair—it's about empowering people with the knowledge they need to take control and find the right solutions. Plus, knowing how age, stress, and health impact hair loss helps us come up with smarter ways to keep those locks looking their best. So, let’s get to the root of the issue and maybe avoid going bald.

Image from https://www.pickpik.com/hidden-hide-man-young-shaved-bald-head-115492

Executive Summary

Key Findings

  • Ages of those with hair loss ranged from 18 to 50 years old.
  • The average age of an individual with hair loss was approximately 33.6 years old.
  • The age groups with the highest proportion of individuals with hair loss were 28-30 and 32-34.

Medical Conditions Associated with Hair Loss:

  • The most common medical conditions in those with hair loss were:
    1. Alopecia Areata - 13.6%
    2. Androgenic Alopecia - 12.2%
    3. Psoriasis - 11.1%

Nutritional Deficiencies in Those with Hair Loss:

  • The most common nutritional deficiencies in those with hair loss were:
    1. Vitamin D - 11.4%
    2. Vitamin A - 11.2%
    3. Zinc - 11.2%

Factors Associated with Hair Loss:

  • Stress levels did not have a significant impact on those with hair loss.
  • No clear factors were associated with hair loss based on a glm model.

Other Findings:

  • Those with hair loss commmonly were using Rogaine, steriods and antidepressants.
  • Further examination with those with Alopecia (illnesses that cause hairloss), not all indivudals with the condition had hair loss.
  • Alopecia indivduals with hair loss commonly had vitamin D and protein deficiencies, no other contributing factors were signficant.

Recommendations

Hair loss is a multifaceted issue influenced by a complex interplay of biological, environmental, and lifestyle factors, requiring a comprehensive and nuanced approach to fully understand its causes and effective treatments. These are the next areas of research to dive into to further understand why hair loss occurs.

  1. Targeted Age Group Analysis: Focus on the age groups 28–30 and 32–34 to explore lifestyle, genetic, and environmental factors contributing to the higher prevalence of hair loss within these demographics.
  2. Nutritional Deficiencies: Conduct longitudinal studies to determine whether deficiencies in Vitamin D, Vitamin A, and Zinc are causative factors for hair loss or secondary effects. Additionally, explore the role of protein deficiencies, particularly in individuals with Alopecia.
  3. Hair Loss in Alopecia Subgroups: Investigate why some individuals with Alopecia do not experience hair loss while others do. This could reveal insights into factors which may cause hair loss.
# Load in the data
library(readr)
data <- suppressMessages(read_csv('data/Predict Hair Fall.csv'))

# Load in required libraries 
library(tidyverse)
library(dplyr)  
library(ggplot2)


#Relabel Hair loss to allow dplyr filter to work properly
data <- dplyr::rename(data, Hair_Loss = `Hair Loss`)
head(data,10)


Clean the data

# Clean the dataset
## Update data so all columns with yes/no changed to binary ( 0 or 1)
## Update to no data to null
data_update <- data %>%
  mutate(across(c('Genetics', 'Hormonal Changes', 'Poor Hair Care Habits', 'Environmental Factors','Smoking', 'Weight Loss'), ~ ifelse(. == "Yes", 1, ifelse(. == "No", 0, NA)))) %>%
mutate(across(c('Genetics', 'Hormonal Changes', 'Poor Hair Care Habits', 'Environmental Factors','Smoking', 'Weight Loss', Hair_Loss), as.factor)) %>%
mutate(across(c('Medical Conditions', 'Medications & Treatments', 'Nutritional Deficiencies'), ~na_if(., "No Data")))

#data_update will be used later for full models

# Check changes to ensure columns have binary numbers  
head(data_update,10)

Create subcatergory datasets to examine

# Make datasets representing those with and without hair loss
hair_loss_data <- dplyr::filter(data_update, Hair_Loss == 1)
with_hair_data <- dplyr::filter(data_update, Hair_Loss == 0)

#Make dataset with just alopecia individuals
alopecia_data <- dplyr::filter(data_update, `Medical Conditions` %in% c("Alopecia Areata", "Androgenetic Alopecia"))




Descriptive Statistics and Visualizations

## Average age and Counts

# Calculate average age of person with hair loss
avg_age <- mean(hair_loss_data$Age, na.rm = TRUE)
print(avg_age)

# Find age range of hair loss
print(range(hair_loss_data$Age, na.rm=TRUE))

# Count number of people based on Medical Condition and Nutritional Deficiencies
# Medical Condition Count
MedCond <- hair_loss_data %>%
  group_by(`Medical Conditions`) %>% 
  summarise(Count = n()) %>%
  drop_na()
print(MedCond)

# Nutritional Deficiency Count
Nut_Def <- hair_loss_data %>%
  group_by(`Nutritional Deficiencies`) %>% 
  summarise(Count = n()) %>% 
  drop_na()
print(Nut_Def)

# Medication and Treatments Count
MT <- hair_loss_data %>%
  group_by(`Medications & Treatments`) %>% 
  summarise(Count = n()) %>% 
  drop_na()
print(MT)
#Create histogram displaying the distribution of ages of those with hair loss
ggplot(hair_loss_data, aes( x = Age )) + geom_histogram(color = "white", bins=16) + 
xlab("Age Range Distribution") + ylab("Count")
# Create age groups in 2-year intervals
age_data <- hair_loss_data %>%
  mutate(AgeGroup = cut(Age, breaks = seq(18, 50, by = 2), right = FALSE))

# Summarize the count of hair loss in each age group
age_group_summary <- age_data %>%
  group_by(AgeGroup) %>%
  summarise(Count = n()) %>%
  mutate(Proportion = Count / sum(Count)) %>%
  drop_na()

print(age_group_summary)
# We can see that the proportion of each age group in 5 years interval are very similar to one another. 

# Plot the stacked bar chart
ggplot(age_group_summary, aes(x = AgeGroup, y = Proportion, fill = AgeGroup)) + 
  geom_bar(stat = "identity") +
  labs(x = "Age Group", y = "Proportion") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))


1 hidden cell
# Examine percent of hair loss indivudals with different medical conditions 

#Create percent data
data_counts <- MedCond %>%
  mutate(Percentage = (Count / sum(Count)) * 100)
print(data_counts)

# Create the bar graph
ggplot(data_counts, aes(x = reorder(`Medical Conditions`, -Count), y = Percentage, fill = `Medical Conditions`)) +
  geom_bar(stat = "identity") + 
  geom_text(aes(label = paste0(round(Percentage, 1), "%")), vjust = -0.5, size = 3.5) +
  theme(axis.text.x = element_text(angle = 75, vjust = 0.5, hjust = 0.5)) +
  theme(axis.title.x=element_blank())









1 hidden cell