Skip to content

Hungarian physician Dr. Ignaz Semmelweis worked at the Vienna General Hospital with childbed fever patients. Childbed fever is a deadly disease affecting women who have just given birth, and in the early 1840s, as many as 10% of the women giving birth died from it at the Vienna General Hospital. Dr.Semmelweis discovered that it was the contaminated hands of the doctors delivering the babies, and on June 1st, 1847, he decreed that everyone should wash their hands, an unorthodox and controversial request; nobody in Vienna knew about bacteria.

You will reanalyze the data that made Semmelweis discover the importance of handwashing and its impact on the hospital.

The data is stored as two CSV files within the data folder.

yearly_deaths_by_clinic.csv contains the number of women giving birth at the two clinics at the Vienna General Hospital between the years 1841 and 1846.

ColumnDescription
yearYears (1841-1846)
birthsNumber of births
deathsNumber of deaths
clinicClinic 1 or clinic 2

monthly_deaths.csv contains data from 'Clinic 1' of the hospital where most deaths occurred.

ColumnDescription
dateDate (YYYY-MM-DD)
birthsNumber of births
deathsNumber of deaths
# Imported libraries
library(tidyverse)

# Load the CSV file
yearly_deaths_by_clinic <- read_csv("data/yearly_deaths_by_clinic.csv")
monthly_deaths <- read_csv("data/monthly_deaths.csv")
# yearly_deaths_by_clinic 
head(yearly_deaths_by_clinic)
glimpse(yearly_deaths_by_clinic)
# monthly_deaths
head(monthly_deaths)
glimpse(monthly_deaths)
yearly <- yearly_deaths_by_clinic %>%
    mutate(proportion_death = deaths / births) 
head(yearly)
monthly <- monthly_deaths %>%
    mutate(proportion_death = deaths / births) 
head(monthly)
Run cancelled
# Load necessary library for plotting
library(ggplot2)

# Yearly proportion of deaths plot
yearly_plot <- ggplot(yearly, aes(x = year, y = proportion_death, color = clinic)) +
    geom_line(size = 1) +
    labs(title = "Yearly Proportion of Deaths by Clinic",
         x = "Year",
         y = "Proportion of Deaths") +
    theme_minimal()

# Monthly proportion of deaths plot
monthly_plot <- ggplot(monthly_deaths, aes(x = date, y = proportion_death)) +
    geom_line(size = 1, color = "blue") +
    labs(title = "Monthly Proportion of Deaths",
         x = "Date",
         y = "Proportion of Deaths") +
    theme_minimal()

# Display the plots
yearly_plot
monthly_plot
Run cancelled
# Add the handwashing_started column
monthly <- monthly %>%
    mutate(handwashing_started = date >='1847-06-01')

# Calculate the proportion of deaths
monthly <- monthly %>%
  mutate(proportion_death = deaths / births)

# Create the plot
ggplot(monthly, aes(x = date, y = proportion_death, color = handwashing_started)) +
  geom_line(size = 1) +
  labs(title = "Monthly Proportion of Deaths with Handwashing Start",
       x = "Date",
       y = "Proportion of Deaths",
       color = "Handwashing Started") +
  theme_minimal()
Run cancelled
# Calculate the mean proportion of deaths before and after handwashing
monthly_summary <- monthly %>%
  group_by(handwashing_started) %>%
  summarise(mean_proportion_death = mean(proportion_death, na.rm = TRUE))

# Print the summary
print(monthly_summary)
Run cancelled
# Create the bar plot
ggplot(monthly_summary, aes(x = handwashing_started, y = mean_proportion_death, fill = handwashing_started)) +
  geom_bar(stat = "identity") +
  labs(title = "Mean Proportion of Deaths Before and After Handwashing",
       x = "Handwashing Started",
       y = "Mean Proportion of Deaths") +
  scale_x_discrete(labels = c("Before", "After")) +
  theme_minimal()