Skip to content
Project: Dr. Semmelweis and the Importance of Handwashing
  • AI Chat
  • Code
  • Report
  • Hungarian physician Dr. Ignaz Semmelweis worked at the Vienna General Hospital with childbed fever patients. Childbed fever is a deadly disease affecting women who have just given birth, and in the early 1840s, as many as 10% of the women giving birth died from it at the Vienna General Hospital. Dr.Semmelweis discovered that it was the contaminated hands of the doctors delivering the babies, and on June 1st, 1847, he decreed that everyone should wash their hands, an unorthodox and controversial request; nobody in Vienna knew about bacteria.

    You will reanalyze the data that made Semmelweis discover the importance of handwashing and its impact on the hospital.

    The data is stored as two CSV files within the data folder.

    yearly_deaths_by_clinic.csv contains the number of women giving birth at the two clinics at the Vienna General Hospital between the years 1841 and 1846.

    ColumnDescription
    yearYears (1841-1846)
    birthsNumber of births
    deathsNumber of deaths
    clinicClinic 1 or clinic 2

    monthly_deaths.csv contains data from 'Clinic 1' of the hospital where most deaths occurred.

    ColumnDescription
    dateDate (YYYY-MM-DD)
    birthsNumber of births
    deathsNumber of deaths
    # Imported libraries
    library(tidyverse)
    
    # Load yearly_deaths_by_clinic.csv into a data frame named yearly
    yearly <- read_csv('data/yearly_deaths_by_clinic.csv')
    yearly
    
    # Load monthly_deaths.csv into a data frame named monthly
    monthly <- read_csv("data/monthly_deaths.csv")
    monthly
    #Add a proportion_deaths column to each df
    
    yearly <- yearly %>% 
      mutate(proportion_deaths = deaths / births)
    
    yearly
    
    monthly <- monthly %>% 
      mutate(proportion_deaths = deaths / births)
    
    monthly
    #ggplot yearly line plots
    
    ggplot(yearly, aes(x = year, y = proportion_deaths, color = clinic)) +
      geom_line()
    
    #ggplot monthly line plots
    ggplot(monthly, aes(date, proportion_deaths)) +
      geom_line() +
      labs(x = "Year", y = "Proportion Deaths")
    #add threshold and plot
    handwashing_start = as.Date('1847-06-01')
    
    monthly <- monthly %>%
      mutate(handwashing_started = date >= handwashing_start)
    
    monthly
    
    ggplot(monthly, aes(x = date, y = proportion_deaths, color = handwashing_started)) +
      geom_line()
    #Mean proportion of deaths before and after handwashing
    
    monthly_summary <- monthly %>% 
      group_by(handwashing_started) %>%
      summarize(mean_proportion_deaths = mean(proportion_deaths))
    
    monthly_summary