Skip to content
Dr. Semmelweis and the Discovery of Handwashing
  • AI Chat
  • Code
  • Report
  • 1. Meet Dr. Ignaz Semmelweis

    This is Dr. Ignaz Semmelweis, a Hungarian physician born in 1818 and active at the Vienna General Hospital. If Dr. Semmelweis looks troubled it's probably because he's thinking about childbed fever: A deadly disease affecting women that just have given birth. He is thinking about it because in the early 1840s at the Vienna General Hospital as many as 10% of the women giving birth die from it. He is thinking about it because he knows the cause of childbed fever: It's the contaminated hands of the doctors delivering the babies. And they won't listen to him and wash their hands!

    In this notebook, we're going to reanalyze the data that made Semmelweis discover the importance of handwashing. Let's start by looking at the data that made Semmelweis realize that something was wrong with the procedures at Vienna General Hospital.

    # Importing modules
    import pandas as pd 
    
    # Read datasets/yearly_deaths_by_clinic.csv into yearly
    yearly = pd.read_csv ('datasets/yearly_deaths_by_clinic.csv')
    
    # Print out yearly
    yearly

    2. The alarming number of deaths

    The table above shows the number of women giving birth at the two clinics at the Vienna General Hospital for the years 1841 to 1846. You'll notice that giving birth was very dangerous; an alarming number of women died as the result of childbirth, most of them from childbed fever.

    We see this more clearly if we look at the proportion of deaths out of the number of women giving birth. Let's zoom in on the proportion of deaths at Clinic 1.

    # Calculating proportion of deaths per no. births
    yearly['proportion_deaths'] = yearly['deaths']/ yearly['births']
    # Extracting Clinic 1 data into clinic_1 and Clinic 2 data into clinic_2
    clinic_1 = yearly[yearly['clinic']== 'clinic 1']
    clinic_2 = yearly[yearly['clinic']== 'clinic 2']
    # Printing out clinic_1
    clinic_1 

    3. Death at the clinics

    If we now plot the proportion of deaths at both Clinic 1 and Clinic 2 we observe a pattern…

    import matplotlib.pyplot as plt
    
    # This makes plots appear in the notebook
    %matplotlib inline
    
    # Ploting yearly proportion of deaths at the two clinics
    ax = clinic_1.plot(x="year", y="proportion_deaths", label="Clinic 1")
    clinic_2.plot(x="year", y="proportion_deaths", label="Clinic 2", ax=ax, ylabel="Proportion deaths")
    plt.show()
    # Read datasets/monthly_deaths.csv into monthly
    monthly = pd.read_csv('datasets/monthly_deaths.csv', parse_dates = ['date'])
    
    # Calculate proportion of deaths per no. births
    monthly['proportion_deaths'] = monthly['deaths'] / monthly['births'] 
    # Print out the first rows in monthly
    monthly.head()
    

    5. The effect of handwashing

    With the data loaded we can now look at the proportion of deaths over time. In the plot below we haven't marked where obligatory handwashing started, but it reduced the proportion of deaths to such a degree it is easily observable.

    # Plot monthly proportion of deaths
    ax = monthly.plot(x = 'date', y = 'proportion_deaths', label = 'Monthly proportion of deaths')
    

    6. The effect of handwashing highlighted

    Starting from the summer of 1847 the proportion of deaths is drastically reduced and, yes, this was when Semmelweis made handwashing obligatory.

    The effect of handwashing is made even more clear if we highlight this in the graph.

    # Date when handwashing was made mandatory
    handwashing_start = pd.to_datetime('1847-06-01')
    
    # Split monthly into before and after handwashing_start
    before_washing = monthly[monthly["date"] < handwashing_start]
    after_washing = monthly[monthly["date"] >= handwashing_start]
    
    # Plot monthly proportion of deaths before and after handwashing
    ax = before_washing.plot(x="date", y="proportion_deaths",
                             label="Before handwashing")
    after_washing.plot(x="date", y="proportion_deaths",
                       label="After handwashing", ax=ax, ylabel="Proportion deaths")
    plt.show()

    7. More handwashing, fewer deaths?

    Again, the graph shows that handwashing had a huge effect. Q: How much did it reduce the monthly proportion of deaths on average?

    # Difference in mean monthly proportion of deaths due to handwashing
    before_proportion = before_washing['proportion_deaths']
    after_proportion = after_washing['proportion_deaths']
    mean_diff = after_proportion.mean() - before_proportion.mean()
    mean_diff

    8. A Bootstrap analysis of Semmelweis handwashing data

    It reduced the proportion of deaths by around 8 percentage points! From 10% on average to just 2%.

    To get a feeling for the uncertainty around how much handwashing reduces mortalities we could look at a confidence interval (here calculated using the bootstrap method).

    # A bootstrap analysis of the reduction of deaths due to handwashing
    boot_mean_diff = []
    for i in range(3000):
        boot_before = before_proportion.sample(frac =1, replace = True )
        boot_after = after_proportion.sample (frac =1, replace = True)
        boot_mean_diff.append( boot_after.mean() - boot_after.mean() )
    
    # Calculating a 95% confidence interval from boot_mean_diff 
    confidence_interval = pd.Series(boot_mean_diff).quantile([0.025,0.975])
    confidence_interval