Hungarian physician Dr. Ignaz Semmelweis worked at the Vienna General Hospital with childbed fever patients. Childbed fever is a deadly disease affecting women who have just given birth, and in the early 1840s, as many as 10% of the women giving birth died from it at the Vienna General Hospital. Dr.Semmelweis discovered that it was the contaminated hands of the doctors delivering the babies, and on June 1st, 1847, he decreed that everyone should wash their hands, an unorthodox and controversial request; nobody in Vienna knew about bacteria.
You will reanalyze the data that made Semmelweis discover the importance of handwashing and its impact on the hospital.
The data is stored as two CSV files within the data folder.
yearly_deaths_by_clinic.csv contains the number of women giving birth at the two clinics at the Vienna General Hospital between the years 1841 and 1846.
| Column | Description |
|---|---|
year | Years (1841-1846) |
births | Number of births |
deaths | Number of deaths |
clinic | Clinic 1 or clinic 2 |
monthly_deaths.csv contains data from 'Clinic 1' of the hospital where most deaths occurred.
| Column | Description |
|---|---|
date | Date (YYYY-MM-DD) |
births | Number of births |
deaths | Number of deaths |
# Imported libraries
library(tidyverse)
# loading the yearly deaths file
yearly <- read_csv("yearly_deaths_by_clinic.csv")
str(yearly)Next, we convert categorical variables to factor.
yearly$clinic <- factor(yearly$clinic)
str(yearly)Next, we add a proportion_deaths column to the df above using the mutate() function.
yearly <- yearly %>%
mutate(proportion_deaths = deaths/births)
yearlyNext, we make a line plot for the yearly proportion of deaths.
library(ggplot2)
ggplot(yearly, aes(x = proportion_deaths, y = year, color = clinic)) +
geom_line() +
labs(title = "Line Plot Of Proportion of Deaths by Year",
x = "Proportion of Deaths", y = "Year")We can see that the proportion of deaths by year was highest in 1842 and this occured in clinic 1 and lowest in 1845 which occured in clinic 2.
Now let's load the monthly deaths data.
# Imported libraries
library(tidyverse)
# loading the monthly deaths file
# Load the readr package
library(readr)
# Read the CSV file and suppress column type information
monthly <- read_csv("monthly_deaths.csv", show_col_types = FALSE)
# Display the first few rows of the data frame
head(monthly)
Next, we add a proportion_deaths by month column to the df above using the mutate() function.
monthly <- monthly %>%
mutate(proportion_deaths = deaths/births)
monthlyNow we make a line plot of proportion of deaths by month using ggplot2 package.