Skip to content
0

Reducing the number of high fatality accidents

📖 Background

You work for the road safety team within the department of transport and are looking into how they can reduce the number of major incidents. The safety team classes major incidents as fatal accidents involving 3+ casualties. They are trying to learn more about the characteristics of these major incidents so they can brainstorm interventions that could lower the number of deaths. They have asked for your assistance with answering a number of questions.

💾 The data

The reporting department have been collecting data on every accident that is reported. They've included this along with a lookup file for 2020's accidents.

Published by the department for transport. https://data.gov.uk/dataset/road-accidents-safety-data Contains public sector information licensed under the Open Government Licence v3.0.

library(tidyverse)
accidents <- readr::read_csv('./data/accident-data.csv')
head(accidents)
lookup <- readr::read_csv('./data/road-safety-lookups.csv')
head(lookup)

💪 Competition challenge

Create a report that covers the following:

  1. What time of day and day of the week do most major incidents happen?
  2. Are there any patterns in the time of day/ day of the week when major incidents occur?
  3. What characteristics stand out in major incidents compared with other accidents?
  4. On what areas would you recommend the planning team focus their brainstorming efforts to reduce major incidents?

Exploring day of the week when Major (Serious + Fatal) incidents happen

accidents <- accidents %>% janitor::clean_names()

lookup <- lookup %>% janitor::clean_names()

Using lookup table to add meaning to accident_severity

weekd <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday")

accidents_severity <- accidents %>% 
  mutate(accident_severity = as.character(accident_severity)) %>% 
  left_join(lookup %>% select(field_name, code_format, label) %>% 
              filter(field_name == 'accident_severity'), by = c("accident_severity" = "code_format")) %>% 
  mutate(accident_severity = label) %>% 
  select(-field_name, -label) %>% 
  
  mutate(day_of_week = as.character(day_of_week)) %>% 
  left_join(lookup %>% select(field_name, code_format, label) %>% 
              filter(field_name == 'day_of_week'), by = c("day_of_week" = "code_format")) %>% 
  
  mutate(day_of_week = label) %>% 
  select(-field_name, -label) %>% 
  
  mutate(accident_major_other = case_when(
    accident_severity %in% c("Serious", "Fatal") ~ "Major",
    TRUE ~ "Other"
  )) %>% 
  
  mutate(day_of_week = forcats::fct_relevel(day_of_week, weekd))
  
  

head(accidents_severity)

Preparing data frame to plot accident count by day of week for accident_severity


accidents_severity_day <- accidents_severity %>% 
  count(accident_major_other, day_of_week, wt = number_of_casualties) %>% 
  mutate(accident_major_other = forcats::fct_relevel(accident_major_other,
                                                     c("Other", "Major")))

Plot of accident count by day of week for accident_severity

ggplot(accidents_severity_day, 
       aes(x = day_of_week, y = n, fill = accident_major_other)) +
  ## draw bars
  geom_bar(stat="identity", 
           position="dodge") +
  
  geom_label(
    aes(label = n), 
    hjust = 0.7, nudge_x = 0,
    size = 2, fontface = "bold", family = "Fira Sans",
    ## turn into white box without outline
    fill = "white", label.size = 0
  ) +
  
  scale_fill_manual(values = c("skyblue", "red")) +
  theme_void() +
  theme(legend.title=element_blank()) +
  theme(axis.text.x = element_text(face="bold", color="#993333", 
                           size=14, angle=90, family = "Fira Sans", hjust = 1),
          axis.text.y = element_text(face="bold", color="#993333", 
                           size=14, angle=45)) +
  labs(title = "Accidents by Day",
       subtitle = "")
  • From the plot above we can observe that the maximum Major accident occur on Friday (combination of Fatal + Serious)

Exploring time of the day when Major (Serious + Fatal) incidents happen

accidents_severity_time <- accidents_severity %>% 
  mutate(time = str_sub(time, 1L, 2L)) %>% 
  count(accident_major_other, time, wt = number_of_casualties) %>% 
  filter(accident_major_other == "Major")
  
  
ggplot(accidents_severity_time, 
       aes(x = as.character(time), y = n, fill = as.character(time))) +
  ## draw bars
  geom_bar(stat="identity", 
           position="dodge") +
  
  geom_label(
    aes(label = n), 
    hjust = 0.7, nudge_x = 0,
    size = 2, fontface = "bold", family = "Fira Sans",
    ## turn into white box without outline
    fill = "white", label.size = 0
  ) +
  
  #scale_y_continuous(labels = scales::percent) +
  #scale_x_continuous(expand = c(.01, .01)) +
  theme_void() +
  theme(legend.title=element_blank()) +
  theme(axis.text.x = element_text(face="bold", color="#993333", 
                           size=14, angle=90, family = "Fira Sans", hjust = 1),
          axis.text.y = element_text(face="bold", color="#993333", 
                           size=14, angle=45)) +
  labs(title = "Accidents by Time",
       subtitle = "")
‌
‌
‌