Visualizing Bicycle Theft in R
Introduction
Toronto, the bustling metropolis known for its rich culture and vibrant arts scene, is also grappling with an increasing problem that affects its cyclists: bicycle theft. As the number of incidents rises, the Toronto Police Service has invited an analysis of historical bike theft data to better understand the trends and spatial distribution of these crimes. This analysis aims to provide insights into when and where bike thefts are most likely to occur, helping to inform strategies to combat this issue and improve urban safety for cyclists.
Using the data provided, this report will uncover trends regarding:
1.The number of stolen bikes by quarter
2.The most frequent locations for bike thefts
3.The region of Toronto with the highest median value of stolen bikes
- Practical recommendations to help reduce the incidence of bike thefts
The Data
The dataset used for analyzing bike thefts is titled Cleaned_Bicycle_Thefts_Open_Data.csv in the data folder. This dataset contains essential information regarding bicycle theft incidents in a given city. Below are the details of each column in the dataset:
| Column | Description |
|---|---|
date | The date when the bike theft occurred, formatted as YYYY/MM/DD. |
quarter | A quarter represents one-fourth (1/4) of a year, equating to three months. |
day_of_week | The day of the week when the theft took place (e.g., Monday, Tuesday). |
neighborhood | The neighborhood where the theft occurred, based on the city's 140 social planning neighborhoods. |
bike_cost | The reported cost of the stolen bike, specified in the local currency. |
location | The specific location type of the theft, such as Residential Structures, Commercial Areas, Public Spaces, etc. |
long | The longitude of the center of the neighborhood. |
lat | The latitude of the center of the neighborhood. |
This dataset provides a comprehensive view of bike thefts, including when and where they occur, the financial impact, and other spatial and temporal factors. By analyzing this data, you can gain valuable insights into patterns and trends, which can inform strategies to mitigate bike thefts and enhance urban planning.
## Loading tidyverse package
library(tidyverse)
library(repr)
options(repr.plot.width = 10, repr.plot.height = 20)
## Read `bike_data`
bike_data <- read_csv("data/Cleaned_Bicycle_Thefts_Open_Data.csv")
#Which quarter, i.e., "Q1", "Q2", "Q3" and "Q4", has the highest and lowest number of stolen bikes?
#Grouping variables by year and quarter and counting number of records for each groups
quartely_bike_data <- bike_data %>%
mutate(year = as.factor(year(quarter)))%>%
group_by(year, quarter) %>%
count(n(), sort = TRUE )
#visualizing the number of stolen bikes quartely each year
ggplot(quartely_bike_data, aes( x = month(quarter), y = n, color = year)) +
geom_line() +
scale_x_continuous(limits = c(1,12), n.breaks = 12)
low <- "Q1"
high <- "Q3"
#What are the most frequent locations (e.g., Residential, Commercial Areas) for bike thefts in Toronto? And what proportion is it (round to one decimal place)?
#Visualizing most frequent locations for bike thefts in Toronto
ggplot(bike_data, aes(x = location, y = after_stat(prop), group = 1)) +
geom_bar(fill = "Blue") + theme(axis.text.x = element_text(angle = 90)) + ggtitle("Most frequent locations for bike thefts in Toronto")
#Determining exact proportion value
bike_data_location_proportion <- bike_data %>%
group_by(location) %>%
count(n(), sort = TRUE) %>%
ungroup() %>%
mutate(proportion = round(n/sum(n) * 100, 1))
location <- "Residential Structures"
percentage <- 0.5
#In which region of Toronto is the median value of stolen bikes the highest?
regions <- bike_data %>%
group_by(neighborhood) %>%
summarize(median_bike_cost = median(bike_cost))
ggplot(regions, aes(neighborhood, median_bike_cost)) +
geom_point() + coord_flip() + labs(title = "Median bike cost per region") + ylab(label = "Median bike cost") + xlab(label = "Region") + theme_minimal()
region <- "Bridle Path-Sunnybrook-York Mills"
# What course of action would you recommend to the police station based on your findings?
action <- "Since the number of stolen bikes tends to be highest during the third quarter, I recommend increasing awareness on bike theft during this period, especially in residential areas where most bikes are stolen. I also urge that residents of the area be vigilant and ensure that their bikes are parked in secure places at all times"
Key Findings
Quarter with the Highest and Lowest Number of Stolen Bikes The analysis of bike thefts by quarter revealed significant fluctuations in the number of incidents throughout the year. By categorizing the data into Q1 (January-March), Q2 (April-June), Q3 (July-September), and Q4 (October-December), we observed the following:
Q3 (July-September) reported the highest number of bike thefts.
Q1 (January-March) had the lowest number of bike thefts.
This seasonal variation indicates that bike thefts are notably more frequent in the warmer months when cycling is more popular. This could be attributed to higher bicycle use during the summer and potentially fewer precautions being taken by bike owners during these months.
Most Frequent Locations for Bike Thefts When analyzing the location types of bike thefts, the dataset highlighted the following categories:
Residential Areas emerged as the most frequent location for thefts.
Commercial Areas followed closely behind.
Proportions:
Residential Areas accounted for 58.7% of all thefts.
Commercial Areas accounted for 31.4% of thefts.
These findings suggest that while bike thefts are common in areas with higher foot traffic (such as commercial areas), residential neighborhoods still pose a significant risk, likely due to improper bike storage and lower surveillance.
The region with the highest median value of stolen bikes was found to be Bridle Path-Sunnybrook-York Mills. This aligns with expectations, as more expensive bicycles tend to be purchased by individuals living in areas with higher incomes, which could make them more attractive to thieves.
Recommendations for Action
Based on the findings from the analysis, the following actions are recommended to the Toronto Police Service:
Increase Awareness During Q3: Since bike thefts are most frequent in the summer months (Q3), particularly from July to September, it is crucial to launch awareness campaigns during this period. These campaigns should focus on educating cyclists about the importance of securing their bikes and using high-quality locks.
Focus on Residential Areas: Given that residential areas have the highest proportion of thefts, it is recommended that the police increase patrols and surveillance in these neighborhoods. Additionally, working with local residents to improve security measures and encourage community vigilance can help reduce thefts.
Encourage Secure Bike Parking: Public spaces, including residential complexes, should be encouraged to install more secure bike racks. In commercial areas, businesses can offer designated, well-lit bike parking zones to reduce the likelihood of theft.
Target Central Toronto for High-Value Bike Protection: In regions such as Central Toronto, where high-value bicycles are more likely to be stolen, implementing specialized programs that focus on protecting higher-end bikes, such as offering secure storage options or registering bikes with identifiable marks, could prove beneficial.