Skip to content

Toronto, Ontario 🌆. The Queen City. The 6ix.

Known for its vibrant arts scene, diverse culture, stunning skyline, and bustling neighborhoods, Toronto is a city that never sleeps. However, as with any major urban center, it faces its share of challenges. One growing concern for Torontonians is the rising number of bicycle thefts.

You have been invited to assist the Toronto Police Service by analyzing and visualizing data to uncover patterns in theft activity. Your findings and visual insights will provide crucial information that can help allocate resources more effectively and develop strategies to combat bike thefts, ensuring a safer city for all cyclists.

The Data

The dataset used for analyzing bike thefts is titled Cleaned_Bicycle_Thefts_Open_Data.csv in the data folder. This dataset contains essential information regarding bicycle theft incidents in a given city. Below are the details of each column in the dataset:

ColumnDescription
dateThe date when the bike theft occurred, formatted as YYYY/MM/DD.
quarterA quarter represents one-fourth (1/4) of a year, equating to three months.
day_of_weekThe day of the week when the theft took place (e.g., Monday, Tuesday).
neighborhoodThe neighborhood where the theft occurred, based on the city's 140 social planning neighborhoods.
bike_costThe reported cost of the stolen bike, specified in the local currency.
locationThe specific location type of the theft, such as Residential Structures, Commercial Areas, Public Spaces, etc.
longThe longitude of the center of the neighborhood.
latThe latitude of the center of the neighborhood.

This dataset provides a comprehensive view of bike thefts, including when and where they occur, the financial impact, and other spatial and temporal factors. By analyzing this data, you can gain valuable insights into patterns and trends, which can inform strategies to mitigate bike thefts and enhance urban planning.

## Load tidyverse package
library(tidyverse)
## Read `bike_data`
bike_data <- read_csv("data/Cleaned_Bicycle_Thefts_Open_Data.csv")
## Take a glance of the `bike_data`
head(bike_data)
# Start coding here
bike_data %>%
group_by(quarter) %>%
summarize(total_per_quarter =n()) %>%
ggplot(aes(x = quarter,y = total_per_quarter)) +
geom_point()+
geom_smooth(span = 0.1,se = FALSE) + 
labs(x = "quarter" , y = "bikes_stolen")


high <- "Q3"
low <- "Q1"
# Use as many cells as you need
bike_data %>%
  group_by(location) %>%
  summarise(bikes_stolen_location = n()) %>%
  mutate(total_all_location = sum(bikes_stolen_location),percentage = (bikes_stolen_location / sum(bikes_stolen_location)) ) %>%
ggplot(aes(x= "",
		   y = percentage,
		  fill = location)) +
geom_col() +
geom_text(aes(label = round(percentage,2)),
		 alpha = 0.5,
		 position= position_stack(vjust = 0.5),
		 show.legend = FALSE)+
coord_polar(theta = "y") 

location <- "Residential Structures"
percentage <- 0.5
g <-bike_data %>%
  group_by(neighborhood) %>%
  summarise(long= mean(long),lat = mean(lat),median_value = median(bike_cost, na.rm = TRUE)) %>%
ggplot(aes(x =long, y = lat,label =neighborhood , color = median_value)) +
geom_point(size = 2) 

g +
geom_text(size = 2)
region <- "41"
action <- "In quarter 3 there is a increse in the tefth of bikes so incresed caution and surveilance is recomended besides, residential areas are where over 50% of thefts happen, and in neighborhood people with valuables bike should be extra cautious as thefts aims tend to rob more valuable bikes "