Skip to content

1 hidden cell


Los Angeles, California—the City of Angels, Tinseltown, and the Entertainment Capital of the World.

Renowned for its vibrant culture, iconic landmarks, and the glitz of Hollywood, Los Angeles also grapples with the challenges of being one of the most populous cities in the United States. Among these challenges is a significant volume of crime that requires the diligent efforts of the Los Angeles Police Department (LAPD) to maintain public safety.

To better understand and combat these issues, LAPD has provided a dataset containing detailed records of crimes reported across the city. Your task is to explore and analyse this data to uncover actionable insights, helping LAPD allocate resources effectively and respond to criminal activity with precision.

The Data

They have provided you with a single dataset to use. A summary and preview are provided below.

It is a modified version of the original data, which is publicly available from Los Angeles Open Data.

crimes.csv

ColumnDescription
'DR_NO'Division of Records Number: Official file number made up of a 2-digit year, area ID, and 5 digits.
'Date Rptd'Date reported - MM/DD/YYYY.
'DATE OCC'Date of occurrence - MM/DD/YYYY.
'TIME OCC'In 24-hour military time.
'AREA NAME'The 21 Geographic Areas or Patrol Divisions are also given a name designation that references a landmark or the surrounding community that it is responsible for. For example, the 77th Street Division is located at the intersection of South Broadway and 77th Street, serving neighborhoods in South Los Angeles.
'Crm Cd Desc'Indicates the crime committed.
'Vict Age'Victim's age in years.
'Vict Sex'Victim's sex: F: Female, M: Male, X: Unknown.
'Vict Descent'Victim's descent:
  • A - Other Asian
  • B - Black
  • C - Chinese
  • D - Cambodian
  • F - Filipino
  • G - Guamanian
  • H - Hispanic/Latin/Mexican
  • I - American Indian/Alaskan Native
  • J - Japanese
  • K - Korean
  • L - Laotian
  • O - Other
  • P - Pacific Islander
  • S - Samoan
  • U - Hawaiian
  • V - Vietnamese
  • W - White
  • X - Unknown
  • Z - Asian Indian
'Weapon Desc'Description of the weapon used (if applicable).
'Status Desc'Crime status.
'LOCATION'Street address of the crime.

Objectives:

This project will focus on the following key questions:

  1. Peak Crime Hour: At what hour of the day do crimes occur most frequently? Identifying this will aid in resource planning during high-risk times.
  2. Night Crime Hotspots: Which area experiences the highest number of night crimes, defined as those occurring between 10:00 PM and 3:59 AM? This insight will guide patrol strategies for late hours.
  3. Victim Age Demographics: How are crimes distributed among different age groups? Breaking down incidents by age will help tailor safety initiatives to protect vulnerable populations effectively.

By answering these questions, we aim to paint a clearer picture of criminal activity in Los Angeles, equipping law enforcement with the insights needed to keep the city safe while optimising their response efforts.

Let’s dive into the data and uncover the story behind the numbers.

# Re-run this cell
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
crimes = pd.read_csv("crimes.csv", parse_dates=["Date Rptd", "DATE OCC"], dtype={"TIME OCC": str})
crimes.head()
# Data cleaning. TIME OCC is not in a suitable format that I an comfortable with.
# converting it to time 
crimes['TIME OCC'] = pd.to_datetime(crimes['TIME OCC'], format='%H%M')
crimes.head()
crimes['TIME OCC'].dtypes

Peak Crime Hour:

At what hour of the day do crimes occur most frequently? Identifying this will aid in resource planning during high-risk times.

# To determine the peak crime hour
peak_crime_hour = crimes['TIME OCC'].dt.hour.value_counts().idxmax()

# Plotting hourly crime frequence
crime_hours = crimes['TIME OCC'].dt.hour.value_counts().sort_index()

plt.figure(figsize= (15, 6))
sns.lineplot(x= crime_hours.index, y= crime_hours)
plt.xlabel('Hour of the day')
plt.ylabel('Crime Frequency')
plt.xticks(range(24)) #Set the x-axis ticks
plt.title('Hourly Crime Frequency')
plt.show()

print(f'From the graph, we cam see that the peak crime hour is {peak_crime_hour}pm')

Night Crime Hotspots:

Which area experiences the highest number of night crimes, defined as those occurring between 10:00 PM and 3:59 AM? This insight will guide patrol strategies for late hours.

# Subsetting the dataset to get evening crimes
crimes['HOUR OCC'] = crimes['TIME OCC'].dt.hour
night_crimes = crimes.query("`HOUR OCC` >= 22 | `HOUR OCC` < 4")

# Getting the area prone to crimes at night
peak_night_crime_location = night_crimes['AREA NAME'].value_counts().idxmax()

# Plotting the graph
night_crime_location = night_crimes['AREA NAME'].value_counts().head(10)
plt.figure(figsize= (15, 8))
sns.barplot(x= night_crime_location.index,
           y= night_crime_location)
plt.xlabel('Los Angeles Police Division')
plt.ylabel('Crime Frequency')
plt.title("Night Crime Hotspots: Top 10 Areas for Late-Night Offences")
plt.xticks(rotation=45, fontsize= 12)
plt.show()

Victim Age Demographics:

How are crimes distributed among different age groups? Breaking down incidents by age will help tailor safety initiatives to protect vulnerable populations effectively.

bins = [0, 17, 25, 34, 44, 54, 64, np.inf]
labels = ["0-17", "18-25", "26-34", "35-44", "45-54", "55-64", "65+"]
crimes['Vict Age Group'] = pd.cut(crimes['Vict Age'], bins=bins, labels=labels, right=True)
victim_ages = crimes['Vict Age Group'].value_counts().sort_index()

# Plotting the graph
plt.figure(figsize= (15, 8))
sns.barplot(x= victim_ages.index,
           y= victim_ages)
plt.xlabel('Age Groups')
plt.ylabel('Crime Frequency')
plt.title("Crimes Across Generations: Age Demographic Breakdown")
plt.xticks(rotation=45)
plt.show()