Analyzing Crime in Los Angeles by Data Sage Analytics
Los Angeles, California—the City of Angels, Tinseltown, and the Entertainment Capital of the World! Known for its sunny weather, iconic palm trees, and sprawling coastline, this bustling metropolis is also the home of Hollywood. But beyond the glamour and bright lights, like any major city, Los Angeles faces the challenges of crime.
Your Mission: Assist the Los Angeles Police Department (LAPD) by analyzing crime data to uncover patterns in criminal behavior. Your insights will play a crucial role in helping the LAPD allocate resources more effectively to combat crime in various neighborhoods.
The Data
The LAPD has provided a dataset that serves as the backbone of this analysis. Below is a summary of the key attributes included in the dataset.
This is a refined version of the original data available from Los Angeles Open Data.
File: crimes.csv
| Column | Description |
|---|---|
'DR_NO' | Division of Records Number: A unique file number comprising a 2-digit year, area ID, and 5 additional digits. |
'Date Rptd' | Date reported in MM/DD/YYYY format. |
'DATE OCC' | Date of occurrence in MM/DD/YYYY format. |
'TIME OCC' | Time of occurrence in 24-hour military time format. |
'AREA NAME' | Name of the geographical area or patrol division, often named after a local landmark or community. For instance, the 77th Street Division, located at South Broadway and 77th Street, serves neighborhoods in South Los Angeles. |
'Crm Cd Desc' | Description of the crime committed. |
'Vict Age' | Age of the victim in years. |
'Vict Sex' | Sex of the victim: F for Female, M for Male, X for Unknown. |
'Vict Descent' | Descent of the victim:
|
'Weapon Desc' | Description of the weapon used, if applicable. |
'Status Desc' | Current status of the crime case. |
'LOCATION' | Street address where the crime occurred. |
About the Data Scientist
This analysis, titled "Analyzing Crime in Los Angeles," was conducted by Erick Bryan Cubas, a seasoned Data Scientist and the Founder of Data Sage Analytics. The objective is to identify patterns such as the hours with the highest frequency of crimes, the areas most prone to nighttime crimes, and the distribution of crimes across different victim age groups.
- LinkedIn: Erick Bryan Cubas
- GitHub: Erick Bryan Cubas
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
crimes = pd.read_csv("crimes.csv", parse_dates=["Date Rptd", "DATE OCC"], dtype={"TIME OCC": str})
crimes.head()Explore the crimes.csv dataset and use your findings to answer the following questions:
1 - Which hour has the highest frequency of crimes? Store as an integer variable called peak_crime_hour:
crimes['HOUR OCC'] = (crimes['TIME OCC'].astype(int)) // 100
peak_crime_hour = crimes['HOUR OCC'].value_counts().idxmax()
print('Hour with the highest frequency of crimes: ', peak_crime_hour)crimes.info()2 - Which hour has the highest frequency of crimes? Store as an integer variable called peak_crime_hour:
night_crimes = crimes[(crimes['HOUR OCC'] >= 22) | (crimes['HOUR OCC'] < 4)]night_crimes.info()number_crimes_location = night_crimes.groupby('AREA NAME')['DR_NO'].nunique()
df_ncl = number_crimes_location.to_frame()
peak_night_crime_location= number_crimes_location.idxmax()
print(f'Area with the highest frequency of nighttime crimes: {peak_night_crime_location}')Area with the highest frequency of nighttime crimes
3 - Identify the number of crimes committed against victims of different age groups. Save as a pandas Series called victim_ages, with age group labels "0-17", "18-25", "26-34", "35-44", "45-54", "55-64", and "65+" as the index and the frequency of crimes as the values.
bins = [0, 17, 25, 34, 44, 54, 64, 100]
labels = ["0-17", "18-25", "26-34", "35-44", "45-54", "55-64", "65+"]
crimes['AGE GROUP'] = pd.cut(crimes['Vict Age'], bins=bins, labels=labels, right=True)
crimes.info()victim_ages = crimes.groupby('AGE GROUP')['DR_NO'].nunique().sort_index()
df_victim_ages = victim_ages.to_frame()
print(f'Crime frequency by age group:\n {victim_ages}')Crime frequency by age group