Skip to content
Manhattan
Manhattan Trees
Introduction
New York City is one of the most beautiful cities in the world and one of its attractions is Central Park. It is known for its scale and location, creating such an island of nature in the stone jungle. Central Park located in Manhattan. In this study, we will answer the following questions:
- What are the most common tree species in Manhattan?
- Which are the neighborhoods with the most trees?
- A visualization of Manhattan's neighborhoods and tree locations.
- What ten tree species would you recommend the city plant in the future?
Import data
First of all we need to import data with what we will work. Tree census data is free to use from NYC Open Data website.
#Import libraries
import pandas as pd
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
import contextily as cx#Import Data from NYC Open Data
trees = pd.read_csv('data/trees.csv')
#Convert DataFrame to GeoDaraFrame
df_trees = pd.DataFrame(trees)
gdf_trees = gpd.GeoDataFrame(df_trees, geometry=gpd.points_from_xy(df_trees.longitude, df_trees.latitude))
gdf_trees.crs = 'EPSG:4326'#Import ShapeFile for Zone Boundaries
neighborhoods = gpd.read_file('data/nta.shp')
manh_neighborhoods = neighborhoods[neighborhoods['boroname'] == "Manhattan"]Analysis of the most common tree in Manhattan
#Count number of tree by it type and convert it to DataFrame
df_manhattan_tree_types = pd.DataFrame(df_trees.spc_common.value_counts())
#Convert index to DataFrame as column
df_manhattan_tree_types['Name of Tree'] = df_manhattan_tree_types.index
#Remove previous and use of default indexes
df_manhattan_tree_types.reset_index(drop=True, inplace=True)
df_manhattan_tree_types['index'] = df_manhattan_tree_types.index
#Insert max value of tree number index
indmax = df_manhattan_tree_types['spc_common'].idxmax()
#Rename columns
rename_columns = {'spc_common': 'Number of Trees'}
df_manhattan_tree_types.rename(columns=rename_columns,inplace=True)
df_manhattan_tree_types = df_manhattan_tree_types[['index','Name of Tree','Number of Trees']]
#Print out the result
print('Q1: What are the most common tree species in Manhattan?')
print('A1: At Manhatten, the most popular tree type is ' + df_manhattan_tree_types['Name of Tree'][indmax] + ' with amount of ' + str(df_manhattan_tree_types['Number of Trees'][indmax]) + '.')
#Add Column with percentage of trees
amount_of_trees = df_manhattan_tree_types['Number of Trees'].sum()
df_manhattan_tree_types['Percentage'] = round(df_manhattan_tree_types['Number of Trees']/amount_of_trees*100,3)
#Create a lists plt_perc_name for type of tree and plt_perc for percentage of trees
plt_perc_name = []
plt_perc = []
#Fill lists with tree types that covers more that 5 percent of all data, other data summarize in one
index = 0
for index in df_manhattan_tree_types['index']:
if df_manhattan_tree_types['Percentage'][index] > 5:
plt_perc_name.append(df_manhattan_tree_types['Name of Tree'][index])
plt_perc.append(df_manhattan_tree_types['Percentage'][index])
index = index + 1
plt_perc_name.append('Other')
plt_percc = df_manhattan_tree_types['Percentage']
plt_perc.append(plt_percc[plt_percc<5].sum())
#Create a dictionary
plt_fin = {}
#Import to dictionary from lists
plt_fin['Name'] = plt_perc_name
plt_fin['Percentage'] = plt_perc
#DataFrame from dictionary
df_plt_fin = pd.DataFrame(plt_fin)
#Plot a pie chart
fig1, ax1 = plt.subplots()
ax1.pie(df_plt_fin['Percentage'], explode =df_plt_fin['Percentage']/100,labels = df_plt_fin['Name'],startangle=90, autopct='%1.1f%%')
plt.title('Tree type Distribution at Manhattan')
plt.show()#Count number of tree by it type and convert it to DataFrame
df_nta_trees = pd.DataFrame(trees.nta_name.value_counts())
#Convert index to DataFrame as column
df_nta_trees['Neighborhood'] = df_nta_trees.index
#Remove previous and use of default indexes
df_nta_trees.reset_index(drop=True, inplace=True)
#Insert max value of tree number index
indmax = df_nta_trees['nta_name'].idxmax()
#Rename columns
rename_columns = {'nta_name': 'Number of Trees'}
df_nta_trees.rename(columns=rename_columns,inplace=True)
df_nta_trees = df_nta_trees[['Neighborhood','Number of Trees']]
#Print out the results
print('Q2: Which are the neighborhoods with the most trees?')
print('A2: ' + df_nta_trees['Neighborhood'][indmax] + ' negihborhood has the most number of trees (' + str(df_nta_trees['Number of Trees'][indmax]) + ')')
#QA
print('Q3: A visualization of Manhattan neighborhoods and tree locations.')
print('A3: Visualization shown below')
#Plot a map
title = 'Map of tree distribution in Manhattan'
fig, ax = plt.subplots(figsize=(16, 16))
plt.title(title)
gdf_trees.plot(ax=ax, color = 'green', markersize = 0.1)
manh_neighborhoods.plot(ax=ax, color = 'none', edgecolor='red', linewidth=3)
cx.add_basemap(ax, crs=gdf_trees.crs)
Q4: What ten tree species would you recommend the city plant in the future?
#Filter by tree resilience
opt1 = np.logical_and(df_trees['root_stone'] == 'No',df_trees['root_grate'] == 'No')
opt2 = np.logical_and(df_trees['root_other'] == 'No',df_trees['trunk_wire'] == 'No')
opt3 = np.logical_and(df_trees['trnk_light'] == 'No',df_trees['trnk_other'] == 'No')
opt4 = np.logical_and(df_trees['brch_light'] == 'No',df_trees['brch_shoe'] == 'No')
opt5 = np.logical_and(df_trees['brch_other'] == 'No',df_trees['health'] == 'Good')
opt11 = np.logical_and(opt1, opt2)
opt22 = np.logical_and(opt3, opt4)
opt33 = np.logical_and(opt11,opt22)
opt44 = np.logical_and(opt33,opt5)
one_tree = df_trees[opt44]
#Convert to DataFrame
df_manhattan_tree_types = pd.DataFrame(one_tree.spc_common.value_counts())
#Convert index to DataFrame as column
df_manhattan_tree_types['Name of Tree'] = df_manhattan_tree_types.index
df_manhattan_tree_types.reset_index(drop=True, inplace=True)
#Rename columns
rename_columns = {'spc_common': 'Number of Trees'}
df_manhattan_tree_types.rename(columns=rename_columns,inplace=True)
df_manhattan_tree_types = df_manhattan_tree_types[['Name of Tree','Number of Trees']]
#Show less 10 trees.
print(df_manhattan_tree_types[-10:])A4: According to the table above, the data have been filtered on the basis of tree hardiness. Afterwards, those trees with the least amount were selected and the 10 trees in the table above were highlighted.