Which tree species should the city plant?
📖 Background
You work for a nonprofit organization advising the planning department on ways to improve the quantity and quality of trees in New York City. The urban design team believes tree size (using trunk diameter as a proxy for size) and health are the most desirable characteristics of city trees.
The city would like to learn more about which tree species are the best choice to plant on the streets of Manhattan.
💾 The data
The team has provided access to the 2015 tree census and geographical information on New York City neighborhoods (trees, neighborhoods):
Tree Census
- "tree_id" - Unique id of each tree.
- "tree_dbh" - The diameter of the tree in inches measured at 54 inches above the ground.
- "curb_loc" - Location of the tree bed in relation to the curb. Either along the curb (OnCurb) or offset from the curb (OffsetFromCurb).
- "spc_common" - Common name for the species.
- "status" - Indicates whether the tree is alive or standing dead.
- "health" - Indication of the tree's health (Good, Fair, and Poor).
- "root_stone" - Indicates the presence of a root problem caused by paving stones in the tree bed.
- "root_grate" - Indicates the presence of a root problem caused by metal grates in the tree bed.
- "root_other" - Indicates the presence of other root problems.
- "trunk_wire" - Indicates the presence of a trunk problem caused by wires or rope wrapped around the trunk.
- "trnk_light" - Indicates the presence of a trunk problem caused by lighting installed on the tree.
- "trnk_other" - Indicates the presence of other trunk problems.
- "brch_light" - Indicates the presence of a branch problem caused by lights or wires in the branches.
- "brch_shoe" - Indicates the presence of a branch problem caused by shoes in the branches.
- "brch_other" - Indicates the presence of other branch problems.
- "postcode" - Five-digit zip code where the tree is located.
- "nta" - Neighborhood Tabulation Area (NTA) code from the 2010 US Census for the tree.
- "nta_name" - Neighborhood name.
- "latitude" - Latitude of the tree, in decimal degrees.
- "longitude" - Longitude of the tree, in decimal degrees.
Neighborhoods' geographical information
- "ntacode" - NTA code (matches Tree Census information).
- "ntaname" - Neighborhood name (matches Tree Census information).
- "geometry" - Polygon that defines the neighborhood.
Tree census and neighborhood information from the City of New York NYC Open Data.
Description of the dataset
The dataset "Tree Census" contains information about 64229 trees located in Manhattan's neighbourhoods. There are 1802 dead trees (status = "Dead") in the dataset. In this case, there is no information about tree species, so dead trees will be excluded from the further analysis since they do not provide any additional information.
Based on the geographical dataset there are 29 neighbourhoods in Manhattan. But one of them (MN99 - "park-cemetery-etc-Manhattan") has no information about trees. The mapping of neigbourhoods codes and names is presented in Appendix 1.
Number of alive trees per neigbourhood.
Firstly, the number of alive trees in each neighbourhood was analyzed. Results are summarised in the figure below.
As it is possible to see from the figure the biggest number of trees is in Upper West Side (MN12). There are 5,723 trees in this neighbourhood, which is 2.57 times greater than the average number of trees per neighbourhood. In addition, Upper East Side-Carnegie Hill (MN40), West Village (MN23) and Central Harlem North-Polo Grounds (MN03) can be considered as neighbourhoods with a significantly big number of trees.
But the areas of the neighbourhood are not equal. So, it is expected that smaller neighbourhoods have fewer trees than bigger ones. The plot below shows the correlation between the number of alive trees in the neighbourhood and its area.
From the figure, we can see that our hypothesis is true. For example, Stuyvesant Town-Cooper Village (MN50) having the smallest number of alive trees is also the smallest neighbourhood and Upper West Side (MN12) having the biggest number of trees is one of the biggest neighbourhoods. Additionally, the linear trendline shows the positive correlation between neighbourhood's area and the number of trees.
Therefore to choose the most "green" neighbourhoods in Manhattan it is preferable to compare the number of trees in relative terms.
Let's define the new variable "Density" as the number of trees per km2. In this case density of neighbourhood
where
In the figure below the tree density in each neighbourhood is presented. Now, top-3 "green" neighbourhoods are Upper East Side-Carnegie Hill (MN40), Central Harlem South (MN11) and Upper West Side (MN12).
The lowest density is in Midtown-Midtown South (MN17) - 399 trees per 1 km2.
If we consider the geographical location of these most "green" neighbourhoods, it is possible to see that they are located in the centre of Manhattan - around Central Park.