Skip to content
0

Which tree species should the city plant?

Introduction

We work for a nonprofit organization advising the planning department on ways to improve the quantity and quality of trees in New York City. The urban design team believes tree size (using trunk diameter as a proxy for size) and health are the most desirable characteristics of city trees. The city would like to learn more about which tree species are the best choice to plant on the streets of Manhattan.

The Data

The data contains of two tables including the 2015 tree census and geographical information on New York City neighborhoods. The attributes and name of these two tables as shown below:

Tree Census
  • "tree_id" - Unique id of each tree.
  • "tree_dbh" - The diameter of the tree in inches measured at 54 inches above the ground.
  • "curb_loc" - Location of the tree bed in relation to the curb. Either along the curb (OnCurb) or offset from the curb (OffsetFromCurb).
  • "spc_common" - Common name for the species.
  • "status" - Indicates whether the tree is alive or standing dead.
  • "health" - Indication of the tree's health (Good, Fair, and Poor).
  • "root_stone" - Indicates the presence of a root problem caused by paving stones in the tree bed.
  • "root_grate" - Indicates the presence of a root problem caused by metal grates in the tree bed.
  • "root_other" - Indicates the presence of other root problems.
  • "trunk_wire" - Indicates the presence of a trunk problem caused by wires or rope wrapped around the trunk.
  • "trnk_light" - Indicates the presence of a trunk problem caused by lighting installed on the tree.
  • "trnk_other" - Indicates the presence of other trunk problems.
  • "brch_light" - Indicates the presence of a branch problem caused by lights or wires in the branches.
  • "brch_shoe" - Indicates the presence of a branch problem caused by shoes in the branches.
  • "brch_other" - Indicates the presence of other branch problems.
  • "postcode" - Five-digit zip code where the tree is located.
  • "nta" - Neighborhood Tabulation Area (NTA) code from the 2010 US Census for the tree.
  • "nta_name" - Neighborhood name.
  • "latitude" - Latitude of the tree, in decimal degrees.
  • "longitude" - Longitude of the tree, in decimal degrees.
Neighborhoods' geographical information
  • "ntacode" - NTA code (matches Tree Census information).
  • "ntaname" - Neighborhood name (matches Tree Census information).
  • "geometry" - Polygon that defines the neighborhood.

Two table has an attribute in common namely nta_code in Neighborhoods table and nta in trees table. We can join these two tables with these two attributes.

Data Validation

Trees Table

The data contains of 64229 and 20 columns. There is no missing values in any cells and duplicated any rows. The numbers about the some attributes in details are the following:

  • There are 128 unique tree species categories as expected
  • Max and Min values of tree diameters in this data are 318 and 0. The measurement was performed at 54 inches above the ground so the value 0 is accepted as tree which is too small.

Neighborhoods Table

The data contains of 195 and 7 columns. There is no missing values in any cells and duplicated any rows. The numbers about the some attributes in details are the following:

  • Neighborhood name is divided into 5 Neighborhoods: Queens, Brooklyn, Bronx, Manhattan, Staten Island.
  • NTA code and ntaname match with Tree Census information

Data Discovery and Visualization

We are going to investigate the data based on the questions as follows:

  • What are the most common tree species in Manhattan?
  • Which are the neighborhoods with the most trees?
  • A visualization of Manhattan's neighborhoods and tree locations.
  • What ten tree species would you recommend the city plant in the future?

What are the most common tree species in Manhattan?

The most common species is honeylocust species n Manhattan. To give details about other species, we also created a visualization including top 10 species in Manhattan. The number of honeylocust species is much more than other sepcies as shown in graph.

Which are the neighborhoods with the most trees?

When we merge the datasets of trees and neighborhoods, the neighborhoods except Manhattan, we do not get any tree result. However, in Manhattan, Upper West Side neighborhoods has the highest number of trees as shown below, we also show the number of trees in other 10 neighborhoods.

A visualization of Manhattan's neighborhoods and tree locations.

In the link below, we uploaded the map Manhattan and tree location with Folium due to restriction storage in datacamp workspace. You can find the main in html page via this link. We categorize the trees in the map according to their state of health. For the categories of "Good," "Fair," and "Poor," we use the colors "green," "yellow," and "red," accordingly. We also arrange the size of points based on their diameter properties. Moreover, you can access the state, health, and issues of the tree by clicking any one of its points.

According to the map, there are far more trees with good health status than those with fair or bad health condition. The trees along the river appear to be in good health on the map.

Link of Manhattan and tree location map wiht Folium