Manhattan's Urban Forestry Report, 2015
Introduction
The urban design team believes that tree size (in terms of trunk diameter) and health are the most desirable characteristics of city trees. In order to help the planning department improve the quantity and quality of trees in New York City, our organization is advised to provide a data analysis report.
Objectives
The main objective of this report is to profile Manhattan's tree population and species by different attributes using summary statistics, visualizations, and textual explanations. Specifically, it aims to:
| ㅤ 1. Describe all censused trees by their spatial and biological characteristics. |
| ㅤ 2. Map the tree profile of the neighborhoods. |
| ㅤ 3. Illustrate the biodiversity and biology of the tree species in Manhattan. |
| ㅤ 4. Determine tree species with the best traits. |
Data Used
The following data sets come from the City of New York NYC Open Data.
Trees
A data set based on the "TreesCount!
See the list of variables and their descriptions here.
Neighborhoods
A data set based on the "boundaries of Neighborhood Tabulation Areas as created by the NYC Department of City Planning using whole census tracts from the
See the list of variables and their descriptions here.
Executive Summary
Using the data available and findings of the analyses, the tree population and species of Manhattan, New York City, can be summarized as follows:
-
Greater numbers of trees are most likely to be found in neighborhoods with larger plots of land.
-
The majority of the trees in Manhattan are on-curb, with only a few that are offset from the curb.
-
The majority of the trees in Manhattan are alive and in fair to good health, while only a small number are dead and in poor health.
-
Although specific root, trunk, and branch problems are not of significant concerns, few of the trees are affected by paving stones in the tree bed (a kind of root problem) as well as other unspecified trunk and branch problems.
-
Manhattan has a rich and diverse set of tree species.
-
The species recommendation for tree planting in Manhattan's streets is a combination of some of the borough's highly and averagely abundant species that have shown favorable qualities of size and health. Specifically, the top five recommended species are as follows:
-
Siberian elm
-
Willow oak
-
Honeylocust
-
American elm
-
Pin oak
-
Results & Discussion
Tree Population
Using descriptive and spatial analyses, the following information outlines the location and physical attributes of all Manhattan trees in
Spatial
Tree Locations by Neighborhood:
While trees seem to cover much each of Manhattan's
- *Hudson Yards-Chelsea-Flatiron-Union Square (MN13)
- Upper West Side (MN12)
- *Midtown-Midtown South (MN17)
- Central Harlem North-Polo Grounds (MN03)
- West Village (MN23)
- *SoHo-TriBeCa-Civic Center-Little Italy (MN24)
- East Harlem North (MN34)
- *Lower East Side (MN28)
- Washington Heights South (MN36)
- Washington Heights North (MN35)
# ---------- Packages & Datasets
# Load pre-installed, required packages
suppressPackageStartupMessages(library(tidyverse))
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(sf))
suppressPackageStartupMessages(library(geojsonsf))
suppressPackageStartupMessages(library(scales))
# Install & load the 'rwantshue' package for generating random color scheme
suppressWarnings(suppressMessages(install.packages("remotes", quiet=TRUE)))
suppressWarnings(suppressMessages(remotes::install_github("hoesler/rwantshue", auth_token="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", quiet=TRUE)))
suppressPackageStartupMessages(library(rwantshue))
# Install & load the 'ggfun' package for round rectangle borders and backgrounds in ggplots
suppressWarnings(suppressMessages(install.packages("ggfun", quiet=TRUE)))
suppressPackageStartupMessages(library(ggfun))
# Install & load the 'ggchicklet' package for bar charts with rounded corners
suppressWarnings(suppressMessages(remotes::install_github("hrbrmstr/ggchicklet", auth_token="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", quiet=TRUE)))
suppressPackageStartupMessages(library(ggchicklet))
# Read the 'trees' data set from the CSV file
trees <- readr::read_csv("data/trees.csv", show_col_types=FALSE) %>%
mutate(spc_common = str_to_sentence(spc_common))
# Read the 'neighborhoods' data set from the SHP file
neighborhoods <- st_read("data/nta.shp", quiet=TRUE) %>%
dplyr::select(boroname, ntacode, ntaname, geometry, shape_area)
# Create a merged data frame for the 'trees' and 'neighborhoods' data sets
merged_trees_and_neighborhoods <- trees %>%
full_join(neighborhoods, by = c("nta"="ntacode", "nta_name"="ntaname"))defaultW <- getOption("warn")
options(warn=-1)
# ---------- Results & Discussion
# ----- Tree Population
# -- Spatial
# Top 10 NTAs in terms of land size
top_nta_area <- neighborhoods %>%
filter(boroname == "Manhattan", ntacode != "MN99") %>%
arrange(desc(shape_area)) %>%
slice(1:10)
# Tree count per neighborhood
nbh_tree_cnts <- merged_trees_and_neighborhoods %>%
filter(boroname == "Manhattan", nta != "MN99") %>%
group_by(nta, nta_name) %>%
summarize(number_of_trees = n(), .groups="keep") %>%
arrange(desc(number_of_trees)) %>%
ungroup() %>%
mutate(proportion = round(number_of_trees/sum(number_of_trees), digits = 4))
# Species richness per neighborhood
nbh_rchns <- trees %>%
filter(!(spc_common=="null")) %>%
group_by(nta, nta_name) %>%
summarize(richness = n_distinct(spc_common), .groups="keep") %>%
arrange(desc(richness)) %>%
ungroup()
# Data for maps
nbhs_map <- nbh_tree_cnts %>%
full_join(neighborhoods, c("nta"="ntacode", "nta_name"="ntaname")) %>%
full_join(nbh_rchns, c("nta", "nta_name")) %>%
mutate(borough = substr(nta, 1, 2),
nta_code_and_name = paste(nta, nta_name, sep=": "),
nta_and_tree_cnt = ifelse(number_of_trees < 1000,
paste(nta, " - ", " ", prettyNum(number_of_trees,big.mark=","), " : ", nta_name, sep=""),
paste(nta, " - ", prettyNum(number_of_trees, big.mark=","), " : ", nta_name, sep="")
),
nta_and_rchns = paste(nta, " - ", prettyNum(richness, big.mark=","),
" : ", nta_name, sep="")
) %>%
st_as_sf %>%
st_transform("+proj=longlat +ellps=intl +no_defs +type=crs")
# Colorize the NTAs
color_scheme <- iwanthue(seed=1234, force_init=TRUE)
nta_colors <- color_scheme$hex(nrow(nbhs_map %>% filter(borough == "MN")))
# Data of tree locations
tree_locs <- trees %>%
st_as_sf(coords = c("longitude", "latitude"), crs=4326) %>%
st_transform("+proj=longlat +ellps=intl +no_defs +type=crs")
# Map of tree locations by neighborhood
tree_locs_map_plot <- ggplot() +
geom_sf(data = nbhs_map,
fill="#E8EAED", color="grey") +
stat_sf_coordinates(data = tree_locs,
aes(color = paste(nta, nta_name, sep=": ")),
size=0.001
) +
stat_sf_coordinates(data = nbhs_map %>% filter(borough=="MN", nta!="MN99"),
color="grey25", size=0.25) +
geom_sf(data = nbhs_map %>% filter(borough=="MN", nta!="MN99"),
color="grey25",
alpha=0.1) +
theme(legend.position = c(0.024, 0.5),
legend.justification=0.0,
legend.key.width = unit(2.5, 'mm'),
legend.key.height = unit(1.8, 'mm'),
legend.direction="vertical",
legend.background= element_roundrect(r = grid::unit(0.02, "snpc"),
fill=alpha("#FFFFFF", 0.90)),
legend.key = element_rect(fill=NA),
legend.text = element_text(margin = margin(r=5, unit="pt"),
color="#65707C",
family="sans serif"),
legend.title = element_text(face="bold",
color="#65707C",
size=8.5,
family="sans serif"),
axis.title = element_text(color="#65707C",
face="bold",
family="sans serif"),
axis.text = element_text(color="#65707C",
size=7,
family="sans serif"),
axis.text.x = element_text(angle=90,
vjust=0.5,
hjust=1),
axis.line = element_line(colour="grey",
linewidth=0.5),
panel.grid.major = element_line(color="grey",
linetype="dashed",
linewidth=0.25),
panel.border = element_rect(color="grey40",
fill=NA),
panel.spacing = unit(2, "lines"),
panel.background = element_roundrect(r = grid::unit(0.001, "snpc"),
fill=alpha("#9CC0F9", 1)),
rect = element_rect(fill = "transparent"),
plot.title = element_text(color="#65707C",
vjust=10,
size=14,
family="sans serif")) +
labs(x="", y="", color=" Code: Name") +
ggtitle("Fig. 1: Map of the Tree Locations by Neighborhood in Manhattan") +
scale_x_continuous(limits = c(-74.25, -73.89),
breaks = seq(-74.25, -73.89, by=0.02)) +
scale_y_continuous(limits = c(40.68, 40.88),
breaks = seq(40.68, 40.88, by=0.02)) +
guides(color = guide_legend(ncol=1,
override.aes = list(shape=15,
size=2.5
))) +
ggrepel::geom_text_repel(data = nbhs_map %>% filter(borough == "MN", nta != "MN99"),
aes(label = nta, geometry = geometry),
stat="sf_coordinates",
min.segment.length=0,
size=2,
label.size=NA,
fontface="bold"
) +
coord_sf(xlim = c(-74.25, -73.89), ylim = c(40.68, 40.88)) +
scale_color_manual(values = nta_colors)
options(warn = defaultW)# ----- For link's image thumbnail
# Install and load the 'patchwork' package
suppressWarnings(suppressMessages(install.packages("patchwork", quiet=TRUE)))
suppressPackageStartupMessages(library(patchwork))
# Install and load the 'png' package
suppressWarnings(suppressMessages(install.packages("png", quiet=TRUE)))
suppressPackageStartupMessages(library(png))
# Create a data
data <- data.frame(x = 1:3,
y = 1:3)
# Read the PNG file
my_image <- readPNG("cover.png", native = TRUE)
# Create a plot and combine with the image
cover_img <- ggplot(data, aes(x, y)) +
geom_point() +
theme_minimal() +
theme(axis.title = element_blank(),
axis.text = element_blank(),
axis.line = element_blank(),
axis.ticks = element_blank()) +
inset_element(p = my_image,
left=-0.1,
bottom=-0.5,
right=1.23,
top=1.5)
cover_img# Export plot as PNG
ggsave(
plot = tree_locs_map_plot + theme(plot.title = element_text(hjust=1.25)),
filename = "documentation/tree_locs_map_plot.png",
bg = "transparent"
)Tree Counts by Neighborhood:
In terms of the number of trees, the top ten neighborhoods are:
- *Upper West Side (MN12)
- Upper East Side-Carnegie Hill (MN40)
- *West Village (MN23)
- *Central Harlem North-Polo Grounds (MN03)
- *Hudson Yards-Chelsea-Flatiron-Union Square (MN13)
- *Washington Heights South (MN36)
- Morningside Heights (MN09)
- Central Harlem South (MN11)
- *Washington Heights North (MN35)
- *East Harlem North (MN34)
Seven of which (indicated by *) are part of the ten largest.
defaultW <- getOption("warn")
options(warn=-1)
# neighborhoods %>%
# st_set_geometry(NULL) %>%
# summarize(total_number_of_neighborhoods_in_the_data_set = n())
# neighborhoods %>%
# st_set_geometry(NULL) %>%
# filter(str_detect(ntacode, "MN")) %>%
# summarize(number_of_neighborhoods_from_manhattan_in_the_data_set = n())
# merged_trees_and_neighborhoods %>%
# group_by(nta) %>%
# summarize(number_of_trees_per_neighborhood = n()) %>%
# summarize(number_of_neighborhoods_from_manhattan_with_trees = n())
# neighborhoods %>%
# st_set_geometry(NULL) %>%
# anti_join(trees, by = c("ntacode" = "nta", "ntaname" = "nta_name")) %>%
# filter(str_detect(ntacode, "MN"))
# Table for Top 10 Tree-Producing Neighborhoods
for_table_nbh_tree_cnts <- nbh_tree_cnts %>%
slice(1:10) %>%
rownames_to_column("rank") %>%
mutate(number_of_trees = prettyNum(number_of_trees, big.mark=","),
percentage = label_percent(accuracy=0.01)(proportion)) %>%
select(-proportion)
# HTML Table for Top 10 Most Abundant Species
#kable(for_table_nbh_tree_cnts,
# caption = " ",
# label = "tables", format = "html", booktabs = TRUE)
# Order by number of trees
nbhs_map$nta_and_tree_cnt <- factor(
nbhs_map$nta_and_tree_cnt,
levels = nbhs_map$nta_and_tree_cnt,
ordered=TRUE
)
# Map of NTAs' tree counts
nbhs_tree_cnts_map_plot <- ggplot() +
geom_sf(data = nbhs_map %>% filter(borough != "MN" | nta == "MN99"),
fill="#E8EAED", color="grey") +
geom_sf(data = nbhs_map %>% filter(borough == "MN", nta != "MN99"),
aes(fill = number_of_trees,
color = nta_and_tree_cnt
)) +
stat_sf_coordinates(data = nbhs_map %>% filter(nta %in% for_table_nbh_tree_cnts$nta),
color="grey25", size=0.5) +
theme(legend.position = #c(0.7, 0.8),
c(0.369, 0.5),
#c(0.025, 0.5),
legend.justification=0.0,
legend.key.width = unit(2.5, 'mm'),
legend.key.height = unit(1.8, 'mm'),
legend.direction="vertical",
legend.background = element_roundrect(r = grid::unit(0.02, "snpc"),
fill = alpha("#FFFFFF", 0.90)),
legend.key = element_rect(fill=NA),
legend.text = element_text(margin = margin(r=5, unit="pt"),
size=7.9,
color="#65707C",
family="sans serif"),
legend.title = element_text(face="bold",
color="#65707C",
size=8.5,
family="sans serif"),
axis.title = element_text(color="#65707C",
face="bold",
family="sans serif"),
axis.text = element_text(color="#65707C",
size=7,
family="sans serif"),
axis.text.x = element_text(angle=90,
vjust=0.5,
hjust=1),
axis.line = element_line(colour="grey",
linewidth=0.5),
panel.grid.major = element_line(color="grey",
linetype="dashed",
linewidth=0.25),
panel.border = element_rect(color="grey40",
fill=NA),
panel.spacing = unit(2, "lines"),
panel.background = element_roundrect(r = grid::unit(0.001, "snpc"),
fill = alpha("#9CC0F9", 1)),
rect = element_rect(fill = "transparent"),
plot.title = element_text(color="#65707C",
vjust=10,
size=14,
family="sans serif")) +
labs(x="", y="", color=" Code - Number of trees : Name"
) +
ggtitle("Fig. 2: Map of the Number of Trees in Manhattan's Neighborhoods") +
scale_x_continuous(expand = c(0.01, 0),
limits = c(-74.04, -73.64),
breaks = seq(-74.04, -73.64, by=0.02)) +
scale_y_continuous(expand = c(0.01, 0),
limits = c(40.68, 40.88),
breaks = seq(40.68, 40.88, by=0.02)) +
scale_color_manual(values = replicate(28, "grey25")) +
scale_fill_gradient2(low = muted("499F78"),
high = muted("#216968")) +
ggrepel::geom_text_repel(data = nbhs_map %>% filter(nta %in% for_table_nbh_tree_cnts$nta),
aes(label = nta, geometry = geometry),
stat="sf_coordinates",
min.segment.length=0,
label.size=NA,
alpha=0.5,
fontface="bold"
) +
coord_sf(xlim = c(-74.04, -73.64), ylim = c(40.68, 40.88)) #-74.28, -73.88
# Extract NTA fill colors
color_scheme_2 <- as.data.frame(ggplot_build(nbhs_tree_cnts_map_plot)$data[[2]])$fill
nbhs_tree_cnts_map_plot1 <- nbhs_tree_cnts_map_plot +
guides(fill = "none",
color = guide_legend(ncol=1,
override.aes = list(color = NA,
fill = color_scheme_2,
linewidth=0))
)
#nbh_tree_cnts %>%
# #slice(1:10) %>%
# #rownames_to_column("rank") %>%
# mutate(number_of_trees = prettyNum(number_of_trees, big.mark=","),
# percentage = label_percent(accuracy=0.01)(proportion)) %>%
# select(-proportion)
options(warn = defaultW)nbh_tree_cnts %>%
slice(1:10) %>%
rownames_to_column("rank") %>%
mutate(number_of_trees = prettyNum(number_of_trees, big.mark=","),
percentage = label_percent(accuracy=0.01)(proportion)) %>%
select(-proportion)# Export plot as PNG
ggsave(
plot = nbhs_tree_cnts_map_plot1 + theme(plot.title = element_text(hjust=1.1)),
filename = "documentation/nbhs_tree_cnts_map_plot.png",
bg = "transparent"
)Trees by Curb Location:
Majority or
# Tree count per location in relation to curb
number_of_trees_per_curb_loc <- merged_trees_and_neighborhoods %>%
filter(str_detect(nta, "MN") & !(nta == "MN99")) %>%
group_by(curb_loc) %>%
summarize(number_of_trees = n()) %>%
arrange(desc(number_of_trees)) %>%
mutate(percentage = label_percent(accuracy=0.01)(number_of_trees/length(merged_trees_and_neighborhoods$tree_id)))
# HTML Table for Curb Location
#kable(number_of_trees_per_curb_loc,
# caption = "This is the caption.",
# label = "tables", format = "html", booktabs = TRUE)
on_curb_stat <- number_of_trees_per_curb_loc %>%
mutate(proportion = number_of_trees/sum(number_of_trees)) %>%
filter(proportion == max(abs(proportion)))
# Create a pie chart for the curb location
curb_loc_stacked_bar_plot <- ggplot(number_of_trees_per_curb_loc) +
geom_chicklet(aes(x="", y = number_of_trees/sum(number_of_trees),
fill = curb_loc),
radius = grid::unit(0.75, "mm"),
position="stack") +
coord_flip() +
theme(legend.position="right",
legend.justification="top",
legend.direction="vertical",
legend.key.size = unit(0, 'pt'),
#legend.key = element_rect(fill = NA),
legend.text = element_text(margin = margin(r = 4, unit = "pt"),
color = "#65707C",
family="sans serif"),
legend.title = element_text(color = "#65707C",
face="bold",
size = 9,
family="sans serif"),
axis.title.x = element_text(color="#65707C",
face="bold",
family="sans serif"),
axis.title.y = element_blank(),
axis.text = element_blank(),
axis.line = element_blank(),
axis.ticks = element_blank(),
panel.grid.minor = element_blank(),
panel.grid.major = element_blank(),
panel.background = element_blank(),
rect = element_rect(fill = "transparent"),
plot.subtitle = element_text(color="#65707C",
hjust=0.25,
size=10,
family="sans serif"),
plot.title = element_text(color="#65707C",
hjust=-0.15,
size=14,
family="sans serif"),
plot.margin = unit(c(0,1,0,1), "cm")) +
scale_fill_manual(values = c("#875826",
"#10401B")) +
ggtitle("\nFig. 3: Proportional Stacked Bar Graph of Tree Bed Location ",
subtitle=" (in relation to the Curb)\n") +
labs(y="\n% \n(Number of trees)\n", fill="Location: ") +
guides(fill = guide_legend(nrow=2,
reverse=TRUE,
override.aes = list(shape = 15,
size = 4))) +
scale_x_discrete(expand = c(0.01, 0)) +
geom_text(data = on_curb_stat,
aes(label = paste(label_percent(accuracy=0.01)(proportion),
"\n (", prettyNum(number_of_trees,
big.mark=","),")",
sep=""),
x = "",
y = 0.50 * proportion - 0.075),
size=5, color="white", hjust=1)# Export plot as PNG
ggsave(
plot = curb_loc_stacked_bar_plot,
filename = "documentation/curb_loc_stacked_bar_plot.png",
bg = "transparent"
)