This dataset is only updated annually, and thus far only data from 2013 to 2018 is contained. The full dataset is also somewhat too large for an exercise (2.5M rows), so I suggest to limit yourself to a subset. I have added a file containing the subset of of only building fires (INCIDENT_TYPE_DESC == “111 - Building fire”) for 2013 to 2018 only which yields about 14,000 incidents.
Provide a leaflet map of the highest severity fires (i.e. subset to the highest category in HIGHEST_LEVEL_DESC) contained in the file buiding_fires.csv. Ignore locations that fall outside the five boroughs of New York City. Provide at least three pieces of information on the incident in a popup.
severe_building_fires <- read_csv("building_fires.csv") %>%
mutate(alarm = case_when(
HIGHEST_LEVEL_DESC == "5 - 5th alarm" ~ "severe",
HIGHEST_LEVEL_DESC == "55 - Fifth Alarm" ~ "severe",
TRUE ~ "not severe")) %>%
filter(alarm == "severe") %>%
mutate(durat_hours = (TOTAL_INCIDENT_DURATION/3600)) %>%
mutate(durat_hours = sprintf("%0.1f", durat_hours)) %>%
mutate(fire_spread = case_when(
FIRE_SPREAD_DESC == "1 - Confined to object of origin" ~ "Confined to object",
FIRE_SPREAD_DESC == "2 - Confined to room of origin" ~ "Confined to room",
FIRE_SPREAD_DESC == "3 - Confined to floor of origin" ~ "Confined to floor",
FIRE_SPREAD_DESC == "4 - Confined to building of origin" ~ "Confined to building",
FIRE_SPREAD_DESC == "5 - Beyond building of origin" ~ "Spread beyond building",
TRUE ~ "NA")) %>%
mutate(fire_spread = factor(fire_spread, levels = c("Confined to object",
"Confined to room",
"Confined to floor",
"Confined to building",
"Spread beyond building"))) %>%
mutate(detector = case_when(
DETECTOR_PRESENCE_DESC == "N - None present" ~ "No",
DETECTOR_PRESENCE_DESC == "1 - Present" ~ "Yes",
TRUE ~ "NA"))
|======= | 5%
|=========== | 7%
|============== | 10%
|================== | 13%
|====================== | 15%
|========================= | 18%
|=========================== | 20% 1 MB
|=============================== | 23% 1 MB
|================================== | 26% 1 MB
|====================================== | 28% 1 MB
|========================================= | 31% 1 MB
|============================================= | 34% 1 MB
|================================================ | 36% 1 MB
|==================================================== | 39% 2 MB
|======================================================= | 41% 2 MB
|=========================================================== | 44% 2 MB
|============================================================== | 47% 2 MB
|================================================================== | 49% 2 MB
|===================================================================== | 52% 2 MB
|======================================================================== | 54% 2 MB
|============================================================================ | 57% 3 MB
|=============================================================================== | 60% 3 MB
|=================================================================================== | 62% 3 MB
|====================================================================================== | 65% 3 MB
|========================================================================================== | 67% 3 MB
|============================================================================================= | 70% 3 MB
|================================================================================================= | 73% 3 MB
|==================================================================================================== | 75% 3 MB
|======================================================================================================== | 78% 4 MB
|=========================================================================================================== | 80% 4 MB
|=============================================================================================================== | 83% 4 MB
|================================================================================================================== | 86% 4 MB
|====================================================================================================================== | 88% 4 MB
|========================================================================================================================= | 91% 4 MB
|============================================================================================================================ | 93% 4 MB
|================================================================================================================================ | 96% 5 MB
|=================================================================================================================================== | 99% 5 MB
|=====================================================================================================================================| 100% 5 MB
content <- paste("Units responding:", severe_building_fires$UNITS_ONSCENE,"<br/>",
"Hours to contain:", (severe_building_fires$durat_hours),"<br/>",
"Fire spread:", severe_building_fires$fire_spread,"<br/>")
m1 <- leaflet(severe_building_fires) %>%
addTiles(urlTemplate = 'https://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}{r}.png',
attribution = "Source: NYC Open Data") %>%
addCircles(lng = ~lon, lat = ~lat, radius = 100, color = "orange", fill = "orange", popup = content) %>%
addControl("Severe Building Fires in NYC (2013-2018)", position = "topright")
m1
Start with the previous map. Now, distinguish the markers of the fire locations by PROPERTY_USE_DESC, i.e. what kind of property was affected. If there are too many categories, collapse some categories. Choose an appropriate coloring scheme to map the locations by type of affected property. Add a legend informing the user about the color scheme. Also make sure that the information about the type of affected property is now contained in the popup information. Show this map.
severe_building_fires_2 <- severe_building_fires %>%
mutate(bldg_use = case_when(
PROPERTY_USE_DESC == "429 - Multifamily dwelling" ~ "Apartment or condo",
PROPERTY_USE_DESC == "500 - Mercantile, business, other" ~ "Business",
PROPERTY_USE_DESC == "419 - 1 or 2 family dwelling" ~ "House",
PROPERTY_USE_DESC == "519 - Food and beverage sales, grocery store" ~ "Grocery",
PROPERTY_USE_DESC == "599 - Business office" ~ "Office",
TRUE ~ "Other")) %>%
mutate(bldg_use = factor(bldg_use,
levels = c("Apartment or condo", "Business", "House", "Grocery", "Office", "Other")))
content2 <- paste("Building use:", severe_building_fires_2$bldg_use, "<br/>",
"Units responding:", severe_building_fires_2$UNITS_ONSCENE,"<br/>",
"Hours to contain:", severe_building_fires_2$durat_hours,"<br/>",
"Fire spread:", severe_building_fires_2$fire_spread,"<br/>")
pal = colorFactor("Dark2", domain = severe_building_fires_2$bldg_use)
color_use = pal(severe_building_fires_2$bldg_use)
m2 <- leaflet(severe_building_fires_2) %>%
addTiles('https://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}{r}.png',
attribution = "Source: NYC Open Data") %>%
addCircles(lng = ~lon, lat = ~lat, radius = 100, color = color_use, fill = color_use, popup = content2) %>%
addLegend(pal = pal, values = severe_building_fires_2$bldg_use, title = "Building use",
position = "topleft") %>%
addControl("Severe Building Fires in NYC (2013-2018)", position = "topright")
m2
Add marker clustering, so that zooming in will reveal the individual locations but the zoomed out map only shows the clusters. Show the map with clusters.
m3 <- m2 %>% addCircleMarkers(lng = ~lon, lat = ~lat, color = color_use, popup = content2,
clusterOptions = markerClusterOptions())
m3
The second data file contains the locations of the 218 firehouses in New York City. Start with the non-clustered map (2b) and now adjust the size of the circle markers by severity (TOTAL_INCIDENT_DURATION or UNITS_ONSCENE seem plausible options). More severe incidents should have larger circles on the map. On the map, also add the locations of the fire houses. Add two layers (“Incidents”, “Firehouses”) that allow the user to select which information to show.
firehouses <- read_csv("FDNY_Firehouse_Listing.csv") %>%
filter(!is.na(Latitude))
m4 <- leaflet() %>%
addTiles('https://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}{r}.png',
attribution = "Source: NYC Open Data") %>%
addCircleMarkers(data = severe_building_fires_2, lng = ~lon, lat = ~lat, color = color_use, popup = content2,
radius = (severe_building_fires_2$UNITS_ONSCENE/5), group = "Incidents") %>%
addLegend(data = severe_building_fires_2, pal = pal, values = severe_building_fires_2$bldg_use,
title = "Building use", group = "Incidents", position = "topleft") %>%
addCircles(data = firehouses, color = "gray", popup = ~paste("Name: ", FacilityName, "<br>Address: ",
FacilityAddress), group = "Firehouses") %>%
addControl("Severe Building Fires in NYC 2013-2018", position = "topright") %>%
addLayersControl(overlayGroups = c("Incidents","Firehouses"),
options = layersControlOptions(collapsed = FALSE))
m4
We now want to investigate whether the distance of the incident from the nearest firehouse varies across the city.
For all incident locations (independent of severity), identify the nearest firehouse and calculate the distance between the firehouse and the incident location. Provide a scatter plot showing the time until the first engine arrived (the variables INCIDENT_DATE_TIME and ARRIVAL_DATE_TIME) will be helpful.
building_fires <- read_csv("building_fires.csv")
houses.xy <- select(firehouses, Longitude, Latitude)
fires.xy <- select(building_fires, lon, lat)
dist_matrix <- distm(fires.xy, houses.xy, fun = distGeo)
dist_nearest <- apply(dist_matrix, 1, min)
building_fires_2 <- building_fires
building_fires_2['dist_nearest'] <- dist_nearest
building_fires_3 <- building_fires_2 %>%
filter(IM_INCIDENT_KEY != 59533504) %>% # Incident located outside NYC
filter(IM_INCIDENT_KEY != 57703382) %>% # Response time is negative
mutate(dist_nearest_miles = (dist_nearest * 0.0006213712)) %>%
mutate(dist_nearest_feet = (dist_nearest * 3.28084)) %>%
mutate(call_time = mdy_hms(INCIDENT_DATE_TIME)) %>%
mutate(arrival_time = mdy_hms(ARRIVAL_DATE_TIME)) %>%
mutate(call_time_2 = as.POSIXct(call_time, tz = Sys.timezone())) %>%
mutate(arrival_time_2 = as.POSIXct(arrival_time, tz = Sys.timezone())) %>%
mutate(mins_to_resp = difftime(arrival_time_2, call_time_2, units = "mins"))%>%
mutate(mins_to_resp_2 = as.double(mins_to_resp)) %>%
mutate(mins_to_resp_3 = sprintf("%0.1f", mins_to_resp_2))
p1 <- ggplot(data = building_fires_3, aes(x = dist_nearest_miles, y = mins_to_resp_2)) +
geom_point() +
coord_cartesian(xlim = c(0, 3), ylim = c(0, 25)) +
theme_fivethirtyeight() +
geom_smooth(se = FALSE, lwd = 1.5, color = "brown4") +
theme(axis.title = element_text(), legend.position = "none") +
labs(caption = "Source: NYC Open Data 2013-2018") +
xlab("Miles to nearest firehouse") +
ylab("Minutes to respond") +
ggtitle("Response time for building fires in NYC")
p1
Now also visualize the patterns separately for severe and non-severe incidents (use HIGHEST_LEVEL_DESC but feel free to reduce the number of categories). What do you find?
building_fires_4 <- building_fires_3 %>%
mutate(alarm_level = case_when(
HIGHEST_LEVEL_DESC == "0 - Initial alarm" ~ "Initial",
HIGHEST_LEVEL_DESC == "1 - More than initial alarm, less than Signal 7-5" ~ "1st alarm",
HIGHEST_LEVEL_DESC == "11 - First Alarm" ~ "1st alarm",
HIGHEST_LEVEL_DESC == "7 - Signal 7-5" ~ "All hands",
HIGHEST_LEVEL_DESC == "75 - All Hands Working" ~ "All hands",
HIGHEST_LEVEL_DESC == "2 - 2nd alarm" ~ "2nd alarm",
HIGHEST_LEVEL_DESC == "22 - Second Alarm" ~ "2nd alarm",
HIGHEST_LEVEL_DESC == "3 - 3rd alarm" ~ "3rd alarm",
HIGHEST_LEVEL_DESC == "33 - Third Alarm" ~ "3rd alarm",
HIGHEST_LEVEL_DESC == "4 - 4th alarm" ~ "4th alarm",
HIGHEST_LEVEL_DESC == "44 - Fourth Alarm" ~ "4th alarm",
HIGHEST_LEVEL_DESC == "5 - 5th alarm" ~ "5th alarm",
HIGHEST_LEVEL_DESC == "55 - Fifth Alarm" ~ "5th alarm")) %>%
mutate(alarm_level = factor(alarm_level, levels = c("Initial", "1st alarm", "All hands",
"2nd alarm", "3rd alarm", "4th alarm", "5th alarm"))) %>%
drop_na(alarm_level) %>%
mutate(alarm_cat = case_when(
alarm_level == "Initial" ~ "Initial and 1st alarm",
alarm_level == "1st alarm" ~ "Initial and 1st alarm",
alarm_level == "All hands" ~ "All hands",
alarm_level == "2nd alarm" ~ "2nd and 3rd alarm",
alarm_level == "3rd alarm" ~ "2nd and 3rd alarm",
alarm_level == "4th alarm" ~ "4th and 5th alarm",
alarm_level == "5th alarm" ~ "4th and 5th alarm")) %>%
mutate(alarm_cat = factor(alarm_cat, levels = c("Initial and 1st alarm","All hands",
"2nd and 3rd alarm", "4th and 5th alarm"))) %>%
mutate(alarm_sev = case_when(
alarm_level == "Initial" ~ "Low",
alarm_level == "1st alarm" ~ "Low",
alarm_level == "All hands" ~ "Medium",
alarm_level == "2nd alarm" ~ "Medium",
TRUE ~ "High")) %>%
mutate(alarm_sev = factor(alarm_sev, levels = c("Low", "Medium", "High"))) %>%
mutate(durat_hours = (TOTAL_INCIDENT_DURATION/3600)) %>%
mutate(durat_hours = sprintf("%0.1f", durat_hours)) %>%
mutate(fire_spread = case_when(
FIRE_SPREAD_DESC == "1 - Confined to object of origin" ~ "Confined to object",
FIRE_SPREAD_DESC == "2 - Confined to room of origin" ~ "Confined to room",
FIRE_SPREAD_DESC == "3 - Confined to floor of origin" ~ "Confined to floor",
FIRE_SPREAD_DESC == "4 - Confined to building of origin" ~ "Confined to building",
FIRE_SPREAD_DESC == "5 - Beyond building of origin" ~ "Spread beyond building",
TRUE ~ "NA")) %>%
mutate(fire_spread = factor(fire_spread, levels = c("Confined to object",
"Confined to room",
"Confined to floor",
"Confined to building",
"Spread beyond building"))) %>%
mutate(bldg_use = case_when(
PROPERTY_USE_DESC == "429 - Multifamily dwelling" ~ "Apartment or condo",
PROPERTY_USE_DESC == "419 - 1 or 2 family dwelling" ~ "House",
PROPERTY_USE_DESC == "500 - Mercantile, business, other" ~ "Business",
PROPERTY_USE_DESC == "161 - Restaurant or cafeteria" ~ "Restaurant",
TRUE ~ "Other")) %>%
mutate(bldg_use = factor(bldg_use, levels = c("Apartment or condo", "House", "Business",
"Restaurant", "Office", "Other"))) %>%
mutate(bldg_cat = case_when(
bldg_use == "Apartment or condo" ~ "Residential",
bldg_use == "House" ~ "Residential",
bldg_use == "Business" ~ "Commercial",
bldg_use == "Office" ~ "Commercial",
bldg_use == "Restaurant" ~ "Commercial",
TRUE ~ "Other")) %>%
mutate(bldg_cat = factor(bldg_cat, levels = c("Residential", "Commercial", "Other")))
p2 <- ggplot(data = building_fires_4, aes(x = dist_nearest_miles, y = mins_to_resp_2)) +
geom_point() +
coord_cartesian(xlim = c(0, 3), ylim = c(0, 25)) +
theme_fivethirtyeight() +
geom_smooth(se = FALSE, lwd = 1, color = "brown4") +
theme(axis.title = element_text(), legend.position = "none") +
labs(caption = "Source: NYC Open Data 2013-2018") +
xlab("Miles to nearest firehouse") +
ylab("Minutes to respond") +
ggtitle("Response time for building fires in NYC \nby alarm level") +
facet_wrap(~ alarm_cat)
p2
Provide a map visualization of response times. Investigate whether the type of property affected (PROPERTY_USE_DESC) or fire severity (HIGHEST_LEVEL_DESC) play a role here.
pal.1 <- colorBin(palette = "YlOrRd", domain = building_fires_4$mins_to_resp_2, bins = 5,
na.color = NULL, pretty = FALSE, alpha = TRUE)
qpal <- colorQuantile("YlOrRd", building_fires_4$mins_to_resp_2, n = 5)
content3 <- paste("Building use:", building_fires_4$bldg_use, "<br/>",
"Highest alarm:", building_fires_4$alarm_level, "<br/>",
"Units responding:", building_fires_4$UNITS_ONSCENE, "<br/>",
"Minutes to respond:", building_fires_4$mins_to_resp_3, "<br/>",
"Hours to contain:", building_fires_4$durat_hours,"<br/>",
"Fire spread:", building_fires_4$fire_spread,"<br/>")
m5 <- leaflet(building_fires_4) %>%
addTiles(urlTemplate = 'https://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}{r}.png',
attribution = "Source: NYC Open Data 2013-2018") %>%
addCircles(data = subset(building_fires_4, building_fires_4$alarm_sev == "Low"), lng = ~lon, lat = ~lat,
color = ~qpal(mins_to_resp_2), popup = content3, group = "Low severity") %>%
addCircles(data = subset(building_fires_4, building_fires_4$alarm_sev == "Medium"), lng = ~lon, lat = ~lat,
color = ~qpal(mins_to_resp_2), popup = content3, group = "Medium severity") %>%
addCircles(data = subset(building_fires_4, building_fires_4$alarm_sev == "High"), lng = ~lon, lat = ~lat,
color = ~qpal(mins_to_resp_2), popup = content3, group = "High severity") %>%
addLegend(pal = pal.1, values = building_fires_4$mins_to_resp_2, position = "topleft",
title = "Minutes to respond", labFormat = labelFormat(digits = 0)) %>%
addControl("Fire Response Time by Alarm Level", position = "topright") %>%
addLayersControl(overlayGroups = c("Low severity", "Medium severity", "High severity"),
options = layersControlOptions(collapsed = FALSE))
m5
m6 <- leaflet(building_fires_4) %>%
addTiles(urlTemplate = 'https://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}{r}.png',
attribution = "Source: NYC Open Data 2013-2018") %>%
addCircles(data = subset(building_fires_4, building_fires_4$bldg_cat == "Residential"), lng = ~lon, lat = ~lat,
color = ~qpal(mins_to_resp_2), popup = content3, group = "Residential") %>%
addCircles(data = subset(building_fires_4, building_fires_4$bldg_cat == "Commercial"), lng = ~lon, lat = ~lat,
color = ~qpal(mins_to_resp_2), popup = content3, group = "Commercial") %>%
addCircles(data = subset(building_fires_4, building_fires_4$bldg_cat == "Other"), lng = ~lon, lat = ~lat,
color = ~qpal(mins_to_resp_2), popup = content3, group = "Other") %>%
addLegend(pal = pal.1, values = building_fires_4$mins_to_resp_2, position = "topleft",
title = "Minutes to respond", labFormat = labelFormat(digits = 0)) %>%
addControl("Fire Response Time by Building Type", position = "topright") %>%
addLayersControl(overlayGroups = c("Residential", "Commercial", "Other"),
options = layersControlOptions(collapsed = FALSE))
m6
Show a faceted choropleth map indicating how response times have developed over the years. What do you find?
rename_boros <- function(x){gsub("^[0-9] - ", "", x)}
boro_name <- unlist(lapply(building_fires_4$BOROUGH_DESC, rename_boros))
building_fires_5 <- building_fires_4
building_fires_5['boro_name'] <- boro_name
pal.2 <- colorNumeric("YlOrRd", NULL)
m7 <- leaflet(nyc_boros) %>%
addTiles(urlTemplate = 'https://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}{r}.png',
attribution = "Source: NYC Open Data 2013-2018") %>%
addPolygons(stroke = FALSE, smoothFactor = 0.3, fillOpacity = 1,
fillColor = ~pal.2(avg_boro_resp_time)) %>%
addLegend(pal = pal.2, values = ~avg_boro_resp_time, position = "topleft",
title = "Minutes to respond", labFormat = labelFormat(digits = 2)) %>%
addControl("Average Fire Response Time (2013-2018)", position = "topright")
m7
m8 <- leaflet(nyc_boros) %>%
addTiles(urlTemplate = 'https://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}{r}.png',
attribution = "Source: NYC Open Data 2013-2018") %>%
addPolygons(stroke = FALSE, smoothFactor = 0.3, fillOpacity = 1,
fillColor = ~pal.2(boro_resp_time_2013)) %>%
addLegend(pal = pal.2, values = ~boro_resp_time_2013, position = "topleft",
title = "Minutes to respond", labFormat = labelFormat(digits = 2)) %>%
addControl("Average Fire Response Time in 2013", position = "topright")
m8
m9 <- leaflet(nyc_boros) %>%
addTiles(urlTemplate = 'https://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}{r}.png',
attribution = "Source: NYC Open Data 2013-2018") %>%
addPolygons(stroke = FALSE, smoothFactor = 0.3, fillOpacity = 1,
fillColor = ~pal.2(boro_resp_time_2018)) %>%
addLegend(pal = pal.2, values = ~boro_resp_time_2018, position = "topleft",
title = "Minutes to respond", labFormat = labelFormat(digits = 2)) %>%
addControl("Average Fire Response Time in 2018", position = "topright")
m9
m10 <- leaflet(nyc_boros) %>%
addTiles(urlTemplate = 'https://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}{r}.png',
attribution = "Source: NYC Open Data 2013-2018") %>%
addPolygons(stroke = FALSE, smoothFactor = 0.3, fillOpacity = 1,
fillColor = ~pal.2(boro_resp_time_diff)) %>%
addLegend(pal = pal.2, values = ~boro_resp_time_diff, position = "topleft",
title = "Minutes to respond", labFormat = labelFormat(digits = 2)) %>%
addControl("Change in Fire Response Times 2013-2018", position = "topright")
m10
Average response times increased from 2013 to 2018, with the lowest increase in Brooklyn and the highest increase in Queens.