Response Time is the key factor in evaluating the performance of Fire Departments.The Fire Department Response Time is made up of several components.Fire departments can’t control how much time elapses between the start of a fire and when a call is placed to 911, which makes it critical for them to minimize the time they can control. There are a number of factors that affect the response times for an emergency.
Total response time is made up of three distinct components:
1.Dispatch time: Time elapsed from when a call is received at the 9-1-1 center until units are notified.
2.Turnout time: Time elapsed from when units are notified until they are responding.
3.Travel time: Time elapsed from when units respond until they arrive on the incident scene.
Most fire departments have a habit of focusing solely on improving their travel time, because it’s traditionally accepted that little can be done to improve the other two components. Firefighters falsely believe that improving response time is made easy by driving faster. This solution rarely has a positive impact; in fact, it can lead to disastrous outcomes.
This project involves using FDNY’s response time data to get a trend in response times for different boroughs, identify (if applicable) new emergency service areas for current fire stations based on travel times, identify areas of concern within service areas, provide an in-depth comparative analysis of response times.
The response times determine how well the emergency reponders are performing their duties. My motivation for this project was due to Fire Department’s response time in news. The New York Times articles related to Fire Department’s response time are selected and converted into a dataframe using web api key. These articles will shed light on inportance of response times.
##
## Attaching package: 'jsonlite'
## The following object is masked from 'package:utils':
##
## View
api_key <- "&api-key=e2ecd4b8ba0d275c1ae6a42a808a991e:5:74859716"
url <- "http://api.nytimes.com/svc/search/v2/articlesearch.json?q=Response-Fire"
dat <- (paste0(url, api_key))
dat <-fromJSON(dat)
#converting dat into a dataframe
df <- as.data.frame(dat$response)
# creating data frame for the required terms
dataframe <- data.frame(df$docs.section_name,df$docs.lead_paragraph, df$docs.abstract, df$docs.web_url)
| df.docs.section_name | df.docs.lead_paragraph | df.docs.abstract | df.docs.web_url | |
|---|---|---|---|---|
| 3 | N.Y. / Region | When I see fire trucks responding to incidents, there always seem to be more vehicles and more commotion than I remember from childhood. Why is that? | NA | http://www.nytimes.com/2008/08/31/nyregion/thecity/31fyi.html |
| 6 | N.Y. / Region | The Fire DepartmentÂ’s average response time decreased in 2007 for the second consecutive year. | NA | http://www.nytimes.com/2008/01/07/nyregion/07mbrfs-FIRECALLS.html |
| df.docs.section_name | df.docs.lead_paragraph | df.docs.abstract | df.docs.web_url | |
|---|---|---|---|---|
| 7 | New York and Region | Relatives and union members reacted angrily yesterday to Fire Department findings that a floor collapse in which two firefighters died last summer in Brooklyn was partly caused by the failure of city housing officials to maintain the building’s structural integrity. ‘’It boggles my mind that they may have done substandard work,’’ said one union leader, Capt. Richard Brower of the Uniformed Fire Officers Association, two of whose members died fighting a fire in the building on June 5, 1998. | Relatives and union members react angrily to Fire Dept findings that floor collapse in which Lieut James Blackmore and Capt Scott LaPiedra died last summer in Brooklyn was partly caused by failure of city housing officials to maintain building’s structural integrity; housing officials take issue with fairness of Fire Dept report; photo (M) | http://www.nytimes.com/1999/07/15/nyregion/angry-response-to-report-on-fatal-fire.html |
| df.docs.section_name | df.docs.lead_paragraph | df.docs.abstract | df.docs.web_url | |
|---|---|---|---|---|
| 5 | New York and Region; Opinion | The recent article on manpower problems in volunteer fire departments addressed valid concerns; however, reference made to a typical firefighter ‘’racing to the firehouse in his car at life-endangering speed’’ is incorrect and fosters a negative stereotype. While prompt response is necessary in any emergency, the law does not allow, nor does the volunteer fire service condone, excessive speed when responding to alarms. The flashing blue light displayed on private autos is not intended to warn of high speed, but rather to alert motorists and encourage them to move aside so firefighters may proceed safely through traffic with the least possible delay. THOMAS BELLINGHAM Captain, Sea Cliff Fire Department | NA | http://www.nytimes.com/1981/10/18/nyregion/l-volunteers-response-to-fire-alarms-055283.html |
The availability of required datasets is a concern for this project, I will try to leverage the open available datasets on NYC open data portal into useful insights and try to answer the following research questions:
Can the boroughs be divided into high, moderate and low risk fire zones based on the incident counts or all the boroughs have evenly distriburted fire incidents.
Is the average response time in all the boroughs equal
All the fire houses locations distributed evenly in the boroughs
Does the average response time varies on the number of incidents (check the independence between Incident counts and response times)
suppressWarnings(library(knitr))
suppressWarnings(library(plyr))
suppressWarnings(library(rgdal))
## Loading required package: sp
## rgdal: version: 1.1-8, (SVN revision 616)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 2.0.1, released 2015/09/15
## Path to GDAL shared files: C:/Users/Gurpreet/Documents/R/win-library/3.2/rgdal/gdal
## GDAL does not use iconv for recoding strings.
## Loaded PROJ.4 runtime: Rel. 4.9.1, 04 March 2015, [PJ_VERSION: 491]
## Path to PROJ.4 shared files: C:/Users/Gurpreet/Documents/R/win-library/3.2/rgdal/proj
## Linking to sp version: 1.2-3
suppressWarnings(library(ggplot2))
suppressWarnings(library(sp))
suppressWarnings(library(rgdal))
suppressWarnings(library(rgeos))
## rgeos version: 0.3-19, (SVN revision 524)
## GEOS runtime version: 3.5.0-CAPI-1.9.0 r4084
## Linking to sp version: 1.2-3
## Polygon checking: TRUE
suppressWarnings(library(rvest))
## Loading required package: xml2
suppressWarnings(library(stringr))
suppressWarnings(library(tidyr))
suppressWarnings(library(maps))
##
## # maps v3.1: updated 'world': all lakes moved to separate new #
## # 'lakes' database. Type '?world' or 'news(package="maps")'. #
##
## Attaching package: 'maps'
## The following object is masked from 'package:plyr':
##
## ozone
suppressWarnings(library(choroplethr))
## Loading required package: acs
## Loading required package: XML
##
## Attaching package: 'XML'
## The following object is masked from 'package:rvest':
##
## xml
##
## Attaching package: 'acs'
## The following object is masked from 'package:base':
##
## apply
suppressWarnings(library(ggthemes))
suppressWarnings(library(jsonlite))
```
The data is collected using New York City’s open data community portal and Socrata. The data files response and locations are used for analysis of fire incidents and locations of fire houses in different boroughs. The dataset pop_county is included in the package choroplethr. The data for square mileage is gleaned from NY state webpage.
res <- read.csv("https://raw.githubusercontent.com/gpsingh12/IS-607-MSDA/master/response.csv")
loc <- read.csv("https://raw.githubusercontent.com/gpsingh12/IS-607-MSDA/master/locations.csv", skip=1)
data(df_pop_county)
Response time is in mm:ss format. We need to convert the time into useful insights (seconds) for analyzing and applying the statistical tests. In addition the data with insufficient information is removed from the file.
We will convert the time format into seconds by adding additional column in the response dataset.
res <- res[-(505:547),]
time<-(res$AVERAGERESPONSETIME)
time<-as.character(time)
SEC <-sapply(strsplit(time,":"),
function(x) {
x <- as.numeric(x)
x[1]*60+x[2]
}
)
res["RESPONSESECONDS"] <-SEC
kable(head(res))
| YEARMONTH | INCIDENTCLASSIFICATION | INCIDENTBOROUGH | INCIDENTCOUNT | AVERAGERESPONSETIME | RESPONSESECONDS |
|---|---|---|---|---|---|
| 200907 | Structural Fires | Citywide | 1947 | 3:54 | 234 |
| 200907 | Structural Fires | Manhattan | 435 | 4:00 | 240 |
| 200907 | Structural Fires | Bronx | 432 | 3:59 | 239 |
| 200907 | Structural Fires | Staten Island | 90 | 4:34 | 274 |
| 200907 | Structural Fires | Brooklyn | 652 | 3:29 | 209 |
| 200907 | Structural Fires | Queens | 338 | 4:17 | 257 |
For the locations dataset, we will get the headcount for number of stations in the all five boroughs.
loc<- loc[,3]
loc<- as.data.frame(table(loc))
loc
## loc Freq
## 1 Bronx 34
## 2 Brooklyn 66
## 3 Manhattan 48
## 4 Queens 50
## 5 Staten Island 20
Incident Counts for all the boroughs and Average response time for them. The incidents are divided into different catregories. We will need the total count for all incidents.
Man <- subset(res, INCIDENTBOROUGH == "Manhattan"& INCIDENTCLASSIFICATION=="All Fire/Emergency Incidents")
Brk<- subset(res, INCIDENTBOROUGH == "Brooklyn"& INCIDENTCLASSIFICATION=="All Fire/Emergency Incidents")
Bx<- subset(res, INCIDENTBOROUGH == "Bronx"& INCIDENTCLASSIFICATION=="All Fire/Emergency Incidents")
Qns <- subset(res, INCIDENTBOROUGH == "Queens"& INCIDENTCLASSIFICATION=="All Fire/Emergency Incidents")
SI <- subset(res, INCIDENTBOROUGH == "Staten Island"& INCIDENTCLASSIFICATION=="All Fire/Emergency Incidents")
Divide the data into three zones.
High risk > 9,000 Incident Counts
Medium risk between 5,000 - 9,000 Incident Counts
Low Risk < 5,000 Incident Counts
High_risk <- subset(res, INCIDENTCLASSIFICATION=="All Fire/Emergency Incidents" & INCIDENTCOUNT > 9000)
Medium_risk <- subset(res, INCIDENTCLASSIFICATION=="All Fire/Emergency Incidents" & 5000 < INCIDENTCOUNT & INCIDENTCOUNT < 9000)
Low_risk <- subset(res, INCIDENTCLASSIFICATION=="All Fire/Emergency Incidents" & INCIDENTCOUNT < 5000)
M_mean <- mean(Man$RESPONSESECONDS)
Brk_mean <- mean(Brk$RESPONSESECONDS)
Bx_mean <- mean(Bx$RESPONSESECONDS)
Qns_mean <-mean(Qns$RESPONSESECONDS)
SI_mean <- mean(SI$RESPONSESECONDS)
summary(Man$INCIDENTCOUNT)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 9189 10020 10560 10470 10800 11630
summary(SI$INCIDENTCOUNT)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1993 2104 2238 2324 2350 3555
summary(Brk$INCIDENTCOUNT)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 10400 10860 11530 11470 11870 12670
summary(Qns$INCIDENTCOUNT)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 7453 7670 8337 8316 8888 9272
summary(Bx$INCIDENTCOUNT)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 7038 7708 8332 8194 8748 8949
high<- High_risk[,3]
as.data.frame(table(high))
## high Freq
## 1 Bronx 0
## 2 Brooklyn 12
## 3 Citywide 12
## 4 INCIDENTBOROUGH 0
## 5 Manhattan 12
## 6 Queens 2
## 7 Staten Island 0
Med<- Medium_risk[,3]
as.data.frame(table(Med))
## Med Freq
## 1 Bronx 12
## 2 Brooklyn 0
## 3 Citywide 0
## 4 INCIDENTBOROUGH 0
## 5 Manhattan 0
## 6 Queens 10
## 7 Staten Island 0
Low<- Low_risk[,3]
as.data.frame(table(Low))
## Low Freq
## 1 Bronx 0
## 2 Brooklyn 0
## 3 Citywide 0
## 4 INCIDENTBOROUGH 0
## 5 Manhattan 0
## 6 Queens 0
## 7 Staten Island 12
The frequency table based on the incident count defines Brooklyn and Manhattan (considering Queens as an outlier) as high risk, Queens and Bronx as moderate risk, Staten Island as low risk fire zones.
summary(High_risk$RESPONSESECONDS)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 238.0 249.8 271.0 266.1 277.0 289.0
summary(Medium_risk$RESPONSESECONDS)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 276.0 282.2 284.5 286.1 289.0 301.0
summary(Low_risk$RESPONSESECONDS)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 273.0 280.0 285.0 291.7 288.2 338.0
We have the summary of all three zones. The mean response time for the zones are highest for low risk followed by medium and high risk. Although the high response time in low risk zone reveals that the incident count and response time are not releated. But we will perform the statistical check on individual zones.
Null Hypothesis: Incident Count and Response times are independent
Alternate Hypothesis: Incident Count and Response times are dependent
chisq.test(High_risk$INCIDENTCOUNT, High_risk$RESPONSESECONDS)
## Warning in chisq.test(High_risk$INCIDENTCOUNT, High_risk$RESPONSESECONDS):
## Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: High_risk$INCIDENTCOUNT and High_risk$RESPONSESECONDS
## X-squared = 874, df = 851, p-value = 0.2848
chisq.test(Medium_risk$INCIDENTCOUNT, Medium_risk$RESPONSESECONDS)
## Warning in chisq.test(Medium_risk$INCIDENTCOUNT, Medium_risk
## $RESPONSESECONDS): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: Medium_risk$INCIDENTCOUNT and Medium_risk$RESPONSESECONDS
## X-squared = 330, df = 315, p-value = 0.2693
chisq.test(Low_risk$INCIDENTCOUNT, Low_risk$RESPONSESECONDS)
## Warning in chisq.test(Low_risk$INCIDENTCOUNT, Low_risk$RESPONSESECONDS):
## Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: Low_risk$INCIDENTCOUNT and Low_risk$RESPONSESECONDS
## X-squared = 108, df = 99, p-value = 0.252
The p-value in all three cases is greater than .05. We do not have enough evidence to reject Null hypothesis. i.e Incident Count and Response times are independent
Borough <-c("Bronx", "Brooklyn","Manhattan","Queens", "Staten Island")
StnCount <- c(34,66,48,50,20)
IncidentCount <-c(8949,12670,11630,9272,3555)
Area <- c(57,96.9, 33.7, 178, 59)
location <-data.frame(Borough,StnCount,IncidentCount, Area)
location%>%
mutate(sqMilesCovered = Area/StnCount)
## Borough StnCount IncidentCount Area sqMilesCovered
## 1 Bronx 34 8949 57.0 1.6764706
## 2 Brooklyn 66 12670 96.9 1.4681818
## 3 Manhattan 48 11630 33.7 0.7020833
## 4 Queens 50 9272 178.0 3.5600000
## 5 Staten Island 20 3555 59.0 2.9500000
Mean_time <- c(Bx_mean, Brk_mean, M_mean, Qns_mean, SI_mean)
loc_mean <- data.frame(Borough, StnCount, Mean_time)
SqMilesCvd <- c(1.67,1.46,.70,3.56,2.95)
cor(loc_mean$Mean_time, SqMilesCvd)
## [1] 0.5271344
#No strong corelation
cor(loc_mean$Mean_time,loc_mean$StnCount)
## [1] -0.7970305
## Evident that station count and response time negatively corelated. Station Count increases response time will decrease and vice versa.
cor(location$StnCount,location$Area)
## [1] 0.384409
#No Corelation
# Mean response time for three zones
# High= 1, Medium = 2, Low =3
boxplot(High_risk$RESPONSESECONDS,Medium_risk$RESPONSESECONDS, Low_risk$RESPONSESECONDS)
#Distribution of Fire Houses
x <-barplot(location$StnCount, main = "Distribution of FireHouses", xlab = "Borough", ylab="frequency", col=c("darkblue","red"), names.arg=location$Borough)
# Distribution of response time in seconds over 5 Boroughs
barplot(loc_mean$Mean_time, main = "Average Respopnse Time in sec. by Boroughs", xlab = "Borough", ylab="Time (in seconds)", col=c("darkblue","red"), names.arg=loc_mean$Borough)
#Distribution of high, medium and low risk zones in 5 boroughs high risk with dark blue and least with light blue.
# FIPS codes for the 5 counties (boroughs) of New York City
nyc_fips = c(36005, 36047, 36061, 36081, 36085)
region <- c(36005, 36047, 36061, 36081, 36085)
value <-c(8949, 12670,11630,9272,3555)
df <- data.frame(region, value)
nyc_county_fips = region
county_choropleth(df,
title = "NY City County Fire Zones",
legend = "Boroughs",
num_colors = 5,
county_zoom = nyc_county_fips)
#Distribution of population covered by Fire Houses in 5 boroughs high risk with dark blue and least with light blue.
nyc_county_fips = c(36005, 36047, 36061, 36081, 36085)
county_choropleth(df_pop_county,
title = "NY City County Population Estimates",
legend = "Population",
num_colors = 5,
county_zoom = nyc_county_fips)
The analysis provided solutions to the areas of concern. The boroughs were dividedbinto high, moderate and low risk zones. The average response time for each borough was different. In addition the location of fire houses was unevenly distributed with stations in Manhattan were shortest distance apart as compared to Queens and SI. Also the incident count and response time were independent of each other.Brooklyn had more occurence of fire incidents and Staten Island with least no. of incidents. The response time in Brooklyn is still leastand most effective although it has highest incident count and it is densely populated while SI has the highest response time in an emergency. One possible factor is the no. of square miles covered by each fire company in Brooklyn is much smaller as compared to Staten Island.The count of fire houses (in Brooklyn more than SI) is a possible cause of the least response time in Brooklyn. In addition the population to be covered in Brooklyn is more than SI also (can not be considered as an important factor). Being more dense and existence of high rise buildings does not effect the response time in Brooklyn while SI having least amount of population and having less fire incidents got comparitively ineffective service, in accordance with the previous analysis link here The indepth analysis of response in SI can provide causes of delays and possible solutions for increasing the effectiveness that is response times. Above all these facts, FDNY firefighters are committed to our service. FDNY is ranked top in US among all the fire departments see link. Traffic might be another aspect of the delay. Although there might be some other factors, this study can provide foundation in considering other factors.
Refrences:
http://www.firefighternation.com/article/technology/using-technology-reduce-response-times
http://www.r-bloggers.com/choroplethr-v3-0-0-is-now-on-cran/
http://onlinefiresciencedegree.org/noteworthy-fire-departments/
https://nycplatform.socrata.com/Public-Safety/FDNY-Firehouse-Listing/hc8x-tcnd