Introduction

Abstract

very brief what we modeled and what tools we used and who we did it for.

Motivation & Use Case

Structure fires are an issue that every town and city must grapple with. Fire departments across the United States work hard to save lives and property; however, fires still devastate hundreds of thousands of people a year. This year in particular, apartment fires in Philadelphia and the Bronx underscored the importance of prevention as the country grieved the loss of far too many families, children, and homes.

Guilford County, North Carolina is less densely populated than Philadelphia and New York, yet structure fires still pose a risk and a challenge. In fact, Guilford County must address the unique challenge of fighting structure fires in a rural context. The extent of Guilford Emergency Services’ (EMS) jurisdiction illustrates this challenge well. Encompassing several municipalities, 19 fire districts and 24 contracted departments – some of which extend beyond county lines, Guilford EMS is faced with a complex, fragmented resource allocation process.

Fire districts in Guilford have varying budget levels and staff structures. Some employ mostly full time career staff, while others are operated mostly by volunteers.

Across the County, the size and scope of each fire district varies, as does the demand for fire service. Each district’s fire budget is largely a factor of property tax revenue rather than demand for service, leaving it unclear whether every corner of the county receives the level of service it needs. Additionally, the county is growing. A 2019 report published by Fitch and Associates concluded that Guilford’s “overall call volume is increasing at approximately 4.2%” annually, due to population growth. (page 103) At the same time, the volunteer fire workforce that the county has historically relied on is decreasing. According to County EMS Director Jim Albright, the sharp reduction in volunteer firefighters has made fire departments’ ability to perform at the level they would like even more challenging.

As a result, it is critical for Guilford EMS to better understand where and how much service is needed across the county, now more than ever. Typically, in the emergency services field, solutions to this kind of challenge are outlined in a standard of cover document. This document allows emergency service providers a systematic way to determine resource allocation given the demand, risk, and needs across the county. However, understanding risk a key prerequisite to outlining a standard of cover. Our goal is to equip our client with a deeper understanding of where there is exposure to risk of structure fires and which parts of the county exhibit greater vulnerability, so that they can make data-driven decisions around how to establish a common standard of cover.

Exploratory Data Analysis

The Data

The complexity of how administrative boundaries interface in the county is at the heart of the disjointed nature of Guilford EMS’ resource allocation process. And consequently, it has been at the forefront of our data exploration phase.

As the emergency services department for the entire county, our client’s role is to distribute resources and provide administrative support to the fire agencies under county authority. In total, Guilford EMS administers 19 fire districts and 24 fire departments that operate according to their own framework and hierarchy. The map below illustrates the various administrative characteristic fire districts in the county may have. Some of these districts extend beyond county borders and include fire stations located entirely outside of Guilford. In contrast, the interior municipalities of High Point and Greensboro, the most populated parts of the county, manage their own fire response and prevention, separate from the county departments. As a result, the county’s emergency services encompasses fire districts either fully or partially located beyond its borders, but does not include the interior cities of Greensboro or High Point. Consequently, our data is somewhat constrained. MAYBE INSERT SOME HINT OR INTRO TO A RECOMMENDATION

We have received rich parcel data for every property within the county proper. We’ve also been given access to violations and inspections data collected by each county fire department, but they only cover commercial occupancies within their respective districts. Because of the conflict between the two administrative boundaries, these parcel and inspection data sets do not cover the same geographic area. The parcel data is bounded by Guilford’s neat rectangular border, while the other includes information from properties outside the county. This issue extends to our dependent variable, fire events, which align to the sprawling fire districts.

load("./Environments/FiresNasBarsEnv.RData")

## data come from Scripts/NasAndBarCharts.R and the previously loaded environmemt 

cntr_crds <- c(mean(st_coordinates(firecad_fires_simplified)[, 1]),
               mean(st_coordinates(firecad_fires_simplified)[, 2]))

## fire districts, fires, and parcels



fires_onMap <-
  mapview(firedist_4326, color="palevioletred", lwd=1, alpha=0.7, alpha.regions=0.1, col.regions = "palevioletred", layer.name = "Fire Districts")+
  mapview(allfires4326, cex=3,color="orange", col.regions='orange', alpha.regions=0.7, layer.name = "Structure Fire Events")

fires_onMap <- fires_onMap@map %>% setView(cntr_crds[1], cntr_crds[2], zoom = 10)

fires_onMap

Fire Event on the Map

Fire events in our study area from 2016 to 2021 are shown as orange dots on the map below. All fire events are bound by the fire > district polygons in pink.

Our initial concern was to preserve as much of the fire event data as we could, due to the proportionally few observations of fire. At the same time, we were in need of identifying features from all of our data sets that are shared among all fire events. Spatially, the only area where all three data sets overlap – fire events, commercial inspections, and parcel data – is within the borders of the county proper. For this reason, we decided to move forward with using only Guilford County bounded data. Additionally, we noticed serve gaps of information around the city of Highpoint and Greensboro. We were able to incorporate supplementary data from the City of Greensboro, but we unable to do the same for Highpoint. Therefore, our final study area excluded the city of High Point.

In preparation for modeling, we have further reduced our data. We have only included the last five years of fire incident data (2016-2021) and the last six years of inspection and violation data. Our main source of structure detail, the county parcel data, does not have observations for multiple years, so it will remain complete. However, our data were still limited by incompleteness.

Below are two visualizations of which possible features are missing a substantial amount of values. The variables are on the y-axis and the percentage of NA’s in the dataset is on the x-axis.

STARTING WTIH WHAT WE HAVE ABOUT STRUCTURES

## data come from Scripts/NasAndBarCharts.R 

  gg_miss_var(parcels2_fire_trim%>%
                dplyr::select(-FIRE ),
              show_pct = TRUE)+
  labs(title="Parcel Data",y = "Percent Missing Values")+
  ylim(NA, 100)+
  ourtheme()

MAYBE TALK ABOUT WHAT WE HAVE FROM OCCUPACNY AND VIOLATIONS & HOW THEY ARE ONLY FOR COMMERCIAL

Exploring Fire Risk

data sources put what this images says into words in the paragraph as well

Our goal is to provide Guilford EMS with a strong spatial understanding of where fire risk is in the county. Traditional methods for identifying risk employ kernel density which suggests where future fires will be based soley on where fire have happened in the past. Our approach is different. While we have incorporated the intelligence that past fires provide by identifying hotspots, we also investigated potential source of latent risk. Latent risk paints a more detailed picture of fire risk. It incorporates that environmental and social factors that are associated with fire events. It is critical to incorporate that source of intelligence about fire risk especially in this context, where fires are rare. Less of one percent of all parcels in our study area have experienced a fire. The fact that fires are rare in Guildford county is not uncommon, but it does pose a significant challenge for accurately modeling fire risk.

insert large Graphic of % of fire events parcel level

What Characteristics Do Structures that Have Experienced Fire Have in Common?

In addition to investigating fires and where they happen in space, we sought to identify trends in structure characteristics that have experienced fire. By identifying shared features among historic fire events, we can apply this information to model exposure to risk.

Fire by Structure Type

While our decision to restrict our study area to the official county lines of Guilford, meant that we lost informative observations from commercial structures lying outside the county, it is important to note that as Director Albright highlighted in our first meeting, the majority of structure fires are residential. This is something we’ve found to be true in the fire event data set, as shown in the bar chart below.

It is clear from Director Albright’s suggestion and from what is shown in the data, that house fires make up the majority of structure fires in Guilford County. need more wordS?

## data come from Scripts/NasAndBarCharts.R 

ggplot(firecad_fires_simplified%>%
         st_drop_geometry()%>%
         drop_na(StructureUse), aes(y=forcats::fct_rev(forcats::fct_infreq(StructureUse))), na.rm = TRUE) +
geom_bar(fill="#2d4739", na.rm = TRUE)+labs(title="Fires By Structure Use", y = "Use", x="Count of Buildings by Type With Fire")+
ourtheme()+
theme(axis.text.x = element_text(angle = 45))+
  theme(text = element_text(size = 15))

Starting with a few numeric variables, we show the distribution of fire events over structure age for all structures built after 1900. Additionally, because we know from Director Albright’s note and our own data exploration that residences are the most common type of structure to experience fire, we investigated trends between residences generally and one and two unit homes.

Fire By Year Built, All Structures 1900 - Present

This plot makes clear that an overall increase in structures after 2000 was met with a rise in structure fires. However an increase in fires for structures built around 1960, is not met with an increase in structures built during that time.

plot_ly(parcels2_fire_trim%>%
          filter(YEAR_BUILT>1899),  
        y = ~YEAR_BUILT, 
        color = ~FIRE, 
        type = "violin", 
        spanmode = 'hard',
        box = list(visible = T),
        meanline = list(visible = T),
        colors = c(palette[3], palette[2]))%>%
  layout(yaxis = list(title = "YEAR BUILT"), 
         title = "Fire By Structure Year Built, 1900 - Present")

Fire By Bulding Count & Unit Count

  plot_ly(parcels2_fire_trim,
          y = ~BUILD_COUNT, 
          color = ~FIRE, 
          type = "violin", 
          spanmode = 'hard',
          meanline = list(visible = T),
          box = list(visible = T),
          colors = c(palette[3], palette[2]))%>%
    layout(yaxis = list(title = "Building Count", range = c(-2, 10)), 
           title = "Fire By Building Count")

  plot_ly(parcels2_fire_trim%>%
            filter(!is.na(TOTAL_UNITS)),
          y = ~TOTAL_UNITS, 
          color = ~FIRE, 
          type = "violin", 
          spanmode = 'hard',
          meanline = list(visible = T),
          box = list(visible = T),
          colors = c(palette[3], palette[2]))%>%
    layout(yaxis = list(title = "Unit Count", range = c(0, 10)), 
           title = "Fire By Unit Count")

Fire By Building Value, Area & Year: Residences of All Kinds

One and two unit homes that have not experienced fire have a greater range of value, while the value for homes with fires appear bounded between 70K and 280K. Additionally, the distribution of one and two unit homes that have had a fire seemed to be weighted more toward larger homes, as shown in the Fire by Main Heated Area plot. Further, one and two unit homes built after 2013 have not had any fires at all.

  plot_ly(parcels2_fire_trim%>%
            filter(str_detect(parcels2_fire_trim$StructureUse, "RESIDENTIAL")), 
          y = ~TOTAL_BUILDING_VALUE_ASSESSED, 
          color = ~FIRE, 
          type = "violin", 
          spanmode = 'hard',
          meanline = list(visible = T),
          box = list(visible = T),
          colors = c(palette[3], palette[2]))%>%
    layout(yaxis = list(title = "Total Building Value Assessed", range = c(0, 2500000)), 
           title = "Fire By Building Value: All Residences")

  plot_ly(parcels2_fire_trim%>%
            filter(str_detect(parcels2_fire_trim$StructureUse, "RESIDENTIAL")), 
          y = ~MAIN_BUILDING_HEATED_AREA, 
          color = ~FIRE, 
          type = "violin",
          spanmode = 'hard',
          box = list(visible = T),
          meanline = list(visible = T),
          colors = c(palette[3], palette[2]))%>%
    layout(yaxis = list(title = "Main Building Heated Area", range = c(0, 9000)),
           title = "Fire By Main Heated Area (size): All Residences")

  plot_ly(parcels2_fire_trim%>%
            filter(str_detect(parcels2_fire_trim$StructureUse, "RESIDENTIAL")), 
          y = ~YEAR_BUILT, 
          color = ~FIRE, 
          type = "violin",
          spanmode = 'hard',
          box = list(visible = T),
          meanline = list(visible = T),
          colors = c(palette[3], palette[2]))%>%
    layout(yaxis = list(title = "Year Built"), title = "Fire By Year Built: All Residences")

Fire By Building Value, Area & Year: 1 & 2 Unit Residences

add that this is like single family or like a duplex One and two unit residences that have not experienced fire have a greater range of value, while the value for homes with fires appear bounded between 70K and 280K. Additionally, the distribution of one and two unit homes that have had a fire seemed to be weighted more toward larger homes, as shown in the Fire by Main Heated Area plot. Further, one and two unit homes built after 2013 have not had any fires at all.

  plot_ly(parcels2_fire_trim%>%
            filter(str_detect(parcels2_fire_trim$StructureUse, "RESIDENTIAL"),
                   parcels2_fire_trim$TOTAL_UNITS < 3,
                   parcels2_fire_trim$BUILD_COUNT < 3),
          y = ~TOTAL_BUILDING_VALUE_ASSESSED, 
          color = ~FIRE, 
          type = "violin", 
          spanmode = 'hard',
          meanline = list(visible = T),
          box = list(visible = T),
          colors = c(palette[3], palette[2]))%>%
    layout(yaxis = list(title = "Total Building Value Assessed", range = c(0, 1500000)),
           title = "Fire By Building Value: 1 & 2 Unit Residences")

  plot_ly(parcels2_fire_trim%>%
            filter(str_detect(parcels2_fire_trim$StructureUse, "RESIDENTIAL"), 
                   parcels2_fire_trim$TOTAL_UNITS < 3,
                   parcels2_fire_trim$BUILD_COUNT < 3), 
          y = ~MAIN_BUILDING_HEATED_AREA, 
          color = ~FIRE, 
          type = "violin",
          spanmode = 'hard',
          box = list(visible = T),
          meanline = list(visible = T),
          colors = c(palette[3], palette[2]))%>%
    layout(yaxis = list(title = "Main Building Heated Area", range = c(0, 9000)),
           title = "Fire By Main Heated Area (size): 1 & 2 Unit Residences")

  plot_ly(parcels2_fire_trim%>%
            filter(str_detect(parcels2_fire_trim$StructureUse, "RESIDENTIAL"), 
                   parcels2_fire_trim$TOTAL_UNITS < 3,
                   parcels2_fire_trim$BUILD_COUNT < 3), 
          y = ~YEAR_BUILT, 
          color = ~FIRE, 
          type = "violin",
          spanmode = 'hard', 
          box = list(visible = T),
          meanline = list(visible = T),
          colors = c(palette[3], palette[2]))%>%
    layout(yaxis = list(title = "Year Built"), title = "Fire By Year Built: 1 & 2 Unit Residences")

Feature Engineering

need to say something about the aggregations

have the aggregated features here with the corrplot and such Fishnet Grid ## Modeling

Guilford EMS Team: Modeling & Application

5/10/2022

Introduction

Abstract

Motivation & Use Case

Exploratory Data Analysis

The Data

Exploring Fire Risk

What Characteristics Do Structures that Have Experienced Fire Have in Common?

Fire by Structure Type

Fire By Year Built, All Structures 1900 - Present

Fire By Bulding Count & Unit Count

Fire By Building Value, Area & Year: Residences of All Kinds

Fire By Building Value, Area & Year: 1 & 2 Unit Residences

Feature Engineering

Guilford EMS Team: Modeling & Application

5/10/2022

Introduction

Abstract

Motivation & Use Case

Exploratory Data Analysis

The Data

Exploring Fire Risk

Fire Hotspots: Are Fires Related?

What Characteristics Do Structures that Have Experienced Fire Have in Common?

Fire by Structure Type

Fire By Year Built, All Structures 1900 - Present

Fire By Bulding Count & Unit Count

Fire By Building Value, Area & Year: Residences of All Kinds

Fire By Building Value, Area & Year: 1 & 2 Unit Residences

Feature Engineering