## Analysis of Federal Disaster Closeout Durations

Introduction

The Federal Emergency Management Agency (FEMA) maintains the Disaster Declarations Summary, a dataset that summarizes all federally declared disasters. The record goes back to the first federal dissaster declaration in 1953, a tornado on May 2nd in Georgia. The Disaster Declarations Summary is raw unedited data much of which was manually entered from highly variable historical records. This analysis will tackle the unique structure and idiosynchracies of the dataset to present an analysis disaster durations for various categories of Federal disasters. Additionally, regionally differences in disaster duration will be explored.

Data Preparation

Loading libraries used

Loading the Disaster Declaration Dataset to an R data.frame

## Parsed with column specification:
## cols(
##   disasterNumber = col_double(),
##   ihProgramDeclared = col_double(),
##   iaProgramDeclared = col_double(),
##   paProgramDeclared = col_double(),
##   hmProgramDeclared = col_double(),
##   state = col_character(),
##   declarationDate = col_datetime(format = ""),
##   fyDeclared = col_double(),
##   disasterType = col_character(),
##   incidentType = col_character(),
##   title = col_character(),
##   incidentBeginDate = col_datetime(format = ""),
##   incidentEndDate = col_datetime(format = ""),
##   disasterCloseOutDate = col_datetime(format = ""),
##   declaredCountyArea = col_character(),
##   placeCode = col_double(),
##   hash = col_character(),
##   lastRefresh = col_datetime(format = ""),
##   id = col_character()
## )
## # A tibble: 6 x 16
##   disasterNumber disasterType declarationDate     disasterCloseOutDa…
##            <dbl> <chr>        <dttm>              <dttm>             
## 1              1 DR           1953-05-02 00:00:00 1954-06-01 00:00:00
## 2              2 DR           1953-05-15 00:00:00 1958-01-01 00:00:00
## 3              3 DR           1953-05-29 00:00:00 1960-02-01 00:00:00
## 4              6 DR           1953-06-09 00:00:00 1956-03-30 00:00:00
## 5              4 DR           1953-06-02 00:00:00 1956-02-01 00:00:00
## 6              5 DR           1953-06-06 00:00:00 1955-12-01 00:00:00
## # … with 12 more variables: incidentType <chr>, ihProgramDeclared <dbl>,
## #   iaProgramDeclared <dbl>, paProgramDeclared <dbl>, hmProgramDeclared <dbl>,
## #   incidentBeginDate <dttm>, incidentEndDate <dttm>, state <chr>,
## #   declaredCountyArea <chr>, placeCode <dbl>, interval <Interval>,
## #   duration_secs <dbl>

Research question

You should phrase your research question in a way that matches up with the scope of inference your dataset allows for.

Has Disaster Closeout time changed significantly in the time that FEMA has been issuing disaster declarations? Is the change consistent across disaster type? Geographical region?

Cases

What are the cases, and how many are there?

Each case represents a FEMA declared disaster event. There are nrow( disaster_df ) events documented in this data set.

Data collection

Describe the method of data collection.

Data is collected by the Federal Emergency Management Agency (FEMA) as part of the OpenFEMA Dataset (OpenFEMA) in an effort to make data freely available in machine readable formats

Type of study

What type of study is this (observational/experiment)?

This is an observational study.

Data Source

If you collected the data, state self-collected. If not, provide a citation/link.

Data is collected by OpenFEMA and is available online here: OpenFEMA Dataset: Disaster Declarations Summaries - V2 For this project, data was uploaded to the authors github to facilitate access.

Response

What is the response variable, and what type is it (numerical/categorical)?

The response variable is the disaster duration which is to be calculated from the difference between the ‘incidentBeginDate’ and ‘incidentEndDate’ features and is a numerical variable.

Explanatory

What is the explanatory variable, and what type is it (numerical/categorical)?

The explanatory variables are categorical: disaster type (‘incidentType’) and geographical region (fipsStateCode & fipsCountyCode).

Relevant summary statistics

Provide summary statistics relevant to your research question. For example, if you’re comparing means across groups provide means, SDs, sample sizes of each group. This step requires the use of R, hence a code chunk is provided below. Insert more code chunks as needed.

Summary Statistics for Disaster Durations

## [1] "Summary Statistics given in Days"
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##       0    1312    2500    2592    3563   12107   10017

Summary Statistics for Duration (Days) by Disaster Type

## # A tibble: 4 x 5
##   disasterType count  mean median   IQR
##   <chr>        <int> <dbl>  <dbl> <dbl>
## 1 DR           36498 3125.   3009 1815 
## 2 EM           13097 1506.   1108 1203 
## 3 FM            1031 1123.   1140  666 
## 4 FS             390 1248.   1058  661.

Boxplot of Disaster Types Summary Stats