Synopsis

This analysis uses the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database from 1950 to Nov 2011 to determine weather events which are most harmful to population health and the economy. Some rudimentary processing of the data is performed to remove unnecessary data, group similar weather event types and summarize property and crop damage in USD. The population health indicator used is the sum of all injuries and fatalities for the weather event while the economic damage indicator used is the sum of all property and crop damage due to the weather event. Only the top 10 weather events for population health damage and economic damage were plotted. The weather events in both top 10 for population health damage and economic damage are also listed.

Introduction

“This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.” - Reproducible Research project explanation

The questions answered in this analysis are:

  1. Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
  2. Across the United States, which types of events have the greatest economic consequences?

The NOAA Storm dataset as downloaded from the Coursera course site span the period from 1950 to November 2011. There are fewer recorded events in earlier years, most likely due to a lack of good records.

Data Processing

Information about the software environment being used for reproducibility

sessionInfo()
## R version 3.0.2 (2013-09-25)
## Platform: x86_64-apple-darwin10.8.0 (64-bit)
## 
## locale:
## [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
## [1] digest_0.6.4     evaluate_0.5.5   formatR_0.10     htmltools_0.2.4 
## [5] knitr_1.6        rmarkdown_0.2.48 stringr_0.6.2    tools_3.0.2     
## [9] yaml_2.1.11

The commented-out code below can be used for most systems to download required NOAA data and create the csv file required for analysis. This section of code is for complete reproducibility with only a url needed. Please skip to next code chunk for data processing starting from the unzipped CSV file downloaded from the courser web site.

# wd <- readline(prompt = "Enter directory to download NOAA data to")
# setwd(paste(wd,sep=""))
# download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2","StormData.csv.bz2",method="curl")
# data <- read.csv(bzfile("StormData.csv.bz2"))
# close(bzfile("StormData.csv.bz2"))
# write.csv(data,file="StormData.csv")

The CSV data file is loaded into R for processing

setwd("~/Desktop/test_repo/RepData_PeerAssessement2")
data <- read.csv("StormData.csv",header=TRUE,stringsAsFactors = FALSE)
head(data)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL
##    EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 1 TORNADO         0                                               0
## 2 TORNADO         0                                               0
## 3 TORNADO         0                                               0
## 4 TORNADO         0                                               0
## 5 TORNADO         0                                               0
## 6 TORNADO         0                                               0
##   COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 1         NA         0                      14.0   100 3   0          0
## 2         NA         0                       2.0   150 2   0          0
## 3         NA         0                       0.1   123 2   0          0
## 4         NA         0                       0.0   100 2   0          0
## 5         NA         0                       0.0   150 2   0          0
## 6         NA         0                       1.5   177 2   0          0
##   INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES
## 1       15    25.0          K       0                                    
## 2        0     2.5          K       0                                    
## 3        2    25.0          K       0                                    
## 4        2     2.5          K       0                                    
## 5        2     2.5          K       0                                    
## 6        6     2.5          K       0                                    
##   LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1     3040      8812       3051       8806              1
## 2     3042      8755          0          0              2
## 3     3340      8742          0          0              3
## 4     3458      8626          0          0              4
## 5     3412      8642          0          0              5
## 6     3450      8748          0          0              6
str(data)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : chr  "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
##  $ BGN_TIME  : chr  "0130" "0145" "1600" "0900" ...
##  $ TIME_ZONE : chr  "CST" "CST" "CST" "CST" ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: chr  "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
##  $ STATE     : chr  "AL" "AL" "AL" "AL" ...
##  $ EVTYPE    : chr  "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : chr  "" "" "" "" ...
##  $ BGN_LOCATI: chr  "" "" "" "" ...
##  $ END_DATE  : chr  "" "" "" "" ...
##  $ END_TIME  : chr  "" "" "" "" ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : chr  "" "" "" "" ...
##  $ END_LOCATI: chr  "" "" "" "" ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: chr  "K" "K" "K" "K" ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: chr  "" "" "" "" ...
##  $ WFO       : chr  "" "" "" "" ...
##  $ STATEOFFIC: chr  "" "" "" "" ...
##  $ ZONENAMES : chr  "" "" "" "" ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : chr  "" "" "" "" ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...

Removal of unnecessary data columns in the data table for streamline exploratoration of data. Necessary data columns retained are:

  1. BGN_DATE (Beginning date of event)
  2. EVTYPE (Type of weather event)
  3. FATALITIES
  4. INJURIES
  5. PROPDMG (Property damage - numeric)
  6. PROPDMGEXP (Exponent multiplier for property damage - K (thousands), M (millions), B (billions))
  7. CROPDMG (Crop damage - numeric)
  8. CROPDMGEXP (see PROPDMGEXP)
  9. REMARKS
  10. REFNUM
strip_data <- data[,c("BGN_DATE","EVTYPE","FATALITIES","INJURIES","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP","REMARKS","REFNUM")]
rm(data)
str(strip_data)
## 'data.frame':    902297 obs. of  10 variables:
##  $ BGN_DATE  : chr  "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
##  $ EVTYPE    : chr  "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: chr  "K" "K" "K" "K" ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: chr  "" "" "" "" ...
##  $ REMARKS   : chr  "" "" "" "" ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...

Steps:

  1. Remove unneccesary data - summarize similar events under same title
  2. Separate to two data sets - population health (injuries > 0 & fatalities > 0) and economic damage (property damage > 0 & crop damage > 0)
  3. Create population health indicator as sum of injuries and fatalities from all events from 1950 to Nov 2012
  4. Create economic damage indicator as sum of all crop and property damage from all event from 1950 to Nov 2012
  5. Rank evtypes and plot top 10 worst events using population health and economic damage indicator to find top 10 worst weather events for both cases.
# Processing data
proc_data <- strip_data
# Some summarising of event types
proc_data$EVTYPE <- toupper(proc_data$EVTYPE)
original <- c("TSTM","AVALANCE","BLOWING SNOW","BRUSH FIRE","WILD FIRE")
replacement <- c("THUNDERSTORM","AVALANCHE","BLIZZARD","WILDFIRE","WILDFIRE")
for (x in 1:length(original)) {
    proc_data$EVTYPE <- gsub(original[x],replacement[x],proc_data$EVTYPE)
}
proc_data$EVTYPE[grep("TORNADO",proc_data$EVTYPE)] <- "TORNADO"
proc_data$EVTYPE[grep("HURRICANE",proc_data$EVTYPE)] <- "HURRICANE/TYPHOON"
proc_data$EVTYPE[grep("TYPHOON",proc_data$EVTYPE)] <- "HURRICANE/TYPHOON"
proc_data$EVTYPE[grep("FLASH FLOOD",proc_data$EVTYPE)] <- "FLASH FLOOD"
proc_data$EVTYPE[grep("^COASTAL",proc_data$EVTYPE)] <- "COASTAL FLOOD"
proc_data$EVTYPE[grep("^COLD",proc_data$EVTYPE)] <- "COLD/WIND CHILL"
proc_data$EVTYPE[grep("^DROUGHT",proc_data$EVTYPE)] <- "DROUGHT"
proc_data$EVTYPE[grep("^EXTREME COLD",proc_data$EVTYPE)] <- "EXTREME COLD/WIND CHILL"
proc_data$EVTYPE[grep("FLASH FLOOD",proc_data$EVTYPE)] <- "FLASH FLOOD"
proc_data$EVTYPE[grep("EXCESSIVE RAINFALL",proc_data$EVTYPE)] <- "HEAVY RAIN"
proc_data$EVTYPE[grep("EXCESSIVE SNOW",proc_data$EVTYPE)] <- "HEAVY SNOW"
proc_data$EVTYPE[grep("EXTREME HEAT",proc_data$EVTYPE)] <- "EXCESSIVE HEAT"
proc_data$EVTYPE[grep("EXTREME WINDCHILL",proc_data$EVTYPE)] <- "EXTREME COLD/WIND CHILL"
proc_data$EVTYPE[grep("FALLING SNOW",proc_data$EVTYPE)] <- "HEAVY SNOW"
proc_data$EVTYPE[grep("^FLOOD",proc_data$EVTYPE)] <- "FLOOD"
proc_data$EVTYPE[grep("^HIGH WIND",proc_data$EVTYPE)] <- "HIGH WIND"
proc_data$EVTYPE[grep("^HEAVY SNOW",proc_data$EVTYPE)] <- "HEAVY SNOW"
proc_data$EVTYPE[grep("^HEAVY SURF",proc_data$EVTYPE)] <- "HIGH SURF"
proc_data$EVTYPE[grep("^LIGHTNING",proc_data$EVTYPE)] <- "LIGHTNING"
proc_data$EVTYPE[grep("EXTREME WINDCHILL",proc_data$EVTYPE)] <- "EXTREME COLD/WIND CHILL"
proc_data$EVTYPE[grep("RIVER FLOOD",proc_data$EVTYPE)] <- "FLOOD"
proc_data$EVTYPE[grep("^SNOW",proc_data$EVTYPE)] <- "HEAVY SNOW"
proc_data$EVTYPE[grep("^WINTER STORM",proc_data$EVTYPE)] <- "WINTER STORM"
proc_data$EVTYPE[grep("^WINTER WEATHER",proc_data$EVTYPE)] <- "WINTER WEATHER"
proc_data$EVTYPE[grep("^WINTRY",proc_data$EVTYPE)] <- "WINTER WEATHER"
proc_data$EVTYPE[grep("^WIND",proc_data$EVTYPE)] <- "STRONG WIND"
proc_data$EVTYPE[grep("^WILD",proc_data$EVTYPE)] <- "WILDFIRE"
proc_data$EVTYPE[grep("^THUNDERSTORM",proc_data$EVTYPE)] <- "THUNDERSTORM WIND"
proc_data$EVTYPE[grep("^URBAN",proc_data$EVTYPE)] <- "FLOOD"
proc_data$EVTYPE[grep("^WARM WEATHER",proc_data$EVTYPE)] <- "EXCESSIVE HEAT"
proc_data$EVTYPE[grep("^UNSEASONABLY WARM",proc_data$EVTYPE)] <- "EXCESSIVE HEAT"
proc_data$EVTYPE[grep("^UNSEASONABLY COLD",proc_data$EVTYPE)] <- "COLD/WIND CHILL"
proc_data$EVTYPE[grep("^TROPICAL STORM",proc_data$EVTYPE)] <- "TROPICAL STORM"
proc_data$EVTYPE[grep("^TORRENTIAL RAIN",proc_data$EVTYPE)] <- "HEAVY RAIN"
proc_data$EVTYPE[grep("^TIDAL FLOOD",proc_data$EVTYPE)] <- "COASTAL FLOOD"
proc_data$EVTYPE[grep("^THUNDERTORM",proc_data$EVTYPE)] <- "THUNDERSTORM WIND"
proc_data$EVTYPE[grep("^STRONG WIND",proc_data$EVTYPE)] <- "STRONG WIND"
proc_data$EVTYPE[grep("^STORM SURGE",proc_data$EVTYPE)] <- "STORM SURGE/TIDE"
proc_data$EVTYPE[grep("^SMALL HAIL",proc_data$EVTYPE)] <- "HAIL"
proc_data$EVTYPE[grep("^RIP CURRENT",proc_data$EVTYPE)] <- "RIP CURRENT"
proc_data$EVTYPE[grep("^RECORD HEAT",proc_data$EVTYPE)] <- "EXCESSIVE HEAT"
proc_data$EVTYPE[grep("EXCESSIVE HEAT",proc_data$EVTYPE)] <- "EXCESSIVE HEAT"
proc_data$EVTYPE[grep("^LOW TEMPERATURE",proc_data$EVTYPE)] <- "COLD/WIND CHILL"
proc_data$EVTYPE[grep("^RECORD COLD",proc_data$EVTYPE)] <- "EXTREME COLD/WIND CHILL"
proc_data$EVTYPE[grep("^FREEZ",proc_data$EVTYPE)] <- "FROST/FREEZE"
proc_data$EVTYPE[grep("^FROST",proc_data$EVTYPE)] <- "FROST/FREEZE"
proc_data$EVTYPE[grep("^HAIL",proc_data$EVTYPE)] <- "HAIL"
proc_data$EVTYPE[grep("THUNDERSTORM",proc_data$EVTYPE)] <- "THUNDERSTORM WIND"
proc_data$EVTYPE[grep("RAIN",proc_data$EVTYPE)] <- "HEAVY RAIN"
proc_data$EVTYPE[grep("WARM",proc_data$EVTYPE)] <- "EXCESSIVE HEAT"
proc_data$EVTYPE[grep("WIND CHILL",proc_data$EVTYPE)] <- "COLD/WIND CHILL"

# Separate into population health data and economic data
pop_data <- proc_data[proc_data$INJURIES>0|proc_data$FATALITIES>0,2:4]
econ_data <- proc_data[proc_data$CROPDMG>0|proc_data$PROPDMG>0,c(1,2,5,6,7,8)]

# Processing exponents in economic damages
econ_data$PROPDMGEXP <- toupper(econ_data$PROPDMGEXP)
econ_data$CROPDMGEXP <- toupper(econ_data$CROPDMGEXP)
econ_data$PROPDMGEXP[econ_data$PROPDMGEXP=="H"] <- paste(10^2,sep="")
econ_data$PROPDMGEXP[econ_data$PROPDMGEXP=="K"] <- paste(10^3,sep="")
econ_data$PROPDMGEXP[econ_data$PROPDMGEXP=="M"] <- paste(10^6,sep="")
econ_data$PROPDMGEXP[econ_data$PROPDMGEXP=="B"] <- paste(10^9,sep="")
econ_data$PROPDMGEXP[econ_data$PROPDMGEXP=="2"] <- paste(10^2,sep="")
econ_data$PROPDMGEXP[econ_data$PROPDMGEXP=="3"] <- paste(10^3,sep="")
econ_data$PROPDMGEXP[econ_data$PROPDMGEXP=="4"] <- paste(10^4,sep="")
econ_data$PROPDMGEXP[econ_data$PROPDMGEXP=="5"] <- paste(10^5,sep="")
econ_data$PROPDMGEXP[econ_data$PROPDMGEXP=="6"] <- paste(10^6,sep="")
econ_data$PROPDMGEXP[econ_data$PROPDMGEXP=="7"] <- paste(10^7,sep="")
econ_data$PROPDMGEXP[econ_data$PROPDMGEXP=="8"] <- paste(10^8,sep="")

econ_data$CROPDMGEXP[econ_data$CROPDMGEXP=="H"] <- paste(10^2,sep="")
econ_data$CROPDMGEXP[econ_data$CROPDMGEXP=="K"] <- paste(10^3,sep="")
econ_data$CROPDMGEXP[econ_data$CROPDMGEXP=="M"] <- paste(10^6,sep="")
econ_data$CROPDMGEXP[econ_data$CROPDMGEXP=="B"] <- paste(10^9,sep="")
econ_data$CROPDMGEXP[econ_data$CROPDMGEXP=="2"] <- paste(10^2,sep="")
econ_data$CROPDMGEXP[econ_data$CROPDMGEXP=="3"] <- paste(10^3,sep="")
econ_data$CROPDMGEXP[econ_data$CROPDMGEXP=="4"] <- paste(10^4,sep="")
econ_data$CROPDMGEXP[econ_data$CROPDMGEXP=="5"] <- paste(10^5,sep="")
econ_data$CROPDMGEXP[econ_data$CROPDMGEXP=="6"] <- paste(10^6,sep="")
econ_data$CROPDMGEXP[econ_data$CROPDMGEXP=="7"] <- paste(10^7,sep="")
econ_data$CROPDMGEXP[econ_data$CROPDMGEXP=="8"] <- paste(10^8,sep="")

econ_data$PROPDMGEXP <- as.numeric(econ_data$PROPDMGEXP)
## Warning: NAs introduced by coercion
econ_data$PROPDMGEXP[is.na(econ_data$PROPDMGEXP)] <- 0
econ_data$PROPDMGEXP[is.infinite(econ_data$PROPDMGEXP)] <- 0
econ_data$CROPDMGEXP <- as.numeric(econ_data$CROPDMGEXP)
## Warning: NAs introduced by coercion
econ_data$CROPDMGEXP[is.na(econ_data$CROPDMGEXP)] <- 0
unique(econ_data$PROPDMGEXP)
## [1] 1e+03 1e+06 1e+09 0e+00 1e+05 1e+04 1e+02 1e+07
unique(econ_data$CROPDMGEXP)
## [1] 0e+00 1e+06 1e+03 1e+09
econ_data$PROPDMG <- econ_data$PROPDMG*econ_data$PROPDMGEXP
econ_data$CROPDMG <- econ_data$CROPDMG*econ_data$CROPDMGEXP
econ_data <- econ_data[,c(2,3,5)]

# Population health ranking
pop_health <- aggregate(. ~ EVTYPE, data = pop_data, sum)
pop_health$Total_Health_Damage <- pop_health$INJURIES + pop_health$FATALITIES
pop_health <- pop_health[order(pop_health$Total_Health_Damage,decreasing=TRUE),]
pop_health <- pop_health[1:10,]

# Economic damage ranking
econ_health <- aggregate(. ~ EVTYPE, data = econ_data, sum)
econ_health$Total_Economic_Damage <- econ_health$PROPDMG + econ_health$CROPDMG
econ_health <- econ_health[order(econ_health$Total_Economic_Damage,decreasing=TRUE),]
econ_health <- econ_health[1:10,]

Results

Plotting the top 10 worst weather events across the USA for population health and economic damage

library(ggplot2)
ggplot(data=pop_health, aes(x = EVTYPE,y = Total_Health_Damage))+geom_bar()+xlab("Weather Event")+ylab("Sum of all Injuries and Fatalities for all events from 1950 - Nov 2011")+ggtitle("Top 10 Weather Events (Population Health)")+coord_flip()
## Mapping a variable to y and also using stat="bin".
##   With stat="bin", it will attempt to set the y value to the count of cases in each group.
##   This can result in unexpected behavior and will not be allowed in a future version of ggplot2.
##   If you want y to represent counts of cases, use stat="bin" and don't map a variable to y.
##   If you want y to represent values in the data, use stat="identity".
##   See ?geom_bar for examples. (Deprecated; last used in version 0.9.2)

plot of chunk unnamed-chunk-6

ggplot(data=econ_health, aes(x = EVTYPE,y = Total_Economic_Damage))+geom_bar()+xlab("Weather Event")+ylab("Sum of Crop and Property Damage for all events from 1950 - Nov 2011 (USD)")+ggtitle("Top 10 Weather Events (Economic Damage)")+coord_flip()
## Mapping a variable to y and also using stat="bin".
##   With stat="bin", it will attempt to set the y value to the count of cases in each group.
##   This can result in unexpected behavior and will not be allowed in a future version of ggplot2.
##   If you want y to represent counts of cases, use stat="bin" and don't map a variable to y.
##   If you want y to represent values in the data, use stat="identity".
##   See ?geom_bar for examples. (Deprecated; last used in version 0.9.2)

plot of chunk unnamed-chunk-6

worst_weather_events <- paste(unlist(intersect(pop_health$EVTYPE,econ_health$EVTYPE)),sep=",")
top_pop <- paste(unlist(pop_health$EVTYPE),sep=",")
top_econ <- paste(unlist(econ_health$EVTYPE),sep=",")

The top 10 worst weather events for population health (in decreasing order) are TORNADO, THUNDERSTORM WIND, EXCESSIVE HEAT, FLOOD, LIGHTNING, HEAT, FLASH FLOOD, ICE STORM, HIGH WIND, WILDFIRE. The top 10 worst weather events for economic damage (in decreasing order) are FLOOD, HURRICANE/TYPHOON, TORNADO, STORM SURGE/TIDE, FLASH FLOOD, HAIL, DROUGHT, THUNDERSTORM WIND, ICE STORM, WILDFIRE. The weather events in both top 10 lists (in no particular order) are TORNADO, THUNDERSTORM WIND, FLOOD, FLASH FLOOD, ICE STORM, WILDFIRE.

Future Improvements

  1. Use date to adjust for damages for inflation
  2. Perform exploratory data analysis to remove outliers and anomalous data
  3. Find better method to find similar weather event types and group them together
  4. Use mean instead of sum