Synopsis: This document serves to address which types of storms are most harmful with respect to population health. Harm can therefore be broken down into two categories- injury and fatality. For this analysis, we determined the maximum number of fatalities and injuries within the dataset, and then took a sub-set of the storm data that included any instance where fatalities were greater than 100, and injuries greater than 500. We then plotted this data, which is presented below.

##Data Processing

Code used to sub-set the data, then plot both injuries and fatalities: 1. summarized max fatalities and injuries, summarized max 2. with max and min values, took subset of data, injuries > 500 or fatalities > 100. 3. Plotted data

storm <- read.csv("./data/StormData.csv") ##reading in the data
str(storm)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : chr  "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
##  $ BGN_TIME  : chr  "0130" "0145" "1600" "0900" ...
##  $ TIME_ZONE : chr  "CST" "CST" "CST" "CST" ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: chr  "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
##  $ STATE     : chr  "AL" "AL" "AL" "AL" ...
##  $ EVTYPE    : chr  "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : chr  "" "" "" "" ...
##  $ BGN_LOCATI: chr  "" "" "" "" ...
##  $ END_DATE  : chr  "" "" "" "" ...
##  $ END_TIME  : chr  "" "" "" "" ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : chr  "" "" "" "" ...
##  $ END_LOCATI: chr  "" "" "" "" ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: chr  "K" "K" "K" "K" ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: chr  "" "" "" "" ...
##  $ WFO       : chr  "" "" "" "" ...
##  $ STATEOFFIC: chr  "" "" "" "" ...
##  $ ZONENAMES : chr  "" "" "" "" ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : chr  "" "" "" "" ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...
summary(storm$FATALITIES) ##identify the max/min number of fatalities 583 is max
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##   0.0000   0.0000   0.0000   0.0168   0.0000 583.0000
summary(storm$INJURIES) ##1700 max injuries
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
##    0.0000    0.0000    0.0000    0.1557    0.0000 1700.0000
summary(storm$PROPDMG)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    0.00    0.00   12.06    0.50 5000.00
summary(storm$CROPDMG)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   0.000   1.527   0.000 990.000
##Taking subset
storm1 <- subset(storm, FATALITIES > 100 | INJURIES > 500)

storm2 <- storm1[, c("EVTYPE", "INJURIES", "FATALITIES") ]

storm3 <- subset(storm, PROPDMG > 4500 | CROPDMG > 900)

storm4 <- storm3[, c("EVTYPE", "PROPDMG", "CROPDMG")]

##Results

After analyzing the data, we were able to determine that Tornado are the most harmful with respect to population health, followed by heat, and then flood.

Regarding impacts to the economy (i.e. crop and property damage), we determined that the most disruptive storm events were flooding and flashflooding, with wind and tropical storm like weather still significant, but about half as impactful.The figure can be found below that highlights this.

storm2
library(ggplot2)
plot1 <- ggplot(storm1, aes(EVTYPE, INJURIES, fill=EVTYPE)) 
inj <- plot1 + geom_bar(stat = "identity") + 
   theme_bw() +
    theme(axis.text.x = element_text(angle = 40, vjust = 1, hjust=1)) +
    xlab("Event Type")+
    ylab("Total Injuries")+
    ggtitle("Subset of Major Injuries by Event")
        inj  ##shows a bar graph, where tornado has the highest rate of injuries.

library(ggplot2)
plot2 <- ggplot(storm1, aes(EVTYPE, FATALITIES, fill=EVTYPE)) 
fat <- plot2 + geom_bar(stat = "identity") + 
   theme_bw() +
    theme(axis.text.x = element_text(angle = 40, vjust = 1, hjust=1)) +
    xlab("Event Type")+
    ylab("Total Fatalities")+
    ggtitle("Subset of Fatalities by Event")
        fat ##Tornado also has the highest rates of fatalities..

library(ggplot2)
plot3 <- ggplot(storm3, aes(EVTYPE, CROPDMG, fill=EVTYPE)) 
cdmg <- plot3 + geom_bar(stat = "identity") + 
   theme_bw() +
    theme(axis.text.x = element_text(angle = 40, vjust = 1, hjust=1)) +
    xlab("Event Type")+
    ylab("Damage")+
    guides(fill = "none")+
    ggtitle("Impact on Crops")
         ##shows a bar graph, where tornado has the highest rate of injuries.

library(ggplot2)
plot4 <- ggplot(storm3, aes(EVTYPE, PROPDMG, fill=EVTYPE)) 
pdmg <- plot4 + geom_bar(stat = "identity") + 
   theme_bw() +
    theme(axis.text.x = element_text(angle = 40, vjust = 1, hjust=1)) +
    xlab("Event Type")+
    ylab("Damage")+
    guides(fill = "none")+
    ggtitle("Impact on Property")

library(cowplot)
title <- ggdraw() + draw_label("Economic Impacts of Major Storm Events", fontface='bold')
plotmain <- plot_grid(cdmg, pdmg, ncol=2, labels="AUTO")
finalplot <- plot_grid(title, plotmain, nrow=2, rel_heights = c(.2, 1, 1))

finalplot