Synopsis: This document serves to address which types of storms are most harmful with respect to population health. Harm can therefore be broken down into two categories- injury and fatality. For this analysis, we determined the maximum number of fatalities and injuries within the dataset, and then took a sub-set of the storm data that included any instance where fatalities were greater than 100, and injuries greater than 500. We then plotted this data, which is presented below.
##Data Processing
Code used to sub-set the data, then plot both injuries and fatalities: 1. summarized max fatalities and injuries, summarized max 2. with max and min values, took subset of data, injuries > 500 or fatalities > 100. 3. Plotted data
storm <- read.csv("./data/StormData.csv") ##reading in the data
str(storm)
## 'data.frame': 902297 obs. of 37 variables:
## $ STATE__ : num 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_DATE : chr "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
## $ BGN_TIME : chr "0130" "0145" "1600" "0900" ...
## $ TIME_ZONE : chr "CST" "CST" "CST" "CST" ...
## $ COUNTY : num 97 3 57 89 43 77 9 123 125 57 ...
## $ COUNTYNAME: chr "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
## $ STATE : chr "AL" "AL" "AL" "AL" ...
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ BGN_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ BGN_AZI : chr "" "" "" "" ...
## $ BGN_LOCATI: chr "" "" "" "" ...
## $ END_DATE : chr "" "" "" "" ...
## $ END_TIME : chr "" "" "" "" ...
## $ COUNTY_END: num 0 0 0 0 0 0 0 0 0 0 ...
## $ COUNTYENDN: logi NA NA NA NA NA NA ...
## $ END_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ END_AZI : chr "" "" "" "" ...
## $ END_LOCATI: chr "" "" "" "" ...
## $ LENGTH : num 14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
## $ WIDTH : num 100 150 123 100 150 177 33 33 100 100 ...
## $ F : int 3 2 2 2 2 2 2 1 3 3 ...
## $ MAG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
## $ WFO : chr "" "" "" "" ...
## $ STATEOFFIC: chr "" "" "" "" ...
## $ ZONENAMES : chr "" "" "" "" ...
## $ LATITUDE : num 3040 3042 3340 3458 3412 ...
## $ LONGITUDE : num 8812 8755 8742 8626 8642 ...
## $ LATITUDE_E: num 3051 0 0 0 0 ...
## $ LONGITUDE_: num 8806 0 0 0 0 ...
## $ REMARKS : chr "" "" "" "" ...
## $ REFNUM : num 1 2 3 4 5 6 7 8 9 10 ...
summary(storm$FATALITIES) ##identify the max/min number of fatalities 583 is max
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.0000 0.0000 0.0168 0.0000 583.0000
summary(storm$INJURIES) ##1700 max injuries
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.0000 0.0000 0.1557 0.0000 1700.0000
summary(storm$PROPDMG)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 0.00 0.00 12.06 0.50 5000.00
summary(storm$CROPDMG)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 0.000 0.000 1.527 0.000 990.000
##Taking subset
storm1 <- subset(storm, FATALITIES > 100 | INJURIES > 500)
storm2 <- storm1[, c("EVTYPE", "INJURIES", "FATALITIES") ]
storm3 <- subset(storm, PROPDMG > 4500 | CROPDMG > 900)
storm4 <- storm3[, c("EVTYPE", "PROPDMG", "CROPDMG")]
##Results
After analyzing the data, we were able to determine that Tornado are the most harmful with respect to population health, followed by heat, and then flood.
Regarding impacts to the economy (i.e. crop and property damage), we determined that the most disruptive storm events were flooding and flashflooding, with wind and tropical storm like weather still significant, but about half as impactful.The figure can be found below that highlights this.
storm2
library(ggplot2)
plot1 <- ggplot(storm1, aes(EVTYPE, INJURIES, fill=EVTYPE))
inj <- plot1 + geom_bar(stat = "identity") +
theme_bw() +
theme(axis.text.x = element_text(angle = 40, vjust = 1, hjust=1)) +
xlab("Event Type")+
ylab("Total Injuries")+
ggtitle("Subset of Major Injuries by Event")
inj ##shows a bar graph, where tornado has the highest rate of injuries.
library(ggplot2)
plot2 <- ggplot(storm1, aes(EVTYPE, FATALITIES, fill=EVTYPE))
fat <- plot2 + geom_bar(stat = "identity") +
theme_bw() +
theme(axis.text.x = element_text(angle = 40, vjust = 1, hjust=1)) +
xlab("Event Type")+
ylab("Total Fatalities")+
ggtitle("Subset of Fatalities by Event")
fat ##Tornado also has the highest rates of fatalities..
library(ggplot2)
plot3 <- ggplot(storm3, aes(EVTYPE, CROPDMG, fill=EVTYPE))
cdmg <- plot3 + geom_bar(stat = "identity") +
theme_bw() +
theme(axis.text.x = element_text(angle = 40, vjust = 1, hjust=1)) +
xlab("Event Type")+
ylab("Damage")+
guides(fill = "none")+
ggtitle("Impact on Crops")
##shows a bar graph, where tornado has the highest rate of injuries.
library(ggplot2)
plot4 <- ggplot(storm3, aes(EVTYPE, PROPDMG, fill=EVTYPE))
pdmg <- plot4 + geom_bar(stat = "identity") +
theme_bw() +
theme(axis.text.x = element_text(angle = 40, vjust = 1, hjust=1)) +
xlab("Event Type")+
ylab("Damage")+
guides(fill = "none")+
ggtitle("Impact on Property")
library(cowplot)
title <- ggdraw() + draw_label("Economic Impacts of Major Storm Events", fontface='bold')
plotmain <- plot_grid(cdmg, pdmg, ncol=2, labels="AUTO")
finalplot <- plot_grid(title, plotmain, nrow=2, rel_heights = c(.2, 1, 1))
finalplot