In this report, we will attempt to analyze weather event types across the United States that are most harmful with respect to population health. We will also evaluate which types of events have the greatest economic consequences.
We used the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
Our analysis shows that tornadoes are the most harmful to public health due to their high number of fatalities. From an economic perspective, floods and hurricanes/typhoons are the most damaging due to their total property damages.
To process the NOAA data, we will load the data from the data source (a URL from the web), save it locally, and read it into R for further analysis. This portion of code is also cached for efficiency.
url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
fileName <- "StormData.csv.bz2"
if(!file.exists(fileName)) {
download.file(url, destfile = fileName, method = "curl")
}
data <- read.csv(fileName)
First, we evaluated the weather events with the greatest impacts to public health. We considered two primary factors: Fatalities and Injuries. Furthermore, we took into account the average and total number of fatalities and injuries by weather event caused by theses events.
The table below shows a summary of the top 10 weather events sorted from highest total fatalities and injuries. The averages of these items are also shown for context. Tornadoes by far are the most harmful to public health as they have caused the highest number of fatalities and injuries over time. Excessive heat follows tornadoes with the second highest total number of fatalities.
library(dplyr)
summary <- summarise(group_by(data, EVTYPE),
TotalFatalities = sum(FATALITIES),
TotalInjuries = sum(INJURIES),
AvgFatalities = round(mean(FATALITIES),3),
AvgInjuries = round(mean(INJURIES),3)
)
summaryTable <- arrange(summary,
desc(TotalFatalities),
desc(TotalInjuries),
desc(AvgFatalities),
desc(AvgInjuries)
)
summaryTable
## Source: local data frame [985 x 5]
##
## EVTYPE TotalFatalities TotalInjuries AvgFatalities AvgInjuries
## 1 TORNADO 5633 91346 0.093 1.506
## 2 EXCESSIVE HEAT 1903 6525 1.134 3.889
## 3 FLASH FLOOD 978 1777 0.018 0.033
## 4 HEAT 937 2100 1.222 2.738
## 5 LIGHTNING 816 5230 0.052 0.332
## 6 TSTM WIND 504 6957 0.002 0.032
## 7 FLOOD 470 6789 0.019 0.268
## 8 RIP CURRENT 368 232 0.783 0.494
## 9 HIGH WIND 248 1137 0.012 0.056
## 10 AVALANCHE 224 170 0.580 0.440
## .. ... ... ... ... ...
Second, we evaluate the economic impact of weather events. To do this we use the property damage field (PROPDMG) provided in our data.
A caveat is that our data uses a secondary field (PROPDMGEXP) to denote the property damage expression where K, M, and B are used to signify Thousand, Million, and Billion respectively. We used R code below to create a field that multiplies the property damage field to create a single measure of currency. We noticed some data collection issues as all rows did not contain the appropriate damage expression as specified. Thus, our analysis will only include values where a damage expression was specified.
kData <- mutate(data[data$PROPDMGEXP == 'K',], PROPDMGNUM = PROPDMG * 1000)
mData <- mutate(data[data$PROPDMGEXP == 'M' | data$PROPDMGEXP == 'm' ,], PROPDMGNUM = PROPDMG * 1000000)
bData <- mutate(data[data$PROPDMGEXP == 'B',], PROPDMGNUM = PROPDMG * 1000000000)
demageData <- rbind(kData,mData,bData)
The table below shows a summary of the top 10 weather events sorted by the highest total property damage. Floods and Hurricane/Typhoon where the top 2 while Tornado was in third place even though it was number one from a health impact perspective.
demageSummary <- summarise(group_by(demageData, EVTYPE), totalDemage = sum(PROPDMGNUM))
demageSummaryTable <- arrange(demageSummary, desc(totalDemage))
demageSummaryTable
## Source: local data frame [404 x 2]
##
## EVTYPE totalDemage
## 1 FLOOD 144657709800
## 2 HURRICANE/TYPHOON 69305840000
## 3 TORNADO 56937160480
## 4 STORM SURGE 43323536000
## 5 FLASH FLOOD 16140811510
## 6 HAIL 15732266720
## 7 HURRICANE 11868319010
## 8 TROPICAL STORM 7703890550
## 9 WINTER STORM 6688497250
## 10 HIGH WIND 5270046260
## .. ... ...