Synpopsis

Severe weather events have significant impacts on population health and economic damage.
Through analysis of the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database, we found that tornado has the biggest severe impact in terms of total fatalities and injuries in population health, while flood caused the largest economic impact, in terms of total property and crop damages. In terms of population health, tornados are responsible for 5633 fatalities and 91346 injuries, followed by excessive heat, which caused 1903 fatalities and 6525 injuries. In terms of economic impact, flood caused 144.6 billion dollars of property damage and 5.7 billion dollars of crop damage. Hurricane/typhoon ranked second, with 69.3 billions in property damage and 2.6 billions in crop damge. Drought caused the biggest crop damage - 14 billion dollars.

Data Processing

Zipped data is downloaded, unzipped and read into R.

url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(url = url, destfile = "StormData.csv.bz2",method="curl")
download_time <- date()
#Decompress bz2 file by running a Linux command from within R
system('bzip2 -d StormData.csv.bz2')
storm<-read.table("StormData.csv",sep=",",header=T,na.strings = "NA")

Results

The impact of severe weather events will be assessed separately in terms of population health and ecnomic damage.

Population health

This section demonstrated the fatalities and injuries caused by severe weather events. In the bar plot, from left to right, weather event types are ranked by the combined number of fatalities and injuries. It is clear that tornado and exccesive heat are the top two causes of fatalities and injuries.

library(dplyr)
#summarise by event type, calculate total fatalities and injuries
fatality_injury<-summarise(group_by(storm,EVTYPE),fatality=sum(FATALITIES),injury=sum(INJURIES),total=fatality+injury)
library(ggplot2)
#sort by combined total fatalities and injuries
fatality_injury<-arrange(fatality_injury,desc(total))
#Plot top 10 event types
top10<-fatality_injury[1:10,]
library(reshape2)
#Order event types by descending order of total fatalities/injuries
top10$EVTYPE<-factor(top10$EVTYPE,levels=top10$EVTYPE[order(-top10$total)],ordered=T)
top10<-top10[,-ncol(top10)]
top10<-melt(top10,id.var="EVTYPE")
ggplot(data=top10,aes(x=EVTYPE,y=value,fill=variable)) + facet_grid(variable ~., scales="free_y") +geom_bar(stat="identity") +theme(axis.text.x = element_text(angle = -10)) + ylab("Number of fatalities/injuries") + xlab("Event type") + geom_text(aes(label=value),size=4)

Figure 1 Top 10 weather events sorted by combined fatalities and injuries.Actual numbers are labeled above each bar.

Economic damage

This section demonstrated property and crop damages caused by severe weather events. The analysis is same as Population health section. As damage values are reported in different units (e.g., thousand, million or billion dollars), they are all converted to billions for easy comparison.
In the bar plot, from left to right, weather event types are ranked by the combined value of property and crop damages. It is clear that flood, hurricane/tyhphoon and storm surge are the top three causes of combined economic damage as well as property damage alone. On the other hand, drought is the top cause of crop damage.

#Convert different value units ("K","M") to "G" - i.e., multiple 1e-6 for K, 1e-3 for M. 
#Empty values ("") is converted to 0, as the corresponding row in CROPDMG/PROPDMG is also 0 - the unit does not matter.
#Assum M/m both mean million, H and h both mean hundred
convert_unit<-data.frame(symbol=c("","K","M","B","m","h","H","0"),factor=c(0,1e-6,1e-3,1,1e-3,1e-7,1e-7,0))
ind<-match(storm$PROPDMGEXP,convert_unit$symbol)
storm$prop_value_factor<-convert_unit$factor[ind]
ind<-match(storm$CROPDMGEXP,convert_unit$symbol)
storm$crop_value_factor<-convert_unit$factor[ind]
damage<-summarise(group_by(storm,EVTYPE),property=sum(PROPDMG*prop_value_factor),crop=sum(CROPDMG*crop_value_factor),total=property+crop)
max_property<-which.max(damage$property)
max_crop<-which.max(damage$crop)
damage<-arrange(damage,desc(total))
top10<-damage[1:10,]
top10$EVTYPE<-factor(top10$EVTYPE,levels=top10$EVTYPE[order(-top10$total)],ordered=T)
top10<-top10[,-ncol(top10)]
top10<-melt(top10,id.var="EVTYPE")
ggplot(data=top10,aes(x=EVTYPE,y=value,fill=variable)) + facet_grid(variable ~., scales="free_y") +geom_bar(stat="identity") +theme(axis.text.x = element_text(angle = -10)) + ylab("Value of damages (in billion dollars)") + xlab("Event type") + geom_text(aes(label=round(value,digits=2)),size=4)

Figure 2 Top 10 weather events sorted by combined property and crop damages. Actual damage values (in billions) are labeled above each bar.