Synopsis

This document analyzes storm data across the United States to determine which types of storm events have the most negative impacts, both economic and to human health. We find that tornadoes are the most disastrous both economically and to human health. Flooding and severe winds also causes a lot of damage economically and results in lots of injuries and fatalities. Heat is the cause of a lot of injuries and loss of life, but is not responsible for as much economic loss.

Data Processing

The storm data is provided to us as a zipped bz2 file. Luckily, R’s read.csv method allows us to read it in this format, decompress it, and load it into a data frame.

data = read.csv("storm_data.csv.bz2")

Results

We first analyze which types of storm events are most harmful with respect to population health. The following code sums total numbers of fatalities and injuries across the United States. We produce a side-by-side comparison for the two categories in the form of bar graphs depicting the leading 10 types for each category. Tornadoes are overwhelmingly the leading cause of injuries and fatalities. Flooding (either flash or normal) comes in at third in both categories with the other variant also showing up in the top 10. Heat and wind also show up prominently in the top 10 in both categories.

sumByType = by(data[,c("FATALITIES","INJURIES")],data$EVTYPE,colSums)
health = data.frame(do.call(rbind, sumByType))
fatalities = health[order(health$FATALITIES,decreasing=TRUE),]
f = as.matrix(fatalities[,"FATALITIES"])
row.names(f) = row.names(fatalities)
injuries = health[order(health$INJURIES,decreasing=TRUE),]
i = as.matrix(injuries[,"INJURIES"])
row.names(i) = row.names(injuries)
par(mfrow=c(1,2))
par(mar=c(9, 3, 2, 2))

barplot(f[1:10,], col="red", width=2, beside=FALSE, las=2, 
        ylim = c(0,6000), main="Fatalities from Storms")
barplot(i[1:10,], col="orange", width=2, beside=FALSE, las=2,
        main="Injuries from Storms")

We now evaluate which type of weather events have the most negative consequences economically. We sum up total amounts of property damage and crop damage by event type across the United States. The top 10 in total damage have been plotted with bar plots for the two categories stacked together on a bar chart so that the sum damage between the two categories can be easily determined. As above, tornadoes are by far the worse, with winds and floods also making serious contributions. As opposed to above, heat does not contribute to as many negative economic consequences.

sumByType = by(data[,c("PROPDMG","CROPDMG")],data$EVTYPE,colSums)
economic = data.frame(do.call(rbind, sumByType))
economic$TOTAL = economic$PROPDMG + economic$CROPDMG
economic = economic[order(economic$TOTAL,decreasing=TRUE),]
eco = as.matrix(economic[,c("PROPDMG","CROPDMG")])

barplot(t(eco[1:10,])/1000, col=c("blue","green"), width=2, beside=FALSE, las=2, ylim=c(0,3500), cex.names = 0.8, 
        main="Economic Losses from Storms\n (in thousands)")
legend("topright", legend = c("Property Damage", "Crop Damage"),
       fill=c("blue","green"), cex=0.55)