Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern. In this report,we explored the data from the U.S. National Oceanic and Atmospheric Administration's (NOAA) storm database. The events in the database start in the year 1950 and end in November 2011. Among 985 major storms and weather events, tornados are most harmful to population health and flood have the greatest economic consequences across the US.
First of all, I loaded several packages and read the data from the CSV file
data = read.csv(bzfile("repdata-data-StormData.csv.bz2", "r"))
And prepared the data for analysis. We made a function to compute property damage in thousand dollars from 2 variables(PROPDMG and PROPDMGEXP/CROPDMG and CROPDMGEXP). And then we made two variables(PROPTotal and CROPTotal) using this function.
totalDMG = function(DMG, EXP) {
total = ifelse(toupper(EXP) == "B", DMG * (10^6), ifelse(toupper(EXP) ==
"M", DMG * (10^3), ifelse(toupper(EXP) == "K", DMG, ifelse(toupper(EXP) ==
"H", DMG/10, DMG/1000))))
return(total)
}
data$PROPTotal = totalDMG(data$PROPDMG, data$PROPDMGEXP)
data$CROPTotal = totalDMG(data$CROPDMG, data$CROPDMGEXP)
Next, we compute the sum of fatalities, injuries, total property damages and total crop damages according to the event type.
library(lattice)
library(reshape)
result = tapply(data$FATALITIES, data$EVTYPE, sum)
result2 = tapply(data$INJURIES, data$EVTYPE, sum)
result3 = tapply(data$PROPTotal, data$EVTYPE, sum)
result4 = tapply(data$CROPTotal, data$EVTYPE, sum)
We made a dataframe with four values and order the data frame based on the sum of fatalities and injuries(health1) and then sum of property damages and crop damages(health2).
data1 = data.frame(FATALITIES = result, INJURIES = result2, PROPDMG = result3,
CROPDMG = result4)
data1 = data1[order(data1$FATALITIES, data1$INJURIES, decreasing = TRUE), ]
health1 = head(data1, 10)
health1$event = factor(rownames(health1), levels = rownames(health1))
meltedhealth1 = melt(health1, id = "event")
data2 = data1[order(data1$PROPDMG, data1$CROPDMG, decreasing = TRUE), ]
data2$DMG = data2$PROPDMG + data2$CROPDMG
data3 = data2[order(data2$DMG, decreasing = TRUE), ]
health2 = head(data3, 10)
health2$event = factor(rownames(health2), levels = rownames(health2))
meltedhealth2 = melt(health2, id = "event")
barchart(value ~ event | variable, data = meltedhealth1[1:20, ], ylab = "",
main = "Top 10 storms and weather events causing death or injuries", scales = list(relation = "free",
x = list(rot = 45)))
As you can see in the figure, tornados are the most harmful to population health, followed by excessive heat and flash flood.
barchart(value/1000 ~ event | variable, data = meltedhealth2[21:40, ], ylab = "Milion dollors",
main = "Top 10 weather events causing greatest economic consequences", scales = list(x = list(rot = 45)))
As you can see in the figure, flood have the greatest economic consequences, followed by hurricane/typhoon and tornado.