Each year major storms and weather events in the United States cause human and economic damage. The U.S. National Oceanic and Atmospheric Administration (NOAA) compiles information about these events, including their human and economic impact. The database used for this analysis contains the NOAA’s Storm information from 1950 to the end of November 2011.
In this document, human fatalities and injuries are combined into the events’ human health damage. Similarly, property and crop damage are grouped together into the storm events’ economic damage. We find that the top ten storm events cause 88% and 91% of the humand and economic damage, respectively. Tornadoes are the largest contributor with about 62% and 30% to the human and economic damage, respectively.
We start by downloading the U.S. National Oceanic and Atmospheric Administration’s (NOAA) Storm database which contains major storms and weather events in the United States and the estimates of any fatalities, injuries, and property and crop damage.
# Download and unzip data file if not already present
if (!file.exists("StormData.csv")) {
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", dest="storm-data.bz2", method="curl")
require(R.utils) || install.packages('R.utils', repos='http://cran.us.r-project.org')
library(R.utils)
bunzip2("storm-data.bz2", "StormData.csv")
}
We then read the database (CSV spreadsheet) into a dataframe.
storm_data <- read.csv("StormData.csv")
We calculate the total human health damage by aggregating the fatalities and injuries per storm event and adding them together.
# Aggregate human issues by event type
human_data <- aggregate(cbind(FATALITIES, INJURIES) ~ EVTYPE, data=storm_data, sum)
# Add totals for human issues
human_data$TOTAL_HEALTH <- human_data$FATALITIES + human_data$INJURIES
We take property and crop damages as negative economic consequences from storm events.
The data for each of the property and crop damages is split into two columns. One of the columns contains a small decimal and the other contains a character that codes for an exponential. We combine these two columns by multiplying the decimal number and the corresponding exponential to obtain the total damage created by the storm event in US dollars by using the multiplier function below.
# Returns an integer based on K (1000), M (1e6), or B (1e9). To be used by the
# exponent columns: PROPDMGEXP and CROPDMGEXP, along with the PROPDMG
# (property) and CROPDMG (crop) damage columns.
multiplier <- function(exp) {
if(exp == "K") {
return(1000)
} else if (exp == "M") {
return(1e6)
} else if (exp == "B") {
return(1e9)
}
return(1)
}
We produce the total property damages by combining PROPDMG and PROPDMGEXP.
storm_data$TOTAL_PROPDMG <- storm_data$PROPDMG * multiplier(storm_data$PROPDMGEXP)
We produce the total crop damages by combining CROPDMG and CROPDMGEXP.
storm_data$TOTAL_CROPDMG <- storm_data$CROPDMG * multiplier(storm_data$CROPDMGEXP)
We then aggregate the two economic issues of propery and crop damage by event type.
economic_data <- aggregate(cbind(TOTAL_PROPDMG, TOTAL_CROPDMG) ~ EVTYPE, data=storm_data, sum)
And we add the totals for both of these economic issues.
economic_data$TOTAL_ECONOMIC <- economic_data$TOTAL_PROPDMG + economic_data$TOTAL_CROPDMG
We extract the top 10 human health damage issues by event types.
top_health <- head(human_data[order(-human_data$TOTAL_HEALTH),], 10)
top_health
## EVTYPE FATALITIES INJURIES TOTAL_HEALTH
## 834 TORNADO 5633 91346 96979
## 130 EXCESSIVE HEAT 1903 6525 8428
## 856 TSTM WIND 504 6957 7461
## 170 FLOOD 470 6789 7259
## 464 LIGHTNING 816 5230 6046
## 275 HEAT 937 2100 3037
## 153 FLASH FLOOD 978 1777 2755
## 427 ICE STORM 89 1975 2064
## 760 THUNDERSTORM WIND 133 1488 1621
## 972 WINTER STORM 206 1321 1527
These 10 events account for about 88% of human fatalities and injuries due to storm events.
sum(top_health$TOTAL_HEALTH) / sum(human_data$TOTAL_HEALTH)
## [1] 0.8811868
We plot these top 10 events. It is easy to see that tornados are the largest cause of human fatalities and injuries due to storm events.
par(mar=c(12, 7, 2, 0), mgp=c(5,2,0))
barplot(top_health$TOTAL_HEALTH, names.arg=top_health$EVTYPE, main="Fatalities and Injuries per Top 10 Event Types", ylab="Death and Injury (Number of people)", las=2)
Tornadoes account for about 62% of all deaths and injuries.
top_health$TOTAL_HEALTH[1] / sum(human_data$TOTAL_HEALTH)
## [1] 0.6229661
Similarly, for the economic damage, we extract the top 10 issues for property and crop damage by event type.
top_economic <- head(economic_data[order(-economic_data$TOTAL_ECONOMIC),], 10)
top_economic
## EVTYPE TOTAL_PROPDMG TOTAL_CROPDMG TOTAL_ECONOMIC
## 834 TORNADO 3212258160 100018.52 3212358179
## 153 FLASH FLOOD 1420124590 179200.46 1420303790
## 856 TSTM WIND 1335965610 109202.60 1336074813
## 170 FLOOD 899938480 168037.88 900106518
## 760 THUNDERSTORM WIND 876844170 66791.45 876910961
## 244 HAIL 688693380 579596.28 689272976
## 464 LIGHTNING 603351780 3580.61 603355361
## 786 THUNDERSTORM WINDS 446293180 18684.93 446311865
## 359 HIGH WIND 324731560 17283.21 324748843
## 972 WINTER STORM 132720590 1978.99 132722569
These 10 events account for about 91% of economic damage due to storm events.
sum(top_economic$TOTAL_ECONOMIC) / sum(economic_data$TOTAL_ECONOMIC)
## [1] 0.9133086
We then generate a plot of these top 10 events. Tornados are, also, the largest cause of economic damage due to storm events.
par(mar=c(12, 7, 2, 0), mgp=c(5,1,0))
barplot(top_economic$TOTAL_ECONOMIC, names.arg=top_economic$EVTYPE, main="Crop and Property Damage per Event Type", ylab="Damage (Dollars)", las=2)
Tornadoes account for about 30% of property and crop damage, which is about half the percentage from the human fatalities and injuries caused by this single event.
top_economic$TOTAL_ECONOMIC[1] / sum(economic_data$TOTAL_ECONOMIC)
## [1] 0.2950941