This report focuses on the health and economic damage caused by storms across the entire United States.
Tornadoes are the most harmful storm event for population health (as measured by injuries and fatalities). Heat/excessive heat is the largest single cause of fatalities per year. Flooding and thunderstorms both make a major contribution to injury and fatality totals.
Floods and hurricanes are the primary factors causing economic consequences. Tornadoes have caused much property damage over more years, and droughts are the primary additional factor causing crop damage.
Lastly, it is important to remember that these results vary greatly by locality. Any prevention/response planning should be based on a finer grained analysis (e.g. by state) taking into account local characteristics.
The data for this report come in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size. You can download the file from:
There is also some documentation of the database available. Here you will find how some of the variables are constructed/defined.
National Weather Service Storm Data Documentation
National Climatic Data Center Storm Events FAQ
The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.
Key fields for this analysis: EVTYPE, BGN_DATE, STATE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP
library(knitr)
## Warning: package 'knitr' was built under R version 3.0.3
opts_chunk$set(autodep=TRUE)
dep_auto() # figure out dependencies automatically
opts_chunk$set(message = FALSE, warnings = FALSE)
options("getSymbols.warning4.0"=FALSE)
library(plyr)
## Warning: package 'plyr' was built under R version 3.0.3
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.0.3
invisible(library(quantmod)) # For CPI data
## Loading required package: Defaults
## Loading required package: xts
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.0.3
##
## Attaching package: 'zoo'
##
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
##
## Loading required package: TTR
## Version 0.4-0 included new data defaults. See ?getSymbols.
The data processing starts from the .csv.bz2 data file which is downloaded by download_data.R (the date/time of download is also saved in a file).
The following data preprocessing steps are performed:
1. CSV is read with stringsAsFactors=FALSE to facilitate text operations on data.
2. Remove unused fields to save memory.
3. Convert STATE to a factor.
4. Add a Date field and a numeric year field to each observation based on the BGN_DATE field.
5. Convert PROPDMGEXP and CROPDMGEXP to factors and use them to adjust the damage values.
6. Download CPI data and use that to calculate inflation adjusted values for property damage and crop damage. It is stated on page 11, section 2.7 of the documentation that damages are entered in actual dollar amounts.
7. Consolidate some of the event types.
The final step of event type consolidation could be an entire project in itself. There were 985 distinct event types in the data file (compared to 42 canonical types described on page 6, section 2.1.1 of the documentation) and a number of the types are combinations of others (e.g. TORNADOES, TSTM WIND, HAIL). I chose to focus on the top 20 event types for each category and calculate the proportion of the totals they explained to validate this approach. In addition, I computed the mean values for each event type to check for low frequency extremes (HURRICANE OPAL is an example). The consolidations chosen can be seen in the code below.
An important caveat about this data is that there was a major change to the data collected in 1993 (there are 19 years of data after and 43 years of data before this date). Prior to 1993 there were only three event types: TORNADO, HAIL, and TSTM WIND. This clearly biases the totals towards these event types (this can be seen in the graphs below) and must be accounted for in any assessment of relative damage by event type.
400MB of data, 902297 obs of 37 variables.
filePath <- "./data/repdata%2Fdata%2FStormData.csv.bz2"
stormData <- read.csv(filePath, stringsAsFactors=FALSE)
# Record raw structure to decide what to add/modify later
str(stormData)
## 'data.frame': 902297 obs. of 37 variables:
## $ STATE__ : num 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_DATE : chr "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
## $ BGN_TIME : chr "0130" "0145" "1600" "0900" ...
## $ TIME_ZONE : chr "CST" "CST" "CST" "CST" ...
## $ COUNTY : num 97 3 57 89 43 77 9 123 125 57 ...
## $ COUNTYNAME: chr "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
## $ STATE : chr "AL" "AL" "AL" "AL" ...
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ BGN_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ BGN_AZI : chr "" "" "" "" ...
## $ BGN_LOCATI: chr "" "" "" "" ...
## $ END_DATE : chr "" "" "" "" ...
## $ END_TIME : chr "" "" "" "" ...
## $ COUNTY_END: num 0 0 0 0 0 0 0 0 0 0 ...
## $ COUNTYENDN: logi NA NA NA NA NA NA ...
## $ END_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ END_AZI : chr "" "" "" "" ...
## $ END_LOCATI: chr "" "" "" "" ...
## $ LENGTH : num 14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
## $ WIDTH : num 100 150 123 100 150 177 33 33 100 100 ...
## $ F : int 3 2 2 2 2 2 2 1 3 3 ...
## $ MAG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
## $ WFO : chr "" "" "" "" ...
## $ STATEOFFIC: chr "" "" "" "" ...
## $ ZONENAMES : chr "" "" "" "" ...
## $ LATITUDE : num 3040 3042 3340 3458 3412 ...
## $ LONGITUDE : num 8812 8755 8742 8626 8642 ...
## $ LATITUDE_E: num 3051 0 0 0 0 ...
## $ LONGITUDE_: num 8806 0 0 0 0 ...
## $ REMARKS : chr "" "" "" "" ...
## $ REFNUM : num 1 2 3 4 5 6 7 8 9 10 ...
# Remove fields I am not using to conserve memory
stormData <- stormData[,c("EVTYPE", "BGN_DATE", "STATE", "FATALITIES", "INJURIES",
"PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")]
# First garbage collect after field removal
gc()
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 493931 13.2 741108 19.8 667722 17.9
## Vcells 390124 3.0 786432 6.0 786061 6.0
# Support switching to more detailed output
debug = FALSE
# Begin data processing
stormData$STATE <- as.factor(stormData$STATE)
# Convert PROPDMGEXP and CROPDMGEXP to factors and use them to adjust the damage values
# Requires some judgment/care!
stormData$PROPDMGEXP <- as.factor(stormData$PROPDMGEXP)
table(stormData$PROPDMGEXP)
##
## - ? + 0 1 2 3 4 5
## 465934 1 8 5 216 25 13 4 4 28
## 6 7 8 B h H K m M
## 4 5 1 40 1 6 424665 7 11330
propexp <- c(0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 2, 2, 3, 6, 6)
if (debug) cbind(levels(stormData$PROPDMGEXP), propexp)
stormData$PROPDMGADJ <- stormData$PROPDMG * 10^propexp[as.numeric(stormData$PROPDMGEXP)]
stormData$CROPDMGEXP <- as.factor(stormData$CROPDMGEXP)
table(stormData$CROPDMGEXP)
##
## ? 0 2 B k K m M
## 618413 7 19 1 9 21 281832 1 1994
cropexp <- c(0, 0, 0, 2, 9, 3, 3, 6, 6)
if (debug) cbind(levels(stormData$CROPDMGEXP), cropexp)
stormData$CROPDMGADJ <- stormData$CROPDMG * 10^cropexp[as.numeric(stormData$CROPDMGEXP)]
# Pay special care to B prefix since if Billion is a misinterpration it is a major error
if (debug) {
stormData[stormData$PROPDMGEXP == "B",]
stormData[stormData$CROPDMGEXP == "B",]
}
# Support dealing with dates (only used at year granularity)
stormData$date <- as.Date(stormData$BGN_DATE, format="%m/%d/%Y")
stormData$year <- as.numeric(format(stormData$date, "%Y"))
# Support inflation adjusted damage figures
getSymbols("CPIAUCSL", src='FRED') #Consumer Price Index for All Urban Consumers: All Items
## [1] "CPIAUCSL"
if (debug) tail(CPIAUCSL)
avg.cpi <- apply.yearly(CPIAUCSL, mean)
cf <- avg.cpi/as.numeric(avg.cpi['2014']) # using 2014 as the base year
stormData$PROPDMGADJ <- stormData$PROPDMGADJ / as.numeric(cf[stormData$year-1946]$CPIAUCSL)
stormData$CROPDMGADJ <- stormData$CROPDMGADJ / as.numeric(cf[stormData$year-1946]$CPIAUCSL)
# Should convert EVTYPE to the 42 canonical types first (985 to start)
# Notice TSTM WIND, THUNDERSTORM WINDS and THUNDERSTORM WIND
# Notice WILD/FOREST FIRE and WILDFIRE
# Notice different types of FLOOD
# Notice the presence of combination types like "TORNADOES, TSTM WIND, HAIL"
tab <- table(stormData$EVTYPE)
if (debug) tail(sort(tab), n=50)
if (debug) hist(log10(tab))
# Based on frequency information above and top 20 for different fields below perform
# the following consolidations of event types
stormData[stormData$EVTYPE == "THUNDERSTORM WINDS", "EVTYPE"] = "TSTM WIND"
stormData[stormData$EVTYPE == "THUNDERSTORM WIND", "EVTYPE"] = "TSTM WIND"
stormData[stormData$EVTYPE == "WILD/FOREST FIRE", "EVTYPE"] = "WILDFIRE"
stormData[stormData$EVTYPE == "MARINE THUNDERSTORM WIND", "EVTYPE"] = "MARINE TSTM WIND"
stormData[stormData$EVTYPE == "URBAN/SML STREAM FLD", "EVTYPE"] = "FLOOD"
stormData[stormData$EVTYPE == "URBAN FLOOD", "EVTYPE"] = "FLOOD"
stormData[stormData$EVTYPE == "HURRICANE OPAL", "EVTYPE"] = "HURRICANE/TYPHOON"
stormData[stormData$EVTYPE == "HURRICANE", "EVTYPE"] = "HURRICANE/TYPHOON"
stormData[stormData$EVTYPE == "RIP CURRENTS", "EVTYPE"] = "RIP CURRENT"
stormData[stormData$EVTYPE == "HEAVY RAIN/SEVERE WEATHER", "EVTYPE"] = "HEAVY RAIN"
# Compare post-consolidation
tab <- table(stormData$EVTYPE)
if (debug) tail(sort(tab), n=50)
From 1950 to 2014 Storm Events resulted in the following total fatalities, injuries, and inflation adjusted property and crop damage (in 2014 dollars). Since property damage is 10x larger than crop damage the economic analysis here focuses on it. In some communities a greater focus on crop damage might be appropriate.
totals <- colSums(stormData[,c("FATALITIES", "INJURIES", "PROPDMGADJ", "CROPDMGADJ")])
totals
## FATALITIES INJURIES PROPDMGADJ CROPDMGADJ
## 1.514e+04 1.405e+05 6.239e+11 6.784e+10
Given the total human and economic impact of Storm Events, preparation and response is clearly important. This analysis focuses on identifying which types of events are most harmful to population health and which have the greatest economic consequences.
Top 20 event types for each category and the proportion of the totals they represent:
results <- ddply(stormData[,c("EVTYPE", "FATALITIES", "INJURIES",
"PROPDMGADJ", "CROPDMGADJ")],
.(EVTYPE), numcolwise(sum))
row.names(results) <- results$EVTYPE
cols <- 2:5
tail(results[order(results$FATALITIES),cols], n=20)
## FATALITIES INJURIES PROPDMGADJ CROPDMGADJ
## HEAVY RAIN 98 251 4.878e+09 9.658e+08
## BLIZZARD 101 805 9.565e+08 1.815e+08
## HIGH SURF 101 152 1.006e+08 0.000e+00
## STRONG WIND 103 280 2.094e+08 7.782e+07
## EXTREME COLD/WIND CHILL 125 24 9.401e+06 6.399e+04
## HURRICANE/TYPHOON 126 1322 1.057e+11 7.030e+09
## HEAVY SNOW 127 1021 1.333e+09 2.052e+08
## EXTREME COLD 160 231 1.052e+08 1.858e+09
## HEAT WAVE 172 309 1.661e+07 8.792e+06
## WINTER STORM 206 1321 1.020e+10 4.144e+07
## AVALANCHE 224 170 4.403e+06 0.000e+00
## HIGH WIND 248 1137 6.475e+09 8.161e+08
## FLOOD 498 6868 1.712e+11 7.124e+09
## RIP CURRENT 572 529 2.147e+05 0.000e+00
## TSTM WIND 701 9353 1.304e+10 1.529e+09
## LIGHTNING 816 5230 1.213e+09 1.739e+07
## HEAT 937 2100 2.111e+06 6.543e+08
## FLASH FLOOD 978 1777 2.142e+10 1.775e+09
## EXCESSIVE HEAT 1903 6525 9.854e+06 5.753e+08
## TORNADO 5633 91346 1.626e+11 5.526e+08
tail(results[order(results$INJURIES),cols], n=20)
## FATALITIES INJURIES PROPDMGADJ CROPDMGADJ
## DENSE FOG 18 342 1.252e+07 0.000e+00
## WINTER WEATHER 33 398 2.297e+07 1.620e+07
## DUST STORM 22 440 6.802e+06 3.593e+06
## RIP CURRENT 572 529 2.147e+05 0.000e+00
## FOG 62 734 1.825e+07 0.000e+00
## BLIZZARD 101 805 9.565e+08 1.815e+08
## HEAVY SNOW 127 1021 1.333e+09 2.052e+08
## HIGH WIND 248 1137 6.475e+09 8.161e+08
## WINTER STORM 206 1321 1.020e+10 4.144e+07
## HURRICANE/TYPHOON 126 1322 1.057e+11 7.030e+09
## HAIL 15 1361 1.971e+10 4.011e+09
## WILDFIRE 87 1456 9.821e+09 5.004e+08
## FLASH FLOOD 978 1777 2.142e+10 1.775e+09
## ICE STORM 89 1975 5.225e+09 7.975e+09
## HEAT 937 2100 2.111e+06 6.543e+08
## LIGHTNING 816 5230 1.213e+09 1.739e+07
## EXCESSIVE HEAT 1903 6525 9.854e+06 5.753e+08
## FLOOD 498 6868 1.712e+11 7.124e+09
## TSTM WIND 701 9353 1.304e+10 1.529e+09
## TORNADO 5633 91346 1.626e+11 5.526e+08
tail(results[order(results$PROPDMGADJ),cols], n=20)
## FATALITIES INJURIES PROPDMGADJ CROPDMGADJ
## LIGHTNING 816 5230 1.213e+09 1.739e+07
## HEAVY SNOW 127 1021 1.333e+09 2.052e+08
## DROUGHT 0 4 1.361e+09 1.865e+10
## SEVERE THUNDERSTORM 0 0 1.863e+09 3.260e+05
## TORNADOES, TSTM WIND, HAIL 25 0 2.608e+09 4.075e+06
## HEAVY RAIN 98 251 4.878e+09 9.658e+08
## STORM SURGE/TIDE 11 5 5.076e+09 9.299e+05
## ICE STORM 89 1975 5.225e+09 7.975e+09
## HIGH WIND 248 1137 6.475e+09 8.161e+08
## RIVER FLOOD 2 2 8.337e+09 8.197e+09
## WILDFIRE 87 1456 9.821e+09 5.004e+08
## TROPICAL STORM 58 340 1.013e+10 8.678e+08
## WINTER STORM 206 1321 1.020e+10 4.144e+07
## TSTM WIND 701 9353 1.304e+10 1.529e+09
## HAIL 15 1361 1.971e+10 4.011e+09
## FLASH FLOOD 978 1777 2.142e+10 1.775e+09
## STORM SURGE 13 38 5.232e+10 7.507e+03
## HURRICANE/TYPHOON 126 1322 1.057e+11 7.030e+09
## TORNADO 5633 91346 1.626e+11 5.526e+08
## FLOOD 498 6868 1.712e+11 7.124e+09
tail(results[order(results$CROPDMGADJ),cols], n=20)
## FATALITIES INJURIES PROPDMGADJ CROPDMGADJ
## EXCESSIVE WETNESS 0 0 0.000e+00 2.315e+08
## DAMAGING FREEZE 0 0 1.201e+07 4.164e+08
## WILDFIRE 87 1456 9.821e+09 5.004e+08
## TORNADO 5633 91346 1.626e+11 5.526e+08
## EXCESSIVE HEAT 1903 6525 9.854e+06 5.753e+08
## HEAT 937 2100 2.111e+06 6.543e+08
## FREEZE 1 0 2.909e+05 6.822e+08
## HIGH WIND 248 1137 6.475e+09 8.161e+08
## TROPICAL STORM 58 340 1.013e+10 8.678e+08
## HEAVY RAIN 98 251 4.878e+09 9.658e+08
## FROST/FREEZE 0 0 1.009e+07 1.226e+09
## TSTM WIND 701 9353 1.304e+10 1.529e+09
## FLASH FLOOD 978 1777 2.142e+10 1.775e+09
## EXTREME COLD 160 231 1.052e+08 1.858e+09
## HAIL 15 1361 1.971e+10 4.011e+09
## HURRICANE/TYPHOON 126 1322 1.057e+11 7.030e+09
## FLOOD 498 6868 1.712e+11 7.124e+09
## ICE STORM 89 1975 5.225e+09 7.975e+09
## RIVER FLOOD 2 2 8.337e+09 8.197e+09
## DROUGHT 0 4 1.361e+09 1.865e+10
sum(tail(results[order(results$FATALITIES),cols], n=20)$FATALITIES) / totals[1]
## FATALITIES
## 0.9131
sum(tail(results[order(results$INJURIES),cols], n=20)$INJURIES) / totals[2]
## INJURIES
## 0.9681
sum(tail(results[order(results$PROPDMGADJ),cols], n=20)$PROPDMGADJ) / totals[3]
## PROPDMGADJ
## 0.9849
sum(tail(results[order(results$CROPDMGADJ),cols], n=20)$CROPDMGADJ) / totals[4]
## CROPDMGADJ
## 0.9676
# Focus on top N event types
ntypes <- 7
The top 20 event types for each category represent over 95% of the totals for injuries, property damage and crop damage. The top 20 event types for fatalities represent 90% of total fatalities.
Proportion of totals represented by top 7 event types for each category:
sum(tail(results[order(results$FATALITIES),cols], n=ntypes)$FATALITIES) / totals[1]
## FATALITIES
## 0.762
sum(tail(results[order(results$INJURIES),cols], n=ntypes)$INJURIES) / totals[2]
## INJURIES
## 0.8781
sum(tail(results[order(results$PROPDMGADJ),cols], n=ntypes)$PROPDMGADJ) / totals[3]
## PROPDMGADJ
## 0.8751
sum(tail(results[order(results$CROPDMGADJ),cols], n=ntypes)$CROPDMGADJ) / totals[4]
## CROPDMGADJ
## 0.8085
Based on this the graphics focus on the top 7 event types for each category. This is least justified for fatalities and crop damage and in general I would recommend consulting the top 20 event type lists for planning purposes. The top 20 events for each category in the Supporting Analysis section provides a useful alternative view (the totals often appear to be driven by a small number of extreme events).
topNfatalities <- ddply(stormData[stormData$EVTYPE %in%
tail(results[order(results$FATALITIES),1], n=ntypes),
c("EVTYPE", "year", "FATALITIES", "INJURIES",
"PROPDMGADJ", "CROPDMGADJ")],
.(EVTYPE, year), numcolwise(sum))
qplot(year, FATALITIES, data=topNfatalities, geom="line", color=EVTYPE) +
ggtitle("Fatalities by Year and Storm Event Type") +
xlab("Year") +
ylab("Fatalities")
Here we see the 1993 change mentioned above. Historically tornadoes cause the largest number of (documented) fatalities, but it seems there was a significant improvement from 1975 until 2010. Tornadoes and (excessive) heat are the primary issue here and are notable for their irregularity by year.
topNinjuries <- ddply(stormData[stormData$EVTYPE %in%
tail(results[order(results$INJURIES),1], n=ntypes),
c("EVTYPE", "year", "FATALITIES", "INJURIES",
"PROPDMGADJ", "CROPDMGADJ")],
.(EVTYPE, year), numcolwise(sum))
qplot(year, INJURIES, data=topNinjuries, geom="line", color=EVTYPE) +
ggtitle("Injuries by Year and Storm Event Type") +
xlab("Year") +
ylab("Injuries")
Tornadoes clearly cause the greatest number of injuries. A single year represents most of the FLOOD injuries. TSTM WIND total is somewhat overstated relative to the others because the data covers more years.
topNpropdmgadj <- ddply(stormData[stormData$EVTYPE %in%
tail(results[order(results$PROPDMGADJ),1], n=ntypes),
c("EVTYPE", "year", "FATALITIES", "INJURIES",
"PROPDMGADJ", "CROPDMGADJ")],
.(EVTYPE, year), numcolwise(sum))
qplot(year, PROPDMGADJ, data=topNpropdmgadj, geom="line", color=EVTYPE, log="y") +
ggtitle("Property Damage by Year and Storm Event Type") +
xlab("Year") +
ylab("Property Damage (2014 $ on log scale)")
Floods and hurricanes dominate the property damage results (to so great an extent the y axis is log scaled, do not be misled). Tornados have also made a significant contribution over the years.
A similar plot for crop damage is omitted given our three plot limit.
if (debug) {
topNcropdmgadj <- ddply(stormData[stormData$EVTYPE %in%
tail(results[order(results$CROPDMGADJ),1], n=ntypes),
c("EVTYPE", "year", "FATALITIES", "INJURIES",
"PROPDMGADJ", "CROPDMGADJ")],
.(EVTYPE, year), numcolwise(sum))
qplot(year, CROPDMGADJ, data=topNcropdmgadj, geom="line", color=EVTYPE) +
ggtitle("Crop Damage by Year and Storm Event Type") +
xlab("Year") +
ylab("Crop Damage (2014 $)")
}
Drought causes the most crop damage, but flood, ice/hail, and hurricanes all make a large contribution.
The formal part of this assignment ends here. Following are additional analyses used to choose what to present above and/or providing additional perspective on the storm data.
First examine the change in EVTYPE data in 1993.
# Compare before and after 1992 since many of the series appear to start at 1993
tab1 <- table(stormData$EVTYPE[stormData$year <= 1992])
sort(tab1)
##
## TORNADO HAIL TSTM WIND
## 34764 61832 90963
# All we have before 1993 are: TORNADO, HAIL, TSTM WIND
tab2 <- table(stormData$EVTYPE[stormData$year > 1992])
tail(sort(tab2), n=50)
##
## DRY MICROBURST STRONG WINDS EXTREME WINDCHILL
## 186 196 204
## HEAVY SURF/HIGH SURF FREEZING RAIN STORM SURGE
## 228 250 261
## HURRICANE/TYPHOON WIND AVALANCHE
## 271 340 386
## DUST STORM MARINE HAIL FOG
## 427 442 538
## COLD/WIND CHILL SNOW LANDSLIDE
## 539 587 600
## FLOOD/FLASH FLOOD LAKE-EFFECT SNOW COASTAL FLOOD
## 624 636 650
## EXTREME COLD FLASH FLOODING TROPICAL STORM
## 655 682 690
## HIGH SURF HEAT RIP CURRENT
## 725 767 774
## EXTREME COLD/WIND CHILL TSTM WIND/HAIL WINTER WEATHER/MIX
## 1002 1028 1104
## DENSE FOG FROST/FREEZE HIGH WINDS
## 1293 1342 1533
## EXCESSIVE HEAT ICE STORM DROUGHT
## 1678 2006 2488
## BLIZZARD STRONG WIND WATERSPOUT
## 2719 3566 3796
## WILDFIRE FUNNEL CLOUD WINTER WEATHER
## 4218 6839 7026
## WINTER STORM HEAVY RAIN MARINE TSTM WIND
## 11433 11725 11987
## HEAVY SNOW LIGHTNING HIGH WIND
## 15708 15754 20212
## TORNADO FLOOD FLASH FLOOD
## 25888 28967 54277
## HAIL TSTM WIND
## 226829 232383
# Try something simple, just sum up FATALITIES, INJURIES, PROPDMG, CROPDMG by EVTYPE
# totals computed above
results <- ddply(stormData[,c("EVTYPE", "FATALITIES", "INJURIES",
"PROPDMGADJ", "CROPDMGADJ")],
.(EVTYPE), numcolwise(sum))
row.names(results) <- results$EVTYPE
if (debug) {
hist(log10(results$FATALITIES))
hist(log10(results$INJURIES))
hist(log10(results$PROPDMGADJ))
hist(log10(results$CROPDMGADJ))
}
cols <- 2:5
if (FALSE) { # This has been moved into Results
tail(results[order(results$FATALITIES),cols], n=20)
tail(results[order(results$INJURIES),cols], n=20)
tail(results[order(results$PROPDMGADJ),cols], n=20)
tail(results[order(results$CROPDMGADJ),cols], n=20)
# Check proportion of results accounted for by top 20 event types
tail(results[order(results$FATALITIES),cols], n=20)$FATALITIES / totals[1]
sum(tail(results[order(results$FATALITIES),cols], n=20)$FATALITIES) / totals[1]
sum(tail(results[order(results$INJURIES),cols], n=20)$INJURIES) / totals[2]
sum(tail(results[order(results$PROPDMGADJ),cols], n=20)$PROPDMGADJ) / totals[3]
sum(tail(results[order(results$CROPDMGADJ),cols], n=20)$CROPDMGADJ) / totals[4]
}
# Look at means to check if any of the low frequency EVTYPEs are outliers and
# should be merged into totals
resultsmean <- ddply(stormData[,c("EVTYPE", "FATALITIES", "INJURIES",
"PROPDMGADJ", "CROPDMGADJ")],
.(EVTYPE), numcolwise(mean))
row.names(resultsmean) <- results$EVTYPE
tail(resultsmean[order(resultsmean$FATALITIES),cols], n=20)
## FATALITIES INJURIES PROPDMGADJ CROPDMGADJ
## Hypothermia/Exposure 1.333 0.000 0.000e+00 0
## HEAVY SEAS 1.500 0.000 0.000e+00 0
## TSUNAMI 1.650 6.450 7.803e+06 1098
## HURRICANE OPAL/HIGH WINDS 2.000 0.000 1.630e+08 16300329
## UNSEASONABLY WARM AND DRY 2.231 0.000 0.000e+00 0
## HEAT WAVE 2.324 4.176 2.245e+05 118807
## HEAT WAVES 2.500 0.000 0.000e+00 0
## RIP CURRENTS/HEAVY SURF 2.500 0.000 0.000e+00 0
## ROUGH SEAS 2.667 1.667 0.000e+00 0
## Heavy surf and wind 3.000 0.000 0.000e+00 0
## HIGH WIND AND SEAS 3.000 20.000 7.727e+04 0
## WINTER STORMS 3.333 5.667 2.717e+05 271672
## MARINE MISHAP 3.500 2.500 0.000e+00 0
## HEAT WAVE DROUGHT 4.000 15.000 3.091e+05 77272
## HIGH WIND/SEAS 4.000 0.000 8.150e+05 0
## EXTREME HEAT 4.364 7.045 8.078e+03 351236
## RECORD/EXCESSIVE HEAT 5.667 0.000 0.000e+00 0
## TROPICAL STORM GORDON 8.000 43.000 8.150e+05 815016
## COLD AND SNOW 14.000 0.000 0.000e+00 0
## TORNADOES, TSTM WIND, HAIL 25.000 0.000 2.608e+09 4075082
tail(resultsmean[order(resultsmean$INJURIES),cols], n=20)
## FATALITIES INJURIES PROPDMGADJ CROPDMGADJ
## HEAT WAVE 2.3243 4.176 224492 118807
## HURRICANE/TYPHOON 0.4649 4.878 390028241 25939536
## EXCESSIVE RAINFALL 0.5000 5.250 0 0
## WATERSPOUT/TORNADO 0.3750 5.250 10156130 0
## TORNADO F2 0.0000 5.333 832905 0
## WINTER STORMS 3.3333 5.667 271672 271672
## TSUNAMI 1.6500 6.450 7802744 1098
## GLAZE 0.2188 6.750 43827 0
## NON-SEVERE WIND DAMAGE 0.0000 7.000 6838 0
## EXTREME HEAT 4.3636 7.045 8078 351236
## WINTER WEATHER MIX 0.0000 11.333 13093 0
## GLAZE/ICE STORM 0.0000 15.000 0 0
## HEAT WAVE DROUGHT 4.0000 15.000 309088 77272
## WINTER STORM HIGH WINDS 1.0000 15.000 97801973 8150164
## SNOW/HIGH WINDS 0.0000 18.000 77272 0
## HIGH WIND AND SEAS 3.0000 20.000 77272 0
## THUNDERSTORMW 0.0000 27.000 193180 0
## WILD FIRES 0.7500 37.500 254324849 0
## TROPICAL STORM GORDON 8.0000 43.000 815016 815016
## Heat Wave 0.0000 70.000 0 0
tail(resultsmean[order(resultsmean$PROPDMGADJ),cols], n=20)
## FATALITIES INJURIES PROPDMGADJ CROPDMGADJ
## FLASH FLOOD/FLOOD 0.63636 0.00000 1.914e+07 3.899e+04
## WILDFIRES 0.00000 0.00000 1.996e+07 9.659e+04
## Heavy Rain/High Surf 0.00000 0.00000 2.027e+07 2.252e+06
## COASTAL FLOODING/EROSION 0.00000 0.00000 2.167e+07 0.000e+00
## River Flooding 0.00000 0.20000 3.188e+07 8.414e+06
## HIGH WINDS/COLD 0.00000 0.80000 3.415e+07 2.164e+06
## STORM SURGE/TIDE 0.07432 0.03378 3.430e+07 6.283e+03
## RIVER FLOOD 0.01156 0.01156 4.819e+07 4.738e+07
## MAJOR FLOOD 0.00000 0.00000 5.561e+07 0.000e+00
## HURRICANE ERIN 0.85714 0.14286 6.010e+07 3.167e+07
## HURRICANE EMILY 0.00000 1.00000 7.944e+07 0.000e+00
## TYPHOON 0.00000 0.45455 8.005e+07 1.100e+05
## WINTER STORM HIGH WINDS 1.00000 15.00000 9.780e+07 8.150e+06
## HAILSTORM 0.00000 0.00000 1.242e+08 0.000e+00
## SEVERE THUNDERSTORM 0.00000 0.00000 1.433e+08 2.508e+04
## HURRICANE OPAL/HIGH WINDS 2.00000 0.00000 1.630e+08 1.630e+07
## STORM SURGE 0.04981 0.14559 2.005e+08 2.876e+01
## WILD FIRES 0.75000 37.50000 2.543e+08 0.000e+00
## HURRICANE/TYPHOON 0.46494 4.87823 3.900e+08 2.594e+07
## TORNADOES, TSTM WIND, HAIL 25.00000 0.00000 2.608e+09 4.075e+06
tail(resultsmean[order(resultsmean$CROPDMGADJ),cols], n=20)
## FATALITIES INJURIES PROPDMGADJ CROPDMGADJ
## DROUGHT 0.00000 0.001608 546829 7496730
## Unseasonable Cold 0.00000 0.000000 0 7656877
## Freeze 0.00000 0.000000 0 7882079
## COOL AND WET 0.00000 0.000000 0 8150164
## WINTER STORM HIGH WINDS 1.00000 15.000000 97801973 8150164
## River Flooding 0.00000 0.200000 31875127 8413556
## TROPICAL STORM JERRY 0.00000 0.000000 2499384 8693509
## FREEZE 0.01351 0.000000 3931 9218558
## SEVERE THUNDERSTORM WINDS 0.00000 0.000000 57051 9454191
## Extreme Cold 1.00000 0.000000 0 15013484
## HURRICANE OPAL/HIGH WINDS 2.00000 0.000000 163003288 16300329
## Damaging Freeze 0.00000 0.000000 0 25620510
## HURRICANE/TYPHOON 0.46494 4.878229 390028241 25939536
## FLOOD/RAIN/WINDS 0.00000 0.000000 0 30644618
## HURRICANE ERIN 0.85714 0.142857 60101052 31671539
## RIVER FLOOD 0.01156 0.011561 48193242 47384063
## Early Frost 0.00000 0.000000 0 63056631
## DAMAGING FREEZE 0.00000 0.000000 2001798 69404508
## COLD AND WET CONDITIONS 0.00000 0.000000 0 107582170
## EXCESSIVE WETNESS 0.00000 0.000000 0 231464669
if (FALSE) { # Most of this moved to results and improved
# Take a look at stats by year
# Note differences in date ranges!
#severe <- ddply(stormData[stormData$EVTYPE %in% c("TORNADO", "FLASH FLOOD", "FLOOD", "LIGHTNING"),
# Now that exponents are properly accounted for these are no longer all most severe
severe <- ddply(stormData[stormData$EVTYPE %in% c("TORNADO", "FLASH FLOOD", "FLOOD", "LIGHTNING", "HEAT", "EXCESSIVE HEAT", "HAIL", "TSTM WIND"),
c("EVTYPE", "year", "FATALITIES", "INJURIES",
"PROPDMGADJ", "CROPDMGADJ")],
.(EVTYPE, year), numcolwise(sum))
qplot(year, FATALITIES, data=severe, geom="line", color=EVTYPE) +
ggtitle("Fatalities by Year and Storm Event Type") +
xlab("Year") +
ylab("Fatalities")
qplot(year, INJURIES, data=severe, geom="line", color=EVTYPE) +
ggtitle("Injuries by Year and Storm Event Type") +
xlab("Year") +
ylab("Injuries")
qplot(year, PROPDMGADJ, data=severe, geom="line", color=EVTYPE) +
ggtitle("Property Damage by Year and Storm Event Type") +
xlab("Year") +
ylab("Property Damage (2014 $)")
qplot(year, CROPDMGADJ, data=severe, geom="line", color=EVTYPE) +
ggtitle("Crop Damage by Year and Storm Event Type") +
xlab("Year") +
ylab("Crop Damage (2014 $)")
}
# Look at top 20 events in each category
columns <- c("EVTYPE", "date", "FATALITIES", "INJURIES", "PROPDMGADJ", "CROPDMGADJ")
tail(stormData[order(stormData$FATALITIES), columns], n=20)
## EVTYPE date FATALITIES INJURIES PROPDMGADJ CROPDMGADJ
## 83578 TORNADO 1957-05-20 37 176 6.474e+06 0
## 78123 TORNADO 1953-12-05 38 270 7.146e+07 0
## 157885 TORNADO 1979-04-10 42 1700 4.506e+08 0
## 362850 EXCESSIVE HEAT 1999-07-18 42 397 0.000e+00 0
## 629242 EXCESSIVE HEAT 2006-08-01 42 0 0.000e+00 0
## 860386 TORNADO 2011-04-27 44 800 1.570e+09 0
## 606363 EXCESSIVE HEAT 2006-07-16 46 18 1.986e+05 575315869
## 598500 EXCESSIVE HEAT 2005-09-21 49 0 0.000e+00 0
## 6370 TORNADO 1952-03-21 50 325 2.094e+07 0
## 78567 TORNADO 1966-03-03 57 504 7.146e+07 0
## 247938 EXTREME HEAT 1995-07-13 57 0 0.000e+00 0
## 230927 EXCESSIVE HEAT 1995-07-01 67 0 0.000e+00 0
## 371112 EXCESSIVE HEAT 1999-07-04 74 135 0.000e+00 0
## 46309 TORNADO 1955-05-25 75 270 1.325e+06 0
## 67884 TORNADO 1953-06-09 90 1228 9.024e+08 0
## 355128 EXCESSIVE HEAT 1999-07-28 99 0 0.000e+00 0
## 148852 TORNADO 1953-05-11 114 597 4.750e+07 0
## 68670 TORNADO 1953-06-08 116 785 9.024e+07 0
## 862634 TORNADO 2011-05-22 158 1150 2.932e+09 0
## 198704 HEAT 1995-07-12 583 0 0.000e+00 0
tail(stormData[order(stormData$INJURIES), columns], n=20)
## EVTYPE date FATALITIES INJURIES PROPDMGADJ
## 344182 FLOOD 1998-10-17 4 500 7.224e+06
## 344289 FLOOD 1998-10-18 0 500 7.224e+06
## 344291 FLOOD 1998-10-18 0 500 7.224e+05
## 78567 TORNADO 1966-03-03 57 504 7.146e+07
## 667233 EXCESSIVE HEAT 2007-08-04 2 519 0.000e+00
## 344163 FLOOD 1998-10-17 0 550 9.535e+07
## 35124 TORNADO 1965-04-11 17 560 1.605e+09
## 148852 TORNADO 1953-05-11 114 597 4.750e+07
## 344158 FLOOD 1998-10-17 11 600 1.156e+07
## 858228 TORNADO 2011-04-27 20 700 7.329e+08
## 344178 FLOOD 1998-10-17 0 750 3.872e+08
## 529351 HURRICANE/TYPHOON 2004-08-13 7 780 6.757e+09
## 68670 TORNADO 1953-06-08 116 785 9.024e+07
## 344159 FLOOD 1998-10-17 2 800 7.224e+07
## 860386 TORNADO 2011-04-27 44 800 1.570e+09
## 116011 TORNADO 1974-04-03 36 1150 5.472e+08
## 862634 TORNADO 2011-05-22 158 1150 2.932e+09
## 67884 TORNADO 1953-06-09 90 1228 9.024e+08
## 223449 ICE STORM 1994-02-08 1 1568 7.727e+07
## 157885 TORNADO 1979-04-10 42 1700 4.506e+08
## CROPDMGADJ
## 344182 144471
## 344289 1444705
## 344291 288941
## 78567 0
## 667233 0
## 344163 332282
## 35124 0
## 148852 0
## 344158 144471
## 858228 0
## 344178 2239293
## 529351 355289858
## 68670 0
## 344159 72235
## 860386 0
## 116011 0
## 862634 0
## 67884 0
## 223449 7727190
## 157885 0
tail(stormData[order(stormData$PROPDMGADJ), columns], n=20)
## EVTYPE date FATALITIES INJURIES PROPDMGADJ
## 525145 HURRICANE/TYPHOON 2004-09-13 0 0 3.117e+09
## 195000 HURRICANE/TYPHOON 1995-10-03 1 0 3.423e+09
## 207175 HEAVY RAIN 1995-05-08 0 0 3.972e+09
## 366694 HURRICANE/TYPHOON 1999-09-15 0 0 4.241e+09
## 739576 STORM SURGE/TIDE 2008-09-12 11 0 4.376e+09
## 298088 FLOOD 1997-04-18 0 0 4.401e+09
## 577683 HURRICANE/TYPHOON 2005-09-23 1 0 4.824e+09
## 529498 HURRICANE/TYPHOON 2004-09-13 7 0 4.987e+09
## 529436 HURRICANE/TYPHOON 2004-09-04 0 0 6.021e+09
## 529351 HURRICANE/TYPHOON 2004-08-13 7 780 6.757e+09
## 443782 TROPICAL STORM 2001-06-05 22 0 6.850e+09
## 581537 HURRICANE/TYPHOON 2005-08-29 15 104 7.092e+09
## 187564 WINTER STORM 1993-03-12 4 0 8.150e+09
## 198389 RIVER FLOOD 1993-08-31 0 0 8.150e+09
## 581533 HURRICANE/TYPHOON 2005-08-28 0 0 8.864e+09
## 569308 HURRICANE/TYPHOON 2005-10-24 5 0 1.206e+10
## 581535 STORM SURGE 2005-08-29 0 0 1.358e+10
## 577675 HURRICANE/TYPHOON 2005-08-28 0 0 2.042e+10
## 577676 STORM SURGE 2005-08-29 0 0 3.775e+10
## 605953 FLOOD 2006-01-01 0 0 1.344e+11
## CROPDMGADJ
## 525145 3.117e+07
## 195000 8.150e+06
## 207175 0.000e+00
## 366694 7.069e+08
## 739576 0.000e+00
## 298088 0.000e+00
## 577683 0.000e+00
## 529498 3.117e+07
## 529436 1.162e+08
## 529351 3.553e+08
## 443782 0.000e+00
## 581537 1.821e+09
## 187564 0.000e+00
## 198389 8.150e+09
## 581533 0.000e+00
## 569308 0.000e+00
## 581535 0.000e+00
## 577675 0.000e+00
## 577676 0.000e+00
## 605953 3.797e+07
tail(stormData[order(stormData$CROPDMGADJ), columns], n=20)
## EVTYPE date FATALITIES INJURIES PROPDMGADJ
## 404049 DROUGHT 2000-08-01 0 0 0.000e+00
## 445697 DROUGHT 2001-12-01 0 0 0.000e+00
## 667177 FLOOD 2007-07-01 0 0 5.679e+03
## 606363 EXCESSIVE HEAT 2006-07-16 46 18 1.986e+05
## 366678 HURRICANE/TYPHOON 1999-09-14 13 0 5.805e+08
## 468052 DROUGHT 2002-12-01 0 0 0.000e+00
## 344463 DROUGHT 1998-12-01 0 0 0.000e+00
## 188633 HEAT 1995-08-20 0 0 0.000e+00
## 384125 FLOOD 2000-10-03 0 0 6.154e+08
## 410175 DROUGHT 2000-11-01 0 0 0.000e+00
## 366694 HURRICANE/TYPHOON 1999-09-15 0 0 4.241e+09
## 371097 DROUGHT 1999-07-01 0 0 0.000e+00
## 337008 DROUGHT 1998-07-06 0 0 0.000e+00
## 422676 DROUGHT 2001-08-01 0 0 0.000e+00
## 199733 DROUGHT 1995-08-01 0 0 0.000e+00
## 312986 EXTREME COLD 1998-12-20 0 0 0.000e+00
## 639347 DROUGHT 2006-01-01 0 0 0.000e+00
## 581537 HURRICANE/TYPHOON 2005-08-29 15 104 7.092e+09
## 211900 ICE STORM 1994-02-09 0 0 7.944e+05
## 198389 RIVER FLOOD 1993-08-31 0 0 8.150e+09
## CROPDMGADJ
## 404049 5.468e+08
## 445697 5.587e+08
## 667177 5.679e+08
## 606363 5.753e+08
## 366678 5.847e+08
## 468052 6.285e+08
## 344463 6.501e+08
## 188633 6.520e+08
## 384125 6.838e+08
## 410175 7.043e+08
## 366694 7.069e+08
## 371097 7.069e+08
## 337008 7.224e+08
## 422676 7.700e+08
## 199733 8.150e+08
## 312986 8.610e+08
## 639347 1.168e+09
## 581537 1.821e+09
## 211900 7.944e+09
## 198389 8.150e+09
# Notice that the top 2 crop damage events look odd
Look at what proportion of the total effects is accounted for by the top 20 events in each category. The proportions of economic damage caused by the worst events is dramatic. For property and crop damage the top 20 of 90000 storm events accounted for over 40% of the total damage (is this correct?).
sum(tail(stormData[order(stormData$FATALITIES),], n=20)$FATALITIES) / totals[1]
## FATALITIES
## 0.1241
sum(tail(stormData[order(stormData$INJURIES),], n=20)$INJURIES) / totals[2]
## INJURIES
## 0.1156
sum(tail(stormData[order(stormData$PROPDMGADJ),], n=20)$PROPDMGADJ) / totals[3]
## PROPDMGADJ
## 0.4863
sum(tail(stormData[order(stormData$CROPDMGADJ),], n=20)$CROPDMGADJ) / totals[4]
## CROPDMGADJ
## 0.4395
nevents <- 50
sum(tail(stormData[order(stormData$FATALITIES),], n=nevents)$FATALITIES) / totals[1]
## FATALITIES
## 0.1811
sum(tail(stormData[order(stormData$INJURIES),], n=nevents)$INJURIES) / totals[2]
## INJURIES
## 0.1951
sum(tail(stormData[order(stormData$PROPDMGADJ),], n=nevents)$PROPDMGADJ) / totals[3]
## PROPDMGADJ
## 0.5783
sum(tail(stormData[order(stormData$CROPDMGADJ),], n=nevents)$CROPDMGADJ) / totals[4]
## CROPDMGADJ
## 0.5839
Look at outcomes by year and state.
sort(xtabs(FATALITIES ~ year, data=stormData))
## year
## 1981 1972 1980 1962 1963 1951 1954 1983 1977 1976 1960 1986 1961 1978 1992
## 24 27 28 30 31 34 36 37 43 44 46 51 52 53 54
## 1988 1959 1975 1982 1969 1958 1950 1964 1970 1991 1989 1956 1979 1973 1987
## 55 58 60 64 66 67 70 73 73 73 79 83 84 89 89
## 1990 1966 1985 1967 1955 1968 1971 1984 1957 1952 1993 1965 2009 1994 1974
## 95 98 112 114 129 131 159 160 193 230 298 301 333 344 366
## 2004 2007 2010 2003 2001 2005 2000 2008 2002 1953 1996 2006 1997 1998 1999
## 370 421 425 443 469 469 477 488 498 519 542 599 601 687 908
## 2011 1995
## 1002 1491
sort(xtabs(INJURIES ~ year, data=stormData))
## year
## 1951 1958 1963 1962 1950 1954 1959 1960 1977 1981 1983 1986
## 524 535 538 551 659 715 734 737 771 798 816 915
## 1978 1955 1972 1988 1961 1964 1980 1976 1982 1969 2009 1956
## 919 926 976 1030 1087 1148 1157 1195 1276 1311 1354 1355
## 1970 1991 1987 1975 1985 1989 1992 1990 2005 2010 1952 1957
## 1355 1355 1416 1457 1513 1675 1754 1825 1834 1855 1915 1976
## 1966 1967 1993 2007 1973 2004 1968 2008 1996 2001 1971 2000
## 2030 2144 2149 2191 2406 2426 2522 2703 2717 2721 2723 2803
## 1984 2003 1979 2002 2006 1997 1994 1995 1953 1999 1965 1974
## 2858 2931 3014 3155 3368 3800 4161 4480 5131 5148 5197 6824
## 2011 1998
## 7792 11177
sort(xtabs(PROPDMGADJ ~ year, data=stormData))
## year
## 1950 1962 1951 1959 1958 1955 1952
## 1.382e+08 2.564e+08 2.598e+08 2.903e+08 3.442e+08 3.513e+08 3.633e+08
## 1963 1969 1954 1960 1956 1964 1957
## 3.666e+08 4.115e+08 4.288e+08 4.659e+08 5.249e+08 6.693e+08 7.205e+08
## 1971 1972 1961 1968 1977 1976 1970
## 7.369e+08 7.734e+08 9.290e+08 9.586e+08 1.180e+09 1.341e+09 1.342e+09
## 1987 1967 1978 1953 1966 1981 1983
## 1.416e+09 2.293e+09 2.411e+09 2.466e+09 2.475e+09 2.523e+09 3.181e+09
## 1975 1979 1985 1992 1986 1991 1988
## 4.036e+09 4.293e+09 4.300e+09 4.428e+09 4.774e+09 4.850e+09 5.062e+09
## 2002 1980 2009 1994 1982 1990 2007
## 5.369e+09 5.400e+09 5.737e+09 5.947e+09 6.124e+09 6.376e+09 6.575e+09
## 2000 1984 1965 1996 2010 1974 1989
## 7.688e+09 7.908e+09 8.529e+09 9.138e+09 9.985e+09 1.031e+10 1.069e+10
## 1999 2003 2001 1973 1997 1998 2008
## 1.233e+10 1.312e+10 1.334e+10 1.395e+10 1.402e+10 1.671e+10 1.703e+10
## 1995 2011 1993 2004 2005 2006
## 1.704e+10 2.187e+10 2.653e+10 3.160e+10 1.167e+11 1.425e+11
sort(xtabs(CROPDMGADJ ~ year, data=stormData))
## year
## 1950 1951 1952 1953 1954 1955 1956
## 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
## 1957 1958 1959 1960 1961 1962 1963
## 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
## 1964 1965 1966 1967 1968 1969 1970
## 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
## 1971 1972 1973 1974 1975 1976 1977
## 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
## 1978 1979 1980 1981 1982 1983 1984
## 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
## 1985 1986 1987 1988 1989 1990 1991
## 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
## 1992 2009 2011 2003 1997 2004 2002
## 0.000e+00 5.732e+08 6.981e+08 1.463e+09 1.802e+09 1.810e+09 1.847e+09
## 2007 2010 2001 2008 1996 2006 2000
## 1.921e+09 1.928e+09 2.416e+09 2.418e+09 2.835e+09 4.129e+09 4.553e+09
## 1995 2005 1999 1998 1993 1994
## 4.750e+09 4.867e+09 4.993e+09 6.503e+09 9.108e+09 9.225e+09
sort(xtabs(FATALITIES ~ STATE, data=stormData))
## STATE
## LC LE LH LO MH PK PM SL ST XX GM LS PH LM PZ
## 0 0 0 0 0 0 0 0 0 0 1 1 1 4 5
## RI VI AM AN VT ME DE DC NH AS CT HI WY ID MT
## 7 7 10 12 23 25 30 31 32 41 41 44 56 58 58
## SD ND NM AK OR GU WV NE NV PR UT IA MA WA MD
## 61 69 72 74 75 81 92 102 105 115 136 140 140 146 162
## CO MN VA NJ AZ SC KY WI LA GA NY KS IN MI NC
## 163 168 174 180 208 221 239 279 310 327 342 356 391 398 398
## OH OK TN AR CA MS FL MO AL PA TX IL
## 403 458 521 530 550 555 746 754 784 846 1366 1421
sort(xtabs(INJURIES ~ STATE, data=stormData))
## STATE
## GM LC LE LH LO LS PH PK PM SL ST XX
## 0 0 0 0 0 0 0 0 0 0 0 0
## MH LM VI PZ AN AM RI PR VT HI AK AS
## 1 2 2 3 23 30 48 52 71 95 112 164
## ME MT NH OR NV ID DE WV DC NM GU WY
## 177 181 195 225 232 273 338 363 383 385 416 432
## ND WA SD CT AZ CO UT NJ NY NE MD VA
## 608 753 868 897 968 1004 1070 1152 1340 1471 1537 1703
## SC MA MN WI IA LA PA CA NC KS KY MI
## 1786 2121 2282 2309 2892 3215 3223 3278 3415 3449 3480 4586
## IN GA TN AR IL OK FL MS OH AL MO TX
## 4720 5061 5202 5550 5563 5710 5918 6675 7112 8742 8998 17667
sort(xtabs(PROPDMGADJ ~ STATE, data=stormData))
## STATE
## LC LH PH PM ST XX SL
## 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 1.964e+04
## LE PK PZ LO AN LS LM
## 3.384e+04 3.701e+04 8.249e+04 8.818e+04 3.293e+05 4.674e+05 2.967e+06
## GM AM MH VI RI DC AS
## 4.531e+06 5.981e+06 7.727e+06 5.426e+07 1.414e+08 2.031e+08 2.356e+08
## DE WY NH HI AK ID MT
## 2.577e+08 2.732e+08 2.846e+08 3.049e+08 3.686e+08 3.912e+08 4.570e+08
## ME WA SD UT NV WV GU
## 8.031e+08 9.818e+08 1.009e+09 1.053e+09 1.209e+09 1.331e+09 1.343e+09
## OR VT MD SC MA NM VA
## 1.359e+09 1.635e+09 1.699e+09 1.814e+09 2.183e+09 2.676e+09 2.736e+09
## PR NJ CO WI AZ CT KY
## 3.126e+09 3.862e+09 4.108e+09 4.267e+09 4.515e+09 4.776e+09 4.789e+09
## MI NE NY PA ND TN OK
## 4.801e+09 5.520e+09 6.207e+09 7.281e+09 7.522e+09 7.599e+09 8.320e+09
## MN OH MO AR KS NC IA
## 9.154e+09 9.988e+09 1.011e+10 1.205e+10 1.220e+10 1.324e+10 1.550e+10
## IN GA IL AL TX MS FL
## 1.603e+10 1.931e+10 1.996e+10 3.223e+10 3.509e+10 3.760e+10 5.756e+10
## LA CA
## 7.607e+10 1.463e+11
sort(xtabs(CROPDMGADJ ~ STATE, data=stormData))
## STATE
## AN GM LC LE LH LM LO
## 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
## LS MH PH PK PM PZ RI
## 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
## SL ST XX DC CT AM NH
## 0.000e+00 0.000e+00 0.000e+00 7.224e+03 3.928e+04 5.235e+04 2.412e+05
## AK VI ME MA AS WY UT
## 2.556e+05 2.675e+05 9.950e+05 1.519e+06 3.499e+06 4.124e+06 6.934e+06
## NV HI ID TN VT DE NM
## 9.856e+06 1.089e+07 2.646e+07 3.493e+07 3.704e+07 4.170e+07 4.572e+07
## WV MT OR NJ GU MD SC
## 5.073e+07 7.598e+07 1.288e+08 1.347e+08 1.402e+08 1.597e+08 1.625e+08
## AR SD CO NY AZ MI KY
## 1.910e+08 2.463e+08 2.499e+08 2.960e+08 3.154e+08 3.414e+08 4.034e+08
## MN OH ND VA WA PR PA
## 4.438e+08 5.481e+08 6.969e+08 7.481e+08 8.099e+08 8.585e+08 8.602e+08
## KS MO AL IN WI GA LA
## 8.649e+08 8.660e+08 9.636e+08 9.952e+08 1.199e+09 1.424e+09 1.600e+09
## OK NC NE CA FL IA IL
## 1.670e+09 2.836e+09 2.999e+09 4.857e+09 5.153e+09 6.129e+09 8.859e+09
## TX MS
## 9.460e+09 9.877e+09