Synopsis

Natural disasters afflict USA each year and cause plenty of damage. Using data from U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database, we will try to identify the major storm and weather events that are most destructive. For the purpose we will use fatalities and injuries to show the human costs and property and crop damage to show economic costs.

Data Processing

First, we load the libraries we will use.

library(R.utils)
library(lubridate)
library(dplyr)

Then we load the file.

download.file(url = "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",
              destfile = "stormdata.csv.bz2", method = "curl")
bunzip2(filename = "stormdata.csv.bz2", remove = TRUE, overwrite = TRUE)

storm.df <- read.csv("stormdata.csv", strip.white = TRUE)

Then we make certain adjustments to the dataset.

storm.df$REMARKS <- as.character(storm.df$REMARKS)
storm.df$BGN_DATE <- mdy_hms(storm.df$BGN_DATE)
storm.df$year <- format(storm.df$BGN_DATE, "%Y")
storm.df$year <- as.numeric(storm.df$year)

We make a histogram to see in which years there were most events. It looks like the events have increased in frequency in the 1990s, which is probably due to better measurement.

hist(storm.df$year, breaks = 60, xlab = "Year", main = "Damaging Weather Events in USA")
abline(v = 1993, col = "red")

We see that they have increased after 1993. So we subset the data. We also don’t need many of the other variables.

storm1993on.df <- subset(storm.df, storm.df$year >= 1993)
storm_tbl <- tbl_df(storm1993on.df)
storm_small_tbl <- select(storm_tbl, contains("year"), 
                          contains("BGN_DATE"), 
                          contains("EVTYPE"), 
                          contains("FATALITIES"), 
                          contains("INJURIES"), 
                          contains("PROPDMG"), 
                          contains("CROPDMG"))

Results

View fatalities

fatalities_tbl <- storm_small_tbl %>% group_by(EVTYPE) %>% summarise(sum(FATALITIES)) 
colnames(fatalities_tbl) <- c("event", "fatalities")

head(arrange(fatalities_tbl, desc(fatalities)), n = 20)
## Source: local data frame [20 x 2]
## 
##                      event fatalities
## 1           EXCESSIVE HEAT       1903
## 2                  TORNADO       1621
## 3              FLASH FLOOD        978
## 4                     HEAT        937
## 5                LIGHTNING        816
## 6                    FLOOD        470
## 7              RIP CURRENT        368
## 8                HIGH WIND        248
## 9                TSTM WIND        241
## 10               AVALANCHE        224
## 11            WINTER STORM        206
## 12            RIP CURRENTS        204
## 13               HEAT WAVE        172
## 14            EXTREME COLD        160
## 15       THUNDERSTORM WIND        133
## 16              HEAVY SNOW        127
## 17 EXTREME COLD/WIND CHILL        125
## 18             STRONG WIND        103
## 19                BLIZZARD        101
## 20               HIGH SURF        101

View injuries

injuries_tbl <- storm_small_tbl %>% group_by(EVTYPE) %>% summarise(sum(INJURIES))
colnames(injuries_tbl) <- c("event", "injuries")
head(arrange(injuries_tbl, desc(injuries)), n = 20)
## Source: local data frame [20 x 2]
## 
##                 event injuries
## 1             TORNADO    23310
## 2               FLOOD     6789
## 3      EXCESSIVE HEAT     6525
## 4           LIGHTNING     5230
## 5           TSTM WIND     3631
## 6                HEAT     2100
## 7           ICE STORM     1975
## 8         FLASH FLOOD     1777
## 9   THUNDERSTORM WIND     1488
## 10       WINTER STORM     1321
## 11  HURRICANE/TYPHOON     1275
## 12          HIGH WIND     1137
## 13         HEAVY SNOW     1021
## 14               HAIL      960
## 15           WILDFIRE      911
## 16 THUNDERSTORM WINDS      908
## 17           BLIZZARD      805
## 18                FOG      734
## 19   WILD/FOREST FIRE      545
## 20         DUST STORM      440

Property damage

property_tbl <- storm_small_tbl %>% group_by(EVTYPE) %>% summarise(sum(PROPDMG))
colnames(property_tbl) <- c("event", "property_damage")
head(arrange(property_tbl, desc(property_damage)), n = 20)
## Source: local data frame [20 x 2]
## 
##                   event property_damage
## 1           FLASH FLOOD      1420124.59
## 2               TORNADO      1387757.09
## 3             TSTM WIND      1335965.61
## 4                 FLOOD       899938.48
## 5     THUNDERSTORM WIND       876844.17
## 6                  HAIL       688693.38
## 7             LIGHTNING       603351.78
## 8    THUNDERSTORM WINDS       446293.18
## 9             HIGH WIND       324731.56
## 10         WINTER STORM       132720.59
## 11           HEAVY SNOW       122251.99
## 12             WILDFIRE        84459.34
## 13            ICE STORM        66000.67
## 14          STRONG WIND        62993.81
## 15           HIGH WINDS        55625.00
## 16           HEAVY RAIN        50842.14
## 17       TROPICAL STORM        48423.68
## 18     WILD/FOREST FIRE        39344.95
## 19       FLASH FLOODING        28497.15
## 20 URBAN/SML STREAM FLD        26051.94

Crop damage

crop_tbl <- storm_small_tbl %>% group_by(EVTYPE) %>% summarise(sum(CROPDMG))
colnames(crop_tbl) <- c("event", "crop_damage")
head(arrange(crop_tbl, desc(crop_damage)), n = 20)
## Source: local data frame [20 x 2]
## 
##                 event crop_damage
## 1                HAIL   579596.28
## 2         FLASH FLOOD   179200.46
## 3               FLOOD   168037.88
## 4           TSTM WIND   109202.60
## 5             TORNADO   100018.52
## 6   THUNDERSTORM WIND    66791.45
## 7             DROUGHT    33898.62
## 8  THUNDERSTORM WINDS    18684.93
## 9           HIGH WIND    17283.21
## 10         HEAVY RAIN    11122.80
## 11       FROST/FREEZE     7034.14
## 12       EXTREME COLD     6121.14
## 13     TROPICAL STORM     5899.12
## 14          HURRICANE     5339.31
## 15     FLASH FLOODING     5126.05
## 16  HURRICANE/TYPHOON     4798.48
## 17           WILDFIRE     4364.20
## 18     TSTM WIND/HAIL     4356.65
## 19   WILD/FOREST FIRE     4189.54
## 20          LIGHTNING     3580.61

Conclusion

Excessive heat, tornado, and flood are the most dangerous events for humans. Hail, TSTM wind, flash flood and tornado have the highest economic impact.