Natural disasters afflict USA each year and cause plenty of damage. Using data from U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database, we will try to identify the major storm and weather events that are most destructive. For the purpose we will use fatalities and injuries to show the human costs and property and crop damage to show economic costs.
First, we load the libraries we will use.
library(R.utils)
library(lubridate)
library(dplyr)
Then we load the file.
download.file(url = "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",
destfile = "stormdata.csv.bz2", method = "curl")
bunzip2(filename = "stormdata.csv.bz2", remove = TRUE, overwrite = TRUE)
storm.df <- read.csv("stormdata.csv", strip.white = TRUE)
Then we make certain adjustments to the dataset.
storm.df$REMARKS <- as.character(storm.df$REMARKS)
storm.df$BGN_DATE <- mdy_hms(storm.df$BGN_DATE)
storm.df$year <- format(storm.df$BGN_DATE, "%Y")
storm.df$year <- as.numeric(storm.df$year)
We make a histogram to see in which years there were most events. It looks like the events have increased in frequency in the 1990s, which is probably due to better measurement.
hist(storm.df$year, breaks = 60, xlab = "Year", main = "Damaging Weather Events in USA")
abline(v = 1993, col = "red")
We see that they have increased after 1993. So we subset the data. We also don’t need many of the other variables.
storm1993on.df <- subset(storm.df, storm.df$year >= 1993)
storm_tbl <- tbl_df(storm1993on.df)
storm_small_tbl <- select(storm_tbl, contains("year"),
contains("BGN_DATE"),
contains("EVTYPE"),
contains("FATALITIES"),
contains("INJURIES"),
contains("PROPDMG"),
contains("CROPDMG"))
View fatalities
fatalities_tbl <- storm_small_tbl %>% group_by(EVTYPE) %>% summarise(sum(FATALITIES))
colnames(fatalities_tbl) <- c("event", "fatalities")
head(arrange(fatalities_tbl, desc(fatalities)), n = 20)
## Source: local data frame [20 x 2]
##
## event fatalities
## 1 EXCESSIVE HEAT 1903
## 2 TORNADO 1621
## 3 FLASH FLOOD 978
## 4 HEAT 937
## 5 LIGHTNING 816
## 6 FLOOD 470
## 7 RIP CURRENT 368
## 8 HIGH WIND 248
## 9 TSTM WIND 241
## 10 AVALANCHE 224
## 11 WINTER STORM 206
## 12 RIP CURRENTS 204
## 13 HEAT WAVE 172
## 14 EXTREME COLD 160
## 15 THUNDERSTORM WIND 133
## 16 HEAVY SNOW 127
## 17 EXTREME COLD/WIND CHILL 125
## 18 STRONG WIND 103
## 19 BLIZZARD 101
## 20 HIGH SURF 101
View injuries
injuries_tbl <- storm_small_tbl %>% group_by(EVTYPE) %>% summarise(sum(INJURIES))
colnames(injuries_tbl) <- c("event", "injuries")
head(arrange(injuries_tbl, desc(injuries)), n = 20)
## Source: local data frame [20 x 2]
##
## event injuries
## 1 TORNADO 23310
## 2 FLOOD 6789
## 3 EXCESSIVE HEAT 6525
## 4 LIGHTNING 5230
## 5 TSTM WIND 3631
## 6 HEAT 2100
## 7 ICE STORM 1975
## 8 FLASH FLOOD 1777
## 9 THUNDERSTORM WIND 1488
## 10 WINTER STORM 1321
## 11 HURRICANE/TYPHOON 1275
## 12 HIGH WIND 1137
## 13 HEAVY SNOW 1021
## 14 HAIL 960
## 15 WILDFIRE 911
## 16 THUNDERSTORM WINDS 908
## 17 BLIZZARD 805
## 18 FOG 734
## 19 WILD/FOREST FIRE 545
## 20 DUST STORM 440
Property damage
property_tbl <- storm_small_tbl %>% group_by(EVTYPE) %>% summarise(sum(PROPDMG))
colnames(property_tbl) <- c("event", "property_damage")
head(arrange(property_tbl, desc(property_damage)), n = 20)
## Source: local data frame [20 x 2]
##
## event property_damage
## 1 FLASH FLOOD 1420124.59
## 2 TORNADO 1387757.09
## 3 TSTM WIND 1335965.61
## 4 FLOOD 899938.48
## 5 THUNDERSTORM WIND 876844.17
## 6 HAIL 688693.38
## 7 LIGHTNING 603351.78
## 8 THUNDERSTORM WINDS 446293.18
## 9 HIGH WIND 324731.56
## 10 WINTER STORM 132720.59
## 11 HEAVY SNOW 122251.99
## 12 WILDFIRE 84459.34
## 13 ICE STORM 66000.67
## 14 STRONG WIND 62993.81
## 15 HIGH WINDS 55625.00
## 16 HEAVY RAIN 50842.14
## 17 TROPICAL STORM 48423.68
## 18 WILD/FOREST FIRE 39344.95
## 19 FLASH FLOODING 28497.15
## 20 URBAN/SML STREAM FLD 26051.94
Crop damage
crop_tbl <- storm_small_tbl %>% group_by(EVTYPE) %>% summarise(sum(CROPDMG))
colnames(crop_tbl) <- c("event", "crop_damage")
head(arrange(crop_tbl, desc(crop_damage)), n = 20)
## Source: local data frame [20 x 2]
##
## event crop_damage
## 1 HAIL 579596.28
## 2 FLASH FLOOD 179200.46
## 3 FLOOD 168037.88
## 4 TSTM WIND 109202.60
## 5 TORNADO 100018.52
## 6 THUNDERSTORM WIND 66791.45
## 7 DROUGHT 33898.62
## 8 THUNDERSTORM WINDS 18684.93
## 9 HIGH WIND 17283.21
## 10 HEAVY RAIN 11122.80
## 11 FROST/FREEZE 7034.14
## 12 EXTREME COLD 6121.14
## 13 TROPICAL STORM 5899.12
## 14 HURRICANE 5339.31
## 15 FLASH FLOODING 5126.05
## 16 HURRICANE/TYPHOON 4798.48
## 17 WILDFIRE 4364.20
## 18 TSTM WIND/HAIL 4356.65
## 19 WILD/FOREST FIRE 4189.54
## 20 LIGHTNING 3580.61
Excessive heat, tornado, and flood are the most dangerous events for humans. Hail, TSTM wind, flash flood and tornado have the highest economic impact.