Analysis of severe weather events in NOAA Storm Database

Synopsis

Data Processing

Read the data into the environment and call dplyr.

library(dplyr)
weather = read.csv("StormData.csv", stringsAsFactors = FALSE)

Dangerous Events with respect to Population Health

For the purposes of our analysis, we will look at the number of injuries and fatalities caused by each type of event (indicated by EVTYPE).

weather_pop = weather %>%
  group_by(EVTYPE) %>%
  summarise(fatalities = sum(FATALITIES), injuries = sum(INJURIES))

weather_fatalities = head(weather_pop[order(weather_pop$fatalities,decreasing = TRUE),c(1,2)],10)

weather_injuries = head(weather_pop[order(weather_pop$injuries,decreasing = TRUE),c(1,3)],10)

The following is the number of fatalities related to weather:

weather_fatalities
## # A tibble: 10 x 2
##    EVTYPE         fatalities
##    <chr>               <dbl>
##  1 TORNADO              5633
##  2 EXCESSIVE HEAT       1903
##  3 FLASH FLOOD           978
##  4 HEAT                  937
##  5 LIGHTNING             816
##  6 TSTM WIND             504
##  7 FLOOD                 470
##  8 RIP CURRENT           368
##  9 HIGH WIND             248
## 10 AVALANCHE             224

The following is the number of injuries related to weather:

weather_injuries
## # A tibble: 10 x 2
##    EVTYPE            injuries
##    <chr>                <dbl>
##  1 TORNADO              91346
##  2 TSTM WIND             6957
##  3 FLOOD                 6789
##  4 EXCESSIVE HEAT        6525
##  5 LIGHTNING             5230
##  6 HEAT                  2100
##  7 ICE STORM             1975
##  8 FLASH FLOOD           1777
##  9 THUNDERSTORM WIND     1488
## 10 HAIL                  1361

Most costly events to the economy

We have to first transform the numbers in Property damage and crop damage to their actual values, based on their modifier.

weather_adj = weather %>%
  mutate(prop_mod_num = ifelse(PROPDMGEXP %in% c('h','H'),100,ifelse(PROPDMGEXP %in% c('k','K'), 1000, ifelse(PROPDMGEXP %in% c('m','M'),1000000,ifelse(PROPDMGEXP %in% c('b','B'),1000000000,0))))) %>%
   mutate(crop_mod_num = ifelse(CROPDMGEXP %in% c('h','H'),100,ifelse(CROPDMGEXP %in% c('k','K'), 1000, ifelse(CROPDMGEXP %in% c('m','M'),1000000,ifelse(CROPDMGEXP %in% c('b','B'),1000000000,0))))) %>%
  mutate(prop_dmg_adj = PROPDMG*prop_mod_num, crop_dmg_adj = CROPDMG*crop_mod_num)

Now that we have transformed the data, we can group by EVTYPE and get the sum of damages per type

damages = weather_adj%>%
  group_by(EVTYPE)%>%
  summarise(cost_prop = sum(prop_dmg_adj), cost_crop = sum(crop_dmg_adj))

The following is the cost of property damages:

weather_prop =head(damages[order(damages$cost_prop,decreasing = T),1:2],10)
weather_prop
## # A tibble: 10 x 2
##    EVTYPE               cost_prop
##    <chr>                    <dbl>
##  1 FLOOD             144657709800
##  2 HURRICANE/TYPHOON  69305840000
##  3 TORNADO            56937160480
##  4 STORM SURGE        43323536000
##  5 FLASH FLOOD        16140811510
##  6 HAIL               15732267220
##  7 HURRICANE          11868319010
##  8 TROPICAL STORM      7703890550
##  9 WINTER STORM        6688497250
## 10 HIGH WIND           5270046260

The following is the cost of crop damages:

weather_crop =head(damages[order(damages$cost_crop,decreasing = T),c(1,3)],10)
weather_crop
## # A tibble: 10 x 2
##    EVTYPE              cost_crop
##    <chr>                   <dbl>
##  1 DROUGHT           13972566000
##  2 FLOOD              5661968450
##  3 RIVER FLOOD        5029459000
##  4 ICE STORM          5022113500
##  5 HAIL               3025954450
##  6 HURRICANE          2741910000
##  7 HURRICANE/TYPHOON  2607872800
##  8 FLASH FLOOD        1421317100
##  9 EXTREME COLD       1292973000
## 10 FROST/FREEZE       1094086000

Results

Plot for fatalities:

library(ggplot2)
ggplot(weather_fatalities,aes(x=EVTYPE, y=fatalities))+
  geom_bar(stat="identity")+
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(title="Fatalities by EVTYPE")

Plot for property damage:

ggplot(weather_prop,aes(x=EVTYPE, y=cost_prop))+
  geom_bar(stat="identity")+
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(title="Property Damage by EVTYPE")

Plot for crop damage:

ggplot(weather_crop,aes(x=EVTYPE, y=cost_crop))+
  geom_bar(stat="identity")+
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(title="Crop Damage by EVTYPE")

Summary

As we can see, floods are the worst for property damage. Droughts are the worst for crop damage. Tornados are the worst for injuries and fatalities, but have a lower property damage than floods. This means that we have good evacuation procedure for floods, but not tornados. We should look into that.