U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database analysis

Author: Bharat S Raj

Synopsis

The goal of this report is show the most harmful events that can be affect the united states, in order to take some decisions related to the future investments in planning and management of damages. The report identifies the most significant weather event types with the largest impact on population health (as measured by number of fatalities and injuries) and the largest economic consequences (as measured by the property damage crop damage sustained during the event). The data was collected during the period from 1950 and November 2011. The purpose of this analysis is to answer the following two questions:

  • Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
  • Across the United States, which types of events have the greatest economic consequences?

Data Processing

Loading the data

The dataset can be downloaded from this link Storm Data [46.9 MB]

We assume the data file is in the working directory

cache = TRUE
storm <- read.csv(bzfile("repdata-data-StormData.csv.bz2"), header = TRUE, stringsAsFactors = FALSE)
# convert letter exponents to integers
storm[(storm$PROPDMGEXP == "K" | storm$PROPDMGEXP == "k"), ]$PROPDMGEXP <- 3
storm[(storm$PROPDMGEXP == "M" | storm$PROPDMGEXP == "m"), ]$PROPDMGEXP <- 6
storm[(storm$PROPDMGEXP == "B" | storm$PROPDMGEXP == "b"), ]$PROPDMGEXP <- 9
storm[(storm$CROPDMGEXP == "K" | storm$CROPDMGEXP == "k"), ]$CROPDMGEXP <- 3
storm[(storm$CROPDMGEXP == "M" | storm$CROPDMGEXP == "m"), ]$CROPDMGEXP <- 6
storm[(storm$CROPDMGEXP == "B" | storm$CROPDMGEXP == "b"), ]$CROPDMGEXP <- 9

# multiply property and crops damage by 10 raised to the power of the exponent
suppressWarnings(storm$PROPDMG <- storm$PROPDMG * 10^as.numeric(storm$PROPDMGEXP))
suppressWarnings(storm$CROPDMG <- storm$CROPDMG * 10^as.numeric(storm$CROPDMGEXP))

# compute combined economic damage (property damage + crops damage)
suppressWarnings(TOTECODMG <- storm$PROPDMG + storm$CROPDMG)

Results

Across the United States, which types of events are most harmful with respect to population health ?

Aggregate data for fatalities

fatalities <- aggregate(FATALITIES ~ EVTYPE, data = storm, FUN = sum)
fatalities <- fatalities[order(fatalities$FATALITIES, decreasing = TRUE), ]
# 5 most harmful causes of fatalities
fatalitiesMax <- fatalities[1:5, ]
print(fatalitiesMax)
##             EVTYPE FATALITIES
## 834        TORNADO       5633
## 130 EXCESSIVE HEAT       1903
## 153    FLASH FLOOD        978
## 275           HEAT        937
## 464      LIGHTNING        816

Aggregate data for injuries

injuries <- aggregate(INJURIES ~ EVTYPE, data = storm, FUN = sum)
injuries <- injuries[order(injuries$INJURIES, decreasing = TRUE), ]
# 5 most harmful causes of injuries
injuriesMax <- injuries[1:5, ]
print(injuriesMax)
##             EVTYPE INJURIES
## 834        TORNADO    91346
## 856      TSTM WIND     6957
## 170          FLOOD     6789
## 130 EXCESSIVE HEAT     6525
## 464      LIGHTNING     5230

Plotting the data for 5 most dangerous events for each type of damage

For plotting graph, used ggplot2 package in this analysis

library(ggplot2)
ggplot(data = fatalitiesMax, aes(x = fatalitiesMax$EVTYPE, y = fatalitiesMax$FATALITIES)) +  geom_bar(colour = "black", fill = "blue", stat = "identity") + xlab("Event Type") +  ylab("Number of fatalities") + ggtitle("Total number of fatalities, 1950 - 2011") +  theme(axis.text.x = element_text(angle = 90, hjust = 1))

ggplot(data = injuriesMax, aes(x = injuriesMax$EVTYPE, y = injuriesMax$INJURIES)) + geom_bar(colour = "black", fill = "blue", stat = "identity") + xlab("Event Type") + ylab("Number of injuries") + ggtitle("Total number of Injuries, 1950 - 2011") +  theme(axis.text.x = element_text(angle = 90, hjust = 1))

Across the United States, which types of events have the greatest economic consequences ?

Aggregate data for property damage

propdmg <- aggregate(PROPDMG ~ EVTYPE, data = storm, FUN = sum)
propdmg <- propdmg[order(propdmg$PROPDMG, decreasing = TRUE), ]
# 5 most harmful causes of injuries
propdmgMax <- propdmg[1:5, ]
print(propdmgMax)
##                EVTYPE      PROPDMG
## 62              FLOOD 144657709800
## 179 HURRICANE/TYPHOON  69305840000
## 332           TORNADO  56947380614
## 281       STORM SURGE  43323536000
## 50        FLASH FLOOD  16822673772

Aggregate data for crop damage

cropdmg <- aggregate(CROPDMG ~ EVTYPE, data = storm, FUN = sum)
cropdmg <- cropdmg[order(cropdmg$CROPDMG, decreasing = TRUE), ]
# 5 most harmful causes of injuries
cropdmgMax <- cropdmg[1:5, ]
print(cropdmgMax)
##         EVTYPE     CROPDMG
## 16     DROUGHT 13972566000
## 34       FLOOD  5661968450
## 98 RIVER FLOOD  5029459000
## 85   ICE STORM  5022113500
## 52        HAIL  3025954470

Aggregate total economic damage

ecodmg <- aggregate(TOTECODMG ~ EVTYPE, data = storm, FUN = sum)
ecodmg <- ecodmg[order(ecodmg$TOTECODMG, decreasing = TRUE), ]
# 5 most harmful causes of property damage
ecodmgMax <- ecodmg[1:5, ]
print(ecodmgMax)
##               EVTYPE    TOTECODMG
## 23             FLOOD 138007444500
## 62 HURRICANE/TYPHOON  29348167800
## 99           TORNADO  16570326363
## 57         HURRICANE  12405268000
## 75       RIVER FLOOD  10108369000

Total Economic Damage Graph Plot:

# total economic damage (property + crops)
ggplot(data = ecodmgMax, aes(x = ecodmgMax$EVTYPE, y = ecodmgMax$TOTECODMG/10^9)) + 
    geom_bar(colour = "black", fill = "blue", stat = "identity") + xlab("Event Type") + 
    ylab("Total damage, bln USD") + ggtitle("Total economic damage  1950 - 2011, billions USD") + 
    theme(axis.text.x = element_text(angle = 90, hjust = 1))

Conclusion

Tornados have caused the greatest number of fatalities - 5,633 and injuries - 91,346 followed by Heat in terms of fatalities 1,903 (6,525 injuries slightly less than Thunderstorm Wind 6957 injuries which is the second harmful cause in terms of injuries).

Floods have caused the most significant economic damage 138,007,444,500 USD (combined for property loss and crops damage) followed by Hurricanes and Typhoons - 29,348,167,800 USD