SYNOPSIS

Analysing the effects of natural disasters on US economy and mortality based on NOAA storm database.

Millions of people are affected by natural disasters every year, and their impact can be calamitous. From the destruction of buildings to the spread of disease, natural disasters can devastate entire countries overnight. Disasters like tsunamis, earthquakes and typhoons do not just wreak havoc on land; they also disrupt people’s lives and affect the country’s economy greatly. This assignment explores the NOAA Storm Database and answer some basic questions about severe weather events. The code below shows the entire analysis.

LOADING AND PROCESSING DATA

Data taken from National Weather Service: https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2 The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.

Loading data

storm_data <- read.csv(bzfile("StormData.csv.bz2"),sep=",",header = T) 
colnames(storm_data)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"

Finding relevant coloumns, we have:

  1. most harmful with respect to population health: EVTYPE FATALITIES INJURIES

  2. events with greatest economic consequences: EVTYPE PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP

Creating subset of data

# create subset of data

event <- c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP", "CROPDMG", 
           "CROPDMGEXP")
storm_subset <- storm_data[event]

Things to be done to the data before analysing:

  1. replace empty/ invalid values with 0
  2. assign values to the exponent data
  3. calculate damage value
unique(storm_subset$PROPDMGEXP)
##  [1] K M   B m + 0 5 6 ? 4 2 3 h 7 H - 1 8
## Levels:  - ? + 0 1 2 3 4 5 6 7 8 B h H K m M
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "+"] <- 0
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "-"] <- 0
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "?"] <- 0

# Assigning values for the property exponent data 
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "K"] <- 1000
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "M"] <- 1e+06
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == ""] <- 1
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "B"] <- 1e+09
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "m"] <- 1e+06
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "0"] <- 1
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "5"] <- 1e+05
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "6"] <- 1e+06
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "4"] <- 10000
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "2"] <- 100
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "3"] <- 1000
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "h"] <- 100
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "7"] <- 1e+07
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "H"] <- 100
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "1"] <- 10
storm_subset$PROPEXP[storm_subset$PROPDMGEXP == "8"] <- 1e+08

# Calculating the property damage value
storm_subset$PROPDMGVAL <- storm_subset$PROPDMG * storm_subset$PROPEXP

#Similarly for crop damage

unique(storm_subset$CROPDMGEXP)
## [1]   M K m B ? 0 k 2
## Levels:  ? 0 2 B k K m M
# Assigning '0' to invalid exponent data
storm_subset$CROPEXP[storm_subset$CROPDMGEXP == "?"] <- 0

# Assigning values for the crop exponent data 
storm_subset$CROPEXP[storm_subset$CROPDMGEXP == "M"] <- 1e+06
storm_subset$CROPEXP[storm_subset$CROPDMGEXP == "K"] <- 1000
storm_subset$CROPEXP[storm_subset$CROPDMGEXP == "m"] <- 1e+06
storm_subset$CROPEXP[storm_subset$CROPDMGEXP == "B"] <- 1e+09
storm_subset$CROPEXP[storm_subset$CROPDMGEXP == "0"] <- 1
storm_subset$CROPEXP[storm_subset$CROPDMGEXP == "k"] <- 1000
storm_subset$CROPEXP[storm_subset$CROPDMGEXP == "2"] <- 100
storm_subset$CROPEXP[storm_subset$CROPDMGEXP == ""] <- 1

#calculating value for crop damage
storm_subset$CROPDMGVAL <- storm_subset$CROPDMG * storm_subset$CROPEXP

#calculating sum of these columns
fatal <- aggregate(FATALITIES ~ EVTYPE, storm_subset, FUN = sum)
injury <- aggregate(INJURIES ~ EVTYPE, storm_subset, FUN = sum)
propdmg <- aggregate(PROPDMGVAL ~ EVTYPE, storm_subset, FUN = sum)
cropdmg <- aggregate(CROPDMGVAL ~ EVTYPE, storm_subset, FUN = sum)

ANALYSING DATA

Listing events with highest fatalities

fatal_top7 <- fatal[order(-fatal$FATALITIES), ][1:7, ]

Listing events with highest injuries

injury_top7 <- injury[order(-injury$INJURIES), ][1:7, ]

Plotting relevant graphs:

par(mfrow = c(1, 2), mar = c(12, 4, 3, 2), mgp = c(3, 1, 0), cex = 0.8)
barplot(fatal_top7$FATALITIES, las = 3, names.arg = fatal_top7$EVTYPE, main = "Events with Highest number of Fatalities", 
        ylab = "Number of fatalities", col = "dark blue")
barplot(injury_top7$INJURIES, las = 3, names.arg = injury_top7$EVTYPE, main = "Events with Highest number of Injuries", 
        ylab = "Number of injuries", col = "dark blue")

Finding events with highest property damage

propdmg_top7 <- propdmg[order(-propdmg$PROPDMGVAL), ][1:7, ]

Finding events with highest crop damage

cropdmg_top7 <- cropdmg[order(-cropdmg$CROPDMGVAL), ][1:7, ]

par(mfrow = c(1, 2), mar = c(12, 4, 3, 2), mgp = c(3, 1, 0), cex = 0.8)
barplot(propdmg_top7$PROPDMGVAL/(10^9), las = 3, names.arg = propdmg_top7$EVTYPE, 
        main = "Events with Highest Property Damage", ylab = "Damage Cost ($ billions)", 
        col = "dark blue")
barplot(cropdmg_top7$CROPDMGVAL/(10^9), las = 3, names.arg = cropdmg_top7$EVTYPE, 
        main = "Events With Highest Crop Damage", ylab = "Damage Cost ($ billions)", 
        col = "dark blue")

RESULTS

As seen from the barplots, tornados caused the maximum number of deaths an injuries. So we can easily conclude that tornados were the most destructive events with respect to population health. Next was excessive heat for most deaths and thunderstorm wind/flood for most injuries. On the other hand, events that affected the economy the most have flood as a common event. Other being drought and hurricane.