Introduction

Sypnosis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. My project shows which events lead to the greatest economic consequences, measured by crop and property damages. This project also shows the levels at which events impact population health, measured by fatalities and injuries.

Data Processing

Code below is used to download, and load the data

library(tidyverse)
url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
setwd("C:\\Users\\tsuim\\Documents\\R\\JHU Data Course\\Module 5\\Project 2")
download.file(url, destfile = "C:\\Users\\tsuim\\Documents\\R\\JHU Data Course\\Module 5\\Project 2\\dataset.csv.bz2")

# Read data 
storm_data <- read.csv(bzfile("dataset.csv.bz2"))

Population Health

Analysis below cleans the data variables: event type, fatalities, and injuries. Plot is generated for visualization.

# Find what types of event is most harmful to population health 
death_inj <- aggregate(INJURIES + FATALITIES ~ EVTYPE, 
                                   storm_data, sum)
popHealth <- death_inj[order(death_inj$`INJURIES + FATALITIES`, 
                             decreasing = TRUE),] # Sort by descending to see which event is most harmful
popHealth[1,] # Display of first row shows that tornado have the greatest harm to pop health
##      EVTYPE INJURIES + FATALITIES
## 834 TORNADO                 96979
colnames(popHealth) <- c('Event_Type', 'Injuries_And_Deaths') # Rename columns
# Plot for population health 
popPlot <- ggplot(popHealth[1:5,], aes(Event_Type, Injuries_And_Deaths)) + 
  geom_bar(stat="identity") +
  theme(text = element_text(size=20),
        axis.text.x = element_text(angle=90, hjust=1)) +
  xlab("Event Type") +
  ylab("Total Fatalities") +
  ggtitle("Most Fatal Events") +
  theme(plot.title = element_text(hjust = 0.5))
popPlot
Fig. 1: Top 5 Events with Greatest Impact to Population Health

Fig. 1: Top 5 Events with Greatest Impact to Population Health

Economic Impact

Analysis below cleans the data variables: event type, crop damage, crop damage exponent, property damage, and property damage exponent. Data is preprocessed by examining the unique values in the exponent column. I subset the values for the exponent columns so that I can multiply the damage and the exponent values to get the total damage value. I total crop and property damage after that.

Plot is generated for visualization.

# Find which type of events have the greatest economic consequences 
# Check unique types of magnitude exponents 
unique(storm_data$PROPDMGEXP)
##  [1] "K" "M" ""  "B" "m" "+" "0" "5" "6" "?" "4" "2" "3" "h" "7" "H" "-" "1" "8"
unique(storm_data$CROPDMGEXP)
## [1] ""  "M" "K" "m" "B" "?" "0" "k" "2"
length(unique(storm_data$PROPDMGEXP)) # There are 19 different values. 
## [1] 19
length(unique(storm_data$CROPDMGEXP)) # There are 9 different values.
## [1] 9
# Create smaller data frame for economics
econ <- storm_data[,c("EVTYPE", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")]

# Upper case exponent abbreviations 
econ$PROPDMGEXP <- toupper(econ$PROPDMGEXP)
econ$CROPDMGEXP <- toupper(econ$CROPDMGEXP)

# Assign integer values to exponents
labels = c(1E3, 1E6, 1E0, 1E9, 1E0, 1E0, 1E5, 1E6, 1E0, 1E4, 1E2, 1E3, 1E2, 1E7, 1E0, 1E1, 1E8)
levels = c("K", "M", "",  "B", "+", "0", "5", "6", "?", "4", "2", "3", "H", "7", "-", "1", "8")
econ$PROPDMGEXP <- factor(econ$PROPDMGEXP, levels = levels, labels = labels)

labels_2 = c(1E0, 1E6, 1E3, 1E9, 1E0, 1E0, 1E2)
levels_2 = c("", "M", "K", "B", "?", "0", "2")
econ$CROPDMGEXP <- factor (econ$CROPDMGEXP, levels = levels_2, labels = labels_2)

# Multiply columns to get total economic damage column
econ$PROPDMGTOTAL <- as.numeric(as.character(econ$PROPDMGEXP)) * econ$PROPDMG
econ$CROPDMGTOTAL <- as.numeric(as.character(econ$CROPDMGEXP)) * econ$CROPDMG
econ$DAMAGETOTAL <- econ$PROPDMGTOTAL + econ$CROPDMGTOTAL

# Find what types of event has the greatest economic consequences
damage <- aggregate(DAMAGETOTAL ~ EVTYPE, econ, sum)
damage_ordered <- damage[order(damage$DAMAGETOTAL, 
                             decreasing = TRUE),] # Sort by descending to see which event is most harmful
damage_ordered[1,] # Display of first row shows that tornado have the greatest harm to pop health
##     EVTYPE  DAMAGETOTAL
## 170  FLOOD 150319678257
# Plot for population health 
econPlot <- ggplot(damage_ordered[1:5,], aes(EVTYPE, DAMAGETOTAL)) + 
  geom_bar(stat="identity") +
  theme(text = element_text(size=20),
        axis.text.x = element_text(angle=90, hjust=1)) +
  xlab("Event Type") +
  ylab("Damage Total") +
  ggtitle("Events with Greatest Economic Impact") +
  theme(plot.title = element_text(hjust = 0.5))
econPlot 
Fig. 2: Top 5 Events with Greatest Impact to Economic

Fig. 2: Top 5 Events with Greatest Impact to Economic

Results

The results above indicate the following: