Synopsis

This analysis focuses on the impact of severe weather events on health and economics. The dataset used, StormData, contains 902,297 rows and 37 columns. We subsetted the data to consider specific factors related to health impact (FATALITIES, INJURIES) and economic impact (PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP), categorized by event type (EVTYPE).

The analysis identifies the top 10 weather events that have the most significant impact on health and economic factors. In terms of health impact, the top 10 events are TORNADO, EXCESSIVE HEAT, TSTM WIND, FLOOD, LIGHTNING, HEAT, FLASH FLOOD, ICE STORM, THUNDERSTORM WIND, and WINTER STORM.

When considering economic factors, the top 10 weather events are FLOOD, HURRICANE/TYPHOON, TORNADO, STORM SURGE, HAIL, FLASH FLOOD, DROUGHT, HURRICANE, RIVER FLOOD, and ICE STORM. These events have the greatest economic consequences.

Through this analysis, we gain insights into the severe weather events that significantly impact health and the economy, allowing for better understanding and preparedness in dealing with such events.

Data Processing

Load the libraries.

library(stringr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(tidyr)

Load the data.

stormData <- read.csv("repdata_data_StormData.csv")

Remove unnecessesary rows and columns.

stormData_subset <- stormData[ ,c('EVTYPE', 'FATALITIES', 'INJURIES', 'PROPDMG', 'PROPDMGEXP', 'CROPDMG', 'CROPDMGEXP')]

stormData_subset <- subset(stormData_subset, (!((stormData_subset$EVTYPE %in% 
                         stormData_subset$EVTYPE[grep("^Summary", stormData_subset$EVTYPE)]) | 
                         stormData_subset$EVTYPE %in% c("?", "NONE", "Other"))) & 
                         ((stormData_subset$INJURIES > 0 | stormData_subset$FATALITIES > 0 | stormData_subset$PROPDMG > 0 | stormData_subset$CROPDMG > 0)))

Convert exponent columns.

stormData_subset$CROPDMGEXP <- toupper(stormData_subset$CROPDMGEXP)
stormData_subset$PROPDMGEXP <- toupper(stormData_subset$PROPDMGEXP)

stormData_subset$PROPDMGEXP[stormData_subset$PROPDMGEXP %in% c("", "-", "?", "+", "0")] <- 10^0
stormData_subset$PROPDMGEXP[stormData_subset$PROPDMGEXP == "2"] <- 10^2
stormData_subset$PROPDMGEXP[stormData_subset$PROPDMGEXP == "3"] <- 10^3
stormData_subset$PROPDMGEXP[stormData_subset$PROPDMGEXP == "4"] <- 10^4
stormData_subset$PROPDMGEXP[stormData_subset$PROPDMGEXP == "5"] <- 10^5
stormData_subset$PROPDMGEXP[stormData_subset$PROPDMGEXP == "6"] <- 10^6
stormData_subset$PROPDMGEXP[stormData_subset$PROPDMGEXP == "7"] <- 10^7
stormData_subset$PROPDMGEXP[stormData_subset$PROPDMGEXP == "H"] <- 10^2
stormData_subset$PROPDMGEXP[stormData_subset$PROPDMGEXP == "K"] <- 10^3
stormData_subset$PROPDMGEXP[stormData_subset$PROPDMGEXP == "M"] <- 10^6
stormData_subset$PROPDMGEXP[stormData_subset$PROPDMGEXP == "B"] <- 10^9

stormData_subset$CROPDMGEXP[stormData_subset$CROPDMGEXP %in% c("", "?", "0")] <- 10^0
stormData_subset$CROPDMGEXP[stormData_subset$CROPDMGEXP == "K"] <- 10^3
stormData_subset$CROPDMGEXP[stormData_subset$CROPDMGEXP == "M"] <- 10^6
stormData_subset$CROPDMGEXP[stormData_subset$CROPDMGEXP == "B"] <- 10^9

Create Property and Crop Cost column.

stormData_subset$PROPDMG <- as.numeric(stormData_subset$PROPDMG)
stormData_subset$PROPDMGEXP <- as.numeric(stormData_subset$PROPDMGEXP)
stormData_subset$CROPDMG <- as.numeric(stormData_subset$CROPDMG)
stormData_subset$CROPDMGEXP <- as.numeric(stormData_subset$CROPDMGEXP)

stormData_subset$PROPCOST <- stormData_subset$PROPDMG * stormData_subset$PROPDMGEXP
stormData_subset$CROPCOST <- stormData_subset$CROPDMG * stormData_subset$CROPDMGEXP

Results

1. Across the United States, which types of events (as indicated in the EVTYPE EVTYPE variable) are most harmful with respect to population health?

To estimate health impact, add the fatalities and injuries columns.

stormData_subset_HI <- stormData_subset %>% group_by(EVTYPE) %>%
        summarize(FATALITIES = sum(FATALITIES),
                  INJURIES = sum(INJURIES),
                  HEALTHIMP = sum(FATALITIES + INJURIES)) %>% 
        arrange(desc(HEALTHIMP))
top10_HI <- head(stormData_subset_HI, 10)
top10_HI <- top10_HI[, -ncol(top10_HI)]

Reshape the data to long format

top10_HI_long <- tidyr::gather(top10_HI, variable, value, -EVTYPE)

The top 10 events according to health impact (fatalities and injuries) are TORNADO, EXCESSIVE HEAT, TSTM WIND, FLOOD, LIGHTNING, HEAT, FLASH FLOOD, ICE STORM, THUNDERSTORM WIND, and WINTER STORM. These weather events have shown the greatest adverse effects on population health based on the combined impact of fatalities and injuries. Health Impact

2. Across the United States, which types of events have the greatest economic consequences?

To estimate economic impact, add the property cost and crop cost columns.

stormData_subset_EI <- stormData_subset %>% group_by(EVTYPE) %>%
        summarize(PROPCOST = sum(PROPCOST),
                  CROPCOST = sum(CROPCOST),
                  ECONIMP = sum(PROPCOST + CROPCOST)) %>% 
        arrange(desc(ECONIMP))
top10_EI <- head(stormData_subset_EI, 10)
top10_EI <- top10_EI[, -ncol(top10_EI)]

Reshape the data to long format

top10_EI_long <- tidyr::gather(top10_EI, variable, value, -EVTYPE)

The top 10 events according to economic impact (property cost and crop cost) are FLOOD, HURRICANE/TYPHOON, TORNADO, STORM SURGE, HAIL, FLASH FLOOD, DROUGHT, HURRICANE, RIVER FLOOD, and ICE STORM. These weather events have shown the greatest adverse effects on the economy based on the combined impact of property damage and crop damage Economic Impact