Data Processing

The first step of the analysis is to download the storm data and load it into R for further processing.

url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"

if (!file.exists("StormData.csv.bz2")) {
    download.file(url, "StormData.csv.bz2")
}

raw_data <- read.csv(bzfile("StormData.csv.bz2"))

raw_data$EVTYPE <- trimws(toupper(raw_data$EVTYPE))

Health Impact Analysis

Total health impact is defined as the sum of fatalities and injuries for each event type.

health <- aggregate( cbind(FATALITIES, INJURIES) ~ EVTYPE, data = raw_data, sum, na.rm=TRUE)

health$TOTAL_HEALTH <- health$FATALITIES + health$INJURIES

health <- health[order(health$TOTAL_HEALTH, decreasing = TRUE), ]

Results: Population Health Impact

top10 <- health[1:10, ]

ggplot(top10, aes(x = reorder(EVTYPE, TOTAL_HEALTH), y = TOTAL_HEALTH)) +
  geom_col(fill = "steelblue") +
  coord_flip() +
  labs(
    title = "Top 10 Weather Events by Population Health Impact",
    x = "Weather Event Type",
    y = "Total Fatalities + Injuries"
  ) +
  theme_minimal(base_size = 12)

Economic Impact Analysis

damage_multiplier <- function(exp) {
    if (exp == "K") {
        return(1000)
    } else if (exp == "M") {
        return(1000000)
    } else if (exp == "B") {
        return(1000000000)
    } else {
        return(1)
    }
}

raw_data$PROPDMGEXP <- toupper(raw_data$PROPDMGEXP)
raw_data$CROPDMGEXP <- toupper(raw_data$CROPDMGEXP)

raw_data$PROP_DAMAGE <- raw_data$PROPDMG *
    sapply(raw_data$PROPDMGEXP, damage_multiplier)

raw_data$CROP_DAMAGE <- raw_data$CROPDMG *
    sapply(raw_data$CROPDMGEXP, damage_multiplier)

raw_data$TOTAL_DAMAGE <- raw_data$PROP_DAMAGE +
    raw_data$CROP_DAMAGE

damage <- aggregate(
    TOTAL_DAMAGE ~ EVTYPE,
    data = raw_data,
    sum
)

damage <- damage[order(damage$TOTAL_DAMAGE, decreasing = TRUE), ]

Results: Economic Impact Analysis

top10_damage <- damage[1:10, ]
top10_damage$TOTAL_DAMAGE_B <- top10_damage$TOTAL_DAMAGE / 1e9

ggplot(top10_damage,
       aes(x = reorder(EVTYPE, TOTAL_DAMAGE_B),
           y = TOTAL_DAMAGE_B)) +
    geom_col(fill = "darkred") +
    coord_flip() +
    labs(
        title = "Top 10 Weather Events by Economic Impact",
        x = "Weather Event Type",
        y = "Total Property and Crop Damage (Billions USD)"
    ) +
    theme_minimal(base_size = 12)

Conclusion

The analysis of the NOAA Storm Database shows that tornadoes had the greatest overall impact on population health when measured by total fatalities and injuries. Other severe weather events such as excessive heat and flooding also contributed significantly to injuries and deaths across the United States.

The economic analysis showed a somewhat different pattern. Flooding, hurricanes, and storm surge events were associated with the highest levels of property and crop damage. This suggests that weather events that are most dangerous to human health are not always the same events that produce the largest economic losses.

Because the event classifications in the NOAA dataset contain inconsistencies and variations in naming, the results should be interpreted as approximate summaries rather than exact rankings. However, the analysis clearly identifies severe storms, flooding, and tornado-related events as major sources of both human and economic impact.