This analysis explores data from the NOAA Storm Database to understand the impacts of severe weather events across the United States. The study focuses on identifying which types of events are most harmful to population health and which cause the greatest economic damage. Using reported fatalities, injuries, property damage, and crop damage, the data are processed and summarized by event type. The results provide a clear overview of the most significant weather-related risks. These findings can assist public officials and emergency planners in prioritizing resources and preparedness efforts. All results shown are fully reproducible from the raw data source.
# Load required packages
library(dplyr)
library(ggplot2)
# Load the raw data directly from the compressed CSV file
storm_data <- read.csv("repdata_data_StormData.csv")
# Inspect structure
str(storm_data)
## 'data.frame': 902297 obs. of 37 variables:
## $ STATE__ : num 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_DATE : chr "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
## $ BGN_TIME : chr "0130" "0145" "1600" "0900" ...
## $ TIME_ZONE : chr "CST" "CST" "CST" "CST" ...
## $ COUNTY : num 97 3 57 89 43 77 9 123 125 57 ...
## $ COUNTYNAME: chr "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
## $ STATE : chr "AL" "AL" "AL" "AL" ...
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ BGN_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ BGN_AZI : chr "" "" "" "" ...
## $ BGN_LOCATI: chr "" "" "" "" ...
## $ END_DATE : chr "" "" "" "" ...
## $ END_TIME : chr "" "" "" "" ...
## $ COUNTY_END: num 0 0 0 0 0 0 0 0 0 0 ...
## $ COUNTYENDN: logi NA NA NA NA NA NA ...
## $ END_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ END_AZI : chr "" "" "" "" ...
## $ END_LOCATI: chr "" "" "" "" ...
## $ LENGTH : num 14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
## $ WIDTH : num 100 150 123 100 150 177 33 33 100 100 ...
## $ F : int 3 2 2 2 2 2 2 1 3 3 ...
## $ MAG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
## $ WFO : chr "" "" "" "" ...
## $ STATEOFFIC: chr "" "" "" "" ...
## $ ZONENAMES : chr "" "" "" "" ...
## $ LATITUDE : num 3040 3042 3340 3458 3412 ...
## $ LONGITUDE : num 8812 8755 8742 8626 8642 ...
## $ LATITUDE_E: num 3051 0 0 0 0 ...
## $ LONGITUDE_: num 8806 0 0 0 0 ...
## $ REMARKS : chr "" "" "" "" ...
## $ REFNUM : num 1 2 3 4 5 6 7 8 9 10 ...
storm_data <- storm_data %>%
select(EVTYPE, FATALITIES, INJURIES,
PROPDMG, PROPDMGEXP,
CROPDMG, CROPDMGEXP)
convert_exp <- function(exp) {
ifelse(exp %in% c("K", "k"), 1e3,
ifelse(exp %in% c("M", "m"), 1e6,
ifelse(exp %in% c("B", "b"), 1e9, 1)))
}
storm_data$PROPDMG_MULT <- convert_exp(storm_data$PROPDMGEXP)
storm_data$CROPDMG_MULT <- convert_exp(storm_data$CROPDMGEXP)
storm_data$PROPDMG_TOTAL <- storm_data$PROPDMG * storm_data$PROPDMG_MULT
storm_data$CROPDMG_TOTAL <- storm_data$CROPDMG * storm_data$CROPDMG_MULT
health_impact <- storm_data %>%
group_by(EVTYPE) %>%
summarise(
Fatalities = sum(FATALITIES, na.rm = TRUE),
Injuries = sum(INJURIES, na.rm = TRUE)
) %>%
mutate(Total_Health = Fatalities + Injuries) %>%
arrange(desc(Total_Health))
economic_impact <- storm_data %>%
group_by(EVTYPE) %>%
summarise(
Economic_Damage = sum(PROPDMG_TOTAL + CROPDMG_TOTAL, na.rm = TRUE)
) %>%
arrange(desc(Economic_Damage))
top_health <- head(health_impact, 10)
ggplot(top_health, aes(x = reorder(EVTYPE, Total_Health),
y = Total_Health)) +
geom_bar(stat = "identity", fill = "steelblue") +
coord_flip() +
labs(title = "Top 10 Weather Events Harmful to Population Health",
x = "Event Type",
y = "Total Fatalities and Injuries")
Figure 1: Tornadoes and excessive heat events are responsible for the highest combined number of fatalities and injuries, making them the most harmful to population health.
top_economic <- head(economic_impact, 10)
ggplot(top_economic, aes(x = reorder(EVTYPE, Economic_Damage),
y = Economic_Damage / 1e9)) +
geom_bar(stat = "identity", fill = "darkred") +
coord_flip() +
labs(title = "Top 10 Weather Events by Economic Damage",
x = "Event Type",
y = "Economic Damage (Billion USD)")
Figure 2: Floods, hurricanes, and storm surges account for the largest economic losses, largely due to extensive property and infrastructure damage.
head(health_impact, 5)
## # A tibble: 5 × 4
## EVTYPE Fatalities Injuries Total_Health
## <chr> <dbl> <dbl> <dbl>
## 1 TORNADO 5633 91346 96979
## 2 EXCESSIVE HEAT 1903 6525 8428
## 3 TSTM WIND 504 6957 7461
## 4 FLOOD 470 6789 7259
## 5 LIGHTNING 816 5230 6046
head(economic_impact, 5)
## # A tibble: 5 × 2
## EVTYPE Economic_Damage
## <chr> <dbl>
## 1 FLOOD 150319678257
## 2 HURRICANE/TYPHOON 71913712800
## 3 TORNADO 57352114049.
## 4 STORM SURGE 43323541000
## 5 HAIL 18758221521.
The analysis shows that tornadoes are the most dangerous events in terms of population health, causing the highest number of fatalities and injuries. In contrast, floods and hurricanes have the greatest economic impact due to widespread property and crop damage. These findings highlight the importance of targeted disaster preparedness strategies. Understanding both human and economic consequences allows decision-makers to allocate resources more effectively and mitigate future risks.