This report analyzes data from the NOAA Storm Database to determine which types of severe weather events are most harmful to population health and which have the greatest economic consequences in the United States. The dataset includes records from 1950 through November 2011. Population health impact is measured by total fatalities and injuries, while economic impact is measured by total property and crop damage. The raw data were cleaned and transformed within this document to ensure full reproducibility. Event types were aggregated and ranked based on total impact. The results indicate that tornadoes are the most harmful events with respect to population health, while floods and hurricane-related events generate the largest economic losses. These findings may assist public officials in prioritizing disaster preparedness and mitigation strategies.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
The data are loaded directly from the original compressed CSV file. No preprocessing was performed outside this document to ensure reproducibility.
storm <- read.csv("C:\\Users\\kroberts\\OneDrive - DOI\\Documents\\Coursera\\repdata_data_StormData.csv")
str(storm)
## 'data.frame': 902297 obs. of 37 variables:
## $ STATE__ : num 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_DATE : chr "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
## $ BGN_TIME : chr "0130" "0145" "1600" "0900" ...
## $ TIME_ZONE : chr "CST" "CST" "CST" "CST" ...
## $ COUNTY : num 97 3 57 89 43 77 9 123 125 57 ...
## $ COUNTYNAME: chr "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
## $ STATE : chr "AL" "AL" "AL" "AL" ...
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ BGN_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ BGN_AZI : chr "" "" "" "" ...
## $ BGN_LOCATI: chr "" "" "" "" ...
## $ END_DATE : chr "" "" "" "" ...
## $ END_TIME : chr "" "" "" "" ...
## $ COUNTY_END: num 0 0 0 0 0 0 0 0 0 0 ...
## $ COUNTYENDN: logi NA NA NA NA NA NA ...
## $ END_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ END_AZI : chr "" "" "" "" ...
## $ END_LOCATI: chr "" "" "" "" ...
## $ LENGTH : num 14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
## $ WIDTH : num 100 150 123 100 150 177 33 33 100 100 ...
## $ F : int 3 2 2 2 2 2 2 1 3 3 ...
## $ MAG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
## $ WFO : chr "" "" "" "" ...
## $ STATEOFFIC: chr "" "" "" "" ...
## $ ZONENAMES : chr "" "" "" "" ...
## $ LATITUDE : num 3040 3042 3340 3458 3412 ...
## $ LONGITUDE : num 8812 8755 8742 8626 8642 ...
## $ LATITUDE_E: num 3051 0 0 0 0 ...
## $ LONGITUDE_: num 8806 0 0 0 0 ...
## $ REMARKS : chr "" "" "" "" ...
## $ REFNUM : num 1 2 3 4 5 6 7 8 9 10 ...
The dataset includes property damage (PROPDMG) and crop
damage (CROPDMG) values, along with exponent variables
(PROPDMGEXP and CROPDMGEXP) that indicate
magnitude (e.g., K = thousands, M = millions, B = billions).
To calculate total economic damage accurately, the exponent values must be converted into numeric multipliers. All exponent values are converted to uppercase to ensure consistency before transformation.
storm$PROPDMGEXP <- toupper(storm$PROPDMGEXP)
storm$CROPDMGEXP <- toupper(storm$CROPDMGEXP)
exp_to_multiplier <- function(x) {
ifelse(x == "H", 1e2,
ifelse(x == "K", 1e3,
ifelse(x == "M", 1e6,
ifelse(x == "B", 1e9, 1))))
}
storm$PROP_MULT <- exp_to_multiplier(storm$PROPDMGEXP)
storm$CROP_MULT <- exp_to_multiplier(storm$CROPDMGEXP)
storm$PROP_TOTAL <- storm$PROPDMG * storm$PROP_MULT
storm$CROP_TOTAL <- storm$CROPDMG * storm$CROP_MULT
storm$TOTAL_DAMAGE <- storm$PROP_TOTAL + storm$CROP_TOTAL
To evaluate overall harm to population health, fatalities and injuries are combined into a single variable.
storm$TOTAL_HEALTH <- storm$FATALITIES + storm$INJURIES
The data are grouped by event type (EVTYPE) and summed
to determine total health and economic impact for each category.
health_summary <- storm %>%
group_by(EVTYPE) %>%
summarise(Total_Health = sum(TOTAL_HEALTH, na.rm = TRUE)) %>%
arrange(desc(Total_Health))
damage_summary <- storm %>%
group_by(EVTYPE) %>%
summarise(Total_Damage = sum(TOTAL_DAMAGE, na.rm = TRUE)) %>%
arrange(desc(Total_Damage))
The following figure displays the top 10 event types ranked by total fatalities and injuries across the United States.
top_health <- head(health_summary, 10)
ggplot(top_health, aes(x = reorder(EVTYPE, Total_Health), y = Total_Health)) +
geom_bar(stat = "identity") +
coord_flip() +
labs(title = "Top 10 Weather Events by Total Health Impact",
x = "Event Type",
y = "Total Fatalities and Injuries")
Figure 1: Tornadoes rank highest in total population health impact, followed by excessive heat, floods, and thunderstorm wind events. This indicates that tornadoes pose the greatest overall threat to human life and safety among recorded weather events.
The next figure shows the top 10 event types ranked by total property and crop damage.
top_damage <- head(damage_summary, 10)
ggplot(top_damage, aes(x = reorder(EVTYPE, Total_Damage), y = Total_Damage)) +
geom_bar(stat = "identity") +
coord_flip() +
labs(title = "Top 10 Weather Events by Economic Damage",
x = "Event Type",
y = "Total Property and Crop Damage (USD)")
Figure 2: Floods and hurricane-related events produce the greatest economic losses, significantly exceeding most other weather categories. These events tend to cause widespread infrastructure and agricultural damage, resulting in large financial consequences.
This analysis demonstrates that tornadoes are the most harmful weather events in terms of population health impact, while floods and hurricanes produce the largest economic losses. Although many types of severe weather events occur across the United States, a relatively small number of categories account for the majority of human and financial damage. Understanding these patterns is critical for emergency management planning and resource allocation.