Severe weather events constitute a significant source of risk to both public health and economic stability, frequently resulting in fatalities, injuries, and substantial property damage. Given the magnitude of these impacts, it is essential to systematically evaluate which types of events pose the greatest threats. This study undertakes an analysis of the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database, a comprehensive record of major storms and weather events across the United States. The database provides detailed information on the timing, geographic distribution, and consequences of such events, including estimates of human health outcomes and economic losses. The central objective of this analysis is to address two key research questions: which event types are most harmful to population health, and which have the greatest economic consequences nationwide. To achieve this, the data set is analyzed by using the R programming language and relevant packages.
Our analysis begins with loading the data set and conducting an initial examination of its contents.
data <- read.csv("repdata_data_StormData.csv.bz2", header=TRUE,
na.strings=c("NA", ""), stringsAsFactors = FALSE)
str(data)
## 'data.frame': 902297 obs. of 37 variables:
## $ STATE__ : num 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_DATE : chr "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
## $ BGN_TIME : chr "0130" "0145" "1600" "0900" ...
## $ TIME_ZONE : chr "CST" "CST" "CST" "CST" ...
## $ COUNTY : num 97 3 57 89 43 77 9 123 125 57 ...
## $ COUNTYNAME: chr "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
## $ STATE : chr "AL" "AL" "AL" "AL" ...
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ BGN_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ BGN_AZI : chr NA NA NA NA ...
## $ BGN_LOCATI: chr NA NA NA NA ...
## $ END_DATE : chr NA NA NA NA ...
## $ END_TIME : chr NA NA NA NA ...
## $ COUNTY_END: num 0 0 0 0 0 0 0 0 0 0 ...
## $ COUNTYENDN: logi NA NA NA NA NA NA ...
## $ END_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ END_AZI : chr NA NA NA NA ...
## $ END_LOCATI: chr NA NA NA NA ...
## $ LENGTH : num 14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
## $ WIDTH : num 100 150 123 100 150 177 33 33 100 100 ...
## $ F : int 3 2 2 2 2 2 2 1 3 3 ...
## $ MAG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr NA NA NA NA ...
## $ WFO : chr NA NA NA NA ...
## $ STATEOFFIC: chr NA NA NA NA ...
## $ ZONENAMES : chr NA NA NA NA ...
## $ LATITUDE : num 3040 3042 3340 3458 3412 ...
## $ LONGITUDE : num 8812 8755 8742 8626 8642 ...
## $ LATITUDE_E: num 3051 0 0 0 0 ...
## $ LONGITUDE_: num 8806 0 0 0 0 ...
## $ REMARKS : chr NA NA NA NA ...
## $ REFNUM : num 1 2 3 4 5 6 7 8 9 10 ...
The variables PROPDMGEXP and CROPDMGEXP are
utilized to calculate total property and crop damage. Following the
guidelines provided in the corresponding
documentation these values are combined to derive the
totaldmg variable, which represents the overall economic
impact.
data <- data %>%
mutate(totalpropdmg = ifelse(PROPDMGEXP == "H", PROPDMG * 100,
ifelse(PROPDMGEXP == "K", PROPDMG * 1000,
ifelse(PROPDMGEXP == "M", PROPDMG * 1e6,
ifelse(PROPDMGEXP == "B", PROPDMG * 1e9, NA))))) %>%
mutate(totalcropdmg = ifelse(PROPDMGEXP == "H", PROPDMG * 100,
ifelse(CROPDMGEXP == "K", CROPDMG * 1e3,
ifelse(CROPDMGEXP == "M", CROPDMG * 1e6,
ifelse(CROPDMGEXP == "B", CROPDMG * 1e9, NA))))) %>%
mutate(totaldmg=totalpropdmg + totalcropdmg)
To assess which event types pose the greatest threat to population health, we examine their impact in terms of fatalities and injuries. Specifically, we consider the top five event types most frequently associated with fatalities and, separately, the five event types most frequently associated with injuries.
top_n = 5
top_fatalities <- data %>%
group_by(EVTYPE) %>%
summarize(fatalities=sum(FATALITIES, na.rm = TRUE)) %>%
arrange(desc(fatalities)) %>%
head(top_n)
g1 <- ggplot(top_fatalities, aes(x=EVTYPE, y=fatalities)) +
geom_bar(stat="identity") +
xlab("Event Type") + ylab("Total number of fatalities") +
scale_x_discrete(limits = top_fatalities$EVTYPE) +
theme_classic() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
top_injuries <- data %>%
group_by(EVTYPE) %>%
summarize(injuries=sum(INJURIES, na.rm = TRUE)) %>%
arrange(desc(injuries)) %>%
head(top_n)
g2 <- ggplot(top_injuries, aes(x=EVTYPE, y=injuries)) +
geom_bar(stat="identity") +
xlab("Event Type") + ylab("Total number of injuries") +
scale_x_discrete(limits = top_injuries$EVTYPE) +
theme_classic() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
grid.arrange(g1, g2, ncol=2)
The results indicate that tornadoes account for the largest share of fatalities, followed by excessive heat and flash floods. A similar analysis is conducted for injuries. In this case, tornadoes again emerge as the most harmful event type, with thunderstorm winds ranking second and floods ranking third.
To evaluate which event types result in significant economic impacts, their effects are assessed in terms of total property and crop damage. The analysis focuses specifically on the five event types most frequently associated with such losses.
top_n = 5
top_dmg <- data %>%
group_by(EVTYPE) %>%
summarize(totaldmg=sum(totaldmg, na.rm = TRUE)) %>%
arrange(desc(totaldmg)) %>%
head(top_n)
g <- ggplot(top_dmg, aes(x=EVTYPE, y=totaldmg)) +
geom_bar(stat="identity") +
xlab("Event Type") + ylab("Total damage") +
scale_x_discrete(limits = top_dmg$EVTYPE) +
theme_classic() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
print(g)
Regarding economic consequences, the findings reveal that floods cause the greatest financial losses, followed by hurricanes and typhoons, with tornadoes occupying the third position.
Our analysis demonstrates that tornadoes represent the most significant threat to population health, causing the highest number of both fatalities and injuries. Other severe events, such as excessive heat, flash floods, and thunderstorm winds, also contribute substantially to adverse health outcomes. From an economic perspective, floods, hurricanes, and typhoons are the primary drivers of financial losses, with tornadoes ranking third. These findings highlight the differential impacts of various natural hazards on human health and economic stability, emphasizing the need for targeted risk mitigation and preparedness strategies.