Our analysis delves into the repercussions of severe weather events across the United States, focusing on their impact on population health and economic consequences. By analyzing the NOAA Storm Database, we identified the most harmful events concerning human health, revealing that tornadoes, excessive heat, and marine thunderstorm winds are the primary causes of fatalities and injuries. Additionally, our investigation into economic consequences unveiled that tornadoes, flash floods, and marine thunderstorm winds inflict the most significant property and crop damages. This analysis underscores the multifaceted toll of severe weather events, emphasizing the critical need for readiness and mitigation strategies to safeguard both lives and livelihoods nationwide. Through data-driven insights, we shed light on the pressing challenges posed by extreme weather phenomena, offering valuable guidance for policymakers, emergency responders, and communities alike in mitigating their adverse effects.
The analysis begins with the loading of the raw CSV file containing the NOAA Storm Database into R. The readr package is utilized for this task.
# Load the readr package
library(readr)
# Load the dataset
storm_data <- read.csv("repdata_data_StormData.csv.bz2")
Once the dataset is loaded, an exploratory data analysis is conducted to understand the structure and content of the dataset. This involves examining summary statistics and the dataset’s structure.
# Summary of the dataset
summary(storm_data)
## STATE__ BGN_DATE BGN_TIME TIME_ZONE
## Min. : 1.0 Length:902297 Length:902297 Length:902297
## 1st Qu.:19.0 Class :character Class :character Class :character
## Median :30.0 Mode :character Mode :character Mode :character
## Mean :31.2
## 3rd Qu.:45.0
## Max. :95.0
##
## COUNTY COUNTYNAME STATE EVTYPE
## Min. : 0.0 Length:902297 Length:902297 Length:902297
## 1st Qu.: 31.0 Class :character Class :character Class :character
## Median : 75.0 Mode :character Mode :character Mode :character
## Mean :100.6
## 3rd Qu.:131.0
## Max. :873.0
##
## BGN_RANGE BGN_AZI BGN_LOCATI END_DATE
## Min. : 0.000 Length:902297 Length:902297 Length:902297
## 1st Qu.: 0.000 Class :character Class :character Class :character
## Median : 0.000 Mode :character Mode :character Mode :character
## Mean : 1.484
## 3rd Qu.: 1.000
## Max. :3749.000
##
## END_TIME COUNTY_END COUNTYENDN END_RANGE
## Length:902297 Min. :0 Mode:logical Min. : 0.0000
## Class :character 1st Qu.:0 NA's:902297 1st Qu.: 0.0000
## Mode :character Median :0 Median : 0.0000
## Mean :0 Mean : 0.9862
## 3rd Qu.:0 3rd Qu.: 0.0000
## Max. :0 Max. :925.0000
##
## END_AZI END_LOCATI LENGTH WIDTH
## Length:902297 Length:902297 Min. : 0.0000 Min. : 0.000
## Class :character Class :character 1st Qu.: 0.0000 1st Qu.: 0.000
## Mode :character Mode :character Median : 0.0000 Median : 0.000
## Mean : 0.2301 Mean : 7.503
## 3rd Qu.: 0.0000 3rd Qu.: 0.000
## Max. :2315.0000 Max. :4400.000
##
## F MAG FATALITIES INJURIES
## Min. :0.0 Min. : 0.0 Min. : 0.0000 Min. : 0.0000
## 1st Qu.:0.0 1st Qu.: 0.0 1st Qu.: 0.0000 1st Qu.: 0.0000
## Median :1.0 Median : 50.0 Median : 0.0000 Median : 0.0000
## Mean :0.9 Mean : 46.9 Mean : 0.0168 Mean : 0.1557
## 3rd Qu.:1.0 3rd Qu.: 75.0 3rd Qu.: 0.0000 3rd Qu.: 0.0000
## Max. :5.0 Max. :22000.0 Max. :583.0000 Max. :1700.0000
## NA's :843563
## PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## Min. : 0.00 Length:902297 Min. : 0.000 Length:902297
## 1st Qu.: 0.00 Class :character 1st Qu.: 0.000 Class :character
## Median : 0.00 Mode :character Median : 0.000 Mode :character
## Mean : 12.06 Mean : 1.527
## 3rd Qu.: 0.50 3rd Qu.: 0.000
## Max. :5000.00 Max. :990.000
##
## WFO STATEOFFIC ZONENAMES LATITUDE
## Length:902297 Length:902297 Length:902297 Min. : 0
## Class :character Class :character Class :character 1st Qu.:2802
## Mode :character Mode :character Mode :character Median :3540
## Mean :2875
## 3rd Qu.:4019
## Max. :9706
## NA's :47
## LONGITUDE LATITUDE_E LONGITUDE_ REMARKS
## Min. :-14451 Min. : 0 Min. :-14455 Length:902297
## 1st Qu.: 7247 1st Qu.: 0 1st Qu.: 0 Class :character
## Median : 8707 Median : 0 Median : 0 Mode :character
## Mean : 6940 Mean :1452 Mean : 3509
## 3rd Qu.: 9605 3rd Qu.:3549 3rd Qu.: 8735
## Max. : 17124 Max. :9706 Max. :106220
## NA's :40
## REFNUM
## Min. : 1
## 1st Qu.:225575
## Median :451149
## Mean :451149
## 3rd Qu.:676723
## Max. :902297
##
# Check the structure of the dataset
str(storm_data)
## 'data.frame': 902297 obs. of 37 variables:
## $ STATE__ : num 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_DATE : chr "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
## $ BGN_TIME : chr "0130" "0145" "1600" "0900" ...
## $ TIME_ZONE : chr "CST" "CST" "CST" "CST" ...
## $ COUNTY : num 97 3 57 89 43 77 9 123 125 57 ...
## $ COUNTYNAME: chr "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
## $ STATE : chr "AL" "AL" "AL" "AL" ...
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ BGN_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ BGN_AZI : chr "" "" "" "" ...
## $ BGN_LOCATI: chr "" "" "" "" ...
## $ END_DATE : chr "" "" "" "" ...
## $ END_TIME : chr "" "" "" "" ...
## $ COUNTY_END: num 0 0 0 0 0 0 0 0 0 0 ...
## $ COUNTYENDN: logi NA NA NA NA NA NA ...
## $ END_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ END_AZI : chr "" "" "" "" ...
## $ END_LOCATI: chr "" "" "" "" ...
## $ LENGTH : num 14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
## $ WIDTH : num 100 150 123 100 150 177 33 33 100 100 ...
## $ F : int 3 2 2 2 2 2 2 1 3 3 ...
## $ MAG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
## $ WFO : chr "" "" "" "" ...
## $ STATEOFFIC: chr "" "" "" "" ...
## $ ZONENAMES : chr "" "" "" "" ...
## $ LATITUDE : num 3040 3042 3340 3458 3412 ...
## $ LONGITUDE : num 8812 8755 8742 8626 8642 ...
## $ LATITUDE_E: num 3051 0 0 0 0 ...
## $ LONGITUDE_: num 8806 0 0 0 0 ...
## $ REMARKS : chr "" "" "" "" ...
## $ REFNUM : num 1 2 3 4 5 6 7 8 9 10 ...
The relevant variables (EVTYPE, FATALITIES, INJURIES, PROPDMG, and CROPDMG) are selected for analysis. This is achieved by subsetting the data to include only these variables.
# Subset the data
selected_data <- storm_data[, c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "CROPDMG")]
The analysis of severe weather events in the United States provides significant findings regarding their impact on population health and economic consequences. Here, we summarize the key results obtained from the analysis.
The analysis reveals the types of severe weather events most harmful to population health, as indicated by fatalities and injuries. Tornadoes emerge by far as the leading cause of health impacts, followed by excessive heat and marine thunderstorm winds.
library(ggplot2)
# Summarize total fatalities and injuries by event type
health_summary <- aggregate(cbind(FATALITIES, INJURIES) ~ EVTYPE, data = selected_data, sum)
# Calculate total health impact
health_summary$total_damage <- health_summary$FATALITIES + health_summary$INJURIES
# Sort by total fatalities and injuries
health_summary <- health_summary[order(health_summary$total_damage, decreasing = TRUE), ]
# Create a bar plot for population health impact
ggplot(health_summary[1:10, ], aes(x = reorder(EVTYPE, -(FATALITIES + INJURIES)), y = FATALITIES + INJURIES)) +
geom_bar(stat = "identity", fill = "blue") +
labs(title = "",
x = "Event Type",
y = "Total Fatalities + Injuries") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Figure 1. Top 10 Severe Weather Events: Population Health Impact
Regarding economic consequences, tornadoes inflict the most substantial property and crop damages, followed by flash floods and marine thunderstorm winds. This underscores the huge financial burden imposed by such events on communities across the United States.
# Summarize total property and crop damages by event type
economic_summary <- aggregate(cbind(PROPDMG, CROPDMG) ~ EVTYPE, data = selected_data, sum)
# Calculate total economic damage
economic_summary$total_damage <- economic_summary$PROPDMG + economic_summary$CROPDMG
# Sort by total economic damage
economic_summary <- economic_summary[order(economic_summary$total_damage, decreasing = TRUE), ]
# Create a bar plot for economic consequences
ggplot(economic_summary[1:10, ], aes(x = reorder(EVTYPE, -(PROPDMG + CROPDMG)), y = PROPDMG/1000 + CROPDMG/1000)) +
geom_bar(stat = "identity", fill = "red") +
labs(title = "",
x = "Event Type",
y = "Total Economic Damage (kUSD)") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Figure 2. Top 10 Severe Weather Events: Economic Consequences