| Title: “NOAA Storm Data Assignment” |
| output: html_document: default |
The data for this analysis were obtained from the NOAA Storm Database. The original data file is a compressed comma-separated file with a .csv.bz2 extension. The analysis starts from the raw compressed file, loads the data into R, and selects the variables needed to answer the public health and economic impact questions.
options(scipen = 999)
file_url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
file_name <- "StormData.csv.bz2"
if (!file.exists(file_name)) {
download.file(file_url, file_name, method = "auto")
}
storm_data <- read.csv(bzfile(file_name), stringsAsFactors = FALSE)
dim(storm_data)
## [1] 902297 37
names(storm_data)[1:20]
## [1] "STATE__" "BGN_DATE" "BGN_TIME" "TIME_ZONE" "COUNTY"
## [6] "COUNTYNAME" "STATE" "EVTYPE" "BGN_RANGE" "BGN_AZI"
## [11] "BGN_LOCATI" "END_DATE" "END_TIME" "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE" "END_AZI" "END_LOCATI" "LENGTH" "WIDTH"
The variables needed for the analysis include event type, fatalities, injuries, property damage, crop damage, and the damage exponent variables. For the public health question, fatalities and injuries were combined into one total health-impact measure.
storm_selected <- storm_data[, c("EVTYPE",
"FATALITIES",
"INJURIES",
"PROPDMG",
"PROPDMGEXP",
"CROPDMG",
"CROPDMGEXP")]
storm_selected$EVTYPE <- toupper(storm_selected$EVTYPE)
storm_selected$health_damage <- storm_selected$FATALITIES + storm_selected$INJURIES
health_summary <- aggregate(health_damage ~ EVTYPE,
data = storm_selected,
sum)
health_summary <- health_summary[order(-health_summary$health_damage), ]
top_health <- head(health_summary, 10)
top_health
## EVTYPE health_damage
## 758 TORNADO 96979
## 116 EXCESSIVE HEAT 8428
## 779 TSTM WIND 7461
## 154 FLOOD 7259
## 418 LIGHTNING 6046
## 243 HEAT 3037
## 138 FLASH FLOOD 2755
## 387 ICE STORM 2064
## 685 THUNDERSTORM WIND 1621
## 888 WINTER STORM 1527
For the economic impact question, property damage and crop damage were converted into dollar values using their corresponding exponent variables. The exponent variables indicate whether the reported damage should be multiplied by hundreds, thousands, millions, or billions.
convert_exponent <- function(exp) {
exp <- toupper(as.character(exp))
multiplier <- rep(0, length(exp))
multiplier[exp == ""] <- 1
multiplier[exp == "0"] <- 1
multiplier[exp == "1"] <- 10
multiplier[exp == "2"] <- 100
multiplier[exp == "3"] <- 1000
multiplier[exp == "4"] <- 10000
multiplier[exp == "5"] <- 100000
multiplier[exp == "6"] <- 1000000
multiplier[exp == "7"] <- 10000000
multiplier[exp == "8"] <- 100000000
multiplier[exp == "9"] <- 1000000000
multiplier[exp == "H"] <- 100
multiplier[exp == "K"] <- 1000
multiplier[exp == "M"] <- 1000000
multiplier[exp == "B"] <- 1000000000
multiplier
}
storm_selected$property_damage_value <- storm_selected$PROPDMG *
convert_exponent(storm_selected$PROPDMGEXP)
storm_selected$crop_damage_value <- storm_selected$CROPDMG *
convert_exponent(storm_selected$CROPDMGEXP)
storm_selected$economic_damage <- storm_selected$property_damage_value +
storm_selected$crop_damage_value
economic_summary <- aggregate(economic_damage ~ EVTYPE,
data = storm_selected,
sum)
economic_summary <- economic_summary[order(-economic_summary$economic_damage), ]
top_economic <- head(economic_summary, 10)
top_economic
## EVTYPE economic_damage
## 154 FLOOD 150319678257
## 372 HURRICANE/TYPHOON 71913712800
## 758 TORNADO 57362333886
## 599 STORM SURGE 43323541000
## 212 HAIL 18761221986
## 138 FLASH FLOOD 18243991078
## 84 DROUGHT 15018672000
## 363 HURRICANE 14610229010
## 529 RIVER FLOOD 10148404500
## 387 ICE STORM 8967041360
The first research question asks which types of weather events are most harmful to population health. Population health impact was measured by combining total fatalities and injuries for each event type. The event types with the highest combined fatalities and injuries are shown below.
barplot(top_health$health_damage,
names.arg = top_health$EVTYPE,
las = 2,
main = "Top 10 Weather Events by Population Health Impact",
ylab = "Total Fatalities and Injuries",
cex.names = 0.7)
Based on the results, tornadoes have the greatest overall impact on population health. Tornadoes produced the highest combined number of fatalities and injuries in the NOAA storm database.
The second research question asks which types of weather events have the greatest economic consequences. Economic impact was measured by combining property damage and crop damage after converting the damage exponent variables into dollar-value multipliers.
barplot(top_economic$economic_damage / 1000000000,
names.arg = top_economic$EVTYPE,
las = 2,
main = "Top 10 Weather Events by Economic Damage",
ylab = "Total Economic Damage in Billions of Dollars",
cex.names = 0.7)
Based on the results, flood-related events have some of the greatest economic consequences. Floods caused very large property and crop damage across the United States.
This analysis shows that the most harmful weather events depend on whether the concern is population health or economic loss. Tornadoes are the most harmful event type for population health when fatalities and injuries are combined. Flood-related events are among the most economically damaging event types when property and crop damage are combined. These findings suggest that emergency preparedness planning should consider both human health effects and financial losses when prioritizing severe weather risks.