This report analyzes the U.S. National Oceanic and Atmospheric Administration (NOAA) Storm Database to identify the types of severe weather events most harmful to human health and those that cause the greatest economic losses. Using data from 1950–2011, we aggregated fatalities, injuries, and property/crop damages. Tornadoes were found to cause the highest combined fatalities and injuries, while floods resulted in the largest economic damages.
library(dplyr)
library(ggplot2)
library(lattice)
This chunk of code is about to read and process data from a csv file. Here first we read the file and store the result in a variable called fsdata then use the pipe function to SELECT a few columns from the data set as we knew that there is a huge data in that table and so for a our question we just need a few columns from that table so i write a code for that columns and store the result in a variable that is already exists (we override the variable data)…
After that I write a code to replace na into 0 then i filter that rows which has the injuries, propdmg & cropdmg are greater than 0. So we get some results as…
Economic damage was computed by converting the exponent columns to numeric multipliers (K=1,000; M=1,000,000; B=1,000,000,000). Finally, we grouped the data by event type to obtain total health and economic impacts.
fsdata <- read.csv("fStormData.csv")
fsdata <- fsdata %>%
select(EVTYPE, FATALITIES, INJURIES,
PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)
fsdata[is.na(fsdata)] <- 0
fsdata <- fsdata %>%
filter(FATALITIES > 0 | INJURIES > 0 | PROPDMG > 0 | CROPDMG > 0)
exp_to_num <- function(x){
x <- toupper(x)
ifelse(x == "K", 1e3,
ifelse(x == "M", 1e6,
ifelse(x == "B", 1e9, 1)))
}
fsdata <- fsdata %>%
mutate(
prop_cost = PROPDMG * exp_to_num(PROPDMGEXP),
crop_cost = CROPDMG * exp_to_num(CROPDMGEXP),
total_cost = prop_cost + crop_cost
)
health_summary <- fsdata %>%
group_by(EVTYPE) %>%
summarise(
total_fatalities = sum(FATALITIES),
total_injuries = sum(INJURIES),
total_harm = total_fatalities + total_injuries,
.groups = "drop"
) %>%
arrange(desc(total_harm))
head(health_summary, 10)
## # A tibble: 10 × 4
## EVTYPE total_fatalities total_injuries total_harm
## <chr> <dbl> <dbl> <dbl>
## 1 TORNADO 5633 91346 96979
## 2 EXCESSIVE HEAT 1903 6525 8428
## 3 TSTM WIND 504 6957 7461
## 4 FLOOD 470 6789 7259
## 5 LIGHTNING 816 5230 6046
## 6 HEAT 937 2100 3037
## 7 FLASH FLOOD 978 1777 2755
## 8 ICE STORM 89 1975 2064
## 9 THUNDERSTORM WIND 133 1488 1621
## 10 WINTER STORM 206 1321 1527
top_health <- head(health_summary, 10)
ggplot(top_health, aes(x = reorder(EVTYPE, total_harm), y = total_harm)) +
geom_col(fill = "tomato") +
coord_flip() +
labs(x = "Event type", y = "Total fatalities + injuries",
title = "Top 10 Weather Events Harmful to Population Health")
econ_summary <- fsdata %>%
group_by(EVTYPE) %>%
summarise(total_econ_damage = sum(total_cost),
.groups = "drop") %>%
arrange(desc(total_econ_damage))
head(econ_summary, 10)
## # A tibble: 10 × 2
## EVTYPE total_econ_damage
## <chr> <dbl>
## 1 FLOOD 150319678257
## 2 HURRICANE/TYPHOON 71913712800
## 3 TORNADO 57352114049.
## 4 STORM SURGE 43323541000
## 5 HAIL 18758221521.
## 6 FLASH FLOOD 17562129167.
## 7 DROUGHT 15018672000
## 8 HURRICANE 14610229010
## 9 RIVER FLOOD 10148404500
## 10 ICE STORM 8967041360
top_econ <- head(econ_summary, 10)
ggplot(top_econ, aes(x = reorder(EVTYPE, total_econ_damage),
y = total_econ_damage/1e9)) +
geom_col(fill = "steelblue") +
coord_flip() +
labs(x = "Event type",
y = "Total economic damage (billion USD)",
title = "Top 10 Weather Events by Economic Consequences")