Synopsis

This analysis investigates the impact of different weather events in the United States using NOAA storm data. The focus is on both economic damage (property and crop losses) and population health impacts (injuries and fatalities). When looking nationally, the data is only seperated by EVTYPE however i have also provided state relevant data. Top events were visualized by total damage and total casualties. A third panel compares the top 5 most harmful events per state by both damage and health impact. All data processing and analysis were done in R, and raw data were loaded directly from the CSV file. No preprocessing was performed outside of this document. Each figure highlights different aspects of the storm data. The findings indicate that nationally, tornadoes cause the most injuries, while floods and hurricanes result in the highest property damage however there are also regional differences. This information may help inform policy decisions regarding disaster preparedness and mitigation.

Data Processing:

This step involves loading the required libraries as well as the csv file containing my data for analysis. No processing is done as the file already contains well laid out data.

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(patchwork)
## Warning: package 'patchwork' was built under R version 4.5.1
file<-read.csv("repdata_data_StormData.csv",header=T)

National Results:

By taking the top 20 causes of economic damage and human casualties, we can review the most impactful natural disasters.

property_damage_data<-file%>%group_by(EVTYPE)%>% 
  summarise(total_damage=sum(PROPDMG+CROPDMG,na.rm=T),property_damage=sum(PROPDMG,na.rm=T),crop_damage=sum(CROPDMG,na.rm=T))%>%arrange(desc(total_damage)) %>% 
  arrange(desc(total_damage)) %>% slice_max(order_by = total_damage, n = 5, with_ties = FALSE)

population_data<-file%>%group_by(EVTYPE) %>% 
  summarise(total_casualties=sum(FATALITIES+INJURIES,na.rm=T),total_fatalities=sum(FATALITIES,na.rm=T),total_injuries=sum(INJURIES,na.rm=T)) %>%
  arrange(desc(total_casualties)) %>% slice_max(order_by = total_casualties, n = 5, with_ties = FALSE)

Greatest crop Damage, Greatest property Damage and greatest total damage: Interesting That hail is the number one cause of crop damage despite tornados being number one for overall and property damage.

property_damage_data[which.max(property_damage_data$crop_damage),]
## # A tibble: 1 × 4
##   EVTYPE total_damage property_damage crop_damage
##   <chr>         <dbl>           <dbl>       <dbl>
## 1 HAIL       1268290.         688693.     579596.
property_damage_data[which.max(property_damage_data$property_damage),]
## # A tibble: 1 × 4
##   EVTYPE  total_damage property_damage crop_damage
##   <chr>          <dbl>           <dbl>       <dbl>
## 1 TORNADO     3312277.        3212258.     100019.
property_damage_data[which.max(property_damage_data$total_damage),]
## # A tibble: 1 × 4
##   EVTYPE  total_damage property_damage crop_damage
##   <chr>          <dbl>           <dbl>       <dbl>
## 1 TORNADO     3312277.        3212258.     100019.

Greatest cause of injury,fatalities and total human casualties, Tornado gets a clean sweap, a very deadly disaster.

population_data[which.max(population_data$total_casualties),]
## # A tibble: 1 × 4
##   EVTYPE  total_casualties total_fatalities total_injuries
##   <chr>              <dbl>            <dbl>          <dbl>
## 1 TORNADO            96979             5633          91346
population_data[which.max(population_data$total_fatalities),]
## # A tibble: 1 × 4
##   EVTYPE  total_casualties total_fatalities total_injuries
##   <chr>              <dbl>            <dbl>          <dbl>
## 1 TORNADO            96979             5633          91346
population_data[which.max(population_data$total_injuries),]
## # A tibble: 1 × 4
##   EVTYPE  total_casualties total_fatalities total_injuries
##   <chr>              <dbl>            <dbl>          <dbl>
## 1 TORNADO            96979             5633          91346

Graph for 5 largest sources of economic damage:

d1<-ggplot(property_damage_data,aes(x=EVTYPE))+geom_point(aes(y = (total_damage), color = "Total Damage")) + 
  geom_point(aes(y = (crop_damage), color = "Total Damage")) +
  geom_point(aes(y = (property_damage), color = "Property Damage")) +
  labs(x="Natural Disaster", y="Damage Cost",title="Damage for the top 5 natural disasters")
d1

Graph for 5 largest sources of human casualties:

p1<-ggplot(population_data,aes(x=EVTYPE))+geom_point(aes(y = (total_casualties), color = "Total Casualties")) + 
  geom_point(aes(y = (total_fatalities), color = "Total Fatalities")) +
  geom_point(aes(y = (total_injuries), color = "Total Injuries")) + 
  labs(x="Natural Disaster", y="Number of given casualty type",title="Casualties for the top 5 natural disasters")
p1

State Results:

Additionally, for municipal governments a more local view is needed. Using the data, we can look at the largest impact at the state level:

state_population_data <- file %>%
  group_by(STATE, EVTYPE) %>%
  summarise(total_casualties = sum(FATALITIES + INJURIES, na.rm = TRUE),.groups = "drop") %>%
  filter(total_casualties != 0) %>% group_by(STATE) %>% slice_max(order_by = total_casualties, n = 5, with_ties = FALSE)

state_damage_data<-file%>%group_by(STATE, EVTYPE)%>%
  summarise(total_damage = sum(PROPDMG + CROPDMG, na.rm = TRUE),.groups = "drop") %>%
  filter(total_damage != 0) %>% group_by(STATE) %>% slice_max(order_by = total_damage, n = 5, with_ties = FALSE)

g1<-ggplot(state_damage_data,aes(x=STATE,y=total_damage,color=EVTYPE))+geom_point() + 
  labs(x="State", y="Damage Cost",title="Top 5 Damage Per State")
g2<-ggplot(state_population_data,aes(x=STATE,y=total_casualties,color=EVTYPE))+geom_point() + 
  labs(x="State", y="Damage Cost",title="Top 5 Damage Per State")

combined_plot<-g1+g2
print(combined_plot)