This analysis investigates the impact of different weather events in the United States using NOAA storm data. The focus is on both economic damage (property and crop losses) and population health impacts (injuries and fatalities). When looking nationally, the data is only seperated by EVTYPE however i have also provided state relevant data. Top events were visualized by total damage and total casualties. A third panel compares the top 5 most harmful events per state by both damage and health impact. All data processing and analysis were done in R, and raw data were loaded directly from the CSV file. No preprocessing was performed outside of this document. Each figure highlights different aspects of the storm data. The findings indicate that nationally, tornadoes cause the most injuries, while floods and hurricanes result in the highest property damage however there are also regional differences. This information may help inform policy decisions regarding disaster preparedness and mitigation.
This step involves loading the required libraries as well as the csv file containing my data for analysis. No processing is done as the file already contains well laid out data.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(patchwork)
## Warning: package 'patchwork' was built under R version 4.5.1
file<-read.csv("repdata_data_StormData.csv",header=T)
By taking the top 20 causes of economic damage and human casualties, we can review the most impactful natural disasters.
property_damage_data<-file%>%group_by(EVTYPE)%>%
summarise(total_damage=sum(PROPDMG+CROPDMG,na.rm=T),property_damage=sum(PROPDMG,na.rm=T),crop_damage=sum(CROPDMG,na.rm=T))%>%arrange(desc(total_damage)) %>%
arrange(desc(total_damage)) %>% slice_max(order_by = total_damage, n = 5, with_ties = FALSE)
population_data<-file%>%group_by(EVTYPE) %>%
summarise(total_casualties=sum(FATALITIES+INJURIES,na.rm=T),total_fatalities=sum(FATALITIES,na.rm=T),total_injuries=sum(INJURIES,na.rm=T)) %>%
arrange(desc(total_casualties)) %>% slice_max(order_by = total_casualties, n = 5, with_ties = FALSE)
Greatest crop Damage, Greatest property Damage and greatest total damage: Interesting That hail is the number one cause of crop damage despite tornados being number one for overall and property damage.
property_damage_data[which.max(property_damage_data$crop_damage),]
## # A tibble: 1 × 4
## EVTYPE total_damage property_damage crop_damage
## <chr> <dbl> <dbl> <dbl>
## 1 HAIL 1268290. 688693. 579596.
property_damage_data[which.max(property_damage_data$property_damage),]
## # A tibble: 1 × 4
## EVTYPE total_damage property_damage crop_damage
## <chr> <dbl> <dbl> <dbl>
## 1 TORNADO 3312277. 3212258. 100019.
property_damage_data[which.max(property_damage_data$total_damage),]
## # A tibble: 1 × 4
## EVTYPE total_damage property_damage crop_damage
## <chr> <dbl> <dbl> <dbl>
## 1 TORNADO 3312277. 3212258. 100019.
Greatest cause of injury,fatalities and total human casualties, Tornado gets a clean sweap, a very deadly disaster.
population_data[which.max(population_data$total_casualties),]
## # A tibble: 1 × 4
## EVTYPE total_casualties total_fatalities total_injuries
## <chr> <dbl> <dbl> <dbl>
## 1 TORNADO 96979 5633 91346
population_data[which.max(population_data$total_fatalities),]
## # A tibble: 1 × 4
## EVTYPE total_casualties total_fatalities total_injuries
## <chr> <dbl> <dbl> <dbl>
## 1 TORNADO 96979 5633 91346
population_data[which.max(population_data$total_injuries),]
## # A tibble: 1 × 4
## EVTYPE total_casualties total_fatalities total_injuries
## <chr> <dbl> <dbl> <dbl>
## 1 TORNADO 96979 5633 91346
Graph for 5 largest sources of economic damage:
d1<-ggplot(property_damage_data,aes(x=EVTYPE))+geom_point(aes(y = (total_damage), color = "Total Damage")) +
geom_point(aes(y = (crop_damage), color = "Total Damage")) +
geom_point(aes(y = (property_damage), color = "Property Damage")) +
labs(x="Natural Disaster", y="Damage Cost",title="Damage for the top 5 natural disasters")
d1
Graph for 5 largest sources of human casualties:
p1<-ggplot(population_data,aes(x=EVTYPE))+geom_point(aes(y = (total_casualties), color = "Total Casualties")) +
geom_point(aes(y = (total_fatalities), color = "Total Fatalities")) +
geom_point(aes(y = (total_injuries), color = "Total Injuries")) +
labs(x="Natural Disaster", y="Number of given casualty type",title="Casualties for the top 5 natural disasters")
p1
Additionally, for municipal governments a more local view is needed. Using the data, we can look at the largest impact at the state level:
state_population_data <- file %>%
group_by(STATE, EVTYPE) %>%
summarise(total_casualties = sum(FATALITIES + INJURIES, na.rm = TRUE),.groups = "drop") %>%
filter(total_casualties != 0) %>% group_by(STATE) %>% slice_max(order_by = total_casualties, n = 5, with_ties = FALSE)
state_damage_data<-file%>%group_by(STATE, EVTYPE)%>%
summarise(total_damage = sum(PROPDMG + CROPDMG, na.rm = TRUE),.groups = "drop") %>%
filter(total_damage != 0) %>% group_by(STATE) %>% slice_max(order_by = total_damage, n = 5, with_ties = FALSE)
g1<-ggplot(state_damage_data,aes(x=STATE,y=total_damage,color=EVTYPE))+geom_point() +
labs(x="State", y="Damage Cost",title="Top 5 Damage Per State")
g2<-ggplot(state_population_data,aes(x=STATE,y=total_casualties,color=EVTYPE))+geom_point() +
labs(x="State", y="Damage Cost",title="Top 5 Damage Per State")
combined_plot<-g1+g2
print(combined_plot)