Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
In this report, I have analyzed the effects of natural disasters by economic and health-related factors in the United States. The effects of these disasters are shown in tables and plots and sorted by the amount of damage or casualties they have had.
For health-related analysis, I have used the injury and fatality data and for the economic analysis I have used the property and crop damage.
First of all we should install and load the necessary libraries:
isntall.packages(ggplot2)
isntall.packages(data.table)
isntall.packages(dplyr)
isntall.packages(knitr)
isntall.packages(reshape2)
library(ggplot2)
library(data.table)
library(dplyr)
library(knitr)
library(reshape2)
Now we should download the data form the given link
if(!file.exists("storm_data")){
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", "storm_data")
}
storm_data <- as.data.table(read.csv(bzfile("storm_data")), keep.rownames=TRUE)
If you want to learn more about the dataset you can read more here and here.
Here we aggregate the data to see how much each event(EVTYPE) has caused injuries(INJURIES) and select the top 20 events that caused the most injuries.
total_injuries <- aggregate(INJURIES~EVTYPE, storm_data, sum)
total_injuries <- arrange(total_injuries, desc(INJURIES))
kable(total_injuries[1:10,],caption = "Total number of injuries by event type", align = 'cc')
| EVTYPE | INJURIES |
|---|---|
| TORNADO | 91346 |
| TSTM WIND | 6957 |
| FLOOD | 6789 |
| EXCESSIVE HEAT | 6525 |
| LIGHTNING | 5230 |
| HEAT | 2100 |
| ICE STORM | 1975 |
| FLASH FLOOD | 1777 |
| THUNDERSTORM WIND | 1488 |
| HAIL | 1361 |
and now we do the same for fatalities to see how much each event(EVTYPE) has caused fatalities(FATALITIES).
total_fatalities <- aggregate(FATALITIES~EVTYPE, storm_data, sum)
total_fatalities <- arrange(total_fatalities, desc(FATALITIES))
kable(total_fatalities[1:10,],
caption = "Total number of fatalities by event type", align = 'cc')
| EVTYPE | FATALITIES |
|---|---|
| TORNADO | 5633 |
| EXCESSIVE HEAT | 1903 |
| FLASH FLOOD | 978 |
| HEAT | 937 |
| LIGHTNING | 816 |
| TSTM WIND | 504 |
| FLOOD | 470 |
| RIP CURRENT | 368 |
| HIGH WIND | 248 |
| AVALANCHE | 224 |
health <- merge(total_fatalities, total_injuries, by.x = "EVTYPE", by.y = "EVTYPE")
health <- arrange(health, desc(FATALITIES+INJURIES))[1:10,]
health <- melt(health, id.vars="EVTYPE", variable.name = "outcome")
graph <- ggplot(health, aes(x=reorder(EVTYPE,-value),y=value))+
geom_bar(stat = 'identity',aes(fill=outcome))+
facet_grid(~outcome)+
theme(axis.text.x = element_text(angle = 45, hjust = 1))+
xlab('Event type')+
ylab('Number of Fatalities and Injuries')+
ggtitle('Top 10 Natural disasters in US by health-related factors')+
theme(plot.title = element_text(hjust = 0.5))+
theme(legend.position = "none")
graph
### Question 2. Across the United States, which types of events have the greatest economic consequences? #### Aggregating the data
Here we aggregate the data to see how much each event(EVTYPE) has caused property damage(PROPDMG) and select the top 20 events that caused the most amount of damage.
total_prop_dmg <- aggregate(PROPDMG~EVTYPE, storm_data, sum)
total_prop_dmg <- arrange(total_prop_dmg, desc(PROPDMG))
kable(total_prop_dmg[1:10,],caption = "Total amount of property damage by event type", align = 'cc')
| EVTYPE | PROPDMG |
|---|---|
| TORNADO | 3212258.2 |
| FLASH FLOOD | 1420124.6 |
| TSTM WIND | 1335965.6 |
| FLOOD | 899938.5 |
| THUNDERSTORM WIND | 876844.2 |
| HAIL | 688693.4 |
| LIGHTNING | 603351.8 |
| THUNDERSTORM WINDS | 446293.2 |
| HIGH WIND | 324731.6 |
| WINTER STORM | 132720.6 |
and now we do the same for fatalities to see how much each event(EVTYPE) has caused crop damage(CROPDMG).
total_crop_dmg <- aggregate(CROPDMG~EVTYPE, storm_data, sum)
total_crop_dmg <- arrange(total_crop_dmg, desc(CROPDMG))
kable(total_crop_dmg[1:10,],
caption = "Total amount of crop damage by event type", align = 'cc')
| EVTYPE | CROPDMG |
|---|---|
| HAIL | 579596.28 |
| FLASH FLOOD | 179200.46 |
| FLOOD | 168037.88 |
| TSTM WIND | 109202.60 |
| TORNADO | 100018.52 |
| THUNDERSTORM WIND | 66791.45 |
| DROUGHT | 33898.62 |
| THUNDERSTORM WINDS | 18684.93 |
| HIGH WIND | 17283.21 |
| HEAVY RAIN | 11122.80 |
econ <- merge(total_crop_dmg, total_prop_dmg, by.x = "EVTYPE", by.y = "EVTYPE")
econ <- arrange(econ, desc(PROPDMG+CROPDMG))[1:10,]
econ <- melt(econ, id.vars="EVTYPE", variable.name = "outcome")
graph <- ggplot(econ, aes(x=reorder(EVTYPE,-value),y=value))+
geom_bar(stat = 'identity',aes(fill=outcome))+
facet_grid(.~outcome)+
theme(axis.text.x = element_text(angle = 45, hjust = 1))+
xlab('Event type')+
ylab('Amount of damage')+
ggtitle('Top 10 Natural disasters in US by Economic factors')+
theme(plot.title = element_text(hjust = 0.5))+
theme(legend.position = "none")
graph
Using the data and plots generated in this analysis we can conclude that tornados are the most destructive and deadly disaster in the United States as they have caused the most number of injuries and fatalities, as well as the most amount of property damage. The most destructive disaster as it has caused the most amount of crop damage.