This report uses the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. The obective is to look at the impact of weather events in the United States. For this report we will be looking specificly at the impact that weather events have on the public’s health, as well as the economic impact that result from such events.
This data was downloaded from the internet using this link. The file is a bzip2 file and contains a compressed csv file. The file was downloaded using R and then was processed into R using the read.csv comand. Three R packages were also used to process the data. They are as follows:
Please look the code bellow:
library(dplyr)
##
## Attaching package: 'dplyr'
##
## The following objects are masked from 'package:stats':
##
## filter, lag
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.3
library(scales)
FileURL <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(FileURL, "StormData.bz2")
StormData <- read.csv(bzfile("StormData.bz2"))
In order to see which events effect the public’s health the most, I extracted and added the variables “FATALITIES”, and “INJURIES”. After the variables were combined the data was grouped by the type of even and totalled. Events with no fatalities and injuries were removed. Then tht botton 95% of fatalities and injuries were filtered out. Finaly the data was arranged so that a chart could be created that showed the data in descending order. Please see code bellow:
PubHealth <- select(StormData, EVTYPE, FATALITIES, INJURIES)
PubHealth <- summarise(group_by(PubHealth, EVTYPE), FatalitiesAndInjuries = sum(FATALITIES, INJURIES))
PubHealth <- filter(PubHealth, FatalitiesAndInjuries > 0)
PubHealth <- filter(PubHealth, FatalitiesAndInjuries > quantile(FatalitiesAndInjuries, probs = 0.95))
PubHealth <- arrange(PubHealth, desc(FatalitiesAndInjuries))
PubHealth$EVTYPE <- factor(PubHealth$EVTYPE, levels = PubHealth$EVTYPE[order(PubHealth$FatalitiesAndInjuries)])
For this I extracted the “PROPDMG”, “PROPDMGEXP”, “CROPDMG”, and “CROPDMGEXP” variables. I filtered out each cost of the property and crop damage by whether they were “K” for thousands, “M” for millions, or “B” for billions. The data was then adjusted to show the true cost. The data set was reconstructed, then the both property and crop damage costs were totalled together and grouped by the weather event. Again the bottom 95% of data was then filtered out to show the most damaging events in terms of cost. Also, once again the data was arranged to show the costs in descending order when a chart is made. Please see code bellow:
EconDam <- select(StormData, EVTYPE, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)
PROPK <- filter(EconDam, PROPDMGEXP == "K")
PROPK <- mutate(PROPK, AdjPROPDMG = PROPDMG * 1000)
PROPM <- filter(EconDam, PROPDMGEXP == "M")
PROPM <- mutate(PROPM, AdjPROPDMG = PROPDMG * 1000000)
PROPB <- filter(EconDam, PROPDMGEXP == "B")
PROPB <- mutate(PROPB, AdjPROPDMG = PROPDMG * 1000000000)
EconDam <- rbind(PROPB, PROPK, PROPM)
CROPK <- filter(EconDam, CROPDMGEXP == "K")
CROPK <- mutate(CROPK, AdjCROPDMG = CROPDMG * 1000)
CROPM <- filter(EconDam, CROPDMGEXP == "M")
CROPM <- mutate(CROPM, AdjCROPDMG = CROPDMG * 1000000)
CROPB <- filter(EconDam, CROPDMGEXP == "B")
CROPB <- mutate(CROPB, AdjCROPDMG = CROPDMG * 1000000000)
EconDam <- rbind(CROPB, CROPK, CROPM)
EconDam <- select(EconDam, EVTYPE, AdjPROPDMG, AdjCROPDMG)
EconDam <- summarise(group_by(EconDam, EVTYPE), TotalDMG = sum(AdjPROPDMG, AdjCROPDMG))
EconDam <- filter(EconDam, TotalDMG > quantile(TotalDMG, probs = 0.95))
EconDam <- arrange(EconDam, desc(TotalDMG))
EconDam$EVTYPE <- factor(EconDam$EVTYPE, levels = EconDam$EVTYPE[order(EconDam$TotalDMG)])
The following code generates two charts illistrating the results from the data processing:
TopEventPlot <- ggplot(PubHealth, aes(x = EVTYPE, y = FatalitiesAndInjuries)) +
geom_bar(stat = "identity", fill = "lightcoral") +
labs(title = "U.S. Population Health's Most Harmful Events") +
labs(x = "Weather Event Type") +
labs(y = "Total Fatalities and Injuries") +
coord_flip() +
theme_bw()
TopEconDomPlot <- ggplot(EconDam, aes(x = EVTYPE, y = TotalDMG)) +
geom_bar(stat = "identity", fill = "lightgoldenrod") +
geom_text(aes(label = dollar(TotalDMG)), position=position_nudge(y =0.9)) +
labs(title = "Top Weather Events with Economic Consequences in the U.S.") +
labs(x = "Weather Event Type") +
labs(y = "Total Amount of Damage in Dollars") +
coord_flip() +
theme_bw()
print(TopEventPlot)
print(TopEconDomPlot)
As you can see, Tornados are the top most dangerous event in termes of public health, they also are the third most economicly damaging. Floods, overall are the most costly in terms of economic damage, causing $138,007,444,500 in damage.