This brief report is examining the most damaging types of weather events in terms of fatalities and economic impact throughout the USA from 1950 to 2011. Severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern. This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
The data is downloaded and create a data frame
zipUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
zipFile <- "repdata_data_Storm_Data.csv.bz2"
if (!file.exists(zipFile)) {
download.file(zipUrl, zipFile, mode = "wb")
}
data <- read.csv(zipFile, stringsAsFactors = FALSE)
The data frame are being subsetted so as to be kept only the variables of our interest
data <- subset(data, select=c(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP))
Reading the units and units abbreviations used by the initial data frame regarding the property damages
unique(data$PROPDMGEXP)
## [1] "K" "M" "" "B" "m" "+" "0" "5" "6" "?" "4" "2" "3" "h" "7" "H" "-"
## [18] "1" "8"
We are transforming them into decimal powers, zeroing parallely the ones with no value “?”,“”,“-”
data$PROPDMGEXP[data$PROPDMGEXP == ""] <- 1
data$PROPDMGEXP[data$PROPDMGEXP == "-"] <- 0
data$PROPDMGEXP[data$PROPDMGEXP == "?"] <- 0
data$PROPDMGEXP[data$PROPDMGEXP == "+"] <- 0
data$PROPDMGEXP[data$PROPDMGEXP == "0"] <- 10^0
data$PROPDMGEXP[data$PROPDMGEXP == "1"] <- 10^1
data$PROPDMGEXP[data$PROPDMGEXP == "2"] <- 10^2
data$PROPDMGEXP[data$PROPDMGEXP == "3"] <- 10^3
data$PROPDMGEXP[data$PROPDMGEXP == "4"] <- 10^4
data$PROPDMGEXP[data$PROPDMGEXP == "5"] <- 10^5
data$PROPDMGEXP[data$PROPDMGEXP == "6"] <- 10^6
data$PROPDMGEXP[data$PROPDMGEXP == "7"] <- 10^7
data$PROPDMGEXP[data$PROPDMGEXP == "8"] <- 10^8
data$PROPDMGEXP[data$PROPDMGEXP == "B"] <- 10^9
data$PROPDMGEXP[data$PROPDMGEXP == "h"] <- 10^2
data$PROPDMGEXP[data$PROPDMGEXP == "H"] <- 10^2
data$PROPDMGEXP[data$PROPDMGEXP == "K"] <- 10^3
data$PROPDMGEXP[data$PROPDMGEXP == "m"] <- 10^6
data$PROPDMGEXP[data$PROPDMGEXP == "M"] <- 10^6
Reading the units and units abbreviations used by the initial data frame regarding the crops damages
unique(data$CROPDMGEXP)
## [1] "" "M" "K" "m" "B" "?" "0" "k" "2"
We are transforming them into decimal powers, zeroing parallely the ones with no value “?”
data$CROPDMGEXP[data$CROPDMGEXP == ""] <- 1
data$CROPDMGEXP[data$CROPDMGEXP == "M"] <- 10^6
data$CROPDMGEXP[data$CROPDMGEXP == "K"] <- 10^3
data$CROPDMGEXP[data$CROPDMGEXP == "m"] <- 10^6
data$CROPDMGEXP[data$CROPDMGEXP == "B"] <- 10^9
data$CROPDMGEXP[data$CROPDMGEXP == "?"] <- 0
data$CROPDMGEXP[data$CROPDMGEXP == "0"] <- 10^0
data$CROPDMGEXP[data$CROPDMGEXP == "k"] <- 10^3
data$CROPDMGEXP[data$CROPDMGEXP == "2"] <- 10^2
We are creating a new variable multiplying the numbers with the new units
data$PROPDMG <- as.numeric(data$PROPDMGEXP) * as.numeric(data$PROPDMG)
data$CROPDMG <- as.numeric(data$CROPDMGEXP) * as.numeric(data$CROPDMG)
We are subsetting again, keeping only the new variable and deleting the other two (numbers and units)
data <- subset(data, select=-c(PROPDMGEXP, CROPDMGEXP))
We are creating four new data frames summarizing the values of property nd crops damage, and human fatalities and injuries by each different type of weather event
property_damage <- aggregate(data$PROPDMG,by=list(data$EVTYPE),FUN=sum)
crops_damage <- aggregate(data$CROPDMG,by=list(data$EVTYPE),FUN=sum)
fatalities <- aggregate(data$FATALITIES,by=list(data$EVTYPE),FUN=sum)
injuries <- aggregate(data$INJURIES,by=list(data$EVTYPE),FUN=sum)
# Loadind dplyr library in order to use arrange function
library(dplyr)
We are arranging the four final data frames in decreasing order of damages and casualties
property_damage <- arrange(property_damage, desc(x))
crops_damage <- arrange(crops_damage, desc(x))
fatalities <- arrange(fatalities, desc(x))
injuries <- arrange(injuries, desc(x))
We select for the purposes of our presentation onlyt the first 20 obs for each kind of damages/casualties
property_damage20 <- property_damage[1:20,]
crops_damage20 <- crops_damage[1:20,]
fatalities20 <- fatalities[1:20,]
injuries20 <- injuries[1:20,]
par(mfrow = c(1, 2), mar = c(12.5, 4, 3, 2), mgp = c(3, 1, 0), cex = 0.7)
barplot(fatalities20$x, las = 3, names.arg = fatalities20$Group.1, main = "Weather Events With The Top 20 Highest Fatalities",
ylab = "number of fatalities", col = "red")
barplot(injuries20$x, las = 3, names.arg = injuries20$Group.1, main = "Weather Events With the Top 20 Highest Injuries",
ylab = "number of injuries", col = "red")
The weather events with the greatest human casualties are tornados (and thunderstorm winds), floods (and flash-floods) and excessive heat.
par(mfrow = c(1, 2), mar = c(13.5, 4, 3, 2), mgp = c(3, 1, 0), cex = 0.7)
barplot(property_damage20$x/(10^9), las = 3, names.arg = property_damage20$Group.1,
main = "Top 20 Events with Greatest Property Damages", ylab = "Cost of damage ($ billions)",
col = "blue")
barplot(crops_damage20$x/(10^9), las = 3, names.arg = property_damage20$Group.1,
main = "Top 20 Events With Greatest Crop Damages", ylab = "Cost of damage ($ billions)",
col = "blue")
The weather events have the greatest economic consequences are: flood, drought, tornado, typhoons and storm surges.
Across the USA, floods, tornados, typhoons and storm surges have caused the greatest damage to properties. Drought and flood are the causes of greatest damage to crops.