Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
The goal of this analysis is to find the weather events that are most harmful with respect to population healt and the weather events that lead to greatest economic consequences.
Reading the Storm Data
data <- read.csv("repdata-data-StormData.csv")
Create new data set with selected relevant columns
dataset <- data[, c("EVTYPE","FATALITIES","INJURIES","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")]
Exploring the property damage exponent
unique(dataset$PROPDMGEXP)
## [1] K M B m + 0 5 6 ? 4 2 3 h 7 H - 1 8
## Levels: - ? + 0 1 2 3 4 5 6 7 8 B h H K m M
Converting the property exponent data to calculable values
dataset$PROPDMGEXP <- as.character(dataset$PROPDMGEXP)
dataset$PROPDMGEXP[dataset$PROPDMGEXP=="K"] <- 1000
dataset$PROPDMGEXP[dataset$PROPDMGEXP=="M" | dataset$PROPDMGEXP =="m"] <- 1000000
dataset$PROPDMGEXP[dataset$PROPDMGEXP=="B"] <- 1000000000
dataset$PROPDMGEXP[dataset$PROPDMGEXP=="H" | dataset$PROPDMGEXP=="h"] <- 100
dataset$PROPDMGEXP[dataset$PROPDMGEXP=="0" | dataset$PROPDMGEXP==""] <- 1
dataset$PROPDMGEXP[dataset$PROPDMGEXP=="1"] <- 10
dataset$PROPDMGEXP[dataset$PROPDMGEXP=="2"] <- 100
dataset$PROPDMGEXP[dataset$PROPDMGEXP=="3"] <- 1000
dataset$PROPDMGEXP[dataset$PROPDMGEXP=="4"] <- 10000
dataset$PROPDMGEXP[dataset$PROPDMGEXP=="5"] <- 100000
dataset$PROPDMGEXP[dataset$PROPDMGEXP=="6"] <- 1000000
dataset$PROPDMGEXP[dataset$PROPDMGEXP=="7"] <- 10000000
dataset$PROPDMGEXP[dataset$PROPDMGEXP=="8"] <- 100000000
dataset$PROPDMGEXP[dataset$PROPDMGEXP == "+" | dataset$PROPDMGEXP == "?" | dataset$PROPDMGEXP == "-"] <- 0
dataset$PROPDMGEXP <- as.numeric(dataset$PROPDMGEXP)
Compute the property damage value
dataset$PROPDMGVAL = dataset$PROPDMG * dataset$PROPDMGEXP
Exploring the crop damage exponent
unique(dataset$CROPDMGEXP)
## [1] M K m B ? 0 k 2
## Levels: ? 0 2 B k K m M
Converting the crop exponent data to calculable values
dataset$CROPDMGEXP <- as.character(dataset$CROPDMGEXP)
dataset$CROPDMGEXP[dataset$CROPDMGEXP=="0" | dataset$CROPDMGEXP==""] <- 1
dataset$CROPDMGEXP[dataset$CROPDMGEXP=="2"] <- 100
dataset$CROPDMGEXP[dataset$CROPDMGEXP=="B"] <- 1000000000
dataset$CROPDMGEXP[dataset$CROPDMGEXP=="K" |dataset$CROPDMGEXP =="k"] <- 1000
dataset$CROPDMGEXP[dataset$CROPDMGEXP=="M" | dataset$CROPDMGEXP == "m"] <- 1000000
dataset$CROPDMGEXP[dataset$CROPDMGEXP=="?"] <- 0
dataset$CROPDMGEXP <- as.numeric(dataset$CROPDMGEXP)
Compute the crop damage value
dataset$CROPDMGVAL = dataset$CROPDMG * dataset$CROPDMGEXP
Compute the sum of data by types of events
fatal <- aggregate(FATALITIES ~ EVTYPE, dataset, sum)
inj <- aggregate(INJURIES ~ EVTYPE, dataset, sum)
prop <- aggregate(PROPDMGVAL ~ EVTYPE, dataset, sum)
crop <- aggregate(CROPDMGVAL ~ EVTYPE, dataset, sum)
The top 10 events with highest fatalities and injuries
topfatal <- fatal[order(fatal$FATALITIES,decreasing=TRUE)[1:10],]
topinj <- inj[order(inj$INJURIES,decreasing=TRUE)[1:10],]
Plots for the top 10 events
par(mfrow=c(1,2), mar = c(8,4,3,1), mgp= c(3,1,0), cex = 0.8,cex.axis = 0.7,cex.main=1)
barplot(topfatal$FATALITIES, las = 3, col = " light blue", names.arg = topfatal$EVTYPE, main = "Weather events with the highest fatalities",ylab = "Number of fatalities")
barplot(topinj$INJURIES, las = 3, col = "light blue", names.arg = topinj$EVTYPE, main = "Weather events with the highest injuries",ylab = "Number of injuries")
Based on this two plots, Tonado was the most harmful for population health in both fatalities and injuries aspects.
The top 10 events with the highest damage value
topprop <- prop[order(prop$PROPDMGVAL,decreasing=TRUE)[1:10],]
topcrop <- crop[order(crop$CROPDMGVAL,decreasing=TRUE)[1:10],]
Plots for the top 10 events
par(mfrow=c(1,2), mar = c(8,4,3,1), mgp= c(3,1,0), cex = 0.8, cex.axis = 0.7, cex.main=1)
barplot(topprop$PROPDMGVAL/(10^9), las = 3, col = " light blue", names.arg = topprop$EVTYPE, main = "Weather events with the highest property damage",ylab = "Property damage ($ Billion)")
barplot(topcrop$CROPDMGVAL/(10^9), las = 3, col = "light blue", names.arg = topcrop$EVTYPE, main = "Weather events with the highest crop damage",ylab = "Crop damage ($ Billion)")
Based on these plots, Flood and Droughts brought out the greatest damage value in property and crop accross the United States.