This report indicates which weather type event creates the biggest economic consequenses and have the biggest impact to public health across The United States of America. The data used for prosessing is provided by NOAA storm database. The data in this report is aggregated by the event type for damage and health indicators. The result set shows that the Tornadoes have the greatest human cost, while Floods have the biggest economical consequences.
# Download zip file if it doesn't exist in the working directory.
if(!file.exists("StormData.csv.bz2")){
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", destfile = "StormData.csv.bz2", method = "curl")
}
#Read file
stormData <- read.csv("StormData.csv.bz2", stringsAsFactors = FALSE)
EVTYPE - the type of the weather phenomena event.FATALITIES - number of fatalities.INJURIES - number of injuries.PROPDMG - estimate of property damage.PROPDMGEXP - magnitude / exponent of property damage.CROPDMG- estimate of crop damage.CROPDMGEXP - magnitude / exponent of crop damage.Start by loading the dplyr library, then select only the columns that we want to use.
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.0.2
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
stormData <- select(stormData, EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)
# Show the structure of the dataset.
str(stormData)
## 'data.frame': 902297 obs. of 7 variables:
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
# Create a conversion table
expFac <- data.frame(c("","B","m","M","K","H","h", "1","2","3","4","5","6","7","8","0","+","-","?","k"),
c(1,1e+09,1e+06,1e+06,1000,100,100,10,100,1000,10000,1e+05,1e+06,1e+07,1e+08,
1,0,0,0,1000))
# Extract property and crop damage data
PD <- stormData$PROPDMG
PDE <- stormData$PROPDMGEXP
CD <- stormData$CROPDMG
CDE <- stormData$CROPDMGEXP
# Create new variables and populate the value of the damage to property and crops
rows <- length(PD)
PROPDMGVAL <- 1:rows
CROPDMGVAL <- 1:rows
for(i in 1:rows){
PROPDMGVAL[i] <- PD[i] * expFac[expFac[,1] == PDE[i],2]
CROPDMGVAL[i] <- CD[i] * expFac[expFac[,1] == CDE[i],2]
}
# Add values of damages to stormData
stormData <- cbind(stormData, PROPDMGVAL)
stormData <- cbind(stormData, CROPDMGVAL)
stormData$HCOST <- stormData$FATALITIES + stormData$INJURIES
stormData$ECOST <- stormData$PROPDMGVAL + stormData$CROPDMGVAL
h_cost <- subset(stormData, select = c(EVTYPE, FATALITIES, INJURIES, HCOST ))
e_cost <- subset(stormData, select = c(EVTYPE, PROPDMGVAL, CROPDMGVAL, ECOST))
# Aggregates for fatalities and injuries
totFatalities <- aggregate(FATALITIES ~ EVTYPE, data = h_cost, FUN = sum)
totFatalities <- arrange(totFatalities, desc(FATALITIES))
totInjuries <- aggregate(INJURIES ~ EVTYPE, data = h_cost, FUN = sum)
totInjuries <- arrange(totInjuries, desc(INJURIES))
totHCost <- aggregate(HCOST ~ EVTYPE, data = h_cost, FUN = sum)
totHCost <- arrange(totHCost, desc(HCOST))
# Aggregates for property and crop damage
totPropDmg <- aggregate(PROPDMGVAL ~ EVTYPE, data = h_cost, FUN = sum)
totPropDmg <- arrange(totPropDmg, desc(PROPDMGVAL))
totCropDmg <- aggregate(CROPDMGVAL ~ EVTYPE, data = h_cost, FUN = sum)
totCropDmg <- arrange(totCropDmg, desc(CROPDMGVAL))
totECost <- aggregate(ECOST ~ EVTYPE, data = e_cost, FUN = sum)
totECost <- arrange(totECost, desc(ECOST))
head(totFatalities)
## EVTYPE FATALITIES
## 1 TORNADO 5633
## 2 EXCESSIVE HEAT 1903
## 3 FLASH FLOOD 978
## 4 HEAT 937
## 5 LIGHTNING 816
## 6 TSTM WIND 504
head(totInjuries)
## EVTYPE INJURIES
## 1 TORNADO 91346
## 2 TSTM WIND 6957
## 3 FLOOD 6789
## 4 EXCESSIVE HEAT 6525
## 5 LIGHTNING 5230
## 6 HEAT 2100
head(totHCost)
## EVTYPE HCOST
## 1 TORNADO 96979
## 2 EXCESSIVE HEAT 8428
## 3 TSTM WIND 7461
## 4 FLOOD 7259
## 5 LIGHTNING 6046
## 6 HEAT 3037
From the tables, we can deduce that Tornadoes are the most harmfulwith respect to population health.
head(totPropDmg)
## EVTYPE PROPDMGVAL
## 1 FLOOD 144657709807
## 2 HURRICANE/TYPHOON 69305840000
## 3 TORNADO 56947380616
## 4 STORM SURGE 43323536000
## 5 FLASH FLOOD 16822673978
## 6 HAIL 15735267513
head(totCropDmg)
## EVTYPE CROPDMGVAL
## 1 DROUGHT 13972566000
## 2 FLOOD 5661968450
## 3 RIVER FLOOD 5029459000
## 4 ICE STORM 5022113500
## 5 HAIL 3025954473
## 6 HURRICANE 2741910000
head(totECost)
## EVTYPE ECOST
## 1 FLOOD 150319678257
## 2 HURRICANE/TYPHOON 71913712800
## 3 TORNADO 57362333886
## 4 STORM SURGE 43323541000
## 5 HAIL 18761221986
## 6 FLASH FLOOD 18243991078
From the tables, eventhough Droughts cause more Crop Damage then Floods, when added to the Property Damage, Floods cause the greatest overall economic consequences.
The plot below shows the Top 10 Total Human Costs (Fatalities + Injuries) by Types of Weather Event.
par(mar= c(8, 5, 4, 2) + 0.1)
barplot(height = totHCost$HCOST[1:10], names.arg = totHCost$EVTYPE[1:10], las = 2, cex.names= 0.7, col = heat.colors (10), main = "Top 10 Total Human Costs (Fatalities + Injuries) by Event Type")
The plot below shows the types of weather events which cause the most economic consequences.
par(mar= c(8, 6, 4, 2) + 0.1)
barplot(height = totECost$ECOST[1:10], names.arg = totECost$EVTYPE[1:10], las = 2, cex.names= 0.7, col = heat.colors (10), main = "Top 10 Total Economic Costs (Property + Crop) by Event Type")
The result set shows that the Tornadoes have the greatest human cost, while Floods have the biggest economical consequences.