In this project the effects of “storm events” in the United States are analyzed. There were two main questions to be answered. Firstly, across the United States, which types of storm events are most harmful with respect to population health? Secondly, across the United States, which types of events have the greatest economic consequences? The raw data used to perform this analysis is from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database which tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. The events in the database start in the year 1950 and end in November 2011.
The raw data used to perform the following analysis can be found here: https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2
Documentation on the raw data used to perform the following analysis can be found here: https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf
sd <- read.csv("StormData.csv.bz2")
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
sda <- select(sd, EVTYPE, FATALITIES,
INJURIES,
PROPDMG,
CROPDMG)
sdpophealth <- select(sda, EVTYPE, FATALITIES, INJURIES)
pophealthsum <- summarise(group_by(sdpophealth, EVTYPE), Total_Deaths = sum(FATALITIES),
Total_Injuries = sum(INJURIES), Average_Deaths = mean(FATALITIES),
Average_Injuries = mean(INJURIES))
phsd <- filter(pophealthsum, Total_Deaths > 500)
phsad <- filter(pophealthsum, Average_Deaths > 5)
phsi <- filter(pophealthsum, Total_Injuries > 1500)
phsai <- filter(pophealthsum, Average_Injuries > 30)
##Total Deaths
phsd
## # A tibble: 6 × 5
## EVTYPE Total_Deaths Total_Injuries Average_Deaths
## <fctr> <dbl> <dbl> <dbl>
## 1 EXCESSIVE HEAT 1903 6525 1.134088200
## 2 FLASH FLOOD 978 1777 0.018018682
## 3 HEAT 937 2100 1.221642764
## 4 LIGHTNING 816 5230 0.051796369
## 5 TORNADO 5633 91346 0.092874101
## 6 TSTM WIND 504 6957 0.002291534
## # ... with 1 more variables: Average_Injuries <dbl>
##Average Deaths
phsad
## # A tibble: 4 × 5
## EVTYPE Total_Deaths Total_Injuries Average_Deaths
## <fctr> <dbl> <dbl> <dbl>
## 1 COLD AND SNOW 14 0 14.000000
## 2 RECORD/EXCESSIVE HEAT 17 0 5.666667
## 3 TORNADOES, TSTM WIND, HAIL 25 0 25.000000
## 4 TROPICAL STORM GORDON 8 43 8.000000
## # ... with 1 more variables: Average_Injuries <dbl>
##Total Injuries
phsi
## # A tibble: 8 × 5
## EVTYPE Total_Deaths Total_Injuries Average_Deaths
## <fctr> <dbl> <dbl> <dbl>
## 1 EXCESSIVE HEAT 1903 6525 1.134088200
## 2 FLASH FLOOD 978 1777 0.018018682
## 3 FLOOD 470 6789 0.018558004
## 4 HEAT 937 2100 1.221642764
## 5 ICE STORM 89 1975 0.044366899
## 6 LIGHTNING 816 5230 0.051796369
## 7 TORNADO 5633 91346 0.092874101
## 8 TSTM WIND 504 6957 0.002291534
## # ... with 1 more variables: Average_Injuries <dbl>
##Average Injuries
phsai
## # A tibble: 3 × 5
## EVTYPE Total_Deaths Total_Injuries Average_Deaths
## <fctr> <dbl> <dbl> <dbl>
## 1 Heat Wave 0 70 0.00
## 2 TROPICAL STORM GORDON 8 43 8.00
## 3 WILD FIRES 3 150 0.75
## # ... with 1 more variables: Average_Injuries <dbl>
sdecon <- select(sda, EVTYPE, PROPDMG,
CROPDMG)
econsum <- summarise(group_by(sdecon, EVTYPE), Total_Property_Damage = sum(PROPDMG),
Total_Crop_Damage = sum(CROPDMG), Average_Property_Damage = mean(PROPDMG),
Average_Crop_Damage = mean(CROPDMG))
tpd <- filter(econsum, Total_Property_Damage > 750000)
apd <- filter(econsum, Average_Property_Damage > 500)
tcd <- filter(econsum, Total_Crop_Damage > 125000)
acd <- filter(econsum, Average_Crop_Damage > 250)
##Total Property Damage
tpd
## # A tibble: 5 × 5
## EVTYPE Total_Property_Damage Total_Crop_Damage
## <fctr> <dbl> <dbl>
## 1 FLASH FLOOD 1420124.6 179200.46
## 2 FLOOD 899938.5 168037.88
## 3 THUNDERSTORM WIND 876844.2 66791.45
## 4 TORNADO 3212258.2 100018.52
## 5 TSTM WIND 1335965.6 109202.60
## # ... with 2 more variables: Average_Property_Damage <dbl>,
## # Average_Crop_Damage <dbl>
##Average Property Damage
apd
## # A tibble: 4 × 5
## EVTYPE Total_Property_Damage Total_Crop_Damage
## <fctr> <dbl> <dbl>
## 1 COASTAL EROSION 766 0
## 2 HEAVY RAIN AND FLOOD 600 0
## 3 Landslump 570 0
## 4 RIVER AND STREAM FLOOD 1200 0
## # ... with 2 more variables: Average_Property_Damage <dbl>,
## # Average_Crop_Damage <dbl>
##Total Crop Damage
tcd
## # A tibble: 3 × 5
## EVTYPE Total_Property_Damage Total_Crop_Damage
## <fctr> <dbl> <dbl>
## 1 FLASH FLOOD 1420124.6 179200.5
## 2 FLOOD 899938.5 168037.9
## 3 HAIL 688693.4 579596.3
## # ... with 2 more variables: Average_Property_Damage <dbl>,
## # Average_Crop_Damage <dbl>
##Average Crop Damage
acd
## # A tibble: 4 × 5
## EVTYPE Total_Property_Damage Total_Crop_Damage
## <fctr> <dbl> <dbl>
## 1 DUST STORM/HIGH WINDS 50 500
## 2 FOREST FIRES 5 500
## 3 HIGH WINDS/COLD 610 2005
## 4 TROPICAL STORM GORDON 500 500
## # ... with 2 more variables: Average_Property_Damage <dbl>,
## # Average_Crop_Damage <dbl>
par(mfrow = c(2,2), mar = c(5,5,5,5), pch = 15, cex = 0.5, cex.axis = 0.8)
barplot(phsai$Average_Injuries, xlab = "Event",
ylab = "Average Injuries", main = "Top Average Injuries",
names = phsai$EVTYPE)
barplot(phsi$Total_Injuries, xlab = "Event",
ylab = "Total Injuries", main = "Top Total Injuries",
names = phsi$EVTYPE)
barplot(phsad$Average_Deaths, xlab = "Event",
ylab = "Average Deaths", main = "Top Average Deaths",
names = phsad$EVTYPE)
barplot(phsd$Total_Deaths, xlab = "Event",
ylab = "Total Deaths", main = "Top Total Deaths",
names = phsd$EVTYPE)
harmful with respect to population health.
par(mfrow = c(2,2), mar = c(5,5,5,5), pch = 15, cex = 0.5, cex.axis = 0.8)
barplot(acd$Average_Crop_Damage, xlab = "Event",
ylab = "Average Crop Damage", main = "Top Average Crop Damage",
names = acd$EVTYPE)
barplot(tcd$Total_Crop_Damage, xlab = "Event",
ylab = "Total Crop Damage", main = "Top Total Crop Damage",
names = tcd$EVTYPE)
barplot(apd$Average_Property_Damage, xlab = "Event",
ylab = "Average Property Damage", main = "Top Average Property Damage",
names = apd$EVTYPE)
barplot(tpd$Total_Property_Damage, xlab = "Event",
ylab = "Total Property Damage", main = "Top Total Property Damage",
names = tpd$EVTYPE)
| Category of Devistation | Event Type |
|---|---|
| Greatest Total Deaths | TORNADO |
| Greatest Average Deaths | TORNADO, TSTM WIND, HAIL |
| Greatest Total Injuries | TORNADO |
| Greatest Average Injuries | Heat Wave |
| Greatest Total Property Damage | TORNADO |
| Greatest Average Propert Damage | COASTAL EROSION |
| Greatest Total Crop Damage | HAIL |
| Greatest Average Crop Damage | DUST STORM + FOREST FIRES + TROPICAL STORM GORDAN |
Tornados are the most harmful storm event to population health based on the fact that they cause the greatest total number of fatalaties. They also have the greatest economic consequences based on the fact that they cause the greatest amount of total property damage. According to this analysis, tornados are the most catestrophic storm event in terms of population health and economic consequences.