In this report, we aim to analyze the impact of different weather events on public health and economy based on the storm database collected from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) from 1950 - 2011. We’ll use the estimates of fatalities, injuries, property and crop damage to decide which types of event are most harmful to the population health and economy. This data analysis addresses the following questions :
Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health ?
Across the United States, which types of events have the greatest economic consequences ?
Through this analysis, we found that: Tornado is the harmful event with respect to population healt hwhile flood is the event which have the greatest economic consequences
echo = TRUE # Always make code visible
options(scipen = 1) # Turn off scientific notations for numbers
library(ggplot2)
library(plyr)
require(gridExtra)
## Le chargement a nécessité le package : gridExtra
library(knitr)
Data Processing
NOAAdata <- read.csv("StormData.csv", sep=",", header=TRUE)
head(NOAAdata)
## STATE__ BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE EVTYPE
## 1 1 4/18/1950 0:00:00 0130 CST 97 MOBILE AL TORNADO
## 2 1 4/18/1950 0:00:00 0145 CST 3 BALDWIN AL TORNADO
## 3 1 2/20/1951 0:00:00 1600 CST 57 FAYETTE AL TORNADO
## 4 1 6/8/1951 0:00:00 0900 CST 89 MADISON AL TORNADO
## 5 1 11/15/1951 0:00:00 1500 CST 43 CULLMAN AL TORNADO
## 6 1 11/15/1951 0:00:00 2000 CST 77 LAUDERDALE AL TORNADO
## BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END COUNTYENDN
## 1 0 0 NA
## 2 0 0 NA
## 3 0 0 NA
## 4 0 0 NA
## 5 0 0 NA
## 6 0 0 NA
## END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES INJURIES PROPDMG
## 1 0 14.0 100 3 0 0 15 25.0
## 2 0 2.0 150 2 0 0 0 2.5
## 3 0 0.1 123 2 0 0 2 25.0
## 4 0 0.0 100 2 0 0 2 2.5
## 5 0 0.0 150 2 0 0 2 2.5
## 6 0 1.5 177 2 0 0 6 2.5
## PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES LATITUDE LONGITUDE
## 1 K 0 3040 8812
## 2 K 0 3042 8755
## 3 K 0 3340 8742
## 4 K 0 3458 8626
## 5 K 0 3412 8642
## 6 K 0 3450 8748
## LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1 3051 8806 1
## 2 0 0 2
## 3 0 0 3
## 4 0 0 4
## 5 0 0 5
## 6 0 0 6
Subset (NOAA) storm database
tidyNOAA <- NOAAdata[,c('EVTYPE','FATALITIES','INJURIES', 'PROPDMG', 'PROPDMGEXP', 'CROPDMG', 'CROPDMGEXP')]
head(tidyNOAA)
## EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 TORNADO 0 15 25.0 K 0
## 2 TORNADO 0 0 2.5 K 0
## 3 TORNADO 0 2 25.0 K 0
## 4 TORNADO 0 2 2.5 K 0
## 5 TORNADO 0 2 2.5 K 0
## 6 TORNADO 0 6 2.5 K 0
str(tidyNOAA)
## 'data.frame': 902297 obs. of 7 variables:
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
To calculate the economic damage, We need:
PROPDMG and CROPDMG: Amount (without unit) of property damage and crop damage.
PROPDMGEXP and CROPDMGEXP: Unit expressed in power of 10 of the above variables (H,K,M B means Hundreds, Thousands, Millions and Billions respectively).
# Convert H, K, M, B units to calculate Property Damage
## create an empty column
tidyNOAA$PROPDMGNUM = 0
## fill in the data with correct units
tidyNOAA[tidyNOAA$PROPDMGEXP == "H", ]$PROPDMGNUM = tidyNOAA[tidyNOAA$PROPDMGEXP == "H", ]$PROPDMG * 10^2
tidyNOAA[tidyNOAA$PROPDMGEXP == "K", ]$PROPDMGNUM = tidyNOAA[tidyNOAA$PROPDMGEXP == "K", ]$PROPDMG * 10^3
tidyNOAA[tidyNOAA$PROPDMGEXP == "M", ]$PROPDMGNUM = tidyNOAA[tidyNOAA$PROPDMGEXP == "M", ]$PROPDMG * 10^6
tidyNOAA[tidyNOAA$PROPDMGEXP == "B", ]$PROPDMGNUM = tidyNOAA[tidyNOAA$PROPDMGEXP == "B", ]$PROPDMG * 10^9
head(tidyNOAA, 100)
## EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 TORNADO 0 15 25.00 K 0
## 2 TORNADO 0 0 2.50 K 0
## 3 TORNADO 0 2 25.00 K 0
## 4 TORNADO 0 2 2.50 K 0
## 5 TORNADO 0 2 2.50 K 0
## 6 TORNADO 0 6 2.50 K 0
## 7 TORNADO 0 1 2.50 K 0
## 8 TORNADO 0 0 2.50 K 0
## 9 TORNADO 1 14 25.00 K 0
## 10 TORNADO 0 0 25.00 K 0
## 11 TORNADO 0 3 2.50 M 0
## 12 TORNADO 0 3 2.50 M 0
## 13 TORNADO 1 26 250.00 K 0
## 14 TORNADO 0 12 0.00 K 0
## 15 TORNADO 0 6 25.00 K 0
## 16 TORNADO 4 50 25.00 K 0
## 17 TORNADO 0 2 25.00 K 0
## 18 TORNADO 0 0 25.00 K 0
## 19 TORNADO 0 0 25.00 K 0
## 20 TORNADO 0 0 25.00 K 0
## 21 TORNADO 0 0 25.00 K 0
## 22 TORNADO 0 0 2.50 K 0
## 23 TORNADO 0 0 2.50 K 0
## 24 TORNADO 0 1 25.00 K 0
## 25 TORNADO 0 1 25.00 K 0
## 26 TORNADO 1 8 25.00 K 0
## 27 TORNADO 0 2 25.00 K 0
## 28 TORNADO 0 1 25.00 K 0
## 29 TORNADO 0 6 25.00 K 0
## 30 TORNADO 0 2 2.50 K 0
## 31 TORNADO 0 0 2.50 K 0
## 32 TORNADO 0 12 2.50 K 0
## 33 TORNADO 0 0 25.00 K 0
## 34 TORNADO 6 195 2.50 M 0
## 35 TORNADO 0 2 25.00 K 0
## 36 TORNADO 7 12 250.00 K 0
## 37 TORNADO 0 0 2.50 K 0
## 38 TORNADO 2 3 25.00 K 0
## 39 TORNADO 0 2 2.50 K 0
## 40 TORNADO 0 0 25.00 K 0
## 41 TORNADO 0 0 2.50 K 0
## 42 TORNADO 0 1 25.00 K 0
## 43 TORNADO 0 0 2.50 K 0
## 44 TORNADO 0 0 25.00 K 0
## 45 TORNADO 0 0 25.00 K 0
## 46 TORNADO 0 0 0.03 K 0
## 47 TORNADO 0 1 25.00 K 0
## 48 TORNADO 0 4 250.00 K 0
## 49 TORNADO 0 26 250.00 K 0
## 50 TORNADO 0 3 2.50 K 0
## 51 TORNADO 0 2 2.50 K 0
## 52 TORNADO 0 0 25.00 K 0
## 53 TORNADO 0 1 25.00 K 0
## 54 TSTM WIND 0 0 0.00 0
## 55 HAIL 0 0 0.00 0
## 56 HAIL 0 0 0.00 0
## 57 TSTM WIND 0 0 0.00 0
## 58 HAIL 0 0 0.00 0
## 59 TSTM WIND 0 0 0.00 0
## 60 TSTM WIND 0 0 0.00 0
## 61 HAIL 0 0 0.00 0
## 62 HAIL 0 0 0.00 0
## 63 HAIL 0 0 0.00 0
## 64 TSTM WIND 0 0 0.00 0
## 65 TSTM WIND 0 0 0.00 0
## 66 TSTM WIND 0 0 0.00 0
## 67 HAIL 0 0 0.00 0
## 68 TORNADO 0 1 25.00 K 0
## 69 TSTM WIND 0 0 0.00 0
## 70 TORNADO 5 20 2.50 M 0
## 71 TSTM WIND 0 0 0.00 0
## 72 TSTM WIND 0 0 0.00 0
## 73 TSTM WIND 0 0 0.00 0
## 74 HAIL 0 0 0.00 0
## 75 TSTM WIND 0 0 0.00 0
## 76 TORNADO 0 0 2.50 K 0
## 77 TORNADO 0 0 2.50 K 0
## 78 TORNADO 0 0 25.00 K 0
## 79 TORNADO 0 0 2.50 M 0
## 80 TORNADO 0 5 2.50 M 0
## 81 TSTM WIND 0 0 0.00 0
## 82 HAIL 0 0 0.00 0
## 83 TSTM WIND 0 0 0.00 0
## 84 TSTM WIND 0 0 0.00 0
## 85 TSTM WIND 0 0 0.00 0
## 86 TSTM WIND 0 0 0.00 0
## 87 TSTM WIND 0 0 0.00 0
## 88 HAIL 0 0 0.00 0
## 89 HAIL 0 0 0.00 0
## 90 TORNADO 0 0 25.00 K 0
## 91 TSTM WIND 0 0 0.00 0
## 92 HAIL 0 0 0.00 0
## 93 TSTM WIND 0 0 0.00 0
## 94 TORNADO 25 200 2.50 M 0
## 95 TSTM WIND 0 0 0.00 0
## 96 TSTM WIND 0 0 0.00 0
## 97 TSTM WIND 0 0 0.00 0
## 98 TSTM WIND 0 0 0.00 0
## 99 TORNADO 0 2 25.00 K 0
## 100 HAIL 0 0 0.00 0
## PROPDMGNUM
## 1 25000
## 2 2500
## 3 25000
## 4 2500
## 5 2500
## 6 2500
## 7 2500
## 8 2500
## 9 25000
## 10 25000
## 11 2500000
## 12 2500000
## 13 250000
## 14 0
## 15 25000
## 16 25000
## 17 25000
## 18 25000
## 19 25000
## 20 25000
## 21 25000
## 22 2500
## 23 2500
## 24 25000
## 25 25000
## 26 25000
## 27 25000
## 28 25000
## 29 25000
## 30 2500
## 31 2500
## 32 2500
## 33 25000
## 34 2500000
## 35 25000
## 36 250000
## 37 2500
## 38 25000
## 39 2500
## 40 25000
## 41 2500
## 42 25000
## 43 2500
## 44 25000
## 45 25000
## 46 30
## 47 25000
## 48 250000
## 49 250000
## 50 2500
## 51 2500
## 52 25000
## 53 25000
## 54 0
## 55 0
## 56 0
## 57 0
## 58 0
## 59 0
## 60 0
## 61 0
## 62 0
## 63 0
## 64 0
## 65 0
## 66 0
## 67 0
## 68 25000
## 69 0
## 70 2500000
## 71 0
## 72 0
## 73 0
## 74 0
## 75 0
## 76 2500
## 77 2500
## 78 25000
## 79 2500000
## 80 2500000
## 81 0
## 82 0
## 83 0
## 84 0
## 85 0
## 86 0
## 87 0
## 88 0
## 89 0
## 90 25000
## 91 0
## 92 0
## 93 0
## 94 2500000
## 95 0
## 96 0
## 97 0
## 98 0
## 99 25000
## 100 0
# Convert H, K, M, B units to calculate Crop Damage
## create an empty column
tidyNOAA$CROPDMGNUM = 0
## assign correct values based on parameters
tidyNOAA[tidyNOAA$CROPDMGEXP == "H", ]$CROPDMGNUM = tidyNOAA[tidyNOAA$CROPDMGEXP == "H", ]$CROPDMG * 10^2
tidyNOAA[tidyNOAA$CROPDMGEXP == "K", ]$CROPDMGNUM = tidyNOAA[tidyNOAA$CROPDMGEXP == "K", ]$CROPDMG * 10^3
tidyNOAA[tidyNOAA$CROPDMGEXP == "M", ]$CROPDMGNUM = tidyNOAA[tidyNOAA$CROPDMGEXP == "M", ]$CROPDMG * 10^6
tidyNOAA[tidyNOAA$CROPDMGEXP == "B", ]$CROPDMGNUM = tidyNOAA[tidyNOAA$CROPDMGEXP == "B", ]$CROPDMG * 10^9
# plot number of fatalities with the most harmful event type
fatalities <- aggregate(FATALITIES ~ EVTYPE, data=tidyNOAA, sum)
fatalities <- fatalities[order(-fatalities$FATALITIES), ][1:10, ]
fatalities$EVTYPE <- factor(fatalities$EVTYPE, levels = fatalities$EVTYPE)
ggplot(fatalities, aes(x = EVTYPE, y = FATALITIES)) +
geom_bar(stat = "identity", fill = "#32CD32", las = 3) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
xlab("Event Type") + ylab("Fatalities") + ggtitle("Number of fatalities by top 10 Weather Events")
## Warning: Ignoring unknown parameters: las
# plot number of injuries with the most harmful event type
injuries <- aggregate(INJURIES ~ EVTYPE, data=tidyNOAA, sum)
injuries <- injuries[order(-injuries$INJURIES), ][1:10, ]
injuries$EVTYPE <- factor(injuries$EVTYPE, levels = injuries$EVTYPE)
ggplot(injuries, aes(x = EVTYPE, y = INJURIES)) +
geom_bar(stat = "identity", fill = "#32CD32", las = 3) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
xlab("Event Type") + ylab("Injuries") + ggtitle("Number of injuries by top 10 Weather Events")
## Warning: Ignoring unknown parameters: las
The weather event that causes the most harm to public health is Tornadoes. They have shown in the graphs above to be the largest cause of fatalities and injuries due to weather events in the United States.
# plot number of damages with the most harmful event type
damages <- aggregate(PROPDMGNUM + CROPDMGNUM ~ EVTYPE, data=tidyNOAA, sum)
names(damages) = c("EVTYPE", "TOTALDAMAGE")
damages <- damages[order(-damages$TOTALDAMAGE), ][1:10, ]
damages$EVTYPE <- factor(damages$EVTYPE, levels = damages$EVTYPE)
ggplot(damages, aes(x = EVTYPE, y = TOTALDAMAGE)) +
geom_bar(stat = "identity", fill = "#32CD32", las = 3) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
xlab("Event Type") + ylab("Damages ($)") + ggtitle("Property & Crop Damages by top 10 Weather Events")
## Warning: Ignoring unknown parameters: las
Flood is the event which have the greatest economic consequences.