In the data processing step, we first load data contained in the “bz2” file.
stormdata <- read.csv(bzfile("repdata-data-StormData.csv.bz2"), stringsAsFactors=FALSE)
With data loaded, we start the filtering process.
For the purpose of this report, we’ll consider only storm events.
The U.S. National Oceanic and Atmospheric Administration’s (NOAA) defined a range of events that qualify a storm event.
stormEvents <- c("ASTRONOMICAL LOW TIDE","AVALANCHE","BLIZZARD","COASTAL FLOOD",
"COLD/WIND CHILL","DEBRIS FLOW","DENSE FOG","DENSE SMOKE",
"DROUGHT","DUST DEVIL","DUST STORM","EXCESSIVE HEAT",
"EXTREME COLD/WIND CHILL","FLASH FLOOD","FLOOD","FROST/FREEZE",
"FUNNEL CLOUD","FREEZING FOG","HAIL","HEAT","HEAVY RAIN",
"HEAVY SNOW","HIGH SURF","HIGH WIND","HURRICANE (TYPHOON)",
"ICE STORM","LAKE-EFFECT SNOW","LAKESHORE FLOOD","LIGHTNING",
"MARINE HAIL","MARINE HIGH WIND","MARINE STRONG WIND",
"MARINE THUNDERSTORM WIND","RIP CURRENT","SEICHE","SLEET",
"STORM SURGE/TIDE","STRONG WIND","THUNDERSTORM WIND","TORNADO",
"TROPICAL DEPRESSION","TROPICAL STORM","TSUNAMI","VOLCANIC ASH",
"WATERSPOUT","WILDFIRE","WINTER STORM","WINTER WEATHER")
stormdata <- stormdata[stormdata$EVTYPE %in% stormEvents, ] # Only storm events
Another task before starting analysis is to transform the exponencial notation. To accomplish this goal, we create a table that matches the symbols in the dataset into multiplication factors.
We also create a list that contains the names of columns that will be used ahead.
exp <- data.frame(id=c("?", "0", "", "+", "-", "1", "2", "H", "h", "3", "K", "k",
"4", "5", "6", "M", "m", "7", "8", "9", "B", "b"),
value=c(NA, 0, 0, 1, -1, 10, 100, 100, 100, 1000, 1000, 1000,
10000, 100000, 1000000, 1000000, 1000000, 10000000,
100000000, 1000000000, 1000000000, 1000000000))
collabels <- c(names(stormdata), "PROPDMGEXP_VAL", "CROPDMGEXP_VAL")
With the match dataset created, we merge it with the original one.
stormdata <- merge(stormdata, exp, by.x="PROPDMGEXP", by.y="id", all=FALSE) #high processing code
stormdata <- merge(stormdata, exp, by.x="CROPDMGEXP", by.y="id", all=FALSE) #high processing code
To finish the data processing task, we name columns, reorder them and calculate the correct value of property and crop damage (variables that use exponencial notation).
# Renaming columns
names(stormdata)[names(stormdata) == "value.x"] <- "PROPDMGEXP_VAL"
names(stormdata)[names(stormdata) == "value.y"] <- "CROPDMGEXP_VAL"
# Reordering columns
stormdata <- stormdata[, collabels]
# Calculating real values
stormdata$PROPDMG_TOTAL <- stormdata$PROPDMG * stormdata$PROPDMGEXP_VAL
stormdata$CROPDMG_TOTAL <- stormdata$CROPDMG * stormdata$CROPDMGEXP_VAL
stormdata$BGN_YEAR <- format(as.Date(strptime(stormdata$BGN_DATE,
"%m/%d/%Y %H:%M:%S")), "%Y")
stormdata$BGN_MONTH <- format(as.Date(strptime(stormdata$BGN_DATE,
"%m/%d/%Y %H:%M:%S")), "%m")
We summarize the results into two sections: health and economic.
Considering the health aspect, that is expressed by the sum of “total number of fatalities” and “total number of injuries”, the event “Tornando” is responsable for more than 69% (total of 96979).
# Grouping data by event type
library(plyr)
eventsByType <- ddply(stormdata, ~EVTYPE, summarise, SUM_FATALITIES=sum(FATALITIES, na.rm=TRUE),
SUM_INJURIES=sum(INJURIES, na.rm=TRUE), SUM_PROPDMG=sum(PROPDMG_TOTAL, na.rm=TRUE),
SUM_CROPDMG=sum(CROPDMG_TOTAL, na.rm=TRUE),
SUM_HEALTH=sum(FATALITIES, na.rm=TRUE)+sum(INJURIES, na.rm=TRUE),
SUM_ECONOMY=sum(PROPDMG_TOTAL, na.rm=TRUE)+sum(CROPDMG_TOTAL, na.rm=TRUE))
# Adjusting data to show
eventsByTypeHealth <- eventsByType[order(eventsByType$SUM_HEALTH, decreasing = TRUE),
c("EVTYPE", "SUM_HEALTH", "SUM_FATALITIES", "SUM_INJURIES")]
eventsByTypeHealth <- cbind(eventsByTypeHealth,
PERC_HEALTH=sprintf("%.2f%%", prop.table(eventsByTypeHealth$SUM_HEALTH) * 100),
PERC_FATALITIES=sprintf("%.2f%%", prop.table(eventsByTypeHealth$SUM_FATALITIES) * 100),
PERC_INJURIES=sprintf("%.2f%%", prop.table(eventsByTypeHealth$SUM_INJURIES) * 100))
eventsByTypeHealth
## EVTYPE SUM_HEALTH SUM_FATALITIES SUM_INJURIES
## 38 TORNADO 96979 5633 91346
## 11 EXCESSIVE HEAT 8428 1903 6525
## 14 FLOOD 7259 470 6789
## 27 LIGHTNING 6046 816 5230
## 19 HEAT 3037 937 2100
## 13 FLASH FLOOD 2755 978 1777
## 24 ICE STORM 2064 89 1975
## 37 THUNDERSTORM WIND 1621 133 1488
## 45 WINTER STORM 1527 206 1321
## 23 HIGH WIND 1385 248 1137
## 18 HAIL 1376 15 1361
## 21 HEAVY SNOW 1148 127 1021
## 44 WILDFIRE 986 75 911
## 3 BLIZZARD 906 101 805
## 32 RIP CURRENT 600 368 232
## 10 DUST STORM 462 22 440
## 46 WINTER WEATHER 431 33 398
## 40 TROPICAL STORM 398 58 340
## 2 AVALANCHE 394 224 170
## 36 STRONG WIND 383 103 280
## 6 DENSE FOG 360 18 342
## 20 HEAVY RAIN 349 98 251
## 22 HIGH SURF 253 101 152
## 41 TSUNAMI 162 33 129
## 12 EXTREME COLD/WIND CHILL 149 125 24
## 5 COLD/WIND CHILL 107 95 12
## 9 DUST DEVIL 44 2 42
## 30 MARINE STRONG WIND 36 14 22
## 31 MARINE THUNDERSTORM WIND 36 10 26
## 43 WATERSPOUT 32 3 29
## 35 STORM SURGE/TIDE 16 11 5
## 4 COASTAL FLOOD 5 3 2
## 8 DROUGHT 4 0 4
## 17 FUNNEL CLOUD 3 0 3
## 29 MARINE HIGH WIND 2 1 1
## 34 SLEET 2 2 0
## 1 ASTRONOMICAL LOW TIDE 0 0 0
## 7 DENSE SMOKE 0 0 0
## 15 FREEZING FOG 0 0 0
## 16 FROST/FREEZE 0 0 0
## 25 LAKE-EFFECT SNOW 0 0 0
## 26 LAKESHORE FLOOD 0 0 0
## 28 MARINE HAIL 0 0 0
## 33 SEICHE 0 0 0
## 39 TROPICAL DEPRESSION 0 0 0
## 42 VOLCANIC ASH 0 0 0
## PERC_HEALTH PERC_FATALITIES PERC_INJURIES
## 38 69.40% 43.15% 72.10%
## 11 6.03% 14.58% 5.15%
## 14 5.19% 3.60% 5.36%
## 27 4.33% 6.25% 4.13%
## 19 2.17% 7.18% 1.66%
## 13 1.97% 7.49% 1.40%
## 24 1.48% 0.68% 1.56%
## 37 1.16% 1.02% 1.17%
## 45 1.09% 1.58% 1.04%
## 23 0.99% 1.90% 0.90%
## 18 0.98% 0.11% 1.07%
## 21 0.82% 0.97% 0.81%
## 44 0.71% 0.57% 0.72%
## 3 0.65% 0.77% 0.64%
## 32 0.43% 2.82% 0.18%
## 10 0.33% 0.17% 0.35%
## 46 0.31% 0.25% 0.31%
## 40 0.28% 0.44% 0.27%
## 2 0.28% 1.72% 0.13%
## 36 0.27% 0.79% 0.22%
## 6 0.26% 0.14% 0.27%
## 20 0.25% 0.75% 0.20%
## 22 0.18% 0.77% 0.12%
## 41 0.12% 0.25% 0.10%
## 12 0.11% 0.96% 0.02%
## 5 0.08% 0.73% 0.01%
## 9 0.03% 0.02% 0.03%
## 30 0.03% 0.11% 0.02%
## 31 0.03% 0.08% 0.02%
## 43 0.02% 0.02% 0.02%
## 35 0.01% 0.08% 0.00%
## 4 0.00% 0.02% 0.00%
## 8 0.00% 0.00% 0.00%
## 17 0.00% 0.00% 0.00%
## 29 0.00% 0.01% 0.00%
## 34 0.00% 0.02% 0.00%
## 1 0.00% 0.00% 0.00%
## 7 0.00% 0.00% 0.00%
## 15 0.00% 0.00% 0.00%
## 16 0.00% 0.00% 0.00%
## 25 0.00% 0.00% 0.00%
## 26 0.00% 0.00% 0.00%
## 28 0.00% 0.00% 0.00%
## 33 0.00% 0.00% 0.00%
## 39 0.00% 0.00% 0.00%
## 42 0.00% 0.00% 0.00%
As you can see above, there is no great difference between the total sum of harmful health cases, and each os its components (fatalities and injuries) separately. The trend/distribution is almost the same.
When we see the distribution through states, Texas and Alabama are outliers, but most of others are equilibrated.
# Analysing Tornados
# Filtering Tornados
tornados <- stormdata[stormdata$EVTYPE == "TORNADO", ]
# Grouping them by state
tornadoByState <- ddply(tornados, ~STATE, summarise, SUM_FATALITIES=sum(FATALITIES, na.rm=TRUE),
SUM_INJURIES=sum(INJURIES, na.rm=TRUE),
SUM_HEALTH=sum(FATALITIES)+sum(INJURIES))
# Adding percentage
tornadoByState <- cbind(tornadoByState,
PERC_HEALTH=sprintf("%.2f%%", prop.table(tornadoByState$SUM_HEALTH) * 100),
PERC_FATALITIES=sprintf("%.2f%%", prop.table(tornadoByState$SUM_FATALITIES) * 100),
PERC_INJURIES=sprintf("%.2f%%", prop.table(tornadoByState$SUM_INJURIES) * 100))
# Adjusting data to show
tornadoByState <- tornadoByState[order(tornadoByState$SUM_HEALTH, decreasing = TRUE),
c("STATE", "SUM_HEALTH", "SUM_FATALITIES", "SUM_INJURIES", "PERC_HEALTH", "PERC_FATALITIES", "PERC_INJURIES")]
tornadoByState
## STATE SUM_HEALTH SUM_FATALITIES SUM_INJURIES PERC_HEALTH
## 45 TX 8745 538 8207 9.02%
## 2 AL 8546 617 7929 8.81%
## 26 MS 6694 450 6244 6.90%
## 3 AR 5495 379 5116 5.67%
## 37 OK 5125 296 4829 5.28%
## 44 TN 5116 368 4748 5.28%
## 25 MO 4718 388 4330 4.86%
## 36 OH 4629 191 4438 4.77%
## 16 IN 4476 252 4224 4.62%
## 15 IL 4348 203 4145 4.48%
## 11 GA 4106 180 3926 4.23%
## 23 MI 3605 243 3362 3.72%
## 10 FL 3501 161 3340 3.61%
## 17 KS 2957 236 2721 3.05%
## 18 KY 2931 125 2806 3.02%
## 19 LA 2790 153 2637 2.88%
## 28 NC 2662 126 2536 2.74%
## 13 IA 2289 81 2208 2.36%
## 24 MN 2075 99 1976 2.14%
## 20 MA 1866 108 1758 1.92%
## 50 WI 1697 96 1601 1.75%
## 42 SC 1373 59 1314 1.42%
## 39 PA 1323 82 1241 1.36%
## 30 NE 1212 54 1158 1.25%
## 47 VA 950 36 914 0.98%
## 7 CT 707 4 703 0.73%
## 43 SD 470 18 452 0.48%
## 29 ND 351 25 326 0.36%
## 35 NY 337 22 315 0.35%
## 21 MD 321 7 314 0.33%
## 49 WA 309 6 303 0.32%
## 6 CO 266 5 261 0.27%
## 33 NM 160 5 155 0.16%
## 4 AZ 150 3 147 0.15%
## 51 WV 117 3 114 0.12%
## 52 WY 105 4 101 0.11%
## 46 UT 92 1 91 0.09%
## 5 CA 88 0 88 0.09%
## 9 DE 75 2 73 0.08%
## 32 NJ 71 1 70 0.07%
## 31 NH 31 1 30 0.03%
## 27 MT 25 4 21 0.03%
## 41 RI 23 0 23 0.02%
## 22 ME 20 1 19 0.02%
## 48 VT 10 0 10 0.01%
## 14 ID 9 0 9 0.01%
## 12 HI 6 0 6 0.01%
## 38 OR 5 0 5 0.01%
## 34 NV 2 0 2 0.00%
## 1 AK 0 0 0 0.00%
## 8 DC 0 0 0 0.00%
## 40 PR 0 0 0 0.00%
## PERC_FATALITIES PERC_INJURIES
## 45 9.55% 8.98%
## 2 10.95% 8.68%
## 26 7.99% 6.84%
## 3 6.73% 5.60%
## 37 5.25% 5.29%
## 44 6.53% 5.20%
## 25 6.89% 4.74%
## 36 3.39% 4.86%
## 16 4.47% 4.62%
## 15 3.60% 4.54%
## 11 3.20% 4.30%
## 23 4.31% 3.68%
## 10 2.86% 3.66%
## 17 4.19% 2.98%
## 18 2.22% 3.07%
## 19 2.72% 2.89%
## 28 2.24% 2.78%
## 13 1.44% 2.42%
## 24 1.76% 2.16%
## 20 1.92% 1.92%
## 50 1.70% 1.75%
## 42 1.05% 1.44%
## 39 1.46% 1.36%
## 30 0.96% 1.27%
## 47 0.64% 1.00%
## 7 0.07% 0.77%
## 43 0.32% 0.49%
## 29 0.44% 0.36%
## 35 0.39% 0.34%
## 21 0.12% 0.34%
## 49 0.11% 0.33%
## 6 0.09% 0.29%
## 33 0.09% 0.17%
## 4 0.05% 0.16%
## 51 0.05% 0.12%
## 52 0.07% 0.11%
## 46 0.02% 0.10%
## 5 0.00% 0.10%
## 9 0.04% 0.08%
## 32 0.02% 0.08%
## 31 0.02% 0.03%
## 27 0.07% 0.02%
## 41 0.00% 0.03%
## 22 0.02% 0.02%
## 48 0.00% 0.01%
## 14 0.00% 0.01%
## 12 0.00% 0.01%
## 38 0.00% 0.01%
## 34 0.00% 0.00%
## 1 0.00% 0.00%
## 8 0.00% 0.00%
## 40 0.00% 0.00%
Alaska, Columbia District and Puerto Rico showed no heath ocurrencies caused by storms.
boxplot(tornadoByState$SUM_HEALTH)
# Analysing evolution through years
eventsByYear <- ddply(stormdata, ~BGN_YEAR, summarise, SUM_FATALITIES=sum(FATALITIES, na.rm=TRUE),
SUM_INJURIES=sum(INJURIES, na.rm=TRUE), SUM_PROPDMG=sum(PROPDMG_TOTAL, na.rm=TRUE),
SUM_CROPDMG=sum(CROPDMG_TOTAL, na.rm=TRUE),
SUM_HEALTH=sum(FATALITIES, na.rm=TRUE)+sum(INJURIES, na.rm=TRUE),
SUM_ECONOMY=sum(PROPDMG_TOTAL, na.rm=TRUE)+sum(CROPDMG_TOTAL, na.rm=TRUE))
eventsByMonth <- ddply(stormdata, ~BGN_MONTH, summarise, SUM_FATALITIES=sum(FATALITIES, na.rm=TRUE),
SUM_INJURIES=sum(INJURIES, na.rm=TRUE), SUM_PROPDMG=sum(PROPDMG_TOTAL, na.rm=TRUE),
SUM_CROPDMG=sum(CROPDMG_TOTAL, na.rm=TRUE),
SUM_HEALTH=sum(FATALITIES, na.rm=TRUE)+sum(INJURIES, na.rm=TRUE),
SUM_ECONOMY=sum(PROPDMG_TOTAL, na.rm=TRUE)+sum(CROPDMG_TOTAL, na.rm=TRUE))
As you can see in the charts bellow, the storm health impact presents some peaks along with all the study time period. In special, at the end of 90’s we saw the highest impact, possible as a consequence of some natural disaster.
par(mfrow=c(1,2))
plot(eventsByYear$BGN_YEAR, eventsByYear$SUM_HEALTH, type="l", main="Storm health impact by Years", xlab="Year", ylab="# ocurrencies")
plot(eventsByMonth$BGN_MONTH, eventsByMonth$SUM_HEALTH, type="l", main="Storm health impact by Months", xlab="Month", ylab="# ocurrencies")
If we analyse the months, we find a pattern, too. In April, the US is more affected by this kind of disaster.
Different of health, the major economic impact is caused by “Flood”.
eventsByTypeEconomy <- eventsByType[order(eventsByType$SUM_ECONOMY, decreasing = TRUE),
c("EVTYPE", "SUM_ECONOMY", "SUM_PROPDMG", "SUM_CROPDMG")]
# Adding percentage
eventsByTypeEconomy <- cbind(eventsByTypeEconomy,
PERC_ECONOMY=sprintf("%.2f%%", prop.table(eventsByTypeEconomy$SUM_ECONOMY) * 100),
PERC_PROPDMG=sprintf("%.2f%%", prop.table(eventsByTypeEconomy$SUM_PROPDMG) * 100),
PERC_CROPDMG=sprintf("%.2f%%", prop.table(eventsByTypeEconomy$SUM_CROPDMG) * 100))
eventsByTypeEconomy
## EVTYPE SUM_ECONOMY SUM_PROPDMG SUM_CROPDMG
## 14 FLOOD 150319678250 144657709800 5661968450
## 38 TORNADO 57362333650 56947380540 414953110
## 18 HAIL 18761221670 15735267220 3025954450
## 13 FLASH FLOOD 18243990610 16822673510 1421317100
## 8 DROUGHT 15018672000 1046106000 13972566000
## 24 ICE STORM 8967041310 3944927810 5022113500
## 40 TROPICAL STORM 8382236550 7703890550 678346000
## 45 WINTER STORM 6715441250 6688497250 26944000
## 23 HIGH WIND 5908617565 5270046265 638571300
## 44 WILDFIRE 5060586800 4765114000 295472800
## 35 STORM SURGE/TIDE 4642038000 4641188000 850000
## 37 THUNDERSTORM WIND 3897965390 3483122340 414843050
## 20 HEAVY RAIN 1427647890 694248090 733399800
## 16 FROST/FREEZE 1103566000 9480000 1094086000
## 21 HEAVY SNOW 1067412240 932759140 134653100
## 27 LIGHTNING 942471370 930379280 12092090
## 3 BLIZZARD 771273950 659213950 112060000
## 11 EXCESSIVE HEAT 500155700 7753700 492402000
## 19 HEAT 403258500 1797000 401461500
## 36 STRONG WIND 240194950 175241450 64953500
## 4 COASTAL FLOOD 237665560 237665560 0
## 41 TSUNAMI 144082000 144062000 20000
## 22 HIGH SURF 89575000 89575000 0
## 25 LAKE-EFFECT SNOW 40115000 40115000 0
## 46 WINTER WEATHER 35866000 20866000 15000000
## 6 DENSE FOG 9674000 9674000 0
## 43 WATERSPOUT 9353700 9353700 0
## 12 EXTREME COLD/WIND CHILL 8698000 8648000 50000
## 10 DUST STORM 8649000 5549000 3100000
## 26 LAKESHORE FLOOD 7540000 7540000 0
## 2 AVALANCHE 3721800 3721800 0
## 5 COLD/WIND CHILL 2590000 1990000 600000
## 15 FREEZING FOG 2182000 2182000 0
## 39 TROPICAL DEPRESSION 1737000 1737000 0
## 29 MARINE HIGH WIND 1297010 1297010 0
## 33 SEICHE 980000 980000 0
## 9 DUST DEVIL 700330 700330 0
## 42 VOLCANIC ASH 500000 500000 0
## 31 MARINE THUNDERSTORM WIND 486400 436400 50000
## 30 MARINE STRONG WIND 418330 418330 0
## 1 ASTRONOMICAL LOW TIDE 320000 320000 0
## 17 FUNNEL CLOUD 194600 194600 0
## 7 DENSE SMOKE 100000 100000 0
## 28 MARINE HAIL 4000 4000 0
## 32 RIP CURRENT 1000 1000 0
## 34 SLEET 0 0 0
## PERC_ECONOMY PERC_PROPDMG PERC_CROPDMG
## 14 48.44% 52.47% 16.35%
## 38 18.48% 20.66% 1.20%
## 18 6.05% 5.71% 8.74%
## 13 5.88% 6.10% 4.10%
## 8 4.84% 0.38% 40.34%
## 24 2.89% 1.43% 14.50%
## 40 2.70% 2.79% 1.96%
## 45 2.16% 2.43% 0.08%
## 23 1.90% 1.91% 1.84%
## 44 1.63% 1.73% 0.85%
## 35 1.50% 1.68% 0.00%
## 37 1.26% 1.26% 1.20%
## 20 0.46% 0.25% 2.12%
## 16 0.36% 0.00% 3.16%
## 21 0.34% 0.34% 0.39%
## 27 0.30% 0.34% 0.03%
## 3 0.25% 0.24% 0.32%
## 11 0.16% 0.00% 1.42%
## 19 0.13% 0.00% 1.16%
## 36 0.08% 0.06% 0.19%
## 4 0.08% 0.09% 0.00%
## 41 0.05% 0.05% 0.00%
## 22 0.03% 0.03% 0.00%
## 25 0.01% 0.01% 0.00%
## 46 0.01% 0.01% 0.04%
## 6 0.00% 0.00% 0.00%
## 43 0.00% 0.00% 0.00%
## 12 0.00% 0.00% 0.00%
## 10 0.00% 0.00% 0.01%
## 26 0.00% 0.00% 0.00%
## 2 0.00% 0.00% 0.00%
## 5 0.00% 0.00% 0.00%
## 15 0.00% 0.00% 0.00%
## 39 0.00% 0.00% 0.00%
## 29 0.00% 0.00% 0.00%
## 33 0.00% 0.00% 0.00%
## 9 0.00% 0.00% 0.00%
## 42 0.00% 0.00% 0.00%
## 31 0.00% 0.00% 0.00%
## 30 0.00% 0.00% 0.00%
## 1 0.00% 0.00% 0.00%
## 17 0.00% 0.00% 0.00%
## 7 0.00% 0.00% 0.00%
## 28 0.00% 0.00% 0.00%
## 32 0.00% 0.00% 0.00%
## 34 0.00% 0.00% 0.00%
Tornado, the focus on our health analysis, is also representative (+18%).
California is, by far, the most affected state (economically).
As you can see bellow, the state represents almost 80% of total damage, specially caused in properties.
flood <- stormdata[stormdata$EVTYPE == "FLOOD", ]
# Grouping them by state
floodByState <- ddply(flood, ~STATE, summarise, SUM_PROPDMG=sum(PROPDMG_TOTAL, na.rm=TRUE),
SUM_CROPDMG=sum(CROPDMG_TOTAL, na.rm=TRUE),
SUM_ECONOMY=sum(PROPDMG_TOTAL)+sum(CROPDMG_TOTAL))
# Adding percentage
floodByState <- cbind(floodByState,
PERC_ECONOMY=sprintf("%.2f%%", prop.table(floodByState$SUM_ECONOMY) * 100),
PERC_PROPDMG=sprintf("%.2f%%", prop.table(floodByState$SUM_PROPDMG) * 100),
PERC_CROPDMG=sprintf("%.2f%%", prop.table(floodByState$SUM_CROPDMG) * 100))
# Adjusting data to show
floodByState <- floodByState[order(floodByState$SUM_ECONOMY, decreasing = TRUE),
c("STATE", "SUM_ECONOMY", "SUM_PROPDMG", "SUM_CROPDMG", "PERC_ECONOMY", "PERC_PROPDMG", "PERC_CROPDMG")]
floodByState
## STATE SUM_ECONOMY SUM_PROPDMG SUM_CROPDMG PERC_ECONOMY PERC_PROPDMG
## 6 CA 117377795000 116751420000 626375000 78.09% 80.71%
## 46 TN 4249543300 4245346300 4197000 2.83% 2.93%
## 31 ND 3989802000 3916842000 72960000 2.65% 2.71%
## 15 IA 2970229000 1381403000 1588826000 1.98% 0.95%
## 34 NJ 2111650000 2111650000 0 1.40% 1.46%
## 11 FL 1824366700 1144859200 679507500 1.21% 0.79%
## 18 IN 1547180750 848982750 698198000 1.03% 0.59%
## 26 MN 1397963800 1293248800 104715000 0.93% 0.89%
## 37 NY 1328764490 1326149490 2615000 0.88% 0.92%
## 51 VT 1112434000 1100484000 11950000 0.74% 0.76%
## 41 PA 1083142000 1078667000 4475000 0.72% 0.75%
## 28 MS 1033250100 1030650100 2600000 0.69% 0.71%
## 47 TX 975417400 890191400 85226000 0.65% 0.62%
## 40 OR 741032500 722172500 18860000 0.49% 0.50%
## 21 LA 709210400 707510400 1700000 0.47% 0.49%
## 27 MO 701018500 107978500 593040000 0.47% 0.07%
## 36 NV 683945000 677945000 6000000 0.45% 0.47%
## 53 WI 615863990 148975490 466888500 0.41% 0.10%
## 20 KY 554795000 547811000 6984000 0.37% 0.38%
## 12 GA 540163300 528577250 11586050 0.36% 0.37%
## 38 OH 497932500 409079500 88853000 0.33% 0.28%
## 3 AR 478999000 337934000 141065000 0.32% 0.23%
## 30 NC 369368900 244762000 124606900 0.25% 0.17%
## 19 KS 338677750 251888750 86789000 0.23% 0.17%
## 48 UT 331875000 331761500 113500 0.22% 0.23%
## 54 WV 274132000 273982000 150000 0.18% 0.19%
## 2 AL 252140000 251540000 600000 0.17% 0.17%
## 25 MI 249284750 237264750 12020000 0.17% 0.16%
## 22 MA 227335000 227335000 0 0.15% 0.16%
## 52 WA 212683000 212683000 0 0.14% 0.15%
## 42 PR 161589000 113589000 48000000 0.11% 0.08%
## 1 AK 157139940 157131940 8000 0.10% 0.11%
## 17 IL 149631260 123552260 26079000 0.10% 0.09%
## 39 OK 131325000 81065000 50260000 0.09% 0.06%
## 24 ME 120673700 120673700 0 0.08% 0.08%
## 16 ID 114217000 114192000 25000 0.08% 0.08%
## 43 RI 92860000 92860000 0 0.06% 0.06%
## 32 NE 85537900 52620900 32917000 0.06% 0.04%
## 49 VA 77001400 67449400 9552000 0.05% 0.05%
## 23 MD 67444000 67444000 0 0.04% 0.05%
## 45 SD 62864100 34693100 28171000 0.04% 0.02%
## 33 NH 61513270 61313270 200000 0.04% 0.04%
## 7 CO 50575650 46860650 3715000 0.03% 0.03%
## 5 AZ 44535500 39035500 5500000 0.03% 0.03%
## 29 MT 38520400 38020400 500000 0.03% 0.03%
## 35 NM 32200000 32199000 1000 0.02% 0.02%
## 44 SC 32015500 17015500 15000000 0.02% 0.01%
## 8 CT 29358000 29358000 0 0.02% 0.02%
## 55 WY 16879500 15739500 1140000 0.01% 0.01%
## 9 DC 10000000 10000000 0 0.01% 0.01%
## 50 VI 1505000 1505000 0 0.00% 0.00%
## 13 GU 750000 750000 0 0.00% 0.00%
## 10 DE 600000 600000 0 0.00% 0.00%
## 4 AS 547000 547000 0 0.00% 0.00%
## 14 HI 400000 400000 0 0.00% 0.00%
## PERC_CROPDMG
## 6 11.06%
## 46 0.07%
## 31 1.29%
## 15 28.06%
## 34 0.00%
## 11 12.00%
## 18 12.33%
## 26 1.85%
## 37 0.05%
## 51 0.21%
## 41 0.08%
## 28 0.05%
## 47 1.51%
## 40 0.33%
## 21 0.03%
## 27 10.47%
## 36 0.11%
## 53 8.25%
## 20 0.12%
## 12 0.20%
## 38 1.57%
## 3 2.49%
## 30 2.20%
## 19 1.53%
## 48 0.00%
## 54 0.00%
## 2 0.01%
## 25 0.21%
## 22 0.00%
## 52 0.00%
## 42 0.85%
## 1 0.00%
## 17 0.46%
## 39 0.89%
## 24 0.00%
## 16 0.00%
## 43 0.00%
## 32 0.58%
## 49 0.17%
## 23 0.00%
## 45 0.50%
## 33 0.00%
## 7 0.07%
## 5 0.10%
## 29 0.01%
## 35 0.00%
## 44 0.26%
## 8 0.00%
## 55 0.02%
## 9 0.00%
## 50 0.00%
## 13 0.00%
## 10 0.00%
## 4 0.00%
## 14 0.00%
Iowa had the biggest impact on crop damage.
Spliting the economic impact into “Property damage” and “Crop damage” we see that, through time, crop damage started the growth before (90’s) the property (~2006).
par(mfrow=c(1,2))
plot(eventsByYear$BGN_YEAR, eventsByYear$SUM_PROPDMG, type="l", main="Storm economy impact - Property", xlab="Year", ylab="USD")
plot(eventsByYear$BGN_YEAR, eventsByYear$SUM_CROPDMG, type="l", main="Storm economy impact - Crop", xlab="Year", ylab="USD")
Independent of type, this metric varies and authorities need to pursue on plans to reduce the impact in the future.