Our objective is to analyze the impact of severe weather events towards both public health and economic problems based on the storm database collected from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) from the year 1950 until the year of 2011 in November. Our overall hypothesis is the excessive heat and tornado are most harmful with respect to population health, while flood and drought have the greatest economic consequences.To investigate the hypothesis, we use the estimates of any fatalities, injuries, property damage and crop damage to determine which types of events are most harmful to the population health and economy. We also specifically obtained the data from the year 1996 to the year 2011 as the most recent years are considered more complete.
echo=TRUE
options(scipen = 1)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.2
library(R.utils)
## Warning: package 'R.utils' was built under R version 3.2.2
## Loading required package: R.oo
## Warning: package 'R.oo' was built under R version 3.2.2
## Loading required package: R.methodsS3
## Warning: package 'R.methodsS3' was built under R version 3.2.2
## R.methodsS3 v1.7.0 (2015-02-19) successfully loaded. See ?R.methodsS3 for help.
## R.oo v1.19.0 (2015-02-27) successfully loaded. See ?R.oo for help.
##
## Attaching package: 'R.oo'
##
## The following objects are masked from 'package:methods':
##
## getClasses, getMethods
##
## The following objects are masked from 'package:base':
##
## attach, detach, gc, load, save
##
## R.utils v2.1.0 (2015-05-27) successfully loaded. See ?R.utils for help.
##
## Attaching package: 'R.utils'
##
## The following object is masked from 'package:utils':
##
## timestamp
##
## The following objects are masked from 'package:base':
##
## cat, commandArgs, getOption, inherits, isOpen, parse, warnings
library(reshape2)
## Warning: package 'reshape2' was built under R version 3.2.2
library(plyr)
## Warning: package 'plyr' was built under R version 3.2.2
library(data.table)
## Warning: package 'data.table' was built under R version 3.2.2
##
## Attaching package: 'data.table'
##
## The following objects are masked from 'package:reshape2':
##
## dcast, melt
First, we set the directory, download the file and unzip the file.
setwd('C:/Users/user/Desktop/PeerAssessment2')
getwd()
## [1] "C:/Users/user/Desktop/PeerAssessment2"
list.files()
## [1] "repdata-data-StormData.csv.bz2" "StormData.csv"
## [3] "StormData.html" "StormData.Rmd"
fileUrl<-"https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
bunzip2("repdata-data-StormData.csv.bz2", "StormData.csv", remove = FALSE, skip = TRUE)
## [1] "StormData.csv"
## attr(,"temporary")
## [1] FALSE
Read the .csv file once the data has existed.
storm_data <- read.csv("StormData.csv")
head(storm_data)
## STATE__ BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1 1 4/18/1950 0:00:00 0130 CST 97 MOBILE AL
## 2 1 4/18/1950 0:00:00 0145 CST 3 BALDWIN AL
## 3 1 2/20/1951 0:00:00 1600 CST 57 FAYETTE AL
## 4 1 6/8/1951 0:00:00 0900 CST 89 MADISON AL
## 5 1 11/15/1951 0:00:00 1500 CST 43 CULLMAN AL
## 6 1 11/15/1951 0:00:00 2000 CST 77 LAUDERDALE AL
## EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 1 TORNADO 0 0
## 2 TORNADO 0 0
## 3 TORNADO 0 0
## 4 TORNADO 0 0
## 5 TORNADO 0 0
## 6 TORNADO 0 0
## COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 1 NA 0 14.0 100 3 0 0
## 2 NA 0 2.0 150 2 0 0
## 3 NA 0 0.1 123 2 0 0
## 4 NA 0 0.0 100 2 0 0
## 5 NA 0 0.0 150 2 0 0
## 6 NA 0 1.5 177 2 0 0
## INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES
## 1 15 25.0 K 0
## 2 0 2.5 K 0
## 3 2 25.0 K 0
## 4 2 2.5 K 0
## 5 2 2.5 K 0
## 6 6 2.5 K 0
## LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1 3040 8812 3051 8806 1
## 2 3042 8755 0 0 2
## 3 3340 8742 0 0 3
## 4 3458 8626 0 0 4
## 5 3412 8642 0 0 5
## 6 3450 8748 0 0 6
str(storm_data)
## 'data.frame': 902297 obs. of 37 variables:
## $ STATE__ : num 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_DATE : Factor w/ 16335 levels "1/1/1966 0:00:00",..: 6523 6523 4242 11116 2224 2224 2260 383 3980 3980 ...
## $ BGN_TIME : Factor w/ 3608 levels "00:00:00 AM",..: 272 287 2705 1683 2584 3186 242 1683 3186 3186 ...
## $ TIME_ZONE : Factor w/ 22 levels "ADT","AKS","AST",..: 7 7 7 7 7 7 7 7 7 7 ...
## $ COUNTY : num 97 3 57 89 43 77 9 123 125 57 ...
## $ COUNTYNAME: Factor w/ 29601 levels "","5NM E OF MACKINAC BRIDGE TO PRESQUE ISLE LT MI",..: 13513 1873 4598 10592 4372 10094 1973 23873 24418 4598 ...
## $ STATE : Factor w/ 72 levels "AK","AL","AM",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ EVTYPE : Factor w/ 985 levels " HIGH SURF ADVISORY",..: 834 834 834 834 834 834 834 834 834 834 ...
## $ BGN_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ BGN_AZI : Factor w/ 35 levels ""," N"," NW",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_LOCATI: Factor w/ 54429 levels "","- 1 N Albion",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ END_DATE : Factor w/ 6663 levels "","1/1/1993 0:00:00",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ END_TIME : Factor w/ 3647 levels ""," 0900CST",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ COUNTY_END: num 0 0 0 0 0 0 0 0 0 0 ...
## $ COUNTYENDN: logi NA NA NA NA NA NA ...
## $ END_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ END_AZI : Factor w/ 24 levels "","E","ENE","ESE",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ END_LOCATI: Factor w/ 34506 levels "","- .5 NNW",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ LENGTH : num 14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
## $ WIDTH : num 100 150 123 100 150 177 33 33 100 100 ...
## $ F : int 3 2 2 2 2 2 2 1 3 3 ...
## $ MAG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ WFO : Factor w/ 542 levels ""," CI","$AC",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ STATEOFFIC: Factor w/ 250 levels "","ALABAMA, Central",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ ZONENAMES : Factor w/ 25112 levels ""," "| __truncated__,..: 1 1 1 1 1 1 1 1 1 1 ...
## $ LATITUDE : num 3040 3042 3340 3458 3412 ...
## $ LONGITUDE : num 8812 8755 8742 8626 8642 ...
## $ LATITUDE_E: num 3051 0 0 0 0 ...
## $ LONGITUDE_: num 8806 0 0 0 0 ...
## $ REMARKS : Factor w/ 436774 levels "","-2 at Deer Park\n",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ REFNUM : num 1 2 3 4 5 6 7 8 9 10 ...
dim(storm_data)
## [1] 902297 37
The events started in the year 1950 and end in November 2011. There are few events recorded in the earlier years while most recent years should be considered more complete.
storm_data$year <- as.numeric(format(as.Date(storm_data$BGN_DATE, format = "%m/%d/%Y %H:%M:%S"), "%Y"))
storm_event <- storm_data[storm_data$year < 1955,]
dim(storm_event)
## [1] 1865 38
summary(storm_event$EVTYPE)
## TORNADO HIGH SURF ADVISORY
## 1865 0
## COASTAL FLOOD FLASH FLOOD
## 0 0
## LIGHTNING TSTM WIND
## 0 0
## TSTM WIND (G45) WATERSPOUT
## 0 0
## WIND ?
## 0 0
## ABNORMAL WARMTH ABNORMALLY DRY
## 0 0
## ABNORMALLY WET ACCUMULATED SNOWFALL
## 0 0
## AGRICULTURAL FREEZE APACHE COUNTY
## 0 0
## ASTRONOMICAL HIGH TIDE ASTRONOMICAL LOW TIDE
## 0 0
## AVALANCE AVALANCHE
## 0 0
## BEACH EROSIN Beach Erosion
## 0 0
## BEACH EROSION BEACH EROSION/COASTAL FLOOD
## 0 0
## BEACH FLOOD BELOW NORMAL PRECIPITATION
## 0 0
## BITTER WIND CHILL BITTER WIND CHILL TEMPERATURES
## 0 0
## Black Ice BLACK ICE
## 0 0
## BLIZZARD BLIZZARD AND EXTREME WIND CHIL
## 0 0
## BLIZZARD AND HEAVY SNOW Blizzard Summary
## 0 0
## BLIZZARD WEATHER BLIZZARD/FREEZING RAIN
## 0 0
## BLIZZARD/HEAVY SNOW BLIZZARD/HIGH WIND
## 0 0
## BLIZZARD/WINTER STORM BLOW-OUT TIDE
## 0 0
## BLOW-OUT TIDES BLOWING DUST
## 0 0
## blowing snow Blowing Snow
## 0 0
## BLOWING SNOW BLOWING SNOW- EXTREME WIND CHI
## 0 0
## BLOWING SNOW & EXTREME WIND CH BLOWING SNOW/EXTREME WIND CHIL
## 0 0
## BREAKUP FLOODING BRUSH FIRE
## 0 0
## BRUSH FIRES COASTAL FLOODING/EROSION
## 0 0
## COASTAL EROSION Coastal Flood
## 0 0
## COASTAL FLOOD coastal flooding
## 0 0
## Coastal Flooding COASTAL FLOODING
## 0 0
## COASTAL FLOODING/EROSION Coastal Storm
## 0 0
## COASTAL STORM COASTAL SURGE
## 0 0
## COASTAL/TIDAL FLOOD COASTALFLOOD
## 0 0
## COASTALSTORM Cold
## 0 0
## COLD COLD AIR FUNNEL
## 0 0
## COLD AIR FUNNELS COLD AIR TORNADO
## 0 0
## Cold and Frost COLD AND FROST
## 0 0
## COLD AND SNOW COLD AND WET CONDITIONS
## 0 0
## Cold Temperature COLD TEMPERATURES
## 0 0
## COLD WAVE COLD WEATHER
## 0 0
## COLD WIND CHILL TEMPERATURES COLD/WIND CHILL
## 0 0
## COLD/WINDS COOL AND WET
## 0 0
## COOL SPELL CSTL FLOODING/EROSION
## 0 0
## DAM BREAK DAM FAILURE
## 0 0
## Damaging Freeze DAMAGING FREEZE
## 0 0
## DEEP HAIL DENSE FOG
## 0 0
## DENSE SMOKE DOWNBURST
## 0 0
## DOWNBURST WINDS DRIEST MONTH
## 0 0
## Drifting Snow DROUGHT
## 0 0
## DROUGHT/EXCESSIVE HEAT DROWNING
## 0 0
## DRY (Other)
## 0 0
As we can see the result above, there is only one event which is Tornado happens before the year 1955 where this analysis consider as incomplete.
hist(storm_data$year,xlab="Year",ylab="Frequency",main = "Event Per Year",breaks=30)
Based on the histogram above, the number of events has significantly increases after the year 1995. Hence, the subset of the data starting the year 1996 will be used to get the more complete records.
event_types <- storm_data[storm_data$year > 1995,]
dim(event_types)
## [1] 653530 38
summary(event_types$EVTYPE)
## HAIL TSTM WIND
## 207715 128662
## THUNDERSTORM WIND FLASH FLOOD
## 81402 50999
## FLOOD TORNADO
## 24247 23154
## HIGH WIND HEAVY SNOW
## 19907 14000
## LIGHTNING HEAVY RAIN
## 13203 11509
## WINTER STORM WINTER WEATHER
## 11317 6968
## MARINE TSTM WIND FUNNEL CLOUD
## 6175 6058
## MARINE THUNDERSTORM WIND STRONG WIND
## 5812 3561
## URBAN/SML STREAM FLD WATERSPOUT
## 3392 3390
## WILDFIRE BLIZZARD
## 2732 2633
## DROUGHT ICE STORM
## 2433 1879
## EXCESSIVE HEAT WILD/FOREST FIRE
## 1656 1443
## FROST/FREEZE DENSE FOG
## 1342 1193
## WINTER WEATHER/MIX TSTM WIND/HAIL
## 1104 1028
## EXTREME COLD/WIND CHILL HIGH SURF
## 1002 717
## HEAT TROPICAL STORM
## 716 682
## LAKE-EFFECT SNOW EXTREME COLD
## 635 615
## COASTAL FLOOD LANDSLIDE
## 589 588
## COLD/WIND CHILL FOG
## 539 532
## MARINE HAIL RIP CURRENT
## 442 432
## DUST STORM SNOW
## 417 395
## AVALANCHE WIND
## 378 320
## RIP CURRENTS STORM SURGE
## 302 253
## HEAVY SURF/HIGH SURF EXTREME WINDCHILL
## 228 204
## STRONG WINDS FREEZING RAIN
## 184 176
## ASTRONOMICAL LOW TIDE DRY MICROBURST
## 174 173
## HURRICANE LIGHT SNOW
## 170 152
## STORM SURGE/TIDE MARINE HIGH WIND
## 148 135
## RECORD WARMTH DUST DEVIL
## 135 128
## UNSEASONABLY WARM COASTAL FLOODING
## 113 107
## ASTRONOMICAL HIGH TIDE RIVER FLOOD
## 103 102
## MODERATE SNOWFALL HURRICANE/TYPHOON
## 101 88
## WINTRY MIX HEAVY SURF
## 82 77
## FREEZE TROPICAL DEPRESSION
## 67 60
## SLEET GUSTY WINDS
## 58 51
## MARINE STRONG WIND OTHER
## 48 47
## UNSEASONABLY DRY FREEZING FOG
## 46 45
## SMALL HAIL Temperature record
## 45 43
## FROST RECORD HEAT
## 42 40
## TSTM WIND (G45) Coastal Flooding
## 39 38
## MONTHLY PRECIPITATION COLD
## 36 34
## MIXED PRECIPITATION Snow
## 34 30
## RECORD COLD EXCESSIVE SNOW
## 29 25
## ICY ROADS GUSTY WIND
## 24 23
## LAKESHORE FLOOD LIGHT FREEZING RAIN
## 23 23
## VOLCANIC ASH GLAZE
## 22 21
## Light Snow SEICHE
## 21 21
## TIDAL FLOODING TSUNAMI
## 20 20
## EXTREME WINDCHILL TEMPERATURES LAKE EFFECT SNOW
## 19 19
## THUNDERSTORM (Other)
## 19 1150
In this section, the number of fatalities and injuries that are caused by the severe weather events will be determined. The first 20 rows for the most severe types of weather events will be extracted.
combine_data<-function(header_name,top = 20, dataset=storm_data){
header_title <- which(colnames(dataset) == header_name)
aggregate_evtype <- aggregate(dataset[,header_title],by=list(dataset$EVTYPE),FUN = "sum")
names(aggregate_evtype) <- c("EVTYPE",header_name)
aggregate_evtype <-arrange(aggregate_evtype,aggregate_evtype[,2],decreasing = TRUE)
aggregate_evtype <-head(aggregate_evtype,n=top)
aggregate_evtype <-within(aggregate_evtype, EVTYPE <- factor(x=EVTYPE,levels = aggregate_evtype$EVTYPE))
return(aggregate_evtype)
}
Two types of damages, Property damage and Crop damage data will be converted into comparable numerical forms. We have Hundred(H), Thousand(K), Million(M) and Billion(B) in both PROPDMGEXP and CROPDMGEXP columns.
summary(event_types$PROPDMGEXP)
## - ? + 0 1 2 3 4 5
## 276185 0 0 0 1 0 0 0 0 0
## 6 7 8 B h H K m M
## 0 0 0 32 0 0 369938 0 7374
event_types$PROPDMGEXP<-as.character(event_types$PROPDMGEXP)
event_types$PROPDMGEXP[event_types$PROPDMGEXP %in% c("-","?","+","1","2","3","4","5","6","7","8")]<-NA
event_types$PROPDMGEXP[event_types$PROPDMGEXP %in% c("","0")]<-0
event_types$PROPDMGEXP<-gsub("B", 10^9, event_types$PROPDMGEXP)
event_types$PROPDMGEXP<-gsub("K", 1000, event_types$PROPDMGEXP)
event_types$PROPDMGEXP<-gsub("H", 100, event_types$PROPDMGEXP)
event_types$PROPDMGEXP<-gsub("M", 10^6, event_types$PROPDMGEXP)
event_types$PROPDMGEXP<-as.numeric(event_types$PROPDMGEXP)
event_types$PROPDMG<-event_types$PROPDMGEXP*event_types$PROPDMG
summary(event_types$CROPDMGEXP)
## ? 0 2 B k K m M
## 373069 0 0 0 4 0 278686 0 1771
event_types$CROPDMGEXP<-as.character(event_types$CROPDMGEXP)
event_types$CROPDMGEXP[event_types$CROPDMGEXP %in% c("?","2")]<-NA
event_types$CROPDMGEXP[event_types$CROPDMGEXP %in% c("","0")]<-0
event_types$CROPDMGEXP<-gsub("B", 10^9, event_types$CROPDMGEXP)
event_types$CROPDMGEXP<-gsub("K", 1000, event_types$CROPDMGEXP)
event_types$CROPDMGEXP<-gsub("M", 10^6, event_types$CROPDMGEXP)
event_types$CROPDMGEXP<-as.numeric(event_types$CROPDMGEXP)
event_types$CROPDMG<-event_types$CROPDMGEXP*event_types$CROPDMG
summary(event_types$CROPDMG)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 0 0 53180 0 1510000000
Question 1 Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
We have two sorted lists (fatalities and injuries) of severe weather events by the number of people badly affected with respect to population health.
Fatalities
fatalities <- combine_data("FATALITIES", dataset = event_types)
fatalities
## EVTYPE FATALITIES
## 1 EXCESSIVE HEAT 1797
## 2 TORNADO 1511
## 3 FLASH FLOOD 887
## 4 LIGHTNING 651
## 5 FLOOD 414
## 6 RIP CURRENT 340
## 7 TSTM WIND 241
## 8 HEAT 237
## 9 HIGH WIND 235
## 10 AVALANCHE 223
## 11 RIP CURRENTS 202
## 12 WINTER STORM 191
## 13 THUNDERSTORM WIND 130
## 14 EXTREME COLD/WIND CHILL 125
## 15 EXTREME COLD 113
## 16 HEAVY SNOW 107
## 17 STRONG WIND 103
## 18 COLD/WIND CHILL 95
## 19 HEAVY RAIN 94
## 20 HIGH SURF 87
PlotFatalities<-qplot(EVTYPE,data = fatalities,weight=FATALITIES,geom="bar",binwidth=1)+
theme(axis.text.x = element_text(angle = 90, hjust = 1))+
xlab("Types of Weather Event")+
ylab("Number of People Death")+
ggtitle("The Amount of Fatalities Caused by Weather Events in U.S.(1996-2011)")
PlotFatalities
Injuries
injuries <- combine_data("INJURIES", dataset = event_types)
injuries
## EVTYPE INJURIES
## 1 TORNADO 20667
## 2 FLOOD 6758
## 3 EXCESSIVE HEAT 6391
## 4 LIGHTNING 4141
## 5 TSTM WIND 3629
## 6 FLASH FLOOD 1674
## 7 THUNDERSTORM WIND 1400
## 8 WINTER STORM 1292
## 9 HURRICANE/TYPHOON 1275
## 10 HEAT 1222
## 11 HIGH WIND 1083
## 12 WILDFIRE 911
## 13 HAIL 713
## 14 FOG 712
## 15 HEAVY SNOW 698
## 16 WILD/FOREST FIRE 545
## 17 BLIZZARD 385
## 18 DUST STORM 376
## 19 WINTER WEATHER 343
## 20 TROPICAL STORM 338
PlotInjuries<-qplot(EVTYPE,data = injuries,weight=INJURIES,geom="bar",binwidth=1)+
theme(axis.text.x = element_text(angle = 90, hjust = 1))+
xlab("Types of Weather Event")+
ylab("Amount of Injuries")+
ggtitle("The Amount of Injuries Caused by Weather Events in U.S.(1996-2011)")
PlotInjuries
Based on the barplot above, we can conclude that Excessive Heat and Tornado caused the most fatalities while Tornado has also caused the most injuries in United States from the year 1996 to the year 2011.
Question 2 Across the United States, which types of events have the greatest economic consequences?
There are two types of damages (Property Damages and Crop Damages) due to the severe weather events have impacted on economy.
prop_dmg <- as.data.table(subset(aggregate(PROPDMG ~ EVTYPE, data = event_types, FUN = "sum"), PROPDMG > 0))
prop_dmg
## EVTYPE PROPDMG
## 1: HIGH SURF ADVISORY 200000
## 2: FLASH FLOOD 50000
## 3: TSTM WIND 8100000
## 4: TSTM WIND (G45) 8000
## 5: ASTRONOMICAL HIGH TIDE 9425000
## ---
## 172: WINTER WEATHER 20866000
## 173: WINTER WEATHER MIX 60000
## 174: WINTER WEATHER/MIX 6372000
## 175: Wintry Mix 2500
## 176: WINTRY MIX 10000
crop_dmg <- as.data.table(subset(aggregate(CROPDMG ~ EVTYPE, data = event_types, FUN = "sum"), CROPDMG > 0))
crop_dmg
## EVTYPE CROPDMG
## 1: AGRICULTURAL FREEZE 28820000
## 2: BLIZZARD 7060000
## 3: COLD/WIND CHILL 600000
## 4: Damaging Freeze 34130000
## 5: DROUGHT 13367566000
## 6: DRY MICROBURST 15000
## 7: DUST STORM 3100000
## 8: Early Frost 42000000
## 9: EXCESSIVE HEAT 492402000
## 10: Extreme Cold 20000000
## 11: EXTREME COLD 1288973000
## 12: EXTREME COLD/WIND CHILL 50000
## 13: EXTREME WINDCHILL 17000000
## 14: FLASH FLOOD 1334901700
## 15: FLOOD 4974778400
## 16: Freeze 10500000
## 17: FREEZE 146225000
## 18: Frost/Freeze 100000
## 19: FROST/FREEZE 1094086000
## 20: GUSTY WIND 10000
## 21: GUSTY WINDS 200000
## 22: HAIL 2476029450
## 23: HARD FREEZE 12900000
## 24: HEAT 176500
## 25: HEAVY RAIN 728169800
## 26: Heavy Rain/High Surf 1500000
## 27: HEAVY SNOW 71122100
## 28: HIGH WIND 633561300
## 29: HURRICANE 2741410000
## 30: HURRICANE/TYPHOON 2607872800
## 31: ICE STORM 15660000
## 32: LANDSLIDE 20017000
## 33: LIGHTNING 6898440
## 34: MARINE THUNDERSTORM WIND 50000
## 35: OTHER 1034400
## 36: RAIN 250000
## 37: RIVER FLOOD 1875000
## 38: River Flooding 28020000
## 39: SMALL HAIL 20793000
## 40: STORM SURGE 5000
## 41: STORM SURGE/TIDE 850000
## 42: STRONG WIND 64953500
## 43: THUNDERSTORM WIND 398331000
## 44: TORNADO 283425010
## 45: TROPICAL STORM 677711000
## 46: TSTM WIND 553915350
## 47: TSTM WIND/HAIL 64696250
## 48: TSUNAMI 20000
## 49: TYPHOON 825000
## 50: Unseasonable Cold 5100000
## 51: UNSEASONABLY COLD 25042500
## 52: UNSEASONABLY WARM 10000
## 53: UNSEASONAL RAIN 10000000
## 54: URBAN/SML STREAM FLD 8488100
## 55: WILD/FOREST FIRE 106782330
## 56: WILDFIRE 295472800
## 57: WIND 300000
## 58: WINTER STORM 11944000
## 59: WINTER WEATHER 15000000
## EVTYPE CROPDMG
economic_consequences <- join_all(list(prop_dmg, crop_dmg), by="EVTYPE")
economic_consequences$TOTAL <- economic_consequences$PROPDMG + economic_consequences$CROPDMG
economic_consequences <- economic_consequences[order(-economic_consequences$TOTAL),][1:20,]
economic_consequences <- transform(economic_consequences, EVTYPE=reorder(EVTYPE, TOTAL) )
economic_consequences <- melt(economic_consequences, id=c("EVTYPE"), measure.vars=c("PROPDMG","CROPDMG"))
Let’s combine the data of both damages to determine which is the event has the greatest economic consequences.
ggplot(data=economic_consequences, aes(EVTYPE, value/(10^6), fill=variable)) +
geom_bar(stat="identity")+labs(x="Types of Weather Event") +
labs(y="Total Damage (in million USD)") +
ggtitle("Total Damage Caused by Weather Event") +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
list(economic_consequences)
## [[1]]
## EVTYPE variable value
## 1: FLOOD PROPDMG 143944833550
## 2: HURRICANE/TYPHOON PROPDMG 69305840000
## 3: STORM SURGE PROPDMG 43193536000
## 4: TORNADO PROPDMG 24616945710
## 5: HAIL PROPDMG 14595143420
## 6: FLASH FLOOD PROPDMG 15222203910
## 7: HURRICANE PROPDMG 11812819010
## 8: DROUGHT PROPDMG 1046101000
## 9: TROPICAL STORM PROPDMG 7642475550
## 10: HIGH WIND PROPDMG 5247860360
## 11: WILDFIRE PROPDMG 4758667000
## 12: TSTM WIND PROPDMG 4478026440
## 13: STORM SURGE/TIDE PROPDMG 4641188000
## 14: THUNDERSTORM WIND PROPDMG 3382654440
## 15: ICE STORM PROPDMG 3642248810
## 16: WILD/FOREST FIRE PROPDMG 3001782500
## 17: WINTER STORM PROPDMG 1532743250
## 18: HEAVY RAIN PROPDMG 584864440
## 19: EXTREME COLD PROPDMG 19760400
## 20: FROST/FREEZE PROPDMG 9480000
## 21: FLOOD CROPDMG 4974778400
## 22: HURRICANE/TYPHOON CROPDMG 2607872800
## 23: STORM SURGE CROPDMG 5000
## 24: TORNADO CROPDMG 283425010
## 25: HAIL CROPDMG 2476029450
## 26: FLASH FLOOD CROPDMG 1334901700
## 27: HURRICANE CROPDMG 2741410000
## 28: DROUGHT CROPDMG 13367566000
## 29: TROPICAL STORM CROPDMG 677711000
## 30: HIGH WIND CROPDMG 633561300
## 31: WILDFIRE CROPDMG 295472800
## 32: TSTM WIND CROPDMG 553915350
## 33: STORM SURGE/TIDE CROPDMG 850000
## 34: THUNDERSTORM WIND CROPDMG 398331000
## 35: ICE STORM CROPDMG 15660000
## 36: WILD/FOREST FIRE CROPDMG 106782330
## 37: WINTER STORM CROPDMG 11944000
## 38: HEAVY RAIN CROPDMG 728169800
## 39: EXTREME COLD CROPDMG 1288973000
## 40: FROST/FREEZE CROPDMG 1094086000
## EVTYPE variable value
Based on the barplot above, we can conclude that Flood has the most impact on economy in U.S, whereas from the list above, we can also conclude that Drought has brought the greatest damages to the crops which would cause the most impact on economy in U.S. from the year 1996 to the year 2011.