Introduction
Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project involves exploring the U.S. National Oceanic and Atmospheric Administration's (NOAA)storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
Synopsis
This report addresses the below questions :
Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
Across the United States, which types of events have the greatest economic consequences?
After analysing and summarizing the data for the top 10 most weather events,we can conclude that Tornadoes caused the most total harm to population health (judged by the combined number of fatalities and injuries), and Floods caused the most total harm to the economy (judged by the value of combined property damage and crop damage).
Data
The data for this assignment come in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size.
The links to download the dataset and the documentation are below:
Storm Data [47Mb]
National Weather Service Storm Data Documentation
National Climatic Data Center Storm Events FAQ
The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.
Data Processing
Install the packages dplyr,reshape2,ggplot2 and Load the foll libraries
# library(dplyr)
# library(ggplot2)
# library(reshape2)
# Commenting out these commands here but they will be called in respective chunks below.
Download file from the URL and save it to your working directory folder.
#fileUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
#download.file(fileUrl, destfile = "./data/repdata-data-StormData.csv.bz2" , method = "curl")
Now that I already downloaded it into my working directory,I will Load and Read in the data into a dataframe and check out dimensions,summary,structure and colnames.
# Please Set the cache option to TRUE,because the dataset is too large to load everytime and is timeconsuming
rawstormData <- read.csv("repdata-data-StormData.csv",as.is=TRUE)
dim(rawstormData)
## [1] 902297 37
summary(rawstormData)
## STATE__ BGN_DATE BGN_TIME TIME_ZONE
## Min. : 1.0 Length:902297 Length:902297 Length:902297
## 1st Qu.:19.0 Class :character Class :character Class :character
## Median :30.0 Mode :character Mode :character Mode :character
## Mean :31.2
## 3rd Qu.:45.0
## Max. :95.0
##
## COUNTY COUNTYNAME STATE EVTYPE
## Min. : 0.0 Length:902297 Length:902297 Length:902297
## 1st Qu.: 31.0 Class :character Class :character Class :character
## Median : 75.0 Mode :character Mode :character Mode :character
## Mean :100.6
## 3rd Qu.:131.0
## Max. :873.0
##
## BGN_RANGE BGN_AZI BGN_LOCATI
## Min. : 0.000 Length:902297 Length:902297
## 1st Qu.: 0.000 Class :character Class :character
## Median : 0.000 Mode :character Mode :character
## Mean : 1.484
## 3rd Qu.: 1.000
## Max. :3749.000
##
## END_DATE END_TIME COUNTY_END COUNTYENDN
## Length:902297 Length:902297 Min. :0 Mode:logical
## Class :character Class :character 1st Qu.:0 NA's:902297
## Mode :character Mode :character Median :0
## Mean :0
## 3rd Qu.:0
## Max. :0
##
## END_RANGE END_AZI END_LOCATI
## Min. : 0.0000 Length:902297 Length:902297
## 1st Qu.: 0.0000 Class :character Class :character
## Median : 0.0000 Mode :character Mode :character
## Mean : 0.9862
## 3rd Qu.: 0.0000
## Max. :925.0000
##
## LENGTH WIDTH F MAG
## Min. : 0.0000 Min. : 0.000 Min. :0.0 Min. : 0.0
## 1st Qu.: 0.0000 1st Qu.: 0.000 1st Qu.:0.0 1st Qu.: 0.0
## Median : 0.0000 Median : 0.000 Median :1.0 Median : 50.0
## Mean : 0.2301 Mean : 7.503 Mean :0.9 Mean : 46.9
## 3rd Qu.: 0.0000 3rd Qu.: 0.000 3rd Qu.:1.0 3rd Qu.: 75.0
## Max. :2315.0000 Max. :4400.000 Max. :5.0 Max. :22000.0
## NA's :843563
## FATALITIES INJURIES PROPDMG
## Min. : 0.0000 Min. : 0.0000 Min. : 0.00
## 1st Qu.: 0.0000 1st Qu.: 0.0000 1st Qu.: 0.00
## Median : 0.0000 Median : 0.0000 Median : 0.00
## Mean : 0.0168 Mean : 0.1557 Mean : 12.06
## 3rd Qu.: 0.0000 3rd Qu.: 0.0000 3rd Qu.: 0.50
## Max. :583.0000 Max. :1700.0000 Max. :5000.00
##
## PROPDMGEXP CROPDMG CROPDMGEXP
## Length:902297 Min. : 0.000 Length:902297
## Class :character 1st Qu.: 0.000 Class :character
## Mode :character Median : 0.000 Mode :character
## Mean : 1.527
## 3rd Qu.: 0.000
## Max. :990.000
##
## WFO STATEOFFIC ZONENAMES LATITUDE
## Length:902297 Length:902297 Length:902297 Min. : 0
## Class :character Class :character Class :character 1st Qu.:2802
## Mode :character Mode :character Mode :character Median :3540
## Mean :2875
## 3rd Qu.:4019
## Max. :9706
## NA's :47
## LONGITUDE LATITUDE_E LONGITUDE_ REMARKS
## Min. :-14451 Min. : 0 Min. :-14455 Length:902297
## 1st Qu.: 7247 1st Qu.: 0 1st Qu.: 0 Class :character
## Median : 8707 Median : 0 Median : 0 Mode :character
## Mean : 6940 Mean :1452 Mean : 3509
## 3rd Qu.: 9605 3rd Qu.:3549 3rd Qu.: 8735
## Max. : 17124 Max. :9706 Max. :106220
## NA's :40
## REFNUM
## Min. : 1
## 1st Qu.:225575
## Median :451149
## Mean :451149
## 3rd Qu.:676723
## Max. :902297
##
str(rawstormData)
## 'data.frame': 902297 obs. of 37 variables:
## $ STATE__ : num 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_DATE : chr "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
## $ BGN_TIME : chr "0130" "0145" "1600" "0900" ...
## $ TIME_ZONE : chr "CST" "CST" "CST" "CST" ...
## $ COUNTY : num 97 3 57 89 43 77 9 123 125 57 ...
## $ COUNTYNAME: chr "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
## $ STATE : chr "AL" "AL" "AL" "AL" ...
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ BGN_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ BGN_AZI : chr "" "" "" "" ...
## $ BGN_LOCATI: chr "" "" "" "" ...
## $ END_DATE : chr "" "" "" "" ...
## $ END_TIME : chr "" "" "" "" ...
## $ COUNTY_END: num 0 0 0 0 0 0 0 0 0 0 ...
## $ COUNTYENDN: logi NA NA NA NA NA NA ...
## $ END_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ END_AZI : chr "" "" "" "" ...
## $ END_LOCATI: chr "" "" "" "" ...
## $ LENGTH : num 14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
## $ WIDTH : num 100 150 123 100 150 177 33 33 100 100 ...
## $ F : int 3 2 2 2 2 2 2 1 3 3 ...
## $ MAG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
## $ WFO : chr "" "" "" "" ...
## $ STATEOFFIC: chr "" "" "" "" ...
## $ ZONENAMES : chr "" "" "" "" ...
## $ LATITUDE : num 3040 3042 3340 3458 3412 ...
## $ LONGITUDE : num 8812 8755 8742 8626 8642 ...
## $ LATITUDE_E: num 3051 0 0 0 0 ...
## $ LONGITUDE_: num 8806 0 0 0 0 ...
## $ REMARKS : chr "" "" "" "" ...
## $ REFNUM : num 1 2 3 4 5 6 7 8 9 10 ...
names(rawstormData)
## [1] "STATE__" "BGN_DATE" "BGN_TIME" "TIME_ZONE" "COUNTY"
## [6] "COUNTYNAME" "STATE" "EVTYPE" "BGN_RANGE" "BGN_AZI"
## [11] "BGN_LOCATI" "END_DATE" "END_TIME" "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE" "END_AZI" "END_LOCATI" "LENGTH" "WIDTH"
## [21] "F" "MAG" "FATALITIES" "INJURIES" "PROPDMG"
## [26] "PROPDMGEXP" "CROPDMG" "CROPDMGEXP" "WFO" "STATEOFFIC"
## [31] "ZONENAMES" "LATITUDE" "LONGITUDE" "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS" "REFNUM"
Of the 902,297 weather events,let’s find out the length of the unique event types that are present.
UniqueEV <- unique(rawstormData$EVTYPE)
length(UniqueEV)
## [1] 985
Let’s summarize and get the count of those unique event types.
# Display the event types by count
suppressMessages(library(dplyr))
eventsCount <- rawstormData %>% select(EVTYPE) %>% group_by(EVTYPE) %>% summarise(count = n()) %>% data.frame()
#eventsCount
# ( We get 985 entries with messy data like spelling mistakes,duplicate naming,lower case,upper case in the event type descriptions.But there are only 48 valid events types per documentation website,so let's clean up.)
The Valid EVTYPE from the NOAA website are as foll:
## [1] "Astronomical Low Tide" "Avalanche"
## [3] "Blizzard" "Coastal Flood"
## [5] "Cold/Wind Chill" "Debris Flow"
## [7] "Dense Fog" "Dense Smoke"
## [9] "Drought" "Dust Devil"
## [11] "Dust Storm" "Excessive Heat"
## [13] "Extreme Cold/Wind Chill" "Flash Flood"
## [15] "Flood" "Freezing Fog"
## [17] "Frost/Freeze" "Funnel Cloud"
## [19] "Hail" "Heat"
## [21] "Heavy Rain" "Heavy Snow"
## [23] "High Surf" "High Wind"
## [25] "Hurricane (Typhoon)" "Ice Storm"
## [27] "Lake-Effect Snow" "Lakeshore Flood"
## [29] "Lightning" "Marine Hail"
## [31] "Marine High Wind" "Marine Strong Wind"
## [33] "Marine Thunderstorm Wind" "Rip Current"
## [35] "Seiche" "Sleet"
## [37] "Storm Surge/Tide" "Strong Wind"
## [39] "Thunderstorm Wind" "Tornado"
## [41] "Tropical Depression" "Tropical Storm"
## [43] "Tsunami" "Volcanic Ash"
## [45] "Waterspout" "Wildfire"
## [47] "Winter Storm" "Winter Weather"
Since there are some entries in lower case,let’s Capitalize all the entries in EVTYPE column and also standardise some of them.
rawstormData$EVTYPE<-toupper(rawstormData$EVTYPE)
rawstormData[grep("BLIZZARD*", rawstormData$EVTYPE), c("EVTYPE")] <- "BLIZZARD"
rawstormData[grep("COASTAL*", rawstormData$EVTYPE), c("EVTYPE")] <- "COASTAL FLOOD"
rawstormData[grep("*COLD*", rawstormData$EVTYPE), c("EVTYPE")] <- "COLD/WIND CHILL"
rawstormData[grep("EROSION*", rawstormData$EVTYPE), c("EVTYPE")] <- "COASTAL FLOOD"
rawstormData[grep("DENSE FOG", rawstormData$EVTYPE),c("EVTYPE")] <- "FOG"
rawstormData[grep("EXTREME COLD", rawstormData$EVTYPE), c("EVTYPE")] <- "EXTREME COLD/WIND CHILL"
rawstormData[grep("EXTREME WIND CHILL", rawstormData$EVTYPE), c("EVTYPE")] <- "EXTREME COLD/WIND CHILL"
rawstormData[grep("FUNNEL CLOUDS", rawstormData$EVTYPE), c("EVTYPE")] <- "FUNNEL CLOUD"
rawstormData[grep("FREEZE", rawstormData$EVTYPE), c("EVTYPE")] <- "FROST/FREEZE"
rawstormData[grep("FROST", rawstormData$EVTYPE), c("EVTYPE")] <- "FROST/FREEZE"
rawstormData[grep("*FLOOD*", rawstormData$EVTYPE), c("EVTYPE")] <- "FLOOD"
rawstormData[grep("*FLD*", rawstormData$EVTYPE), c("EVTYPE")] <- "FLOOD"
rawstormData[grep("HEAVY SURF", rawstormData$EVTYPE), c("EVTYPE")] <- "HIGH SURF"
rawstormData[grep("HEAVY SURF/HIGH SURF", rawstormData$EVTYPE), c("EVTYPE")] <- "HIGH SURF"
rawstormData[grep("HIGH WINDS", rawstormData$EVTYPE), c("EVTYPE")] <- "HIGH WIND"
rawstormData[grep("HAIL*", rawstormData$EVTYPE), c("EVTYPE")] <- "HAIL"
rawstormData[grep("*HEAT*", rawstormData$EVTYPE), c("EVTYPE")] <- "HEAT"
rawstormData[grep("HEAVY RAIN*", rawstormData$EVTYPE), c("EVTYPE")] <- "HEAVY RAIN"
rawstormData[grep("HURRICANE*", rawstormData$EVTYPE), c("EVTYPE")] <- "HURRICANE/TYPHOON"
rawstormData[grep("SLIDE", rawstormData$EVTYPE),c("EVTYPE")] <- "LAND/MUD/ROCK SLIDES"
rawstormData[grep("ICE", rawstormData$EVTYPE), c("EVTYPE")] <- "ICE STORM"
rawstormData[grep("LIGHTNING|LIGHTING|LIGNTNING", rawstormData$EVTYPE), c("EVTYPE")] <- "LIGHTNING"
rawstormData[grep("*RAIN*", rawstormData$EVTYPE), c("EVTYPE")] <- "RAIN"
rawstormData[grep("RIP CURRENTS", rawstormData$EVTYPE), c("EVTYPE")] <- "RIP CURRENT"
rawstormData[grep("STORM SURGE/TIDE", rawstormData$EVTYPE), c("EVTYPE")] <- "STORM SURGE"
rawstormData[grep("STRONG WINDS", rawstormData$EVTYPE), c("EVTYPE")] <- "STRONG WIND"
rawstormData[grep("TYPHOON", rawstormData$EVTYPE), c("EVTYPE")] <- "HURRICANE/TYPHOON"
rawstormData[grep("TROPICAL D*", rawstormData$EVTYPE), c("EVTYPE")] <- "TROPICAL DEPRESSION"
rawstormData[grep("TROPICAL S*", rawstormData$EVTYPE), c("EVTYPE")] <- "TROPICAL STORM"
rawstormData[grep("TORNADO*", rawstormData$EVTYPE), c("EVTYPE")] <- "TORNADO"
rawstormData[grep("THUNDERSTORM*|TSTM*", rawstormData$EVTYPE), c("EVTYPE")] <- "THUNDERSTORM"
rawstormData[grep("WILD/FOREST FIRE*", rawstormData$EVTYPE), c("EVTYPE")] <- "WILDFIRE"
rawstormData[grep("WINTER WEATHER/MIX", rawstormData$EVTYPE), c("EVTYPE")] <- "WINTER WEATHER"
rawstormData[grep("WATERSPOUT*", rawstormData$EVTYPE), c("EVTYPE")] <- "WATERSPOUT"
Once our cleanup is done,let’s subset and narrow our dataframe to include only the required Events types that caused the Injuries,Fatalities,Property Damage or Crop Damage.
stormData <- subset(x = rawstormData, subset = INJURIES > 0 | FATALITIES > 0 | PROPDMG > 0 | CROPDMG > 0,
select = c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG","PROPDMGEXP", "CROPDMG", "CROPDMGEXP"))
Let’s check out the dimensions of this new subsetted dataframe and the data in the first and last 3 rows and the number of unique events.
dim(stormData)
## [1] 254633 7
head(stormData,3)
## EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 TORNADO 0 15 25.0 K 0
## 2 TORNADO 0 0 2.5 K 0
## 3 TORNADO 0 2 25.0 K 0
tail(stormData,3)
## EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG
## 902257 STRONG WIND 0 0 1.0 K 0
## 902259 DROUGHT 0 0 2.0 K 0
## 902260 HIGH WIND 0 0 7.5 K 0
## CROPDMGEXP
## 902257 K
## 902259 K
## 902260 K
UniqueEV <- unique(stormData$EVTYPE)
length(UniqueEV)
## [1] 138
Let’s get the count of those unique event types from this subsetted dataframe with at least 50 occurrences.
# Display the event types by count
suppressMessages(library(dplyr))
eventsCount <- stormData %>% select(EVTYPE) %>% group_by(EVTYPE) %>% summarise(count = n()) %>%
filter(count >= 50) %>% data.frame()
eventsCount
## EVTYPE count
## 1 AVALANCHE 268
## 2 BLIZZARD 259
## 3 COLD/WIND CHILL 469
## 4 DROUGHT 266
## 5 DRY MICROBURST 78
## 6 DUST DEVIL 95
## 7 DUST STORM 103
## 8 FLOOD 33197
## 9 FOG 181
## 10 FROST/FREEZE 155
## 11 HAIL 26673
## 12 HEAT 3519
## 13 HIGH SURF 221
## 14 HIGH WIND 6191
## 15 HURRICANE/TYPHOON 232
## 16 ICE STORM 752
## 17 LAKE-EFFECT SNOW 194
## 18 LAND/MUD/ROCK SLIDES 209
## 19 LIGHT SNOW 141
## 20 LIGHTNING 13309
## 21 RAIN 95
## 22 RIP CURRENT 641
## 23 SNOW 54
## 24 STORM SURGE 224
## 25 STRONG WIND 3424
## 26 THUNDERSTORM 119292
## 27 TORNADO 39968
## 28 TROPICAL STORM 456
## 29 WATERSPOUT 55
## 30 WILDFIRE 1246
## 31 WIND 84
## 32 WINTER STORM 1508
## 33 WINTER WEATHER 546
We get the total count of the events reduced to 33 because we applied filter condition for at least 50 occurrences.
Question 1: Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
Let’s construct a variable called “W_INJURIES” which adds all the injuries in the stormData$INJURIES column by stormData$EVTYPE.We can find out which weather related event caused most injuries.
W_INJURIES <- aggregate(INJURIES ~ EVTYPE,data= stormData,FUN=sum,na.rm= TRUE)
Let’s construct a variable called “W_FATALITIES” which adds all the fatalities in the stormData$FATALITIES column by stormData$EVTYPE.We can find out which weather related event caused most fatalities.
W_FATALITIES <- aggregate(FATALITIES ~ EVTYPE,data= stormData,FUN=sum,na.rm=TRUE)
Finally,let’s construct a variable called “W_CASUALTIES” adding stormData$INJURIES and stormData$FATALITIES per stormData$EVTYPE.We can find out which weather related type of event was most harmfulto population health.
stormData$CASUALTIES <- stormData$INJURIES + stormData$FATALITIES
W_CASUALTIES <- aggregate(CASUALTIES ~ EVTYPE,data= stormData,FUN=sum,na.rm=TRUE)
Question 2: Across the United States, which types of events have the greatest economic consequences?
Now,Let’s calculate the property (stormData$PROPDMG) and crop (stormData$CROPDMG) damage per event. Notice stormData$PROPDMGEXP and stormData$CROPDMGEXP variables are damages magnitude fields where H,K,M,B represent Hundreds,Thousands, Millions and Billions in US dollars per the documentation.Any corrupt or miscoded values will be ignored.
magnitude <- c(H = 10^2,K = 10^3, M = 10^6, B = 10^9)
stormData$PROPDMG <- stormData$PROPDMG * magnitude[as.character(stormData$PROPDMGEXP)]
stormData$CROPDMG <- stormData$CROPDMG * magnitude[as.character(stormData$CROPDMGEXP)]
To find which weather event has the most expensive damages,let’s create the variable “W_DAMAGES” which adds all damages in US dollars (stormData$DMG column) per stormData$EVTYPE.
W_DAMAGES <- aggregate(cbind(PROPDMG,CROPDMG) ~ EVTYPE, data = stormData, FUN=sum,na.rm=TRUE)
Results for Question 1
W_INJURIES[order(W_INJURIES$INJURIES, decreasing = T), ][1:10, ]
## EVTYPE INJURIES
## 111 TORNADO 91407
## 43 HEAT 10513
## 107 THUNDERSTORM 9447
## 27 FLOOD 8685
## 72 LIGHTNING 5232
## 62 ICE STORM 2154
## 52 HIGH WIND 1496
## 41 HAIL 1467
## 127 WILDFIRE 1456
## 58 HURRICANE/TYPHOON 1333
W_FATALITIES[order(W_FATALITIES$FATALITIES, decreasing = T), ][1:10, ]
## EVTYPE FATALITIES
## 111 TORNADO 5636
## 43 HEAT 3366
## 27 FLOOD 1557
## 72 LIGHTNING 817
## 107 THUNDERSTORM 724
## 87 RIP CURRENT 572
## 12 COLD/WIND CHILL 451
## 52 HIGH WIND 291
## 7 AVALANCHE 224
## 134 WINTER STORM 206
Top10_HEALTH <- W_CASUALTIES[order(W_CASUALTIES$CASUALTIES, decreasing = T), ][1:10,]
Top10_HEALTH
## EVTYPE CASUALTIES
## 111 TORNADO 97043
## 43 HEAT 13879
## 27 FLOOD 10242
## 107 THUNDERSTORM 10171
## 72 LIGHTNING 6049
## 62 ICE STORM 2256
## 52 HIGH WIND 1787
## 127 WILDFIRE 1543
## 134 WINTER STORM 1527
## 41 HAIL 1512
Visualization for Question 1
1) Let’s plot the graph of the top 10 most harmful type of events with respect to population health.
library(ggplot2)
ggplot(Top10_HEALTH, aes(reorder(EVTYPE, -CASUALTIES), CASUALTIES)) +
geom_bar(stat = "identity", aes(fill = EVTYPE)) +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(x="Event Type", y=expression("Total Number of Casualties")) +
labs(title=expression("Top10 Most Harmful Weather Events in US (1950-2011)"))
From the graph,we can see that Tornados are the most harmful event types followed by Heat and Flood.
Results for Question 2
library(reshape2)
Top10_ECONOMY <- melt(head(W_DAMAGES[order(-W_DAMAGES$PROPDMG,-W_DAMAGES$CROPDMG), ],10))
## Using EVTYPE as id variables
Top10_ECONOMY
## EVTYPE variable value
## 1 FLOOD PROPDMG 146036259290
## 2 HURRICANE/TYPHOON PROPDMG 38876883000
## 3 TORNADO PROPDMG 16166781890
## 4 HAIL PROPDMG 9596509190
## 5 STORM SURGE PROPDMG 4643558000
## 6 THUNDERSTORM PROPDMG 4340057500
## 7 WILDFIRE PROPDMG 3547227470
## 8 HIGH WIND PROPDMG 2685097340
## 9 TROPICAL STORM PROPDMG 1063393350
## 10 WINTER STORM PROPDMG 1017844200
## 11 FLOOD CROPDMG 11739276150
## 12 HURRICANE/TYPHOON CROPDMG 5313117800
## 13 TORNADO CROPDMG 353383660
## 14 HAIL CROPDMG 2055922950
## 15 STORM SURGE CROPDMG 855000
## 16 THUNDERSTORM CROPDMG 1144216750
## 17 WILDFIRE CROPDMG 284822100
## 18 HIGH WIND CROPDMG 667134850
## 19 TROPICAL STORM CROPDMG 468261000
## 20 WINTER STORM CROPDMG 23724000
Visualization for Question 2
1) To get the results of the damage in billions of USD on the plot,let’s divide the value of property and crop damage by 1e+09.
ggplot(Top10_ECONOMY, aes(x = EVTYPE, y = value/1e+09, fill = variable)) +
geom_bar(stat = "identity") +
coord_flip() +
ggtitle("To10 Most Economically Damaging Weather Events in US (1950-2011)") +
labs(x = "Event Type", y = "Total Damages in Billions (USD)") +
scale_fill_manual(values = c("brown", "green"), labels = c("Property Damage", "Crop Damage"))
From the graph,we can see that Flood causes the most economic damage to the property and crops.
Conclusions
1 : Across the United States, which types of events are most harmful with respect to population health?
Tornados are the most harmful weather events as they created the most casualties.
2 : Across the United States, which types of events have the greatest economic consequences?
Flood is the most expensive weather event as it created the highest combined property and crop damage with an amount greater than 150 Billion USD.