This document is the final report of the Peer Assessment 2 project from Coursera’s course Reproducible Research, as part of the Specialization in Data Science. It was built up in RStudio, using its knitr functions, meant to be published in html format.
Data from the U.S. National Oceanic and Atmospheric Administration’s - NOAA storm database was used for the analysis. Data was avalilable since 1950, but a subset of such data (1995 - 2011) was used due to the increased reporting status.
In the analysis underneath, it will be possible to verify that most of the fatalities in the U.S. due to severe weather conditions come from Excessive Heat and Tornados. In the same way, Tornados are also responsible for the great majority of the injuries reported. While most of the injuries are reported in the states of Texas and Missouri, fatalities are mostly detected in Illinois.
When the focus of the analysis turns to the economical impact of the weather events, Thunderstorm Wind, Flash Flood and Tornados are responsible for most of the damage to properties in the U.S., while Hail is the main cause of crop damage.
Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
This small routine cleans up space in memory to receive the data and uploads the necessary packages for the data analysis and plotting processes.
It also defines “echo=TRUE” as default for knitr and turns off the scientific notation for numbers.
rm(list=ls()) # free up memory for the download of the data sets
library(knitr)
library(ggplot2)
library(gridExtra)
library(dplyr)
library(tidyr)
opts_chunk$set(echo = TRUE)
options(scipen = 999) # switches off the scientific notation
Latest information about the system where the analysis has been run:
sessionInfo()
## R version 3.2.1 (2015-06-18)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 7 x64 (build 7601) Service Pack 1
##
## locale:
## [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252
## [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
## [5] LC_TIME=Portuguese_Brazil.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] tidyr_0.3.1 dplyr_0.4.3 gridExtra_2.0.0 ggplot2_1.0.1
## [5] knitr_1.10.5
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.0 magrittr_1.5 MASS_7.3-40 munsell_0.4.2
## [5] colorspace_1.2-6 R6_2.1.1 stringr_1.0.0 plyr_1.8.3
## [9] tools_3.2.1 parallel_3.2.1 grid_3.2.1 gtable_0.1.2
## [13] DBI_0.3.1 htmltools_0.2.6 yaml_2.1.13 digest_0.6.8
## [17] assertthat_0.1 reshape2_1.4.1 formatR_1.2 evaluate_0.7
## [21] rmarkdown_0.7 stringi_0.5-5 scales_0.3.0 proto_0.3-10
The data for this assignment come in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size. The file was downloaded from the course web site:
There is also some documentation of the database available. Here you will find how some of the variables are constructed/defined.
National Weather Service Storm Data Documentation
National Climatic Data Center Storm Events FAQ
The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.
The data file was downloaded, unzipped and saved in a local research directory. For reproducibility of this analysis, the same procedured should be followed after the definition of the working directory using the ‘setwd()’ function. Full dataset was loaded in the vector stormData.
stormData <- read.csv("repdata-data-StormData.csv", header = T, sep = ",")
# check the data
dim(stormData)
## [1] 902297 37
The vector has a length of 902297 datasets, with 37 variables. An overview of the data can be seen below:
head(stormData, n=2)
## STATE__ BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1 1 4/18/1950 0:00:00 0130 CST 97 MOBILE AL
## 2 1 4/18/1950 0:00:00 0145 CST 3 BALDWIN AL
## EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 1 TORNADO 0 0
## 2 TORNADO 0 0
## COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 1 NA 0 14 100 3 0 0
## 2 NA 0 2 150 2 0 0
## INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES
## 1 15 25.0 K 0
## 2 0 2.5 K 0
## LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1 3040 8812 3051 8806 1
## 2 3042 8755 0 0 2
And the dataset details are described by:
str(stormData)
## 'data.frame': 902297 obs. of 37 variables:
## $ STATE__ : num 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_DATE : Factor w/ 16335 levels "1/1/1966 0:00:00",..: 6523 6523 4242 11116 2224 2224 2260 383 3980 3980 ...
## $ BGN_TIME : Factor w/ 3608 levels "00:00:00 AM",..: 272 287 2705 1683 2584 3186 242 1683 3186 3186 ...
## $ TIME_ZONE : Factor w/ 22 levels "ADT","AKS","AST",..: 7 7 7 7 7 7 7 7 7 7 ...
## $ COUNTY : num 97 3 57 89 43 77 9 123 125 57 ...
## $ COUNTYNAME: Factor w/ 29601 levels "","5NM E OF MACKINAC BRIDGE TO PRESQUE ISLE LT MI",..: 13513 1873 4598 10592 4372 10094 1973 23873 24418 4598 ...
## $ STATE : Factor w/ 72 levels "AK","AL","AM",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ EVTYPE : Factor w/ 985 levels " HIGH SURF ADVISORY",..: 834 834 834 834 834 834 834 834 834 834 ...
## $ BGN_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ BGN_AZI : Factor w/ 35 levels ""," N"," NW",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_LOCATI: Factor w/ 54429 levels "","- 1 N Albion",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ END_DATE : Factor w/ 6663 levels "","1/1/1993 0:00:00",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ END_TIME : Factor w/ 3647 levels ""," 0900CST",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ COUNTY_END: num 0 0 0 0 0 0 0 0 0 0 ...
## $ COUNTYENDN: logi NA NA NA NA NA NA ...
## $ END_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ END_AZI : Factor w/ 24 levels "","E","ENE","ESE",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ END_LOCATI: Factor w/ 34506 levels "","- .5 NNW",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ LENGTH : num 14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
## $ WIDTH : num 100 150 123 100 150 177 33 33 100 100 ...
## $ F : int 3 2 2 2 2 2 2 1 3 3 ...
## $ MAG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ WFO : Factor w/ 542 levels ""," CI","$AC",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ STATEOFFIC: Factor w/ 250 levels "","ALABAMA, Central",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ ZONENAMES : Factor w/ 25112 levels ""," "| __truncated__,..: 1 1 1 1 1 1 1 1 1 1 ...
## $ LATITUDE : num 3040 3042 3340 3458 3412 ...
## $ LONGITUDE : num 8812 8755 8742 8626 8642 ...
## $ LATITUDE_E: num 3051 0 0 0 0 ...
## $ LONGITUDE_: num 8806 0 0 0 0 ...
## $ REMARKS : Factor w/ 436774 levels "","-2 at Deer Park\n",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ REFNUM : num 1 2 3 4 5 6 7 8 9 10 ...
To have an initial visual overview of the data, a bar graph with the total number of weather events is ploted.
stormData$year <- as.numeric(format(as.Date(stormData$BGN_DATE,
format = "%m/%d/%Y %H:%M:%S"), "%Y"))
ggplot(stormData, aes(stormData$year)) +
geom_histogram(binwidth=1,colour="#999999",fill="#000099") +
ggtitle("Total Weather Events per Year")+
xlab("year")+
ylab("weather events")
As previously described, the dataset gets more consistent and complete after 1995. The accuracy of the data has not yet been checked. A trend for increasing events has not been checked either, being subject for later studies.
Taking it into account, a subset of the data was taken and loaded into the vector stormDataSET, with datasets only from 1995 and further. This vector will be used till the end of this analysis.
The subset of data from 95 and on was taken following the procedure:
stormDataSET <- subset(stormData, year >= 1995)
# standardises the digits of cost variables
stormDataSET$PROPDMGEXP <- as.character(toupper(stormDataSET$PROPDMGEXP))
stormDataSET$CROPDMGEXP <- as.character(toupper(stormDataSET$CROPDMGEXP))
# check the new dataset
dim(stormDataSET)
## [1] 681500 38
The new dataset has now 681500 events registered. The total number of registered events is detailed by:
totalEvents<- stormDataSET %>%
group_by(EVTYPE) %>%
summarize(total=n()) %>%
arrange(desc(total))
head(totalEvents,15)
## Source: local data frame [15 x 2]
##
## EVTYPE total
## (fctr) (int)
## 1 HAIL 215932
## 2 TSTM WIND 128923
## 3 THUNDERSTORM WIND 81745
## 4 FLASH FLOOD 52673
## 5 FLOOD 24641
## 6 TORNADO 24335
## 7 HIGH WIND 19956
## 8 HEAVY SNOW 14710
## 9 LIGHTNING 14280
## 10 HEAVY RAIN 11621
## 11 WINTER STORM 11372
## 12 THUNDERSTORM WINDS 10041
## 13 WINTER WEATHER 6973
## 14 FUNNEL CLOUD 6357
## 15 MARINE TSTM WIND 6175
Hail and Thunderstorm Winds are the most reported severe weather events.
The impact of the severe weather events on public health can be checked with a deeper analysis of the variables FATALITIES and INJURIES, self explained by their names.
impactFatalities <- stormDataSET %>%
group_by(EVTYPE) %>%
summarize(totFatalities=sum(FATALITIES)) %>%
arrange(desc(totFatalities))
impactFatalities <- impactFatalities[1:15,]
head(impactFatalities,15)
## Source: local data frame [15 x 2]
##
## EVTYPE totFatalities
## (fctr) (dbl)
## 1 EXCESSIVE HEAT 1903
## 2 TORNADO 1545
## 3 FLASH FLOOD 934
## 4 HEAT 924
## 5 LIGHTNING 729
## 6 FLOOD 423
## 7 RIP CURRENT 360
## 8 HIGH WIND 241
## 9 TSTM WIND 241
## 10 AVALANCHE 223
## 11 RIP CURRENTS 204
## 12 WINTER STORM 195
## 13 HEAT WAVE 161
## 14 THUNDERSTORM WIND 131
## 15 EXTREME COLD 126
Excessive Heat and Tornados are responsible for most of the fatalities due to weather events in the U.S. since 1995.
impactInjuries <- stormDataSET %>%
group_by(EVTYPE) %>%
summarize(totInjuries=sum(INJURIES)) %>%
arrange(desc(totInjuries))
impactInjuries <- impactInjuries[1:15,]
head(impactInjuries,15)
## Source: local data frame [15 x 2]
##
## EVTYPE totInjuries
## (fctr) (dbl)
## 1 TORNADO 21765
## 2 FLOOD 6769
## 3 EXCESSIVE HEAT 6525
## 4 LIGHTNING 4631
## 5 TSTM WIND 3630
## 6 HEAT 2030
## 7 FLASH FLOOD 1734
## 8 THUNDERSTORM WIND 1426
## 9 WINTER STORM 1298
## 10 HURRICANE/TYPHOON 1275
## 11 HIGH WIND 1093
## 12 HAIL 916
## 13 WILDFIRE 911
## 14 HEAVY SNOW 751
## 15 FOG 718
Tornados is the weather condition that causes most of the injuries.
In order to have an overview of the impact of the most important weather conditions as seen above on each one of the country states, a filter has been applied only to the events “TORNADO”,“EXCESSIVE HEAT”,“FLASH FLOOD”, “HEAT”, “FLOOD”, “LIGHTNING” and “TSTM WIND”, that are common to the two analyses above.
targetEvents <- c("TORNADO","EXCESSIVE HEAT","FLASH FLOOD",
"HEAT","FLOOD","LIGHTNING","TSTM WIND")
stateData <- filter(stormDataSET, EVTYPE %in% targetEvents )
impactStates <- stateData %>%
select(STATE,EVTYPE,FATALITIES,INJURIES) %>%
group_by(STATE) %>%
summarize(totFATAL=sum(FATALITIES),totINJUR=sum(INJURIES)) %>%
mutate(totalImpact=totFATAL+totINJUR) %>%
arrange(desc(totalImpact))
impactStates <- impactStates[1:15,]
head(impactStates,15)
## Source: local data frame [15 x 4]
##
## STATE totFATAL totINJUR totalImpact
## (fctr) (dbl) (dbl) (dbl)
## 1 TX 639 8743 9382
## 2 MO 555 6501 7056
## 3 AL 416 3797 4213
## 4 TN 309 2410 2719
## 5 OK 210 2245 2455
## 6 IL 1045 1384 2429
## 7 GA 151 1848 1999
## 8 AR 213 1603 1816
## 9 FL 214 1600 1814
## 10 PA 495 1012 1507
## 11 NC 146 1156 1302
## 12 MS 133 1074 1207
## 13 MI 57 1083 1140
## 14 MD 132 853 985
## 15 VA 78 831 909
In total (fatalities and injuries), Texas and Missouri are the most impacted states. Most of the injuries are reported in those states.
In the other hand, Illinois is the state with the majority of the fatalities.
The total impact can be better seen in the two following graphs:
# total fatalities graph
gbar01<- ggplot(impactFatalities,aes(x=reorder(EVTYPE,-totFatalities),
y=totFatalities, fill=EVTYPE)) +
geom_bar(stat = "identity",binwidth=1) +
ggtitle(expression(atop("Total Fatalities per Year",
atop(italic("1995 - 2011"),"")))) +
xlab("events") +
ylab("number of fatalities") +
theme(axis.text.x=element_text(angle=45,hjust=1,vjust=1),
legend.position="none")
# total injuries graph
gbar02<- ggplot(impactInjuries,aes(x=reorder(EVTYPE,-totInjuries),
y=totInjuries, fill=EVTYPE)) +
geom_bar(stat = "identity",binwidth=1) +
ggtitle(expression(atop("Total Injuries per Year",
atop(italic("1995 - 2011"),"")))) +
xlab("events") +
ylab("number of injuriess") +
theme(axis.text.x=element_text(angle=45,hjust=1,vjust=1),
legend.position="none")
grid.arrange(gbar01, gbar02, ncol = 2)
The economical impact that those severe wether events represent to society and government are the losses reported in the variables PROPDMG and CROPDMG, which means the total cost of the damage to Property and Crop respectively.
Before calculating such costs, the variables have to be transformed to the right unit (millions, billion, kilo, etc.), with a small procedure:
# transforms the property damage costs
stormDataSET$PROPDMGEXP = gsub("\\-|\\+|\\?","0", stormDataSET$PROPDMGEXP)
stormDataSET$PROPDMGEXP = gsub("B", "9", stormDataSET$PROPDMGEXP)
stormDataSET$PROPDMGEXP = gsub("M", "6", stormDataSET$PROPDMGEXP)
stormDataSET$PROPDMGEXP = gsub("K", "3", stormDataSET$PROPDMGEXP)
stormDataSET$PROPDMGEXP = gsub("H", "2", stormDataSET$PROPDMGEXP)
stormDataSET$PROPDMGEXP <- as.numeric(stormDataSET$PROPDMGEXP)
stormDataSET$PROPDMGEXP[is.na(stormDataSET$PROPDMGEXP)] = 0
stormDataSET$PropDMGTransf<- stormDataSET$PROPDMG * 10^stormDataSET$PROPDMGEXP
# transforms the crop damage costs
stormDataSET$CROPDMGEXP = gsub("\\-|\\+|\\?","0", stormDataSET$CROPDMGEXP)
stormDataSET$CROPDMGEXP = gsub("B", "9", stormDataSET$CROPDMGEXP)
stormDataSET$CROPDMGEXP = gsub("M", "6", stormDataSET$CROPDMGEXP)
stormDataSET$CROPDMGEXP = gsub("K", "3", stormDataSET$CROPDMGEXP)
stormDataSET$CROPDMGEXP = gsub("H", "2", stormDataSET$CROPDMGEXP)
stormDataSET$CROPDMGEXP <- as.numeric(stormDataSET$CROPDMGEXP)
stormDataSET$CROPDMGEXP[is.na(stormDataSET$CROPDMGEXP)] = 0
stormDataSET$CropDMGTransf<- stormDataSET$CROPDMG * 10^stormDataSET$CROPDMGEXP
Then the total costs of damage can be finaly calculated.
The cost of Property Damage (in US$) is calculated with:
impactProperty <- stormDataSET %>%
group_by(EVTYPE) %>%
summarize(totPropImpact=sum(PropDMGTransf)) %>%
arrange(desc(totPropImpact))
impactProperty <- impactProperty[1:15,]
head(impactProperty,15)
## Source: local data frame [15 x 2]
##
## EVTYPE totPropImpact
## (fctr) (dbl)
## 1 FLOOD 144022037057
## 2 HURRICANE/TYPHOON 69305840000
## 3 STORM SURGE 43193536000
## 4 TORNADO 24935939545
## 5 FLASH FLOOD 16047794571
## 6 HAIL 15048722103
## 7 HURRICANE 11812819010
## 8 TROPICAL STORM 7653335550
## 9 HIGH WIND 5259785375
## 10 WILDFIRE 4759064000
## 11 STORM SURGE/TIDE 4641188000
## 12 TSTM WIND 4482361440
## 13 ICE STORM 3643555810
## 14 THUNDERSTORM WIND 3399282992
## 15 HURRICANE OPAL 3172846000
Where we see that the highest levels of damage costs on properties (above US$ 10 billion) are due to FLOOD, HURRICANE/TYPHOON, STORM SURGE, TORNADO, FLASH FLOOD, HAIL and HURRICANE.
The cost of Crop Damage (in US$) is calculated with:
impactCrop <- stormDataSET %>%
group_by(EVTYPE) %>%
summarize(totCropImpact=sum(CropDMGTransf)) %>%
arrange(desc(totCropImpact))
impactCrop <- impactCrop[1:15,]
head(impactCrop,15)
## Source: local data frame [15 x 2]
##
## EVTYPE totCropImpact
## (fctr) (dbl)
## 1 DROUGHT 13922066000
## 2 FLOOD 5422810400
## 3 HURRICANE 2741410000
## 4 HAIL 2614127070
## 5 HURRICANE/TYPHOON 2607872800
## 6 FLASH FLOOD 1343915000
## 7 EXTREME COLD 1292473000
## 8 FROST/FREEZE 1094086000
## 9 HEAVY RAIN 728399800
## 10 TROPICAL STORM 677836000
## 11 HIGH WIND 633561300
## 12 TSTM WIND 553947350
## 13 EXCESSIVE HEAT 492402000
## 14 THUNDERSTORM WIND 414354000
## 15 HEAT 401411500
Once again, we observe that the highest levels of damage costs on crop (above US$ 1 billion) are due to DROUGHT, FLOOD, HURRICANE, HAIL, HURRICANE/TYPHOON, FLASH FLOOD, EXTREME COLD and FROST/FREEZE.
A complete figure, with both data of damage all together is seen below.
impactTotalCost <- stormDataSET %>%
group_by(EVTYPE) %>%
summarize(totCrop=sum(CropDMGTransf),
totProp=sum(PropDMGTransf)) %>%
mutate(totCost=totCrop+totProp) %>%
arrange(desc(totCost))
impactTotalCost <- impactTotalCost[1:15,]
head(impactTotalCost,15)
## Source: local data frame [15 x 4]
##
## EVTYPE totCrop totProp totCost
## (fctr) (dbl) (dbl) (dbl)
## 1 FLOOD 5422810400 144022037057 149444847457
## 2 HURRICANE/TYPHOON 2607872800 69305840000 71913712800
## 3 STORM SURGE 5000 43193536000 43193541000
## 4 TORNADO 296595770 24935939545 25232535315
## 5 HAIL 2614127070 15048722103 17662849173
## 6 FLASH FLOOD 1343915000 16047794571 17391709571
## 7 DROUGHT 13922066000 1046106000 14968172000
## 8 HURRICANE 2741410000 11812819010 14554229010
## 9 TROPICAL STORM 677836000 7653335550 8331171550
## 10 HIGH WIND 633561300 5259785375 5893346675
## 11 WILDFIRE 295472800 4759064000 5054536800
## 12 TSTM WIND 553947350 4482361440 5036308790
## 13 STORM SURGE/TIDE 850000 4641188000 4642038000
## 14 THUNDERSTORM WIND 414354000 3399282992 3813636992
## 15 ICE STORM 15660000 3643555810 3659215810
Taking the data above, we see that the most impacted states on costs are:
totalCostState <- stormDataSET %>%
group_by(STATE) %>%
summarize(totCropSt=sum(CropDMGTransf),
totPropSt=sum(PropDMGTransf)) %>%
mutate(totCostSt=totCropSt+totPropSt) %>%
arrange(desc(totCostSt))
totalCostState <- totalCostState[1:15,]
head(totalCostState,15)
## Source: local data frame [15 x 4]
##
## STATE totCropSt totPropSt totCostSt
## (fctr) (dbl) (dbl) (dbl)
## 1 CA 3461941700 122421713252 125883654952
## 2 LA 1178141000 59029677250 60207818250
## 3 FL 3850241400 38834725270 42684966670
## 4 MS 1609752600 28696098452 30305851052
## 5 TX 7280847900 22648381290 29929229190
## 6 AL 539759240 10825484400 11365243640
## 7 NC 2048807400 7849194782 9898002182
## 8 IA 4590075660 3216192200 7806267860
## 9 MO 731448200 5869876270 6601324470
## 10 TN 19543500 6053418445 6072961945
## 11 ND 473010000 5222936400 5695946400
## 12 OH 361301900 4814507743 5175809643
## 13 OK 1106393550 3790507245 4896900795
## 14 MN 268853150 4256860720 4525713870
## 15 NY 160658100 4122348240 4283006340
California, Louisiana, Florida, Mississippi and Texas had most impact (damage costs) with the severe weather events.
A summary of the damages can be seen in the following graphs.
gbar04<- ggplot(impactProperty,aes(x=reorder(EVTYPE,-totPropImpact),
y=totPropImpact,fill=EVTYPE)) +
geom_bar(stat="identity",binwidth=1) +
ggtitle(expression(atop("Property Damage Cost",
atop(italic("1995 - 2011"),"")))) +
xlab("event") +
ylab("total cost (in US$)") +
theme(axis.text.x=element_text(angle=45,hjust=1,vjust=1),
legend.position="none")
gbar05<- ggplot(impactCrop,aes(x=reorder(EVTYPE,-totCropImpact),
y=totCropImpact,fill=EVTYPE)) +
geom_bar(stat="identity",binwidth=1) +
ggtitle(expression(atop("Crop Damage Cost",
atop(italic("1995 - 2011"),"")))) +
xlab("event") +
ylab("total cost(in US$)") +
theme(axis.text.x=element_text(angle=45,hjust=1,vjust=1),
legend.position="none")
grid.arrange(gbar04, gbar05, ncol = 2)
Answering the questions proposed for this Peer Assessment, we can summarise the conclusions as follows:
Question 01:
Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
Question 02:
Across the United States, which types of events have the greatest economic consequences?