You need to install “reshape2” and “ggplot2” packages to run this code.
In this report, we explore the NOAA Storm Database from 1950-2011.
The objective is to find out the following :
1. Across U.S., which types of events cause the most population health impact;
2. Across U.S., which types of events cause the most economic impact.
It was found that for 1950-1992, only 3 broad categories of events were captured. However, for 1993-2011, all 11 broad categories of events were captured.
For meaningful comparison, only data for 1993-2011 were used for comparison.
Considering the data from 1993-2011, we found that :
1. Personal Health Impact
a. The events that cause the most personal health impact are : tornado, heat, flood and wintry weather.
b. Heat cause the highest fatalities, while tornado causes the highest injurues.
2. Economic Impact
a. The events that cause the most economic impact are : flood, hurricane/storm, tornado, rain and hail.
b. Flood causes the highest property damage, while hail causes the highest crop damage.
We first read in csv file as data frame. The data is delimited and missing value is coded as “?”.
DF <- read.csv(bzfile("repdata_data_StormData.csv.bz2"), na.strings="?")
After reading, we preview the data frame (DF).
dim(DF)
## [1] 902297 37
head(DF)
## STATE__ BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1 1 4/18/1950 0:00:00 0130 CST 97 MOBILE AL
## 2 1 4/18/1950 0:00:00 0145 CST 3 BALDWIN AL
## 3 1 2/20/1951 0:00:00 1600 CST 57 FAYETTE AL
## 4 1 6/8/1951 0:00:00 0900 CST 89 MADISON AL
## 5 1 11/15/1951 0:00:00 1500 CST 43 CULLMAN AL
## 6 1 11/15/1951 0:00:00 2000 CST 77 LAUDERDALE AL
## EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 1 TORNADO 0 0
## 2 TORNADO 0 0
## 3 TORNADO 0 0
## 4 TORNADO 0 0
## 5 TORNADO 0 0
## 6 TORNADO 0 0
## COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 1 NA 0 14.0 100 3 0 0
## 2 NA 0 2.0 150 2 0 0
## 3 NA 0 0.1 123 2 0 0
## 4 NA 0 0.0 100 2 0 0
## 5 NA 0 0.0 150 2 0 0
## 6 NA 0 1.5 177 2 0 0
## INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES
## 1 15 25.0 K 0
## 2 0 2.5 K 0
## 3 2 25.0 K 0
## 4 2 2.5 K 0
## 5 2 2.5 K 0
## 6 6 2.5 K 0
## LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1 3040 8812 3051 8806 1
## 2 3042 8755 0 0 2
## 3 3340 8742 0 0 3
## 4 3458 8626 0 0 4
## 5 3412 8642 0 0 5
## 6 3450 8748 0 0 6
The columns we are interested in are FATALITES, INJURIES, PROPDMG & CROPDMG. So we extract the columns and do a check to see if there are any NA values.
sum(is.na(DF$FATALITIES))
## [1] 0
sum(is.na(DF$INJURIES))
## [1] 0
sum(is.na(DF$PROPDMG))
## [1] 0
sum(is.na(DF$CROPDMG))
## [1] 0
From the above, we see that there are no missing values for the 4 columns that we are concerned with.
We then clean up the data as follows :
- convert all events to small letters.
- EVTYPE with the words “summary”, “apache county”, “southeast”, “monthly” are ignored.
- EVTYPE is re-classified into 11 broad categories.
- Each of the 11 broad categories includes EVTYPE with the following keys words tag to it :
1. sea/coast - surf, swell, sea, wave, marine, seiche, beach, coastal, coastal flood, dam, tidal flood, storm surge, blow out flood, low tide, high tide, tsunami, rip current, red flag
2. flood - flash, flood, rapidly rising/high water, urban, small stream, drowning
3. hail - hail
4. rain - torrential, thunderstorm, heavy/excessive rain/shower, wet, metro storm, tropical depression
5. storm/hurr - hurricane, typhoon, tropical storm, tstm, floyd
6. lightning - lightning
7. tornado - wall cloud, tornado, water/land spout, funnel
8. wintry - wintry, thundersnow, blizzard, heavy snow, snow, freeze, frost, ice, sleet, freezing rain, glaze, low temperature, cold, cool, wind chill, hypothermia, icy
9. wind - wind, gust, microburst, downburst
10. heat - heat, hot, high/record temperature, warm, hyperthermia, drought, below normal precipitation, dry, driest
11. others - precipitation, heavy mix, severe turbulence, northern lights, record high/low, none, no severe weather, mild pattern, high, excessive, other, dust, fog, avalanche, land slide/slump, volcanic, vog
DF$EVTYPE <- tolower(DF$EVTYPE)
DF <- DF[grepl("summary(.*)", DF$EVTYPE)==F,]
DF <- DF[grepl("apache county", DF$EVTYPE)==F,]
DF <- DF[grepl("southeast", DF$EVTYPE)==F,]
DF <- DF[grepl("monthly(.*)", DF$EVTYPE)==F,]
DF[grepl("(.*)surf(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)swell(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)sea(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)wave(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("marine(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("seiche(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("beach(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("coastal(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("c(.*)st(.*)l(.*)flood(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("dam(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("tidal flood(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("storm surge(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("blow(.*)out tide(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)low tide(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)high tide(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)tsunami(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)rip current(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)red flag(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)flood(.*)", DF$EVTYPE)==T,8] <- "flood"
DF[grepl("(.*)rapidly rising water(.*)", DF$EVTYPE)==T,8] <- "flood"
DF[grepl("(.*)flash fl(.*)", DF$EVTYPE)==T,8] <- "flood"
DF[grepl("(.*)urban(.*)", DF$EVTYPE)==T,8] <- "flood"
DF[grepl("(.*)sm(.*)stream(.*)", DF$EVTYPE)==T,8] <- "flood"
DF[grepl("(.*)high water(.*)", DF$EVTYPE)==T,8] <- "flood"
DF[grepl("drowning", DF$EVTYPE)==T,8] <- "flood"
DF[grepl("hail(.*)", DF$EVTYPE)==T,8] <- "hail"
DF[grepl("torrential rain(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("t(.*)u(.*)e(.*)storm(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("h(.*)vy rain(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("excessive rain(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("heavy shower(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("(.*)wet(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("metro storm", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("(.*)rain(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("tropical depression(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("hurricane(.*)", DF$EVTYPE)==T,8] <- "storm/hurr"
DF[grepl("typhoon(.*)", DF$EVTYPE)==T,8] <- "storm/hurr"
DF[grepl("tropical storm(.*)", DF$EVTYPE)==T,8] <- "storm/hurr"
DF[grepl("tstm(.*)", DF$EVTYPE)==T,8] <- "storm/hurr"
DF[grepl("(.)floyd(.*)", DF$EVTYPE)==T,8] <- "storm/hurr"
DF[grepl("lig(.*)t(.*)ing(.*)", DF$EVTYPE)==T,8] <- "lightning"
DF[grepl("wall cloud(.*)", DF$EVTYPE)==T,8] <- "tornado"
DF[grepl("torn(.*)o(.*)", DF$EVTYPE)==T,8] <- "tornado"
DF[grepl("wa(.*)ter(.*)spout(.*)", DF$EVTYPE)==T,8] <- "tornado"
DF[grepl("(.*)funnel(.*)", DF$EVTYPE)==T,8] <- "tornado"
DF[grepl("land(.*)spout(.*)", DF$EVTYPE)==T,8] <- "tornado"
DF[grepl("wint(.*)r(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("thunder(.*)w(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)blizzard(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("heavy(.*)snow(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)snow(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)freez(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)frost(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)ice(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)sleet(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)freezing rain(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)glaze(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)icy(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("low temperature(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)cold(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)cool(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)wind chil(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("hypothermia(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)wind(.*)", DF$EVTYPE)==T,8] <- "wind"
DF[grepl("gust(.*)", DF$EVTYPE)==T,8] <- "wind"
DF[grepl("wnd", DF$EVTYPE)==T,8] <- "wind"
DF[grepl("(.*)mic(.*)oburst", DF$EVTYPE)==T,8] <- "wind"
DF[grepl("(.*)down(.*)burst", DF$EVTYPE)==T,8] <- "wind"
DF[grepl("heat(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("hot(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("high temperature(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("record temperature(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("temperature record(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("(.*)warm(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("hyperthermia(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("drought(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("below normal precipitation", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("dry(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("driest(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("(.*)fire(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("(.*)smoke", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("(.*)precip(.*)", DF$EVTYPE)==T,8] <- "others"
DF[grepl("heavy mix", DF$EVTYPE)==T,8] <- "others"
DF[grepl("severe turbulence", DF$EVTYPE)==T,8] <- "others"
DF[grepl("northern lights", DF$EVTYPE)==T,8] <- "others"
DF[grepl("record high", DF$EVTYPE)==T,8] <- "others"
DF[grepl("record low", DF$EVTYPE)==T,8] <- "others"
DF[grepl("none", DF$EVTYPE)==T,8] <- "others"
DF[grepl("no severe weather", DF$EVTYPE)==T,8] <- "others"
DF[grepl("mild pattern", DF$EVTYPE)==T,8] <- "others"
DF[grepl("high", DF$EVTYPE)==T,8] <- "others"
DF[grepl("excessive", DF$EVTYPE)==T,8] <- "others"
DF[grepl("other", DF$EVTYPE)==T,8] <- "others"
DF[grepl("dust(.*)", DF$EVTYPE)==T,8] <- "others"
DF[grepl("fog(.*)", DF$EVTYPE)==T,8] <- "others"
DF[grepl("avalanc(.*)e(.*)", DF$EVTYPE)==T,8] <- "others"
DF[grepl("(.*)slide(.*)", DF$EVTYPE)==T,8] <- "others"
DF[grepl("land(.*)slump(.*)", DF$EVTYPE)==T,8] <- "others"
DF[grepl("volcanic(.*)", DF$EVTYPE)==T,8] <- "others"
DF[grepl("vog(.*)", DF$EVTYPE)==T,8] <- "others"
After that, we pick up only the YEAR from the “BGN_DATE” field.
DF$PERIOD <- as.character(DF$BGN_DATE)
DF$PERIOD <- gsub(" 0:00:00","",DF$PERIOD)
DF$PERIOD <- sapply(strsplit(DF$PERIOD, "/"), "[", 3)
Next, we make a contingency table of the YEAR with the types of event (EVTYPE), and print out the contingency table.
YEAR <- table(DF$PERIOD, DF$EVTYPE)
YEAR
##
## flood hail heat lightning others rain sea/coast storm/hurr
## 1950 0 0 0 0 0 0 0 0
## 1951 0 0 0 0 0 0 0 0
## 1952 0 0 0 0 0 0 0 0
## 1953 0 0 0 0 0 0 0 0
## 1954 0 0 0 0 0 0 0 0
## 1955 0 360 0 0 0 0 0 421
## 1956 0 401 0 0 0 0 0 735
## 1957 0 479 0 0 0 0 0 775
## 1958 0 706 0 0 0 0 0 899
## 1959 0 531 0 0 0 0 0 652
## 1960 0 581 0 0 0 0 0 719
## 1961 0 722 0 0 0 0 0 752
## 1962 0 886 0 0 0 0 0 830
## 1963 0 652 0 0 0 0 0 823
## 1964 0 679 0 0 0 0 0 909
## 1965 0 805 0 0 0 0 0 1055
## 1966 0 732 0 0 0 0 0 1050
## 1967 0 764 0 0 0 0 0 958
## 1968 0 1068 0 0 0 0 0 1529
## 1969 0 766 0 0 0 0 0 1510
## 1970 0 721 0 0 0 0 0 1794
## 1971 0 964 0 0 0 0 0 1544
## 1972 0 681 0 0 0 0 0 712
## 1973 0 1098 0 0 0 0 0 2166
## 1974 0 1660 0 0 0 0 0 2603
## 1975 0 1374 0 0 0 0 0 2639
## 1976 0 1091 0 0 0 0 0 1742
## 1977 0 1083 0 0 0 0 0 1723
## 1978 0 1024 0 0 0 0 0 1758
## 1979 0 1315 0 0 0 0 0 2046
## 1980 0 1993 0 0 0 0 0 3181
## 1981 0 1494 0 0 0 0 0 2193
## 1982 0 2381 0 0 0 0 0 3570
## 1983 0 2334 0 0 0 0 0 4993
## 1984 0 2749 0 0 0 0 0 3566
## 1985 0 3379 0 0 0 0 0 3827
## 1986 0 3512 0 0 0 0 0 4365
## 1987 0 2416 0 0 0 0 0 4256
## 1988 0 2537 0 0 0 0 0 3947
## 1989 0 3778 0 0 0 0 0 5711
## 1990 0 3618 0 0 0 0 0 6064
## 1991 0 4811 0 0 0 0 0 6503
## 1992 0 5687 0 0 0 0 0 6443
## 1993 1579 4216 30 467 25 3889 52 15
## 1994 1868 6733 57 1010 50 8001 73 54
## 1995 3021 8370 211 1083 123 10731 212 305
## 1996 4551 10855 218 914 85 399 159 10045
## 1997 3984 8801 145 841 142 395 173 9869
## 1998 4933 12730 452 901 101 718 277 13627
## 1999 3397 10236 750 863 144 520 186 10378
## 2000 3560 11372 700 907 152 646 172 12171
## 2001 3850 12389 487 880 206 513 371 11762
## 2002 4167 12689 656 875 161 363 1325 11849
## 2003 4912 13911 424 741 188 938 1653 12036
## 2004 5704 13142 188 705 217 741 1551 11963
## 2005 4325 13788 368 864 221 858 1422 12397
## 2006 3851 16638 709 840 267 1549 1418 13176
## 2007 5494 12711 726 719 346 13877 1253 25
## 2008 6123 17546 623 766 307 17651 1502 181
## 2009 6069 13313 471 721 225 14472 1270 11
## 2010 6700 10922 814 867 340 16947 1517 34
## 2011 7182 17761 1587 801 320 22743 1953 162
##
## tornado wind wintry
## 1950 223 0 0
## 1951 269 0 0
## 1952 272 0 0
## 1953 492 0 0
## 1954 609 0 0
## 1955 632 0 0
## 1956 567 0 0
## 1957 930 0 0
## 1958 608 0 0
## 1959 630 0 0
## 1960 645 0 0
## 1961 772 0 0
## 1962 673 0 0
## 1963 493 0 0
## 1964 760 0 0
## 1965 995 0 0
## 1966 606 0 0
## 1967 966 0 0
## 1968 715 0 0
## 1969 650 0 0
## 1970 700 0 0
## 1971 963 0 0
## 1972 775 0 0
## 1973 1199 0 0
## 1974 1123 0 0
## 1975 962 0 0
## 1976 935 0 0
## 1977 922 0 0
## 1978 875 0 0
## 1979 918 0 0
## 1980 972 0 0
## 1981 830 0 0
## 1982 1181 0 0
## 1983 995 0 0
## 1984 1020 0 0
## 1985 773 0 0
## 1986 849 0 0
## 1987 695 0 0
## 1988 773 0 0
## 1989 921 0 0
## 1990 1264 0 0
## 1991 1208 0 0
## 1992 1404 0 0
## 1993 895 497 942
## 1994 1516 586 681
## 1995 1755 901 1257
## 1996 1693 1230 2046
## 1997 1769 899 1658
## 1998 2147 910 1321
## 1999 2113 1123 1571
## 2000 1748 1123 1909
## 2001 1831 992 1671
## 2002 1480 1022 1694
## 2003 2121 855 1973
## 2004 2571 846 1735
## 2005 1968 912 2061
## 2006 1860 1792 1934
## 2007 1853 1965 4320
## 2008 2483 3069 5412
## 2009 1941 2666 4658
## 2010 2119 2568 5333
## 2011 2921 2547 4197
From the above contingency table, we observe that :
1. From 1950 to 1954, only data for tornado were captured.
2. From 1955 to 1992, only data for hail, rain/storm/hurricane and tornado were captured.
3. From 1993 to 2011, data for all 11 broad categories of events were captured.
For meaningful comparison of all types of events, we choose to use only data for 1993-2011.
So then, we categorize the data into “1950-1992” and “1993-2011”. Following that, we subset the data for 1993-2011 only.
DF$PERIOD <- gsub("19[5-8].","1950",DF$PERIOD)
DF$PERIOD <- gsub("199[0-2]","1950",DF$PERIOD)
DF$PERIOD <- gsub("199[3-9]","1993",DF$PERIOD)
DF$PERIOD <- gsub("20.[0-9]","1993",DF$PERIOD)
DF$PERIOD <- gsub("1950","1950-1992",DF$PERIOD)
DF$PERIOD <- gsub("1993","1993-2011",DF$PERIOD)
DF <- DF[DF$PERIOD=="1993-2011",]
For Personal Health Impact, it was interpreted as being contributed by the columns FATALITIES and INJURIES.
For Economic Impact, it was interpreted as being contributed by the columns PROPDMG and CROPDMG.
We then re-shape the data by finding the sum of each of the columns : FATALITIES, INJURIES, PROPDMG, CROPDMG, segregated by events.
sumDF <- DF[DF$PERIOD == "1993-2011",]
sumDF <- as.data.frame(table(DF$EVTYPE))
sumDF[,2] <- data.frame(tapply(DF$FATALITIES,DF$EVTYPE,sum))
sumDF[,3] <- data.frame(tapply(DF$INJURIES,DF$EVTYPE,sum))
sumDF[,4] <- data.frame(tapply(DF$PROPDMG,DF$EVTYPE,sum))
sumDF[,5] <- data.frame(tapply(DF$CROPDMG,DF$EVTYPE,sum))
sumDF[,6] <- data.frame("1993-2011")
colnames(sumDF) <- c("EVTYPE","FATALITIES","INJURIES","PROPERTY","CROP","PERIOD")
Then we print the table of the sum of FATALITIES, INJURIES, PROPDMG, CROPDMG by each event.
sumDF
## EVTYPE FATALITIES INJURIES PROPERTY CROP PERIOD
## 1 flood 1552 8673 2444573 367245 1993-2011
## 2 hail 40 1066 699166 585957 1993-2011
## 3 heat 3048 10444 131181 44632 1993-2011
## 4 lightning 817 5231 603397 3581 1993-2011
## 5 others 375 1815 47851 2673 1993-2011
## 6 rain 313 2757 1386979 98286 1993-2011
## 7 sea/coast 1098 1478 61835 1862 1993-2011
## 8 storm/hurr 443 5350 1411757 127306 1993-2011
## 9 tornado 1627 23403 1401014 100027 1993-2011
## 10 wind 445 1874 452839 21714 1993-2011
## 11 wintry 1107 6674 419397 24547 1993-2011
We then use the “reshape” library to melt the data frame into :
1. sumDF1 : PERIOD, EVTYPE, HEALTH, NUM_LIVES
2. sumDF2 : PERIOD, EVTYPE, ECONOMIC, AMOUNT_DMG
sumDF1 - sums the number of lives affected by Personal HEALTH, taking into consideration FATALITIES & INJURIES.
sumDF2 - sums the amount of ECONOMIC damages, taking into consideration PROPDMG (property damage) & CROPDMG (crop damage).
The purpose is to allow facetting in ggplot.
library(reshape2)
sumDF1 <- melt(sumDF, id.vars=c("PERIOD","EVTYPE"), measure.vars=c("FATALITIES","INJURIES"), variable.name="HEALTH", value.name = "NUM_LIVES")
sumDF2 <- melt(sumDF, id.vars=c("PERIOD","EVTYPE"), measure.vars=c("PROPERTY","CROP"), variable.name="ECONOMIC", value.name = "AMOUNT_DMG")
Finally, we use ggplot to get the barplots.
We first look at the Impact on Personal Health - by plotting Types of Events against Number of Lives Affected, facetted by HEALTH (FATALITIES/INJURIES)
library(ggplot2)
h <- ggplot(sumDF1,aes(x = reorder(EVTYPE, -NUM_LIVES), y = NUM_LIVES))
h <- h + geom_bar(stat="identity") + facet_grid(HEALTH~., margin = TRUE)
h <- h + labs( y = "Number of Lives Affected")
h <- h + labs( x = "Event")
h <- h + ggtitle(expression(atop("Impact on Population Health", atop("Across United States from 1993-2011",""))))
h <- h + theme(axis.text.x = element_text(angle=+90, hjust=0, vjust=1))
print(h)
Observations from the Plot on Personal Health Impact
Next, we look at the Economic Consequences - by plotting Types of Events against Amount of Damage, facetted by ECONOMIC (PROPDMG/CROPDMG)
e <- ggplot(sumDF2,aes(x = reorder(EVTYPE, -AMOUNT_DMG), y = AMOUNT_DMG))
e <- e + geom_bar(stat="identity") + facet_grid(ECONOMIC~., margin = TRUE)
e <- e + labs( y = "Amount of Damage")
e <- e + labs( x = "Event")
e <- e + ggtitle(expression(atop("Economic Consequences", atop("Across United States from 1993-2011",""))))
e <- e + theme(axis.text.x = element_text(angle=+90, hjust=0, vjust=1))
print(e)
Observations from the Plot on Economic Impact