You need to install “ggplot2” packages to run this code.

Synopsis


In this report, we explore the NOAA Storm Database from 1950-2011.

The objective is to find out the following :
1. Across U.S., which types of events cause the most population health impact;
2. Across U.S., which types of events cause the most economic impact.

It was found that for 1950-1992, only 3 broad categories of events were captured. However, for 1993-2011, all 11 broad categories of events were captured.
For meaningful comparison, only data for 1993-2011 were used for comparison.

Considering the data from 1993-2011, we found that :
1. Personal Health Impact
a. The events that cause the most personal health impact are : tornado, heat, flood and wintry weather.
b. Heat cause the highest fatalities, while tornado causes the highest injurues.

2. Economic Impact
a. The events that cause the most economic impact are : flood, hurricane/storm, tornado, rain and hail.
b. Flood causes the highest property damage, while hail causes the highest crop damage.


Data Processing



1 : Loading the data

We first read in csv file as data frame. The data is delimited and missing value is coded as “?”.

DF <- read.csv(bzfile("repdata_data_StormData.csv.bz2"), na.strings="?")


After reading, we preview the data frame (DF).

dim(DF)
## [1] 902297     37
head(DF)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL
##    EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 1 TORNADO         0                                               0
## 2 TORNADO         0                                               0
## 3 TORNADO         0                                               0
## 4 TORNADO         0                                               0
## 5 TORNADO         0                                               0
## 6 TORNADO         0                                               0
##   COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 1         NA         0                      14.0   100 3   0          0
## 2         NA         0                       2.0   150 2   0          0
## 3         NA         0                       0.1   123 2   0          0
## 4         NA         0                       0.0   100 2   0          0
## 5         NA         0                       0.0   150 2   0          0
## 6         NA         0                       1.5   177 2   0          0
##   INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES
## 1       15    25.0          K       0                                    
## 2        0     2.5          K       0                                    
## 3        2    25.0          K       0                                    
## 4        2     2.5          K       0                                    
## 5        2     2.5          K       0                                    
## 6        6     2.5          K       0                                    
##   LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1     3040      8812       3051       8806              1
## 2     3042      8755          0          0              2
## 3     3340      8742          0          0              3
## 4     3458      8626          0          0              4
## 5     3412      8642          0          0              5
## 6     3450      8748          0          0              6


The columns we are interested in are FATALITES, INJURIES, PROPDMG & CROPDMG. So we extract the columns and do a check to see if there are any NA values.

sum(is.na(DF$FATALITIES))
## [1] 0
sum(is.na(DF$INJURIES))
## [1] 0
sum(is.na(DF$PROPDMG))
## [1] 0
sum(is.na(DF$CROPDMG))
## [1] 0


From the above, we see that there are no missing values for the 4 columns that we are concerned with.

2a : Cleaning the data (EVTYPE)

We then clean up the data as follows :

- convert all events to small letters.
- EVTYPE with the words “summary”, “apache county”, “southeast”, “monthly” are ignored.
- EVTYPE is re-classified into 11 broad categories.
- Each of the 11 broad categories includes EVTYPE with the following keys words tag to it :

1. sea/coast - surf, swell, sea, wave, marine, seiche, beach, coastal, coastal flood, dam, tidal flood, storm surge, blow out flood, low tide, high tide, tsunami, rip current, red flag
2. flood - flash, flood, rapidly rising/high water, urban, small stream, drowning
3. hail - hail
4. rain - torrential, thunderstorm, heavy/excessive rain/shower, wet, metro storm, tropical depression
5. storm/hurr - hurricane, typhoon, tropical storm, tstm, floyd
6. lightning - lightning
7. tornado - wall cloud, tornado, water/land spout, funnel
8. wintry - wintry, thundersnow, blizzard, heavy snow, snow, freeze, frost, ice, sleet, freezing rain, glaze, low temperature, cold, cool, wind chill, hypothermia, icy
9. wind - wind, gust, microburst, downburst
10. heat - heat, hot, high/record temperature, warm, hyperthermia, drought, below normal precipitation, dry, driest
11. others - precipitation, heavy mix, severe turbulence, northern lights, record high/low, none, no severe weather, mild pattern, high, excessive, other, dust, fog, avalanche, land slide/slump, volcanic, vog

DF$EVTYPE <- tolower(DF$EVTYPE)

DF <- DF[grepl("summary(.*)", DF$EVTYPE)==F,] 
DF <- DF[grepl("apache county", DF$EVTYPE)==F,] 
DF <- DF[grepl("southeast", DF$EVTYPE)==F,] 
DF <- DF[grepl("monthly(.*)", DF$EVTYPE)==F,] 

DF[grepl("(.*)surf(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)swell(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)sea(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)wave(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("marine(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("seiche(.*)", DF$EVTYPE)==T,8] <- "sea/coast"

DF[grepl("beach(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("coastal(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("c(.*)st(.*)l(.*)flood(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("dam(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("tidal flood(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("storm surge(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("blow(.*)out tide(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)low tide(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)high tide(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)tsunami(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)rip current(.*)", DF$EVTYPE)==T,8] <- "sea/coast"
DF[grepl("(.*)red flag(.*)", DF$EVTYPE)==T,8] <- "sea/coast"

DF[grepl("(.*)flood(.*)", DF$EVTYPE)==T,8] <- "flood"
DF[grepl("(.*)rapidly rising water(.*)", DF$EVTYPE)==T,8] <- "flood"
DF[grepl("(.*)flash fl(.*)", DF$EVTYPE)==T,8] <- "flood"
DF[grepl("(.*)urban(.*)", DF$EVTYPE)==T,8] <- "flood"
DF[grepl("(.*)sm(.*)stream(.*)", DF$EVTYPE)==T,8] <- "flood"
DF[grepl("(.*)high water(.*)", DF$EVTYPE)==T,8] <- "flood"
DF[grepl("drowning", DF$EVTYPE)==T,8] <- "flood"

DF[grepl("hail(.*)", DF$EVTYPE)==T,8] <- "hail"

DF[grepl("torrential rain(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("t(.*)u(.*)e(.*)storm(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("h(.*)vy rain(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("excessive rain(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("heavy shower(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("(.*)wet(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("metro storm", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("(.*)rain(.*)", DF$EVTYPE)==T,8] <- "rain"
DF[grepl("tropical depression(.*)", DF$EVTYPE)==T,8] <- "rain"

DF[grepl("hurricane(.*)", DF$EVTYPE)==T,8] <- "storm/hurr"
DF[grepl("typhoon(.*)", DF$EVTYPE)==T,8] <- "storm/hurr"
DF[grepl("tropical storm(.*)", DF$EVTYPE)==T,8] <- "storm/hurr"
DF[grepl("tstm(.*)", DF$EVTYPE)==T,8] <- "storm/hurr"
DF[grepl("(.)floyd(.*)", DF$EVTYPE)==T,8] <- "storm/hurr"

DF[grepl("lig(.*)t(.*)ing(.*)", DF$EVTYPE)==T,8] <- "lightning"

DF[grepl("wall cloud(.*)", DF$EVTYPE)==T,8] <- "tornado"
DF[grepl("torn(.*)o(.*)", DF$EVTYPE)==T,8] <- "tornado"
DF[grepl("wa(.*)ter(.*)spout(.*)", DF$EVTYPE)==T,8] <- "tornado"
DF[grepl("(.*)funnel(.*)", DF$EVTYPE)==T,8] <- "tornado"
DF[grepl("land(.*)spout(.*)", DF$EVTYPE)==T,8] <- "tornado"

DF[grepl("wint(.*)r(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("thunder(.*)w(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)blizzard(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("heavy(.*)snow(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)snow(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)freez(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)frost(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)ice(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)sleet(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)freezing rain(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)glaze(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)icy(.*)", DF$EVTYPE)==T,8] <- "wintry"

DF[grepl("low temperature(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)cold(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)cool(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("(.*)wind chil(.*)", DF$EVTYPE)==T,8] <- "wintry"
DF[grepl("hypothermia(.*)", DF$EVTYPE)==T,8] <- "wintry"

DF[grepl("(.*)wind(.*)", DF$EVTYPE)==T,8] <- "wind"
DF[grepl("gust(.*)", DF$EVTYPE)==T,8] <- "wind"
DF[grepl("wnd", DF$EVTYPE)==T,8] <- "wind"
DF[grepl("(.*)mic(.*)oburst", DF$EVTYPE)==T,8] <- "wind"
DF[grepl("(.*)down(.*)burst", DF$EVTYPE)==T,8] <- "wind"

DF[grepl("heat(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("hot(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("high temperature(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("record temperature(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("temperature record(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("(.*)warm(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("hyperthermia(.*)", DF$EVTYPE)==T,8] <- "heat"

DF[grepl("drought(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("below normal precipitation", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("dry(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("driest(.*)", DF$EVTYPE)==T,8] <- "heat"

DF[grepl("(.*)fire(.*)", DF$EVTYPE)==T,8] <- "heat"
DF[grepl("(.*)smoke", DF$EVTYPE)==T,8] <- "heat"

DF[grepl("(.*)precip(.*)", DF$EVTYPE)==T,8] <- "others"
DF[grepl("heavy mix", DF$EVTYPE)==T,8] <- "others"
DF[grepl("severe turbulence", DF$EVTYPE)==T,8] <- "others"
DF[grepl("northern lights", DF$EVTYPE)==T,8] <- "others"
DF[grepl("record high", DF$EVTYPE)==T,8] <- "others"
DF[grepl("record low", DF$EVTYPE)==T,8] <- "others"
DF[grepl("none", DF$EVTYPE)==T,8] <- "others"
DF[grepl("no severe weather", DF$EVTYPE)==T,8] <- "others"
DF[grepl("mild pattern", DF$EVTYPE)==T,8] <- "others"
DF[grepl("high", DF$EVTYPE)==T,8] <- "others"
DF[grepl("excessive", DF$EVTYPE)==T,8] <- "others"
DF[grepl("other", DF$EVTYPE)==T,8] <- "others"

DF[grepl("dust(.*)", DF$EVTYPE)==T,8] <- "others"

DF[grepl("fog(.*)", DF$EVTYPE)==T,8] <- "others"

DF[grepl("avalanc(.*)e(.*)", DF$EVTYPE)==T,8] <- "others"
DF[grepl("(.*)slide(.*)", DF$EVTYPE)==T,8] <- "others"
DF[grepl("land(.*)slump(.*)", DF$EVTYPE)==T,8] <- "others"

DF[grepl("volcanic(.*)", DF$EVTYPE)==T,8] <- "others"
DF[grepl("vog(.*)", DF$EVTYPE)==T,8] <- "others"


2b : Cleaning the data (BGN_DATE)

After that, we pick up only the YEAR from the “BGN_DATE” field.

DF$YEAR <- as.character(DF$BGN_DATE)
DF$YEAR <- strptime(DF$YEAR, "%m/%d/%Y %H:%M:%S")
DF$YEAR <- format(DF$YEAR, "%Y")


Next, we make a contingency table of the YEAR with the types of event (EVTYPE), and print out the contingency table.

YEAR_EV <- table(DF$YEAR, DF$EVTYPE)
YEAR_EV
##       
##        flood  hail  heat lightning others  rain sea/coast storm/hurr
##   1950     0     0     0         0      0     0         0          0
##   1951     0     0     0         0      0     0         0          0
##   1952     0     0     0         0      0     0         0          0
##   1953     0     0     0         0      0     0         0          0
##   1954     0     0     0         0      0     0         0          0
##   1955     0   360     0         0      0     0         0        421
##   1956     0   401     0         0      0     0         0        735
##   1957     0   479     0         0      0     0         0        775
##   1958     0   706     0         0      0     0         0        899
##   1959     0   531     0         0      0     0         0        652
##   1960     0   581     0         0      0     0         0        719
##   1961     0   722     0         0      0     0         0        752
##   1962     0   886     0         0      0     0         0        830
##   1963     0   652     0         0      0     0         0        823
##   1964     0   679     0         0      0     0         0        909
##   1965     0   805     0         0      0     0         0       1055
##   1966     0   732     0         0      0     0         0       1050
##   1967     0   764     0         0      0     0         0        958
##   1968     0  1068     0         0      0     0         0       1529
##   1969     0   766     0         0      0     0         0       1510
##   1970     0   721     0         0      0     0         0       1794
##   1971     0   964     0         0      0     0         0       1544
##   1972     0   681     0         0      0     0         0        712
##   1973     0  1098     0         0      0     0         0       2166
##   1974     0  1660     0         0      0     0         0       2603
##   1975     0  1374     0         0      0     0         0       2639
##   1976     0  1091     0         0      0     0         0       1742
##   1977     0  1083     0         0      0     0         0       1723
##   1978     0  1024     0         0      0     0         0       1758
##   1979     0  1315     0         0      0     0         0       2046
##   1980     0  1993     0         0      0     0         0       3181
##   1981     0  1494     0         0      0     0         0       2193
##   1982     0  2381     0         0      0     0         0       3570
##   1983     0  2334     0         0      0     0         0       4993
##   1984     0  2749     0         0      0     0         0       3566
##   1985     0  3379     0         0      0     0         0       3827
##   1986     0  3512     0         0      0     0         0       4365
##   1987     0  2416     0         0      0     0         0       4256
##   1988     0  2537     0         0      0     0         0       3947
##   1989     0  3778     0         0      0     0         0       5711
##   1990     0  3618     0         0      0     0         0       6064
##   1991     0  4811     0         0      0     0         0       6503
##   1992     0  5687     0         0      0     0         0       6443
##   1993  1579  4216    30       467     25  3889        52         15
##   1994  1868  6733    57      1010     50  8001        73         54
##   1995  3021  8370   211      1083    123 10731       212        305
##   1996  4551 10855   218       914     85   399       159      10045
##   1997  3984  8801   145       841    142   395       173       9869
##   1998  4933 12730   452       901    101   718       277      13627
##   1999  3397 10236   750       863    144   520       186      10378
##   2000  3560 11372   700       907    152   646       172      12171
##   2001  3850 12389   487       880    206   513       371      11762
##   2002  4167 12689   656       875    161   363      1325      11849
##   2003  4912 13911   424       741    188   938      1653      12036
##   2004  5704 13142   188       705    217   741      1551      11963
##   2005  4325 13788   368       864    221   858      1422      12397
##   2006  3851 16638   709       840    267  1549      1418      13176
##   2007  5494 12711   726       719    346 13877      1253         25
##   2008  6123 17546   623       766    307 17651      1502        181
##   2009  6069 13313   471       721    225 14472      1270         11
##   2010  6700 10922   814       867    340 16947      1517         34
##   2011  7182 17761  1587       801    320 22743      1953        162
##       
##        tornado  wind wintry
##   1950     223     0      0
##   1951     269     0      0
##   1952     272     0      0
##   1953     492     0      0
##   1954     609     0      0
##   1955     632     0      0
##   1956     567     0      0
##   1957     930     0      0
##   1958     608     0      0
##   1959     630     0      0
##   1960     645     0      0
##   1961     772     0      0
##   1962     673     0      0
##   1963     493     0      0
##   1964     760     0      0
##   1965     995     0      0
##   1966     606     0      0
##   1967     966     0      0
##   1968     715     0      0
##   1969     650     0      0
##   1970     700     0      0
##   1971     963     0      0
##   1972     775     0      0
##   1973    1199     0      0
##   1974    1123     0      0
##   1975     962     0      0
##   1976     935     0      0
##   1977     922     0      0
##   1978     875     0      0
##   1979     918     0      0
##   1980     972     0      0
##   1981     830     0      0
##   1982    1181     0      0
##   1983     995     0      0
##   1984    1020     0      0
##   1985     773     0      0
##   1986     849     0      0
##   1987     695     0      0
##   1988     773     0      0
##   1989     921     0      0
##   1990    1264     0      0
##   1991    1208     0      0
##   1992    1404     0      0
##   1993     895   497    942
##   1994    1516   586    681
##   1995    1755   901   1257
##   1996    1693  1230   2046
##   1997    1769   899   1658
##   1998    2147   910   1321
##   1999    2113  1123   1571
##   2000    1748  1123   1909
##   2001    1831   992   1671
##   2002    1480  1022   1694
##   2003    2121   855   1973
##   2004    2571   846   1735
##   2005    1968   912   2061
##   2006    1860  1792   1934
##   2007    1853  1965   4320
##   2008    2483  3069   5412
##   2009    1941  2666   4658
##   2010    2119  2568   5333
##   2011    2921  2547   4197


From the above contingency table, we observe that :
1. From 1950 to 1954, only data for tornado were captured.
2. From 1955 to 1992, only data for hail, rain/storm/hurricane and tornado were captured.
3. From 1993 to 2011, data for all 11 broad categories of events were captured.

For meaningful comparison of all types of events, we choose to use only data for 1993-2011.

2c : Further Cleaning of Data (subset for Year 1993-2011)

So, we subset the data for 1993-2011 only.

DF$YEAR <- as.numeric(DF$YEA)
DF <- DF[DF$YEAR >= 1993 & DF$YEAR <= 2011,]


Result



From this point on, only data from 1993-2011 were considered.

Analysis is for across the United States


For Population Health Impact, it was interpreted as being contributed by the columns FATALITIES and INJURIES.
For Economic Impact, it was interpreted as being contributed by the columns PROPDMG and CROPDMG.

Re-shaping the data


The purpose is to allow facetting in ggplot.

POPULATION HEALTH IMPACT
========================
We re-shape the data by creating the following data frames :
1a. sumDF_FATALITIES : EVTYPE (event type), NUM_LIVES (number of lives lost), “Fatalities” (Health Catgory)
1b. sumDF_INJURIES : EVTYPE (event type), NUM_LIVES (number of lives lost), “Injuries” (Health Catgory)

Then we row-bind the above 2 data frames to get one combined dataframe for HEALTH Impact.
1c. sumDF_HEALTH : EVTYPE, NUM_LIVES, HEALTH_CAT
This is the number of lives affected by Population HEALTH, taking into consideration FATALITIES & INJURIES.

ECONOMIC IMPACT
================
We re-shape the data by creating the following data frames :
2a. sumDF_PROPDMG : EVTYPE (event type), NUM_LIVES (amount of damage), “Property Damage” (Economic Catgory)
2b. sumDF_CROPDMG : EVTYPE (event type), NUM_LIVES (amount of damage), “Crop Damage” (Economic Catgory)

Then we row-bind the above 2 data frames to get one combined dataframe for ECONOMIC Impact.
2c. sumDF_ECONOMIC : EVTYPE, AMOUNT_DMG, ECONOMIC_CAT
This is the the amount of ECONOMIC damages, taking into consideration PROPDMG (property damage) & CROPDMG (crop damage).

EVTYPE <- data.frame(table(DF$EVTYPE))[,1]

sumDF_FATALITIES <- data.frame(EVTYPE, tapply(DF$FATALITIES,DF$EVTYPE,sum), "Fatalities")
sumDF_INJURIES<- data.frame(EVTYPE, tapply(DF$INJURIES,DF$EVTYPE,sum), "Injuries")
colnames(sumDF_FATALITIES) <- c("EVTYPE","NUM_LIVES","HEALTH_CAT")
colnames(sumDF_INJURIES) <- c("EVTYPE","NUM_LIVES","HEALTH_CAT")
sumDF_HEALTH <- rbind(sumDF_FATALITIES, sumDF_INJURIES)

sumDF_PROPDMG <- data.frame(EVTYPE, tapply(DF$PROPDMG,DF$EVTYPE,sum), "Property Damage")
sumDF_CROPDMG <- data.frame(EVTYPE, tapply(DF$CROPDMG,DF$EVTYPE,sum), "Crop Damage")
colnames(sumDF_PROPDMG) <- c("EVTYPE","AMOUNT_DMG","ECONOMIC_CAT")
colnames(sumDF_CROPDMG) <- c("EVTYPE","AMOUNT_DMG","ECONOMIC_CAT")
sumDF_ECONOMIC <- rbind(sumDF_PROPDMG, sumDF_CROPDMG)


Then we print the table of the sum of FATALITIES, INJURIES, PROPDMG, CROPDMG by each event.

sumDF_HEALTH
##                 EVTYPE NUM_LIVES HEALTH_CAT
## flood            flood      1552 Fatalities
## hail              hail        40 Fatalities
## heat              heat      3048 Fatalities
## lightning    lightning       817 Fatalities
## others          others       375 Fatalities
## rain              rain       313 Fatalities
## sea/coast    sea/coast      1098 Fatalities
## storm/hurr  storm/hurr       443 Fatalities
## tornado        tornado      1627 Fatalities
## wind              wind       445 Fatalities
## wintry          wintry      1107 Fatalities
## flood1           flood      8673   Injuries
## hail1             hail      1066   Injuries
## heat1             heat     10444   Injuries
## lightning1   lightning      5231   Injuries
## others1         others      1815   Injuries
## rain1             rain      2757   Injuries
## sea/coast1   sea/coast      1478   Injuries
## storm/hurr1 storm/hurr      5350   Injuries
## tornado1       tornado     23403   Injuries
## wind1             wind      1874   Injuries
## wintry1         wintry      6674   Injuries
sumDF_ECONOMIC
##                 EVTYPE AMOUNT_DMG    ECONOMIC_CAT
## flood            flood 2444573.31 Property Damage
## hail              hail  699166.38 Property Damage
## heat              heat  131180.95 Property Damage
## lightning    lightning  603396.78 Property Damage
## others          others   47851.43 Property Damage
## rain              rain 1386978.96 Property Damage
## sea/coast    sea/coast   61834.74 Property Damage
## storm/hurr  storm/hurr 1411756.54 Property Damage
## tornado        tornado 1401013.64 Property Damage
## wind              wind  452839.05 Property Damage
## wintry          wintry  419397.16 Property Damage
## flood1           flood  367244.53     Crop Damage
## hail1             hail  585956.66     Crop Damage
## heat1             heat   44632.24     Crop Damage
## lightning1   lightning    3580.61     Crop Damage
## others1         others    2672.90     Crop Damage
## rain1             rain   98285.93     Crop Damage
## sea/coast1   sea/coast    1861.50     Crop Damage
## storm/hurr1 storm/hurr  127305.51     Crop Damage
## tornado1       tornado  100026.77     Crop Damage
## wind1             wind   21713.76     Crop Damage
## wintry1         wintry   24546.91     Crop Damage


Plotting the Impact on Population Health

Finally, we use ggplot to get the barplots.

We first look at the Impact on Population Health - by plotting Types of Events against Number of Lives Affected, facetted by HEALTH (FATALITIES/INJURIES)

library(ggplot2)
    h <- ggplot(sumDF_HEALTH, aes(x = reorder(EVTYPE, -NUM_LIVES), y = NUM_LIVES))
    h <- h + geom_bar(stat="identity") + facet_grid(HEALTH_CAT~., margin = TRUE)
    h <- h + labs( y = "Number of Lives Affected") 
    h <- h + labs( x = "Event")      
    h <- h + ggtitle(expression(atop("Impact on Population Health", atop("Across United States from 1993-2011",""))))
    h <- h + theme(axis.text.x = element_text(angle=+90, hjust=0, vjust=1))     
    print(h)


Observations from the Plot on Population Health Impact

  1. The events that cause the most population health impact are : tornado, heat and flood.
  2. Heat causes the highest fatalities, while tornado causes the highest injuries.



Plotting the Economic Impact

Next, we look at the Economic Consequences - by plotting Types of Events against Amount of Damage, facetted by ECONOMIC (PROPDMG/CROPDMG)

  e <- ggplot(sumDF_ECONOMIC,aes(x = reorder(EVTYPE, -AMOUNT_DMG), y = AMOUNT_DMG))
    e <- e + geom_bar(stat="identity") + facet_grid(ECONOMIC_CAT~., margin = TRUE)
    e <- e + labs( y = "Amount of Damage") 
    e <- e + labs( x = "Event")      
    e <- e + ggtitle(expression(atop("Economic Consequences", atop("Across United #States from 1993-2011",""))))
    e <- e + theme(axis.text.x = element_text(angle=+90, hjust=0, vjust=1))     
    print(e)


Observations from the Plot on Economic Impact

  1. The events that cause the most economic impact are : flood, hurricane/storm, tornado, rain and hail.
  2. Flood causes the highest property damage, while hail causes the highest crop damage.

End of Report