Introduction

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration's (NOAA)storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Synopsis

This report addresses the below questions :

  1. Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

  2. Across the United States, which types of events have the greatest economic consequences?

After analysing and summarizing the data for the top 10 most weather events,we can conclude that Tornadoes caused the most total harm to population health (judged by the combined number of fatalities and injuries), and Floods caused the most total harm to the economy (judged by the value of combined property damage and crop damage).

Data

The data for this assignment come in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size.

The links to download the dataset and the documentation are below:

Storm Data [47Mb]

National Weather Service Storm Data Documentation

National Climatic Data Center Storm Events FAQ

The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.

Data Processing

Install the packages dplyr,reshape2,ggplot2 and Load the foll libraries

# library(dplyr)
# library(ggplot2)
# library(reshape2)
# Commenting out these commands here but they will be called in respective chunks below.

Download file from the URL and save it to your working directory folder.

#fileUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
#download.file(fileUrl, destfile = "./data/repdata-data-StormData.csv.bz2" , method = "curl")

Now that I already downloaded it into my working directory,I will Load and Read in the data into a dataframe and check out dimensions,summary,structure and colnames.

# Please Set the cache option to TRUE,because the dataset is too large to load everytime and is timeconsuming
rawstormData <- read.csv("repdata-data-StormData.csv",as.is=TRUE)
dim(rawstormData) 
## [1] 902297     37
summary(rawstormData)
##     STATE__       BGN_DATE           BGN_TIME          TIME_ZONE        
##  Min.   : 1.0   Length:902297      Length:902297      Length:902297     
##  1st Qu.:19.0   Class :character   Class :character   Class :character  
##  Median :30.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :31.2                                                           
##  3rd Qu.:45.0                                                           
##  Max.   :95.0                                                           
##                                                                         
##      COUNTY       COUNTYNAME           STATE              EVTYPE         
##  Min.   :  0.0   Length:902297      Length:902297      Length:902297     
##  1st Qu.: 31.0   Class :character   Class :character   Class :character  
##  Median : 75.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :100.6                                                           
##  3rd Qu.:131.0                                                           
##  Max.   :873.0                                                           
##                                                                          
##    BGN_RANGE          BGN_AZI           BGN_LOCATI       
##  Min.   :   0.000   Length:902297      Length:902297     
##  1st Qu.:   0.000   Class :character   Class :character  
##  Median :   0.000   Mode  :character   Mode  :character  
##  Mean   :   1.484                                        
##  3rd Qu.:   1.000                                        
##  Max.   :3749.000                                        
##                                                          
##    END_DATE           END_TIME           COUNTY_END COUNTYENDN    
##  Length:902297      Length:902297      Min.   :0    Mode:logical  
##  Class :character   Class :character   1st Qu.:0    NA's:902297   
##  Mode  :character   Mode  :character   Median :0                  
##                                        Mean   :0                  
##                                        3rd Qu.:0                  
##                                        Max.   :0                  
##                                                                   
##    END_RANGE          END_AZI           END_LOCATI       
##  Min.   :  0.0000   Length:902297      Length:902297     
##  1st Qu.:  0.0000   Class :character   Class :character  
##  Median :  0.0000   Mode  :character   Mode  :character  
##  Mean   :  0.9862                                        
##  3rd Qu.:  0.0000                                        
##  Max.   :925.0000                                        
##                                                          
##      LENGTH              WIDTH                F               MAG         
##  Min.   :   0.0000   Min.   :   0.000   Min.   :0.0      Min.   :    0.0  
##  1st Qu.:   0.0000   1st Qu.:   0.000   1st Qu.:0.0      1st Qu.:    0.0  
##  Median :   0.0000   Median :   0.000   Median :1.0      Median :   50.0  
##  Mean   :   0.2301   Mean   :   7.503   Mean   :0.9      Mean   :   46.9  
##  3rd Qu.:   0.0000   3rd Qu.:   0.000   3rd Qu.:1.0      3rd Qu.:   75.0  
##  Max.   :2315.0000   Max.   :4400.000   Max.   :5.0      Max.   :22000.0  
##                                         NA's   :843563                    
##    FATALITIES          INJURIES            PROPDMG       
##  Min.   :  0.0000   Min.   :   0.0000   Min.   :   0.00  
##  1st Qu.:  0.0000   1st Qu.:   0.0000   1st Qu.:   0.00  
##  Median :  0.0000   Median :   0.0000   Median :   0.00  
##  Mean   :  0.0168   Mean   :   0.1557   Mean   :  12.06  
##  3rd Qu.:  0.0000   3rd Qu.:   0.0000   3rd Qu.:   0.50  
##  Max.   :583.0000   Max.   :1700.0000   Max.   :5000.00  
##                                                          
##   PROPDMGEXP           CROPDMG         CROPDMGEXP       
##  Length:902297      Min.   :  0.000   Length:902297     
##  Class :character   1st Qu.:  0.000   Class :character  
##  Mode  :character   Median :  0.000   Mode  :character  
##                     Mean   :  1.527                     
##                     3rd Qu.:  0.000                     
##                     Max.   :990.000                     
##                                                         
##      WFO             STATEOFFIC         ZONENAMES            LATITUDE   
##  Length:902297      Length:902297      Length:902297      Min.   :   0  
##  Class :character   Class :character   Class :character   1st Qu.:2802  
##  Mode  :character   Mode  :character   Mode  :character   Median :3540  
##                                                           Mean   :2875  
##                                                           3rd Qu.:4019  
##                                                           Max.   :9706  
##                                                           NA's   :47    
##    LONGITUDE        LATITUDE_E     LONGITUDE_       REMARKS         
##  Min.   :-14451   Min.   :   0   Min.   :-14455   Length:902297     
##  1st Qu.:  7247   1st Qu.:   0   1st Qu.:     0   Class :character  
##  Median :  8707   Median :   0   Median :     0   Mode  :character  
##  Mean   :  6940   Mean   :1452   Mean   :  3509                     
##  3rd Qu.:  9605   3rd Qu.:3549   3rd Qu.:  8735                     
##  Max.   : 17124   Max.   :9706   Max.   :106220                     
##                   NA's   :40                                        
##      REFNUM      
##  Min.   :     1  
##  1st Qu.:225575  
##  Median :451149  
##  Mean   :451149  
##  3rd Qu.:676723  
##  Max.   :902297  
## 
str(rawstormData)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : chr  "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
##  $ BGN_TIME  : chr  "0130" "0145" "1600" "0900" ...
##  $ TIME_ZONE : chr  "CST" "CST" "CST" "CST" ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: chr  "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
##  $ STATE     : chr  "AL" "AL" "AL" "AL" ...
##  $ EVTYPE    : chr  "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : chr  "" "" "" "" ...
##  $ BGN_LOCATI: chr  "" "" "" "" ...
##  $ END_DATE  : chr  "" "" "" "" ...
##  $ END_TIME  : chr  "" "" "" "" ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : chr  "" "" "" "" ...
##  $ END_LOCATI: chr  "" "" "" "" ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: chr  "K" "K" "K" "K" ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: chr  "" "" "" "" ...
##  $ WFO       : chr  "" "" "" "" ...
##  $ STATEOFFIC: chr  "" "" "" "" ...
##  $ ZONENAMES : chr  "" "" "" "" ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : chr  "" "" "" "" ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...
names(rawstormData)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"

Of the 902,297 weather events,let’s find out the length of the unique event types that are present.

UniqueEV <- unique(rawstormData$EVTYPE)
length(UniqueEV)
## [1] 985

Let’s summarize and get the count of those unique event types.

# Display the event types by count
suppressMessages(library(dplyr))
eventsCount <- rawstormData %>% select(EVTYPE) %>% group_by(EVTYPE) %>% summarise(count = n()) %>% data.frame() 
#eventsCount
# ( We get 985 entries with messy data like spelling mistakes,duplicate naming,lower case,upper case in the event type descriptions.But there are only 48 valid events types per documentation website,so let's clean up.)

The Valid EVTYPE from the NOAA website are as foll:

##  [1] "Astronomical Low Tide"    "Avalanche"               
##  [3] "Blizzard"                 "Coastal Flood"           
##  [5] "Cold/Wind Chill"          "Debris Flow"             
##  [7] "Dense Fog"                "Dense Smoke"             
##  [9] "Drought"                  "Dust Devil"              
## [11] "Dust Storm"               "Excessive Heat"          
## [13] "Extreme Cold/Wind Chill"  "Flash Flood"             
## [15] "Flood"                    "Freezing Fog"            
## [17] "Frost/Freeze"             "Funnel Cloud"            
## [19] "Hail"                     "Heat"                    
## [21] "Heavy Rain"               "Heavy Snow"              
## [23] "High Surf"                "High Wind"               
## [25] "Hurricane (Typhoon)"      "Ice Storm"               
## [27] "Lake-Effect Snow"         "Lakeshore Flood"         
## [29] "Lightning"                "Marine Hail"             
## [31] "Marine High Wind"         "Marine Strong Wind"      
## [33] "Marine Thunderstorm Wind" "Rip Current"             
## [35] "Seiche"                   "Sleet"                   
## [37] "Storm Surge/Tide"         "Strong Wind"             
## [39] "Thunderstorm Wind"        "Tornado"                 
## [41] "Tropical Depression"      "Tropical Storm"          
## [43] "Tsunami"                  "Volcanic Ash"            
## [45] "Waterspout"               "Wildfire"                
## [47] "Winter Storm"             "Winter Weather"

Since there are some entries in lower case,let’s Capitalize all the entries in EVTYPE column and also standardise some of them.

rawstormData$EVTYPE<-toupper(rawstormData$EVTYPE)
rawstormData[grep("BLIZZARD*", rawstormData$EVTYPE), c("EVTYPE")] <- "BLIZZARD"
rawstormData[grep("COASTAL*", rawstormData$EVTYPE), c("EVTYPE")] <- "COASTAL FLOOD"
rawstormData[grep("*COLD*", rawstormData$EVTYPE), c("EVTYPE")] <- "COLD/WIND CHILL"
rawstormData[grep("EROSION*", rawstormData$EVTYPE), c("EVTYPE")] <- "COASTAL FLOOD"
rawstormData[grep("DENSE FOG", rawstormData$EVTYPE),c("EVTYPE")] <- "FOG"
rawstormData[grep("EXTREME COLD", rawstormData$EVTYPE), c("EVTYPE")] <- "EXTREME COLD/WIND CHILL"
rawstormData[grep("EXTREME WIND CHILL", rawstormData$EVTYPE), c("EVTYPE")] <- "EXTREME COLD/WIND CHILL"
rawstormData[grep("FUNNEL CLOUDS", rawstormData$EVTYPE), c("EVTYPE")] <- "FUNNEL CLOUD"
rawstormData[grep("FREEZE", rawstormData$EVTYPE), c("EVTYPE")] <- "FROST/FREEZE"
rawstormData[grep("FROST", rawstormData$EVTYPE), c("EVTYPE")] <- "FROST/FREEZE"
rawstormData[grep("*FLOOD*", rawstormData$EVTYPE), c("EVTYPE")] <- "FLOOD"
rawstormData[grep("*FLD*", rawstormData$EVTYPE), c("EVTYPE")] <- "FLOOD"
rawstormData[grep("HEAVY SURF", rawstormData$EVTYPE), c("EVTYPE")] <- "HIGH SURF"
rawstormData[grep("HEAVY SURF/HIGH SURF", rawstormData$EVTYPE), c("EVTYPE")] <- "HIGH SURF"
rawstormData[grep("HIGH WINDS", rawstormData$EVTYPE), c("EVTYPE")] <- "HIGH WIND"
rawstormData[grep("HAIL*", rawstormData$EVTYPE), c("EVTYPE")] <- "HAIL"
rawstormData[grep("*HEAT*", rawstormData$EVTYPE), c("EVTYPE")] <- "HEAT"
rawstormData[grep("HEAVY RAIN*", rawstormData$EVTYPE), c("EVTYPE")] <- "HEAVY RAIN"
rawstormData[grep("HURRICANE*", rawstormData$EVTYPE), c("EVTYPE")] <- "HURRICANE/TYPHOON"
rawstormData[grep("SLIDE", rawstormData$EVTYPE),c("EVTYPE")] <- "LAND/MUD/ROCK SLIDES"
rawstormData[grep("ICE", rawstormData$EVTYPE), c("EVTYPE")] <- "ICE STORM"
rawstormData[grep("LIGHTNING|LIGHTING|LIGNTNING", rawstormData$EVTYPE), c("EVTYPE")] <- "LIGHTNING"
rawstormData[grep("*RAIN*", rawstormData$EVTYPE), c("EVTYPE")] <- "RAIN"
rawstormData[grep("RIP CURRENTS", rawstormData$EVTYPE), c("EVTYPE")] <- "RIP CURRENT"
rawstormData[grep("STORM SURGE/TIDE", rawstormData$EVTYPE), c("EVTYPE")] <- "STORM SURGE"
rawstormData[grep("STRONG WINDS", rawstormData$EVTYPE), c("EVTYPE")] <- "STRONG WIND"
rawstormData[grep("TYPHOON", rawstormData$EVTYPE), c("EVTYPE")] <- "HURRICANE/TYPHOON"
rawstormData[grep("TROPICAL D*", rawstormData$EVTYPE), c("EVTYPE")] <- "TROPICAL DEPRESSION"
rawstormData[grep("TROPICAL S*", rawstormData$EVTYPE), c("EVTYPE")] <- "TROPICAL STORM"
rawstormData[grep("TORNADO*", rawstormData$EVTYPE), c("EVTYPE")] <- "TORNADO"
rawstormData[grep("THUNDERSTORM*|TSTM*", rawstormData$EVTYPE), c("EVTYPE")] <- "THUNDERSTORM"
rawstormData[grep("WILD/FOREST FIRE*", rawstormData$EVTYPE), c("EVTYPE")] <- "WILDFIRE"
rawstormData[grep("WINTER WEATHER/MIX", rawstormData$EVTYPE), c("EVTYPE")] <- "WINTER WEATHER"
rawstormData[grep("WATERSPOUT*", rawstormData$EVTYPE), c("EVTYPE")] <- "WATERSPOUT"

Once our cleanup is done,let’s subset and narrow our dataframe to include only the required Events types that caused the Injuries,Fatalities,Property Damage or Crop Damage.

stormData <- subset(x = rawstormData, subset = INJURIES > 0 | FATALITIES > 0 | PROPDMG > 0 | CROPDMG > 0,
                select = c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG","PROPDMGEXP", "CROPDMG", "CROPDMGEXP"))

Let’s check out the dimensions of this new subsetted dataframe and the data in the first and last 3 rows and the number of unique events.

dim(stormData) 
## [1] 254633      7
head(stormData,3)
##    EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 TORNADO          0       15    25.0          K       0           
## 2 TORNADO          0        0     2.5          K       0           
## 3 TORNADO          0        2    25.0          K       0
tail(stormData,3)
##             EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG
## 902257 STRONG WIND          0        0     1.0          K       0
## 902259     DROUGHT          0        0     2.0          K       0
## 902260   HIGH WIND          0        0     7.5          K       0
##        CROPDMGEXP
## 902257          K
## 902259          K
## 902260          K
UniqueEV <- unique(stormData$EVTYPE)
length(UniqueEV) 
## [1] 138

Let’s get the count of those unique event types from this subsetted dataframe with at least 50 occurrences.

# Display the event types by count
suppressMessages(library(dplyr))
eventsCount <- stormData %>% select(EVTYPE) %>% group_by(EVTYPE) %>% summarise(count = n()) %>%
    filter(count >= 50) %>% data.frame() 
eventsCount
##                  EVTYPE  count
## 1             AVALANCHE    268
## 2              BLIZZARD    259
## 3       COLD/WIND CHILL    469
## 4               DROUGHT    266
## 5        DRY MICROBURST     78
## 6            DUST DEVIL     95
## 7            DUST STORM    103
## 8                 FLOOD  33197
## 9                   FOG    181
## 10         FROST/FREEZE    155
## 11                 HAIL  26673
## 12                 HEAT   3519
## 13            HIGH SURF    221
## 14            HIGH WIND   6191
## 15    HURRICANE/TYPHOON    232
## 16            ICE STORM    752
## 17     LAKE-EFFECT SNOW    194
## 18 LAND/MUD/ROCK SLIDES    209
## 19           LIGHT SNOW    141
## 20            LIGHTNING  13309
## 21                 RAIN     95
## 22          RIP CURRENT    641
## 23                 SNOW     54
## 24          STORM SURGE    224
## 25          STRONG WIND   3424
## 26         THUNDERSTORM 119292
## 27              TORNADO  39968
## 28       TROPICAL STORM    456
## 29           WATERSPOUT     55
## 30             WILDFIRE   1246
## 31                 WIND     84
## 32         WINTER STORM   1508
## 33       WINTER WEATHER    546

We get the total count of the events reduced to 33 because we applied filter condition for at least 50 occurrences.

Question 1: Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

Let’s construct a variable called “W_INJURIES” which adds all the injuries in the stormData$INJURIES column by stormData$EVTYPE.We can find out which weather related event caused most injuries.

W_INJURIES <- aggregate(INJURIES ~ EVTYPE,data= stormData,FUN=sum,na.rm= TRUE)

Let’s construct a variable called “W_FATALITIES” which adds all the fatalities in the stormData$FATALITIES column by stormData$EVTYPE.We can find out which weather related event caused most fatalities.

W_FATALITIES <- aggregate(FATALITIES ~ EVTYPE,data= stormData,FUN=sum,na.rm=TRUE)

Finally,let’s construct a variable called “W_CASUALTIES” adding stormData$INJURIES and stormData$FATALITIES per stormData$EVTYPE.We can find out which weather related type of event was most harmfulto population health.

stormData$CASUALTIES <- stormData$INJURIES + stormData$FATALITIES
W_CASUALTIES <- aggregate(CASUALTIES ~ EVTYPE,data= stormData,FUN=sum,na.rm=TRUE)

Question 2: Across the United States, which types of events have the greatest economic consequences?

Now,Let’s calculate the property (stormData$PROPDMG) and crop (stormData$CROPDMG) damage per event. Notice stormData$PROPDMGEXP and stormData$CROPDMGEXP variables are damages magnitude fields where H,K,M,B represent Hundreds,Thousands, Millions and Billions in US dollars per the documentation.Any corrupt or miscoded values will be ignored.

magnitude <- c(H = 10^2,K = 10^3, M = 10^6, B = 10^9)
stormData$PROPDMG <- stormData$PROPDMG * magnitude[as.character(stormData$PROPDMGEXP)]
stormData$CROPDMG <- stormData$CROPDMG * magnitude[as.character(stormData$CROPDMGEXP)]

To find which weather event has the most expensive damages,let’s create the variable “W_DAMAGES” which adds all damages in US dollars (stormData$DMG column) per stormData$EVTYPE.

W_DAMAGES <- aggregate(cbind(PROPDMG,CROPDMG) ~ EVTYPE, data = stormData, FUN=sum,na.rm=TRUE)

Results for Question 1

  1. Top 10 weather events with the most injuries:
W_INJURIES[order(W_INJURIES$INJURIES, decreasing = T), ][1:10, ]
##                EVTYPE INJURIES
## 111           TORNADO    91407
## 43               HEAT    10513
## 107      THUNDERSTORM     9447
## 27              FLOOD     8685
## 72          LIGHTNING     5232
## 62          ICE STORM     2154
## 52          HIGH WIND     1496
## 41               HAIL     1467
## 127          WILDFIRE     1456
## 58  HURRICANE/TYPHOON     1333
  1. Top 10 weather events with the most fatalities:
W_FATALITIES[order(W_FATALITIES$FATALITIES, decreasing = T), ][1:10, ]
##              EVTYPE FATALITIES
## 111         TORNADO       5636
## 43             HEAT       3366
## 27            FLOOD       1557
## 72        LIGHTNING        817
## 107    THUNDERSTORM        724
## 87      RIP CURRENT        572
## 12  COLD/WIND CHILL        451
## 52        HIGH WIND        291
## 7         AVALANCHE        224
## 134    WINTER STORM        206
  1. Top 10 weather events with the most casualties:
Top10_HEALTH <- W_CASUALTIES[order(W_CASUALTIES$CASUALTIES, decreasing = T), ][1:10,]
Top10_HEALTH
##           EVTYPE CASUALTIES
## 111      TORNADO      97043
## 43          HEAT      13879
## 27         FLOOD      10242
## 107 THUNDERSTORM      10171
## 72     LIGHTNING       6049
## 62     ICE STORM       2256
## 52     HIGH WIND       1787
## 127     WILDFIRE       1543
## 134 WINTER STORM       1527
## 41          HAIL       1512

Visualization for Question 1
1) Let’s plot the graph of the top 10 most harmful type of events with respect to population health.

library(ggplot2)
ggplot(Top10_HEALTH, aes(reorder(EVTYPE, -CASUALTIES), CASUALTIES)) + 
geom_bar(stat = "identity", aes(fill = EVTYPE)) + 
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(x="Event Type", y=expression("Total Number of Casualties")) +
labs(title=expression("Top10 Most Harmful Weather Events in US (1950-2011)"))

From the graph,we can see that Tornados are the most harmful event types followed by Heat and Flood.

Results for Question 2

library(reshape2)
Top10_ECONOMY <- melt(head(W_DAMAGES[order(-W_DAMAGES$PROPDMG,-W_DAMAGES$CROPDMG), ],10))
## Using EVTYPE as id variables
Top10_ECONOMY
##               EVTYPE variable        value
## 1              FLOOD  PROPDMG 146036259290
## 2  HURRICANE/TYPHOON  PROPDMG  38876883000
## 3            TORNADO  PROPDMG  16166781890
## 4               HAIL  PROPDMG   9596509190
## 5        STORM SURGE  PROPDMG   4643558000
## 6       THUNDERSTORM  PROPDMG   4340057500
## 7           WILDFIRE  PROPDMG   3547227470
## 8          HIGH WIND  PROPDMG   2685097340
## 9     TROPICAL STORM  PROPDMG   1063393350
## 10      WINTER STORM  PROPDMG   1017844200
## 11             FLOOD  CROPDMG  11739276150
## 12 HURRICANE/TYPHOON  CROPDMG   5313117800
## 13           TORNADO  CROPDMG    353383660
## 14              HAIL  CROPDMG   2055922950
## 15       STORM SURGE  CROPDMG       855000
## 16      THUNDERSTORM  CROPDMG   1144216750
## 17          WILDFIRE  CROPDMG    284822100
## 18         HIGH WIND  CROPDMG    667134850
## 19    TROPICAL STORM  CROPDMG    468261000
## 20      WINTER STORM  CROPDMG     23724000

Visualization for Question 2
1) To get the results of the damage in billions of USD on the plot,let’s divide the value of property and crop damage by 1e+09.

ggplot(Top10_ECONOMY, aes(x = EVTYPE,  y = value/1e+09, fill = variable)) +
geom_bar(stat = "identity") + 
coord_flip() + 
ggtitle("To10 Most Economically Damaging Weather Events in US (1950-2011)") +
labs(x = "Event Type", y = "Total Damages in Billions (USD)") + 
scale_fill_manual(values = c("brown", "green"), labels = c("Property Damage", "Crop Damage"))

From the graph,we can see that Flood causes the most economic damage to the property and crops.

Conclusions

1 : Across the United States, which types of events are most harmful with respect to population health?

Tornados are the most harmful weather events as they created the most casualties.

2 : Across the United States, which types of events have the greatest economic consequences?

Flood is the most expensive weather event as it created the highest combined property and crop damage with an amount greater than 150 Billion USD.