Synopsis

Severe weather events in the United States some time lead to severe, even catastrophic consequences, for population health and the economy. These events result in personal injuries and fatalities and often severely damage properties and crops. Preventing such outcomes is of a vital concern to any modern society impacted by such natural events. A proper understanding the impacts of these weather events is needed to allow focus on those events more consequential to the population and economy. The analysis of weather even types based on population health impacts and economic consequences is based on publicly available data from the U.S. National Oceanic and Atmospheric Administration (NOAA) storm database. This analysis aims to answer the basic questions concerning which types of weather events are most harmful with respect to population health and which weather events have the greatest economic consequences. The results of the analysis indicate:

Data Processing

The data processing including defining subsets the data set for specific focus along with additional data elements developed from the raw data is described below:

  1. Load the required R libraries and set up the working directory
# Load required librbaries
library(ggplot2)
library(scales)

# Set the working directory
setwd("~/Documents/R_Programming/ReproducibleResearch/RepData_PeerAssessment2")
  1. Load the raw/source data in R
# Download the raw data file to the local machine
raw.data.url <- "http://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
raw.data.file <- "./data/repdata-data-StormData.csv.bz2"

if(!file.exists(raw.data.file)) { # only download the file if it's not locally present
    download.file(raw.data.url, raw.data.file)
    print("File download complete and now beginning to read the data")
}else {
    print("File already exists, beginning to read the data")
}
## [1] "File already exists, beginning to read the data"
# Read the local raw data file into R
stormdata <- read.table(file = raw.data.file, 
                         header = TRUE,
                         sep = ",",
                         stringsAsFactors = FALSE)

cat("The raw dataset has", nrow(stormdata), "rows and", ncol(stormdata), "columns.")
## The raw dataset has 902297 rows and 37 columns.
print("The first few rows of stormdata:")
## [1] "The first few rows of stormdata:"
head(stormdata)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL
##    EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 1 TORNADO         0                                               0
## 2 TORNADO         0                                               0
## 3 TORNADO         0                                               0
## 4 TORNADO         0                                               0
## 5 TORNADO         0                                               0
## 6 TORNADO         0                                               0
##   COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 1         NA         0                      14.0   100 3   0          0
## 2         NA         0                       2.0   150 2   0          0
## 3         NA         0                       0.1   123 2   0          0
## 4         NA         0                       0.0   100 2   0          0
## 5         NA         0                       0.0   150 2   0          0
## 6         NA         0                       1.5   177 2   0          0
##   INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES
## 1       15    25.0          K       0                                    
## 2        0     2.5          K       0                                    
## 3        2    25.0          K       0                                    
## 4        2     2.5          K       0                                    
## 5        2     2.5          K       0                                    
## 6        6     2.5          K       0                                    
##   LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1     3040      8812       3051       8806              1
## 2     3042      8755          0          0              2
## 3     3340      8742          0          0              3
## 4     3458      8626          0          0              4
## 5     3412      8642          0          0              5
## 6     3450      8748          0          0              6
print("Summary Analysis of stormdata")
## [1] "Summary Analysis of stormdata"
summary(stormdata)
##     STATE__       BGN_DATE           BGN_TIME          TIME_ZONE        
##  Min.   : 1.0   Length:902297      Length:902297      Length:902297     
##  1st Qu.:19.0   Class :character   Class :character   Class :character  
##  Median :30.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :31.2                                                           
##  3rd Qu.:45.0                                                           
##  Max.   :95.0                                                           
##                                                                         
##      COUNTY     COUNTYNAME           STATE              EVTYPE         
##  Min.   :  0   Length:902297      Length:902297      Length:902297     
##  1st Qu.: 31   Class :character   Class :character   Class :character  
##  Median : 75   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :101                                                           
##  3rd Qu.:131                                                           
##  Max.   :873                                                           
##                                                                        
##    BGN_RANGE      BGN_AZI           BGN_LOCATI          END_DATE        
##  Min.   :   0   Length:902297      Length:902297      Length:902297     
##  1st Qu.:   0   Class :character   Class :character   Class :character  
##  Median :   0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :   1                                                           
##  3rd Qu.:   1                                                           
##  Max.   :3749                                                           
##                                                                         
##    END_TIME           COUNTY_END COUNTYENDN       END_RANGE  
##  Length:902297      Min.   :0    Mode:logical   Min.   :  0  
##  Class :character   1st Qu.:0    NA's:902297    1st Qu.:  0  
##  Mode  :character   Median :0                   Median :  0  
##                     Mean   :0                   Mean   :  1  
##                     3rd Qu.:0                   3rd Qu.:  0  
##                     Max.   :0                   Max.   :925  
##                                                              
##    END_AZI           END_LOCATI            LENGTH           WIDTH     
##  Length:902297      Length:902297      Min.   :   0.0   Min.   :   0  
##  Class :character   Class :character   1st Qu.:   0.0   1st Qu.:   0  
##  Mode  :character   Mode  :character   Median :   0.0   Median :   0  
##                                        Mean   :   0.2   Mean   :   8  
##                                        3rd Qu.:   0.0   3rd Qu.:   0  
##                                        Max.   :2315.0   Max.   :4400  
##                                                                       
##        F               MAG          FATALITIES     INJURIES     
##  Min.   :0        Min.   :    0   Min.   :  0   Min.   :   0.0  
##  1st Qu.:0        1st Qu.:    0   1st Qu.:  0   1st Qu.:   0.0  
##  Median :1        Median :   50   Median :  0   Median :   0.0  
##  Mean   :1        Mean   :   47   Mean   :  0   Mean   :   0.2  
##  3rd Qu.:1        3rd Qu.:   75   3rd Qu.:  0   3rd Qu.:   0.0  
##  Max.   :5        Max.   :22000   Max.   :583   Max.   :1700.0  
##  NA's   :843563                                                 
##     PROPDMG      PROPDMGEXP           CROPDMG       CROPDMGEXP       
##  Min.   :   0   Length:902297      Min.   :  0.0   Length:902297     
##  1st Qu.:   0   Class :character   1st Qu.:  0.0   Class :character  
##  Median :   0   Mode  :character   Median :  0.0   Mode  :character  
##  Mean   :  12                      Mean   :  1.5                     
##  3rd Qu.:   0                      3rd Qu.:  0.0                     
##  Max.   :5000                      Max.   :990.0                     
##                                                                      
##      WFO             STATEOFFIC         ZONENAMES            LATITUDE   
##  Length:902297      Length:902297      Length:902297      Min.   :   0  
##  Class :character   Class :character   Class :character   1st Qu.:2802  
##  Mode  :character   Mode  :character   Mode  :character   Median :3540  
##                                                           Mean   :2875  
##                                                           3rd Qu.:4019  
##                                                           Max.   :9706  
##                                                           NA's   :47    
##    LONGITUDE        LATITUDE_E     LONGITUDE_       REMARKS         
##  Min.   :-14451   Min.   :   0   Min.   :-14455   Length:902297     
##  1st Qu.:  7247   1st Qu.:   0   1st Qu.:     0   Class :character  
##  Median :  8707   Median :   0   Median :     0   Mode  :character  
##  Mean   :  6940   Mean   :1452   Mean   :  3509                     
##  3rd Qu.:  9605   3rd Qu.:3549   3rd Qu.:  8735                     
##  Max.   : 17124   Max.   :9706   Max.   :106220                     
##                   NA's   :40                                        
##      REFNUM      
##  Min.   :     1  
##  1st Qu.:225575  
##  Median :451149  
##  Mean   :451149  
##  3rd Qu.:676723  
##  Max.   :902297  
## 

Data Processing: Population Impact

  1. Subset the raw data set into a “humancost” to focus on the weather events most harmful with respect to population health
# Add Column for "Population.Harm" by adding the values in the FATALITIES and
# INJURIES columns
stormdata$Population.Harm <- stormdata$FATALITIES + stormdata$INJURIES


# Subset the storm data to examine the human cost defined as Population.Harm > 0
humancost <- subset(stormdata, Population.Harm > 0)
  1. Enhance the “humancost”" data set by defining indexes to be used for fatalities and injuries
# Calculate the mean of the total fatalities for the dataset
meanfatalities <- mean(humancost$FATALITIES)

# Calculate the mean of the total injuries for the dataset
meaninjuries <- mean(humancost$INJURIES)

# Create a Fatality.Index based on Fatality/Population.Harm
humancost$Fatality.Index <- humancost$FATALITIES / meanfatalities

# Create a Injury.Index based on Injuries/Population.Harm
humancost$Injury.Index <- humancost$INJURIES / meaninjuries

# Create a Population.Harm.Index for each weather event based on the following formula: 
# (FATALITIES/mean(FATALITIES) + (INJURIES/mean(INJURIES) 
humancost$Population.Harm.Index <- humancost$Fatality.Index + 
    humancost$Injury.Index
  1. “Map” the EVTYPE (event types) into groups for easier aggregation and to build better consistency with the data set given different EVTYPE data values entered over the time span

NOTE: This grouping could be done on the larger stormdata dataset but I chose to perform the grouping at the subset level to make the manual mapping process more manageable for the unique EVTYPE values in the subset

# Create the EVTYPE.Grouping (event type grouping) to "normalize/clean" the various EVTYPE
# entries
## Make all entries uppercase to increase consistency
humancost$EVTYPE <- toupper(humancost$EVTYPE) 

## "Map"" the events into groupings
### Note: I realize this is a brture force method instead regexpr but after
### examning the data I decide to do the mapping manually because I actually want
### use this same dataset for a side project not related to the Coursera courses
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("AVALANCE",
                                                  "AVALANCHE")] <- "Avalanche"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("BLACK ICE", 
                                                  "BLIZZARD",
                                                  "BLOWING SNOW",
                                                  "COLD AND SNOW", 
                                                  "EXCESSIVE SNOW", 
                                                  "FALLING SNOW/ICE", 
                                                  "FREEZING RAIN", 
                                                  "FREEZING RAIN/SNOW",
                                                  "FREEZING SPRAY", 
                                                  "FROST", 
                                                  "GLAZE",
                                                  "GLAZE/ICE STORM", 
                                                  "HEAVY SNOW",
                                                  "HEAVY SNOW AND HIGH WINDS", 
                                                  "HEAVY SNOW SHOWER", 
                                                  "HEAVY SNOW/BLIZZARD/AVALANCHE",
                                                  "HEAVY SNOW/ICE", 
                                                  "HEAVY WIND/HEAVY SNOW", 
                                                  "HIGH WINDS/SNOW", 
                                                  "ICE", 
                                                  "ICE ON ROAD",
                                                  "ICE ROADS", 
                                                  "ICE STORM", 
                                                  "ICE STORM/FLAH FLOOD", 
                                                  "ICY ROADS",
                                                  "LIGHT SNOW", 
                                                  "MIXED PRECIP", 
                                                  "RAIN/SNOW", 
                                                  "SLEET", 
                                                  "SNOW", 
                                                  "SNOW AND ICE", 
                                                  "SNOW SQUALL",
                                                  "SNOW SQUALLS", 
                                                  "SNOW/ BITTER COLD",
                                                  "SNOW/HIGH WINDS", 
                                                  "THUNDERSNOW",
                                                  "WINTER STORM", 
                                                  "WINTER STORM HIGH WINDS", 
                                                  "WINTER STORMS", 
                                                  "WINTER WEATHER",
                                                  "WINTER WEATHER MIX",
                                                  "WINTER WEATHER/MIX", 
                                                  "WINTRY MIX",
                                                  "ICE STORM/FLASH FLOOD",
                                                  "SNOW/ICE",
                                                  "HIGH WIND/HEAVY SNOW",
                                                  "BLIZZARD/WINTER STORM",
                                                  "FROST/FREEZE",
                                                  "FROST\\FREEZE",
                                                  "GLAZE ICE",
                                                  "GROUND BLIZZARD",
                                                  "HEAVY LAKE SNOW",
                                                  "HEAVY MIX",
                                                  "HEAVY SNOW AND STRONG WINDS",
                                                  "HEAVY SNOW SQUALLS",
                                                  "HEAVY SNOW-SQUALLS",
                                                  "HEAVY SNOW/BLIZZARD",
                                                  "HEAVY SNOW/FREEZING RAIN",
                                                  "HEAVY SNOW/HIGH WINDS & FLOOD",
                                                  "HEAVY SNOW/SQUALLS",
                                                  "HEAVY SNOW/WIND",
                                                  "HEAVY SNOW/WINTER STORM",
                                                  "HEAVY SNOWPACK",
                                                  "ICE AND SNOW",
                                                  "ICE FLOES",
                                                  "ICE JAM",
                                                  "ICE/STRONG WINDS",
                                                  "LAKE EFFECT SNOW",
                                                  "LAKE-EFFECT SNOW",
                                                  "LATE SEASON SNOW",
                                                  "LIGHT FREEZING RAIN",
                                                  "LIGHT SNOWFALL",
                                                  "MIXED PRECIPITATION",
                                                  "RECORD SNOW",
                                                  "SLEET/ICE STORM",
                                                  "SNOW ACCUMULATION",
                                                  "SNOW AND HEAVY SNOW",
                                                  "SNOW AND ICE STORM",
                                                  "SNOW FREEZING RAIN",
                                                  "SNOW/ ICE",
                                                  "SNOW/BLOWING SNOW",
                                                  "SNOW/COLD",
                                                  "SNOW/FREEZING RAIN",
                                                  "SNOW/HEAVY SNOW",
                                                  "SNOW/ICE STORM",
                                                  "SNOW/SLEET",
                                                  "SNOW/SLEET/FREEZING RAIN",
                                                  "")] <- "Winter Weather"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("BRUSH FIRE", 
                                                  "WILD FIRES", 
                                                  "WILD/FOREST FIRE",
                                                  "WILDFIRE",
                                                  "DENSE SMOKE",
                                                  "FOREST FIRES",
                                                  "GRASS FIRES",
                                                  "WILD/FOREST FIRES",
                                                  "WILDFIRES")] <- "Wildfire"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("COASTAL FLOOD", 
                                                  "COASTAL FLOODING",
                                                  "COASTAL FLOODING/EROSION",
                                                  "FLASH FLOOD", 
                                                  "FLASH FLOOD/FLOOD",
                                                  "FLASH FLOODING", 
                                                  "FLASH FLOODING/FLOOD",
                                                  "FLASH FLOODS", 
                                                  "FLOOD", 
                                                  "FLOOD & HEAVY RAIN", 
                                                  "FLOOD/FLASH FLOOD",
                                                  "FLOOD/RIVER FLOOD", 
                                                  "FLOODING",
                                                  "MINOR FLOODING", 
                                                  "RAPIDLY RISING WATER",
                                                  "RIVER FLOOD", 
                                                  "RIVER FLOODING",
                                                  "STORM SURGE", 
                                                  "STORM SURGE/TIDE",
                                                  "TIDAL FLOODING", 
                                                  "FLOODING/COASTAL FLOODING",
                                                  "URBAN AND SMALL STREAM FLOODIN",
                                                  "URBAN/SML STREAM FLD",
                                                  " FLASH FLOOD",
                                                  "ASTRONOMICAL HIGH TIDE",
                                                  "BEACH EROSION",
                                                  "BREAKUP FLOODING",
                                                  "COASTAL  FLOODING/EROSION",
                                                  "COASTAL EROSION",
                                                  "COASTAL SURGE",
                                                  "DAM BREAK",
                                                  "EROSION/CSTL FLOOD",
                                                  "FLASH FLOOD - HEAVY RAIN",
                                                  "FLASH FLOOD FROM ICE JAMS",
                                                  "FLASH FLOOD WINDS",
                                                  "FLASH FLOOD/",
                                                  "FLASH FLOOD/ STREET",
                                                  "FLASH FLOODING/THUNDERSTORM WI",
                                                  "FLOOD FLASH",
                                                  "FLOOD/FLASH",
                                                  "FLOOD/FLASH/FLOOD",
                                                  "FLOOD/FLASHFLOOD",
                                                  "FLOOD/RAIN/WINDS",
                                                  "FLOODING/HEAVY RAIN",
                                                  "FLOODS",
                                                  "HEAVY SURF COASTAL FLOODING",
                                                  "HIGH TIDES",
                                                  "ICE JAM FLOOD (MINOR",
                                                  "ICE JAM FLOODING",
                                                  "LAKE FLOOD",
                                                  "LAKESHORE FLOOD",
                                                  "MAJOR FLOOD",
                                                  "RIVER AND STREAM FLOOD",
                                                  "RURAL FLOOD",
                                                  "SEICHE",
                                                  "SMALL STREAM FLOOD",
                                                  "SNOWMELT FLOODING",
                                                  "URBAN AND SMALL",
                                                  "URBAN FLOOD",
                                                  "URBAN FLOODING",
                                                  "URBAN FLOODS",
                                                  "URBAN SMALL",
                                                  "URBAN/SMALL STREAM",
                                                  "URBAN/SMALL STREAM FLOOD")] <- "Flooding"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("COASTAL STORM", 
                                                  "COASTALSTORM")] <- "Coastal Storm"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("COLD", 
                                                  "COLD TEMPERATURE", 
                                                  "COLD WAVE",
                                                  "COLD WEATHER", 
                                                  "COLD/WIND CHILL",
                                                  "COLD/WINDS", 
                                                  "EXTENDED COLD",
                                                  "EXTREME COLD", 
                                                  "EXTREME COLD/WIND CHILL",
                                                  "EXTREME WINDCHILL", 
                                                  "FREEZE",
                                                  "FREEZING DRIZZLE", 
                                                  "LOW TEMPERATURE",
                                                  "RECORD COLD", 
                                                  "UNSEASONABLY COLD",
                                                  "AGRICULTURAL FREEZE",
                                                  "COLD AND WET CONDITIONS",
                                                  "COOL AND WET",
                                                  "DAMAGING FREEZE",
                                                  "EARLY FROST",
                                                  "EXTREME WIND CHILL",
                                                  "FREEZING RAIN/SLEET",
                                                  "HARD FREEZE",
                                                  "UNSEASONABLE COLD")] <- "Cold Weather"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("DENSE FOG", 
                                                  "FOG", 
                                                  "FOG AND COLD TEMPERATURES",
                                                  "FREEZING FOG")] <- "Fog"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("DROUGHT", 
                                                  "DROUGHT/EXCESSIVE HEAT")] <- "Drought"
humancost$EVTYPE.Grouping[humancost$EVTYPE == "DROWNING"] <- "Drowning"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("DRY MICROBURST", 
                                                  "DRY MICROBURST WINDS",
                                                  "DRY MIRCOBURST WINDS",
                                                  "DOWNBURST",
                                                  "DUST DEVIL WATERSPOUT",
                                                  "DUST STORM/HIGH WINDS",
                                                  "MICROBURST",
                                                  "MICROBURST WINDS",
                                                  "WET MICROBURST")] <- "Microburst"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("DUST DEVIL", 
                                                  "DUST STORM", 
                                                  "WHIRLWIND",
                                                  "BLOWING DUST")] <- "Dust Devil/Whirlwind"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("EXCESSIVE HEAT",
                                                  "EXTREME HEAT",
                                                  "HEAT",
                                                  "HEAT WAVE", 
                                                  "HEAT WAVE DROUGHT",
                                                  "HEAT WAVES", 
                                                  "RECORD HEAT",
                                                  "RECORD/EXCESSIVE HEAT", 
                                                  "UNSEASONABLY WARM",
                                                  "UNSEASONABLY WARM AND DRY",
                                                  "WARM WEATHER")] <- "Heat"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("EXCESSIVE RAINFALL", 
                                                  "HEAVY RAIN",
                                                  "HEAVY RAINS", 
                                                  "RAIN/WIND", 
                                                  "TORRENTIAL RAINFALL",
                                                  "HEAVY RAIN/LIGHTNING",
                                                  "EXCESSIVE WETNESS",
                                                  "HEAVY PRECIPITATION",
                                                  "HEAVY RAIN AND FLOOD",
                                                  "HEAVY RAIN/HIGH SURF",
                                                  "HEAVY RAIN/LIGHTNING",
                                                  "HEAVY RAIN/SEVERE WEATHER",
                                                  "HEAVY RAIN/SMALL STREAM URBAN",
                                                  "HEAVY RAIN/SNOW",
                                                  "HEAVY RAINS/FLOODING",
                                                  "HEAVY SHOWER",
                                                  "HVY RAIN",
                                                  "RAIN",
                                                  "RAINSTORM",
                                                  "RECORD RAINFALL",
                                                  "UNSEASONAL RAIN")] <- "Excessive Rainfall"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("FUNNEL CLOUD", 
                                                  "TORNADO",
                                                  "TORNADO F2", 
                                                  "TORNADO F3", 
                                                  "TORNADOES, TSTM WIND, HAIL",
                                                  "TORNADO F0",
                                                  "FUNNEL",
                                                  "COLD AIR TORNADO",
                                                  "TORNADO F1",
                                                  "TORNADOES",
                                                  "TORNDAO")] <- "Tornado"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("HAIL", 
                                                  "SMALL HAIL",
                                                  "HAIL 0.75",
                                                  "HAIL 075",
                                                  "HAIL 100",
                                                  "HAIL 125",
                                                  "HAIL 150",
                                                  "HAIL 175",
                                                  "HAIL 200",
                                                  "HAIL 275",
                                                  "HAIL 450",
                                                  "HAIL 75",
                                                  "HAIL DAMAGE",
                                                  "HAIL/WIND",
                                                  "HAIL/WINDS",
                                                  "HAILSTORM",
                                                  "MARINE HAIL")] <- "Hail"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("HAZARDOUS SURF", 
                                                  "HEAVY SURF",
                                                  "HEAVY SURF AND WIND", 
                                                  "HEAVY SURF/HIGH WIND", 
                                                  "HIGH SURF",
                                                  "ROUGH SURF",
                                                  "HEAVY SURF/HIGH SURF",
                                                  "   HIGH SURF ADVISORY")] <- "Hazardous Surf"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("HEAVY SEAS", 
                                                  "HIGH", 
                                                  "HIGH SEAS",
                                                  "HIGH SWELLS", 
                                                  "HIGH WATER", 
                                                  "HIGH WAVES",
                                                  "HIGH WIND/SEAS",
                                                  "HIGH WIND AND SEAS",
                                                  "ROUGH SEAS")] <- "Heavy/High Seas"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("HIGH WIND", 
                                                  "HIGH WINDS 48", 
                                                  "HIGH WINDS", 
                                                  "HIGH WINDS/COLD", 
                                                  "STRONG WIND", 
                                                  "STRONG WINDS", 
                                                  "WIND", 
                                                  "WIND STORM", 
                                                  "WINDS",
                                                  "HIGH WIND 48",
                                                  "GRADIENT WIND",
                                                  "GUSTNADO",
                                                  "GUSTY WIND/HAIL",
                                                  "GUSTY WIND/HVY RAIN",
                                                  "GUSTY WIND/RAIN",
                                                  "HIGH WIND (G40)",
                                                  "HIGH WIND DAMAGE",
                                                  "HIGH WIND/BLIZZARD",
                                                  "HIGH WINDS HEAVY RAINS",
                                                  "HIGH WINDS/",
                                                  "HIGH WINDS/COASTAL FLOOD",
                                                  "HIGH WINDS/HEAVY RAIN",
                                                  "GUSTY WIND",
                                                  "GUSTY WINDS",
                                                  "SEVERE TURBULENCE",
                                                  "STORM FORCE WINDS",
                                                  "WIND AND WAVE",
                                                  "WIND DAMAGE",
                                                  "WIND/HAIL")] <- "High Winds"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("HURRICANE", 
                                                  "HURRICANE EDOUARD",
                                                  "HURRICANE EMILY", 
                                                  "HURRICANE ERIN", 
                                                  "HURRICANE FELIX", 
                                                  "HURRICANE OPAL",
                                                  "HURRICANE OPAL/HIGH WINDS",
                                                  "HURRICANE-GENERATED SWELLS",
                                                  "HURRICANE/TYPHOON",
                                                  "TYPHOON",
                                                  "HURRICANE GORDON")] <- "Hurricane/Typhoon"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("HYPERTHERMIA/EXPOSURE", 
                                                  "HYPOTHERMIA",
                                                  "HYPOTHERMIA/EXPOSURE")] <- "Hypothermia/Exposure"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("LANDSLIDE", 
                                                  "LANDSLIDES",
                                                  "FLASH FLOOD LANDSLIDES",
                                                  "FLASH FLOOD/LANDSLIDE",
                                                  "ROCK SLIDE")] <- "Landslide"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("LIGHTNING", 
                                                  "LIGHTNING AND THUNDERSTORM WIND",
                                                  "LIGHTNING INJURY", 
                                                  "LIGHTNING.",
                                                  "LIGHTING",
                                                  "LIGHTNING AND HEAVY RAIN",
                                                  "LIGHTNING AND THUNDERSTORM WIN",
                                                  "LIGHTNING  WAUSEON",
                                                  "LIGHTNING FIRE",
                                                  "LIGHTNING THUNDERSTORM WINDS",
                                                  "LIGHTNING/HEAVY RAIN",
                                                  "LIGNTNING")] <- "Lightning"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("MARINE ACCIDENT", 
                                                  "MARINE MISHAP")] <- "Marine Accident"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("MARINE HIGH WIND", 
                                                  "MARINE STRONG WIND",
                                                  "MARINE THUNDERSTORM WIND", 
                                                  "MARINE TSTM WIND")] <- "Marine Storm"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("MUDSLIDE", 
                                                  "MUDSLIDES",
                                                  "MUD SLIDE",
                                                  "MUD SLIDES",
                                                  "MUD SLIDES URBAN FLOODING")] <- "Mudslide"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("NON TSTM WIND", 
                                                  "NON-SEVERE WIND DAMAGE",
                                                  "NON-TSTM WIND")] <- "Non-Thunderstorm Wind"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("OTHER", 
                                                  "?",
                                                  "APACHE COUNTY",
                                                  "ASTRONOMICAL LOW TIDE",
                                                  "LANDSLUMP",
                                                  "LANDSPOUT",
                                                  "VOLCANIC ASH")] <- "Other"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("RIP CURRENT", 
                                                  "RIP CURRENTS",
                                                  "RIP CURRENTS/HEAVY SURF")] <- "Rip Current"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("THUNDERSTORM", "THUNDERSTORM  WINDS",
                                                  "THUNDERSTORM WIND", 
                                                  "THUNDERSTORM WIND (G48)",
                                                  "THUNDERSTORM WIND G52",
                                                  "THUNDERSTORM WINDS", 
                                                  "THUNDERSTORM WINDS 13",
                                                  "THUNDERSTORM WINDS/HAIL",
                                                  "THUNDERSTORM WINDSS",
                                                  "THUNDERSTORMS WINDS",
                                                  "TSTM WIND", "TSTM WIND (G35)",
                                                  "TSTM WIND (G40)", "TSTM WIND (G45)",
                                                  "TSTM WIND/HAIL",
                                                  "THUNDERSTORM WINS",
                                                  "THUNDERSTORM WINDS LIGHTNING",
                                                  "THUNDERTORM WINDS",
                                                  "THUNDERSTORMW",
                                                  "THUNDERSTORM WIND (G40)",
                                                  "THUNDERSTORM WINDS HAIL",
                                                  " TSTM WIND",
                                                  " TSTM WIND (G45)",
                                                  "SEVERE THUNDERSTORM",
                                                  "SEVERE THUNDERSTORM WINDS",
                                                  "SEVERE THUNDERSTORMS",
                                                  "THUDERSTORM WINDS",
                                                  "THUNDERESTORM WINDS",
                                                  "THUNDERSTORM DAMAGE TO",
                                                  "THUNDERSTORM HAIL",
                                                  "THUNDERSTORM WIND 60 MPH",
                                                  "THUNDERSTORM WIND 65 MPH",
                                                  "THUNDERSTORM WIND 65MPH",
                                                  "THUNDERSTORM WIND 98 MPH",
                                                  "THUNDERSTORM WIND G50",
                                                  "THUNDERSTORM WIND G55",
                                                  "THUNDERSTORM WIND G60",
                                                  "THUNDERSTORM WIND TREES",
                                                  "THUNDERSTORM WIND.",
                                                  "THUNDERSTORM WIND/ TREE",
                                                  "THUNDERSTORM WIND/ TREES",
                                                  "THUNDERSTORM WIND/AWNING",
                                                  "THUNDERSTORM WIND/HAIL",
                                                  "THUNDERSTORM WIND/LIGHTNING",
                                                  "THUNDERSTORM WINDS 63 MPH",
                                                  "THUNDERSTORM WINDS AND",
                                                  "THUNDERSTORM WINDS G60",
                                                  "THUNDERSTORM WINDS HAIL",
                                                  "THUNDERSTORM WINDS.",
                                                  "THUNDERSTORM WINDS/ FLOOD",
                                                  "THUNDERSTORM WINDS/FLOODING",
                                                  "THUNDERSTORM WINDS/FUNNEL CLOU",
                                                  "THUNDERSTORM WINDS53",
                                                  "THUNDERSTORM WINDSHAIL",
                                                  "THUNDERSTORMS",
                                                  "THUNDERSTORMS WIND",
                                                  "THUNDERSTORMWINDS",
                                                  "THUNDERSTROM WIND",
                                                  "THUNERSTORM WINDS",
                                                  "TSTM WIND  (G45)",
                                                  "TSTM WIND (41)",
                                                  "TSTM WIND 40",
                                                  "TSTM WIND 45",
                                                  "TSTM WIND 55",
                                                  "TSTM WIND 65)",
                                                  "TSTM WIND AND LIGHTNING",
                                                  "TSTM WIND DAMAGE",
                                                  "TSTM WIND G45",
                                                  "TSTM WIND G58",
                                                  "TSTM WINDS",
                                                  "TSTMW",
                                                  "TUNDERSTORM WIND")] <- "Thunderstorm"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("TROPICAL STORM",
                                                  "TROPICAL STORM GORDON",
                                                  "TROPICAL DEPRESSION",
                                                  "TROPICAL STORM ALBERTO",
                                                  "TROPICAL STORM DEAN",
                                                  "TROPICAL STORM JERRY")] <- "Tropical Storm"
humancost$EVTYPE.Grouping[humancost$EVTYPE == "TSUNAMI"] <- "Tsunami"
humancost$EVTYPE.Grouping[humancost$EVTYPE %in% c("WATERSPOUT",
                                                  "WATERSPOUT TORNADO",
                                                  "WATERSPOUT/TORNADO",
                                                  "WATERSPOUT-",
                                                  "WATERSPOUT-TORNADO",
                                                  "WATERSPOUT/ TORNADO")] <- "Waterspout"
humancost$EVTYPE.Grouping[humancost$EVTYPE == "ROGUE WAVE"] <- "Rogue Wave"
  1. Use the Population.Harm.Index (defined earlier in the data processing) to subset the humancost data set to the top Population.Harm.Index values
# Explore the EVTYPE data vs "most harmful with respect to population health"
## scatterplot EVTYPE by Population.Harm.Index
highhumancost <- subset(humancost, 
                             Population.Harm.Index > 
                                 quantile(Population.Harm.Index, 0.99))

Data Processing: Economic Impact

  1. Subset the raw data set into an “econcost” to focus on the weather events with the greatest economic consequences
# Subset the storm data to examine the economic cost
econcost <- subset(stormdata, (PROPDMG > 0 | CROPDMG > 0))
  1. “Map” the EVTYPE (event types) into groups for easier aggregation and to build better consistency with the data set given different EVTYPE data values entered over the time span

NOTE: This grouping could be done on the larger stormdata dataset but I chose to perform the grouping at the subset level to make the manual mapping process more manageable for the unique EVTYPE values in the subset.

# Create the EVTYPE.Grouping (event type grouping) to "normalize/clean" the various EVTYPE
# entries
## Make all entries uppercase to increase consistency
econcost$EVTYPE <- toupper(econcost$EVTYPE) 
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("AVALANCE",
                                                "AVALANCHE")] <- "Avalanche"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("BLACK ICE", 
                                                "BLIZZARD",
                                                "BLOWING SNOW",
                                                "COLD AND SNOW", 
                                                "EXCESSIVE SNOW", 
                                                "FALLING SNOW/ICE", 
                                                "FREEZING RAIN", 
                                                "FREEZING RAIN/SNOW",
                                                "FREEZING SPRAY", 
                                                "FROST", 
                                                "GLAZE",
                                                "GLAZE/ICE STORM", 
                                                "HEAVY SNOW",
                                                "HEAVY SNOW AND HIGH WINDS", 
                                                "HEAVY SNOW SHOWER", 
                                                "HEAVY SNOW/BLIZZARD/AVALANCHE",
                                                "HEAVY SNOW/ICE", 
                                                "HEAVY WIND/HEAVY SNOW", 
                                                "HIGH WINDS/SNOW", 
                                                "ICE", 
                                                "ICE ON ROAD",
                                                "ICE ROADS", 
                                                "ICE STORM", 
                                                "ICE STORM/FLAH FLOOD", 
                                                "ICY ROADS",
                                                "LIGHT SNOW", 
                                                "MIXED PRECIP", 
                                                "RAIN/SNOW", 
                                                "SLEET", 
                                                "SNOW", 
                                                "SNOW AND ICE", 
                                                "SNOW SQUALL",
                                                "SNOW SQUALLS", 
                                                "SNOW/ BITTER COLD",
                                                "SNOW/HIGH WINDS", 
                                                "THUNDERSNOW",
                                                "WINTER STORM", 
                                                "WINTER STORM HIGH WINDS", 
                                                "WINTER STORMS", 
                                                "WINTER WEATHER",
                                                "WINTER WEATHER MIX",
                                                "WINTER WEATHER/MIX", 
                                                "WINTRY MIX",
                                                "ICE STORM/FLASH FLOOD",
                                                "SNOW/ICE",
                                                "HIGH WIND/HEAVY SNOW",
                                                "BLIZZARD/WINTER STORM",
                                                "FROST/FREEZE",
                                                "FROST\\FREEZE",
                                                "GLAZE ICE",
                                                "GROUND BLIZZARD",
                                                "HEAVY LAKE SNOW",
                                                "HEAVY MIX",
                                                "HEAVY SNOW AND STRONG WINDS",
                                                "HEAVY SNOW SQUALLS",
                                                "HEAVY SNOW-SQUALLS",
                                                "HEAVY SNOW/BLIZZARD",
                                                "HEAVY SNOW/FREEZING RAIN",
                                                "HEAVY SNOW/HIGH WINDS & FLOOD",
                                                "HEAVY SNOW/SQUALLS",
                                                "HEAVY SNOW/WIND",
                                                "HEAVY SNOW/WINTER STORM",
                                                "HEAVY SNOWPACK",
                                                "ICE AND SNOW",
                                                "ICE FLOES",
                                                "ICE JAM",
                                                "ICE/STRONG WINDS",
                                                "LAKE EFFECT SNOW",
                                                "LAKE-EFFECT SNOW",
                                                "LATE SEASON SNOW",
                                                "LIGHT FREEZING RAIN",
                                                "LIGHT SNOWFALL",
                                                "MIXED PRECIPITATION",
                                                "RECORD SNOW",
                                                "SLEET/ICE STORM",
                                                "SNOW ACCUMULATION",
                                                "SNOW AND HEAVY SNOW",
                                                "SNOW AND ICE STORM",
                                                "SNOW FREEZING RAIN",
                                                "SNOW/ ICE",
                                                "SNOW/BLOWING SNOW",
                                                "SNOW/COLD",
                                                "SNOW/FREEZING RAIN",
                                                "SNOW/HEAVY SNOW",
                                                "SNOW/ICE STORM",
                                                "SNOW/SLEET",
                                                "SNOW/SLEET/FREEZING RAIN",
                                                "")] <- "Winter Weather"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("BRUSH FIRE", 
                                                "WILD FIRES", 
                                                "WILD/FOREST FIRE",
                                                "WILDFIRE",
                                                "DENSE SMOKE",
                                                "FOREST FIRES",
                                                "GRASS FIRES",
                                                "WILD/FOREST FIRES",
                                                "WILDFIRES")] <- "Wildfire"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("COASTAL FLOOD", 
                                                "COASTAL FLOODING",
                                                "COASTAL FLOODING/EROSION",
                                                "FLASH FLOOD", 
                                                "FLASH FLOOD/FLOOD",
                                                "FLASH FLOODING", 
                                                "FLASH FLOODING/FLOOD",
                                                "FLASH FLOODS", 
                                                "FLOOD", 
                                                "FLOOD & HEAVY RAIN", 
                                                "FLOOD/FLASH FLOOD",
                                                "FLOOD/RIVER FLOOD", 
                                                "FLOODING",
                                                "MINOR FLOODING", 
                                                "RAPIDLY RISING WATER",
                                                "RIVER FLOOD", 
                                                "RIVER FLOODING",
                                                "STORM SURGE", 
                                                "STORM SURGE/TIDE",
                                                "TIDAL FLOODING", 
                                                "FLOODING/COASTAL FLOODING",
                                                "URBAN AND SMALL STREAM FLOODIN",
                                                "URBAN/SML STREAM FLD",
                                                " FLASH FLOOD",
                                                "ASTRONOMICAL HIGH TIDE",
                                                "BEACH EROSION",
                                                "BREAKUP FLOODING",
                                                "COASTAL  FLOODING/EROSION",
                                                "COASTAL EROSION",
                                                "COASTAL SURGE",
                                                "DAM BREAK",
                                                "EROSION/CSTL FLOOD",
                                                "FLASH FLOOD - HEAVY RAIN",
                                                "FLASH FLOOD FROM ICE JAMS",
                                                "FLASH FLOOD WINDS",
                                                "FLASH FLOOD/",
                                                "FLASH FLOOD/ STREET",
                                                "FLASH FLOODING/THUNDERSTORM WI",
                                                "FLOOD FLASH",
                                                "FLOOD/FLASH",
                                                "FLOOD/FLASH/FLOOD",
                                                "FLOOD/FLASHFLOOD",
                                                "FLOOD/RAIN/WINDS",
                                                "FLOODING/HEAVY RAIN",
                                                "FLOODS",
                                                "HEAVY SURF COASTAL FLOODING",
                                                "HIGH TIDES",
                                                "ICE JAM FLOOD (MINOR",
                                                "ICE JAM FLOODING",
                                                "LAKE FLOOD",
                                                "LAKESHORE FLOOD",
                                                "MAJOR FLOOD",
                                                "RIVER AND STREAM FLOOD",
                                                "RURAL FLOOD",
                                                "SEICHE",
                                                "SMALL STREAM FLOOD",
                                                "SNOWMELT FLOODING",
                                                "URBAN AND SMALL",
                                                "URBAN FLOOD",
                                                "URBAN FLOODING",
                                                "URBAN FLOODS",
                                                "URBAN SMALL",
                                                "URBAN/SMALL STREAM",
                                                "URBAN/SMALL STREAM FLOOD",
                                                "HEAVY SWELLS")] <- "Flooding"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("COASTAL STORM", 
                                                "COASTALSTORM")] <- "Coastal Storm"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("COLD", 
                                                "COLD TEMPERATURE", 
                                                "COLD WAVE",
                                                "COLD WEATHER", 
                                                "COLD/WIND CHILL",
                                                "COLD/WINDS", 
                                                "EXTENDED COLD",
                                                "EXTREME COLD", 
                                                "EXTREME COLD/WIND CHILL",
                                                "EXTREME WINDCHILL", 
                                                "FREEZE",
                                                "FREEZING DRIZZLE", 
                                                "LOW TEMPERATURE",
                                                "RECORD COLD", 
                                                "UNSEASONABLY COLD",
                                                "AGRICULTURAL FREEZE",
                                                "COLD AND WET CONDITIONS",
                                                "COOL AND WET",
                                                "DAMAGING FREEZE",
                                                "EARLY FROST",
                                                "EXTREME WIND CHILL",
                                                "FREEZING RAIN/SLEET",
                                                "HARD FREEZE",
                                                "UNSEASONABLE COLD")] <- "Cold Weather"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("DENSE FOG", 
                                                "FOG", 
                                                "FOG AND COLD TEMPERATURES",
                                                "FREEZING FOG")] <- "Fog"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("DROUGHT", 
                                                "DROUGHT/EXCESSIVE HEAT")] <- "Drought"
econcost$EVTYPE.Grouping[econcost$EVTYPE == "DROWNING"] <- "Drowning"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("DRY MICROBURST", 
                                                "DRY MICROBURST WINDS",
                                                "DRY MIRCOBURST WINDS",
                                                "DOWNBURST",
                                                "DUST DEVIL WATERSPOUT",
                                                "DUST STORM/HIGH WINDS",
                                                "MICROBURST",
                                                "MICROBURST WINDS",
                                                "WET MICROBURST")] <- "Microburst"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("DUST DEVIL", 
                                                "DUST STORM", 
                                                "WHIRLWIND",
                                                "BLOWING DUST")] <- "Dust Devil/Whirlwind"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("EXCESSIVE HEAT",
                                                "EXTREME HEAT",
                                                "HEAT",
                                                "HEAT WAVE", 
                                                "HEAT WAVE DROUGHT",
                                                "HEAT WAVES", 
                                                "RECORD HEAT",
                                                "RECORD/EXCESSIVE HEAT", 
                                                "UNSEASONABLY WARM",
                                                "UNSEASONABLY WARM AND DRY",
                                                "WARM WEATHER")] <- "Heat"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("EXCESSIVE RAINFALL", 
                                                "HEAVY RAIN",
                                                "HEAVY RAINS", 
                                                "RAIN/WIND", 
                                                "TORRENTIAL RAINFALL",
                                                "HEAVY RAIN/LIGHTNING",
                                                "EXCESSIVE WETNESS",
                                                "HEAVY PRECIPITATION",
                                                "HEAVY RAIN AND FLOOD",
                                                "HEAVY RAIN/HIGH SURF",
                                                "HEAVY RAIN/LIGHTNING",
                                                "HEAVY RAIN/SEVERE WEATHER",
                                                "HEAVY RAIN/SMALL STREAM URBAN",
                                                "HEAVY RAIN/SNOW",
                                                "HEAVY RAINS/FLOODING",
                                                "HEAVY SHOWER",
                                                "HVY RAIN",
                                                "RAIN",
                                                "RAINSTORM",
                                                "RECORD RAINFALL",
                                                "UNSEASONAL RAIN")] <- "Excessive Rainfall"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("FUNNEL CLOUD", 
                                                "TORNADO",
                                                "TORNADO F2", 
                                                "TORNADO F3", 
                                                "TORNADOES, TSTM WIND, HAIL",
                                                "TORNADO F0",
                                                "FUNNEL",
                                                "COLD AIR TORNADO",
                                                "TORNADO F1",
                                                "TORNADOES",
                                                "TORNDAO")] <- "Tornado"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("HAIL", 
                                                "SMALL HAIL",
                                                "HAIL 0.75",
                                                "HAIL 075",
                                                "HAIL 100",
                                                "HAIL 125",
                                                "HAIL 150",
                                                "HAIL 175",
                                                "HAIL 200",
                                                "HAIL 275",
                                                "HAIL 450",
                                                "HAIL 75",
                                                "HAIL DAMAGE",
                                                "HAIL/WIND",
                                                "HAIL/WINDS",
                                                "HAILSTORM",
                                                "MARINE HAIL")] <- "Hail"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("HAZARDOUS SURF", 
                                                "HEAVY SURF",
                                                "HEAVY SURF AND WIND", 
                                                "HEAVY SURF/HIGH WIND", 
                                                "HIGH SURF",
                                                "ROUGH SURF",
                                                "HEAVY SURF/HIGH SURF",
                                                "   HIGH SURF ADVISORY")] <- "Hazardous Surf"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("HEAVY SEAS", 
                                                "HIGH", 
                                                "HIGH SEAS",
                                                "HIGH SWELLS", 
                                                "HIGH WATER", 
                                                "HIGH WAVES",
                                                "HIGH WIND/SEAS",
                                                "HIGH WIND AND SEAS",
                                                "ROUGH SEAS")] <- "Heavy/High Seas"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("HIGH WIND", 
                                                "HIGH WINDS 48", 
                                                "HIGH WINDS", 
                                                "HIGH WINDS/COLD", 
                                                "STRONG WIND", 
                                                "STRONG WINDS", 
                                                "WIND", 
                                                "WIND STORM", 
                                                "WINDS",
                                                "HIGH WIND 48",
                                                "GRADIENT WIND",
                                                "GUSTNADO",
                                                "GUSTY WIND/HAIL",
                                                "GUSTY WIND/HVY RAIN",
                                                "GUSTY WIND/RAIN",
                                                "HIGH WIND (G40)",
                                                "HIGH WIND DAMAGE",
                                                "HIGH WIND/BLIZZARD",
                                                "HIGH WINDS HEAVY RAINS",
                                                "HIGH WINDS/",
                                                "HIGH WINDS/COASTAL FLOOD",
                                                "HIGH WINDS/HEAVY RAIN",
                                                "GUSTY WIND",
                                                "GUSTY WINDS",
                                                "SEVERE TURBULENCE",
                                                "STORM FORCE WINDS",
                                                "WIND AND WAVE",
                                                "WIND DAMAGE",
                                                "WIND/HAIL",
                                                "HIGH  WINDS")] <- "High Winds"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("HURRICANE", 
                                                "HURRICANE EDOUARD",
                                                "HURRICANE EMILY", 
                                                "HURRICANE ERIN", 
                                                "HURRICANE FELIX", 
                                                "HURRICANE OPAL",
                                                "HURRICANE OPAL/HIGH WINDS",
                                                "HURRICANE-GENERATED SWELLS",
                                                "HURRICANE/TYPHOON",
                                                "TYPHOON",
                                                "HURRICANE GORDON")] <- "Hurricane/Typhoon"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("HYPERTHERMIA/EXPOSURE", 
                                                "HYPOTHERMIA",
                                                "HYPOTHERMIA/EXPOSURE")] <- "Hypothermia/Exposure"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("LANDSLIDE", 
                                                "LANDSLIDES",
                                                "FLASH FLOOD LANDSLIDES",
                                                "FLASH FLOOD/LANDSLIDE",
                                                "ROCK SLIDE")] <- "Landslide"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("LIGHTNING", 
                                                "LIGHTNING AND THUNDERSTORM WIND",
                                                "LIGHTNING INJURY", 
                                                "LIGHTNING.",
                                                "LIGHTING",
                                                "LIGHTNING AND HEAVY RAIN",
                                                "LIGHTNING AND THUNDERSTORM WIN",
                                                "LIGHTNING  WAUSEON",
                                                "LIGHTNING FIRE",
                                                "LIGHTNING THUNDERSTORM WINDS",
                                                "LIGHTNING/HEAVY RAIN",
                                                "LIGNTNING")] <- "Lightning"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("MARINE ACCIDENT", 
                                                "MARINE MISHAP")] <- "Marine Accident"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("MARINE HIGH WIND", 
                                                "MARINE STRONG WIND",
                                                "MARINE THUNDERSTORM WIND", 
                                                "MARINE TSTM WIND")] <- "Marine Storm"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("MUDSLIDE", 
                                                "MUDSLIDES",
                                                "MUD SLIDE",
                                                "MUD SLIDES",
                                                "MUD SLIDES URBAN FLOODING")] <- "Mudslide"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("NON TSTM WIND", 
                                                "NON-SEVERE WIND DAMAGE",
                                                "NON-TSTM WIND")] <- "Non-Thunderstorm Wind"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("OTHER", 
                                                "?",
                                                "APACHE COUNTY",
                                                "ASTRONOMICAL LOW TIDE",
                                                "LANDSLUMP",
                                                "LANDSPOUT",
                                                "VOLCANIC ASH")] <- "Other"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("RIP CURRENT", 
                                                "RIP CURRENTS",
                                                "RIP CURRENTS/HEAVY SURF")] <- "Rip Current"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("THUNDERSTORM", "THUNDERSTORM  WINDS",
                                                "THUNDERSTORM WIND", 
                                                "THUNDERSTORM WIND (G48)",
                                                "THUNDERSTORM WIND G52",
                                                "THUNDERSTORM WINDS", 
                                                "THUNDERSTORM WINDS 13",
                                                "THUNDERSTORM WINDS/HAIL",
                                                "THUNDERSTORM WINDSS",
                                                "THUNDERSTORMS WINDS",
                                                "TSTM WIND", "TSTM WIND (G35)",
                                                "TSTM WIND (G40)", "TSTM WIND (G45)",
                                                "TSTM WIND/HAIL",
                                                "THUNDERSTORM WINS",
                                                "THUNDERSTORM WINDS LIGHTNING",
                                                "THUNDERTORM WINDS",
                                                "THUNDERSTORMW",
                                                "THUNDERSTORM WIND (G40)",
                                                "THUNDERSTORM WINDS HAIL",
                                                " TSTM WIND",
                                                " TSTM WIND (G45)",
                                                "SEVERE THUNDERSTORM",
                                                "SEVERE THUNDERSTORM WINDS",
                                                "SEVERE THUNDERSTORMS",
                                                "THUDERSTORM WINDS",
                                                "THUNDERESTORM WINDS",
                                                "THUNDERSTORM DAMAGE TO",
                                                "THUNDERSTORM HAIL",
                                                "THUNDERSTORM WIND 60 MPH",
                                                "THUNDERSTORM WIND 65 MPH",
                                                "THUNDERSTORM WIND 65MPH",
                                                "THUNDERSTORM WIND 98 MPH",
                                                "THUNDERSTORM WIND G50",
                                                "THUNDERSTORM WIND G55",
                                                "THUNDERSTORM WIND G60",
                                                "THUNDERSTORM WIND TREES",
                                                "THUNDERSTORM WIND.",
                                                "THUNDERSTORM WIND/ TREE",
                                                "THUNDERSTORM WIND/ TREES",
                                                "THUNDERSTORM WIND/AWNING",
                                                "THUNDERSTORM WIND/HAIL",
                                                "THUNDERSTORM WIND/LIGHTNING",
                                                "THUNDERSTORM WINDS 63 MPH",
                                                "THUNDERSTORM WINDS AND",
                                                "THUNDERSTORM WINDS G60",
                                                "THUNDERSTORM WINDS HAIL",
                                                "THUNDERSTORM WINDS.",
                                                "THUNDERSTORM WINDS/ FLOOD",
                                                "THUNDERSTORM WINDS/FLOODING",
                                                "THUNDERSTORM WINDS/FUNNEL CLOU",
                                                "THUNDERSTORM WINDS53",
                                                "THUNDERSTORM WINDSHAIL",
                                                "THUNDERSTORMS",
                                                "THUNDERSTORMS WIND",
                                                "THUNDERSTORMWINDS",
                                                "THUNDERSTROM WIND",
                                                "THUNERSTORM WINDS",
                                                "TSTM WIND  (G45)",
                                                "TSTM WIND (41)",
                                                "TSTM WIND 40",
                                                "TSTM WIND 45",
                                                "TSTM WIND 55",
                                                "TSTM WIND 65)",
                                                "TSTM WIND AND LIGHTNING",
                                                "TSTM WIND DAMAGE",
                                                "TSTM WIND G45",
                                                "TSTM WIND G58",
                                                "TSTM WINDS",
                                                "TSTMW",
                                                "TUNDERSTORM WIND",
                                                "THUNDEERSTORM WINDS")] <- "Thunderstorm"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("TROPICAL STORM",
                                                "TROPICAL STORM GORDON",
                                                "TROPICAL DEPRESSION",
                                                "TROPICAL STORM ALBERTO",
                                                "TROPICAL STORM DEAN",
                                                "TROPICAL STORM JERRY")] <- "Tropical Storm"
econcost$EVTYPE.Grouping[econcost$EVTYPE == "TSUNAMI"] <- "Tsunami"
econcost$EVTYPE.Grouping[econcost$EVTYPE %in% c("WATERSPOUT",
                                                "WATERSPOUT TORNADO",
                                                "WATERSPOUT/TORNADO",
                                                "WATERSPOUT-",
                                                "WATERSPOUT-TORNADO",
                                                "WATERSPOUT/ TORNADO")] <- "Waterspout"
econcost$EVTYPE.Grouping[econcost$EVTYPE == "ROGUE WAVE"] <- "Rogue Wave"
  1. Cleanup the Property Damage Exponent field which serves as a “multiplier” of the value contained in the Property Damage (PROPDMG) field
# Cleanup the PROPDMGEXP field to obtain the proper multiple of PROPDMG
econcost$PROPDMGEXP <- toupper(econcost$PROPDMGEXP)

## Note: interpreting numbers in the PROPDMGEXP column as the "number of zeros
## to append to the value in PROPDMG

### No or invalid mulitplier specified
econcost$PROPDMGEXP[econcost$PROPDMGEXP %in% c("0", "", "+", "-")] <- 1
### Hundreds
econcost$PROPDMGEXP[econcost$PROPDMGEXP %in% c("H", "2")] <- 100
### Thousands
econcost$PROPDMGEXP[econcost$PROPDMGEXP %in% c("K", "3")] <- 1000
### Ten Thousands
econcost$PROPDMGEXP[econcost$PROPDMGEXP == "4"] <- 10000
### Hundred Thousands
econcost$PROPDMGEXP[econcost$PROPDMGEXP == "5"] <- 100000
### Millions
econcost$PROPDMGEXP[econcost$PROPDMGEXP %in% c("M", "6")] <- 1000000
### Ten Millions
econcost$PROPDMGEXP[econcost$PROPDMGEXP == "7"] <- 10000000
### Billions
econcost$PROPDMGEXP[econcost$PROPDMGEXP == "B"] <- 1000000000
  1. Calculate the Extended Property Damage value
# Add Column for Extended Property Damage based on PROPDMG and PROPDMGEXP column
# data
econcost$Extended.Property.Damage <- as.numeric(econcost$PROPDMG) * 
    as.numeric(econcost$PROPDMGEXP)
  1. Cleanup the Crop Damage Exponent field which serves as a “multiplier” of the value contained in the Crop Damage (CROPDMG) field
# Cleanup the CROPDMGEXT field to obtain the proper multiple of CROPDMG
econcost$CROPDMGEXP <- toupper(econcost$CROPDMGEXP)

## Note: interpreting numbers in the CROPDMGEXP column as the "number of zeros
## to append to the value in CROPDMG

### No or invalid mulitplier specified
econcost$CROPDMGEXP[econcost$CROPDMGEXP %in% c("0", "", "?")] <- 1
### Thousands
econcost$CROPDMGEXP[econcost$CROPDMGEXP == "K"] <- 1000
## Millions
econcost$CROPDMGEXP[econcost$CROPDMGEXP == "M"] <- 1000000
### Billions
econcost$CROPDMGEXP[econcost$CROPDMGEXP == "B"] <- 1000000000
  1. Calculate the Extended Crop Damage value
# Add Column for Extended Property Damage based on CROPDMG and CROPDMGEXP column
# data
econcost$Extended.Crop.Damage <- as.numeric(econcost$CROPDMG) * 
    as.numeric(econcost$CROPDMGEXP)
  1. Calculate the Total Economic Damage value and then convert it into Billions of US Dollars
# Add Column for Total.Economic.Damage
econcost$Total.Economic.Damage <- econcost$Extended.Property.Damage +
    econcost$Extended.Crop.Damage

# Add Column for Total.Ecnomic.Damage.in.Billions
econcost$Total.Economic.Damage.in.Billions <- econcost$Total.Economic.Damage / 1000000000
  1. Use the Total.Economic.Cost.in.Billions to subset the econcost data set to the top Total.Economic.Cost.in.Billions values
# Explore the EVTYPE data vs "most harmful with respect to economic damage"
## scatterplot EVTYPE by Economic Damage
higheconcost <- subset(econcost, 
                        Total.Economic.Damage.in.Billions > 
                            quantile(Total.Economic.Damage.in.Billions, 0.99))

Results

The results from the data set analysis appear below answering the two questions posed at the beginning of the analysis:

Across the United States, which types of weather events are most harmful with respect to population health?

# Build the plot data for High Human Cost
hhc.plot.data <- aggregate(highhumancost$Population.Harm.Index,
                       by = list(highhumancost$EVTYPE.Grouping),
                       FUN = sum)
names(hhc.plot.data) <- c("EVTYPE.Grouping", "Total.Population.Harm.Index")

# Build the plot for the High Human Cost
hhc.plot <- ggplot(hhc.plot.data, 
                   aes(x = reorder(EVTYPE.Grouping, -Total.Population.Harm.Index),
                       y = Total.Population.Harm.Index,
                       ymax = max(Total.Population.Harm.Index) + 1000)) + 
    geom_bar(stat = "identity", 
             fill = "blue") + 
    geom_text(aes(label = round(Total.Population.Harm.Index)), 
              vjust = -0.5) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
    xlab("Event Type Grouping") + 
    ylab("Total Population Harm Index") + 
    ggtitle("Weather Events Most Harmful to Population Health \nbetween 1950 and Nov. 2011\n") + 
    theme(axis.text=element_text(size=14, face="bold"), 
          axis.title=element_text(size=14, face="bold"), 
          plot.title = element_text(lineheight=1, 
                                    face="bold", 
                                    color="black", 
                                    size=18))

Table: Top Population Health by Weather Events

hhc.plot.data
##      EVTYPE.Grouping Total.Population.Harm.Index
## 1 Excessive Rainfall                       28.76
## 2           Flooding                      882.68
## 3               Heat                     3085.76
## 4  Hurricane/Typhoon                      220.56
## 5            Tornado                     9286.43
## 6     Tropical Storm                       64.51
## 7            Tsunami                       66.46
## 8           Wildfire                       62.07
## 9     Winter Weather                      401.61
print(hhc.plot)

FIGURE1: Weather Events Most Harmful to Population Health between 1950 and Nov. 2011

Answer:

Tornadoes are the most harmful to population health based on the NOAA Storm Database dataset from 1950 through November 2011

Across the United States, which types of weather events have the greatest economic consequences?

# Build the plot data for High Economic Cost
hec.plot.data <- aggregate(higheconcost$Total.Economic.Damage.in.Billions,
                           by = list(higheconcost$EVTYPE.Grouping),
                           FUN = sum)

names(hec.plot.data) <- c("EVTYPE.Grouping", "Total.Economic.Damage.in.Billions")

hec.plot <- ggplot(hec.plot.data, 
                   aes(x = reorder(EVTYPE.Grouping, -Total.Economic.Damage.in.Billions),
                       y = Total.Economic.Damage.in.Billions,
                       ymax = max(Total.Economic.Damage.in.Billions) + 15)) + 
    geom_bar(stat = "identity", 
             fill = "red") + 
    geom_text(aes(label = round(Total.Economic.Damage.in.Billions)), 
              vjust = -0.5) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
    xlab("Event Type Grouping") + 
    ylab("Total Economic Damage in Billions USD") + 
    ggtitle("Weather Events with Greatest Economic \nConsequences between 1950 and Nov. 2011\n") + 
    theme(axis.text=element_text(size=14, face="bold"), 
          axis.title=element_text(size=14, face="bold"), 
          plot.title = element_text(lineheight=1, 
                                    face="bold", 
                                    color="black", 
                                    size=18))

Table: Top Economic Consequences by Weather Events

hec.plot.data
##       EVTYPE.Grouping Total.Economic.Damage.in.Billions
## 1        Cold Weather                           2.21812
## 2             Drought                          14.79117
## 3  Excessive Rainfall                           3.77264
## 4            Flooding                         216.71963
## 5                Hail                          14.38426
## 6      Hazardous Surf                           0.04792
## 7                Heat                           0.89257
## 8          High Winds                           5.41130
## 9   Hurricane/Typhoon                          90.58742
## 10          Landslide                           0.24180
## 11          Lightning                           0.02900
## 12       Thunderstorm                           6.95576
## 13            Tornado                          45.73866
## 14     Tropical Storm                           8.08619
## 15            Tsunami                           0.12182
## 16         Waterspout                           0.05000
## 17           Wildfire                           8.14838
## 18     Winter Weather                          17.02068
print(hec.plot)

FIGURE 2: Weather Events Most Harmfil to Economic Consequences betweeh 1950 and Nov. 2011

Answer:

Flooding Events lead to the greatest economic consequences based on the NOAA Storm Database dataset from 1950 through November 2011