Analysis of effects of storms and other weather phenomena in the US

Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, property and crop damage. Preventing such outcomes to the extent possible is a key concern.

This data analysis provides information to government or municipal managers who might be responsible for preparing for severe weather events and will need to prioritize resources for different types of events.

The follwing questions are addresed regarding the United States:
- Which types of events are most harmful with respect to population health?
- Which types of events have the greatest economic consequences?

According to the data of the NOAA storm database Tornado’s have by far the most adverse health effects between 1950 and 2011 in the United States. Tornado’s also score highest regarding estimated costs to repair property damage. Drought though seem to cause the hightes estimated costs regarding crop damage.

Data Processing

Description of the source data

To answer the questions an analysis was done on the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. The events in the database start in the year 1950 and end in November 2011.

The source data is a comma-separated-value file compressed via the bzip2 algorithm to reduce its size. Documentation of the database and variables are available via the National Weather SErvice and the National Climatic Data Center:

Download source and load in R

fileURL <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
destfile <- "StormData.csv.bz2"
download.file(fileURL, destfile, method = "curl")
downloaddate <- date()
stormdataRAW <- read.table(destfile, header=T, sep=",")

The source data was downloaded as a zipped file on Tue Aug 13 19:04:42 2019 from https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2.

dim(stormdataRAW)
## [1] 902297     37

The source data contains about a million observations to 37 variables.

Preparation for analysis

The following steps were undertaken to make the data ready for analysis. Details including code can be viewed at the appendix at the end of the report.

Reduce raw source to relevant columns

  • Seen the size of the sourcefile the relevant colums were selected for further analysis.

Clean eventtype

  • add new field: eventtype and put all EVTYPES to uppercase

  • cleaning typo’s, inconsistent use of singular/plural

Create a dataframe to base analysis of health effects

  • reduce to columns eventtype, fatalities and injuries

Convert multiplier and calculate estimated costs

According to the documentation of the data source alphabetical characters used to signify magnitude include “K” for thousands, “M” for millions, and “B” for billions,i.e. 1.55B for $1,550,000,000. “k” and “m” were interpreted as “K” and “M”. For “h” and “H” 100 was taken as a multiplier.

An analysis how to interprete the data which does not comply to characters K, M B was found at https://rstudio-pubs-static.s3.amazonaws.com/58957_37b6723ee52b455990e149edde45e5b6.html

In order to analyse economic effects four new variables were created:

  • add new variable propmultiplier and transform values from PROPDMGEXP

  • add new variable cropmultiplier and transform values from CROPDMGEXP

  • add new variable propcost = propmultiplier * PROPDMG
  • add new variable cropcost = cropmultiplier * CROPDMG

Create a dataframe to base analysis of economic effects

  • reduce to columns eventtype, propcost and cropcost

Analysis

Weather events most harmful for health

Below we list the top ten of three variables

  • sorted descending by total: describes the combined effect (number of fatalities AND injuries)
  • sorted descending by fatallities
  • sorted descending by injuries

The three lists show that Tornado’s, Thunderstorms/Lightning, (Excesive) Heat and (Flash) Flood are on top of all three lists.

##             eventtype FATALITIES INJURIES total
## 573           TORNADO       5633    91346 96979
## 557 THUNDERSTORM WIND        710     9458 10168
## 103    EXCESSIVE HEAT       1922     6575  8497
## 126             FLOOD        476     6791  7267
## 310         LIGHTNING        817     5232  6049
## 212              HEAT        937     2100  3037
## 124       FLASH FLOOD       1035     1800  2835
## 279         ICE STORM         89     1975  2064
## 258         HIGH WIND        293     1471  1764
## 665          WILDFIRE         90     1606  1696
##             eventtype FATALITIES INJURIES total
## 573           TORNADO       5633    91346 96979
## 103    EXCESSIVE HEAT       1922     6575  8497
## 124       FLASH FLOOD       1035     1800  2835
## 212              HEAT        937     2100  3037
## 310         LIGHTNING        817     5232  6049
## 557 THUNDERSTORM WIND        710     9458 10168
## 402       RIP CURRENT        577      529  1106
## 126             FLOOD        476     6791  7267
## 258         HIGH WIND        293     1471  1764
## 112      EXTREME COLD        288      255   543
##             eventtype FATALITIES INJURIES total
## 573           TORNADO       5633    91346 96979
## 557 THUNDERSTORM WIND        710     9458 10168
## 126             FLOOD        476     6791  7267
## 103    EXCESSIVE HEAT       1922     6575  8497
## 310         LIGHTNING        817     5232  6049
## 212              HEAT        937     2100  3037
## 279         ICE STORM         89     1975  2064
## 124       FLASH FLOOD       1035     1800  2835
## 665          WILDFIRE         90     1606  1696
## 258         HIGH WIND        293     1471  1764

Weather events most harmful for economy

Below we list the top ten of three variables

  • sorted descending by total: describes the combined effect (number of property AND crop costs)
  • sorted descending by propcost
  • sorted descending by cropcost

The three lists show that overall Tornado’s, (Flash) Flood, Hail and Hurricanes result in the highest cost estimates. However, weather effects seem to have different effects on property vs agricultural (crop) cost estimates: Damage to agricultural (crop) is highest due to drought, whereas drought is not in the top ten for property damage.

##             eventtype    propcost    cropcost       total
## 573           TORNADO 51690162897   414954710 52105117607
## 126             FLOOD 23490964860  5670823950 29161788810
## 124       FLASH FLOOD 15916911203  1532197150 17449108353
## 181              HAIL 13950269877  3025954650 16976224527
## 265         HURRICANE 11100180010  4020392800 15120572810
## 73            DROUGHT  1046106000 12487566000 13533672000
## 557 THUNDERSTORM WIND  9761179222  1224398700 10985577922
## 665          WILDFIRE  5876463500   402281630  6278745130
## 258         HIGH WIND  4716353745   686301900  5402655645
## 279         ICE STORM  3944928310    72113500  4017041810
##             eventtype    propcost   cropcost       total
## 573           TORNADO 51690162897  414954710 52105117607
## 126             FLOOD 23490964860 5670823950 29161788810
## 124       FLASH FLOOD 15916911203 1532197150 17449108353
## 181              HAIL 13950269877 3025954650 16976224527
## 265         HURRICANE 11100180010 4020392800 15120572810
## 557 THUNDERSTORM WIND  9761179222 1224398700 10985577922
## 665          WILDFIRE  5876463500  402281630  6278745130
## 258         HIGH WIND  4716353745  686301900  5402655645
## 279         ICE STORM  3944928310   72113500  4017041810
## 587    TROPICAL STORM  2605890550  678846000  3284736550
##             eventtype    propcost    cropcost       total
## 73            DROUGHT  1046106000 12487566000 13533672000
## 126             FLOOD 23490964860  5670823950 29161788810
## 265         HURRICANE 11100180010  4020392800 15120572810
## 181              HAIL 13950269877  3025954650 16976224527
## 124       FLASH FLOOD 15916911203  1532197150 17449108353
## 112      EXTREME COLD   132385400  1313023000  1445408400
## 557 THUNDERSTORM WIND  9761179222  1224398700 10985577922
## 157      FROST/FREEZE    10480000  1094186000  1104666000
## 222        HEAVY RAIN   694248090   733399800  1427647890
## 258         HIGH WIND  4716353745   686301900  5402655645

Disclaimer
As mentioned in the National Climatic Data Center Storm Data FAQ page the National Weather Service does not guarantee the accuracy or validity of the information. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.

The categorisation of eventtypes is somewhat arbitrary and does not allow a sound analysis. For example, terminology varies from “Heat”, Excessive heat" to “Extreme heat”, etc. In order to deliver good decision material efforts to catalogue eventtypes consistently and reliably with domain knowledge are adviced.

Due to time restraints there was no analysis of missing values made nor there potential influence on the results.

It might be of interest to conduct an analysis on health and economic effect of weather events restricted to the last 5 or 10 years. The data quality might be better and the validity of the results for future budgeting might increase.

Results

Weather events most harmful to health

According to the data of the NOAA storm database Tornado’s have by far the most adverse health effects between 1950 and 2011 in the United States.

Weather events most harmful to economy

Tornado’s also score highest regarding estimated costs to repair property damage. Drought seem to cause the hightes estimated costs regarding crop damage.

Appendix: Code Details

This section offers all coding details of the analysis.

Preparation for analysis

The following steps were undertaken to make the data ready for analysis.

Reduce raw source to relevant columns

  • Seen the size of the sourcefile the relevant colums were selected for further analysis.
stormdata <- stormdataRAW[,c("BGN_DATE","TIME_ZONE", "STATE","EVTYPE","FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")]
dim(stormdata)
## [1] 902297     10
str(stormdata)
## 'data.frame':    902297 obs. of  10 variables:
##  $ BGN_DATE  : Factor w/ 16335 levels "1/1/1966 0:00:00",..: 6523 6523 4242 11116 2224 2224 2260 383 3980 3980 ...
##  $ TIME_ZONE : Factor w/ 22 levels "ADT","AKS","AST",..: 7 7 7 7 7 7 7 7 7 7 ...
##  $ STATE     : Factor w/ 72 levels "AK","AL","AM",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ EVTYPE    : Factor w/ 985 levels "   HIGH SURF ADVISORY",..: 834 834 834 834 834 834 834 834 834 834 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...
summary(stormdata)
##               BGN_DATE        TIME_ZONE          STATE       
##  5/25/2011 0:00:00:  1202   CST    :547493   TX     : 83728  
##  4/27/2011 0:00:00:  1193   EST    :245558   KS     : 53440  
##  6/9/2011 0:00:00 :  1030   MST    : 68390   OK     : 46802  
##  5/30/2004 0:00:00:  1016   PST    : 28302   MO     : 35648  
##  4/4/2011 0:00:00 :  1009   AST    :  6360   IA     : 31069  
##  4/2/2006 0:00:00 :   981   HST    :  2563   NE     : 30271  
##  (Other)          :895866   (Other):  3631   (Other):621339  
##                EVTYPE         FATALITIES          INJURIES        
##  HAIL             :288661   Min.   :  0.0000   Min.   :   0.0000  
##  TSTM WIND        :219940   1st Qu.:  0.0000   1st Qu.:   0.0000  
##  THUNDERSTORM WIND: 82563   Median :  0.0000   Median :   0.0000  
##  TORNADO          : 60652   Mean   :  0.0168   Mean   :   0.1557  
##  FLASH FLOOD      : 54277   3rd Qu.:  0.0000   3rd Qu.:   0.0000  
##  FLOOD            : 25326   Max.   :583.0000   Max.   :1700.0000  
##  (Other)          :170878                                         
##     PROPDMG          PROPDMGEXP        CROPDMG          CROPDMGEXP    
##  Min.   :   0.00          :465934   Min.   :  0.000          :618413  
##  1st Qu.:   0.00   K      :424665   1st Qu.:  0.000   K      :281832  
##  Median :   0.00   M      : 11330   Median :  0.000   M      :  1994  
##  Mean   :  12.06   0      :   216   Mean   :  1.527   k      :    21  
##  3rd Qu.:   0.50   B      :    40   3rd Qu.:  0.000   0      :    19  
##  Max.   :5000.00   5      :    28   Max.   :990.000   B      :     9  
##                    (Other):    84                     (Other):     9
head(stormdata)
##             BGN_DATE TIME_ZONE STATE  EVTYPE FATALITIES INJURIES PROPDMG
## 1  4/18/1950 0:00:00       CST    AL TORNADO          0       15    25.0
## 2  4/18/1950 0:00:00       CST    AL TORNADO          0        0     2.5
## 3  2/20/1951 0:00:00       CST    AL TORNADO          0        2    25.0
## 4   6/8/1951 0:00:00       CST    AL TORNADO          0        2     2.5
## 5 11/15/1951 0:00:00       CST    AL TORNADO          0        2     2.5
## 6 11/15/1951 0:00:00       CST    AL TORNADO          0        6     2.5
##   PROPDMGEXP CROPDMG CROPDMGEXP
## 1          K       0           
## 2          K       0           
## 3          K       0           
## 4          K       0           
## 5          K       0           
## 6          K       0

Clean eventtype

  • add new field: eventtype and put all EVTYPES to uppercase
stormdata$eventtype <- toupper(stormdata$EVTYPE)
  • cleaning typo’s, inconsistent use of singular/plural
stormdata[stormdata$eventtype == "AVALANCE",]$eventtype <- "AVALANCHE"
stormdata[substr(stormdata$eventtype,1,10) == "COASTAL FL",]$eventtype <- "COASTAL FLOOD"
stormdata[stormdata$eventtype == "COASTALSTORM",]$eventtype <- "COASTAL STORM"
stormdata[stormdata$eventtype == "COLD AND SNOW",]$eventtype <- "COLD"
stormdata[stormdata$eventtype == "COLD TEMPERATURE",]$eventtype <- "COLD"
stormdata[stormdata$eventtype == "COLD WEATHER",]$eventtype <- "COLD"
stormdata[substr(stormdata$eventtype,1,6) == "COLD/W",]$eventtype <- "COLD/WIND"
stormdata[stormdata$eventtype == "DRY MIRCOBURST WINDS",]$eventtype <- "DRY MICROBURST"
stormdata[stormdata$eventtype == "DUST DEVIL",]$eventtype <- "DUST STORM"
stormdata[stormdata$eventtype == "EXTREME COLD/WIND CHILL",]$eventtype <- "EXTREME COLD"
stormdata[substr(stormdata$eventtype,1,11) == "FLASH FLOOD",]$eventtype <- "FLASH FLOOD"
stormdata[substr(stormdata$eventtype,1,12) == " FLASH FLOOD",]$eventtype <- "FLASH FLOOD"
stormdata[stormdata$eventtype == "FLOOD/FLASH FLOOD",]$eventtype <- "FLASH FLOOD"
stormdata[stormdata$eventtype == "FLOODING",]$eventtype <- "FLOOD"
stormdata[stormdata$eventtype == "GUSTY WINDS",]$eventtype <- "GUSTY WIND"
stormdata[stormdata$eventtype == "HEAT WAVES",]$eventtype <- "HEAT WAVE"
stormdata[substr(stormdata$eventtype,1,10) == "HEAVY SNOW",]$eventtype <- "HEAVY SNOW"
stormdata[substr(stormdata$eventtype,1,10) == "HEAVY SURF",]$eventtype <- "HEAVY SURF"
stormdata[substr(stormdata$eventtype,1,9) == "HIGH WIND",]$eventtype <- "HIGH WIND"
stormdata[substr(stormdata$eventtype,1,9) == "HURRICANE",]$eventtype <- "HURRICANE"
stormdata[stormdata$eventtype == "HYPOTHERMIA/EXPOSURE",]$eventtype <- "HYPOTHERMIA"
stormdata[substr(stormdata$eventtype,1,9) == "LIGHTNING",]$eventtype <- "LIGHTNING"
stormdata[stormdata$eventtype == "RECORD/EXCESSIVE HEAT",]$eventtype <- "EXCESSIVE HEAT"
stormdata[stormdata$eventtype == "RECORD HEAT",]$eventtype <- "EXCESSIVE HEAT"
stormdata[stormdata$eventtype == "RECORD COLD",]$eventtype <- "EXTREME COLD"
stormdata[substr(stormdata$eventtype,1,12) == "RIP CURRENTS",]$eventtype <- "RIP CURRENT"
stormdata[stormdata$eventtype == "RIVER FLOODING",]$eventtype <- "RIVER FLOOD"
stormdata[stormdata$eventtype == "SNOW SQUALLS",]$eventtype <- "SNOW SQUALL"
stormdata[stormdata$eventtype == "STRONG WINDS",]$eventtype <- "STRONG WIND"
stormdata[substr(stormdata$eventtype,1,17) == "THUNDERSTORM WIND",]$eventtype <- "THUNDERSTORM WIND"
stormdata[substr(stormdata$eventtype,1,17) == "THUNDERTORM WINDS",]$eventtype <- "THUNDERSTORM WIND"
stormdata[stormdata$eventtype == "TROPICAL STORM GORDON",]$eventtype <- "TROPICAL STORM"
stormdata[substr(stormdata$eventtype,1,5) == "TSTM ",]$eventtype <- "THUNDERSTORM WIND"
stormdata[substr(stormdata$eventtype,1,6) == " TSTM ",]$eventtype <- "THUNDERSTORM WIND"
stormdata[substr(stormdata$eventtype,1,10) == "WILD FIRES",]$eventtype <- "WILDFIRE"
stormdata[substr(stormdata$eventtype,1,10) == "WILD/FORES",]$eventtype <- "WILDFIRE"
stormdata[substr(stormdata$eventtype,1,14) == "WINTER WEATHER",]$eventtype <- "WINTER WEATHER"
stormdata[stormdata$eventtype == "WINTRY MIX",]$eventtype <- "WINTER WEATHER"

Create a dataframe to base analysis of health effects

  • reduce to columns eventtype, fatalities and injuries
fatal <- aggregate(FATALITIES ~ eventtype, stormdata, FUN ="sum")
injured <- aggregate(INJURIES ~ eventtype, stormdata, FUN ="sum")
healtheffects <- fatal
healtheffects$INJURIES <- injured$INJURIES
dim(healtheffects)
## [1] 685   3
# reduce to rows where either fatalities or injuries > 0
healtheffects <- healtheffects[(healtheffects$FATALITIES > 0) | (healtheffects$INJURIES >0), ]
dim(healtheffects)
## [1] 135   3

Convert multiplier and calculate estimated costs

In order to analyse economic effects four new variables were created:
- add new variable propmultiplier and transform values from PROPDMGEXP

stormdata$propmultiplier <- 10
stormdata$propmultiplier <- as.numeric(stormdata$propmultiplier)

#plot(stormdata$PROPDMGEXP)
table(stormdata$PROPDMGEXP)
## 
##             -      ?      +      0      1      2      3      4      5 
## 465934      1      8      5    216     25     13      4      4     28 
##      6      7      8      B      h      H      K      m      M 
##      4      5      1     40      1      6 424665      7  11330
stormdata[stormdata$PROPDMGEXP == "",]$propmultiplier <- 0
stormdata[stormdata$PROPDMGEXP == "-",]$propmultiplier <- 0
stormdata[stormdata$PROPDMGEXP == "?",]$propmultiplier <- 0
stormdata[stormdata$PROPDMGEXP == "+",]$propmultiplier <- 1
stormdata[stormdata$PROPDMGEXP == "B",]$propmultiplier <- 10000000
stormdata[stormdata$PROPDMGEXP == "h",]$propmultiplier <- 100
stormdata[stormdata$PROPDMGEXP == "H",]$propmultiplier <- 100
stormdata[stormdata$PROPDMGEXP == "K",]$propmultiplier <- 1000
stormdata[stormdata$PROPDMGEXP == "m",]$propmultiplier <- 1000000
stormdata[stormdata$PROPDMGEXP == "M",]$propmultiplier <- 1000000
options(scipen=999)
table(stormdata$propmultiplier)
## 
##        0        1       10      100     1000  1000000 10000000 
##   465943        5      300        7   424665    11337       40
  • add new variable cropmultiplier and transform values from CROPDMGEXP
stormdata$cropmultiplier <- 10
stormdata$cropmultiplier <- as.numeric(stormdata$cropmultiplier)

#plot(stormdata$CROPDMGEXP)
table(stormdata$CROPDMGEXP)
## 
##             ?      0      2      B      k      K      m      M 
## 618413      7     19      1      9     21 281832      1   1994
stormdata[stormdata$CROPDMGEXP == "",]$cropmultiplier <- 0
stormdata[stormdata$CROPDMGEXP == "?",]$cropmultiplier <- 0
stormdata[stormdata$CROPDMGEXP == "B",]$cropmultiplier <- 10000000
stormdata[stormdata$CROPDMGEXP == "k",]$cropmultiplier <- 1000
stormdata[stormdata$CROPDMGEXP == "K",]$cropmultiplier <- 1000
stormdata[stormdata$CROPDMGEXP == "m",]$cropmultiplier <- 1000000
stormdata[stormdata$CROPDMGEXP == "M",]$cropmultiplier <- 1000000

options(scipen=999)
table(stormdata$cropmultiplier)
## 
##        0       10     1000  1000000 10000000 
##   618420       20   281853     1995        9
  • add new variable propcost = propmultiplier * PROPDMG
  • add new variable cropcost = cropmultiplier * CROPDMG
#dim(stormdata)
stormdata$propcost <- stormdata$propmultiplier * stormdata$PROPDMG
stormdata$cropcost <- stormdata$cropmultiplier * stormdata$CROPDMG
#dim(stormdata)

Create a dataframe to base analysis of economic effects

  • reduce to columns eventtype, propcost and cropcost
prop <- aggregate(propcost ~ eventtype, stormdata, FUN ="sum")
crop <- aggregate(cropcost ~ eventtype, stormdata, FUN ="sum")
economiceffects <- prop
economiceffects$cropcost <- crop$cropcost
dim(economiceffects)
## [1] 685   3

Analysis

Weather events most harmful for health

healtheffects$total <- healtheffects$FATALITIES + healtheffects$INJURIES
head(healtheffects[order(-healtheffects$total),], 10)
##             eventtype FATALITIES INJURIES total
## 573           TORNADO       5633    91346 96979
## 557 THUNDERSTORM WIND        710     9458 10168
## 103    EXCESSIVE HEAT       1922     6575  8497
## 126             FLOOD        476     6791  7267
## 310         LIGHTNING        817     5232  6049
## 212              HEAT        937     2100  3037
## 124       FLASH FLOOD       1035     1800  2835
## 279         ICE STORM         89     1975  2064
## 258         HIGH WIND        293     1471  1764
## 665          WILDFIRE         90     1606  1696
head(healtheffects[order(-healtheffects$FATALITIES),], 10)
##             eventtype FATALITIES INJURIES total
## 573           TORNADO       5633    91346 96979
## 103    EXCESSIVE HEAT       1922     6575  8497
## 124       FLASH FLOOD       1035     1800  2835
## 212              HEAT        937     2100  3037
## 310         LIGHTNING        817     5232  6049
## 557 THUNDERSTORM WIND        710     9458 10168
## 402       RIP CURRENT        577      529  1106
## 126             FLOOD        476     6791  7267
## 258         HIGH WIND        293     1471  1764
## 112      EXTREME COLD        288      255   543
head(healtheffects[order(-healtheffects$INJURIES),], 10)
##             eventtype FATALITIES INJURIES total
## 573           TORNADO       5633    91346 96979
## 557 THUNDERSTORM WIND        710     9458 10168
## 126             FLOOD        476     6791  7267
## 103    EXCESSIVE HEAT       1922     6575  8497
## 310         LIGHTNING        817     5232  6049
## 212              HEAT        937     2100  3037
## 279         ICE STORM         89     1975  2064
## 124       FLASH FLOOD       1035     1800  2835
## 665          WILDFIRE         90     1606  1696
## 258         HIGH WIND        293     1471  1764
#In order to be able to plot the averse health effects in the result section, we need to prepare the data:

#limit the data for plot
plotdataH <- head(healtheffects[order(-healtheffects$total),], 10)
plotdataH <- plotdataH[,1:3]

#make wide format to long format in order to plot the results
library(reshape2)
head(plotdataH)
##             eventtype FATALITIES INJURIES
## 573           TORNADO       5633    91346
## 557 THUNDERSTORM WIND        710     9458
## 103    EXCESSIVE HEAT       1922     6575
## 126             FLOOD        476     6791
## 310         LIGHTNING        817     5232
## 212              HEAT        937     2100
plotdataH <- melt(plotdataH)
## Using eventtype as id variables
head(plotdataH)
##           eventtype   variable value
## 1           TORNADO FATALITIES  5633
## 2 THUNDERSTORM WIND FATALITIES   710
## 3    EXCESSIVE HEAT FATALITIES  1922
## 4             FLOOD FATALITIES   476
## 5         LIGHTNING FATALITIES   817
## 6              HEAT FATALITIES   937

Weather events most harmful for economy

economiceffects$total <- economiceffects$propcost + economiceffects$cropcost
head(economiceffects[order(-economiceffects$total),], 10)
##             eventtype    propcost    cropcost       total
## 573           TORNADO 51690162897   414954710 52105117607
## 126             FLOOD 23490964860  5670823950 29161788810
## 124       FLASH FLOOD 15916911203  1532197150 17449108353
## 181              HAIL 13950269877  3025954650 16976224527
## 265         HURRICANE 11100180010  4020392800 15120572810
## 73            DROUGHT  1046106000 12487566000 13533672000
## 557 THUNDERSTORM WIND  9761179222  1224398700 10985577922
## 665          WILDFIRE  5876463500   402281630  6278745130
## 258         HIGH WIND  4716353745   686301900  5402655645
## 279         ICE STORM  3944928310    72113500  4017041810
head(economiceffects[order(-economiceffects$propcost),], 10)
##             eventtype    propcost   cropcost       total
## 573           TORNADO 51690162897  414954710 52105117607
## 126             FLOOD 23490964860 5670823950 29161788810
## 124       FLASH FLOOD 15916911203 1532197150 17449108353
## 181              HAIL 13950269877 3025954650 16976224527
## 265         HURRICANE 11100180010 4020392800 15120572810
## 557 THUNDERSTORM WIND  9761179222 1224398700 10985577922
## 665          WILDFIRE  5876463500  402281630  6278745130
## 258         HIGH WIND  4716353745  686301900  5402655645
## 279         ICE STORM  3944928310   72113500  4017041810
## 587    TROPICAL STORM  2605890550  678846000  3284736550
head(economiceffects[order(-economiceffects$cropcost),], 10)
##             eventtype    propcost    cropcost       total
## 73            DROUGHT  1046106000 12487566000 13533672000
## 126             FLOOD 23490964860  5670823950 29161788810
## 265         HURRICANE 11100180010  4020392800 15120572810
## 181              HAIL 13950269877  3025954650 16976224527
## 124       FLASH FLOOD 15916911203  1532197150 17449108353
## 112      EXTREME COLD   132385400  1313023000  1445408400
## 557 THUNDERSTORM WIND  9761179222  1224398700 10985577922
## 157      FROST/FREEZE    10480000  1094186000  1104666000
## 222        HEAVY RAIN   694248090   733399800  1427647890
## 258         HIGH WIND  4716353745   686301900  5402655645
# In order to be able to plot the averse economic effects in the result section, we need to prepare the data:

# limit the data for plot
plotdataE <- head(economiceffects[order(-economiceffects$total),], 10)
plotdataE <- plotdataE[,1:3]

#make wide format to long format in order to plot the results
#library(reshape2)
#head(plotdataE)
plotdataE <- melt(plotdataE)
## Using eventtype as id variables
#head(plotdataE)

Results

Weather events most harmful to health

library(ggplot2)
p1 <- ggplot(plotdataH, aes(x=reorder(eventtype,-value), y=value, fill=variable))
p1 <- p1 + geom_bar(stat='identity')
p1 <- p1 + theme_bw() + theme(axis.text.x=element_text(angle=45, hjust=1))
p1 <- p1 + labs(y="Number of fatalities/injuries", x = "Eventtype", fill="")
p1 <- p1 + ggtitle("Weather events most harmful to health (Total Top 10)")
p1 + scale_fill_manual(values = c("Black", "Red"))

Weather events most harmful to economy

p2 <- ggplot(plotdataE, aes(x=reorder(eventtype,-value), y=value, fill=variable))
p2 <- p2 + geom_bar(stat='identity')
p2 <- p2 + theme_bw() + theme(axis.text.x=element_text(angle=45, hjust=1))
p2 <- p2 + labs(y="Sum of estimated costs", x = "Eventtype", fill="")
p2 <- p2 + ggtitle("Weather events most harmful to economy (Total Top 10)")
p2 + scale_fill_manual(values = c("Steelblue", "Darkgreen"))

Software Environment:

sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18362)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=German_Austria.1252  LC_CTYPE=German_Austria.1252   
## [3] LC_MONETARY=German_Austria.1252 LC_NUMERIC=C                   
## [5] LC_TIME=German_Austria.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggplot2_3.2.0  reshape2_1.4.3
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.1       knitr_1.23       magrittr_1.5     tidyselect_0.2.5
##  [5] munsell_0.5.0    colorspace_1.4-1 R6_2.4.0         rlang_0.4.0     
##  [9] stringr_1.4.0    plyr_1.8.4       dplyr_0.8.3      tools_3.6.1     
## [13] grid_3.6.1       gtable_0.3.0     xfun_0.8         withr_2.1.2     
## [17] htmltools_0.3.6  assertthat_0.2.1 lazyeval_0.2.2   digest_0.6.20   
## [21] tibble_2.1.3     crayon_1.3.4     purrr_0.3.2      glue_1.3.1      
## [25] evaluate_0.14    rmarkdown_1.14   labeling_0.3     stringi_1.4.3   
## [29] compiler_3.6.1   pillar_1.4.2     scales_1.0.0     pkgconfig_2.0.2

This analysis was a project for the course “Reproducible Research” from Johns Hopkins on Coursera.