Synopsis

The most destructive weather events for population health are tornadoes (by a significant margin), followed by heat. A larger proportion of people are killed by heat than by tornadoes.

The worst weather events for property and crop damage are flooding followed by hurricanes. The majority of all economic consequences are property damage.

Data Processing

The storm data is compressed as a bz2 file, which can be read directly into R.

storms <- read.csv("repdata_data_StormData.csv.bz2")
head(storms)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL TORNADO
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL TORNADO
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL TORNADO
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL TORNADO
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL TORNADO
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL TORNADO
##   BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END COUNTYENDN
## 1         0                                               0         NA
## 2         0                                               0         NA
## 3         0                                               0         NA
## 4         0                                               0         NA
## 5         0                                               0         NA
## 6         0                                               0         NA
##   END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES INJURIES PROPDMG
## 1         0                      14.0   100 3   0          0       15    25.0
## 2         0                       2.0   150 2   0          0        0     2.5
## 3         0                       0.1   123 2   0          0        2    25.0
## 4         0                       0.0   100 2   0          0        2     2.5
## 5         0                       0.0   150 2   0          0        2     2.5
## 6         0                       1.5   177 2   0          0        6     2.5
##   PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES LATITUDE LONGITUDE
## 1          K       0                                         3040      8812
## 2          K       0                                         3042      8755
## 3          K       0                                         3340      8742
## 4          K       0                                         3458      8626
## 5          K       0                                         3412      8642
## 6          K       0                                         3450      8748
##   LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1       3051       8806              1
## 2          0          0              2
## 3          0          0              3
## 4          0          0              4
## 5          0          0              5
## 6          0          0              6

Data Processing to Determine Harm to Population Health

The variables we are interested in to determine the events that pose the biggest threat to population health are EVTYPE (event type - class character), FATALITIES (class numeric), and INJURIES (class numeric). These variables are taken out of the main data table and stored separately for ease of analysis.

health <- subset(storms, select = c(EVTYPE, FATALITIES, INJURIES))
head(health)
##    EVTYPE FATALITIES INJURIES
## 1 TORNADO          0       15
## 2 TORNADO          0        0
## 3 TORNADO          0        2
## 4 TORNADO          0        2
## 5 TORNADO          0        2
## 6 TORNADO          0        6

The fatalities and injuries are summed up over each event type and the outputs combined in one data table called healthsummary.

healthsummary <- aggregate(cbind(health$FATALITIES, health$INJURIES), list(health$EVTYPE), sum)
colnames(healthsummary) <- c("event", "fatalities", "injuries")
head(healthsummary)
##                   event fatalities injuries
## 1    HIGH SURF ADVISORY          0        0
## 2         COASTAL FLOOD          0        0
## 3           FLASH FLOOD          0        0
## 4             LIGHTNING          0        0
## 5             TSTM WIND          0        0
## 6       TSTM WIND (G45)          0        0

Rows that do not contain any fatalities and injuries are removed. Then fatalities and injuries are summed up and the total is placed in a new column.

smallhealthsummary <- subset(healthsummary, fatalities !=0 | injuries != 0)
smallhealthsummary$total <- smallhealthsummary$fatalities + smallhealthsummary$injuries
head(smallhealthsummary)
##           event fatalities injuries total
## 18     AVALANCE          1        0     1
## 19    AVALANCHE        224      170   394
## 29    BLACK ICE          1       24    25
## 30     BLIZZARD        101      805   906
## 42 blowing snow          1        1     2
## 44 BLOWING SNOW          1       13    14

There are many rows in “smallhealthsummary” that could be combined. To determine how precisely the data need to be processed, the row with the maximum total fatalities and injuries is pulled.

smallhealthsummary[which.max(smallhealthsummary$total), ]
##       event fatalities injuries total
## 834 TORNADO       5633    91346 96979

Since the maximum total fatalities and injuries total to nearly 100,000, any row that has less than 1,000 total fatalities and injuries, or less than 1% of the maximum, will be culled from the data set. This estimation should not skew the data analysis too heavily, while simplifying the data table significantly.

finalhealthsummary <- subset(smallhealthsummary, total > 1000)
print(finalhealthsummary)
##                 event fatalities injuries total
## 130    EXCESSIVE HEAT       1903     6525  8428
## 153       FLASH FLOOD        978     1777  2755
## 170             FLOOD        470     6789  7259
## 244              HAIL         15     1361  1376
## 275              HEAT        937     2100  3037
## 310        HEAVY SNOW        127     1021  1148
## 359         HIGH WIND        248     1137  1385
## 411 HURRICANE/TYPHOON         64     1275  1339
## 427         ICE STORM         89     1975  2064
## 464         LIGHTNING        816     5230  6046
## 760 THUNDERSTORM WIND        133     1488  1621
## 834           TORNADO       5633    91346 96979
## 856         TSTM WIND        504     6957  7461
## 972      WINTER STORM        206     1321  1527

This data table still contains a few rows with data that need to be combined. For instance, “Flood” and “Flash Flood” are combined as “Flood”.

finalhealthsummary$event <- replace(finalhealthsummary$event, 1, "HEAT")
finalhealthsummary$event <- replace(finalhealthsummary$event, 2, "FLOOD")
finalhealthsummary$event <- replace(finalhealthsummary$event, 6, "WINTER STORM")
finalhealthsummary$event <- replace(finalhealthsummary$event, 7, "WIND")
finalhealthsummary$event <- replace(finalhealthsummary$event, 8, "HURRICANE")
finalhealthsummary$event <- replace(finalhealthsummary$event, 9, "WINTER STORM")
finalhealthsummary$event <- replace(finalhealthsummary$event, 11, "WIND")
finalhealthsummary$event <- replace(finalhealthsummary$event, 13, "WIND")
print(finalhealthsummary)
##            event fatalities injuries total
## 130         HEAT       1903     6525  8428
## 153        FLOOD        978     1777  2755
## 170        FLOOD        470     6789  7259
## 244         HAIL         15     1361  1376
## 275         HEAT        937     2100  3037
## 310 WINTER STORM        127     1021  1148
## 359         WIND        248     1137  1385
## 411    HURRICANE         64     1275  1339
## 427 WINTER STORM         89     1975  2064
## 464    LIGHTNING        816     5230  6046
## 760         WIND        133     1488  1621
## 834      TORNADO       5633    91346 96979
## 856         WIND        504     6957  7461
## 972 WINTER STORM        206     1321  1527

These values are then summed up to make a table with unique rows.

finalhealthsummary <- aggregate(cbind(finalhealthsummary$fatalities, finalhealthsummary$injuries, finalhealthsummary$total), list(finalhealthsummary$event), sum)
colnames(finalhealthsummary) <- c("event", "fatalities", "injuries", "total")
print(finalhealthsummary)
##          event fatalities injuries total
## 1        FLOOD       1448     8566 10014
## 2         HAIL         15     1361  1376
## 3         HEAT       2840     8625 11465
## 4    HURRICANE         64     1275  1339
## 5    LIGHTNING        816     5230  6046
## 6      TORNADO       5633    91346 96979
## 7         WIND        885     9582 10467
## 8 WINTER STORM        422     4317  4739

Data Processing to Determine Economic Consequences

The variables we are interested in to determine the events that pose the biggest threat to population health are EVTYPE (event type - class character), PROPDMG (class numeric), PROPDMGEXP (class character), CROPDMG (class numeric), and CROPDMGEXP (class character). These variables are taken out of the main data table and stored separately for ease of analysis.

damage <- subset(storms, select = c(EVTYPE, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP))
nrow(damage)
## [1] 902297
head(damage)
##    EVTYPE PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 TORNADO    25.0          K       0           
## 2 TORNADO     2.5          K       0           
## 3 TORNADO    25.0          K       0           
## 4 TORNADO     2.5          K       0           
## 5 TORNADO     2.5          K       0           
## 6 TORNADO     2.5          K       0

Because the damage is listed with various exponents, the values of PROPDMG and CROPDMG cannot simply be summed to get the total amount of damage for each event. First, PROPDMG and CROPDMG will have to be converted to their actual amounts, taking into account the exponent. To work with as little data as possible, first any rows that contain no value for PROPDMG and CROPDMG are removed.

smalldamage <- subset(damage, PROPDMG !=0 | CROPDMG != 0)
nrow(smalldamage)
## [1] 245031
head(smalldamage)
##    EVTYPE PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 TORNADO    25.0          K       0           
## 2 TORNADO     2.5          K       0           
## 3 TORNADO    25.0          K       0           
## 4 TORNADO     2.5          K       0           
## 5 TORNADO     2.5          K       0           
## 6 TORNADO     2.5          K       0

Next, it would be nice to have a sense of what exponents are listed and how many of each there are. To do this, the dplyr package is used.

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following object is masked _by_ '.GlobalEnv':
## 
##     storms
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

The exponent options and counts for property damage and crop damage are shown below:

smalldamage %>% count(PROPDMGEXP)
##    PROPDMGEXP      n
## 1               4357
## 2           -      1
## 3           +      5
## 4           0    209
## 5           2      1
## 6           3      1
## 7           4      4
## 8           5     18
## 9           6      3
## 10          7      2
## 11          B     40
## 12          h      1
## 13          H      6
## 14          K 229057
## 15          m      7
## 16          M  11319
smalldamage %>% count(CROPDMGEXP)
##   CROPDMGEXP      n
## 1            145037
## 2          ?      6
## 3          0     17
## 4          B      7
## 5          k     21
## 6          K  97960
## 7          m      1
## 8          M   1982

The storm data documentation states on page 12 the allowable entries for an exponent are “K” for thousands, “M” for millions, and “B” for billions. The rest of the various exponents entered will be ignored and replaced as missing data. The column will then be converted to a numerical class. The code and data for the PROPDMGEXP column is as follows:

# Replacing the valid exponents with numbers
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "K", 1000)
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "M", 1000000)
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "m", 1000000)
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "B", 1000000000)
# Replacing the invalid exponent entries with NA
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "", "NA")
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "-", "NA")
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "+", "NA")
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "0", "NA")
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "2", "NA")
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "3", "NA")
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "4", "NA")
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "5", "NA")
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "6", "NA")
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "7", "NA")
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "h", "NA")
smalldamage$PROPDMGEXP <- replace(smalldamage$PROPDMGEXP, smalldamage$PROPDMGEXP == "H", "NA")
smalldamage$PROPDMGEXP <- as.numeric(smalldamage$PROPDMGEXP)
## Warning: NAs introduced by coercion
smalldamage %>% count(PROPDMGEXP)
##   PROPDMGEXP      n
## 1      1e+03 229057
## 2      1e+06  11326
## 3      1e+09     40
## 4         NA   4608
class(smalldamage$PROPDMGEXP)
## [1] "numeric"

The same is done for the CROPDMGEXP column.

# Replacing the valid exponents with numbers
smalldamage$CROPDMGEXP <- replace(smalldamage$CROPDMGEXP, smalldamage$CROPDMGEXP == "K", 1000)
smalldamage$CROPDMGEXP <- replace(smalldamage$CROPDMGEXP, smalldamage$CROPDMGEXP == "k", 1000)
smalldamage$CROPDMGEXP <- replace(smalldamage$CROPDMGEXP, smalldamage$CROPDMGEXP == "M", 1000000)
smalldamage$CROPDMGEXP <- replace(smalldamage$CROPDMGEXP, smalldamage$CROPDMGEXP == "m", 1000000)
smalldamage$CROPDMGEXP <- replace(smalldamage$CROPDMGEXP, smalldamage$CROPDMGEXP == "B", 1000000000)
# Replacing the invalid exponent entries with NA
smalldamage$CROPDMGEXP <- replace(smalldamage$CROPDMGEXP, smalldamage$CROPDMGEXP == "", "NA")
smalldamage$CROPDMGEXP <- replace(smalldamage$CROPDMGEXP, smalldamage$CROPDMGEXP == "?", "NA")
smalldamage$CROPDMGEXP <- replace(smalldamage$CROPDMGEXP, smalldamage$CROPDMGEXP == "0", "NA")
smalldamage$CROPDMGEXP <- as.numeric(smalldamage$CROPDMGEXP)
## Warning: NAs introduced by coercion
smalldamage %>% count(CROPDMGEXP)
##   CROPDMGEXP      n
## 1      1e+03  97981
## 2      1e+06   1983
## 3      1e+09      7
## 4         NA 145060
class(smalldamage$CROPDMGEXP)
## [1] "numeric"

From here, the actual damage value is calculated for both, with each being assigned to a new column.

smalldamage$PROPDMGVALUE <- smalldamage$PROPDMG * smalldamage$PROPDMGEXP
smalldamage$CROPDMGVALUE <- smalldamage$CROPDMG * smalldamage$CROPDMGEXP
head(smalldamage)
##    EVTYPE PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP PROPDMGVALUE CROPDMGVALUE
## 1 TORNADO    25.0       1000       0         NA        25000           NA
## 2 TORNADO     2.5       1000       0         NA         2500           NA
## 3 TORNADO    25.0       1000       0         NA        25000           NA
## 4 TORNADO     2.5       1000       0         NA         2500           NA
## 5 TORNADO     2.5       1000       0         NA         2500           NA
## 6 TORNADO     2.5       1000       0         NA         2500           NA

The property damage and crop damage values are summed up over each event type and the outputs combined in one data table called damagesummary. NA values are ignored for these calculations.

damagesummary <- aggregate(cbind(smalldamage$PROPDMGVALUE, smalldamage$CROPDMGVALUE), list(smalldamage$EVTYPE), sum, na.rm=TRUE)
colnames(damagesummary) <- c("event", "propdmg", "cropdmg")
head(damagesummary)
##                   event propdmg  cropdmg
## 1    HIGH SURF ADVISORY  200000        0
## 2           FLASH FLOOD   50000        0
## 3             TSTM WIND 8100000        0
## 4       TSTM WIND (G45)    8000        0
## 5                     ?    5000        0
## 6   AGRICULTURAL FREEZE       0 28820000

The property damage and crop damage values are added up and the total is placed in a new column.

damagesummary$total <- damagesummary$propdmg + damagesummary$cropdmg
head(damagesummary)
##                   event propdmg  cropdmg    total
## 1    HIGH SURF ADVISORY  200000        0   200000
## 2           FLASH FLOOD   50000        0    50000
## 3             TSTM WIND 8100000        0  8100000
## 4       TSTM WIND (G45)    8000        0     8000
## 5                     ?    5000        0     5000
## 6   AGRICULTURAL FREEZE       0 28820000 28820000

As with the population health data, there are many rows in damagesummary that could be combined. To determine how precisely the data need to be processed, the row with the maximum total damage value is pulled.

damagesummary[which.max(damagesummary$total), ]
##    event      propdmg    cropdmg        total
## 72 FLOOD 144657709800 5661968450 150319678250

Since the maximum property damages and crop damages total to nearly $150 trillion, any row that has less than $1,500,000,000 total damages, or less than 1% of the maximum, will be culled from the data set. This estimation should not skew the data analysis too heavily, while simplifying the data table significantly.

smalldamagesummary <- subset(damagesummary, total > 1500000000)
print(smalldamagesummary)
##                          event      propdmg     cropdmg        total
## 39                     DROUGHT   1046106000 13972566000  15018672000
## 59                 FLASH FLOOD  16140811510  1421317100  17562128610
## 72                       FLOOD 144657709800  5661968450 150319678250
## 116                       HAIL  15732266720  3025954450  18758221170
## 142  HEAVY RAIN/SEVERE WEATHER   2500000000           0   2500000000
## 174                  HIGH WIND   5270046260   638571300   5908617560
## 189                  HURRICANE  11868319010  2741910000  14610229010
## 195             HURRICANE OPAL   3172846000    19000000   3191846000
## 197          HURRICANE/TYPHOON  69305840000  2607872800  71913712800
## 206                  ICE STORM   3944927810  5022113500   8967041310
## 262                RIVER FLOOD   5118945500  5029459000  10148404500
## 299                STORM SURGE  43323536000        5000  43323541000
## 300           STORM SURGE/TIDE   4641188000      850000   4642038000
## 313          THUNDERSTORM WIND   3483121140   414843050   3897964190
## 328         THUNDERSTORM WINDS   1735952850   190654700   1926607550
## 354                    TORNADO  56937160480   414953110  57352113590
## 360 TORNADOES, TSTM WIND, HAIL   1600000000     2500000   1602500000
## 363             TROPICAL STORM   7703890550   678346000   8382236550
## 369                  TSTM WIND   4484928440   554007350   5038935790
## 412           WILD/FOREST FIRE   3001829500   106796830   3108626330
## 414                   WILDFIRE   4765114000   295472800   5060586800
## 424               WINTER STORM   6688497250    26944000   6715441250

This data table still contains a few rows with data that need to be combined. For instance, “Flood” and “Flash Flood” are combined as “Flood”.

smalldamagesummary$event <- replace(smalldamagesummary$event, 2, "FLOOD")
smalldamagesummary$event <- replace(smalldamagesummary$event, 5, "WIND")
smalldamagesummary$event <- replace(smalldamagesummary$event, 6, "WIND")
smalldamagesummary$event <- replace(smalldamagesummary$event, 8, "HURRICANE")
smalldamagesummary$event <- replace(smalldamagesummary$event, 9, "HURRICANE")
smalldamagesummary$event <- replace(smalldamagesummary$event, 10, "WINTER STORM")
smalldamagesummary$event <- replace(smalldamagesummary$event, 11, "FLOOD")
smalldamagesummary$event <- replace(smalldamagesummary$event, 13, "STORM SURGE")
smalldamagesummary$event <- replace(smalldamagesummary$event, 14, "WIND")
smalldamagesummary$event <- replace(smalldamagesummary$event, 15, "WIND")
smalldamagesummary$event <- replace(smalldamagesummary$event, 17, "TORNADO")
smalldamagesummary$event <- replace(smalldamagesummary$event, 18, "HURRICANE")
smalldamagesummary$event <- replace(smalldamagesummary$event, 19, "WIND")
smalldamagesummary$event <- replace(smalldamagesummary$event, 20, "WILDFIRE")
print(smalldamagesummary)
##            event      propdmg     cropdmg        total
## 39       DROUGHT   1046106000 13972566000  15018672000
## 59         FLOOD  16140811510  1421317100  17562128610
## 72         FLOOD 144657709800  5661968450 150319678250
## 116         HAIL  15732266720  3025954450  18758221170
## 142         WIND   2500000000           0   2500000000
## 174         WIND   5270046260   638571300   5908617560
## 189    HURRICANE  11868319010  2741910000  14610229010
## 195    HURRICANE   3172846000    19000000   3191846000
## 197    HURRICANE  69305840000  2607872800  71913712800
## 206 WINTER STORM   3944927810  5022113500   8967041310
## 262        FLOOD   5118945500  5029459000  10148404500
## 299  STORM SURGE  43323536000        5000  43323541000
## 300  STORM SURGE   4641188000      850000   4642038000
## 313         WIND   3483121140   414843050   3897964190
## 328         WIND   1735952850   190654700   1926607550
## 354      TORNADO  56937160480   414953110  57352113590
## 360      TORNADO   1600000000     2500000   1602500000
## 363    HURRICANE   7703890550   678346000   8382236550
## 369         WIND   4484928440   554007350   5038935790
## 412     WILDFIRE   3001829500   106796830   3108626330
## 414     WILDFIRE   4765114000   295472800   5060586800
## 424 WINTER STORM   6688497250    26944000   6715441250

These values are then summed up to make a table with unique rows.

finaldamagesummary <- aggregate(cbind(smalldamagesummary$propdmg, smalldamagesummary$cropdmg, smalldamagesummary$total), list(smalldamagesummary$event), sum)
colnames(finaldamagesummary) <- c("event", "propdmg", "cropdmg", "total")
print(finaldamagesummary)
##          event      propdmg     cropdmg        total
## 1      DROUGHT   1046106000 13972566000  15018672000
## 2        FLOOD 165917466810 12112744550 178030211360
## 3         HAIL  15732266720  3025954450  18758221170
## 4    HURRICANE  92050895560  6047128800  98098024360
## 5  STORM SURGE  47964724000      855000  47965579000
## 6      TORNADO  58537160480   417453110  58954613590
## 7     WILDFIRE   7766943500   402269630   8169213130
## 8         WIND  17474048690  1798076400  19272125090
## 9 WINTER STORM  10633425060  5049057500  15682482560

Results

Results for Population Health

The data for fatalities and injuries is formatted to make it easy to plot a bargraph. First, the log10 of the fatalities and injuries columns are taken to condense the range of values that must be plotted. The two log values are added together and sorted so that when the values are plotted, the events are ordered from least hazardous to most hazardous.

# Take log10 of both fatalities and injuries columns
finalhealthsummary$fatalities <- log10(finalhealthsummary$fatalities)
finalhealthsummary$injuries <- log10(finalhealthsummary$injuries)
# Add values and order data
finalhealthsummary$total <- finalhealthsummary$fatalities + finalhealthsummary$injuries
finalhealthsummary <- finalhealthsummary[order(finalhealthsummary$total, decreasing=TRUE), ]
# Format table so barplot can graph the data
healthplot <- rbind(finalhealthsummary$fatalities, finalhealthsummary$injuries)
colnames(healthplot) <- finalhealthsummary$event
print(healthplot)
##      TORNADO     HEAT    FLOOD     WIND LIGHTNING WINTER STORM HURRICANE
## [1,] 3.75074 3.453318 3.160769 2.946943  2.911690     2.625312   1.80618
## [2,] 4.96069 3.935759 3.932778 3.981456  3.718502     3.635182   3.10551
##          HAIL
## [1,] 1.176091
## [2,] 3.133858

The data can be plotted as a stacked bar graph to reveal that tornadoes are the most hazardous weather event for population health, in both the number of fatalities and number of injuries. Heat events, which come in second, have a larger percentage of fatalities.

par(mar = c(4,11,4,2))
barplot(healthplot, main = "Most Hazardous Weather Events to Population Health",
      horiz=TRUE,
      xlab = "log10(Health Event)",
      las=1,
      legend.text=c("Fatalities", "Injuries"),
      col = rainbow(2),
      axes = TRUE)
title(ylab="Weather Event", line=9)

Results for Economic Consequences

The data for property and crop damage values are also formatted to make it easy to plot a bargraph. First, the property damage value and crop damage value columns are normalized by dividing by 10,000,000,000 to put the values at smaller numbers that are easier to compare. The property and crop damage values are added together and sorted so that when the values are plotted, the events are ordered from the smallest damage to the most damage.

# Take log10 of both property damage value and crop damage value
finaldamagesummary$propdmg <- finaldamagesummary$propdmg/10000000000
finaldamagesummary$cropdmg <- finaldamagesummary$cropdmg/10000000000
# Add values and order data
finaldamagesummary$total <- finaldamagesummary$propdmg + finaldamagesummary$cropdmg
finaldamagesummary <- finaldamagesummary[order(finaldamagesummary$total, decreasing=TRUE), ]
# Format table so barplot can graph the data
damageplot <- rbind(finaldamagesummary$propdmg, finaldamagesummary$cropdmg)
colnames(damageplot) <- finaldamagesummary$event
print(damageplot)
##          FLOOD HURRICANE    TORNADO STORM SURGE      WIND      HAIL
## [1,] 16.591747 9.2050896 5.85371605   4.7964724 1.7474049 1.5732267
## [2,]  1.211274 0.6047129 0.04174531   0.0000855 0.1798076 0.3025954
##      WINTER STORM   DROUGHT   WILDFIRE
## [1,]    1.0633425 0.1046106 0.77669435
## [2,]    0.5049058 1.3972566 0.04022696

The data can be plotted as a stacked bar graph to reveal that floods are the most destructive weather event as far as economic consequences, with the vast majority of the damage occuring to properties. Hurricane events, which come in second, also have the mast majority of damage that occurs as property damage. The only weather event whose damages largely occur to crops is drought.

par(mar = c(4,11,4,2))
barplot(damageplot, main = "Most Costly Weather Events to Property and Crops",
      horiz=TRUE,
      xlab = "Damage Value (Tens of Billions)",
      las=1,
      legend.text=c("Property Damage", "Crop Damage"),
      col = rainbow(2),
      axes = TRUE)
title(ylab="Weather Event", line=9)