Synopsis

The National Oceanic and Atmospheric Administration (NOAA) keeps track of the amount of damage caused by natural events both physically and financially. Events are categorized into 16 discrete categories (Tornado, SNOW/ICE, STORMS, etc). The physical damage is tracked for properties and crops separately. This reports looks at the overall damage caused by these events by adding up the number of fatalities and injuries per category. In the report, I also take a look at the top 20 events recorded nationally. For financial harm, I computed the top 20 events that caused property damage and crop damage. This summary includes all the data process and computations note needed to reproduce the analysis. The results section includes the key data tables and figures all in one place.

Data Processing

As part of the data processing, I like to make sure that I have the needed libraries loaded (see “Loading the appropriate libraries) and that I access the file directly from the server (see”Getting the stormdata file from the coursera site").

To be able to answer the questions in this assignment, data variables need to be created (columns added to the data frame) by combining data from the existing data frame. Those steps are also shown below.

Loading the appropriate libraries

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:data.table':
## 
##     between, first, last
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## 
## Attaching package: 'mice'
## The following object is masked from 'package:stats':
## 
##     filter
## The following objects are masked from 'package:base':
## 
##     cbind, rbind
## 
## Attaching package: 'pastecs'
## The following objects are masked from 'package:dplyr':
## 
##     first, last
## The following objects are masked from 'package:data.table':
## 
##     first, last

Getting the Storm data from the coursera site and read the file

url <-"https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
destfile <-("~/Desktop/SantanaHardDisk/visualizationcourse/R_practice_files/reproducibility/week4/StormData.csv")
download.file(url, destfile)
stormdata <- read.table("StormData.csv", sep = ",", header = TRUE)
head(stormdata[1:5,1:4])
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE
## 1       1  4/18/1950 0:00:00     0130       CST
## 2       1  4/18/1950 0:00:00     0145       CST
## 3       1  2/20/1951 0:00:00     1600       CST
## 4       1   6/8/1951 0:00:00     0900       CST
## 5       1 11/15/1951 0:00:00     1500       CST

Creating a data subset

To help with the analysis it is helpful to create a more manegeable subset of the data.

SUBstormdata <- select(stormdata, c(1:8, 23:28))
head(SUBstormdata)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL TORNADO
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL TORNADO
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL TORNADO
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL TORNADO
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL TORNADO
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL TORNADO
##   FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1          0       15    25.0          K       0           
## 2          0        0     2.5          K       0           
## 3          0        2    25.0          K       0           
## 4          0        2     2.5          K       0           
## 5          0        2     2.5          K       0           
## 6          0        6     2.5          K       0

Calculating the Health Impact

Determining the Healthy Impact by using the sum of the fatalities and injuries for that state.

The code below adds a column called ‘healthimpact’ by adding the fatalities and injuries together.

SUBstormdata$healthimpact <- SUBstormdata$FATALITIES + SUBstormdata$INJURIES
summary(SUBstormdata)
##     STATE__       BGN_DATE           BGN_TIME          TIME_ZONE        
##  Min.   : 1.0   Length:902297      Length:902297      Length:902297     
##  1st Qu.:19.0   Class :character   Class :character   Class :character  
##  Median :30.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :31.2                                                           
##  3rd Qu.:45.0                                                           
##  Max.   :95.0                                                           
##      COUNTY       COUNTYNAME           STATE              EVTYPE         
##  Min.   :  0.0   Length:902297      Length:902297      Length:902297     
##  1st Qu.: 31.0   Class :character   Class :character   Class :character  
##  Median : 75.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :100.6                                                           
##  3rd Qu.:131.0                                                           
##  Max.   :873.0                                                           
##    FATALITIES          INJURIES            PROPDMG         PROPDMGEXP       
##  Min.   :  0.0000   Min.   :   0.0000   Min.   :   0.00   Length:902297     
##  1st Qu.:  0.0000   1st Qu.:   0.0000   1st Qu.:   0.00   Class :character  
##  Median :  0.0000   Median :   0.0000   Median :   0.00   Mode  :character  
##  Mean   :  0.0168   Mean   :   0.1557   Mean   :  12.06                     
##  3rd Qu.:  0.0000   3rd Qu.:   0.0000   3rd Qu.:   0.50                     
##  Max.   :583.0000   Max.   :1700.0000   Max.   :5000.00                     
##     CROPDMG         CROPDMGEXP         healthimpact      
##  Min.   :  0.000   Length:902297      Min.   :   0.0000  
##  1st Qu.:  0.000   Class :character   1st Qu.:   0.0000  
##  Median :  0.000   Mode  :character   Median :   0.0000  
##  Mean   :  1.527                      Mean   :   0.1725  
##  3rd Qu.:  0.000                      3rd Qu.:   0.0000  
##  Max.   :990.000                      Max.   :1742.0000

Top 20 Health Impact Events

Now, I will calculate the total number of cases (sum of Health Impact) by Event Type.

SUBevtype <- group_by(SUBstormdata, SUBstormdata$EVTYPE)
SUBtableevtype <- summarise(SUBevtype, healthimpact = sum(healthimpact))
## `summarise()` ungrouping output (override with `.groups` argument)
SUBevtypes_top20 <- head(arrange(SUBtableevtype, desc(healthimpact)), 20)
print(SUBevtypes_top20)
## # A tibble: 20 x 2
##    `SUBstormdata$EVTYPE` healthimpact
##    <chr>                        <dbl>
##  1 TORNADO                      96979
##  2 EXCESSIVE HEAT                8428
##  3 TSTM WIND                     7461
##  4 FLOOD                         7259
##  5 LIGHTNING                     6046
##  6 HEAT                          3037
##  7 FLASH FLOOD                   2755
##  8 ICE STORM                     2064
##  9 THUNDERSTORM WIND             1621
## 10 WINTER STORM                  1527
## 11 HIGH WIND                     1385
## 12 HAIL                          1376
## 13 HURRICANE/TYPHOON             1339
## 14 HEAVY SNOW                    1148
## 15 WILDFIRE                       986
## 16 THUNDERSTORM WINDS             972
## 17 BLIZZARD                       906
## 18 FOG                            796
## 19 RIP CURRENT                    600
## 20 WILD/FOREST FIRE               557

Visualizing the information is always helpful. Below is the same data in a barplot.

color <- brewer.pal(5, "Blues")

barplot(SUBevtypes_top20$healthimpact, main = "Top 20 Health Impact Events", 
        names.arg = SUBevtypes_top20$`SUBstormdata$EVTYPE`, cex.names = 0.6,
        col=color,
        xlab = "Event Type Count")

This graph indicates that TORNADO has had the most significant health impact among the top 20 events. In fact, it is approximately 12 times higher than the next highest impact event.

Economic Impact

Next, conversion values for propdmg and cropdmg variables where they are set to below values for propdmgexp values are included in two different columns. The conversation values used are: 1000 for K, 1000,000 for M, 1000,000,000 for B and 0. It is not clear what conversion value to use for the rest of the variables.

SUBstormdata$propconv <- 0
SUBstormdata$cropconv <- 0
head(SUBstormdata)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL TORNADO
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL TORNADO
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL TORNADO
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL TORNADO
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL TORNADO
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL TORNADO
##   FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP healthimpact
## 1          0       15    25.0          K       0                      15
## 2          0        0     2.5          K       0                       0
## 3          0        2    25.0          K       0                       2
## 4          0        2     2.5          K       0                       2
## 5          0        2     2.5          K       0                       2
## 6          0        6     2.5          K       0                       6
##   propconv cropconv
## 1        0        0
## 2        0        0
## 3        0        0
## 4        0        0
## 5        0        0
## 6        0        0

The table below shows that there are 465858 row where PRPDMEXP=0–>0 indicates that there is no impact and thus they do not need to be included in the analysis.

 table(filter(SUBstormdata, SUBstormdata$PROPDMGEXP == "")[,11])
## 
##      0   0.41      1      2      3      4      5      6      7      8      9 
## 465858      1      4      7     16      9     11      6      3      2      3 
##     10     20     35     75 
##      8      4      1      1

Removing entries where the PRPDMGEXP = 0

SUBstormdata[(SUBstormdata$PROPDMGEXP == "K"),][,16] <- 1000
head(filter(SUBstormdata, SUBstormdata$PROPDMGEXP == "K"))
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL TORNADO
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL TORNADO
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL TORNADO
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL TORNADO
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL TORNADO
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL TORNADO
##   FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP healthimpact
## 1          0       15    25.0          K       0                      15
## 2          0        0     2.5          K       0                       0
## 3          0        2    25.0          K       0                       2
## 4          0        2     2.5          K       0                       2
## 5          0        2     2.5          K       0                       2
## 6          0        6     2.5          K       0                       6
##   propconv cropconv
## 1     1000        0
## 2     1000        0
## 3     1000        0
## 4     1000        0
## 5     1000        0
## 6     1000        0
SUBstormdata[(SUBstormdata$PROPDMGEXP == "M"),][,16] <- 1000000
head(filter(SUBstormdata, SUBstormdata$PROPDMGEXP == "M"))
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE
## 1       1  2/13/1952 0:00:00     2030       CST     43    CULLMAN    AL TORNADO
## 2       1  2/13/1952 0:00:00     2030       CST      9     BLOUNT    AL TORNADO
## 3       1  4/18/1953 0:00:00     1700       CST     81        LEE    AL TORNADO
## 4       1  4/24/1955 0:00:00     0545       CST    103     MORGAN    AL TORNADO
## 5       1 10/16/1955 0:00:00     1840       CST     31     COFFEE    AL TORNADO
## 6       1 10/16/1955 0:00:00     1840       CST     45       DALE    AL TORNADO
##   FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP healthimpact
## 1          0        3     2.5          M       0                       3
## 2          0        3     2.5          M       0                       3
## 3          6      195     2.5          M       0                     201
## 4          5       20     2.5          M       0                      25
## 5          0        0     2.5          M       0                       0
## 6          0        5     2.5          M       0                       5
##   propconv cropconv
## 1    1e+06        0
## 2    1e+06        0
## 3    1e+06        0
## 4    1e+06        0
## 5    1e+06        0
## 6    1e+06        0
SUBstormdata[(SUBstormdata$PROPDMGEXP == "B"),][,16] <- 1000000000
head(filter(SUBstormdata, SUBstormdata$PROPDMGEXP == "B"))
##   STATE__          BGN_DATE BGN_TIME TIME_ZONE COUNTY
## 1       1 3/12/1993 0:00:00     2200       CST      0
## 2       1 10/4/1995 0:00:00     1200       CST      0
## 3      12 10/3/1995 0:00:00     1200       CST      0
## 4      12 3/12/1993 0:00:00     2200       EST      0
## 5      12 10/4/1995 0:00:00     0000       EST      0
## 6      17 8/31/1993 0:00:00     0000       CST      1
##                             COUNTYNAME STATE                     EVTYPE
## 1                           ALZ001>018    AL               WINTER STORM
## 2                           ALZ001>050    AL  HURRICANE OPAL/HIGH WINDS
## 3 FLZ001 - 002 - 003 - 004 - 005 - 006    FL             HURRICANE OPAL
## 4                           FLZ001>023    FL TORNADOES, TSTM WIND, HAIL
## 5           FLZ007>019 - 026>029 - 034    FL             HURRICANE OPAL
## 6            ADAMS, CALHOUN AND JERSEY    IL                RIVER FLOOD
##   FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP healthimpact
## 1          4        0     5.0          B     0.0                       4
## 2          2        0     0.1          B    10.0          M            2
## 3          1        0     2.1          B     5.0          M            1
## 4         25        0     1.6          B     2.5          M           25
## 5          0        0     1.0          B     0.0                       0
## 6          0        0     5.0          B     5.0          B            0
##   propconv cropconv
## 1    1e+09        0
## 2    1e+09        0
## 3    1e+09        0
## 4    1e+09        0
## 5    1e+09        0
## 6    1e+09        0
SUBstormdata[!(SUBstormdata$PROPDMGEXP %in% c("K","M","B")),][,16] <- 0
head(SUBstormdata[!(SUBstormdata$PROPDMGEXP %in% c("K","M","B")),])
##    STATE__          BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 54       1 3/21/1955 0:00:00     1800       CST     73  JEFFERSON    AL
## 55       1 3/25/1955 0:00:00     1610       CST    103     MORGAN    AL
## 56       1 3/25/1955 0:00:00     1700       CST     83  LIMESTONE    AL
## 57       1  4/6/1955 0:00:00     0600       CST     73  JEFFERSON    AL
## 58       1  4/6/1955 0:00:00     1730       CST     57    FAYETTE    AL
## 59       1  4/6/1955 0:00:00     1816       CST      9     BLOUNT    AL
##       EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 54 TSTM WIND          0        0       0                  0           
## 55      HAIL          0        0       0                  0           
## 56      HAIL          0        0       0                  0           
## 57 TSTM WIND          0        0       0                  0           
## 58      HAIL          0        0       0                  0           
## 59 TSTM WIND          0        0       0                  0           
##    healthimpact propconv cropconv
## 54            0        0        0
## 55            0        0        0
## 56            0        0        0
## 57            0        0        0
## 58            0        0        0
## 59            0        0        0

Showing the CROPDMGEXP cells with value of 0

 table(filter(SUBstormdata, SUBstormdata$CROPDMGEXP == "")[,11])
## 
##      0   0.01   0.02   0.03   0.04   0.05   0.06   0.07   0.08    0.1   0.11 
## 473376    901     75   1604      3    547     24      3      2   1221     27 
##   0.12   0.13   0.15   0.16   0.17    0.2   0.21   0.22   0.25    0.3   0.35 
##      6      1     39      1      2    415      2      1   1783    198      6 
##    0.4   0.45    0.5   0.51   0.55    0.6   0.65   0.66    0.7   0.75   0.76 
##     77      2   4551      2      2     55      3      1     34    306     10 
##   0.78    0.8   0.81   0.84   0.85   0.86    0.9   0.95   0.99      1   1.01 
##      1     77      1      1     12      1     30      1      1   9225      3 
##   1.02   1.03   1.04   1.05   1.06   1.07    1.1   1.11   1.13   1.14   1.15 
##      3      2      4      9      4      2     55      2      5      2     13 
##   1.16   1.18    1.2   1.24   1.25   1.26   1.27   1.28   1.29    1.3   1.32 
##      1      1    101      3     50      3      1      3      2     36      2 
##   1.33   1.34   1.35   1.37   1.38    1.4   1.41   1.43   1.45   1.46   1.47 
##      1      1      2      2      1     35      1      2      2      2      2 
##   1.48   1.49    1.5   1.51   1.53   1.54   1.55   1.57   1.58   1.59    1.6 
##      2      2    959      3      1      1      3      2      1      2     32 
##   1.61   1.62   1.65   1.68   1.69    1.7   1.71   1.72   1.73   1.74   1.75 
##      2      2      4      2      1     28      1      2      1      1     12 
##   1.76   1.77   1.78   1.79    1.8   1.85   1.86   1.87   1.88    1.9   1.92 
##      1      1      1      1     38      2      1      1      2     19      1 
##   1.95   1.99      2   2.01   2.03   2.05   2.09    2.1   2.16   2.19    2.2 
##      6      8   9672      1      3      4      1     21      2      1     21 
##   2.22   2.25   2.27    2.3   2.32   2.33   2.35   2.36   2.38    2.4   2.41 
##      1      9      1      8      2      1      1      1      2     17      1 
##   2.45   2.46   2.48    2.5   2.52   2.53   2.54   2.55   2.57   2.58    2.6 
##      2      1      1   7841      1      1      1      3      1      2     11 
##   2.65   2.66   2.69    2.7   2.73   2.75   2.77   2.78    2.8   2.81   2.82 
##      2      1      1     13      1      2      1      2     11      1      1 
##    2.9   2.95      3   3.02   3.04   3.05    3.1   3.13   3.15   3.17    3.2 
##      8      2   5686      1      2      4      5      1      2      1     16 
##   3.24   3.25    3.3   3.31   3.37    3.4   3.43   3.45   3.47    3.5   3.53 
##      1      5     15      1      1      5      2      1      1    216      2 
##   3.54   3.55   3.57    3.6   3.64   3.65   3.68    3.7   3.71   3.72   3.74 
##      1      3      1      8      2      1      1      2      1      1      1 
##   3.75   3.78    3.8   3.83    3.9   3.92   3.94   3.96      4   4.02    4.1 
##      1      1      6      1      5      2      1      1   1771      2      8 
##   4.12   4.15    4.2   4.22   4.25   4.26    4.3   4.36   4.38    4.4   4.43 
##      2      3     10      1      3      1      5      4      1      9      1 
##   4.44    4.5   4.51   4.52   4.57    4.6   4.65    4.7   4.71    4.8   4.85 
##     10     76      1      1      1      2      1      2      1      4      1 
##    4.9   4.96      5   5.05   5.08   5.09    5.1   5.13   5.15   5.16    5.2 
##      5      2  20197      2      1      1      4      1      1      1      5 
##   5.24   5.27    5.3    5.4   5.42   5.47    5.5   5.51   5.55   5.58    5.6 
##      1      1      3      6      1      1     34      1     13      1      4 
##    5.7   5.75    5.8    5.9   5.94   5.99      6   6.06   6.07    6.1   6.14 
##      4      1      4      2      1      2    896      1      1      2      1 
##    6.2   6.25    6.3   6.32   6.34    6.4   6.45    6.5   6.51   6.55    6.6 
##      2      3      2      1      1      4      2     22      1      1      3 
##   6.68    6.7   6.75    6.8    6.9      7   7.05   7.15    7.2   7.29    7.3 
##      1      3      2      2      1    689      2      1      3      1      2 
##   7.35   7.45    7.5   7.51   7.55    7.6   7.64    7.7   7.72   7.75    7.8 
##      1      2    122      1      1      5      1      5      1      1      3 
##    7.9      8   8.09    8.1    8.2   8.25    8.3   8.37    8.4   8.43   8.45 
##      3   1501      1      2      1      1      2      1      2      1      1 
##    8.5    8.6    8.7    8.8   8.85   8.87    8.9   8.97      9   9.06   9.17 
##     24      1      3      2      1      1      1      1    184      1      1 
##    9.2    9.3   9.31    9.5   9.51    9.6   9.72   9.76   9.77    9.8    9.9 
##      1      2      1      8      1      1      1      1      1      1      4 
##     10  10.05   10.1  10.15   10.2  10.25   10.3   10.4   10.5  10.72   10.8 
##  12072      1      1      1      5      1      2      2      7      1      1 
##  10.88     11  11.02   11.1  11.15  11.16  11.18  11.26   11.5   11.6  11.62 
##      1    142      2      3      1      1      1      1      7      2      2 
##   11.7  11.83  11.85     12  12.05   12.2   12.4   12.5   12.7  12.71   12.8 
##      3      1      1    681      1      1      1     10      1      1      3 
##   12.9     13  13.25   13.3  13.36   13.4  13.47   13.5  13.53   13.8  13.95 
##      2     97      1      3      1      1      1      4      1      1      1 
##     14   14.2  14.25  14.28   14.3   14.4   14.5   14.6  14.96  14.98     15 
##     95      2      2      1      1      2      2      1      1      1   4737 
##   15.3   15.5  15.75     16  16.05   16.1   16.2  16.25   16.5   16.6  16.74 
##      1      6      1     85      1      1      2      1      2      2      1 
##   16.9  16.93  16.96     17  17.03   17.3   17.5   17.6   17.7  17.75   17.8 
##      2      1      1     86      1      1      6      1      1      1      1 
##   17.9     18  18.05   18.5  18.54   18.8  18.97     19   19.2   19.3   19.5 
##      1    216      1      2      1      1      1     30      1      1      4 
##  19.64  19.77   19.9  19.94     20  20.02   20.1   20.4   20.5     21   21.1 
##      1      1      1      8   5227      2      1      1      1     32      1 
##   21.2   21.3   21.5   21.7  21.88     22  22.14  22.18   22.2   22.5   22.7 
##      1      1      6      1      1     93      1      1      1      2      1 
##  22.75  22.88     23   23.2  23.23   23.5   23.7     24   24.5   24.7     25 
##      1      1     46      1      2      2      1     34      2      1  13121 
##  25.13   25.5  25.52     26   26.2   26.3   26.5  26.87     27   27.5     28 
##      1      2      1     26      1      1      1      1     41      1     47 
##   28.5  28.68     29  29.96     30  30.06   30.3   30.5     31   31.3   31.5 
##      3      1     20      1   2447      1      1      1     17      1      4 
##  31.52  31.95     32   32.2  32.22   32.5     33   33.5     34  34.89     35 
##      1      2     20      1      1      1     21      1     16      1    657 
##     36   36.2     37   37.4   37.5     38   38.5     39   39.6     40   40.2 
##     14      1     16      1      9     26      2      4      1   1347      1 
##     41   41.7     42  42.31   42.4     43   43.6     44   44.6   44.7  44.72 
##      7      1     21      1      1      9      1     12      1      1      1 
##     45   45.5   45.7     46   46.5   46.8     47   47.3   47.5     48  48.02 
##    246      1      2     11      1      1     12      1      1      9      1 
##     49  49.94  49.98     50  50.02   50.1     51   51.5     52     53   53.8 
##      8      1      1   8484      1      1      8      1      6      7      1 
##     54   54.1   54.9     55  55.08   55.6   55.9     56  56.54     57  57.12 
##      3      1      1    134      1      1      1      7      1      6      1 
##     58     59     60     61  61.98     62     63     64     65   65.5     66 
##      9      2    578      6      1     10     11      6    112      1      3 
##   66.5   66.9     67     68     69     70     71   71.5     72   72.7     73 
##      1      1      4      8      2    341      4      1      3      1      7 
##     74  74.25     75   75.3     76   76.3     77     78   78.2     79   79.2 
##      2      1   1149      1      3      1      3      4      1      2      1 
##  79.98     80     81     82   82.5     83     84     85     86   86.6     87 
##      1    499      5      7      1      4      1     78      1      1      2 
##   87.8     88  88.15   88.5     89   89.5     90  90.43     91     92     93 
##      1      5      1      1      3      1    154      1      1      4      2 
##     94   94.5     95     96     97     98  98.26     99  99.97    100 100.02 
##      2      1     39      4      3      4      1      5      1   3251      1 
## 100.03  100.5    101    102    103    104    105    106 106.72    107    108 
##      1      1      5      3      1      3     20      4      1      1      7 
## 108.63    109    110    111    112    113    114    115    116    117    118 
##      1      1     76      1      2      3      7     25      2      1      4 
##    120  121.7    122    123  124.9    125    126    127  127.2    128    130 
##    150      1      2      1      1    224      2      3      1      2     58 
## 130.02    132    133    134  134.8    135    136  137.9    138  138.6    139 
##      1      4      1      1      1     25      1      1      1      1      1 
##    140    141    142    143    144    145    146  146.5    147    148 148.25 
##     40      1      1      1      1      9      1      2      3      1      1 
##    149 149.58 149.85    150    151  151.4    153 153.55    154    155    158 
##      2      1      1   1077      1      1      1      1      1     11      2 
##    159  159.5    160  160.8    161 161.11    162    163  163.5    164    165 
##      1      1     55      1      2      1      1      1      1      1     17 
##    166  166.5    167    168    170    171    172    173  174.4    175    176 
##      3      1      1      1     30      1      2      2      1    104      3 
##    177    178  178.4    179  179.4  179.5 179.61    180    183  183.5    184 
##      2      1      1      1      1      1      7     41      1      1      1 
##    185    186    187    189    190    192    193    195    196  198.5    200 
##      6      1      1      1     20      1      1      9      1      1   1168 
##    201    202    203    204    205    206    207    209    210    212    213 
##      1      2      2      1      5      1      2      1     25      1      2 
##    214    215    218    219    220    224    225    227    229  229.9    230 
##      1      6      1      1     21      2     46      3      1      1     18 
##    231    233    235    237    240    242    243    245    246  246.1    250 
##      2      1      5      2     18      1      2      3      1      1   7294 
## 250.03    253    255    257 257.95    259    260    261    262    265    266 
##      1      1      1      1      2      1     11      1      1      5      1 
##    269    270 270.75    271    275    277    278  278.6    279    280  280.1 
##      1      7      1      1     48      1      1      1      2     12      1 
##    283    284    285    286 287.18    288    290    294    295 297.08 299.88 
##      1      2      1      1      1      1      9      1      3      1      1 
##    300    303    304    305    308    310    312    315    320  322.2    324 
##    573      1      1      3      1      6      1      4     13      1      3 
##    325    327    328    330    331    335    337    340    343    345    346 
##     31      1      1     13      1      1      1      8      1      1      1 
##    347    348    350    352    354    355    357    358    359    360    363 
##      1      2    168      1      1      2      1      1      1      9      1 
##    365    367    370    373    375    378    380    381    382  382.5    385 
##      3      1      4      1     17      1     12      1      1      2      2 
##    390    400    405    410    411 411.14  413.5    415    420    425    430 
##      1    312      1      3      1      1      1      1      4      9      1 
## 431.72    432    435    437    438    440    442    445    450    451    459 
##      1      1      1      2      1      3      1      1     74      1      1 
##    460 460.56    463    465    467    470    475    476    478    479    480 
##      3      1      1      4      2      3      6      2      1      2      7 
##    482    485    490    493 499.92 499.96    500 500.01  500.4    501    502 
##      2      3      2      1      1      1   2491      2      1      1      2 
##  502.7    505    510 510.07    515    518    520    524    525    529    530 
##      1      1      6      1      3      1      3      1      9      1      2 
##  531.1    534    535    536    540    545    547    550    552    554    557 
##      1      1      2      1      4      1      3     35      1      1      1 
##    560    561    565    570 570.45    571    575    580    585    586    590 
##      3      1      1     24      1      1      7      3      2      1      2 
##    592    594    595    600    602    604    605    610    613    617    619 
##      1      1      2    130      1      2      1      2      1      1      1 
##    620    623    625    630    632    635    640    643    645 645.15    650 
##      2      1      5      3      1      1      5      1      2      1     31 
##    655    660    661    662    665    670    675    680    690  693.4  696.4 
##      1      1      1      1      3      3      2      3      1      1      1 
##    700    701    702    706    710    713    720    724    725  733.4    738 
##     94      1      1      1      1      1      4      1      1      1      1 
##    740    745    746    750    755 758.25    760    762    766    770    775 
##      1      1      1    202      1      1      1      1      1      2      4 
##  777.8    780    785    787 792.15    800 806.77    810    815    820    824 
##      1      2      1      1      1    116      1      3      1      1      1 
##    825    830    840    850  868.5    870    875    880    887    888    890 
##      5      1      1     21      1      1      1      2      1      1      1 
##    900    910    915    920    925    930    932    935    936    940    950 
##     52      2      1      2      3      2      2      1      1      1      8 
##  952.5    954    955    960    968    970    971    973    975    979    988 
##      1      1      1      2      1      1      1      1      2      1      1 
##    990    995    996 
##      1      1      1

This goes through the same process removing entries where CROPDMGEXP = 0

SUBstormdata[(SUBstormdata$CROPDMGEXP == "K"),][,17] <- 1000
head(filter(SUBstormdata, SUBstormdata$CROPDMGEXP == "K"))
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1       1  6/26/1994 0:00:00     2025       CST     83  LIMESTONE    AL
## 2       1  3/24/1994 0:00:00     1410       CST      7       BIBB    AL
## 3       1  3/24/1994 0:00:00     1425       CST      7       BIBB    AL
## 4       1 11/28/1994 0:00:00     0030       CST      9     BLOUNT    AL
## 5       1  3/24/1994 0:00:00     1804       CST     17   CHAMBERS    AL
## 6       1   7/9/1995 0:00:00     1610       CST     19   CHEROKEE    AL
##                    EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG
## 1      THUNDERSTORM WINDS          0        0       5          M     500
## 2      THUNDERSTORM WINDS          0        0      50          K      50
## 3      THUNDERSTORM WINDS          0        2     500          K      50
## 4                 TORNADO          0        5     500          K       5
## 5                 TORNADO          0        0     500          K      50
## 6 THUNDERSTORM WINDS/HAIL          0        0       5          K      15
##   CROPDMGEXP healthimpact propconv cropconv
## 1          K            0    1e+06     1000
## 2          K            0    1e+03     1000
## 3          K            2    1e+03     1000
## 4          K            5    1e+03     1000
## 5          K            0    1e+03     1000
## 6          K            0    1e+03     1000
SUBstormdata[(SUBstormdata$CROPDMGEXP == "M"),][,17] <- 1000000
head(filter(SUBstormdata, SUBstormdata$CROPDMGEXP == "M"))
##   STATE__          BGN_DATE BGN_TIME TIME_ZONE COUNTY
## 1       1 10/4/1995 0:00:00     1200       CST      0
## 2       1  8/3/1995 0:00:00     0900       CST      0
## 3       1 10/3/1995 0:00:00     1200       CST      0
## 4       1  7/3/1994 0:00:00     0000       CST     77
## 5       1  7/3/1994 0:00:00     0000       CST     77
## 6       1 3/27/1994 0:00:00     1500       CST    107
##                                                     COUNTYNAME STATE
## 1                                                   ALZ001>050    AL
## 2 ALZ051 - 052 - 053 - 055 - 056 - 059 - 061 - 062 - 063 - 064    AL
## 3                                                   ALZ051>064    AL
## 4                                                   LAUDERDALE    AL
## 5                                                   LAUDERDALE    AL
## 6                                                      PICKENS    AL
##                      EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG
## 1 HURRICANE OPAL/HIGH WINDS          2        0     0.1          B      10
## 2            HURRICANE ERIN          0        0    25.0          M       1
## 3            HURRICANE OPAL          0        0    48.0          M       4
## 4                  FLOODING          2        0    50.0          M       5
## 5                HEAVY RAIN          2        0    50.0          M       5
## 6        THUNDERSTORM WINDS          0        0   500.0          K       5
##   CROPDMGEXP healthimpact propconv cropconv
## 1          M            2    1e+09    1e+06
## 2          M            0    1e+06    1e+06
## 3          M            0    1e+06    1e+06
## 4          M            2    1e+06    1e+06
## 5          M            2    1e+06    1e+06
## 6          M            0    1e+03    1e+06
SUBstormdata[(SUBstormdata$CROPDMGEXP == "B"),][,17] <- 1000000000
head(filter(SUBstormdata, SUBstormdata$CROPDMGEXP == "B"))
##   STATE__          BGN_DATE    BGN_TIME TIME_ZONE COUNTY
## 1       1 8/20/1995 0:00:00        0000       CST    121
## 2      17 8/31/1993 0:00:00        0000       CST      1
## 3      19  8/1/1995 0:00:00        0000       CST      0
## 4      19 9/21/1995 0:00:00        2300       CST      0
## 5      28  2/9/1994 0:00:00        0000       CST      0
## 6      28 8/29/2005 0:00:00 08:00:00 AM       CST     18
##                                                                                     COUNTYNAME
## 1                                                                                    TALLADEGA
## 2                                                                    ADAMS, CALHOUN AND JERSEY
## 3 IAZ004>011 - 015 - 019 - 023>030 - 033>042 - 044>054 - 057>068 - 070>078 - 081>089 - 092>099
## 4   IAZ004>011 - 015>019 - 023>030 - 033>039 - 044>050 - 057>062 - 070>075 - 081>086 - 092>097
## 5                                                               MSZ001 - 023 - 025 - 026 - 034
## 6                                                               MSZ018>019 - 025>066 - 072>074
##   STATE            EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG
## 1    AL              HEAT          0        0    0.00               0.40
## 2    IL       RIVER FLOOD          0        0    5.00          B    5.00
## 3    IA           DROUGHT          0        0    0.00               0.50
## 4    IA            FREEZE          0        0    0.00               0.20
## 5    MS         ICE STORM          0        0  500.00          K    5.00
## 6    MS HURRICANE/TYPHOON         15      104    5.88          B    1.51
##   CROPDMGEXP healthimpact propconv cropconv
## 1          B            0    0e+00    1e+09
## 2          B            0    1e+09    1e+09
## 3          B            0    0e+00    1e+09
## 4          B            0    0e+00    1e+09
## 5          B            0    1e+03    1e+09
## 6          B          119    1e+09    1e+09
SUBstormdata[!(SUBstormdata$CROPDMGEXP %in% c("K","M","B")),][,17] <- 0
head(SUBstormdata[!(SUBstormdata$CROPDMGEXP %in% c("K","M","B")),])
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL TORNADO
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL TORNADO
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL TORNADO
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL TORNADO
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL TORNADO
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL TORNADO
##   FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP healthimpact
## 1          0       15    25.0          K       0                      15
## 2          0        0     2.5          K       0                       0
## 3          0        2    25.0          K       0                       2
## 4          0        2     2.5          K       0                       2
## 5          0        2     2.5          K       0                       2
## 6          0        6     2.5          K       0                       6
##   propconv cropconv
## 1     1000        0
## 2     1000        0
## 3     1000        0
## 4     1000        0
## 5     1000        0
## 6     1000        0

After cleaning up the data frame we are now ready to calculate the prop and crop damange amount. This can be achieved by multiplying the propdmg and crop dmg values by propconv and cropconv respectively. After this we are able to compute the top events that had maximum prop and crop damage.

SUBstormdata$PROPDMGAMT <- 0
SUBstormdata$CROPDMGAMT <- 0
head(SUBstormdata)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL TORNADO
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL TORNADO
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL TORNADO
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL TORNADO
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL TORNADO
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL TORNADO
##   FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP healthimpact
## 1          0       15    25.0          K       0                      15
## 2          0        0     2.5          K       0                       0
## 3          0        2    25.0          K       0                       2
## 4          0        2     2.5          K       0                       2
## 5          0        2     2.5          K       0                       2
## 6          0        6     2.5          K       0                       6
##   propconv cropconv PROPDMGAMT CROPDMGAMT
## 1     1000        0          0          0
## 2     1000        0          0          0
## 3     1000        0          0          0
## 4     1000        0          0          0
## 5     1000        0          0          0
## 6     1000        0          0          0

Now that the variable columns have been created, we can run the analysis to find the top 20 events that had the most impact.

SUBstormdata$PROPDMGAMT <- SUBstormdata$propconv*SUBstormdata$PROPDMG
SUBstormdata$CROPDMGAMT <- SUBstormdata$cropconv*SUBstormdata$CROPDMG

grpbyevtype <- group_by(SUBstormdata,SUBstormdata$EVTYPE)
storm_prop_impact <- summarise(grpbyevtype,propimpact = sum(PROPDMGAMT))
## `summarise()` ungrouping output (override with `.groups` argument)
storm_crop_impact <- summarise(grpbyevtype,cropimpact = sum(CROPDMGAMT))
## `summarise()` ungrouping output (override with `.groups` argument)
storm_top20_propimpact<- head(arrange(storm_prop_impact,desc(propimpact)),20)
print(storm_top20_propimpact)
## # A tibble: 20 x 2
##    `SUBstormdata$EVTYPE`       propimpact
##    <chr>                            <dbl>
##  1 FLOOD                     144657709800
##  2 HURRICANE/TYPHOON          69305840000
##  3 TORNADO                    56925660480
##  4 STORM SURGE                43323536000
##  5 FLASH FLOOD                16140811510
##  6 HAIL                       15727366720
##  7 HURRICANE                  11868319010
##  8 TROPICAL STORM              7703890550
##  9 WINTER STORM                6688497250
## 10 HIGH WIND                   5270046260
## 11 RIVER FLOOD                 5118945500
## 12 WILDFIRE                    4765114000
## 13 STORM SURGE/TIDE            4641188000
## 14 TSTM WIND                   4484928440
## 15 ICE STORM                   3944927810
## 16 THUNDERSTORM WIND           3483121140
## 17 HURRICANE OPAL              3152846000
## 18 WILD/FOREST FIRE            3001829500
## 19 HEAVY RAIN/SEVERE WEATHER   2500000000
## 20 THUNDERSTORM WINDS          1733452850
storm_top20_cropimpact<- head(arrange(storm_crop_impact,desc(cropimpact)),20)
print(storm_top20_cropimpact)
## # A tibble: 20 x 2
##    `SUBstormdata$EVTYPE`  cropimpact
##    <chr>                       <dbl>
##  1 DROUGHT               13972566000
##  2 FLOOD                  5661968450
##  3 RIVER FLOOD            5029459000
##  4 ICE STORM              5022113500
##  5 HAIL                   3025537450
##  6 HURRICANE              2741910000
##  7 HURRICANE/TYPHOON      2607872800
##  8 FLASH FLOOD            1421317100
##  9 EXTREME COLD           1292973000
## 10 FROST/FREEZE           1094086000
## 11 HEAVY RAIN              733399800
## 12 TROPICAL STORM          678346000
## 13 HIGH WIND               638571300
## 14 TSTM WIND               554007350
## 15 EXCESSIVE HEAT          492402000
## 16 FREEZE                  446225000
## 17 TORNADO                 414953110
## 18 THUNDERSTORM WIND       414843050
## 19 HEAT                    401461500
## 20 WILDFIRE                295472800

To compute the events that had the most economic impact, I added up the property and crop damage.

SUBstormdata$ECONtot <- 0
SUBstormdata$ECONtot <- SUBstormdata$PROPDMGAMT + SUBstormdata$CROPDMGAMT

grpbyevtype <- group_by(SUBstormdata,SUBstormdata$EVTYPE)
storm_econ_impact<- summarise(grpbyevtype,econimpact = sum(ECONtot))
## `summarise()` ungrouping output (override with `.groups` argument)
top20_econ <-head(arrange(storm_econ_impact, desc(econimpact)),20)

Results

Overall, the table below shows the top 20 events are are most harmful with respect to population health.

print(SUBevtypes_top20)
## # A tibble: 20 x 2
##    `SUBstormdata$EVTYPE` healthimpact
##    <chr>                        <dbl>
##  1 TORNADO                      96979
##  2 EXCESSIVE HEAT                8428
##  3 TSTM WIND                     7461
##  4 FLOOD                         7259
##  5 LIGHTNING                     6046
##  6 HEAT                          3037
##  7 FLASH FLOOD                   2755
##  8 ICE STORM                     2064
##  9 THUNDERSTORM WIND             1621
## 10 WINTER STORM                  1527
## 11 HIGH WIND                     1385
## 12 HAIL                          1376
## 13 HURRICANE/TYPHOON             1339
## 14 HEAVY SNOW                    1148
## 15 WILDFIRE                       986
## 16 THUNDERSTORM WINDS             972
## 17 BLIZZARD                       906
## 18 FOG                            796
## 19 RIP CURRENT                    600
## 20 WILD/FOREST FIRE               557

The table below shows the top 20 events with the biggest property damage.

print(storm_top20_propimpact)
## # A tibble: 20 x 2
##    `SUBstormdata$EVTYPE`       propimpact
##    <chr>                            <dbl>
##  1 FLOOD                     144657709800
##  2 HURRICANE/TYPHOON          69305840000
##  3 TORNADO                    56925660480
##  4 STORM SURGE                43323536000
##  5 FLASH FLOOD                16140811510
##  6 HAIL                       15727366720
##  7 HURRICANE                  11868319010
##  8 TROPICAL STORM              7703890550
##  9 WINTER STORM                6688497250
## 10 HIGH WIND                   5270046260
## 11 RIVER FLOOD                 5118945500
## 12 WILDFIRE                    4765114000
## 13 STORM SURGE/TIDE            4641188000
## 14 TSTM WIND                   4484928440
## 15 ICE STORM                   3944927810
## 16 THUNDERSTORM WIND           3483121140
## 17 HURRICANE OPAL              3152846000
## 18 WILD/FOREST FIRE            3001829500
## 19 HEAVY RAIN/SEVERE WEATHER   2500000000
## 20 THUNDERSTORM WINDS          1733452850

This table shows the top 20 events with the biggest crop damage.

print(storm_top20_cropimpact)
## # A tibble: 20 x 2
##    `SUBstormdata$EVTYPE`  cropimpact
##    <chr>                       <dbl>
##  1 DROUGHT               13972566000
##  2 FLOOD                  5661968450
##  3 RIVER FLOOD            5029459000
##  4 ICE STORM              5022113500
##  5 HAIL                   3025537450
##  6 HURRICANE              2741910000
##  7 HURRICANE/TYPHOON      2607872800
##  8 FLASH FLOOD            1421317100
##  9 EXTREME COLD           1292973000
## 10 FROST/FREEZE           1094086000
## 11 HEAVY RAIN              733399800
## 12 TROPICAL STORM          678346000
## 13 HIGH WIND               638571300
## 14 TSTM WIND               554007350
## 15 EXCESSIVE HEAT          492402000
## 16 FREEZE                  446225000
## 17 TORNADO                 414953110
## 18 THUNDERSTORM WIND       414843050
## 19 HEAT                    401461500
## 20 WILDFIRE                295472800

Finally, this last table shows the total economic impact, combining the property and crop financial damage.

print(top20_econ) 
## # A tibble: 20 x 2
##    `SUBstormdata$EVTYPE`       econimpact
##    <chr>                            <dbl>
##  1 FLOOD                     150319678250
##  2 HURRICANE/TYPHOON          71913712800
##  3 TORNADO                    57340613590
##  4 STORM SURGE                43323541000
##  5 HAIL                       18752904170
##  6 FLASH FLOOD                17562128610
##  7 DROUGHT                    15018672000
##  8 HURRICANE                  14610229010
##  9 RIVER FLOOD                10148404500
## 10 ICE STORM                   8967041310
## 11 TROPICAL STORM              8382236550
## 12 WINTER STORM                6715441250
## 13 HIGH WIND                   5908617560
## 14 WILDFIRE                    5060586800
## 15 TSTM WIND                   5038935790
## 16 STORM SURGE/TIDE            4642038000
## 17 THUNDERSTORM WIND           3897964190
## 18 HURRICANE OPAL              3161846000
## 19 WILD/FOREST FIRE            3108626330
## 20 HEAVY RAIN/SEVERE WEATHER   2500000000

A Multi-Panel Visualization

This visualization includes the Top 20 Overall Events and the Top 20 Economic Impact Events.

par(mfrow = c(1, 2))

barplot(SUBevtypes_top20$healthimpact, main = "Top 20 Overall Events", 
        names.arg = SUBevtypes_top20$`SUBstormdata$EVTYPE`, cex.names = 0.5,
        col=color,
        ylab = "Event Type Count")

barplot(top20_econ$econimpact, main="Top 20 Economic Impact Events",
        names.arg = top20_econ$`SUBstormdata$EVTYPE`, cex.names=0.6,
        col=color,
        xlab = "Event Type Count")