SUMMARY

The present analysis looks at the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database in order to determine which types of weather events are the most harmful in terms of community health and economic impact. First, the NOAA data was cleaned up, variables belonging to the same category were combined, and very low values were excluded. Then, boxplots were created in order to visually present which type of weather was most harmful. The most harmful type of weather in terms of both fatalities and property damage was tornadoes, while the higest amount of injuries were due to heat and drought. Wildfires and floods were also high in terms of property and crop damage.

DATA PROCESSING

First, the knitr package is loaded in.

library(knitr) 

Next, we install the doBy package. This will be useful in the final analysis, when we want to look at damage amount per weather type.

#install.packages("doBy")
library(doBy)

The working directory is set. This is necessary in order for R to be able to find the dataset which will be loaded into it.

setwd("C:/Users/Nina/Desktop/Essays/Coursera/Data Science/Reproducible Research/Programming Assignment 2")

Loading in the storm data dataset (collected by NOAA)

dataset <- read.csv("repdata_data_StormData.csv.bz2")

The attach function is used to later simplify notation and for concise commands.

attach(dataset)
## The following object is masked from package:base:
## 
##     F

A function which will be used later, %notin%, is defined. This will be used in filtering out some of the data.

`%notin%` <- Negate(`%in%`)

We view EVTYPE through the summary() function to see what it consists of.

summary(EVTYPE)
##                     HAIL                TSTM WIND        THUNDERSTORM WIND 
##                   288661                   219940                    82563 
##                  TORNADO              FLASH FLOOD                    FLOOD 
##                    60652                    54277                    25326 
##       THUNDERSTORM WINDS                HIGH WIND                LIGHTNING 
##                    20843                    20212                    15754 
##               HEAVY SNOW               HEAVY RAIN             WINTER STORM 
##                    15708                    11723                    11433 
##           WINTER WEATHER             FUNNEL CLOUD         MARINE TSTM WIND 
##                     7026                     6839                     6175 
## MARINE THUNDERSTORM WIND               WATERSPOUT              STRONG WIND 
##                     5812                     3796                     3566 
##     URBAN/SML STREAM FLD                 WILDFIRE                 BLIZZARD 
##                     3392                     2761                     2719 
##                  DROUGHT                ICE STORM           EXCESSIVE HEAT 
##                     2488                     2006                     1678 
##               HIGH WINDS         WILD/FOREST FIRE             FROST/FREEZE 
##                     1533                     1457                     1342 
##                DENSE FOG       WINTER WEATHER/MIX           TSTM WIND/HAIL 
##                     1293                     1104                     1028 
##  EXTREME COLD/WIND CHILL                     HEAT                HIGH SURF 
##                     1002                      767                      725 
##           TROPICAL STORM           FLASH FLOODING             EXTREME COLD 
##                      690                      682                      655 
##            COASTAL FLOOD         LAKE-EFFECT SNOW        FLOOD/FLASH FLOOD 
##                      650                      636                      624 
##                LANDSLIDE                     SNOW          COLD/WIND CHILL 
##                      600                      587                      539 
##                      FOG              RIP CURRENT              MARINE HAIL 
##                      538                      470                      442 
##               DUST STORM                AVALANCHE                     WIND 
##                      427                      386                      340 
##             RIP CURRENTS              STORM SURGE            FREEZING RAIN 
##                      304                      261                      250 
##              URBAN FLOOD     HEAVY SURF/HIGH SURF        EXTREME WINDCHILL 
##                      249                      228                      204 
##             STRONG WINDS           DRY MICROBURST    ASTRONOMICAL LOW TIDE 
##                      196                      186                      174 
##                HURRICANE              RIVER FLOOD               LIGHT SNOW 
##                      174                      173                      154 
##         STORM SURGE/TIDE            RECORD WARMTH         COASTAL FLOODING 
##                      148                      146                      143 
##               DUST DEVIL         MARINE HIGH WIND        UNSEASONABLY WARM 
##                      141                      135                      126 
##                 FLOODING   ASTRONOMICAL HIGH TIDE        MODERATE SNOWFALL 
##                      120                      103                      101 
##           URBAN FLOODING               WINTRY MIX        HURRICANE/TYPHOON 
##                       98                       90                       88 
##            FUNNEL CLOUDS               HEAVY SURF              RECORD HEAT 
##                       87                       84                       81 
##                   FREEZE                HEAT WAVE                     COLD 
##                       74                       74                       72 
##              RECORD COLD                      ICE  THUNDERSTORM WINDS HAIL 
##                       64                       61                       61 
##      TROPICAL DEPRESSION                    SLEET         UNSEASONABLY DRY 
##                       60                       59                       56 
##                    FROST              GUSTY WINDS      THUNDERSTORM WINDSS 
##                       53                       53                       51 
##       MARINE STRONG WIND                    OTHER               SMALL HAIL 
##                       48                       48                       47 
##                   FUNNEL             FREEZING FOG             THUNDERSTORM 
##                       46                       45                       45 
##       Temperature record          TSTM WIND (G45)         Coastal Flooding 
##                       43                       39                       38 
##              WATERSPOUTS    MONTHLY PRECIPITATION                    WINDS 
##                       37                       36                       36 
##                  (Other) 
##                     2940

As we just saw, the dataset is not clean. For example, there are many column names which appear to be mistyped, as well as those that seem to belong to the same category (e.g. “FLASH FLOODING” and “FLASH FLOOD”). We next take some type to clean up the dataset and combine redundant variables, as well as combining closely related columns into categories.

We set the data to a new value so that the old value remains intact.

EVTYPE2 <- EVTYPE;

Now we can start cleaning up the dataset. The new variable names are shown in the comments above each chunk of code.

FLOOD/FLASH FLOOD

EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FLASH FLOODING"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FLASH FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "URBAN/SML STREAM FLD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "URBAN FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "RIVER FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "URBAN FLOODING"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Coastal Flooding"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FLASH FLOODS"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "URBAN/SMALL STREAM FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "RIVER FLOODING"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "LAKESHORE FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FLASH FLOOD/FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HIGH SURF"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "COASTAL FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAVY SURF/HIGH SURF"), "FLOOD/FLASH FLOOD");

HAIL

EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MARINE HAIL"), "HAIL");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SMALL HAIL"), "HAIL");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HAIL "), "HAIL");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HAIL 0.75"), "HAIL");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HAIL 100"), "HAIL");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HAIL 175"), "HAIL");

THUNDERSTORM

EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "TSTM WIND"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "THUNDERSTORM WIND"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "THUNDERSTORM WINDS"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "TSTM WIND/HAIL"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "THUNDERSTORM WINDSS"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "TSTM WIND (G45)"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "THUNDERSTORM WINDS/HAIL"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SEVERE THUNDERSTORMS"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "THUNDERSTORM WINDS HAIL"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "THUNDERSTORMS WINDS"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SEVERE THUNDERSTORM"), "THUNDERSTORM");

WIND

EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HIGH WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MARINE TSTM WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MARINE THUNDERSTORM WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "STRONG WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HIGH WINDS"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "STRONG WINDS"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXTREME WINDCHILL"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "COLD/WIND CHILL"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MARINE HIGH WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "GUSTY WINDS"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MARINE STRONG WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WINDS"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WIND DAMAGE"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "GUSTY WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXTREME WINDCHILL TEMPERATURES"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WIND ADVISORY"), "WIND");

RAIN

EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAVY RAIN"), "RAIN");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Heavy Rain"), "RAIN");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "RECORD RAINFALL"), "RAIN");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MONTHLY RAINFALL"), "RAIN");

WINTER WEATHER

EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WINTER STORM"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "BLIZZARD"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "ICE STORM"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WINTER WEATHER/MIX"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FROST/FREEZE"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXTREME COLD"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "LAKE-EFFECT SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAVY SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "LIGHT SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MODERATE SNOWFALL"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WINTRY MIX"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FREEZE"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SNOW AND ICE"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAVY SNOW SQUALLS"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Snow"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "ICY ROADS"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXCESSIVE SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAVY LAKE SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXTREME COLD/WIND CHILL"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "COLD"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "RECORD COLD"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "ICE"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "LAKE EFFECT SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Light Snow"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SNOW SQUALL"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SNOW SQUALLS"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Winter Weather"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SNOW/ICE STORM"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAVY SNOW-SQUALLS"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "BLOWING SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SLEET STORM"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "UNSEASONABLY COOL"), "WINTER WEATHER");

TORNADO

EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WATER SPROUT"), "TORNADO");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WATERSPROUT"), "TORNADO");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WATERSPOUT"), "TORNADO");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WATERSPROUTs"), "TORNADO");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "TORNADO F0"), "TORNADO");

WILDFIRE

EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WILD/FOREST FIRE"), "WILDFIRE");

HEAT/DROUGHT

EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "DROUGHT"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXCESSIVE HEAT"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAT"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "DRY MICROBURST"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "UNSEASONABLY WARM"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "RECORD WARMTH"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "RECORD HEAT"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAT WAVE"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "UNSEASONABLY DRY"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Temperature record"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXTREME HEAT"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "DROUGHT/EXCESSIVE HEAT"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "UNSEASONABLY WARM AND DRY"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Record temperature"), "HEAT/DROUGHT");

FOG

EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "DENSE FOG"), "FOG");

We now check if the variable names were correctly merged and redistributed.

summary(EVTYPE2)
##           THUNDERSTORM                   HAIL      FLOOD/FLASH FLOOD 
##                 324644                 289194                  86593 
##                TORNADO         WINTER WEATHER                   WIND 
##                  64467                  45172                  38930 
##              LIGHTNING                   RAIN           FUNNEL CLOUD 
##                  15754                  11780                   6839 
##           HEAT/DROUGHT               WILDFIRE                    FOG 
##                   5705                   4218                   1831 
##         TROPICAL STORM              LANDSLIDE            RIP CURRENT 
##                    690                    600                    470 
##             DUST STORM              AVALANCHE           RIP CURRENTS 
##                    427                    386                    304 
##            STORM SURGE          FREEZING RAIN  ASTRONOMICAL LOW TIDE 
##                    261                    250                    174 
##              HURRICANE       STORM SURGE/TIDE       COASTAL FLOODING 
##                    174                    148                    143 
##             DUST DEVIL               FLOODING ASTRONOMICAL HIGH TIDE 
##                    141                    120                    103 
##      HURRICANE/TYPHOON          FUNNEL CLOUDS             HEAVY SURF 
##                     88                     87                     84 
##    TROPICAL DEPRESSION                  SLEET                  FROST 
##                     60                     59                     53 
##                  OTHER                 FUNNEL           FREEZING FOG 
##                     48                     46                     45 
##            WATERSPOUTS  MONTHLY PRECIPITATION    MIXED PRECIPITATION 
##                     37                     36                     34 
##                  GLAZE                HAIL 75            HEAVY RAINS 
##                     32                     29                     26 
##    LIGHT FREEZING RAIN      UNSEASONABLY COLD           VOLCANIC ASH 
##                     23                     23                     22 
##                 SEICHE       FREEZING DRIZZLE         TIDAL FLOODING 
##                     21                     20                     20 
##                TSUNAMI       UNSEASONABLY WET             WIND CHILL 
##                     20                     19                     18 
##           PROLONG COLD              BLACK ICE                  Glaze 
##                     17                     14                     11 
##                  SMOKE     SNOW FREEZING RAIN                TYPHOON 
##                     11                     11                     11 
##                   Cold            DENSE SMOKE            Gusty Winds 
##                     10                     10                     10 
##           MIXED PRECIP             SNOW/SLEET        TSTM WIND (G40) 
##                     10                     10                     10 
##       UNSEASONABLY HOT         UNUSUAL WARMTH            WATERSPOUT- 
##                     10                     10                     10 
##                    DRY             FIRST SNOW    FREEZING RAIN/SLEET 
##                      9                      9                      9 
##   HEAVY RAINS/FLOODING              High Surf         HURRICANE OPAL 
##                      9                      9                      9 
##               MUDSLIDE          COASTAL STORM             Dust Devil 
##                      9                      8                      8 
##   FLASH FLOODING/FLOOD         GRADIENT WINDS              HEAVY MIX 
##                      8                      8                      8 
##              HIGH SEAS             LANDSLIDES               Mudslide 
##                      8                      8                      8 
##            RECORD SNOW          Record Warmth         UNUSUALLY COLD 
##                      8                      8                      8 
##     URBAN/SMALL STREAM     WATERSPOUT/TORNADO              WILDFIRES 
##                      8                      8                      8 
##          Freezing Rain            HARD FREEZE         HURRICANE ERIN 
##                      7                      7                      7 
##        LOW TEMPERATURE              MUD SLIDE        NON SEVERE HAIL 
##                      7                      7                      7 
##     SMALL STREAM FLOOD           SNOW DROUGHT      SNOW/BLOWING SNOW 
##                      7                      7                      7 
##               SNOW/ICE           Strong Winds    THUNDERSTORM  WINDS 
##                      7                      7                      7 
##                (Other) 
##                   1405

We can see from here that there are still some remaining variables. However, they present a very small proportion of the data. Hence, we considered that it is appropriate to leave these values out in order to make the figures clearer. It is highly unlikely that these infrequent events would prove to be the ones causing the most damage.

We create EVTYPE3 as we later did for EVTYPE2, in order to protect the data.

EVTYPE3 <- EVTYPE2;

We assign a value to the major types of damage (as summarized into categories above). All the other values will be excluded due to infrequency.

validTypes <- c("FLOOD/FLASH FLOOD","HAIL","THUNDERSTORM","WIND","RAIN","WINTER WEATHER","TORNADO","WILDFIRE","HEAT/DROUGHT","FOG");
EVTYPE3 <- replace(EVTYPE3,which(EVTYPE3 %notin% validTypes),NA);
EVTYPE3 <- droplevels(EVTYPE3)

We check if the values were correctly excluded.

summary(EVTYPE3)  
## FLOOD/FLASH FLOOD               FOG              HAIL      HEAT/DROUGHT 
##             86593              1831            289194              5705 
##              RAIN      THUNDERSTORM           TORNADO          WILDFIRE 
##             11780            324644             64467              4218 
##              WIND    WINTER WEATHER              NA's 
##             38930             45172             29763

RESULTS

First, we create the boxplots for the values of both fatalities and injuries. It is probably incorrect to lump these two categories into one, as they are very different. While injuries might be minor and not require medical attention, deaths are always a large loss. We also use functions from the doBy package to look at the fatalities per weather caregory.

We combine the two damage types into one category.

DAMAGE <- PROPDMG + CROPDMG

We assign a dataframe to our data.

my_data <- data.frame(EVTYPE3, FATALITIES, INJURIES, DAMAGE)


fatalities_plot <- boxplot(FATALITIES ~ EVTYPE3, main = "Fatalities by Weather Type", xlab = "Weather Type", ylab = "Fatalities")

summaryBy(FATALITIES ~ EVTYPE3, data = my_data, 
          FUN = list(mean, median, sd))
##              EVTYPE3 FATALITIES.mean FATALITIES.median FATALITIES.sd
## 1  FLOOD/FLASH FLOOD    0.0194011063                 0    0.23335691
## 2                FOG    0.0436919716                 0    0.36552578
## 3               HAIL    0.0000518683                 0    0.01001381
## 4       HEAT/DROUGHT    0.5530236635                 0    8.25069816
## 5               RAIN    0.0083191851                 0    0.20253607
## 6       THUNDERSTORM    0.0021777701                 0    0.06163957
## 7            TORNADO    0.0874245738                 0    1.36982471
## 8           WILDFIRE    0.0206258890                 0    0.34817371
## 9               WIND    0.0145902903                 0    0.15896354
## 10    WINTER WEATHER    0.0206101125                 0    0.21632331
## 11              <NA>    0.0735140947                 0    0.47273456

Other than an outlier value in the heat & drought category (which might be due to measurment or input error), the highest fatalities are due to tornadoes.

injuries_plot <- boxplot(INJURIES ~ EVTYPE3, main = "Injuries by Weather Type", xlab = "Weather Type", ylab = "Injuries")

summaryBy(INJURIES ~ EVTYPE3, data = my_data, 
          FUN = list(mean, median, sd))
##              EVTYPE3 INJURIES.mean INJURIES.median INJURIES.sd
## 1  FLOOD/FLASH FLOOD   0.102456319               0   6.0321675
## 2                FOG   0.587657018               0   4.0048933
## 3               HAIL   0.004740762               0   0.3846984
## 4       HEAT/DROUGHT   1.610517090               0  16.3919417
## 5               RAIN   0.021307301               0   0.4780301
## 6       THUNDERSTORM   0.029167334               0   0.5402649
## 7            TORNADO   1.417391844               0  16.6671881
## 8           WILDFIRE   0.345187293               0   2.5868553
## 9               WIND   0.049062420               0   0.7865245
## 10    WINTER WEATHER   0.137784468               0   7.8883399
## 11              <NA>   0.313678057               0   5.4049244

The highest number of injuries is due to heat and drought. This is followed by tornadoes, fog, and wildfires. (the latter with an outlier value that skews that data).

The categories of property damage and crop damage were combined into one category of general economic damage above. We visualize the combined damage data in a boxplot.

damage_plot <- boxplot(DAMAGE ~ EVTYPE3, main = "Damage by Weather Type", ylab = "Damage (in Dollars)", xlab = "Weather Type")

summaryBy(DAMAGE ~ EVTYPE3, data = my_data, 
          FUN = list(mean, median, sd))
##              EVTYPE3 DAMAGE.mean DAMAGE.median DAMAGE.sd
## 1  FLOOD/FLASH FLOOD   32.451204           0.0 112.43893
## 2                FOG    9.325647           0.0  59.17839
## 3               HAIL    4.391972           0.0  37.85159
## 4       HEAT/DROUGHT    7.750449           0.0  55.50257
## 5               RAIN    5.349787           0.0  42.11475
## 6       THUNDERSTORM    8.847064           0.0  45.51774
## 7            TORNADO   51.525800           2.5 115.98509
## 8           WILDFIRE   31.379334           0.0 114.01762
## 9               WIND   12.233942           0.0  63.76805
## 10    WINTER WEATHER    9.469637           0.0  56.73931
## 11              <NA>   27.805950           0.0  92.86525

The highest amount of damage is done by tornadoes, which is also the only type of weather with a nonzero median. This is followed by floods and wildfires, which are close together.