The present analysis looks at the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database in order to determine which types of weather events are the most harmful in terms of community health and economic impact. First, the NOAA data was cleaned up, variables belonging to the same category were combined, and very low values were excluded. Then, boxplots were created in order to visually present which type of weather was most harmful. The most harmful type of weather in terms of both fatalities and property damage was tornadoes, while the higest amount of injuries were due to heat and drought. Wildfires and floods were also high in terms of property and crop damage.
First, the knitr package is loaded in.
library(knitr)
Next, we install the doBy package. This will be useful in the final analysis, when we want to look at damage amount per weather type.
#install.packages("doBy")
library(doBy)
The working directory is set. This is necessary in order for R to be able to find the dataset which will be loaded into it.
setwd("C:/Users/Nina/Desktop/Essays/Coursera/Data Science/Reproducible Research/Programming Assignment 2")
Loading in the storm data dataset (collected by NOAA)
dataset <- read.csv("repdata_data_StormData.csv.bz2")
The attach function is used to later simplify notation and for concise commands.
attach(dataset)
## The following object is masked from package:base:
##
## F
A function which will be used later, %notin%, is defined. This will be used in filtering out some of the data.
`%notin%` <- Negate(`%in%`)
We view EVTYPE through the summary() function to see what it consists of.
summary(EVTYPE)
## HAIL TSTM WIND THUNDERSTORM WIND
## 288661 219940 82563
## TORNADO FLASH FLOOD FLOOD
## 60652 54277 25326
## THUNDERSTORM WINDS HIGH WIND LIGHTNING
## 20843 20212 15754
## HEAVY SNOW HEAVY RAIN WINTER STORM
## 15708 11723 11433
## WINTER WEATHER FUNNEL CLOUD MARINE TSTM WIND
## 7026 6839 6175
## MARINE THUNDERSTORM WIND WATERSPOUT STRONG WIND
## 5812 3796 3566
## URBAN/SML STREAM FLD WILDFIRE BLIZZARD
## 3392 2761 2719
## DROUGHT ICE STORM EXCESSIVE HEAT
## 2488 2006 1678
## HIGH WINDS WILD/FOREST FIRE FROST/FREEZE
## 1533 1457 1342
## DENSE FOG WINTER WEATHER/MIX TSTM WIND/HAIL
## 1293 1104 1028
## EXTREME COLD/WIND CHILL HEAT HIGH SURF
## 1002 767 725
## TROPICAL STORM FLASH FLOODING EXTREME COLD
## 690 682 655
## COASTAL FLOOD LAKE-EFFECT SNOW FLOOD/FLASH FLOOD
## 650 636 624
## LANDSLIDE SNOW COLD/WIND CHILL
## 600 587 539
## FOG RIP CURRENT MARINE HAIL
## 538 470 442
## DUST STORM AVALANCHE WIND
## 427 386 340
## RIP CURRENTS STORM SURGE FREEZING RAIN
## 304 261 250
## URBAN FLOOD HEAVY SURF/HIGH SURF EXTREME WINDCHILL
## 249 228 204
## STRONG WINDS DRY MICROBURST ASTRONOMICAL LOW TIDE
## 196 186 174
## HURRICANE RIVER FLOOD LIGHT SNOW
## 174 173 154
## STORM SURGE/TIDE RECORD WARMTH COASTAL FLOODING
## 148 146 143
## DUST DEVIL MARINE HIGH WIND UNSEASONABLY WARM
## 141 135 126
## FLOODING ASTRONOMICAL HIGH TIDE MODERATE SNOWFALL
## 120 103 101
## URBAN FLOODING WINTRY MIX HURRICANE/TYPHOON
## 98 90 88
## FUNNEL CLOUDS HEAVY SURF RECORD HEAT
## 87 84 81
## FREEZE HEAT WAVE COLD
## 74 74 72
## RECORD COLD ICE THUNDERSTORM WINDS HAIL
## 64 61 61
## TROPICAL DEPRESSION SLEET UNSEASONABLY DRY
## 60 59 56
## FROST GUSTY WINDS THUNDERSTORM WINDSS
## 53 53 51
## MARINE STRONG WIND OTHER SMALL HAIL
## 48 48 47
## FUNNEL FREEZING FOG THUNDERSTORM
## 46 45 45
## Temperature record TSTM WIND (G45) Coastal Flooding
## 43 39 38
## WATERSPOUTS MONTHLY PRECIPITATION WINDS
## 37 36 36
## (Other)
## 2940
As we just saw, the dataset is not clean. For example, there are many column names which appear to be mistyped, as well as those that seem to belong to the same category (e.g. “FLASH FLOODING” and “FLASH FLOOD”). We next take some type to clean up the dataset and combine redundant variables, as well as combining closely related columns into categories.
We set the data to a new value so that the old value remains intact.
EVTYPE2 <- EVTYPE;
Now we can start cleaning up the dataset. The new variable names are shown in the comments above each chunk of code.
FLOOD/FLASH FLOOD
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FLASH FLOODING"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FLASH FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "URBAN/SML STREAM FLD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "URBAN FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "RIVER FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "URBAN FLOODING"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Coastal Flooding"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FLASH FLOODS"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "URBAN/SMALL STREAM FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "RIVER FLOODING"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "LAKESHORE FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FLASH FLOOD/FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HIGH SURF"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "COASTAL FLOOD"), "FLOOD/FLASH FLOOD");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAVY SURF/HIGH SURF"), "FLOOD/FLASH FLOOD");
HAIL
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MARINE HAIL"), "HAIL");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SMALL HAIL"), "HAIL");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HAIL "), "HAIL");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HAIL 0.75"), "HAIL");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HAIL 100"), "HAIL");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HAIL 175"), "HAIL");
THUNDERSTORM
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "TSTM WIND"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "THUNDERSTORM WIND"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "THUNDERSTORM WINDS"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "TSTM WIND/HAIL"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "THUNDERSTORM WINDSS"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "TSTM WIND (G45)"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "THUNDERSTORM WINDS/HAIL"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SEVERE THUNDERSTORMS"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "THUNDERSTORM WINDS HAIL"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "THUNDERSTORMS WINDS"), "THUNDERSTORM");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SEVERE THUNDERSTORM"), "THUNDERSTORM");
WIND
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HIGH WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MARINE TSTM WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MARINE THUNDERSTORM WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "STRONG WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HIGH WINDS"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "STRONG WINDS"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXTREME WINDCHILL"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "COLD/WIND CHILL"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MARINE HIGH WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "GUSTY WINDS"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MARINE STRONG WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WINDS"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WIND DAMAGE"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "GUSTY WIND"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXTREME WINDCHILL TEMPERATURES"), "WIND");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WIND ADVISORY"), "WIND");
RAIN
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAVY RAIN"), "RAIN");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Heavy Rain"), "RAIN");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "RECORD RAINFALL"), "RAIN");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MONTHLY RAINFALL"), "RAIN");
WINTER WEATHER
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WINTER STORM"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "BLIZZARD"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "ICE STORM"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WINTER WEATHER/MIX"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FROST/FREEZE"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXTREME COLD"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "LAKE-EFFECT SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAVY SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "LIGHT SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "MODERATE SNOWFALL"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WINTRY MIX"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "FREEZE"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SNOW AND ICE"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAVY SNOW SQUALLS"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Snow"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "ICY ROADS"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXCESSIVE SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAVY LAKE SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXTREME COLD/WIND CHILL"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "COLD"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "RECORD COLD"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "ICE"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "LAKE EFFECT SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Light Snow"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SNOW SQUALL"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SNOW SQUALLS"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Winter Weather"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SNOW/ICE STORM"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAVY SNOW-SQUALLS"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "BLOWING SNOW"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "SLEET STORM"), "WINTER WEATHER");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "UNSEASONABLY COOL"), "WINTER WEATHER");
TORNADO
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WATER SPROUT"), "TORNADO");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WATERSPROUT"), "TORNADO");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WATERSPOUT"), "TORNADO");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WATERSPROUTs"), "TORNADO");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "TORNADO F0"), "TORNADO");
WILDFIRE
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "WILD/FOREST FIRE"), "WILDFIRE");
HEAT/DROUGHT
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "DROUGHT"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXCESSIVE HEAT"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAT"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "DRY MICROBURST"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "UNSEASONABLY WARM"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "RECORD WARMTH"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "RECORD HEAT"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "HEAT WAVE"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "UNSEASONABLY DRY"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Temperature record"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "EXTREME HEAT"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "DROUGHT/EXCESSIVE HEAT"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "UNSEASONABLY WARM AND DRY"), "HEAT/DROUGHT");
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "Record temperature"), "HEAT/DROUGHT");
FOG
EVTYPE2 <- replace(EVTYPE2, which(EVTYPE == "DENSE FOG"), "FOG");
We now check if the variable names were correctly merged and redistributed.
summary(EVTYPE2)
## THUNDERSTORM HAIL FLOOD/FLASH FLOOD
## 324644 289194 86593
## TORNADO WINTER WEATHER WIND
## 64467 45172 38930
## LIGHTNING RAIN FUNNEL CLOUD
## 15754 11780 6839
## HEAT/DROUGHT WILDFIRE FOG
## 5705 4218 1831
## TROPICAL STORM LANDSLIDE RIP CURRENT
## 690 600 470
## DUST STORM AVALANCHE RIP CURRENTS
## 427 386 304
## STORM SURGE FREEZING RAIN ASTRONOMICAL LOW TIDE
## 261 250 174
## HURRICANE STORM SURGE/TIDE COASTAL FLOODING
## 174 148 143
## DUST DEVIL FLOODING ASTRONOMICAL HIGH TIDE
## 141 120 103
## HURRICANE/TYPHOON FUNNEL CLOUDS HEAVY SURF
## 88 87 84
## TROPICAL DEPRESSION SLEET FROST
## 60 59 53
## OTHER FUNNEL FREEZING FOG
## 48 46 45
## WATERSPOUTS MONTHLY PRECIPITATION MIXED PRECIPITATION
## 37 36 34
## GLAZE HAIL 75 HEAVY RAINS
## 32 29 26
## LIGHT FREEZING RAIN UNSEASONABLY COLD VOLCANIC ASH
## 23 23 22
## SEICHE FREEZING DRIZZLE TIDAL FLOODING
## 21 20 20
## TSUNAMI UNSEASONABLY WET WIND CHILL
## 20 19 18
## PROLONG COLD BLACK ICE Glaze
## 17 14 11
## SMOKE SNOW FREEZING RAIN TYPHOON
## 11 11 11
## Cold DENSE SMOKE Gusty Winds
## 10 10 10
## MIXED PRECIP SNOW/SLEET TSTM WIND (G40)
## 10 10 10
## UNSEASONABLY HOT UNUSUAL WARMTH WATERSPOUT-
## 10 10 10
## DRY FIRST SNOW FREEZING RAIN/SLEET
## 9 9 9
## HEAVY RAINS/FLOODING High Surf HURRICANE OPAL
## 9 9 9
## MUDSLIDE COASTAL STORM Dust Devil
## 9 8 8
## FLASH FLOODING/FLOOD GRADIENT WINDS HEAVY MIX
## 8 8 8
## HIGH SEAS LANDSLIDES Mudslide
## 8 8 8
## RECORD SNOW Record Warmth UNUSUALLY COLD
## 8 8 8
## URBAN/SMALL STREAM WATERSPOUT/TORNADO WILDFIRES
## 8 8 8
## Freezing Rain HARD FREEZE HURRICANE ERIN
## 7 7 7
## LOW TEMPERATURE MUD SLIDE NON SEVERE HAIL
## 7 7 7
## SMALL STREAM FLOOD SNOW DROUGHT SNOW/BLOWING SNOW
## 7 7 7
## SNOW/ICE Strong Winds THUNDERSTORM WINDS
## 7 7 7
## (Other)
## 1405
We can see from here that there are still some remaining variables. However, they present a very small proportion of the data. Hence, we considered that it is appropriate to leave these values out in order to make the figures clearer. It is highly unlikely that these infrequent events would prove to be the ones causing the most damage.
We create EVTYPE3 as we later did for EVTYPE2, in order to protect the data.
EVTYPE3 <- EVTYPE2;
We assign a value to the major types of damage (as summarized into categories above). All the other values will be excluded due to infrequency.
validTypes <- c("FLOOD/FLASH FLOOD","HAIL","THUNDERSTORM","WIND","RAIN","WINTER WEATHER","TORNADO","WILDFIRE","HEAT/DROUGHT","FOG");
EVTYPE3 <- replace(EVTYPE3,which(EVTYPE3 %notin% validTypes),NA);
EVTYPE3 <- droplevels(EVTYPE3)
We check if the values were correctly excluded.
summary(EVTYPE3)
## FLOOD/FLASH FLOOD FOG HAIL HEAT/DROUGHT
## 86593 1831 289194 5705
## RAIN THUNDERSTORM TORNADO WILDFIRE
## 11780 324644 64467 4218
## WIND WINTER WEATHER NA's
## 38930 45172 29763
First, we create the boxplots for the values of both fatalities and injuries. It is probably incorrect to lump these two categories into one, as they are very different. While injuries might be minor and not require medical attention, deaths are always a large loss. We also use functions from the doBy package to look at the fatalities per weather caregory.
We combine the two damage types into one category.
DAMAGE <- PROPDMG + CROPDMG
We assign a dataframe to our data.
my_data <- data.frame(EVTYPE3, FATALITIES, INJURIES, DAMAGE)
fatalities_plot <- boxplot(FATALITIES ~ EVTYPE3, main = "Fatalities by Weather Type", xlab = "Weather Type", ylab = "Fatalities")
summaryBy(FATALITIES ~ EVTYPE3, data = my_data,
FUN = list(mean, median, sd))
## EVTYPE3 FATALITIES.mean FATALITIES.median FATALITIES.sd
## 1 FLOOD/FLASH FLOOD 0.0194011063 0 0.23335691
## 2 FOG 0.0436919716 0 0.36552578
## 3 HAIL 0.0000518683 0 0.01001381
## 4 HEAT/DROUGHT 0.5530236635 0 8.25069816
## 5 RAIN 0.0083191851 0 0.20253607
## 6 THUNDERSTORM 0.0021777701 0 0.06163957
## 7 TORNADO 0.0874245738 0 1.36982471
## 8 WILDFIRE 0.0206258890 0 0.34817371
## 9 WIND 0.0145902903 0 0.15896354
## 10 WINTER WEATHER 0.0206101125 0 0.21632331
## 11 <NA> 0.0735140947 0 0.47273456
Other than an outlier value in the heat & drought category (which might be due to measurment or input error), the highest fatalities are due to tornadoes.
injuries_plot <- boxplot(INJURIES ~ EVTYPE3, main = "Injuries by Weather Type", xlab = "Weather Type", ylab = "Injuries")
summaryBy(INJURIES ~ EVTYPE3, data = my_data,
FUN = list(mean, median, sd))
## EVTYPE3 INJURIES.mean INJURIES.median INJURIES.sd
## 1 FLOOD/FLASH FLOOD 0.102456319 0 6.0321675
## 2 FOG 0.587657018 0 4.0048933
## 3 HAIL 0.004740762 0 0.3846984
## 4 HEAT/DROUGHT 1.610517090 0 16.3919417
## 5 RAIN 0.021307301 0 0.4780301
## 6 THUNDERSTORM 0.029167334 0 0.5402649
## 7 TORNADO 1.417391844 0 16.6671881
## 8 WILDFIRE 0.345187293 0 2.5868553
## 9 WIND 0.049062420 0 0.7865245
## 10 WINTER WEATHER 0.137784468 0 7.8883399
## 11 <NA> 0.313678057 0 5.4049244
The highest number of injuries is due to heat and drought. This is followed by tornadoes, fog, and wildfires. (the latter with an outlier value that skews that data).
The categories of property damage and crop damage were combined into one category of general economic damage above. We visualize the combined damage data in a boxplot.
damage_plot <- boxplot(DAMAGE ~ EVTYPE3, main = "Damage by Weather Type", ylab = "Damage (in Dollars)", xlab = "Weather Type")
summaryBy(DAMAGE ~ EVTYPE3, data = my_data,
FUN = list(mean, median, sd))
## EVTYPE3 DAMAGE.mean DAMAGE.median DAMAGE.sd
## 1 FLOOD/FLASH FLOOD 32.451204 0.0 112.43893
## 2 FOG 9.325647 0.0 59.17839
## 3 HAIL 4.391972 0.0 37.85159
## 4 HEAT/DROUGHT 7.750449 0.0 55.50257
## 5 RAIN 5.349787 0.0 42.11475
## 6 THUNDERSTORM 8.847064 0.0 45.51774
## 7 TORNADO 51.525800 2.5 115.98509
## 8 WILDFIRE 31.379334 0.0 114.01762
## 9 WIND 12.233942 0.0 63.76805
## 10 WINTER WEATHER 9.469637 0.0 56.73931
## 11 <NA> 27.805950 0.0 92.86525
The highest amount of damage is done by tornadoes, which is also the only type of weather with a nonzero median. This is followed by floods and wildfires, which are close together.