Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This analysis involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Goal

Data analysis must address the following questions: 1. Across the US, which types of events are most harmful with respect to population health? 2. Across the US, which types of events have the greatest economic consequences?

Data Source

NCDC receives Storm Data from the National Weather Service. The National Weather service receives their information from a variety of sources, which include but are not limited to: county, state and federal emergency management officials, local law enforcement officials, skywarn spotters, NWS damage surveys, newspaper clipping services, the insurance industry and the general public.

Storm Data is an official publication of the National Oceanic and Atmospheric Administration (NOAA) which documents the occurrence of storms and other significant weather phenomena having sufficient intensity to cause loss of life, injuries, significant property damage, and/or disruption to commerce. In addition, it is a partial record of other significant meteorological events, such as record maximum or minimum temperatures or precipitation that occurs in connection with another event. Some information appearing in Storm Data may be provided by or gathered from sources outside the National Weather Service (NWS), such as the media, law enforcement and/or other government agencies, private companies, individuals, etc. An effort is made to use the best available information but because of time and resource constraints, information from these sources may be unverified by the NWS. Therefore, when using information from Storm Data, customers should be cautious as the NWS does not guarantee the accuracy or validity of the information.

Income Data

Data file has been dowloaded using next link:
https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2

The events in the database start in the year 1950 and end in November 2011.
In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.

Loading Storm Data into R:

storm.url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(storm.url, "repdata_data_StormData.csv")
Storms <- read.table("repdata-data-StormData.csv", header = TRUE, sep = ",")

Data Processing

Loading relevant packages:

library(rmarkdown)
library(dplyr)
## 
## Attaching package: 'dplyr'
## Следующие объекты скрыты от 'package:stats':
## 
##     filter, lag
## Следующие объекты скрыты от 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(cowplot)
## 
## Attaching package: 'cowplot'
## Следующий объект скрыт от 'package:ggplot2':
## 
##     ggsave
library(sqldf)
## Loading required package: gsubfn
## Loading required package: proto
## Warning in doTryCatch(return(expr), name, parentenv, handler): не могу загрузить разделяемый объект '/Library/Frameworks/R.framework/Resources/modules//R_X11.so':
##   dlopen(/Library/Frameworks/R.framework/Resources/modules//R_X11.so, 6): Library not loaded: /opt/X11/lib/libSM.6.dylib
##   Referenced from: /Library/Frameworks/R.framework/Resources/modules//R_X11.so
##   Reason: image not found
## Could not load tcltk.  Will use slower R code instead.
## Loading required package: RSQLite
## Loading required package: DBI
library(pander)

Lets check how many events in the Data table.

summary(Storms$EVTYPE)
##                     HAIL                TSTM WIND        THUNDERSTORM WIND 
##                   288661                   219940                    82563 
##                  TORNADO              FLASH FLOOD                    FLOOD 
##                    60652                    54277                    25326 
##       THUNDERSTORM WINDS                HIGH WIND                LIGHTNING 
##                    20843                    20212                    15754 
##               HEAVY SNOW               HEAVY RAIN             WINTER STORM 
##                    15708                    11723                    11433 
##           WINTER WEATHER             FUNNEL CLOUD         MARINE TSTM WIND 
##                     7026                     6839                     6175 
## MARINE THUNDERSTORM WIND               WATERSPOUT              STRONG WIND 
##                     5812                     3796                     3566 
##     URBAN/SML STREAM FLD                 WILDFIRE                 BLIZZARD 
##                     3392                     2761                     2719 
##                  DROUGHT                ICE STORM           EXCESSIVE HEAT 
##                     2488                     2006                     1678 
##               HIGH WINDS         WILD/FOREST FIRE             FROST/FREEZE 
##                     1533                     1457                     1342 
##                DENSE FOG       WINTER WEATHER/MIX           TSTM WIND/HAIL 
##                     1293                     1104                     1028 
##  EXTREME COLD/WIND CHILL                     HEAT                HIGH SURF 
##                     1002                      767                      725 
##           TROPICAL STORM           FLASH FLOODING             EXTREME COLD 
##                      690                      682                      655 
##            COASTAL FLOOD         LAKE-EFFECT SNOW        FLOOD/FLASH FLOOD 
##                      650                      636                      624 
##                LANDSLIDE                     SNOW          COLD/WIND CHILL 
##                      600                      587                      539 
##                      FOG              RIP CURRENT              MARINE HAIL 
##                      538                      470                      442 
##               DUST STORM                AVALANCHE                     WIND 
##                      427                      386                      340 
##             RIP CURRENTS              STORM SURGE            FREEZING RAIN 
##                      304                      261                      250 
##              URBAN FLOOD     HEAVY SURF/HIGH SURF        EXTREME WINDCHILL 
##                      249                      228                      204 
##             STRONG WINDS           DRY MICROBURST    ASTRONOMICAL LOW TIDE 
##                      196                      186                      174 
##                HURRICANE              RIVER FLOOD               LIGHT SNOW 
##                      174                      173                      154 
##         STORM SURGE/TIDE            RECORD WARMTH         COASTAL FLOODING 
##                      148                      146                      143 
##               DUST DEVIL         MARINE HIGH WIND        UNSEASONABLY WARM 
##                      141                      135                      126 
##                 FLOODING   ASTRONOMICAL HIGH TIDE        MODERATE SNOWFALL 
##                      120                      103                      101 
##           URBAN FLOODING               WINTRY MIX        HURRICANE/TYPHOON 
##                       98                       90                       88 
##            FUNNEL CLOUDS               HEAVY SURF              RECORD HEAT 
##                       87                       84                       81 
##                   FREEZE                HEAT WAVE                     COLD 
##                       74                       74                       72 
##              RECORD COLD                      ICE  THUNDERSTORM WINDS HAIL 
##                       64                       61                       61 
##      TROPICAL DEPRESSION                    SLEET         UNSEASONABLY DRY 
##                       60                       59                       56 
##                    FROST              GUSTY WINDS      THUNDERSTORM WINDSS 
##                       53                       53                       51 
##       MARINE STRONG WIND                    OTHER               SMALL HAIL 
##                       48                       48                       47 
##                   FUNNEL             FREEZING FOG             THUNDERSTORM 
##                       46                       45                       45 
##       Temperature record          TSTM WIND (G45)         Coastal Flooding 
##                       43                       39                       38 
##              WATERSPOUTS    MONTHLY PRECIPITATION                    WINDS 
##                       37                       36                       36 
##                  (Other) 
##                     2940

985 types of events is too much. Lets simplificate it. Trying to reduce the number of unique events from 985 to at least 10:

Storms$EVTYPE <- as.character(Storms$EVTYPE)


Storms$event <- ""
Storms$event[Storms$EVTYPE %in% c("FLOOD","FLASH FLOOD","FLOOD/FLASH FLOOD",
      "URBAN FLOOD","URBAN FLOODING","COASTAL FLOODING","RIVER FLOOD",
      "FLOODING","COASTAL FLOOD","Coastal Flooding","FLASH FLOODING", 
      "FLASH FLOODING/FLOOD", "TIDAL FLOODING", "FLASH FLOOD/FLOOD", 
      "LAKESHORE FLOOD", "RIVER FLOODING", "URBAN/SMALL STREAM FLOOD", 
      "FLASH FLOODS", "SMALL STREAM FLOOD", "URBAN/SMALL STREAM FLOODING", 
      "SMALL STREAM FLOODING", "ICE JAM FLOODING", 
      "URBAN AND SMALL STREAM FLOODIN", "FLOOD/RAIN/WINDS")] <- "Flood"
Storms$event[Storms$EVTYPE %in% c("RECORD COLD","WINTER WEATHER","EXCESSIVE HEAT",
      "WINTER WEATHER/MIX","EXTREME COLD","EXTREME COLD/WIND CHILL","COLD","FREEZE",
      "RECORD HEAT","HEAT WAVE","FROST","HEAT","FROST/FREEZE","Temperature record",
      "UNSEASONABLY HOT", "Cold", "Record temperature", "UNSEASONABLY COOL", 
      "PROLONG COLD", "WIND CHILL", "Winter Weather", "EXTREME WINDCHILL TEMPERATURES",
      "FREEZING DRIZZLE", "EXTREME HEAT", "UNSEASONABLY COLD", "FREEZING FOG", 
      "EXTREME/RECORD COLD", "DROUGHT/EXCESSIVE HEAT", "DROUGHT/EXCESSIVE HEAT",
      "UNUSUALLY COLD", "HARD FREEZE", "LOW TEMPERATURE", "AGRICULTURAL FREEZE", 
      "BITTER WIND CHILL TEMPERATURES", "HIGH WINDS/COLD", "RECORD COOL")] <- "Temperature"
Storms$event[Storms$EVTYPE %in% c("TSTM WIND","THUNDERSTORM WIND","STRONG WIND",
      "TSTM WIND/HAIL","HIGH WIND", "MARINE TSTM WIND","MARINE THUNDERSTORM WIND",
      "STRONG WINDS", "THUNDERSTORM WINDSS", "MARINE STRONG WIND",
      "MARINE HIGH WIND","WIND","GUSTY WINDS","TSTM WIND (G45)","WINDS", 
      "GRADIENT WINDS","TSTM WIND (G40)", "Gusty Winds", "WIND ADVISORY", 
      "GUSTY WIND", "THUNDERSTORM WINDS/HAIL", "WIND DAMAGE", "HIGH WINDS", 
      "COLD/WIND CHILL", "Strong Winds", "Wind Damage")] <- "Wind"

Storms$event[Storms$EVTYPE %in% c("HAIL","HEAVY RAIN","HEAVY SNOW","WINTER STORM",
      "ICE STORM","BLIZZARD","SMALL HAIL","MARINE HAIL","FREEZING RAIN",
      "THUNDERSTORM WINDS HAIL","MODERATE SNOWFALL","LIGHT SNOW","SNOW","ICE",
      "LAKE-EFFECT SNOW","SLEET","WINTRY MIX", "HEAVY RAINS/FLOODING", 
      "FREEZING RAIN/SLEET","FIRST SNOW", "SNOW/SLEET", "SNOW FREEZING RAIN",
      "MONTHLY RAINFALL", "BLOWING SNOW", "RAIN", "RECORD RAINFALL", 
      "BLACK ICE", "HEAVY SNOW-SQUALLS", "Heavy Rain", "SNOW/ICE STORM", 
      "SNOW SQUALLS", "HAIL 0.75", "SNOW SQUALL", "Light Snow", 
      "LAKE EFFECT SNOW", "LIGHT FREEZING RAIN", "HEAVY LAKE SNOW", 
      "EXCESSIVE SNOW", "HEAVY RAINS", "ICY ROADS", "HAIL 75", 
      "SNOW AND ICE", "EXCESSIVE RAINFALL", "HEAVY SNOW SQUALLS", 
      "Snow", "HAIL 100", "HAIL 175", "RECORD SNOW", "Freezing Rain",
      "NON SEVERE HAIL", "SNOW DROUGHT", "SNOW/BLOWING SNOW", 
      "SNOW/ICE", "SNOW/SLEET/FREEZING RAIN", "Blowing Snow", 
      "Black Ice", "SNOW AND SLEET")] <- "Rain/Snow"

Storms$event[Storms$EVTYPE %in% c("TORNADO", "TORNADO F0", "HURRICANE ERIN",
      "TORNADO F1")] <- "Tornado"
Storms$event[Storms$EVTYPE %in% c("WATERSPOUT", "WATERSPOUT-", "WATERSPOUTS",
      "WATERSPOUT/TORNADO")] <- "Waterspout"
Storms$event[Storms$EVTYPE %in% c("TYPHOON")] <- "Typhoon"
Storms$event[Storms$EVTYPE %in% c("WILDFIRES", "WILD FIRES", "BRUSH FIRE")] <- "Fire"

Storms$event[Storms$EVTYPE %in% c("LIGHTNING","WILDFIRE","WILD/FOREST FIRE")] <- "Fire"
Storms$event[Storms$EVTYPE %in% c("DROUGHT","OTHER","UNSEASONABLY WARM",
      "EXTREME WINDCHILL","DUST DEVIL","AVALANCHE","FOG","RIP CURRENT",
      "UNSEASONABLY DRY", "LANDSLIDES", "HIGH SEAS", "HEAVY MIX", 
      "MUDSLIDE", "DRY", "UNUSUAL WARMTH", "MIXED PRECIP", "DENSE SMOKE", 
      "SMOKE", "Glaze", "UNSEASONABLY WARM AND DRY", "UNSEASONABLY WET", 
      "SEICHE", "VOLCANIC ASH", "TROPICAL DEPRESSION", "RECORD WARMTH", 
      "DRY MICROBURST", "FUNNEL CLOUD", "DENSE FOG", "LANDSLIDE", 
      "DRY WEATHER", "URBAN/SML STREAM FLD", "RIP CURRENTS", 
      "MONTHLY PRECIPITATION", "MIXED PRECIPITATION", "GLAZE")] <- "Others"
Storms$event[Storms$EVTYPE %in% c("HIGH SURF","HEAVY SURF/HIGH SURF", 
      "ASTRONOMICAL LOW TIDE","HEAVY SURF","STORM SURGE/TIDE", "High Surf", 
      "TSUNAMI", "ASTRONOMICAL HIGH TIDE")] <- "Tides"
Storms$event[Storms$EVTYPE %in% c("TROPICAL STORM","DUST STORM","FUNNEL",
      "THUNDERSTORM","HURRICANE/TYPHOON", "FUNNEL CLOUDS","HURRICANE", 
      "COASTAL STORM", "HURRICANE OPAL", "SLEET STORM", "SEVERE THUNDERSTORMS", 
      "THUNDERSTORM WINDS", "FUNNEL CLOUD", "STORM SURGE", 
      "THUNDERSTORMS WINDS", "SEVERE THUNDERSTORM", "THUNDERSTORM  WINDS",
      "THUNDERSTORM WINDS LIGHTNING", "THUNDERSTORMS", 
      "THUNDERSTORM WIND/ TREES", "THUNDERSTORM WIND G50", 
      "THUNDERSTORM WIND 60 MPH", "SEVERE THUNDERSTORM WINDS", 
      "GUSTY THUNDERSTORM WINDS")] <- "Storm"

Storms$event[which(Storms$event=="")] <- "Others"

Storms$event <- as.factor(Storms$event)
summary(Storms$event)
##        Fire       Flood      Others   Rain/Snow       Storm Temperature 
##       19987       82571       12038      335337       29619       14319 
##       Tides     Tornado     Typhoon  Waterspout        Wind 
##        1491       60682          11        3851      342391

Now we have just 11 groups of events.

Lets find right amount for each event. We have the next multipliers for data figures:

summary(Storms$PROPDMGEXP)
##             -      ?      +      0      1      2      3      4      5 
## 465934      1      8      5    216     25     13      4      4     28 
##      6      7      8      B      h      H      K      m      M 
##      4      5      1     40      1      6 424665      7  11330
summary(Storms$CROPDMGEXP)
##             ?      0      2      B      k      K      m      M 
## 618413      7     19      1      9     21 281832      1   1994

Let’s make them to lower case

Storms$PROPDMGEXP <- as.character(Storms$PROPDMGEXP)
Storms$CROPDMGEXP <- as.character(Storms$CROPDMGEXP)

Storms$PROPDMGEXP <- tolower(Storms$PROPDMGEXP)
Storms$CROPDMGEXP <- tolower(Storms$CROPDMGEXP)

Looking for the right multipliers at the same Storm Data Table:

Storms$Crop.factor <- 1
Storms$Crop.factor[which(Storms$CROPDMGEXP=="h")] <- 100
Storms$Crop.factor[which(Storms$CROPDMGEXP=="k")] <- 1000
Storms$Crop.factor[which(Storms$CROPDMGEXP=="m")] <- 1000000
Storms$Crop.factor[which(Storms$CROPDMGEXP=="b")] <- 1000000000
Storms$Prop.factor <- 1
Storms$Prop.factor[which(Storms$PROPDMGEXP=="h")] <- 100
Storms$Prop.factor[which(Storms$PROPDMGEXP=="k")] <- 1000
Storms$Prop.factor[which(Storms$PROPDMGEXP=="m")] <- 1000000
Storms$Prop.factor[which(Storms$PROPDMGEXP=="b")] <- 1000000000

and create new column with total property damage value:

Storms$Damage <- (Storms$PROPDMG*Storms$Prop.factor+
                    Storms$CROPDMG*Storms$Crop.factor)/1000000 # in MM

Fine! We are rady to make total result table. Summarizing the total of injuries, fatalities and property damages per natural event in a new table:

Storm.Groups = sqldf("SELECT event,
                          sum(INJURIES) as inj, 
                          sum(Damage) as prop, 
                          sum(FATALITIES) as fat,
                          COUNT(*) AS 'freq'
                   FROM Storms GROUP BY 1")
Storm.Groups$event <- as.character(Storm.Groups$event)

Now analyze the losses and injuries per event:

Storm.Groups$inj.per.event <- Storm.Groups$inj/Storm.Groups$freq
Storm.Groups$prpDMG.per.event <- Storm.Groups$prop/Storm.Groups$freq
Storm.Groups$fat.per.event <- Storm.Groups$fat/Storm.Groups$freq

And prepare the result table with the highest events

Storm.Ordered <- as.data.frame(x = c(1:11))
Storm.Ordered$Injures <- arrange(Storm.Groups, desc(inj.per.event))[,1]
Storm.Ordered$Fatalities <- arrange(Storm.Groups, desc(fat.per.event))[,1]
Storm.Ordered$Property.Damaged <- arrange(Storm.Groups, desc(prpDMG.per.event))[,1]

Results

Have a look for a rate by the losses per event:

names(Storm.Ordered) <- c("#", "by Injuries", "by Fatalities", "by Economic Losses")
pander(Storm.Ordered)
# by Injuries by Fatalities by Economic Losses
1 Tornado Temperature Typhoon
2 Temperature Tides Storm
3 Typhoon Others Tides
4 Fire Tornado Flood
5 Tides Fire Others
6 Others Flood Tornado
7 Flood Storm Fire
8 Storm Wind Temperature
9 Wind Rain/Snow Rain/Snow
10 Rain/Snow Waterspout Wind
11 Waterspout Typhoon Waterspout

The minimum number of position has a maximum impact on the US economy and public health per event.
Let’s plotting the most events conserning human injuries:

1 Highest level of injuries in the US

a1 <- ggplot(Storm.Groups, aes(x = event, y = inj, fill = event)) +
  geom_bar(stat = "identity") +
  scale_fill_hue(l=30) +
  coord_flip() +
  xlab("Event") +
  ylab("Number of injuries") +
  theme_minimal(base_size = 10) +
  guides(fill=FALSE) +
  ggtitle("Events with highest level of injuries in the US")

a2 <- ggplot(Storm.Groups, aes(x = event, y = inj.per.event, fill = event)) +
  geom_bar(stat = "identity") +
  scale_fill_hue(l=30) +
  coord_flip() +
  xlab("Event") +
  ylab("Injuries per event") +
  theme_minimal(base_size = 10) +
  guides(fill=FALSE) +
  ggtitle("Injuries per natural event")

plot_grid(a1, a2, ncol = 2, nrow = 1)

2 Highest level of deaths in the US

a1 <- ggplot(Storm.Groups, aes(x = event, y = fat, fill = event)) +
  geom_bar(stat = "identity") +
  scale_fill_hue(l=30) +
  coord_flip() +
  xlab("Event") +
  ylab("Number of deaths") +
  theme_minimal(base_size = 10) +
  guides(fill=FALSE) +
  ggtitle("Events with highest level of deaths in the US")

a2 <- ggplot(Storm.Groups, aes(x = event, y = fat.per.event, fill = event)) +
  geom_bar(stat = "identity") +
  scale_fill_hue(l=30) +
  coord_flip() +
  xlab("Event") +
  ylab("Death per event") +
  theme_minimal(base_size = 10) +
  guides(fill=FALSE) +
  ggtitle("Death per natural event")

plot_grid(a1, a2, ncol = 2, nrow = 1)

3 Economic losses of natural events in the US

a1 <- ggplot(Storm.Groups, aes(x = event, y = prop, fill = event)) +
  geom_bar(stat = "identity") +
  scale_fill_hue(l=30) +
  coord_flip() +
  xlab("Event") +
  ylab("Losses in mln US dollars") +
  theme_minimal(base_size = 10) +
  guides(fill=FALSE) +
  ggtitle("Economic losses of natural events in the US")

a2 <- ggplot(Storm.Groups, aes(x = event, y = prpDMG.per.event, fill = event)) +
  geom_bar(stat = "identity") +
  scale_fill_hue(l=30) +
  coord_flip() +
  xlab("Event") +
  ylab("Losses per event in mln US dollars") +
  theme_minimal(base_size = 10) +
  guides(fill=FALSE) +
  ggtitle("Economic losses per natural event")

plot_grid(a1, a2, ncol = 2, nrow = 1)