Synopsis

This analysis describes the impact of Storms and other severe weather events on public health and economy for communities and municipalities in US. The U.S. National Oceanic and Atmospheric Administration’s (NOAA) tracks characteristics of major storms and weather events in the United States and frequently publishes their data set.

The report presents the following study

Data Processing

The data obtained from the NOAA website Storm Data is in a zipped CSV format. We first download the data into the working directory.

Load Libraries

##Load the required R Library
install.packages('tidyr', repos="http://cran.rstudio.com/")
library(tidyr)
library(dplyr)
library(ggplot2)

Here we load the required data

StormData <- tbl_df(read.csv("repdata-data-StormData.csv.bz2"))

After loading the dataset we check the number of observations and variables.

dim(StormData)
## [1] 902297     37

We have about 902297 observations for 37 variables

Here are the first few rows from the Storm Data. We are particularly interested in the following variables

head(StormData)
## Source: local data frame [6 x 37]
## 
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME  STATE
##     (dbl)             (fctr)   (fctr)    (fctr)  (dbl)     (fctr) (fctr)
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE     AL
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN     AL
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE     AL
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON     AL
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN     AL
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE     AL
## Variables not shown: EVTYPE (fctr), BGN_RANGE (dbl), BGN_AZI (fctr),
##   BGN_LOCATI (fctr), END_DATE (fctr), END_TIME (fctr), COUNTY_END (dbl),
##   COUNTYENDN (lgl), END_RANGE (dbl), END_AZI (fctr), END_LOCATI (fctr),
##   LENGTH (dbl), WIDTH (dbl), F (int), MAG (dbl), FATALITIES (dbl),
##   INJURIES (dbl), PROPDMG (dbl), PROPDMGEXP (fctr), CROPDMG (dbl),
##   CROPDMGEXP (fctr), WFO (fctr), STATEOFFIC (fctr), ZONENAMES (fctr),
##   LATITUDE (dbl), LONGITUDE (dbl), LATITUDE_E (dbl), LONGITUDE_ (dbl),
##   REMARKS (fctr), REFNUM (dbl)

Summarizing the data

In order to present our analysis we summarize the observations based on Health Impact and Economic Impact for the different Event Types

## Group the Storm Data by the Event Type and Calculate the sum for Fatalities, Injuries seperately and the total of both together
##      Filter the summary data that have positive values and gather them by the Impact type
##      Select only the top 20 by the Total Impact and Impact Type
HealthImpactByStorm <- 
                StormData %>%
                group_by(EVTYPE) %>%
                summarise(Fatalities = sum(FATALITIES), Injuries = sum(INJURIES), TotalHealthImpact = sum(FATALITIES + INJURIES))  %>%
                filter(TotalHealthImpact > 0) %>%
                gather(ImpactType, ImpactByType, Fatalities:Injuries) %>%
                arrange(desc(TotalHealthImpact) ) %>%
                top_n(20, TotalHealthImpact)


## Group the Storm Data by the Event Type and Calculate the sum for Property, Crop Damage seperately and the total of both together
##      Filter the summary data that have positive values and gather them by the Impact type
##      Select only the top 20 by the Total Impact and Impact Type
EconomicImpactByStorm <- 
                StormData %>%
                group_by(EVTYPE) %>%
                summarise(PropertyDamage = sum(PROPDMG), CropDamage = sum(CROPDMG), TotalEconomicImpact = sum(PROPDMG + CROPDMG))  %>%
                filter(TotalEconomicImpact > 0) %>%
                gather(ImpactType, ImpactByType, PropertyDamage:CropDamage) %>%
                arrange(desc(TotalEconomicImpact) ) %>%
                top_n(20, TotalEconomicImpact)

Results

Weather Events and Population Health

Evaluating the graph below we observer the Tornados are by far the weather event that caused the most impact on the Health in terms of Injuries and fatalities in the US

ggplot(HealthImpactByStorm,aes(EVTYPE,ImpactByType, fill=ImpactType))+
        geom_bar(position = "stack",stat = "identity")+
        ggtitle("Impact of Storm on Health (Injuries or Fatalities) in US") +
        xlab("Event Type") + 
        ylab("Total Impact") +
        guides(fill=guide_legend(title="Impact Type")) +
        theme(plot.title = element_text(lineheight=3, face="bold", color="black", size=13)
              , axis.text.x = element_text(angle = 90)
              , axis.title.x = element_text(size=10, face="bold")
              , axis.title.y = element_text(size=10, face="bold")
              , legend.title = element_text(size=10, face="bold")
        )

Weather Event and Economic Impact

Evaluating the graph below we observer the Tornados are by far the weather event that caused the most impact on the US Economy in terms of property and crop damage

ggplot(EconomicImpactByStorm,aes(EVTYPE,ImpactByType/1000, fill=ImpactType))+
        geom_bar(position = "stack",stat = "identity")+
        ggtitle("Impact of Storm on Economy") +
        xlab("Event Type") + 
        ylab("Total Impact") +
        guides(fill=guide_legend(title="Impact Type")) +
        theme(plot.title = element_text(lineheight=3, face="bold", color="black", size=15)
              , axis.text.x = element_text(angle = 90)
              , axis.title.x = element_text(size=12, face="bold")
              , axis.title.y = element_text(size=12, face="bold")
              , legend.title = element_text(size=12, face="bold")
        )