Introduction

With data from the National Oceanic and Atmospheric Administrations (NOAA) storm database, this report explores the economic impact of severe weather events in the United States.These weather events range from excessive heat, tornados, blizards including those accompanied by high winds, coastal flodding, drought, dense smoke, dust, etc. And, their economic impact range from fatalities, injuries, crop and property damage with varying effects across different states. This report explores the impact of severe weather events on public health in the US focusing on fatalities, injuries, damage to crop and property across different states.


Data Processing

The Storm Data was downloaded from https://www.coursera.org/learn/reproducible-research/peer/OMZ37/course-project-2 as a zipped file into a working directory. The file was unzipped and the associated .csv file - repdata_data_StormData.csv was read in as a dataframe. This file contains 902297 observations of 37 variables. Of the 37 variables, 8 were deemed relevant to present analysis. They are: STATE and EVTYPE representing a US state and, a severe weather event respectively. The other six variables describe damage to life and property caused by associated weather events. And, they are fatalities, injuries, property damage, financial estimates of property damage, crop damage and its financial estimates. This report will focus on loss of life; fatalities and, injuries to persons. And, given the focus of this report, no pre-processing of the data was required The storm data show a count of fatalities and injuries to persons across different states. The associated crop and property damage are represented as numeric values and could be summed. The first few damage variables are shown in the table below:

# 
knitr::opts_chunk$set(echo = TRUE, cache=TRUE, comment = NA, warning=FALSE, message=FALSE)
library(xtable)
library(dplyr)
library(ggplot2)
library(plotly)
library(gridExtra)
library(ggrepel)
#read in data as data frame 
mydata <- data.frame(read.csv("repdata_data_StormData.csv", stringsAsFactors = FALSE))
myData <- mydata[, c(7:8, 23:28)]
str(myData)
## 'data.frame':    902297 obs. of  8 variables:
##  $ STATE     : chr  "AL" "AL" "AL" "AL" ...
##  $ EVTYPE    : chr  "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: chr  "K" "K" "K" "K" ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: chr  "" "" "" "" ...
knitr::kable(head(myData, 5), caption = ("Severe weather events with fatal and property damage across USA"))
Severe weather events with fatal and property damage across USA
STATE EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
AL TORNADO 0 15 25.0 K 0
AL TORNADO 0 0 2.5 K 0
AL TORNADO 0 2 25.0 K 0
AL TORNADO 0 2 2.5 K 0
AL TORNADO 0 2 2.5 K 0

Damage Assessment

The four tables below show US states with most number of fatalities, injuries to persons, most number of property and crop damages and, the severe weather events that caused these damages.

10 States with most fatalities

FatalityDamage <- myData %>%
select(STATE, EVTYPE, FATALITIES, INJURIES, PROPDMG,CROPDMG)  %>%
 group_by(STATE, EVTYPE) %>%
  summarise_each(funs(sum), FATALITIES, INJURIES, PROPDMG,CROPDMG) %>%
          arrange(desc(FATALITIES, INJURIES))   
knitr::kable(head(FatalityDamage, 10), caption = ("Table 1: 10 US States With Most Fatalities"))
Table 1: 10 US States With Most Fatalities
STATE EVTYPE FATALITIES INJURIES PROPDMG CROPDMG
IL HEAT 653 241 55.0 10.40
AL TORNADO 617 7929 167816.2 1652.70
TX TORNADO 538 8207 283097.2 4866.20
MS TORNADO 450 6244 187840.9 24964.20
MO TORNADO 388 4330 132159.9 2286.00
AR TORNADO 379 5116 119550.2 388.13
TN TORNADO 368 4748 112161.0 681.00
PA EXCESSIVE HEAT 359 320 0.0 0.00
IL EXCESSIVE HEAT 330 352 0.0 0.00
OK TORNADO 296 4829 165167.9 606.55
Top10 <-  paste(round(100*(sum(FatalityDamage$FATALITIES[1:10])/(sum(FatalityDamage$FATALITIES))), 2), "%", sep="")
Top10
[1] "28.91%"

10 States with most injuries

InjuryDamage <- myData %>%
select(STATE, EVTYPE, FATALITIES, INJURIES, PROPDMG,CROPDMG)  %>%
 group_by(STATE, EVTYPE) %>%
  summarise_each(funs(sum), FATALITIES, INJURIES, PROPDMG,CROPDMG) %>%
          arrange(desc(INJURIES, FATALITIES))   
knitr::kable(head(InjuryDamage, 10), caption = ("Table 2: 10 US States With Most Injuries"))
Table 2: 10 US States With Most Injuries
STATE EVTYPE FATALITIES INJURIES PROPDMG CROPDMG
TX TORNADO 538 8207 283097.24 4866.20
AL TORNADO 617 7929 167816.20 1652.70
TX FLOOD 49 6338 25926.53 9701.60
MS TORNADO 450 6244 187840.91 24964.20
AR TORNADO 379 5116 119550.24 388.13
OK TORNADO 296 4829 165167.88 606.55
TN TORNADO 368 4748 112160.96 681.00
OH TORNADO 191 4438 95744.09 388.50
MO TORNADO 388 4330 132159.87 2286.00
IN TORNADO 252 4224 104686.49 516.00
#
Top10I <- paste(round(100*(sum(InjuryDamage$INJURIES[1:10])/(sum(InjuryDamage$INJURIES))), 2), "%", sep="")
Top10I
[1] "40.14%"

10 States with most property damage

PropertyDamage <- myData %>%
select(STATE, EVTYPE, FATALITIES, INJURIES, PROPDMG,CROPDMG)  %>%
 group_by(STATE, EVTYPE) %>%
  summarise_each(funs(sum), FATALITIES, INJURIES, PROPDMG,CROPDMG) %>%
          arrange(desc(PROPDMG, CROPDMG))   
knitr::kable(head(PropertyDamage, 10), caption = ("Table 3: 10 US States With Most Property Damage"))
Table 3: 10 US States With Most Property Damage
STATE EVTYPE FATALITIES INJURIES PROPDMG CROPDMG
TX TORNADO 538 8207 283097.2 4866.20
MS TORNADO 450 6244 187840.9 24964.20
AL TORNADO 617 7929 167816.2 1652.70
OK TORNADO 296 4829 165167.9 606.55
FL TORNADO 161 3340 159752.6 148.50
IA TORNADO 81 2208 152142.8 4751.97
GA TORNADO 180 3926 151349.5 3792.50
TX TSTM WIND 42 484 144959.0 6859.95
KS TORNADO 236 2721 143209.9 5481.80
MO TORNADO 388 4330 132159.9 2286.00
Top10P <- paste(round(100*(sum(PropertyDamage$PROPDMG[1:10])/(sum(PropertyDamage$PROPDMG))), 2), "%", sep="")
Top10P
[1] "15.5%"

10 States with most crop damage

CropDamage <- myData %>%
select(STATE, EVTYPE, FATALITIES, INJURIES, PROPDMG,CROPDMG)  %>%
 group_by(STATE, EVTYPE) %>%
  summarise_each(funs(sum), FATALITIES, INJURIES, PROPDMG,CROPDMG) %>%
          arrange(desc(CROPDMG, PROPDMG))   
knitr::kable(head(CropDamage, 10), caption = ("Table 4: 10 US States With Most Crop Damage"))
Table 4: 10 US States With Most Crop Damage
STATE EVTYPE FATALITIES INJURIES PROPDMG CROPDMG
NE HAIL 0 93 67022.80 201031.15
TX HAIL 4 287 111287.90 103947.70
KS HAIL 0 121 48168.76 80734.15
IA HAIL 4 134 69847.35 47875.76
IA FLOOD 1 121 106589.09 43273.10
NE TSTM WIND 1 85 46771.50 37418.00
ND HAIL 0 27 18568.15 28818.70
WI FLASH FLOOD 7 16 43970.84 25645.37
IA FLASH FLOOD 6 15 103418.62 25187.50
NE FLASH FLOOD 3 4 23347.60 25018.17
#
Top10C <- paste(round(100*(sum(CropDamage$CROPDMG[1:10])/(sum(CropDamage$CROPDMG))), 2), "%", sep="")
Top10C
[1] "44.92%"

Results

Table 1 show that 28.91% of all fatalities were caused by heat and tornado. Of these, 7/10 by tornado. Table 2 show that 40.14% of all injuries to persons were caused by tornado and flood with tornado accounting for 9/10 of these injuries. Table 3 show that 15.5% of all property damage were caused by tornado and other wind related weather conditions. Tornado caused 9/10 of these damages. Table 4 show that 44.92% of all crop damage were cause by hail, flooding, flash flooding and wind. These results suggests that tornado and wind related weather conditions may have a dorminant impact on public health and economic damage that are due to severe weather conditions in the US.

df <- data.frame(FatalityDamage$STATE[1:10], FatalityDamage$FATALITIES[1:10])
FatalityDamage <- ggplot(data=df, aes(x=df[, 1], y=df[,2])) +
  geom_bar(stat="identity", fill="steelblue")+
  geom_text(aes(label=df[,2]), vjust=1.6, color="white", size=3.5)+
  theme_minimal() +
#add title and axis labels
  labs (
    title = "10 States with Most Fatalities",
      x = "US States",
     y  = "Number of fatalities"
  )
#
df2 <- data.frame(InjuryDamage$STATE[1:10], InjuryDamage$INJURIES[1:10])
# 
InjDamage <- ggplot(data=df2, aes(x=df2[, 1], y=df2[,2])) +
  geom_bar(stat="identity", fill="steelblue")+
  geom_text(aes(label=df2[,2]), vjust=1.6, color="white", size=3.5)+
  theme_minimal() +
#add title and axis labels
  labs (
    title = "10 States with Most Injuries",
      x = "US States",
     y  = "Number of Injuries",
    caption = "Figure 1: Fatality and Injuries in 10 US States"
  )
#Display FatalityDamage and InjuryDamage bar charts
grid.arrange(FatalityDamage, InjDamage , ncol=2)

#
#Property damages
df3 <- data.frame(PropertyDamage$STATE[1:10], PropertyDamage$PROPDMG[1:10])
# 
PropDamage <- ggplot(data=df3, aes(x=df3[, 1], y=df3[,2])) +
  geom_bar(stat="identity", fill="steelblue")+
#  geom_text_repel(aes(label=df3[,2]), vjust=1.6, color="white", size=3.5)+
  geom_text(aes(label=df3[,2]), vjust=1.6, color="white", size=3.5)+
  theme_minimal() +
#
  labs (
    title = "10 States with Most Property Damages",
      x = "US States",
     y  = "Number of Property Damages",
    caption = "Figure 2: Number of property damage in some US States"
  )
#Display chart
     PropDamage 

#
#CropDamage
df4 <- data.frame(CropDamage$STATE[1:10], CropDamage$CROPDMG[1:10])
# 
CrpDamage <- ggplot(data=df4, aes(x=df4[, 1], y=df4[,2])) +
  geom_bar(stat="identity", fill="steelblue")+
   geom_text_repel(aes(label=df4[,2]), vjust=1.6, color="white", size=3.5)+
  theme_minimal() + 
   #
  labs (
    title = "10 States with Most Crop Damages",
      x = "US States",
     y  = "Number of Crop Damages",
    caption = "Figure 3: Number of crop damage in some US States"
      )
 
#
#Display chart
  CrpDamage 

#

Conclusion

The result of this analysis show that severe weather conditions has serious public health and economic impact in the United States. It also showed that tornado causes the most death and injuries to persons in the United States. For this reason, this report deems tornado as the most harmful weather condition to population health. This is because, damage due to loss of life and injuries to persons trumps damage due to loss of crop or, loss of property.