WHEATHER EVENTS IMPACT ON POPULATION HEALTH AND ECONOMY

Synopsis

The basic goal of this assignment is to explore the U.S. National Oceanic and Atmospheric Administration (NOAA) Storm Database and answer some basic questions about severe weather events.

This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. The events in the database start in the year 1950 and end in November 2011.

We give answers to the following questions:

  1. Across the United States, which types of events are most harmful with respect to population health?

  2. Across the United States, which types of events have the greatest economic consequences?

From the analysis we can say:

  1. The tornados are responsible for a maximum number of fatalities and injuries.

  2. The floods are responsbile for maximum property damage, while droughts cause maximum crop damage.

Data Processing

1. Install packages and load data

library(dplyr)
library(ggplot2)
library(gridExtra)
library(rmarkdown)
library(knitr)
data <- read.csv("repdata_data_StormData.csv")

2. Explore and transform data

dim(data)
## [1] 902297     37
str(data)
View(data)
data <- tbl_df(data)

3. Across the United States, which types of events are most harmful with respect to population health?

fatalities <- aggregate(FATALITIES~EVTYPE, data, sum)
fatalities <- arrange(fatalities, desc(FATALITIES))
fatalities10 <- fatalities[1:10, ]

injuries <- aggregate(INJURIES~EVTYPE, data, sum)
injuries <- arrange(injuries, desc(INJURIES))
injuries10 <- injuries[1:10,]

fatalities10
##            EVTYPE FATALITIES
## 1         TORNADO       5633
## 2  EXCESSIVE HEAT       1903
## 3     FLASH FLOOD        978
## 4            HEAT        937
## 5       LIGHTNING        816
## 6       TSTM WIND        504
## 7           FLOOD        470
## 8     RIP CURRENT        368
## 9       HIGH WIND        248
## 10      AVALANCHE        224
injuries10
##               EVTYPE INJURIES
## 1            TORNADO    91346
## 2          TSTM WIND     6957
## 3              FLOOD     6789
## 4     EXCESSIVE HEAT     6525
## 5          LIGHTNING     5230
## 6               HEAT     2100
## 7          ICE STORM     1975
## 8        FLASH FLOOD     1777
## 9  THUNDERSTORM WIND     1488
## 10              HAIL     1361

We will graph the previous results to have an overview of them

plotfatalities = ggplot(fatalities10, aes(x = EVTYPE, y = FATALITIES)) + geom_bar(stat = "identity", fill = "green") + theme(axis.text.x = element_text(angle = 90, hjust = 1, size = 8)) + xlab("Event Type") + ylab("Fatalities") + ggtitle("Fatalities by top 10 Weather Event Types") + theme(plot.title = element_text(size = 10)) 

plotinjuries <- ggplot(injuries10, aes(x = EVTYPE, y = INJURIES)) + geom_bar(stat = "identity", fill = "blue") + theme(axis.text.x = element_text(angle = 90, hjust = 1, size = 8)) + xlab("Event Type") + ylab("Injuries") + ggtitle("Injuries by top 10 Weather Event Types") + theme(plot.title = element_text(size = 10)) 


grid.arrange(plotfatalities, plotinjuries, ncol = 2, top = "Most Harmful Events with Respect to Population Health")

4. Across the United States, which types of events have the greatest economic consequences?

When reviewing column names, we have property damage like (PROPDMG) and crop damage like (CROPDMG). But we have to transform their values.

Calculating property damage

unique(data$PROPDMGEXP)
##  [1] K M   B m + 0 5 6 ? 4 2 3 h 7 H - 1 8
## Levels:  - ? + 0 1 2 3 4 5 6 7 8 B h H K m M
damageProperty <- select(data, EVTYPE, PROPDMG, PROPDMGEXP)
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "0"] <- 1
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "1"] <- 10
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "2"] <- 100
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "3"] <- 1000
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "4"] <- 10000
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "5"] <- 1e+05
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "6"] <- 1e+06
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "7"] <- 1e+07
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "8"] <- 1e+08
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "B"] <- 1e+09
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "h"] <- 100
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "H"] <- 100
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "K"] <- 1000
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "m"] <- 1e+06
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "M"] <- 1e+06
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == ""] <- 1
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "+"] <- 0
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "-"] <- 0
damageProperty$ChangeExp[damageProperty$PROPDMGEXP == "?"] <- 0

damageProperty$damageValue <- damageProperty$PROPDMG*damageProperty$ChangeExp
damageValueProperty <- aggregate(damageValue ~ EVTYPE, damageProperty, sum)
damageValueProperty <- arrange(damageValueProperty, desc(damageValue))
damageValueProperty10 <- damageValueProperty[1:10,]

The 10 events with the greatest property damage

damageValueProperty10
##               EVTYPE  damageValue
## 1              FLOOD 144657709807
## 2  HURRICANE/TYPHOON  69305840000
## 3            TORNADO  56947380617
## 4        STORM SURGE  43323536000
## 5        FLASH FLOOD  16822673979
## 6               HAIL  15735267513
## 7          HURRICANE  11868319010
## 8     TROPICAL STORM   7703890550
## 9       WINTER STORM   6688497251
## 10         HIGH WIND   5270046260

Calculating crop damage

unique(data$CROPDMGEXP)
## [1]   M K m B ? 0 k 2
## Levels:  ? 0 2 B k K m M
damageCrop <- select(data, EVTYPE, CROPDMG,CROPDMGEXP)
damageCrop$ChangeExp[damageCrop$CROPDMGEXP == "0"] <- 1
damageCrop$ChangeExp[damageCrop$CROPDMGEXP == "2"] <- 100
damageCrop$ChangeExp[damageCrop$CROPDMGEXP == "B"] <- 1e+09
damageCrop$ChangeExp[damageCrop$CROPDMGEXP == "K"] <- 1000
damageCrop$ChangeExp[damageCrop$CROPDMGEXP == "k"] <- 1000
damageCrop$ChangeExp[damageCrop$CROPDMGEXP == "m"] <- 1e+06
damageCrop$ChangeExp[damageCrop$CROPDMGEXP == "M"] <- 1e+06
damageCrop$ChangeExp[damageCrop$CROPDMGEXP == "?"] <- 0
damageCrop$damageValue <- damageCrop$CROPDMG*damageCrop$ChangeExp
damageValueCrop <- aggregate(damageValue ~ EVTYPE, damageCrop, sum)
damageValueCrop <- arrange(damageValueCrop, desc(damageValue))
damageValueCrop10 <- damageValueCrop[1:10,]

The 10 events with the greatest crop damage

damageValueCrop10
##               EVTYPE damageValue
## 1            DROUGHT 13972566000
## 2              FLOOD  5661968450
## 3        RIVER FLOOD  5029459000
## 4          ICE STORM  5022113500
## 5               HAIL  3025954470
## 6          HURRICANE  2741910000
## 7  HURRICANE/TYPHOON  2607872800
## 8        FLASH FLOOD  1421317100
## 9       EXTREME COLD  1292973000
## 10      FROST/FREEZE  1094086000

We will graph the previous results to have an overview of them

par(mfrow=c(1,2),mar=c(11,3,3,2))
barplot(damageValueProperty10$damageValue/(10^9),names.arg=damageValueProperty10$EVTYPE,las=2,col="blue", ylab = "Property damage (billions)",main="Property Damages")
barplot(damageValueCrop10$damageValue/(10^9),names.arg=damageValueCrop10$EVTYPE,las=2,col="green", ylab = "Crop damage (billions)",main="Crop Damages")

Results

From the analysis made, we can say that tornados have caused the greatest number of fatalities and injuries throughout the registered period. 5633 and 91346 respectively.

Also floods are the cause of the greatest damage to property, 144657709807 dollars in the records. While droughts are responsible for the greatest losses to crops. 13972566000 dollars