Synopsis

This report examines the impact of climate events in the U.S. on the population and the economy. It outlines the approach and results. This is a project of the course “Reproducible Research” and the final project of week 4. This analysis is based on the U.S. National Oceanic and Athmospheric Administration (NOAA) storm database. This database records characteristics of severe storms and weather events across the United States. Also recorded are estimates of fatalities, injuries and property damage.

We found the following results:

In this report, the two questions 1. what types of events are most harmful to the health of the population in the U.S.? 2. what types of events have the greatest economic impact in the US.

For the first question, we found that tornadoes are at the top of the list in terms of deaths and injuries. And that by far.

On the second question, the flood has the greatest impact on Proberty. The biggest influence on Corp is by Drought.

Data

Resources

Storm event Data which is used in this analysis study can be found at Storm Data [47Mb]

The documentation for the Storm data are described in the 2 documents which you can found there:

The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.

Get Data

Download and read Storm event data from https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2 Extract only the columns that we need.

url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"

dest_file <- "stromData"
if (!file.exists(dest_file)){
  
download.file(url, dest_file)
}

NOAA_data <- read.csv("stromData")
Storm_data <- NOAA_data[, -c(1:7, 9:22, 29:37)]

Data Processing

Investigate the fatalities and injuries

fatal <- aggregate(FATALITIES ~ EVTYPE, data=Storm_data, sum)
fatal_10 <- fatal[order(-fatal$FATALITIES), ][1:10,]
fatal_10
##             EVTYPE FATALITIES
## 834        TORNADO       5633
## 130 EXCESSIVE HEAT       1903
## 153    FLASH FLOOD        978
## 275           HEAT        937
## 464      LIGHTNING        816
## 856      TSTM WIND        504
## 170          FLOOD        470
## 585    RIP CURRENT        368
## 359      HIGH WIND        248
## 19       AVALANCHE        224
inj <- aggregate(INJURIES ~ EVTYPE, data=Storm_data, sum)
inj_10 <- inj[order(-inj$INJURIES), ][1:10,]
inj_10
##                EVTYPE INJURIES
## 834           TORNADO    91346
## 856         TSTM WIND     6957
## 170             FLOOD     6789
## 130    EXCESSIVE HEAT     6525
## 464         LIGHTNING     5230
## 275              HEAT     2100
## 427         ICE STORM     1975
## 153       FLASH FLOOD     1777
## 760 THUNDERSTORM WIND     1488
## 244              HAIL     1361

The values for the fatalties and injuries are very different. To make it clearer, we will only look at the top 10.

Plot fatalities and injuries

par(mfrow = c(1, 2), mar = c(12, 8, 3, 2),  mgp = c(4, 1, 0), cex = 0.8)

with(fatal_10, barplot(FATALITIES,  las = 2, names.arg = EVTYPE, 
                            ylab = 'Number of Fatalities', main='Top 10 Highest Fatalities Events',
                            col = 'steelblue', ylim = c(0,6000)))

with(inj_10, barplot(INJURIES, las = 2, names.arg = EVTYPE, 
                          ylab = 'Injuries', main='Top 10 Highest Injuries Events', 
                          col = 'steelblue')) 
\label{fig:figs}Top 10 Fatalities and Injuries events

Top 10 Fatalities and Injuries events

Economic Impact

The code issue

From the storm data, we can see that there are PROPDMGEXP and CROPDMGEXP fields, which presumably correspond to the PROPDMG and CROPDMG fields prospectively. Unfortunately, we do not find information on how to decode these exponent fields at NOAA or in the course project description. From our clarifications in the discussion forum and on the Internet, we found a key to decode PROPDMGEXP. We assume that it is correct and use it for further analysis. https://github.com/dsong99/Reproducible-Proj-2/blob/master/storm_exp_code.csv

code <- data.frame(
        'Expo'=factor(c('K','k','M','m','B','b','H','h','0','1','2','3','4','5','6','7','8','+','-','?','')),
        'Multiplier' =c(1000,1000,1000000,1000000,1000000000,1000000000,100,100,1,1,1,1,1,1,1,1,1,1,0,0,0)
)

Property Impact

prop <- Storm_data[,c('EVTYPE','PROPDMG','PROPDMGEXP')]

unique(prop$PROPDMGEXP)
##  [1] "K" "M" ""  "B" "m" "+" "0" "5" "6" "?" "4" "2" "3" "h" "7" "H" "-" "1" "8"
prop <- merge(x=prop, y=code, by.x = 'PROPDMGEXP', by.y = 'Expo', all.x=TRUE )

prop$PROPDMG <- prop$PROPDMG * prop$Multiplier
prop_damages_analysis <- aggregate(PROPDMG~EVTYPE, data= prop, sum)

prop_damages_analysis <- prop_damages_analysis[order(-prop_damages_analysis$PROPDMG),]
prop_damages_10 <- head(prop_damages_analysis, 10)
prop_damages_10
##                EVTYPE      PROPDMG
## 170             FLOOD 144657709800
## 411 HURRICANE/TYPHOON  69305840000
## 834           TORNADO  56937160776
## 670       STORM SURGE  43323536000
## 153       FLASH FLOOD  16140811860
## 244              HAIL  15732267486
## 402         HURRICANE  11868319010
## 848    TROPICAL STORM   7703890550
## 972      WINTER STORM   6688497251
## 359         HIGH WIND   5270046280
prop_damages_10$PROPDMG <- round(prop_damages_10$PROPDMG/1e+9, 3)
prop_damages_10
##                EVTYPE PROPDMG
## 170             FLOOD 144.658
## 411 HURRICANE/TYPHOON  69.306
## 834           TORNADO  56.937
## 670       STORM SURGE  43.324
## 153       FLASH FLOOD  16.141
## 244              HAIL  15.732
## 402         HURRICANE  11.868
## 848    TROPICAL STORM   7.704
## 972      WINTER STORM   6.688
## 359         HIGH WIND   5.270

Corp Impact

crop <- Storm_data[,c('EVTYPE','CROPDMG','CROPDMGEXP')]

unique(crop$CROPDMGEXP)
## [1] ""  "M" "K" "m" "B" "?" "0" "k" "2"
crop <- merge(x=crop, y=code, by.x = 'CROPDMGEXP', by.y = 'Expo', all.x=TRUE )
crop$CROPDMG <- crop$CROPDMG * crop$Multiplier

crop_damages <- aggregate(CROPDMG~EVTYPE, data = crop, FUN=sum)
crop_damages_10 <- crop_damages[order(-crop_damages$CROPDMG),][1:10,]
crop_damages_10
##                EVTYPE     CROPDMG
## 95            DROUGHT 13972566000
## 170             FLOOD  5661968450
## 590       RIVER FLOOD  5029459000
## 427         ICE STORM  5022113500
## 244              HAIL  3025954470
## 402         HURRICANE  2741910000
## 411 HURRICANE/TYPHOON  2607872800
## 153       FLASH FLOOD  1421317100
## 140      EXTREME COLD  1292973000
## 212      FROST/FREEZE  1094086000
crop_damages_10$CROPDMG <- round(crop_damages_10$CROPDMG/1e+9, 3)
crop_damages_10
##                EVTYPE CROPDMG
## 95            DROUGHT  13.973
## 170             FLOOD   5.662
## 590       RIVER FLOOD   5.029
## 427         ICE STORM   5.022
## 244              HAIL   3.026
## 402         HURRICANE   2.742
## 411 HURRICANE/TYPHOON   2.608
## 153       FLASH FLOOD   1.421
## 140      EXTREME COLD   1.293
## 212      FROST/FREEZE   1.094

Plot Property and Corp Impact

par(mfrow = c(1, 2), mar = c(12, 8, 3, 2),  mgp = c(4, 1, 0), cex = 0.8)

with(prop_damages_10, barplot(PROPDMG, las = 2, names.arg = EVTYPE, 
                              ylab = 'Property Damages in Billions ', 
                              main='Top 10 Highest Property Damages Events', col = 'steelblue'))

with(crop_damages_10, barplot(CROPDMG, las = 2, names.arg = EVTYPE, 
                              ylab = 'Crop Damages in Billions ', 
                              main='Top 10 Highest Crop Damages Events', ylim=c(0, 14),col = 'steelblue'))
\label{fig:figs}Top 10 Property and Crop Damages events

Top 10 Property and Crop Damages events

Results

In this report, the two questions 1. what types of events are most harmful to the health of the population in the U.S. 2. what types of events have the greatest economic impact in the US.

For the first question, we found that Tornadoes are at the top of the list in terms of deaths and injuries. And that by far.

On the second question, the Flood has the greatest impact on Proberty. The biggest influence on Corp is by Drought.