This analysis is performed as part of a peer-graded assignment for the Coursera course ‘Reproducible Research’.

Synopsis

The objective of this analysis is to answer two questions about severe weather events: 1. which types of severe weather events are most harmful to population health? 2. which types of severe weather events have the greatest economic consequences?

The questions are answered based on a subset of the National Oceanic and Atmospheric Administration’s (NOAA) Storm Database, limited to events in the US from 1996-2011. Impacts on human health are measured as the total fatalities and injuries as a result of severe weather, and economic consequences as the total costs of property and crop damage.

Results: Results show that from 1950-2011 hurricanes account for by far the greatest impact on human health (both in terms of fatalities as in terms of injuries). The weather types with the biggest economic consequences are floods and hurricanes for property dammage, and drought and floods for crop damage.

Data Processing

Load the data in R as ‘StormData’, without first decompressing:

if (!exists('StormData'))
StormData<-read.csv("repdata_data_StormData.csv.bz2")

We don’t need any geographical or temporal information, so will remove this from the dataset. We will call the reduced dataset ‘Storm’.

Storm<-StormData[-c(1:6,9:20,29:37)]

Exploring how many event types there are:

## [1] 985

We will clean the data for this column, as it’s a crucial one for our research questions. we notice that quite a few of the event types contain numbers, which should all be removed since the official event types do not contain numbers:

#remove numbers
Storm$EVTYPE<-gsub('[[:digit:]]+', '', Storm$EVTYPE)
# replace all punct. characters with a space
Storm$EVTYPE<- gsub("[[:blank:][:punct:]+]", " ", Storm$EVTYPE)
#remove leading & trailing white spaces
Storm$EVTYPE<-trimws(Storm$EVTYPE)
#Replace a number of strings that don't appear in the list of 48 official event types on https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf.

orig_event_types<- c("RIP CURRENTS", "TSTM WIND","EXTREME COLD WIND CHILL","EXTREME HEAT","HURRICANE/TYPHOON")
new_event_types <- c("RIP CURRENT","THUNDERSTORM WIND","EXTREME COLD","EXCESSIVE HEAT","HURRICANE")
for(i in 1:length(orig_event_types)) {
    Storm$EVTYPE <- gsub(orig_event_types[i], new_event_types[i], Storm$EVTYPE, ignore.case = TRUE)
}

Storm$EVTYPE<-gsub(".*Hurricane.*","HURRICANE",Storm$EVTYPE,ignore.case = TRUE)
Storm$EVTYPE<-gsub(".*Tornado.*","HURRICANE",Storm$EVTYPE,ignore.case = TRUE)
Storm$EVTYPE <- gsub("TH.*WIND.*", "THUNDERSTORM WIND", Storm$EVTYPE,ignore.case = TRUE)
Storm$EVTYPE <- gsub("HI.*WIND.*", "HIGH WIND", Storm$EVTYPE,ignore.case = TRUE)
Storm$EVTYPE<-gsub(".*HE.*SNOW.*","HEAVY SNOW",Storm$EVTYPE,ignore.case = TRUE)
Storm$EVTYPE<-gsub(".*HE.*RAIN.*","HEAVY RAIN",Storm$EVTYPE,ignore.case = TRUE)
Storm$EVTYPE<-gsub(".*FLOOD.*","FLOOD",Storm$EVTYPE,ignore.case = TRUE)
Storm$EVTYPE<-gsub(".*BLIZZ.*","BLIZZARD",Storm$EVTYPE,ignore.case = TRUE)

Clean the PROPDMGEXP and CROPDMGEXP columns:

unique(Storm$PROPDMGEXP)
##  [1] "K" "M" ""  "B" "m" "+" "0" "5" "6" "?" "4" "2" "3" "h" "7" "H" "-" "1" "8"
unique(Storm$CROPDMGEXP)
## [1] ""  "M" "K" "m" "B" "?" "0" "k" "2"
# replace H, h, K,k M,m and B,b with numeric values
orig_exp_values<- c("H", "K", "M", "B","\\-","\\+","\\?")
new_exp_values <- c(2, 3, 6, 9,1,1,1)
for(i in 1:length(orig_exp_values)) {
    Storm$PROPDMGEXP <- gsub(orig_exp_values[i], new_exp_values[i], Storm$PROPDMGEXP, ignore.case = TRUE)
    Storm$CROPDMGEXP <- gsub(orig_exp_values[i], new_exp_values[i], Storm$CROPDMGEXP, ignore.case = TRUE)
}

Calculate the actual property and crop damage and store those as 2 new columns in the Storm datafame:

Storm<-mutate(Storm,PropertyDamage=PROPDMG*10^as.numeric(PROPDMGEXP),
              CropDamage=CROPDMG*10^as.numeric(CROPDMGEXP))

Results

Effect of severe weather events on public health

fatalities

#group events per fatality
fatal_events<-aggregate(FATALITIES~EVTYPE,Storm,sum)

#list the top 10 most fatal events
top_fatal_events <- fatal_events %>% arrange(desc(FATALITIES)) %>% slice(1:10)
top_fatal_events
##               EVTYPE FATALITIES
## 1          HURRICANE       5796
## 2     EXCESSIVE HEAT       1999
## 3              FLOOD       1524
## 4               HEAT        937
## 5          LIGHTNING        817
## 6  THUNDERSTORM WIND        710
## 7        RIP CURRENT        572
## 8          HIGH WIND        293
## 9       EXTREME COLD        285
## 10         AVALANCHE        224
#plot
par(mar=c(5,8,4,2))
par(oma=c(8,1,3,3))
barplot(top_fatal_events$FATALITIES,names.arg=top_fatal_events$EVTYPE,las=2,main="10 most fatal event types",ylab="number of fatalities")

injuries

#group events per injury
injury_events<-aggregate(INJURIES~EVTYPE,Storm,sum)
#list the 10 event types with most injuries
top_injury_events <-injury_events %>% arrange(desc(INJURIES)) %>% slice(1:10)
top_injury_events
##               EVTYPE INJURIES
## 1          HURRICANE    92735
## 2  THUNDERSTORM WIND     9469
## 3              FLOOD     8604
## 4     EXCESSIVE HEAT     6680
## 5          LIGHTNING     5230
## 6               HEAT     2100
## 7          ICE STORM     1975
## 8          HIGH WIND     1471
## 9               HAIL     1361
## 10      WINTER STORM     1321
#plot
par(mar=c(5,8,4,2))
par(oma=c(8,1,3,3))
barplot(top_injury_events$INJURIES,names.arg=top_injury_events$EVTYPE,las=2,main="10 event types causing most injuries",ylab="number of injuries",width=800)

Which events have the greatest economic consequences

Property damage

PropertyDamage_events<-aggregate(PropertyDamage~EVTYPE,Storm,sum)
#list the 10 event types with most property damage
top_PropertyDamage_events <-PropertyDamage_events %>% arrange(desc(PropertyDamage)) %>% slice(1:10)
top_PropertyDamage_events
##               EVTYPE PropertyDamage
## 1              FLOOD   168190218789
## 2          HURRICANE   143359498474
## 3        STORM SURGE    43323536000
## 4               HAIL    15735819456
## 5  THUNDERSTORM WIND     9970370523
## 6     TROPICAL STORM     7703890550
## 7       WINTER STORM     6688497251
## 8          HIGH WIND     6003356490
## 9           WILDFIRE     4765114000
## 10  STORM SURGE TIDE     4641188000
par(mar=c(5,5,4,2))
par(oma=c(5,1,3,3))
# Plot for Property Damage
barplot(top_PropertyDamage_events$PropertyDamage,names.arg=top_PropertyDamage_events$EVTYPE,las=2,main="10 event types causing most property damage")

Crop damage

CropDamage_events<-aggregate(CropDamage~EVTYPE,Storm,sum)
#list the 10 event types with most property damage
top_CropDamage_events <-CropDamage_events %>% arrange(desc(CropDamage)) %>% slice(1:10)
top_CropDamage_events
##               EVTYPE  CropDamage
## 1            DROUGHT 13972566000
## 2              FLOOD 12379706100
## 3          HURRICANE  5932754320
## 4          ICE STORM  5022113500
## 5               HAIL  3026044470
## 6       EXTREME COLD  1293023000
## 7  THUNDERSTORM WIND  1224408980
## 8       FROST FREEZE  1094086000
## 9         HEAVY RAIN   795752800
## 10         HIGH WIND   686301900
# Plot for Crop Damage
#par(mar=c(5,5,4,2))
#par(oma=c(5,1,3,3))
# barplot(top_CropDamage_events$CropDamage,names.arg=top_CropDamage_events$EVTYPE,las=2,main="10 event types causing most crop damage")

Conclusion

When looking at the impact of severe weather on public health, we distinguish between the impact on fatalities and on injuries. The weather types that caused the most fatalities between 1950 and 2011 are tornadoes, excessive heat and floods. The weather types that caused the most injuries between 1950 and 2011 are tornadoes, thunderstorm winds and floods.

When looking at the impact of severe weather on the economy, we distinguish between the impact on property damage and on crop damage. The weather types that caused the most property damage between 1950 and 2011 are floods, hurricanes and tornadoes. The weather types that caused the most crop damage between 1950 and 2011 are drought, floods and hurricanes.