Introduction

The basic goal of this assignment is to explore the NOAA Storm Database and to answer some basic questions about severe weather events. The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.

Synopsis

The analysis on the storm event database revealed that tornadoes are the most dangerous weather event to the population health. The second most dangerous event type is the excessive heat. The economic impact of weather events was also analyzed. Flash floods and thunderstorm winds caused billions of dollars in property damages between 1950 and 2011. The largest crop damage caused by drought, followed by flood and hails.

Data Processing

The following code is used to download and read data:

stormdata <- read.csv("StormData.csv.bz2")
names(stormdata)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"
length(table(stormdata$EVTYPE))
## [1] 985
# 985 different event types

# translate all letters to lowercase
event_types <- tolower(stormdata$EVTYPE)
# replace all punct. characters with a space
event_types <- gsub("[[:blank:][:punct:]+]", " ", event_types)
length(unique(event_types))
## [1] 874
# 874 different events after data cleaning

# update the data frame
stormdata$EVTYPE <- event_types

Results

Most harmful events to human health

This data set account data for four types of damage: fatality (FATALITIES) , injury (INJURIES), property damage (PROPDMG) and crop damage (CROPDMG). The last two must be calculated with magnitude, PROPDMGEXP and CROPDMGEXP. The two first items are directly related to human health, so a summary is presented.

fatal <- aggregate(FATALITIES ~ EVTYPE, data = stormdata, sum)
fatal1 <- fatal[fatal$FATALITIES > 0, ]
fatalorder <- fatal1[order(fatal1$FATALITIES, decreasing = TRUE), ]
head(fatalorder)
##             EVTYPE FATALITIES
## 741        tornado       5633
## 116 excessive heat       1903
## 138    flash flood        978
## 240           heat        937
## 410      lightning        816
## 762      tstm wind        504

The prevous code above shows aggregates the fatality data by event type and rank it in decreasing order. We see tornadoes and excessive heat are the most fatality-causing events from 1950 onwards. Then,now we will summarize injuries“ data:

injury <- aggregate(INJURIES ~ EVTYPE, data = stormdata, sum)
injury1 <- injury[injury$INJURIES > 0, ]
injuryorder <- injury1[order(injury1$INJURIES, decreasing = TRUE), ]
head(injuryorder)
##             EVTYPE INJURIES
## 741        tornado    91346
## 762      tstm wind     6957
## 154          flood     6789
## 116 excessive heat     6525
## 410      lightning     5230
## 240           heat     2100

We have found the two events causing more injuries to people are again tornadoes and excessive heat.

This data will be presented in two graphs: fatality and injuries.

barplot(fatalorder[1:10, 2], col = blues9, legend.text = fatalorder[1:10,
    1], ylab = "Fatality", main = "10 natural events causing most fatalities")

barplot(injuryorder[1:10, 2], col = blues9, legend.text = injuryorder[1:10,
    1], ylab = "Injured people", main = "10 natural events causing most people“s injuries")

Therfeor, we now can determine which events cause BOTH major fatalities and body injuries.

intersect(fatalorder[1:10, 1], injuryorder[1:10, 1])
## [1] "tornado"        "excessive heat" "flash flood"    "heat"          
## [5] "lightning"      "tstm wind"      "flood"

From 7 major types of events listed in the top 10 causes of fatalities and body injuries, tornadoes are the most harmful event to human health while others like exccesive heat, flash flood, and thunderstorm wind area listed as well.

3.2 Property most harmful events

Here we summarize property and crop damage cause by these natural events.

unique(stormdata$PROPDMGEXP)
##  [1] "K" "M" ""  "B" "m" "+" "0" "5" "6" "?" "4" "2" "3" "h" "7" "H" "-" "1" "8"
unique(stormdata$CROPDMGEXP)
## [1] ""  "M" "K" "m" "B" "?" "0" "k" "2"

Now we can aggregate property and crop damage by event types and order them them in decreasing order.

damage <- aggregate(PROPDMG ~ EVTYPE, data = stormdata, sum)
damage1 <- damage[damage$PROPDMG > 0, ]
damageorder <- damage1[order(damage1$PROPDMG, decreasing = TRUE), ]
head(damageorder)
##                EVTYPE   PROPDMG
## 741           tornado 3212258.2
## 138       flash flood 1420124.6
## 762         tstm wind 1335995.6
## 154             flood  899938.5
## 671 thunderstorm wind  876844.2
## 209              hail  688693.4

As We can see at the above tableflood is the most harmful event regarding property damage, followed by hurricane(typhoon).

cropdmg <- aggregate(CROPDMG ~ EVTYPE, data = stormdata, sum)
cropdmg1 <- cropdmg[cropdmg$CROPDMG > 0, ]
cropdmgorder <- cropdmg1[order(cropdmg1$CROPDMG, decreasing = TRUE), ]
head(cropdmgorder)
##                EVTYPE   CROPDMG
## 209              hail 579596.28
## 138       flash flood 179200.46
## 154             flood 168037.88
## 762         tstm wind 109202.60
## 741           tornado 100018.52
## 671 thunderstorm wind  66791.45

As we have seen from the analysis the most severe weather event in terms of crop damage is drought. In the last half century, the drought has caused more than 10 billion dollars damage.Additionally, the Other severe crop-damage-causing event types are floods and hails.

A a result, the next graph will show us the 10 most harmful events in those two categories.

barplot(damageorder[1:10, 2], col = blues9, legend.text = damageorder[1:10, 
    1], ylab = "Property damage", main = "10 natural events caused most property damage")

As We have seen the sequence of these two types of damages is different. We will add them to obtain the total sum.

totaldmg <- merge(damageorder, cropdmgorder, by = "EVTYPE")
totaldmg$total = totaldmg$PROPDMG + totaldmg$CROPDMG
totaldmgorder <- totaldmg[order(totaldmg$total, decreasing = TRUE), ]
totaldmgorder[1:5, ]
##         EVTYPE   PROPDMG  CROPDMG   total
## 84     tornado 3212258.2 100018.5 3312277
## 15 flash flood 1420124.6 179200.5 1599325
## 91   tstm wind 1335995.6 109202.6 1445198
## 31        hail  688693.4 579596.3 1268290
## 19       flood  899938.5 168037.9 1067976