Introduction
The basic goal of this assignment is to explore the NOAA Storm Database and to answer some basic questions about severe weather events. The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.
Synopsis
The analysis on the storm event database revealed that tornadoes are the most dangerous weather event to the population health. The second most dangerous event type is the excessive heat. The economic impact of weather events was also analyzed. Flash floods and thunderstorm winds caused billions of dollars in property damages between 1950 and 2011. The largest crop damage caused by drought, followed by flood and hails.
Data Processing
The following code is used to download and read data:
stormdata <- read.csv("StormData.csv.bz2")
names(stormdata)
## [1] "STATE__" "BGN_DATE" "BGN_TIME" "TIME_ZONE" "COUNTY"
## [6] "COUNTYNAME" "STATE" "EVTYPE" "BGN_RANGE" "BGN_AZI"
## [11] "BGN_LOCATI" "END_DATE" "END_TIME" "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE" "END_AZI" "END_LOCATI" "LENGTH" "WIDTH"
## [21] "F" "MAG" "FATALITIES" "INJURIES" "PROPDMG"
## [26] "PROPDMGEXP" "CROPDMG" "CROPDMGEXP" "WFO" "STATEOFFIC"
## [31] "ZONENAMES" "LATITUDE" "LONGITUDE" "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS" "REFNUM"
length(table(stormdata$EVTYPE))
## [1] 985
# 985 different event types
# translate all letters to lowercase
event_types <- tolower(stormdata$EVTYPE)
# replace all punct. characters with a space
event_types <- gsub("[[:blank:][:punct:]+]", " ", event_types)
length(unique(event_types))
## [1] 874
# 874 different events after data cleaning
# update the data frame
stormdata$EVTYPE <- event_types
Results
Most harmful events to human health
This data set account data for four types of damage: fatality (FATALITIES) , injury (INJURIES), property damage (PROPDMG) and crop damage (CROPDMG). The last two must be calculated with magnitude, PROPDMGEXP and CROPDMGEXP. The two first items are directly related to human health, so a summary is presented.
fatal <- aggregate(FATALITIES ~ EVTYPE, data = stormdata, sum)
fatal1 <- fatal[fatal$FATALITIES > 0, ]
fatalorder <- fatal1[order(fatal1$FATALITIES, decreasing = TRUE), ]
head(fatalorder)
## EVTYPE FATALITIES
## 741 tornado 5633
## 116 excessive heat 1903
## 138 flash flood 978
## 240 heat 937
## 410 lightning 816
## 762 tstm wind 504
The prevous code above shows aggregates the fatality data by event type and rank it in decreasing order. We see tornadoes and excessive heat are the most fatality-causing events from 1950 onwards. Then,now we will summarize injuries“ data:
injury <- aggregate(INJURIES ~ EVTYPE, data = stormdata, sum)
injury1 <- injury[injury$INJURIES > 0, ]
injuryorder <- injury1[order(injury1$INJURIES, decreasing = TRUE), ]
head(injuryorder)
## EVTYPE INJURIES
## 741 tornado 91346
## 762 tstm wind 6957
## 154 flood 6789
## 116 excessive heat 6525
## 410 lightning 5230
## 240 heat 2100
We have found the two events causing more injuries to people are again tornadoes and excessive heat.
This data will be presented in two graphs: fatality and injuries.
barplot(fatalorder[1:10, 2], col = blues9, legend.text = fatalorder[1:10,
1], ylab = "Fatality", main = "10 natural events causing most fatalities")
barplot(injuryorder[1:10, 2], col = blues9, legend.text = injuryorder[1:10,
1], ylab = "Injured people", main = "10 natural events causing most people“s injuries")
Therfeor, we now can determine which events cause BOTH major fatalities and body injuries.
intersect(fatalorder[1:10, 1], injuryorder[1:10, 1])
## [1] "tornado" "excessive heat" "flash flood" "heat"
## [5] "lightning" "tstm wind" "flood"
From 7 major types of events listed in the top 10 causes of fatalities and body injuries, tornadoes are the most harmful event to human health while others like exccesive heat, flash flood, and thunderstorm wind area listed as well.
3.2 Property most harmful events
Here we summarize property and crop damage cause by these natural events.
unique(stormdata$PROPDMGEXP)
## [1] "K" "M" "" "B" "m" "+" "0" "5" "6" "?" "4" "2" "3" "h" "7" "H" "-" "1" "8"
unique(stormdata$CROPDMGEXP)
## [1] "" "M" "K" "m" "B" "?" "0" "k" "2"
Now we can aggregate property and crop damage by event types and order them them in decreasing order.
damage <- aggregate(PROPDMG ~ EVTYPE, data = stormdata, sum)
damage1 <- damage[damage$PROPDMG > 0, ]
damageorder <- damage1[order(damage1$PROPDMG, decreasing = TRUE), ]
head(damageorder)
## EVTYPE PROPDMG
## 741 tornado 3212258.2
## 138 flash flood 1420124.6
## 762 tstm wind 1335995.6
## 154 flood 899938.5
## 671 thunderstorm wind 876844.2
## 209 hail 688693.4
As We can see at the above tableflood is the most harmful event regarding property damage, followed by hurricane(typhoon).
cropdmg <- aggregate(CROPDMG ~ EVTYPE, data = stormdata, sum)
cropdmg1 <- cropdmg[cropdmg$CROPDMG > 0, ]
cropdmgorder <- cropdmg1[order(cropdmg1$CROPDMG, decreasing = TRUE), ]
head(cropdmgorder)
## EVTYPE CROPDMG
## 209 hail 579596.28
## 138 flash flood 179200.46
## 154 flood 168037.88
## 762 tstm wind 109202.60
## 741 tornado 100018.52
## 671 thunderstorm wind 66791.45
As we have seen from the analysis the most severe weather event in terms of crop damage is drought. In the last half century, the drought has caused more than 10 billion dollars damage.Additionally, the Other severe crop-damage-causing event types are floods and hails.
A a result, the next graph will show us the 10 most harmful events in those two categories.
barplot(damageorder[1:10, 2], col = blues9, legend.text = damageorder[1:10,
1], ylab = "Property damage", main = "10 natural events caused most property damage")
As We have seen the sequence of these two types of damages is different. We will add them to obtain the total sum.
totaldmg <- merge(damageorder, cropdmgorder, by = "EVTYPE")
totaldmg$total = totaldmg$PROPDMG + totaldmg$CROPDMG
totaldmgorder <- totaldmg[order(totaldmg$total, decreasing = TRUE), ]
totaldmgorder[1:5, ]
## EVTYPE PROPDMG CROPDMG total
## 84 tornado 3212258.2 100018.5 3312277
## 15 flash flood 1420124.6 179200.5 1599325
## 91 tstm wind 1335995.6 109202.6 1445198
## 31 hail 688693.4 579596.3 1268290
## 19 flood 899938.5 168037.9 1067976