Weather event data in the United States from 1950 through 20011 are examined to determine impact of human populations. Analysis of the data addresses two questions: 1) Which type of events are most harmful to population health? 2) Which type of events have the greatest consequences on economics?
Original data source: https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2
Trivial processing steps not shown are data file downloading and placement in the proper working directory to faciliate loading the data.
options(warn=-1)
library(knitr)
library(ggplot2)
library(reshape2)
Each question requires a different subset of data for analysis.
Step 1. Read in the file.
storms<- read.csv(bzfile("repdata-data-StormData.csv.bz2"), header=T, sep=",")
head(storms,3)
## STATE__ BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1 1 4/18/1950 0:00:00 0130 CST 97 MOBILE AL
## 2 1 4/18/1950 0:00:00 0145 CST 3 BALDWIN AL
## 3 1 2/20/1951 0:00:00 1600 CST 57 FAYETTE AL
## EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 1 TORNADO 0 0
## 2 TORNADO 0 0
## 3 TORNADO 0 0
## COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 1 NA 0 14.0 100 3 0 0
## 2 NA 0 2.0 150 2 0 0
## 3 NA 0 0.1 123 2 0 0
## INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES
## 1 15 25.0 K 0
## 2 0 2.5 K 0
## 3 2 25.0 K 0
## LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1 3040 8812 3051 8806 1
## 2 3042 8755 0 0 2
## 3 3340 8742 0 0 3
Step 2. Prepare the data to answer Q1. Before we process the data, an examination reveals severl types of weather events we want to aggregate. For example, there appear to be a number of ‘wind’ related EVTYPES. There are also many events related to cold wather and winter like conditions. Consolidating weather event types to provide a broader overview for presentation purposes is warranted.
# Aggretage EVTYPEs using logical vectors (grep).
stormTypes<- storms$EVTYPE
wind<- grepl('wind', stormTypes, ignore.case=TRUE)
heat<- grepl('heat|hot|warm', stormTypes, ignore.case=TRUE)
cold<- grepl('blizzard|freez|ice|snow|winter|cold', stormTypes, ignore.case=TRUE)
flood<- grepl('flood|rain', stormTypes, ignore.case=TRUE)
hurricane<- grepl('hurricane|typhoon', stormTypes, ignore.case=TRUE)
waters<- grepl('water|surf|current', stormTypes, ignore.case=TRUE)
stormTypes[wind] <- "WIND"
stormTypes[heat] <- "HEAT"
stormTypes[cold] <- "COLD"
stormTypes[flood] <- "FLOOD"
stormTypes[hurricane] <- "HURRICANE"
stormTypes[waters] <- "WATERS"
storms$STYPES <- stormTypes
sub1 <- data.frame( storms$STYPES, storms$FATALITIES, storms$INJURIES)
# Now aggregate health columns based on weather event types
stormInjuries<- aggregate( storms$INJURIES ~ storms$STYPES, sub1, sum)
stormFatalities <- aggregate( storms$FATALITIES ~ storms$STYPES, sub1, sum)
stormHarm <- merge(stormInjuries, stormFatalities, by = intersect(names(stormInjuries), names(stormFatalities)))
colnames(stormHarm) <- c("TYPE", "INJURIES", "FATALITIES")
# And prepare the data for presentation in the Results section
c1<- order(stormHarm$INJURIES, decreasing=TRUE)
c2<- order(stormHarm$FATALITIES, decreasing=TRUE)
c3 <- order(stormHarm$INJURIES + stormHarm$FATALTIES, decreasing=TRUE)
stormI<- head(stormHarm[c1,], 10)
stormF <- head(stormHarm[c2,], 10)
stormA <- head(stormHarm[c3,], 10)
zLabels<- c(as.character(stormI$TYPE), as.character(stormF$TYPE))
Step 3. Now prepare the data to answer Q2.
#We already have our weather event types -- just need economic consequence aggregation
sub2 <- data.frame( storms$STYPES, storms$PROPDMG, storms$CROPDMG)
stormProperty<- aggregate( storms$PROPDMG ~ storms$STYPES, sub2, sum)
stormCrops <- aggregate( storms$CROPDMG ~ storms$STYPES, sub2, sum)
stormEharm <- merge(stormProperty, stormCrops, by = intersect(names(stormProperty), names(stormCrops)))
colnames(stormEharm) <- c("TYPE", "Property", "Crops")
#And we prepare again for presenation in the Results section
cE1<- order(stormEharm$Property, decreasing=TRUE)
cE2<- order(stormEharm$Crops, decreasing=TRUE)
cE3 <- order(stormEharm$Property + stormEharm$Crop, decreasing=TRUE)
stormP<- head(stormEharm[cE1,], 10)
stormC <- head(stormEharm[cE2,], 10)
stormE <- head(stormEharm[cE3,], 10)
zELabels<- c(as.character(stormP$TYPE), as.character(stormC$TYPE))
Q1: Which type of events are most harmful to population health?
par(mar=c(10,5,5,5))
color1=as.character(rep("grey", 10))
color2=as.character(rep("red",10))
colors<- c(color1, color2)
mids <- barplot(as.matrix(c(stormI[,2], stormF[,3])), beside=T, names.arg = NULL, axisnames = FALSE, las=1, log="y", main="WEATHER RELATED HEALTH\nPopulation Health Most HarmfulWeather Events in U.S.\n(log scale)", col=colors, space=0.4)
par(las=2)
axis(1, at=mids, labels=zLabels, cex.axis=0.75)
par(las=1)
legend("topright", c("Left (grey): INJURIES", "Right (red): FATALITIES"), bg="yellow", cex=0.8)
vr<-sum(storms$INJURIES) / sum(storms$FATALITIES)
print(paste("Aggregate Number of Injuries is", round(vr,2), " times greater than aggregate number of Fatalities."))
## [1] "Aggregate Number of Injuries is 9.28 times greater than aggregate number of Fatalities."
A: Using a logarithmic scale to depict the data, above (which allows for an easier determination of ranking of event types), Tornado, other Wind, Heat, Flood, and Cold related events are cause of the most injuries. But (in order) Tornatos, Heat, Flood, Wind and Cold related weather events cause most fatalities.
Q2: Which type of events are most have the greatest economic consequences?
par(mar=c(10,5,5,5))
color1=as.character(rep("green", 10))
color2=as.character(rep("darkgreen",10))
colors<- c(color1, color2)
mids <- barplot(as.matrix(c(stormP[,2], stormC[,3])), beside=T, names.arg = NULL, axisnames = FALSE, las=1, log="y", main="WEATHER RELATED ECONOMICS\nGreatest Impact\n(log scale)", col=colors, space=0.4)
par(las=2)
axis(1, at=mids, labels=zELabels, cex.axis=0.75)
par(las=1)
legend("topright", c("Left (green): PROPERTY", "Right (dark green): CROPS"), bg="pink", cex=0.8)
vr<-sum(storms$PROPDMG) / sum(storms$CROPDMG)
print(paste("Aggregate Property Damage is", round(vr,2), " times greater than aggregate Crop damage."))
## [1] "Aggregate Property Damage is 7.9 times greater than aggregate Crop damage."
A: Again a logarithmic scale is used to depict the data, above. A log scale allows for an easier comparison and ranking of event types. Tornado, other Wind, Flood, Hail, and Lightening related events are the cause of most Proerty damaage. But (in order) Hail, Flood, Wind, Tornado, and Drought cause the most Crop damage.
Public officials are better prepared to plan for effective programs to reduce weather related harm on population health and amelioarte economic consequences by determing how best to prioritize preparations for such events. An analysis of greatest damage as a function of weather event type can help public officials prioritize resources accordingly.