Storm Data is an official publication of the National Oceanic and Atmospheric Administration (NOAA) which documents:
The occurrence of storms and other significant weather phenomena having sufficient intensity to cause loss of life, injuries, significant property damage, and/or disruption to commerce;
Rare, unusual, weather phenomena that generate media attention, such as snow flurries in South Florida or the San Diego coastal area; and…
Other significant meteorological events, such as record maximum or minimum temperatures or precipitation that occur in connection with another event.
The Data for this analysis were downloaded from: https://t.ly/dv0RA
The Data documentation is available from the following link: https://t.ly/yYdEn
Across the United States, during the years of 1950 - 2011, the National Oceanic and Atmospheric Administration’s (NOAA) collected and recorded a database of Storms activities and their overall economic and health-related impact of the population. This research report’s goal is to review the available data to determine two key factors:
Which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
Which types of events have the greatest economic consequences?
I am hoping to present my findings and graph those results in an effort to better understand these events and how they effect our overall health and econimic requirements to remediate and recover from these disasters.
noaaData <- read.table("repdata_data_StormData.csv.bz2", header = TRUE, sep = ",")
Now that we have read the raw data into its Table variable, Let’s review the table Headers
head(noaaData)
## STATE__ BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE EVTYPE
## 1 1 4/18/1950 0:00:00 0130 CST 97 MOBILE AL TORNADO
## 2 1 4/18/1950 0:00:00 0145 CST 3 BALDWIN AL TORNADO
## 3 1 2/20/1951 0:00:00 1600 CST 57 FAYETTE AL TORNADO
## 4 1 6/8/1951 0:00:00 0900 CST 89 MADISON AL TORNADO
## 5 1 11/15/1951 0:00:00 1500 CST 43 CULLMAN AL TORNADO
## 6 1 11/15/1951 0:00:00 2000 CST 77 LAUDERDALE AL TORNADO
## BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END COUNTYENDN
## 1 0 0 NA
## 2 0 0 NA
## 3 0 0 NA
## 4 0 0 NA
## 5 0 0 NA
## 6 0 0 NA
## END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES INJURIES PROPDMG
## 1 0 14.0 100 3 0 0 15 25.0
## 2 0 2.0 150 2 0 0 0 2.5
## 3 0 0.1 123 2 0 0 2 25.0
## 4 0 0.0 100 2 0 0 2 2.5
## 5 0 0.0 150 2 0 0 2 2.5
## 6 0 1.5 177 2 0 0 6 2.5
## PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES LATITUDE LONGITUDE
## 1 K 0 3040 8812
## 2 K 0 3042 8755
## 3 K 0 3340 8742
## 4 K 0 3458 8626
## 5 K 0 3412 8642
## 6 K 0 3450 8748
## LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1 3051 8806 1
## 2 0 0 2
## 3 0 0 3
## 4 0 0 4
## 5 0 0 5
## 6 0 0 6
… and Data Dimensions for analysis:
dim(noaaData)
## [1] 902297 37
# summary(noaaData)
eDatef <- first(str_sub(noaaData$BGN_DATE, end=-9))
eDatel <- last(str_sub(noaaData$BGN_DATE, end=-9))
dataC = ddply(noaaData,.(EVTYPE),summarise,FATALITIES = sum(FATALITIES), INJURIES = sum(INJURIES))
dataC = dataC[order(-dataC["INJURIES"]),]
a <- data.frame(EVTYPE = dataC[1,1], FATALITIES = dataC[1,2], INJURIES = dataC[1,3])
b <- data.frame(EVTYPE = dataC[6,1], FATALITIES = dataC[4,2] + dataC[6,2], INJURIES = dataC[4,3] + dataC[6,3])
d <- data.frame(EVTYPE = dataC[2,1], FATALITIES = dataC[2,2], INJURIES = dataC[2,3])
e <- data.frame(EVTYPE = dataC[3,1], FATALITIES = dataC[3,2], INJURIES = dataC[3,3])
f <- data.frame(EVTYPE = dataC[5,1], FATALITIES = dataC[5,2], INJURIES = dataC[5,3])
g <- data.frame(EVTYPE = dataC[7,1], FATALITIES = dataC[7,2], INJURIES = dataC[7,3])
h <- data.frame(EVTYPE = dataC[8,1], FATALITIES = dataC[8,2], INJURIES = dataC[8,3])
i <- data.frame(EVTYPE = dataC[9,1], FATALITIES = dataC[9,2], INJURIES = dataC[9,3])
j <- data.frame(EVTYPE = dataC[10,1], FATALITIES = dataC[10,2], INJURIES = dataC[10,3])
k <- data.frame(EVTYPE = dataC[11,1], FATALITIES = dataC[11,2], INJURIES = dataC[11,3])
cData <- rbind(a,b,d,e,f,g,h,i,j,k)
names(cData) <- c("Event", "Fatalities", "Injuries")
dataCost<-ddply(noaaData,.(EVTYPE),summarise,PROPDMG = sum(PROPDMG), CROPDMG = sum(CROPDMG))
propDamg = dataCost[order(-dataCost["PROPDMG"]),][1:10,1:2]
cropDamg = dataCost[order(-dataCost["CROPDMG"]),][1:10,1-3]
propDamgB <- round((propDamg[,2]*0.0001),0)
cropDamgB <- round((cropDamg[,2]*0.0001),0)
In the United States of America, as per the collected Data, the top 10 most harmful events that effect a Human Being between 4/18/1950 to 11/28/2011 were:
TORNADO, causing 5633 Fatalities and 91346 Injuries.
HEAT / EXCESSIVE HEAT, causing 2840 Fatalities and 8625 Injuries.
(Please note that I am combining EXCESSIVE HEAT and HEAT variables to generate a single HEAT result Set. In my observations, I believe that these two variables, combined, present a better representation of the overall harmful impact on Human Beings health.)
TSTM WIND, causing 504 Fatalities and 6957 Injuries.
FLOOD, causing 470 Fatalities and 6789 Injuries.
LIGHTNING, causing 816 Fatalities and 5230 Injuries.
ICE STORM, causing 89 Fatalities and 1975 Injuries.
FLASH FLOOD, causing 978 Fatalities and 1777 Injuries.
THUNDERSTORM WIND, causing 133 Fatalities and 1488 Injuries.
HAIL, causing 15 Fatalities and 1361 Injuries.
WINTER STORM, causing 206 Fatalities and 1321 Injuries.
plotdata <- t(cbind(log10(cData$Fatalities), log10(cData$Injuries)))
op <- par(mfrow=c(1,1), mar = c(8,3,3,3)); on.exit(par(op))
barplot(plotdata, beside=T, names.arg = cData$Event,
main = "Plot of Most Harmful Events (Log10)",
col=c("gold","orange"), legend = c("Fatalities","Injuries"),
args.legend = list(x = "topright", bty="y", horiz = T),
cex.names = 0.7, las = 3)
In the United States, the Events with the greatest Economic Consequences from 1950 - 2011 were:
TORNADO, causing 321 Billion Dollars in Property Damage and 58 Billion Dollars in Crop Damage
FLASH FLOOD, causing 142 Billion Dollars in Property Damage and 18 Billion Dollars in Crop Damage
TSTM WIND, causing 134 Billion Dollars in Property Damage and 17 Billion Dollars in Crop Damage
FLOOD, causing 90 Billion Dollars in Property Damage and 11 Billion Dollars in Crop Damage
THUNDERSTORM WIND, causing 88 Billion Dollars in Property Damage and 10 Billion Dollars in Crop Damage
HAIL, causing 69 Billion Dollars in Property Damage and 7 Billion Dollars in Crop Damage
LIGHTNING, causing 60 Billion Dollars in Property Damage and 3 Billion Dollars in Crop Damage
THUNDERSTORM WINDS, causing 45 Billion Dollars in Property Damage and 2 Billion Dollars in Crop Damage
HIGH WIND, causing 32 Billion Dollars in Property Damage and 2 Billion Dollars in Crop Damage
WINTER STORM, causing 13 Billion Dollars in Property Damage and 1 Billion Dollars in Crop Damage
ecodata <- t(cbind(log10(propDamg$PROPDMG), log10(cropDamg$CROPDMG)))
op <- par(mfrow=c(1,1), mar = c(8,3,3,3)); on.exit(par(op))
barplot(ecodata, beside=T, names.arg = propDamg$EVTYPE,
main = "Plot of Economic Consequence (Log10)",
col=c("gold","orange"), legend = c("Property","Crop"),
args.legend = list(x = "topright", bty="n", horiz = T),
cex.names = 0.7, las = 3)
In the United States of America, the number one cause of health hazard from natural disasters was TORNADO, causing 5633 Fatalities and 91346 Injuries. On the other hand, the event or natural disaster that cost the most to recover from was TORNADO, causing 321 Billion Dollars in Property Damage and 58 Billion Dollars in Crop Damage.