Storm Data Disclaimer

Storm Data is an official publication of the National Oceanic and Atmospheric Administration (NOAA) which documents:

  1. The occurrence of storms and other significant weather phenomena having sufficient intensity to cause loss of life, injuries, significant property damage, and/or disruption to commerce;

  2. Rare, unusual, weather phenomena that generate media attention, such as snow flurries in South Florida or the San Diego coastal area; and…

  3. Other significant meteorological events, such as record maximum or minimum temperatures or precipitation that occur in connection with another event.

The Data for this analysis were downloaded from: https://t.ly/dv0RA

The Data documentation is available from the following link: https://t.ly/yYdEn

1. Synopsis

Across the United States, during the years of 1950 - 2011, the National Oceanic and Atmospheric Administration’s (NOAA) collected and recorded a database of Storms activities and their overall economic and health-related impact of the population. This research report’s goal is to review the available data to determine two key factors:

  1. Which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

  2. Which types of events have the greatest economic consequences?

I am hoping to present my findings and graph those results in an effort to better understand these events and how they effect our overall health and econimic requirements to remediate and recover from these disasters.

2. Data Processing

noaaData <- read.table("repdata_data_StormData.csv.bz2", header = TRUE, sep = ",")

Now that we have read the raw data into its Table variable, Let’s review the table Headers

head(noaaData)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL TORNADO
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL TORNADO
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL TORNADO
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL TORNADO
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL TORNADO
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL TORNADO
##   BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END COUNTYENDN
## 1         0                                               0         NA
## 2         0                                               0         NA
## 3         0                                               0         NA
## 4         0                                               0         NA
## 5         0                                               0         NA
## 6         0                                               0         NA
##   END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES INJURIES PROPDMG
## 1         0                      14.0   100 3   0          0       15    25.0
## 2         0                       2.0   150 2   0          0        0     2.5
## 3         0                       0.1   123 2   0          0        2    25.0
## 4         0                       0.0   100 2   0          0        2     2.5
## 5         0                       0.0   150 2   0          0        2     2.5
## 6         0                       1.5   177 2   0          0        6     2.5
##   PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES LATITUDE LONGITUDE
## 1          K       0                                         3040      8812
## 2          K       0                                         3042      8755
## 3          K       0                                         3340      8742
## 4          K       0                                         3458      8626
## 5          K       0                                         3412      8642
## 6          K       0                                         3450      8748
##   LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1       3051       8806              1
## 2          0          0              2
## 3          0          0              3
## 4          0          0              4
## 5          0          0              5
## 6          0          0              6

… and Data Dimensions for analysis:

dim(noaaData)
## [1] 902297     37
# summary(noaaData)
eDatef <- first(str_sub(noaaData$BGN_DATE, end=-9))
eDatel <- last(str_sub(noaaData$BGN_DATE, end=-9))
dataC = ddply(noaaData,.(EVTYPE),summarise,FATALITIES = sum(FATALITIES), INJURIES = sum(INJURIES))
dataC = dataC[order(-dataC["INJURIES"]),]

Prepare the Plot Data; generate a clean set of data to design the Graph of the Events

a <- data.frame(EVTYPE = dataC[1,1], FATALITIES = dataC[1,2], INJURIES = dataC[1,3])
b <- data.frame(EVTYPE = dataC[6,1], FATALITIES = dataC[4,2] + dataC[6,2], INJURIES = dataC[4,3] + dataC[6,3])
d <- data.frame(EVTYPE = dataC[2,1], FATALITIES = dataC[2,2], INJURIES = dataC[2,3])
e <- data.frame(EVTYPE = dataC[3,1], FATALITIES = dataC[3,2], INJURIES = dataC[3,3])
f <- data.frame(EVTYPE = dataC[5,1], FATALITIES = dataC[5,2], INJURIES = dataC[5,3])
g <- data.frame(EVTYPE = dataC[7,1], FATALITIES = dataC[7,2], INJURIES = dataC[7,3])
h <- data.frame(EVTYPE = dataC[8,1], FATALITIES = dataC[8,2], INJURIES = dataC[8,3])
i <- data.frame(EVTYPE = dataC[9,1], FATALITIES = dataC[9,2], INJURIES = dataC[9,3])
j <- data.frame(EVTYPE = dataC[10,1], FATALITIES = dataC[10,2], INJURIES = dataC[10,3])
k <- data.frame(EVTYPE = dataC[11,1], FATALITIES = dataC[11,2], INJURIES = dataC[11,3])
cData <- rbind(a,b,d,e,f,g,h,i,j,k)
names(cData) <- c("Event", "Fatalities", "Injuries")
dataCost<-ddply(noaaData,.(EVTYPE),summarise,PROPDMG = sum(PROPDMG), CROPDMG = sum(CROPDMG))
propDamg = dataCost[order(-dataCost["PROPDMG"]),][1:10,1:2]
cropDamg = dataCost[order(-dataCost["CROPDMG"]),][1:10,1-3]

propDamgB <- round((propDamg[,2]*0.0001),0)
cropDamgB <- round((cropDamg[,2]*0.0001),0)

3. Results

3.1. First Question: Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

In the United States of America, as per the collected Data, the top 10 most harmful events that effect a Human Being between 4/18/1950 to 11/28/2011 were:

  1. TORNADO, causing 5633 Fatalities and 91346 Injuries.

  2. HEAT / EXCESSIVE HEAT, causing 2840 Fatalities and 8625 Injuries.

(Please note that I am combining EXCESSIVE HEAT and HEAT variables to generate a single HEAT result Set. In my observations, I believe that these two variables, combined, present a better representation of the overall harmful impact on Human Beings health.)

  1. TSTM WIND, causing 504 Fatalities and 6957 Injuries.

  2. FLOOD, causing 470 Fatalities and 6789 Injuries.

  3. LIGHTNING, causing 816 Fatalities and 5230 Injuries.

  4. ICE STORM, causing 89 Fatalities and 1975 Injuries.

  5. FLASH FLOOD, causing 978 Fatalities and 1777 Injuries.

  6. THUNDERSTORM WIND, causing 133 Fatalities and 1488 Injuries.

  7. HAIL, causing 15 Fatalities and 1361 Injuries.

  8. WINTER STORM, causing 206 Fatalities and 1321 Injuries.

Plot the Data for the most influencial events in the United States from 1950 - 2011

plotdata <- t(cbind(log10(cData$Fatalities), log10(cData$Injuries)))
op <- par(mfrow=c(1,1), mar = c(8,3,3,3)); on.exit(par(op))
barplot(plotdata, beside=T, names.arg = cData$Event,
        main = "Plot of Most Harmful Events (Log10)",
        col=c("gold","orange"), legend = c("Fatalities","Injuries"),
        args.legend = list(x = "topright", bty="y", horiz = T),
        cex.names = 0.7, las = 3)

3.2. Second Question

Across the United States, which types of events have the greatest economic consequences?

In the United States, the Events with the greatest Economic Consequences from 1950 - 2011 were:

  1. TORNADO, causing 321 Billion Dollars in Property Damage and 58 Billion Dollars in Crop Damage

  2. FLASH FLOOD, causing 142 Billion Dollars in Property Damage and 18 Billion Dollars in Crop Damage

  3. TSTM WIND, causing 134 Billion Dollars in Property Damage and 17 Billion Dollars in Crop Damage

  4. FLOOD, causing 90 Billion Dollars in Property Damage and 11 Billion Dollars in Crop Damage

  5. THUNDERSTORM WIND, causing 88 Billion Dollars in Property Damage and 10 Billion Dollars in Crop Damage

  6. HAIL, causing 69 Billion Dollars in Property Damage and 7 Billion Dollars in Crop Damage

  7. LIGHTNING, causing 60 Billion Dollars in Property Damage and 3 Billion Dollars in Crop Damage

  8. THUNDERSTORM WINDS, causing 45 Billion Dollars in Property Damage and 2 Billion Dollars in Crop Damage

  9. HIGH WIND, causing 32 Billion Dollars in Property Damage and 2 Billion Dollars in Crop Damage

  10. WINTER STORM, causing 13 Billion Dollars in Property Damage and 1 Billion Dollars in Crop Damage

Plot of the greatest Economic Consequences in the United States from 1950 - 2011

ecodata <- t(cbind(log10(propDamg$PROPDMG), log10(cropDamg$CROPDMG)))
op <- par(mfrow=c(1,1), mar = c(8,3,3,3)); on.exit(par(op))
barplot(ecodata, beside=T, names.arg = propDamg$EVTYPE,
        main = "Plot of Economic Consequence (Log10)",
        col=c("gold","orange"), legend = c("Property","Crop"),
        args.legend = list(x = "topright", bty="n", horiz = T),
        cex.names = 0.7, las = 3)

4. Conclusion

In the United States of America, the number one cause of health hazard from natural disasters was TORNADO, causing 5633 Fatalities and 91346 Injuries. On the other hand, the event or natural disaster that cost the most to recover from was TORNADO, causing 321 Billion Dollars in Property Damage and 58 Billion Dollars in Crop Damage.