An Analysis of Severe Weather and Its Economic and Population Health Consequences

Coursera Reproducible Research Project II

John Slough II

25 Jan 2015

Synopsis

Severe weather event data and its economic and public health consequences were analyzed to determine which events were most harmful with respect to population health and economic consequences. From analysis of the data from 1996 to 2011, it was determined that tornadoes, excessive heat, and floods have the worst consequences in terms of public health. Tstm wind, flash flood, tornadoes, and hail are among the worst in terms of economic consequences. Tornadoes were the worst overall offender.

Data

The data analyzed comes from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. The dataset is a collection of severe weather events and their economic and population health consequences from 1950 to 2011 in the USA. We are to determine which kind of severe weather has the most effect on the economy and public health.

Data Processing

A feature of the data is that until 1996 all severe weather events were not recorded; prior to 1996 only tornado, thunderstorm wind, and hail were recorded and prior to 1955 only tornadoes were recorded. This means that if were used the entire dataset the results would be biased towards tornado, thunderstorm wind, and hail. Therefore, only the data from 1996 onwards will be analyzed. See http://www.ncdc.noaa.gov/stormevents/details.jsp?type=eventtype for more information. The code chunk below first loads the raw csv file, subsets only the required columns and writes a new csv file. This is to aid processing speed, as the raw file requires a lot of time to process. The BGN_DATE variable is converted to date format, then all records from 1996 onwards are selected.

setwd("~/Desktop/Coursera/Reproducible Research/project2")

# the following code, now set as comments, needs to be run the first time the 
# dataset is loaded
#stormAll=read.csv("repdata_data_StormData.csv",header=TRUE)
#storm=subset(storm,select=c(BGN_DATE, STATE,EVTYPE,FATALITIES,
#                           INJURIES,PROPDMG,CROPDMG))
# write smaller file, load for subsequent analysis
#write.csv(storm, file = "storm.csv",row.names=FALSE)
storm=read.csv("storm.csv",header=TRUE)
storm$BGN_DATE = as.Date(storm$BGN_DATE, "%Y-%m-%d")
AllET=subset(storm, BGN_DATE>"1995-12-31") 

Results of Population Health Analysis

The number of fatalities and injuries were recorded. Severe weather events in which there were more than 0 fatalities or injuries were selected for analysis. The top 10 severe weather types based on number of fatalities, then injuries is shown in the code chunk below.

FatInj = subset(AllET, FATALITIES>0 |INJURIES>0,
                  select=c(EVTYPE,FATALITIES,INJURIES))

library(plyr)
Fatsum=ddply(FatInj,"EVTYPE", summarize,fatalities=sum(FATALITIES))
Injsum=ddply(FatInj,"EVTYPE", summarize,injuries=sum(INJURIES))
FatInjCount=merge(Injsum,Fatsum,all.x=TRUE)
FatInjCountSort=FatInjCount[order(c(FatInjCount$fatalities),decreasing=TRUE),]
FatInjCountSortInj=FatInjCount[order(c(FatInjCount$injuries),decreasing=TRUE),]

FatInjCountSort[1:10,]
##             EVTYPE injuries fatalities
## 26  EXCESSIVE HEAT     6391       1797
## 115        TORNADO    20667       1511
## 34     FLASH FLOOD     1674        887
## 78       LIGHTNING     4141        651
## 35           FLOOD     6758        414
## 92     RIP CURRENT      209        340
## 118      TSTM WIND     3629        241
## 49            HEAT     1222        237
## 64       HIGH WIND     1083        235
## 1        AVALANCHE      156        223
FatInjCountSortInj[1:10,]
##                EVTYPE injuries fatalities
## 115           TORNADO    20667       1511
## 35              FLOOD     6758        414
## 26     EXCESSIVE HEAT     6391       1797
## 78          LIGHTNING     4141        651
## 118         TSTM WIND     3629        241
## 34        FLASH FLOOD     1674        887
## 112 THUNDERSTORM WIND     1400        130
## 134      WINTER STORM     1292        191
## 67  HURRICANE/TYPHOON     1275         64
## 49               HEAT     1222        237

We can see almost the same weather types in the top ten for each sorting method. It is clear that excessive heat, tornadoes, and floods are among the worst in both sets.

To visualize this data a scatterplot of the number of injuries and fatalities from severe weather was produced. The top ten types of severe weather are highlighted. The two worst, by a large margin are tornadoes and excessive heat. One may like to note that flood and flash flood are both high and may be similar occurrences however the NOAA has distinguished them as separate types.

top=FatInjCountSort[c(1:7,9,10),]
heat=FatInjCountSort[8,]
library(calibrate)
## Loading required package: MASS
plot(FatInjCountSort$injuries,FatInjCountSort$fatalities,cex=.9,
     xlim=c(0,20500),ylim=c(0,1800),
     col=rgb(100,0,0,90,maxColorValue=150), pch=16,
     xlab="Number of Injuries",ylab="Number of Deaths",
     main="Injuries & Deaths from Different Types of Severe Weather\nUSA: 1996 to 2011")
par(new=TRUE)
plot(top$injuries,top$fatalities,xlim=c(0,20500),ylim=c(0,1800),xaxt='n',yaxt='n',bty='n',ylab='',xlab='',col=rgb(100,0,0,maxColorValue=100),pch=16)
textxy(top$injuries,top$fatalities, top$EVTYPE, cex=.6,pos=3)
par(new=TRUE)
plot(heat$injuries,heat$fatalities,xlim=c(0,20500),ylim=c(0,1800),xaxt='n',
     yaxt='n',bty='n',ylab='',xlab='',col=rgb(100,0,0,
                                              maxColorValue=100),pch=16)
textxy(heat$injuries,heat$fatalities, heat$EVTYPE, cex=.6,pos=4)

Results of Economic Consequences Analysis

The dataset includes data on crop and property damage for each severe weather event. Severe weather events in which there were more than $0 crop or property damage were selected for analysis. The top 10 severe weather types based on property damage, then crop damage is shown in the code chunk below.

EconDam = subset(AllET, PROPDMG>0 |CROPDMG>0,
                 select=c(EVTYPE,PROPDMG,CROPDMG))

Propsum=ddply(EconDam,"EVTYPE", summarize,property=sum(PROPDMG))
Cropsum=ddply(EconDam,"EVTYPE", summarize,crop=sum(CROPDMG))
PropCropSum=merge(Propsum,Cropsum,all.x=TRUE)
# change from million to billion $ for plots
PropCropSum$property=PropCropSum$property/1000
PropCropSum$crop=PropCropSum$crop/1000
PropCropSort=PropCropSum[order(PropCropSum$property,PropCropSum$crop,
                               decreasing=TRUE),]
PropCropSortcrop=PropCropSum[order(PropCropSum$crop,
                                   PropCropSum$property,decreasing=TRUE),]

PropCropSort[1:10,]
##                EVTYPE   property      crop
## 152         TSTM WIND 1330.70691 109.11060
## 44        FLASH FLOOD 1247.56254 161.06671
## 148           TORNADO 1187.87823  90.12850
## 145 THUNDERSTORM WIND  862.25736  66.66300
## 46              FLOOD  824.93671 151.82618
## 72               HAIL  575.31728 498.33912
## 109         LIGHTNING  488.56185   1.90344
## 86          HIGH WIND  315.09806  17.26821
## 181      WINTER STORM  126.91049   1.96399
## 77         HEAVY SNOW   89.39311   1.59170
PropCropSortcrop[1:10,]
##                EVTYPE   property      crop
## 72               HAIL  575.31728 498.33912
## 44        FLASH FLOOD 1247.56254 161.06671
## 46              FLOOD  824.93671 151.82618
## 152         TSTM WIND 1330.70691 109.11060
## 148           TORNADO 1187.87823  90.12850
## 145 THUNDERSTORM WIND  862.25736  66.66300
## 30            DROUGHT    4.09405  33.29362
## 86          HIGH WIND  315.09806  17.26821
## 75         HEAVY RAIN   47.00284  10.97771
## 59       FROST/FREEZE    0.96852   7.03414

Again, we can see similar severe weather types in the top ten for each sorting method.

As above, to visualize this data a scatterplot of the crop damage and property damage from severe weather was produced. The top ten types of severe weather are highlighted. We can see that hail produces, by far, the most crop damage. The worst as far as property damage appears to be wind related events (tornadoes, thunderstorm winds) and floods/flash floods.

topdam=PropCropSort[1:9,]
topdamHSnow=PropCropSort[10,]

p=plot(PropCropSort$property,PropCropSort$crop,cex=.9,xlim=c(0,1320),
       ylim=c(0,500),col=rgb(100,0,0,90,maxColorValue=150), pch=16,
       xlab="Property Damage in billion $",ylab="Crop Damage in billion $",
main="Crop & Property Damage from Different Types of Severe Weather\nUSA: 1996 to 2011")
par(new=TRUE)
plot(topdam$property,topdam$crop,xlim=c(0,1320),ylim=c(0,500),xaxt='n',
     yaxt='n',bty='n',ylab='',xlab='',col=rgb(100,0,0,maxColorValue=100),
     pch=16)
textxy(topdam$property,topdam$crop, topdam$EVTYPE, cex=.6,pos=3)
par(new=TRUE)
plot(topdamHSnow$property,topdamHSnow$crop,xlim=c(0,1320),ylim=c(0,500),
     xaxt='n',yaxt='n',bty='n',ylab='',xlab='',
     col=rgb(100,0,0,maxColorValue=100),pch=16)
textxy(topdamHSnow$property,topdamHSnow$crop, topdamHSnow$EVTYPE, 
       cex=.6,pos=1)

Overall Results

In order to have a better idea of the overall consequences of severe weather events property damage and crop damage were combined in one variable called total damage. As it does not seem prudent to combine the population health variables injury and fatality the following analysis will keep them separate.
A 3-D scatterplot was created using these three variables. The number of injuries is shown on the x axis, the number of fatalities on the z axis, and the total damage in billion dollars on the y axis. Lines are drawn from the top ten severe weather events for total damage and they are labeled. Clearly, tornadoes are the worst single severe weather event for total damage and population health, as they result in the highest or almost the highest value for each variable. Floods/flash floods are also stand out. Excessive heat leads to high levels of fatalities but virtually 0 property or crop damage.

PropCropSumTotal=PropCropSum
PropCropSumTotal$total=PropCropSumTotal$property+PropCropSumTotal$crop
AllCount=merge(PropCropSumTotal,FatInjCount,all=TRUE)

AllCount[is.na(AllCount)] = 0
AllCountSort=AllCount[order(AllCount$fatalities,
        AllCount$total,AllCount$injuries,decreasing=TRUE),]

All10=AllCount[order(AllCount$total,decreasing=TRUE),][1:10,]
ExHeat=AllCount[which(AllCount$EVTYPE=="EXCESSIVE HEAT"),] 
All10=rbind(All10,ExHeat)
library(scatterplot3d)

sp3D=scatterplot3d(AllCountSort$injuries,AllCountSort$fatalities,
        AllCountSort$total,color=rgb(100,0,0,90,maxColorValue=150),
        pch=16,xlab="Number of Injuries", 
        ylab="Number of Fatalities", zlab="Total Damage in billion $",
main="Total Damage, Fatalities & Injuries from Different 
Types of Severe Weather\nUSA: 1996 to 2011",
              highlight.3d=FALSE,type="h")
sp3D$points3d(x=All10$injuries,y=All10$fatalities,z=All10$total,
              col=rgb(100,0,0,maxColorValue=100),pch=16)
sp3D.coords = sp3D$xyz.convert(All10$injuries,All10$fatalities,All10$total)
text(sp3D.coords$x, sp3D.coords$y, labels=All10$EVTYPE, cex=.6, pos=3) 

Conclusion

Policy makers may use this analysis to better prioritize resources in order deal with the consequences of these severe weather events. By knowing which severe weather events produce the most, and which types of damage, resources can be directed in a more efficient manner. When considering the variables analyzed above, tornadoes produce the worst economic and public health consequences.