Synopsis

This report is aimed to analyze a the U.S. National Oceanic and Atmospheric Administration(NOAA)’s storm database, which records data related to storms and other severe weather events between 1950 and 2011. These events influence human health on fatality or injury and their damage to properties and crops can lead to a magnificent economic loss. It is critical to understand which weather event is the most harmful for these concerns. In this study the subset of the original dataset is applied for analysis for the years in the 21st century since people tend to be most interested in the events happening recent years.

Data Processing

From NOAA the dataset of Storm Data is obtained for analysis.

1. Read the data into Rstudio

setInternet2(TRUE)
temp <- tempfile()
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",temp)
storm <- read.csv(bzfile(temp, "repdata_data_StormData.csv"))
unlink(temp)

2. Subset the data in the 21th century
Since the climate changes every year, the situation right now is probably not similar with that in the last century. Besides, the data in the previous ages are often not as complete as recent ones. Thus, we focus our study in the data collected in the 21th century, that is, from 2000-01-01 till the most recent data available.

storm$BeginDate <- as.Date(storm$BGN_DATE,format="%m/%d/%Y")
storm2k <- subset(storm, storm$BeginDate>="2000-01-01")

3. Tidy the varibale EVTYPE
Since the EVTYPE variable in the original dataset is not complety raw, where there are 196 types of events, most of which are just different representations of the same natural events. The Storm Data Event Table NATIONAL WEATHER SERVICE INSTRUCTION provided by NOAA only involves 48 events. I classified them one by one according to the descriptions in that file.

storm2k$EVTYPE <- tolower(storm2k$EVTYPE)
original <- sort(unique(storm2k$EVTYPE)) 
rp.0 <- c("high surf", "flash flood", "thunderstorm wind", "waterspout","drought",
          "flood", "heavy snow", rep("astronomical tide", 2), "avalanche")
rp.1 <- c("high surf", "ice storm", "blizzard", "dust storm", "wildfire",
          "coastal flood", "coastal flood", rep("cold/wind chill",3) )
rp.2 <- c("cold/wind chill", "coastal flood", "debirs flow", "dense fog", "dense smoke",
          "drought", "drowning", rep("drought",3) )
rp.3 <- c("drought", rep("dust devil",2), "dust storm", "heavy snow",
          rep("excessive heat",2), rep("extreme cold/wind chill",3) )
rp.4 <- c("extreme cold/wind chill", "flood", rep("heavy snow",2), "flash flood",
          "flood", rep("frost/freeze", 3), "freezing fog" )
rp.5 <- c(rep("frost/freeze",4), rep("funnel cloud",2),
          "winter weather", "strong wind", "strong wind", "thunderstorm wind")
rp.6 <- c("thunderstorm wind", "strong wind", "strong wind", "hail", "frost/freeze",
          "high surf", "heat", "heavy rain", "heavy rain", "heavy snow")
rp.7 <- c(rep("high surf", 7), "high wind", rep("hurricane/typhoon",2) )
rp.8 <- c(rep("ice storm",4), rep("lake-effect snow",2),
          rep("lakeshore flood",2), "avalanche","heavy snow"  )
rp.9 <- c("frost/freeze", "heavy snow", "lightning", "heavy rain", "marine hail", 
          "marine high wind", "marine strong wind", rep("marine thunderstorm wind",2), "mixed precipitation")
rp.10 <- c("heavy snow", "mixed precipitation", rep("debirs flow",2), "strong wind",
           "strong wind", "hail", "strong wind", "northern lights", "other" )
rp.11 <- c("ice storm", "cold/wind chill", "heat", "heavy rain", "extreme cold/wind chill",
           "excessive heat", "drought", "heavy rain", rep("heavy snow",2) )
rp.12 <- c("excessive heat", "wildfire", rep("rip current",4), 
           "seiche", "thunderstorm wind", "sleet", "sleet" )
rp.13 <- c("hail", "dense smoke", rep("heavy snow",8) )
rp.14 <- c("sleet", "heavy snow", rep("storm surge/tide",2), 
           rep("strong wind",2), rep("thunderstorm wind",4)  )
rp.15 <- c("storm surge/tide", "tornado", "tornado", "tropical depression",
           "tropical storm", rep("thunderstorm wind",5)  )
rp.16 <- c("tsunami", rep("cold/wind chill",3), "drought",
           rep("heat",4), "flood" )
rp.17 <- c(rep("cold/wind chill",2), "heavy snow", "heat", "flood",
           "drought", "excessive heat", rep("volcanic ash",2), "thunderstorm wind"  )
rp.18 <- c("heat", "waterspout", "tornado", rep("wildfire",2), 
           rep("strong wind",2), "cold/wind chill", rep("strong wind",2)  )
rp.19 <- c("winter storm", rep("winter weather",4), "strong wind" )
rp <- c(rp.0,rp.1,rp.2,rp.3,rp.4,rp.5,rp.6,rp.7,rp.8,rp.9,rp.10,rp.11,rp.12,
        rp.13,rp.14,rp.15,rp.16,rp.17,rp.18,rp.19)
storm2k$newEVTYPE <- storm2k$EVTYPE
n <- length(original)
for (i in 1:n) {
      storm2k$newEVTYPE[ storm2k$newEVTYPE==original[i] ] <- rp[i]
}
sort( unique(storm2k$newEVTYPE) )

Now there are 52 categories of the event, of which 48 are exactly the ones descirbed in the reference document. 4 categories cannot be correctly classified: for “drowning”, it is unknown what causes those unfortunes; for “mixed precipitation”, it is also unclear what is the specific reason for the deaths and injuries; while “northern lights” and “others” are completely new categories. Thus, their categories are kept as it is. Luckily, the sample those categories represent are trivial compared to the whole dataset.

which(storm2k$newEVTYPE=="drowning")
## [1] 94931
which(storm2k$newEVTYPE=="mixed precipitation")
##  [1]  12093  12115  12972  19459  20022  23510  23534  31426  32105  32573
## [11]  32578  32804  33101  33126  33129  46769  47259  48148  48156  48168
## [21]  48182  55686  55693  55702  55713  56273  56282  56297  56317  56757
## [31]  58300  58789  66719  66720  66728  66736  66746  66825  66827  66967
## [41]  67510  67853  68067  68070  68079  80838  80870  81557  81626  89843
## [51]  92106  92123  92860  93016 101696 101711 102901 102909 103130 103159
which(storm2k$newEVTYPE=="northern lights")
## [1] 68088
which(storm2k$newEVTYPE=="other")
## [1]  2773 19409 19978 56225

Below I just show all the classifications that I made. Since I am not an expert in climate, there are probably some misclassifications, which could be easily found with the following table.

df <- cbind(original,rp)
require(knitr)
print(kable(df,col.names = c("Original Event Type", "Classified Event Type")))
## 
## 
## Original Event Type              Classified Event Type    
## -------------------------------  -------------------------
## high surf advisory               high surf                
## flash flood                      flash flood              
## tstm wind                        thunderstorm wind        
## waterspout                       waterspout               
## abnormally dry                   drought                  
## abnormally wet                   flood                    
## accumulated snowfall             heavy snow               
## astronomical high tide           astronomical tide        
## astronomical low tide            astronomical tide        
## avalanche                        avalanche                
## beach erosion                    high surf                
## black ice                        ice storm                
## blizzard                         blizzard                 
## blowing dust                     dust storm               
## brush fire                       wildfire                 
## coastal flood                    coastal flood            
## coastal flooding                 coastal flood            
## cold                             cold/wind chill          
## cold weather                     cold/wind chill          
## cold wind chill temperatures     cold/wind chill          
## cold/wind chill                  cold/wind chill          
## cstl flooding/erosion            coastal flood            
## dam break                        debirs flow              
## dense fog                        dense fog                
## dense smoke                      dense smoke              
## drought                          drought                  
## drowning                         drowning                 
## dry                              drought                  
## dry conditions                   drought                  
## dry microburst                   drought                  
## dry spell                        drought                  
## dust devel                       dust devil               
## dust devil                       dust devil               
## dust storm                       dust storm               
## early snowfall                   heavy snow               
## excessive heat                   excessive heat           
## excessive heat/drought           excessive heat           
## extreme cold                     extreme cold/wind chill  
## extreme cold/wind chill          extreme cold/wind chill  
## extreme windchill                extreme cold/wind chill  
## extreme windchill temperatures   extreme cold/wind chill  
## extremely wet                    flood                    
## falling snow/ice                 heavy snow               
## first snow                       heavy snow               
## flash flood                      flash flood              
## flood                            flood                    
## fog                              frost/freeze             
## freeze                           frost/freeze             
## freezing drizzle                 frost/freeze             
## freezing fog                     freezing fog             
## freezing rain                    frost/freeze             
## freezing rain/sleet              frost/freeze             
## frost                            frost/freeze             
## frost/freeze                     frost/freeze             
## funnel cloud                     funnel cloud             
## funnel clouds                    funnel cloud             
## glaze                            winter weather           
## gradient wind                    strong wind              
## gusty lake wind                  strong wind              
## gusty thunderstorm wind          thunderstorm wind        
## gusty thunderstorm winds         thunderstorm wind        
## gusty wind                       strong wind              
## gusty winds                      strong wind              
## hail                             hail                     
## hard freeze                      frost/freeze             
## hazardous surf                   high surf                
## heat                             heat                     
## heavy rain                       heavy rain               
## heavy rain effects               heavy rain               
## heavy snow                       heavy snow               
## heavy surf                       high surf                
## heavy surf/high surf             high surf                
## high seas                        high surf                
## high surf                        high surf                
## high surf advisories             high surf                
## high surf advisory               high surf                
## high water                       high surf                
## high wind                        high wind                
## hurricane                        hurricane/typhoon        
## hurricane/typhoon                hurricane/typhoon        
## ice on road                      ice storm                
## ice storm                        ice storm                
## ice/snow                         ice storm                
## icy roads                        ice storm                
## lake-effect snow                 lake-effect snow         
## lake effect snow                 lake-effect snow         
## lakeshore flood                  lakeshore flood          
## landslide                        lakeshore flood          
## landslump                        avalanche                
## late season snow                 heavy snow               
## light freezing rain              frost/freeze             
## light snow                       heavy snow               
## lightning                        lightning                
## locally heavy rain               heavy rain               
## marine hail                      marine hail              
## marine high wind                 marine high wind         
## marine strong wind               marine strong wind       
## marine thunderstorm wind         marine thunderstorm wind 
## marine tstm wind                 marine thunderstorm wind 
## mixed precipitation              mixed precipitation      
## moderate snowfall                heavy snow               
## monthly precipitation            mixed precipitation      
## mud slide                        debirs flow              
## mudslide                         debirs flow              
## non-severe wind damage           strong wind              
## non-tstm wind                    strong wind              
## non severe hail                  hail                     
## non tstm wind                    strong wind              
## northern lights                  northern lights          
## other                            other                    
## patchy ice                       ice storm                
## prolong cold                     cold/wind chill          
## prolong warmth                   heat                     
## rain                             heavy rain               
## record cold                      extreme cold/wind chill  
## record heat                      excessive heat           
## record low rainfall              drought                  
## record rainfall                  heavy rain               
## record snow                      heavy snow               
## record snowfall                  heavy snow               
## record warmth                    excessive heat           
## red flag criteria                wildfire                 
## rip current                      rip current              
## rip currents                     rip current              
## rogue wave                       rip current              
## rough seas                       rip current              
## seiche                           seiche                   
## severe thunderstorms             thunderstorm wind        
## sleet                            sleet                    
## sleet storm                      sleet                    
## small hail                       hail                     
## smoke                            dense smoke              
## snow                             heavy snow               
## snow advisory                    heavy snow               
## snow and ice                     heavy snow               
## snow drought                     heavy snow               
## snow showers                     heavy snow               
## snow squalls                     heavy snow               
## snow/blowing snow                heavy snow               
## snow/freezing rain               heavy snow               
## snow/sleet                       sleet                    
## snowmelt flooding                heavy snow               
## storm surge                      storm surge/tide         
## storm surge/tide                 storm surge/tide         
## strong wind                      strong wind              
## strong winds                     strong wind              
## thunderstorm                     thunderstorm wind        
## thunderstorm wind                thunderstorm wind        
## thunderstorm wind (g40)          thunderstorm wind        
## thunderstorms                    thunderstorm wind        
## tidal flooding                   storm surge/tide         
## tornado                          tornado                  
## tornado debris                   tornado                  
## tropical depression              tropical depression      
## tropical storm                   tropical storm           
## tstm wind                        thunderstorm wind        
## tstm wind (g40)                  thunderstorm wind        
## tstm wind (g45)                  thunderstorm wind        
## tstm wind g45                    thunderstorm wind        
## tstm wind/hail                   thunderstorm wind        
## tsunami                          tsunami                  
## unseasonably cold                cold/wind chill          
## unseasonably cool                cold/wind chill          
## unseasonably cool & wet          cold/wind chill          
## unseasonably dry                 drought                  
## unseasonably hot                 heat                     
## unseasonably warm                heat                     
## unseasonably warm & wet          heat                     
## unseasonably warm/wet            heat                     
## unseasonably wet                 flood                    
## unseasonal low temp              cold/wind chill          
## unusually cold                   cold/wind chill          
## unusually late snow              heavy snow               
## unusually warm                   heat                     
## urban/sml stream fld             flood                    
## very dry                         drought                  
## very warm                        excessive heat           
## volcanic ash                     volcanic ash             
## volcanic ashfall                 volcanic ash             
## wall cloud                       thunderstorm wind        
## warm weather                     heat                     
## waterspout                       waterspout               
## whirlwind                        tornado                  
## wild/forest fire                 wildfire                 
## wildfire                         wildfire                 
## wind                             strong wind              
## wind advisory                    strong wind              
## wind chill                       cold/wind chill          
## wind damage                      strong wind              
## wind gusts                       strong wind              
## winter storm                     winter storm             
## winter weather                   winter weather           
## winter weather mix               winter weather           
## winter weather/mix               winter weather           
## wintry mix                       winter weather           
## wnd                              strong wind

4. Tidy the Variable PROPDMG and CROPDMGEXP
Variables related to economic loss are PROPDMG, PROPDMGEXP, CROPDMG and CROPDMGEXP. The intepretation of them are as follows:
1. PROPDMG: The number of property damage.
2. PROPDMGEXP: The monetary unit for property damage.
3. CROPDMG: The number of crop damage.
4. CROPDMGEXP: The monetary unit for crop damage.
The new variables newPROPDMG and newPROPDMG are made to conform the units of property damage estimtes and crop damage estimates.

4.1 Tidy the Property Demage Estimates

summary(storm2k$PROPDMGEXP)
##             -      ?      +      0      1      2      3      4      5 
## 189121      0      0      0      1      0      0      0      0      0 
##      6      7      8      B      h      H      K      m      M 
##      0      0      0     29      0      0 328461      0   5551

In NATIONAL WEATHER SERVICE INSTRUCTION for PROPDMGEXP and CROPDMGEXP, alphabetical character “B” means billion persons, “M” means million persons, “K” means thousand persons, “” or “0” means person. In the following step, all the units of property demage estimates are conformed to person.

newPROPDMG <- numeric()
m <- length(storm2k$PROPDMG)
for ( i in 1:m)  {
      if (storm2k$PROPDMGEXP[i]=="B") 
            newPROPDMG[i] <- storm2k$PROPDMG[i]*10^9
      else if(storm2k$PROPDMGEXP[i]=="M")
            newPROPDMG[i] <- storm2k$PROPDMG[i]*10^6
      else if(storm2k$PROPDMGEXP[i]=="K")
            newPROPDMG[i] <- storm2k$PROPDMG[i]*10^3
      else
            newPROPDMG[i] <- storm2k$PROPDMG[i]
}
length(newPROPDMG)
## [1] 523163
summary(newPROPDMG)
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.000e+00 0.000e+00 0.000e+00 6.323e+05 1.000e+03 1.150e+11

Now all the units of property demage estimates are conformed to dollars.

4.2 Tidy the Crop Demage Estimates

summary(storm2k$CROPDMGEXP)
##             ?      0      2      B      k      K      m      M 
## 250613      0      0      0      4      0 271351      0   1195

The strategy for tidying crop demage estimates is simiar as before.

newCROPDMG <- numeric()
for ( i in 1:m )  {
      if (storm2k$CROPDMGEXP[i]=="B")   
            newCROPDMG[i] <- storm2k$CROPDMG[i]*10^9
      else if(storm2k$CROPDMGEXP[i]=="M")
            newCROPDMG[i] <- storm2k$CROPDMG[i]*10^6
      else if(storm2k$CROPDMGEXP[i]=="K")
            newCROPDMG[i] <- storm2k$CROPDMG[i]*10^3
      else
            newCROPDMG[i] <- storm2k$CROPDMG[i]
}
length(newCROPDMG)
## [1] 523163
summary(newCROPDMG)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 0.00e+00 0.00e+00 0.00e+00 4.51e+04 0.00e+00 1.51e+09

Now all the units of crop demage estimates are conformed to dollars.

Results

In this analysis the questions are focused to study:
1. Across the United States in the 21th century, which types of events are most harmful with respect to population health?
2. Across the United States in the 21th century, which types of events have the greatest economic consequences?

The most harmful events with respect to population health

First, I will study which events causes the most fatalities.

fatality.mean <- sapply(split(storm2k$FATALITIES, storm2k$newEVTYPE), mean)
most.fatlality <- head(sort(fatality.mean, decreasing=TRUE ),5)
barplot(most.fatlality, main="Top 5 Events Causing Fatalities", 
        cex.main=1.5, ylab="Average Deaths per Event" )

From the above plot, we can see the most harmful events with respect to fatalities is tsunami.

Second, I will study which events causes the most injuries.

injury.mean <- sapply(split(storm2k$INJURIES, storm2k$newEVTYPE), mean)
most.injury <- head(sort(injury.mean, decreasing=TRUE ),5)
barplot(most.injury, main="Top 5 Events Causing Injuries", 
        cex.main=1.5, ylab="Average Injuries per Event")

From the above plot, we can see the most harmful events with respect to injuries is hurricane/typhoon. While Tsunami ranks the second.

The most harmful events with respect to economic consequences

I will study which events causes the most economic losses, which is the summation of estimated property losses and crop losses.

property.mean <- sapply(split(newPROPDMG, storm2k$newEVTYPE), mean)
crop.mean <- sapply(split(newCROPDMG, storm2k$newEVTYPE), mean)
economic.mean <- (property.mean+crop.mean)/(10^6)
most.economic <- head(sort(economic.mean, decreasing=TRUE ),5)  
# Transform the units to Million Dollars for plotting
barplot(most.economic, main="Top 5 Events Causing Economic Losses (Million Dollars)", 
        cex.main=1.5, ylab="Average Economic Losses per Event (Million Dollars)" )

From the above plot, we can see the most harmful events with respect to economic consequences is hurricane/typhoon. While Storm surge/tide ranks the second.

BIN FANG
July, 2015