Severe weather events are costly in terms of lives and property, and municipalities can suffer for years from the impact of a single event. This analysis looks at 48 severe weather events and their effects on population health and economic indicators. The purpose of this study is to provide information to state, county, and city planners so they can prioritize resources and improve their emergency preparedness programs. National Oceanographic and Atmospheric Administration (NOAA) storm data collected from 1950 to 2011 provided the historical information. Preliminary results of this analysis indicate that weather impacts abide by the Pareto Principle and a small number of events (between four and eight) are responsible for approximately 80% of impacts. This insight allows planners to focus on the weather events most likely to occur in their region. Population health is most susceptible to tornadic, extreme heat, and tropical storm events. Property and crop damage is most susceptible to flood, hurricane, tornadic, and drought events. This study also provides a full summary of all 48 event types along with the aggregate impact on health and economics over the 60-years of data collected.
This analysis is based on a sample of severe weather events occurring in the United States between 1950 and 2011 [1]. NOAA collected this data to include numerical and categorical information as well as event description narratives. The data were downloaded from a site provided for educational use through Johns Hopkins University [2] using R programming software version 3.1.0 (2014-04-10, “Spring Dance”) [3].
The data itself was made available in a comma-separated-value file compressed via the bzip2 algorithm to reduce its size. The data contained 902,297 weather event observations and 37 variables.
This analysis was executed on a Windows i386-w64-mingw32/i386 (32-bit) platform. One package was used that is not included in the R base packages and must be installed:
Analysis began by loading the data into the R Programming environment. This analysis begins with the raw data in its original .csv.bz2 so as to enable reproducible research. This code chunk reads in the data and provides a view of the data as revealed by the variable names.
## Unzip and read in NOAA bzip2 file
StormData <- read.csv(bzfile("repdata-data-StormData.csv.bz2")) # read in csv
names(StormData) # take a look at names
## [1] "STATE__" "BGN_DATE" "BGN_TIME" "TIME_ZONE" "COUNTY"
## [6] "COUNTYNAME" "STATE" "EVTYPE" "BGN_RANGE" "BGN_AZI"
## [11] "BGN_LOCATI" "END_DATE" "END_TIME" "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE" "END_AZI" "END_LOCATI" "LENGTH" "WIDTH"
## [21] "F" "MAG" "FATALITIES" "INJURIES" "PROPDMG"
## [26] "PROPDMGEXP" "CROPDMG" "CROPDMGEXP" "WFO" "STATEOFFIC"
## [31] "ZONENAMES" "LATITUDE" "LONGITUDE" "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS" "REFNUM"
This analysis addresses two fundamental questions:
The purpose of this analysis is to provide insights to municipality planners in order to prioritize resources and improve emergency preparedness programs. Based on these questions and purpose, the original dataframe of 37 variables was reduced to seven variables of interest: EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, and CROPDMGEXP.
storm <- StormData[, c(8, 23:28)] # subset dataframe
names(storm) <- tolower(colnames(storm)) # reset variables using tidy data
## Get overall appreciation for data
str(storm)
## 'data.frame': 902297 obs. of 7 variables:
## $ evtype : Factor w/ 985 levels " HIGH SURF ADVISORY",..: 834 834 834 834 834 834 834 834 834 834 ...
## $ fatalities: num 0 0 0 0 0 0 0 0 1 0 ...
## $ injuries : num 15 0 2 2 2 6 1 0 14 0 ...
## $ propdmg : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ propdmgexp: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
## $ cropdmg : num 0 0 0 0 0 0 0 0 0 0 ...
## $ cropdmgexp: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...
rm(StormData) # view structure and clean up environment
Exploratory analysis has as one of its purposes the goal of becoming familiar with the data and identifying any pre-processing needs prior to further analysis. First, the structure of the data is satisfactory.
bad <- is.na(storm) # identify 'NA' in the dataset (i.e., missing data)
table(bad) # view results
## bad
## FALSE
## 6316079
rm(bad) # clean up R environment
A look for “NA” within observations shows no missing values in the storm dataframe because all (902297 obs.) * (7 variables) = 6,316,079 are FALSE. However, there is some concern about some of the levels under propdmgexp and cropdmgexp. A table() call of these two factors reveals some anomalies (possible bad data)
# Create tables of expontial data to see occurences of the factor levels
table(storm$propdmgexp)
##
## - ? + 0 1 2 3 4 5
## 465934 1 8 5 216 25 13 4 4 28
## 6 7 8 B h H K m M
## 4 5 1 40 1 6 424665 7 11330
table(storm$cropdmgexp)
##
## ? 0 2 B k K m M
## 618413 7 19 1 9 21 281832 1 1994
The variables propdmgexp and cropdmgexp will need a transformation to mathematically standardize the numerical variables propdmg and cropdmg - specifically those levels labeled “-”, “?”, and “+” will need to be replaced and the labels for billion (B), million (m or M), thousand (k or K), and hundred (h or H) will need to be replaced with a number to allow scaling of the damage level.
Before eliminating or transforming the data one should understand the potential impacts:
options(scipen = 6)
look <- storm[storm$propdmgexp == "+" | storm$propdmgexp == "-" | storm$propdmgexp ==
"?" | storm$cropdmgexp == "?", ]
look <- look[order(look$propdmgexp), ]
look[, c(1, 4:7)]
## evtype propdmg propdmgexp cropdmg cropdmgexp
## 192467 FLASH FLOOD WINDS 0.41 0 ?
## 220300 THUNDERSTORM WINDS 0.00 0 ?
## 229327 HIGH WIND 15.00 - 0
## 198689 THUNDERSTORM WINDS 0.00 ? 0
## 225254 FLASH FLOOD 0.00 ? 0
## 227409 FLASH FLOOD 0.00 ? 0
## 232016 THUNDERSTORM WIND 0.00 ? 0
## 233746 HAIL 0.00 ? 0
## 233747 HAIL 0.00 ? 0
## 233748 HAIL 0.00 ? 0
## 247617 THUNDERSTORM WINDS 0.00 ? 0
## 188780 BREAKUP FLOODING 20.00 + 0
## 189001 HIGH WIND 20.00 + 0
## 192262 FLOODING/HEAVY RAIN 2.00 + 0
## 216755 HIGH WINDS 15.00 + 0
## 216802 TORNADO 60.00 + 0
## 197066 THUNDERSTORM WINDS 0.50 K 0 ?
## 197331 THUNDERSTORM WINDS 0.50 K 0 ?
## 220877 FLOOD/FLASH FLOOD 400.00 K 0 ?
## 242953 THUNDERSTORM WINDS 80.00 K 0 ?
## 232901 FLOOD/FLASH FLOOD 0.50 M 0 ?
rm(look) # clean up R environment
Based on this subset of data, the following transformations will be made using the procedures provided below:
stormMod <- storm # create new dataframe to retain original
stormMod$propdmgexp <- as.character(stormMod$propdmgexp) # set as char for gsub
stormMod$cropdmgexp <- as.character(stormMod$cropdmgexp) # set as char for gsub
stormMod$propdmgexp <- gsub("\\?", "", stormMod$propdmgexp) #replace ? with blank
stormMod$cropdmgexp <- gsub("\\?", "", stormMod$cropdmgexp) #replace ? with blank
## Start a vector called remove to gather row numbers of data to remove
remove <- grep("\\+", stormMod$propdmgexp, value = FALSE, ignore.case = TRUE)
moreRemove <- grep("\\-", stormMod$propdmgexp, value = FALSE, ignore.case = TRUE)
remove <- append(remove, moreRemove, after = length(remove)) # combine grep info
stormMod <- stormMod[-remove, ] # create final dataframe by removing rows
rm(moreRemove, remove) #clean up R environment
Now the labels propmdgexp and cropdmgexp are changed to numeral characters.
## Replace exp labels for hundred, thousand, million, and billion
stormMod$propdmgexp <- gsub("[hH]", "2", stormMod$propdmgexp)
stormMod$propdmgexp <- gsub("[kK]", "3", stormMod$propdmgexp)
stormMod$propdmgexp <- gsub("[mM]", "6", stormMod$propdmgexp)
stormMod$propdmgexp <- gsub("[bB]", "9", stormMod$propdmgexp)
## Convert single digit to exponential form
for (i in 1:9) {
stormMod$propdmgexp <- gsub(i, as.numeric(paste("1E", i, sep = "")), stormMod$propdmgexp)
}
stormMod$propdmgexp <- gsub("^[0]", "1", stormMod$propdmgexp) #fixes the value '0'
stormMod$propdmgexp <- as.numeric(stormMod$propdmgexp) # converts to numeric
fix.na <- is.na(stormMod$propdmgexp) # finds blanks converted to 'NA'
stormMod$propdmgexp[fix.na] <- 1 # replaces 'NA' with the number 1
## Creates new column for normalized property damage values
stormMod$propertydamage <- stormMod$propdmg * stormMod$propdmgexp
## Compare original and transformed data for quality check
table(storm$propdmgexp)
##
## - ? + 0 1 2 3 4 5
## 465934 1 8 5 216 25 13 4 4 28
## 6 7 8 B h H K m M
## 4 5 1 40 1 6 424665 7 11330
table(stormMod$propdmgexp)
##
## 1 10 100 1000 10000 100000
## 466158 25 20 424669 4 28
## 1000000 10000000 100000000 1000000000
## 11341 5 1 40
rm(i, fix.na) # clean up R environment
Although it is a lot of code to show, the nature of the gsub() function makes it a good choice to simply repeat the transformation for property damage to create good numbers for crop damage.
## Replace exp labels for hundred, thousand, million, and billion
stormMod$cropdmgexp <- gsub("[kK]", "3", stormMod$cropdmgexp)
stormMod$cropdmgexp <- gsub("[mM]", "6", stormMod$cropdmgexp)
stormMod$cropdmgexp <- gsub("[bB]", "9", stormMod$cropdmgexp)
## Convert single digit to exponential form
for (i in 1:9) {
stormMod$cropdmgexp <- gsub(i, as.numeric(paste("1E", i, sep = "")), stormMod$cropdmgexp)
}
stormMod$cropdmgexp <- gsub("^[0]", "1", stormMod$cropdmgexp) #fixes the value '0'
stormMod$cropdmgexp <- as.numeric(stormMod$cropdmgexp) # converts to numeric
fix.na <- is.na(stormMod$cropdmgexp) # finds blanks converted to 'NA'
stormMod$cropdmgexp[fix.na] <- 1 # replaces 'NA' with the number 1
## Creates new column for normalized property damage values
stormMod$cropdamage <- stormMod$cropdmg * stormMod$cropdmgexp
## Compare original and transformed data for quality check
table(storm$cropdmgexp)
##
## ? 0 2 B k K m M
## 618413 7 19 1 9 21 281832 1 1994
table(stormMod$cropdmgexp)
##
## 1 100 1000 1000000 1000000000
## 618433 1 281853 1995 9
rm(i, fix.na) # clean up R environment
Lastly, a new variable totaldamage is added to the original data, combining the levels of property and crop damage. The interim variables are removed to cleanup the transformed data.
## Creates new column for normalized property damage values
stormMod$totaldamage <- stormMod$propertydamage + stormMod$cropdamage
## remove all interim property and crop damage variables
stormMod <- stormMod[, -(4:7)]
## table(stormMod$evtype) #commented out as it is very long
message(paste("Number of unique event types in dataset:", length(unique(stormMod$evtype)),
sep = " "))
## Number of unique event types in dataset: 983
The most complex aspect of data transformation during the exploratory phases was the event types. A lengthy description plus all code is provided here to clearly demonstrate the assumptions and decisions made in the transformation process. There were 983 unique event types as determined from the length(unique()) call below. According to the document STORM DATA PREPARATION, NWSI 10-1605 [1], there are only 48 different types of entries permitted to be recorded. Section 2.1 of the document notes:
2.1 Permitted Storm Data Events. The only events permitted in Storm Data are listed in Table 1 of Section 2.1.1. The chosen event name should be the one that most accurately describes the meteorological event leading to fatalities, injuries, damage, etc. However, significant events, such as tornadoes, having no impact or causing no damage, should also be included in Storm Data. See Section 7 for detailed examples.
Therefore, during the exploratory analysis phase the event type variable is transformed to better conform to the 48 permitted entries for event type. This also serves to combine the data into less categories and make the results more useful.
stormMod$evtype <- as.character(stormMod$evtype)
stormMod$evtype <- gsub("^[ ]+", "", stormMod$evtype) # remove leading whitespace
## Remove lines that are simply summaries containing no data
remove <- grep("[sS]ummar", stormMod$evtype, value = FALSE, ignore.case = TRUE)
remove <- append(remove, grep("\\?", stormMod$evtype, value = FALSE, ignore.case = TRUE),
after = length(remove))
remove <- append(remove, grep("^[mM]onthly", stormMod$evtype, value = FALSE,
ignore.case = TRUE), after = length(remove))
remove <- append(remove, grep("excessive$", stormMod$evtype, value = FALSE,
ignore.case = TRUE), after = length(remove))
remove <- append(remove, grep("^HIGH$", stormMod$evtype, value = FALSE, ignore.case = TRUE),
after = length(remove))
remove <- append(remove, grep("SOUTHEAST", stormMod$evtype, value = FALSE, ignore.case = TRUE),
after = length(remove))
remove <- append(remove, grep("APACHE COUNTY", stormMod$evtype, value = FALSE,
ignore.case = TRUE), after = length(remove))
stormMod <- stormMod[-remove, ] # eliminates these observations
rm(remove)
The details of the grepl code will show readers the groupings of event types that were combined and transformed into one of the 48 permitted event types.
new <- stormMod
## Good codings of event types
new$swap[grepl("^ASTRO(.*)", new$evtype)] <- "ASTRONOMICAL LOW TIDE"
new$swap[grepl("^AVAL(.*)", new$evtype)] <- "AVALANCHE"
new$swap[grepl("^DUST(.*)L", new$evtype, ignore.case = T)] <- "DUST DEVIL"
dust <- c("sahar(.*)", "^DUSTS(.*)", "^dust sto(.*)")
new$swap[grepl(paste(dust, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "DUST STORM"
heat <- c("^EXCESSIVE HEAT$", "RECORD/EXCESSIVE HEAT")
new$swap[grepl(paste(heat, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "EXCESSIVE HEAT"
blizzard <- c("^BLIZZ(.*)", "^BLOW(.*)", "^RECORD(.*)snow$", "ground blizzard")
new$swap[grepl(paste(blizzard, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "BLIZZARD"
new$swap[grepl("BLOWING DUST", new$evtype)] <- "DUST STORM"
cflood <- c("^Beach e", "^coastal", "cstl", "^tidal flood", "Beach flood")
new$swap[grepl(paste(cflood, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "COASTAL FLOODING"
tstm <- c("^TROPICAL S(.*)", "^TSTM(.*)")
new$swap[grepl(paste(tstm, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "TROPICAL STORM"
new$swap[grepl("TSUNAMI", new$evtype)] <- "TSUNAMI"
cold <- c("^cold", "cool", "^WIND CHILL$", "WIND CHILL/HIGH WIND", "^COLD/WIND CHILL",
"COLD WIND CHILL TEMPERATURES", "Unseasonable Cold", "UNSEASONABLY COLD",
"UNSEASONAL LOW TEMP")
new$swap[grepl(paste(cold, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "COLD/WIND CHILL"
fog <- c("DENSE FOG", "^FOG", "FOG AND COLD TEMPERATURES", "VOG", "PATCHY DENSE FOG")
new$swap[grepl(paste(fog, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "DENSE FOG"
new$swap[grepl("SMOKE", new$evtype)] <- "DENSE SMOKE"
drought <- c("BELOW NORMAL PRECIPITATION", "DRIEST MONTH", "^DROUGHT$", "DROUGHT/EXCESSIVE HEAT",
"^DRY$", "DRY CONDITIONS", "DRY HOT WEATHER", "DRY PATTERN", "DRY SPELL",
"DRY WEATHER", "DRYNESS", "EXCESSIVE HEAT/DROUGHT", "EXCESSIVELY DRY", "EXTREME HEAT",
"HEAT WAVE DROUGHT", "LACK OF SNOW", "Mild and Dry Pattern", "MILD/DRY PATTERN",
"PROLONG WARMTH", "Record dry month", "RECORD DRYNESS", "RECORD LOW RAINFALL",
"heat drought", "UNSEASONABLY DRY")
new$swap[grepl(paste(drought, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "DROUGHT"
excold <- c("hyp", "low temperature", "^LOW WIND CHILL", "BITTER WIND CHILL",
"BITTER WIND CHILL TEMPERATURES", "Excessive Cold", "Extended Cold", "^Extreme ",
"EXTREME/RECORD COLD", "Prolong Cold", "RECORD COLD", "Record Cold", "RECORD LOW$",
"SEVERE COLD", "UNUSUALLY COLD")
new$swap[grepl(paste(excold, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "EXTREME COLD/WIND CHILL"
flash <- c("^flash flood", "floood", "LOCAL FLASH FLOOD", "RAPIDLY RISING WATER",
"^street flood", "^small stream and", "^small stream flood", "SMALL STREAM URBAN FLOOD",
"SMALL STREAM/URBAN FLOOD", "^Sml Stream Fld", "^STREAM FLOODING$")
new$swap[grepl(paste(flash, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "FLASH FLOOD"
flood <- c("^flood", "^river", "HIGH WATER", "ICE JAM FLOODING", "LAKE FLOOD",
"LOCAL FLOOD", "MAJOR FLOOD", "rural flood", "^DAM ")
new$swap[grepl(paste(flood, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "FLOOD"
fog <- c("g fog$", "ice fog")
new$swap[grepl(paste(fog, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "FREEZING FOG"
frost <- c("^frost$", "freeze", "Freezing Spray", "Early Frost", "EARLY FROST",
"FIRST FROST", "freezing drizzle", "^freezing rain", "^frost", "glaze")
new$swap[grepl(paste(frost, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "FROST/FREEZE"
new$swap[grepl("^funnel", new$evtype, ignore.case = T)] <- "FUNNEL CLOUD"
hail <- c("DEEP HAIL", "^hail", "LATE SEASON HAIL", "NON SEVERE HAIL", "small hail",
"THUNDERSTORM HAIL", "THUNDERSTORM WINDS/ HAIL")
new$swap[grepl(paste(hail, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "HAIL"
heat <- c("^heat w", "^heat$", "ABNORMAL WARMTH", "ABNORMALLY DRY", "^heat/",
"record heat", "record high", "^hot", "hot$", "record warm", "red flag",
"UNSEASONABLY WARM", "HIGH TEMPERATURE RECORD", "HOT/DRY PATTERN", "UNSEASONABLY HOT",
"UNUSUAL WARMTH", "UNUSUALLY WARM", "VERY DRY", "VERY WARM", "WARM DRY CONDITIONS",
"WARM WEATHER")
new$swap[grepl(paste(heat, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "HEAT"
hrain <- c("^heavy rain", "heavy prec", "^mud", "^rain", "torrent", "^landsl",
"heavy shower", "EXCESSIVE PRECIPITATION", "EXCESSIVE RAIN", "EXCESSIVE WETNESS",
"EXTREMELY WET", "Heatburst", "RECORD PRECIPITATION", "RECORD RAINFALL",
"RECORD/EXCESSIVE RAINFALL", "PROLONGED RAIN", "LOCALLY HEAVY RAIN", "Metro Storm, May 26",
"^hvy rain")
new$swap[grepl(paste(hrain, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "HEAVY RAIN"
hsnow <- c("^heavy snow", "HEAVY LAKE SNOW", "HEAVY MIX", "EXCESSIVE SNOW",
"^snow", "HEAVY WET SNOW", "Mountain Snows", "NEAR RECORD SNOW", "RECORD SNOW/COLD",
"RECORD SNOWFALL")
new$swap[grepl(paste(hsnow, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "HEAVY SNOW"
surf <- c("^high surf", "^heavy surf", " seas$", "HAZARDOUS SURF", "HEAVY SEAS",
"HEAVY SWELLS", "HIGH SWELLS", "HIGH SEAS", "HIGH SWELLS", "^HIGH TIDES",
"HIGH WAVES", "ROGUE WAVE", "ROUGH SURF")
new$swap[grepl(paste(surf, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "HIGH SURF"
hwind <- c("^high wind", "high wind", "^wnd$", "^non(.*)wind", "^gust", "gradient")
new$swap[grepl(paste(hwind, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "HIGH WIND"
hurr <- c("hurricane", "REMNANTS OF FLOYD", "TYPHOON")
new$swap[grepl(paste(hurr, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "HURRICANE/TYPHOON"
ice <- c("^ice", "^icy", "LIGHT FREEZING RAIN")
new$swap[grepl(paste(ice, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "ICE STORM"
new$swap[grepl("^lake(.*)snow$", new$evtype, ignore.case = T)] <- "LAKE-EFFECT SNOW"
new$swap[grepl("LAKESHORE FLOOD", new$evtype, ignore.case = T)] <- "LAKESHORE FLOOD"
lightn <- c("^lightn", "lighting", "LIGNTNING")
new$swap[grepl(paste(lightn, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "LIGHTNING"
new$swap[grepl("MARINE HAIL", new$evtype, ignore.case = T)] <- "MARINE HAIL"
new$swap[grepl("MARINE HIGH WIND", new$evtype, ignore.case = T)] <- "MARINE HIGH WIND"
new$swap[grepl("MARINE STRONG WIND", new$evtype, ignore.case = T)] <- "MARINE STRONG WIND"
new$swap[grepl("MARINE THUNDERSTORM WIND", new$evtype, ignore.case = T)] <- "MARINE THUNDERSTORM WIND"
new$swap[grepl("rip current", new$evtype, ignore.case = T)] <- "RIP CURRENT"
new$swap[grepl("seiche", new$evtype, ignore.case = T)] <- "SEICHE"
new$swap[grepl("^sleet", new$evtype, ignore.case = T)] <- "SLEET"
new$swap[grepl("TROPICAL DEPRESSION", new$evtype, ignore.case = T)] <- "TROPICAL DEPRESSION"
new$swap[grepl("^VOLCANIC", new$evtype, ignore.case = T)] <- "VOLCANIC ASH"
wspout <- c("^waterspout", "WAYTERSPOUT", "WATER SPOUT")
new$swap[grepl(paste(wspout, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "WATERSPOUT"
wfires <- c("fires$", "^wild(.*)fire$", "BRUSH FIRE")
new$swap[grepl(paste(wfires, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "WILDFIRE"
wstorm <- c("^winter storm", "UNUSUALLY LATE SNOW", "WINTER MIX")
new$swap[grepl(paste(wstorm, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "WINTER STORM"
tornado <- c("^tornado", "microburst", "downburst", "wall cloud", "mirco", "torndao")
new$swap[grepl(paste(tornado, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "TORNADO"
swind <- c("^strong wind", "^wind", "whirlwind")
new$swap[grepl(paste(swind, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "STRONG WIND"
tstorm <- c("^thund", "THUNERSTORM", "TUNDERSTORM", "micoburst", "severe thun",
"SEVERE TURBULENCE", "STORM FORCE WINDS", "THUDERSTORM WINDS", "LANDSPOUT",
"MARINE TSTM WIND")
new$swap[grepl(paste(tstorm, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "THUNDERSTORM WIND"
surge <- c("^urban", "HIGHWAY FLOODING", "MINOR FLOOD", "storm surge")
new$swap[grepl(paste(surge, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "STORM SURGE/TIDE"
winterwx <- c("BLACK ICE", "Drifting Snow", "EARLY SNOW", "FALLING SNOW/ICE",
"FIRST SNOW", "LATE(.*)SNOW", "Late(.*)fall", "^LIGHT SNOW", "MIXED PRECIP",
"MODERATE SNOW", "PATCHY ICE", "Seasonal Snowfall", "WINTER WEATHER", "WINTERY MIX",
"wintry mix")
new$swap[grepl(paste(winterwx, collapse = "|"), new$evtype, ignore.case = TRUE)] <- "WINTER WEATHER"
rm(surge, tstorm, wfires, wstorm, tornado, dust, heat, blizzard, cflood, tstm,
cold, fog, drought, excold, flash, frost, flood, hail, hrain, hsnow, surf,
hwind, hurr, ice, lightn, swind, winterwx, wspout)
rm(stormMod)
Lastly, the small number of orphaned event types are gathered up and placed into an event type called OTHER EVENT. This code makes the transformation. The table() call shows the 48 (plus one) list of event types now available after transformation.
## replace all 'oddball items' with 'OTHER EVENT'
bad <- is.na(new$swap)
new[is.na(new)] <- 0
new$swap <- gsub(0, "OTHER EVENT", new$swap)
table(new$swap) # show final set of event types
##
## ASTRONOMICAL LOW TIDE AVALANCHE BLIZZARD
## 277 387 2765
## COASTAL FLOODING COLD/WIND CHILL DENSE FOG
## 901 704 1836
## DENSE SMOKE DROUGHT DUST DEVIL
## 21 2601 151
## DUST STORM EXCESSIVE HEAT EXTREME COLD/WIND CHILL
## 437 1681 2047
## FLASH FLOOD FLOOD FREEZING FOG
## 55071 26319 46
## FROST/FREEZE FUNNEL CLOUD HAIL
## 1857 6980 288838
## HEAT HEAVY RAIN HEAVY SNOW
## 1291 12521 16663
## HIGH SURF HIGH WIND HURRICANE/TYPHOON
## 1088 21929 301
## ICE STORM LAKE-EFFECT SNOW LAKESHORE FLOOD
## 2149 659 23
## LIGHTNING MARINE HAIL MARINE HIGH WIND
## 15770 442 135
## MARINE STRONG WIND MARINE THUNDERSTORM WIND OTHER EVENT
## 48 5812 166
## RIP CURRENT SEICHE SLEET
## 777 21 78
## STORM SURGE/TIDE STRONG WIND THUNDERSTORM WIND
## 4230 4229 109946
## TORNADO TROPICAL DEPRESSION TROPICAL STORM
## 60920 60 221754
## TSUNAMI VOLCANIC ASH WATERSPOUT
## 20 29 3861
## WILDFIRE WINTER STORM WINTER WEATHER
## 4237 11442 8635
## permanently transform the dataframe
new$evtypeoriginal <- new$evtype
new$evtype <- new$swap
new <- new[, -7]
## Create final dataset for analysis and remove temporary dataframe
stormFinal <- new # create finalized dataset
stormFinal$evtype <- as.factor(stormFinal$evtype) # make class 'factor'
rm(storm, new)
A summary() call give us an idea of what the final dataframe looks like. Notice the highly left skewed distribution of raw data based on the median = 0 for all numeric variables.
summary(stormFinal) # get idea of what data looks like
## evtype fatalities injuries
## HAIL :288838 Min. : 0 Min. : 0.0
## TROPICAL STORM :221754 1st Qu.: 0 1st Qu.: 0.0
## THUNDERSTORM WIND:109946 Median : 0 Median : 0.0
## TORNADO : 60920 Mean : 0 Mean : 0.2
## FLASH FLOOD : 55071 3rd Qu.: 0 3rd Qu.: 0.0
## FLOOD : 26319 Max. :583 Max. :1700.0
## (Other) :139307
## propertydamage cropdamage totaldamage
## Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
## Median : 0 Median : 0 Median : 0
## Mean : 474669 Mean : 54430 Mean : 529099
## 3rd Qu.: 500 3rd Qu.: 0 3rd Qu.: 1000
## Max. :115000000000 Max. :5000000000 Max. :115032500000
##
## evtypeoriginal
## Length:902155
## Class :character
## Mode :character
##
##
##
##
The final code book accompanying this analysis after transformations is provided in Table 1. This table was coded by hand using HTML and not generated using the xtable() function of R.
Table 1: Codebook for Storm Data Analysis Accounting for Transformations of Original Dataset
| Variable Name | Description {class of variable} |
|---|---|
| evtype | Event type per document STORM DATA PREPARATION, NWSI 10-1605 {factor} |
| fatalities | Number of fatalities during event {numeric} |
| injuries | Number of injuries during event {numeric} |
| propertydamage | Property damage from event in U.S. dollars {numeric} |
| cropdamage | Crop damage from event in U.S. dollars {numeric} |
| totaldamage | Addend of property and crop damage {numeric} |
| evtypeoriginal | Event type as coded in the original data set {factor} |
The results are presented in three sections: aggregating the dataset, plotting aggregated data, and interpreting analysis results.
The dataset stormFinal is a tidy dataset with one row per observation and one column per variable. Analyzing this data through visual means first requires an aggregation of variables of interest by categorical group to enable plotting. Beginning with the full dataset, a temporary dataframe sum is generated by applying the sum() function across each variable and across each event type then merging the new information into sum. This results in a summary of impact by event type as provided in Table 2.
## Calculate the total level of impact by aggregating Event Types Begin with
## fatalities and injuries
sum <- aggregate(stormFinal$fatalities, by = list(stormFinal$evtype), sum)
sumInjuries <- aggregate(stormFinal$injuries, by = list(stormFinal$evtype),
sum)
sum <- merge(sum, sumInjuries, by.x = "Group.1", by.y = "Group.1")
## add property damage to dataframe
sumProp <- aggregate(stormFinal$propertydamage, by = list(stormFinal$evtype),
sum)
sum <- merge(sum, sumProp, by.x = "Group.1", by.y = "Group.1")
## add crop damage to dataframe
sumCrop <- aggregate(stormFinal$cropdamage, by = list(stormFinal$evtype), sum)
sum <- merge(sum, sumCrop, by.x = "Group.1", by.y = "Group.1")
## add total damage to dataframe
sumTotDamage <- aggregate(stormFinal$totaldamage, by = list(stormFinal$evtype),
sum)
sum <- merge(sum, sumTotDamage, by.x = "Group.1", by.y = "Group.1")
## provide appropriate variable names
names(sum) <- c("eventtype", "fatalities", "injuries", "propertydamage", "cropdamage",
"totaldamage")
rm(sumCrop, sumInjuries, sumProp, sumTotDamage)
# Install package if not available to R and then open library
if (!"xtable" %in% installed.packages()) {
install.packages("xtable")
}
library(xtable)
sumtab <- sum # create temporary dataframe
sumtab[, 4:6] <- sumtab[, 4:6]/1000000 #scale down large numbers
names(sumtab) <- c("Event Type", "Fatalities", "Injuries", "Property Damage ($M)",
"Crop Damage ($M)", "Total Damage ($M)") # give human readable names
tabletwo <- xtable(sumtab, caption = "Table 2: Weather Impacts. Fatalities, Injuries, and Damage Resulting from Severe Weather in the U.S. (1950-2011)",
digits = c(0, 0, 0, 0, 2, 2, 2), align = c("l", "l", "r", "r", "r", "r",
"r"), html.table.attributes = getOption("xtable.html.table.attributes",
"border=1"), html.table.attributes = getOption("xtable.html.table.attributes",
"align = center"))
print(tabletwo, type = "html", caption.placement = "top", include.rownames = F)
| Event Type | Fatalities | Injuries | Property Damage ($M) | Crop Damage ($M) | Total Damage ($M) |
|---|---|---|---|---|---|
| ASTRONOMICAL LOW TIDE | 0 | 0 | 9.74 | 0.00 | 9.74 |
| AVALANCHE | 225 | 170 | 3.72 | 0.00 | 3.72 |
| BLIZZARD | 103 | 819 | 660.83 | 112.06 | 772.89 |
| COASTAL FLOODING | 10 | 10 | 445.20 | 0.06 | 445.25 |
| COLD/WIND CHILL | 160 | 60 | 2.54 | 101.74 | 104.29 |
| DENSE FOG | 81 | 1077 | 22.83 | 0.00 | 22.83 |
| DENSE SMOKE | 0 | 0 | 0.10 | 0.00 | 0.10 |
| DROUGHT | 2 | 4 | 1046.11 | 13972.57 | 15018.68 |
| DUST DEVIL | 2 | 43 | 0.72 | 0.00 | 0.72 |
| DUST STORM | 22 | 440 | 5.62 | 3.60 | 9.22 |
| EXCESSIVE HEAT | 1920 | 6525 | 7.75 | 492.40 | 500.16 |
| EXTREME COLD/WIND CHILL | 418 | 415 | 133.41 | 1335.02 | 1468.43 |
| FLASH FLOOD | 1019 | 1785 | 17414.73 | 1437.17 | 18851.90 |
| FLOOD | 502 | 6809 | 150301.11 | 10936.19 | 161237.31 |
| FREEZING FOG | 0 | 0 | 2.18 | 0.00 | 2.18 |
| FROST/FREEZE | 20 | 272 | 28.92 | 1997.06 | 2025.98 |
| FUNNEL CLOUD | 0 | 3 | 0.19 | 0.00 | 0.19 |
| HAIL | 15 | 1371 | 15977.54 | 3046.89 | 19024.43 |
| HEAT | 1160 | 2563 | 12.46 | 407.07 | 419.53 |
| HEAVY RAIN | 149 | 337 | 3564.10 | 958.52 | 4522.62 |
| HEAVY SNOW | 142 | 1139 | 984.52 | 134.68 | 1119.21 |
| HIGH SURF | 179 | 261 | 101.56 | 0.00 | 101.56 |
| HIGH WIND | 296 | 1491 | 6005.21 | 686.51 | 6691.72 |
| HURRICANE/TYPHOON | 135 | 1333 | 85356.41 | 5516.12 | 90872.53 |
| ICE STORM | 101 | 2146 | 3972.51 | 5027.11 | 8999.62 |
| LAKE-EFFECT SNOW | 0 | 0 | 40.18 | 0.00 | 40.18 |
| LAKESHORE FLOOD | 0 | 0 | 7.54 | 0.00 | 7.54 |
| LIGHTNING | 817 | 5232 | 935.46 | 12.09 | 947.55 |
| MARINE HAIL | 0 | 0 | 0.00 | 0.00 | 0.00 |
| MARINE HIGH WIND | 1 | 1 | 1.30 | 0.00 | 1.30 |
| MARINE STRONG WIND | 14 | 22 | 0.42 | 0.00 | 0.42 |
| MARINE THUNDERSTORM WIND | 10 | 26 | 0.44 | 0.05 | 0.49 |
| OTHER EVENT | 9 | 11 | 0.26 | 11.03 | 11.29 |
| RIP CURRENT | 577 | 529 | 0.16 | 0.00 | 0.16 |
| SEICHE | 0 | 0 | 0.98 | 0.00 | 0.98 |
| SLEET | 2 | 0 | 0.50 | 0.00 | 0.50 |
| STORM SURGE/TIDE | 54 | 122 | 48046.99 | 10.63 | 48057.63 |
| STRONG WIND | 137 | 388 | 188.41 | 70.81 | 259.23 |
| THUNDERSTORM WIND | 211 | 2461 | 6646.75 | 652.96 | 7299.71 |
| TORNADO | 5661 | 91393 | 58559.45 | 417.48 | 58976.93 |
| TROPICAL DEPRESSION | 0 | 0 | 1.74 | 0.00 | 1.74 |
| TROPICAL STORM | 576 | 7439 | 12252.46 | 1313.60 | 13566.06 |
| TSUNAMI | 33 | 129 | 144.06 | 0.02 | 144.08 |
| VOLCANIC ASH | 0 | 0 | 0.50 | 0.00 | 0.50 |
| WATERSPOUT | 6 | 72 | 60.73 | 0.00 | 60.73 |
| WILDFIRE | 90 | 1608 | 8496.63 | 403.28 | 8899.91 |
| WINTER STORM | 217 | 1353 | 6749.00 | 32.44 | 6781.44 |
| WINTER WEATHER | 67 | 668 | 30.88 | 15.00 | 45.88 |
rm(tabletwo, sumtab)
This table is useful, but visualization is even more useful in analysis. The first plot is provided to present a visual representation of the number of impacts to population health. It also will provide a feel for the distribution of the data as shown in Figure 1.
## Get a feel for what the big picture looks like...just exploring fatalities
par(mar = c(10, 4, 3, 2))
sumexplore <- sum[order(sum$fatalities, decreasing = T), ]
barplot(sumexplore$fatalities, names.arg = sumexplore$eventtype, horiz = F,
las = 2, axes = F, cex.names = 0.75, ylim = c(0, 6000), ylab = "Number of Fatalities by Event Type",
cex.lab = 1, main = "Exploring Distribution of Fatalities Based on Event Type",
cex.main = 1)
axis(2, las = 2, cex.axis = 0.8)
Figure 1. Representative Distribution of “fatalities” Variable. This figure provides a view of a highly right-skewed variable aggregated from the pre-processed dataset. Similar highly skewed distributions would be seen for all of the six variables based on an inspection of the data in Table 2. This provides a rationale to further condense the data for a more meaningful analysis.
Analysis proceeds by employing the Pareto Principle, also referred to as the 80-20 rule, and named after the Italian economist Vilfredo Pareto. He observed that 20% of his countrymen owned 80% of the land. This led quality expert Joseph Juran to bring it to fame noting that a large number of events are based on sparsity and the vital few dictate the macro effects [4]. Based on the representative distribution in Figure 1, these appears to be a viable strategy to simplify the results. If the Pareto Principle for severe weather impacts in society, we might expect that 20% of the 49 categories (roughly 5 event types) would account for 80% of the impacts on population health and economics.
This chunk of code generates two dataframes for population health by ordering the data, cutting it at the 80% point, and recombining the remaining 20% into a single category. This transformed data is then plotted in Figure 2 with annotations showing various percentages of the whole.
rm(sumexplore)
## Generate Pareto-like plots showing the Event Types causing the top 80% of
## fatalities and injuries Extract fatality and injury data from summary
fatsort <- sum[, 1:2]
injsort <- sum[, c(1, 3)]
## Create pareto-style charts for both --> fatalities
fatsort <- fatsort[order(fatsort$fatalities, decreasing = T), ] # order data
twentypercent <- sum(fatsort$fatalities[9:48]) # establish the sum of bottom 20%
eventtype = c("40 Other Events")
fatalities = twentypercent # temp data frame
tempdf <- data.frame(eventtype, fatalities)
fatsort <- rbind(fatsort, tempdf)
fatsort <- fatsort[c(1:8, 49), ]
fatsort$fatalities <- as.numeric(fatsort$fatalities/1000)
## --> injuries
injsort <- injsort[order(injsort$injuries, decreasing = T), ]
twentypercent <- sum(injsort$injuries[5:48])
eventtype = c("44 Other Events")
injuries = twentypercent
temp <- data.frame(eventtype, injuries)
injsort <- rbind(injsort, temp)
injsort <- injsort[c(1:4, 49), ]
injsort$injuries <- as.numeric(injsort$injuries/1000)
## Generate figure - side-by-side plot
par(mfrow = c(1, 2))
par(mar = c(9, 4, 2, 2) + 0.1, oma = c(0, 0, 3, 0))
options(scipen = -2)
barplot(fatsort$fatalities, names.arg = fatsort$eventtype, horiz = F, las = 2,
axes = F, cex.names = 1, ylim = c(0, 10), col = "black", ylab = "Number of Fatalities by Event Type (in thousands)",
cex.lab = 1, main = "Eight (8) of 48 Event Types Represent 80% of Fatalities",
cex.main = 1)
axis(2, las = 2, cex.axis = 1)
tiles = c(sum(sum$fatalities)/1000 * 0.05, sum(sum$fatalities)/1000 * 0.2, sum(sum$fatalities)/1000 *
0.35)
abline(h = tiles, lty = 3)
text(8, tiles, c("5%", "20% of total", "35%"), cex = 1, pos = 3, offset = 0.25)
text(6, 9.5, cex = 1.2, "Note: left panel scale is an \norder of magnitude less than right panel")
barplot(injsort$injuries, names.arg = injsort$eventtype, horiz = F, las = 2,
axes = F, cex.names = 1, ylim = c(0, 100), col = "red3", ylab = "Number of Injuries by Event Type (in thousands)",
cex.lab = 1, main = "Four (4) of 48 Event Types Represent 80% of Injuries",
cex.main = 1)
axis(2, las = 2, cex.axis = 1)
tiles = c(sum(sum$injuries)/1000 * 0.05, sum(sum$injuries)/1000 * 0.2, sum(sum$injuries)/1000 *
0.65)
abline(h = tiles, lty = 3)
text(4, tiles, c("5%", "20% of total", "65%"), cex = 1, pos = 3, offset = 0.25)
par(mfrow = c(1, 1))
title(main = "Severe Weather Events with Greatest Impact to Population Health",
outer = TRUE)
Figure 2. Impacts to Population Health. Two panels showing the impacts to population health in terms of fatalities (left panel) and injuries (right panel). It is worth noting that the left panel uses a y-axis that is one order of magnitude less than the right panel - simply to make the results more readable. Both distributions follow the Pareto Principle with results of 10-15% of event types causing 80% of the impact. All key event types causing injury are seen in roughly the same order as fatalities. A noteworthy difference is the impact of heat, lightning, and rip currents on fatalities.
Similar analysis was conducted on the individual and collective indicators of economic impact. Figure 3 provides similar annotations to show the levels representing 20% of the whole.
rm(eventtype, fatalities, injuries, tiles, twentypercent)
rm(tempdf, temp, fatsort, injsort)
## Generate Pareto-like plots showing the Event Types causing the top 80% of
## property, crop, and total damage
## Extract damage data from summary
propsort <- sum[, c(1, 4)]
cropsort <- sum[, c(1, 5)]
totalsort <- sum[, c(1, 6)]
## Create pareto-style chart with panel for individual and combined damage
## --> property
propsort <- propsort[order(propsort$propertydamage, decreasing = T), ] # order data
twentypercent <- sum(propsort$propertydamage[5:48]) # sum of bottom 20%
eventtype = c("44 Other Events")
propertydamage = twentypercent # temp data frame
tempdf <- data.frame(eventtype, propertydamage)
propsort <- rbind(propsort, tempdf)
propsort <- propsort[c(1:4, 49), ]
propsort$propertydamage <- as.numeric(propsort$propertydamage)
## --> crop
cropsort <- cropsort[order(cropsort$cropdamage, decreasing = T), ] # order data
twentypercent <- sum(cropsort$cropdamage[6:48]) # sum of bottom 20%
eventtype = c("43 Other Events")
cropdamage = twentypercent # temp data frame
tempdf <- data.frame(eventtype, cropdamage)
cropsort <- rbind(cropsort, tempdf)
cropsort <- cropsort[c(1:5, 49), ]
cropsort$cropdamage <- as.numeric(cropsort$cropdamage)
## --> total damage (property + crop)
totalsort <- totalsort[order(totalsort$totaldamage, decreasing = T), ] # order data
twentypercent <- sum(totalsort$totaldamage[6:48]) # sum of bottom 20%
eventtype = c("43 Other Events")
totaldamage = twentypercent # temp data frame
tempdf <- data.frame(eventtype, totaldamage)
totalsort <- rbind(totalsort, tempdf)
totalsort <- totalsort[c(1:5, 49), ]
totalsort$totaldamage <- as.numeric(totalsort$totaldamage)
#
par(mfrow = c(1, 3))
par(mar = c(13, 4, 4, 2) + 0.1, oma = c(0, 0, 3, 0))
barplot(propsort$propertydamage/1e+09, names.arg = propsort$eventtype, horiz = F,
las = 2, axes = F, cex.names = 1.2, ylim = c(0, 200), col = "chocolate",
ylab = "Property Damage by Event Type (in billions of USD)", cex.lab = 1.2,
main = "Four (4) of 48 Event Types Represent 80% of Property Damage", cex.main = 1.2)
axis(2, las = 2, cex.axis = 1)
tiles = c(sum(sum$propertydamage)/1e+09 * 0.2, sum(sum$propertydamage)/1e+09 *
0.35)
abline(h = tiles, lty = 3)
text(4, tiles, c("20% of total", "35%"), cex = 1.2, pos = 3, offset = 0.25)
barplot(cropsort$cropdamage/1e+09, names.arg = cropsort$eventtype, horiz = F,
las = 2, axes = F, cex.names = 1.2, ylim = c(0, 20), col = "chartreuse3",
ylab = "Crop Damage by Event Type (in billions of USD)", cex.lab = 1.2,
main = "Five (5) of 48 Event Types Represent 80% of Crop Damage", cex.main = 1.2)
axis(2, las = 2, cex.axis = 1)
tiles = c(sum(sum$cropdamage)/1e+09 * 0.2, sum(sum$cropdamage)/1e+09 * 0.3)
abline(h = tiles, lty = 3)
text(5, tiles, c("20% of total", "30%"), cex = 1.2, pos = 3, offset = 0.25)
text(4, 18, cex = 1.2, "Note: center panel scale is an order \nof magnitude less than either left or right")
barplot(totalsort$totaldamage/1e+09, names.arg = totalsort$eventtype, horiz = F,
las = 2, axes = F, cex.names = 1.2, ylim = c(0, 200), col = "lightskyblue",
ylab = "Total Damage by Event Type (in billions of USD)", cex.lab = 1.2,
main = "Five (5) of 48 Event Types Represent 80% of Total Damage", cex.main = 1.2)
axis(2, las = 2, cex.axis = 1)
tiles = c(sum(sum$totaldamage)/1e+09 * 0.2, sum(sum$totaldamage)/1e+09 * 0.35)
abline(h = tiles, lty = 3)
text(5, tiles, c("20% of total", "35%"), cex = 1.2, pos = 3, offset = 0.25)
par(mfrow = c(1, 1))
title(main = "Severe Weather Events with Greatest Economic Consequences", outer = TRUE)
Figure 3. Impacts to Economic Indicators. Three panels showing the impacts to population health in terms of property damage (left panel), crop damage (center panel), and total damage (property and crop combined) (right panel). It is worth noting that the center panel uses a y-axis that is one order of magnitude less than either the left or right panel - simply to make the results more readable. All three distributions follow the Pareto Principle with results of about 10% of event types causing 80% of the impact. Floods, hurricanes, and tornadoes are key events causing damage. Drought and hail have greater impact to crop damage than in property damage.
At the outset, two questions were presented to direct this analysis.
With respect to the first questions, four event types are key causes of death and injury:
Additionally heat, lightning, and rip currents have a greater impact on fatalities than on injuries.
With respect to the second question, three event types are key causes of property and crop damage:
Crop damage is also uniquely impacted more by drought than any other event type.
Lastly, it is very interesting to see that roughly 10-15% of the 48 severe weather event types are responsible for 80% of the population health and economic impact. This phenomenon may be leveraged by state, county, and city planners in developing their emergency preparedness plans. Focus and depth can be given to prevention and remediation plans for fewer event types rather than trying to address all the the possible severe weather possibilities.
[1] National Weather Service Storm Data Documentation. URL https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf
[2] Coursera-Johns Hopkins University. NOAA Storm Data. URL https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2
[3] R Core Team (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/
[4] Pareto Principle. URL http://en.wikipedia.org/wiki/Pareto_principle
sessionInfo()
## R version 3.1.0 (2014-04-10)
## Platform: i386-w64-mingw32/i386 (32-bit)
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] xtable_1.7-3 knitr_1.5
##
## loaded via a namespace (and not attached):
## [1] evaluate_0.5.5 formatR_0.10 stringr_0.6.2 tools_3.1.0