Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

In the database, only three weather events (flood, tornado, thunderstrom) are recorded from 1950 to 1992, most likely due to a lack of good records. Therefore, with a full spectrum of weather events, records from 1993 to November 2011 will be considered in the analysis.

Weather events drought, tornado, and flood were found to cause the most harm to population health. On the other hand, flood, hurricane, and storm were found to cause the most property damage.

Download and Read the Data

if (!file.exists("repdata%2Fdata%2FStormData.csv.bz2")){
    fileURL <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
    download.file(fileURL, "repdata%2Fdata%2FStormData.csv.bz2")
}  
if (!"data" %in% ls() ){
    data <- read.csv("repdata%2Fdata%2FStormData.csv.bz2")
}
if (!file.exists("repdata%2Fpeer2_doc%2Fpd01016005curr.pdf")){
        pdfURL <- "https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf"
        download.file(pdfURL, "repdata%2Fpeer2_doc%2Fpd01016005curr.pdf")
} 

Data Processing

storm <- data[,c("BGN_DATE", "EVTYPE","FATALITIES","INJURIES","PROPDMG","PROPDMGEXP")]
names(storm) <- c("year","type", "fatalities", "injuries", "propertyDamage", "pde")
storm$year <- as.numeric(format(as.Date(storm$year, "%m/%d/%Y"),"%Y"))
change <- function(x){
        if (as.numeric(x) %in% 0:8){
                return(as.numeric(x))
        }
        else if (x == "B"){
                return(9)
        }
        else if (x %in% c("h","H")){
                return(2)
        }
        else if (x == "K"){
                return(3)
        }
        else if (x %in% c("m","M")){
                return(6)
        }
        else {
                return(0)
        }
}
storm$pde <- sapply(storm$pde, change)
storm$propertyDamage <- storm$propertyDamage * (10 ** storm$pde)
storm$type <- tolower(storm$type)
old <- storm[storm$year < 1993,]
oldtype <- length(unique(old$type))
sum2 <- aggregate.data.frame(old[,3:5], by =list(old$type), sum)
oldcount <- as.data.frame(table(old$type))
oldsummary <- merge(oldcount, sum2, by.x = "Var1", by.y = "Group.1")
names(oldsummary)[1] <- "type"
all <- storm
storm <- all[all$year >1992,]
newtype <- length(unique(storm$type))
storm$type[grepl("tide|surf|rip current|wave", storm$type)]  <-  "tide"
storm$type[grepl("avalanche|avalance", storm$type)]  <-  "avalanche"
storm$type[grepl("snow|blizzard|ice storm|ice", storm$type)]  <-  "snow"
storm$type[grepl("flood|flash|drowning|seiche", storm$type)]  <-  "flood"
storm$type[grepl("cold|cool|frost|freeze|freezing", storm$type)]  <-  "cold"
storm$type[grepl("fog", storm$type)]  <-  "fog"
storm$type[grepl("dust|smoke|vog", storm$type)]  <-  "dust"
storm$type[grepl("drought|heat|dry|warm|hot|driest", storm$type)]  <-  "drought"
storm$type[grepl("tornado", storm$type)]  <-  "tornado"
storm$type[grepl("thunderstorm|tstm|storm", storm$type)]  <-  "storm"
storm$type[grepl("hail|rain", storm$type)]  <-  "hail"
storm$type[grepl("wind", storm$type)]  <-  "wind"
storm$type[grepl("hurricane|typhoon", storm$type)]  <-  "hurricane"
storm$type[grepl("lightning", storm$type)]  <-  "lightning"
storm$type[grepl("tropical", storm$type)]  <-  "tropical weather"
storm$type[grepl("volcanic|ash", storm$type)]  <-  "volcanic ash"
storm$type[grepl("waterspout|funnel|cloud", storm$type)]  <-  "waterspout"
storm$type[grepl("fire", storm$type)]  <-  "wildfire"
storm$type[grepl("winter|wintry mix", storm$type)]  <-  "winter weather"
storm$type[grepl("wet", storm$type)]  <-  "wet"
storm <- storm[!grepl("summary|none", ignore.case=TRUE, storm$type),]
newtype2 <- length(unique(storm$type))
sum <- aggregate.data.frame(storm[,3:5], by =list(storm$type), sum)
names(sum)[1] <- "type"
fatalities <- sum[order(sum$fatalities, decreasing = TRUE),1:2]
injuries <- sum[order(sum$injuries, decreasing = TRUE),c(1,3)]
prop <- sum[order(sum$propertyDamage, decreasing = TRUE),c(1,4)]

As mentioned, only few types of weather events have been recorded before 1993. Above procedure seperates those from the remaining. Moreover, it standizes the property damage in dollars. There were initially 898 types of weather events in the datasets. To avoid the skewness caused by similar phrasing events, weather event type are further modified, resulting in only 111 types. Only the top 10 most harmful weather events in each category will be presented below.

Results

barplot(fatalities[1:10, 2], col = heat.colors(10), legend.text = fatalities[1:10, 1], ylab = "Fatalities", main = "Fatalities: Top 10")

Drought/Dryness type events are causing the most death, followed by tornado and flood.

barplot(injuries[1:10, 2], col = rainbow(10), legend.text = injuries[1:10, 1], ylab = "Injuries", main = "Injuries: Top 10")

Tornado are causing the most injuries, followed by drought and flood.

barplot(prop[1:10, 2], col = topo.colors(10), legend.text = prop[1:10, 1], ylab = "Property Damage (US$)", main = "Property Damage: Top 10")

Flooding has created the most property damage while hurricane and storm are second and third.

knitr::kable(oldsummary)
type Freq fatalities injuries propertyDamage
hail 61832 5 401 0
tornado 34764 4012 68036 30598198570
tstm wind 90963 263 3326 0

Data before 1993 indicates that tornado is the most harmful in both population health and economy in the past. However, the result maybe doubtful due to the possibility of lack of good data in this dataset.