Analysis of Storm Data

Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This analysis involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

The basic goal of this analysis is to explore the NOAA Storm Database and determine which events have the most severe impact on human life and property.

Data Processing

Load libraries for the analysis

library(ggplot2)
library(reshape2)

Read the StormData file and create a subset with only the required columns

StormData<-read.csv("StormData.csv.bz2")
StormSubset<-StormData[,c("EVTYPE","FATALITIES","INJURIES","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")]

Determine events most harmful to human life

  1. First we calculate the total number of casualties by adding injuries and fatalities
  2. Create a summary table, StormHarmCst, with the sum of fatalities, injuries and total casualties for each event.
  3. Find the top 10 events which cause the maximum number of casualties.
  4. Create a summary table, meltStormHarm which will be used to plot the chart.
StormSubset$Casualties<-StormSubset$FATALITIES+StormSubset$INJURIES
StormHarm<-melt(StormSubset,id=c("EVTYPE"),measure.vars = c("INJURIES","FATALITIES","Casualties"))
StormHrmCst<-dcast(StormHarm,EVTYPE~variable,sum)

TopStormHarm<-StormHrmCst[order(StormHrmCst$Casualties,decreasing=T)[1:10],]

meltTopHarm<-melt(TopStormHarm,id=c("EVTYPE"),measure.vars=c("INJURIES","FATALITIES"))

Multiplier function decodes the notation used to denote damage

Multiplier function looks at the 2 columns PROPDMGEXP and CROPDMGEXP in order to determine the magnitude of the damage.

H - denotes hundred K - denotes thousand M - denotes million B - denotes billion 0-9 - denotes the 10^n

StormSubset$PROPDMGEXP <- toupper(StormSubset$PROPDMGEXP)
StormSubset$CROPDMGEXP<- toupper(StormSubset$CROPDMGEXP)


multiplier<-function(x){
  if(x=="K") {
    10^3
  }else if(x=="M"){
    10^6
  } else if(x=="B"){
    10^9
  } else if(x=="H") {
    10^2
  }else if (x%in%c("9","8","7","6","5","4","3","2","1","0")){
    10^as.numeric(as.character(x))
  } else {
    10^0
  }
    }

Calculate Total Property, Crop and Economic Damage

StormSubset$PropDamage<-StormSubset$PROPDMG*sapply(StormSubset$PROPDMGEXP,multiplier)
StormSubset$CropDamage<-StormSubset$CROPDMG*sapply(StormSubset$CROPDMGEXP,multiplier)
StormSubset$EcoDamage<-StormSubset$PropDamage+StormSubset$CropDamage

Determine events most harmful to property

  1. First we calculate the total economic damage by adding property and crop damage
  2. Create a summary table, StormDmgCst, with the sum of Property Damage, Crop Damage and total Economic Damage for each event.
  3. Find the top 10 events which cause the maximum amount of economic damage.
  4. Create a summary table, meltTopEco which will be used to plot the chart.
StormDamage<-melt(StormSubset,id=c("EVTYPE"),measure.vars = c("PropDamage","CropDamage","EcoDamage"))
StormDmgCst<-dcast(StormDamage,EVTYPE~variable,sum)

TopEcoDmg<-StormDmgCst[order(StormDmgCst$EcoDamage,decreasing=T)[1:10],]

meltTopEco<-melt(TopEcoDmg,id=c("EVTYPE"),measure.vars = c("PropDamage","CropDamage"))

meltTopEco$value<-meltTopEco$value/10^6

Results

Plot top 10 events responsible for most casualties

Based on the analysis of the storm data, Tornadoes are most harmful to human health leading to over 90000 casualties.Excessive heat is the second most harmful event accounting for about 8500 casualties. TSTM Wind leads to about 7500 casualties.

g<-ggplot(meltTopHarm,aes(x=reorder(EVTYPE,value),y=value))
g+geom_bar(stat="identity",aes(fill=variable))+coord_flip()+xlab("Event Type")+ylab("Number of casualties")+ggtitle("Top Events harmful to human health")

Plot top 10 events responsible for most economic damage

In terms of economic damage, Floods cause the maximum economic damage ($1.5 Bn) followed by Hurricanes ($719 Mn) and Tornadoes ($573 Mn).

g2<-ggplot(meltTopEco,aes(x=reorder(EVTYPE,value),y=value))
g2+geom_bar(stat="identity",aes(fill=variable))+coord_flip()+xlab("Event Type")+ylab("Damage in Mn USD")+ggtitle("Top Events causing Economic Damage")