Synopsis

The basic goal of this report is to explore the NOAA Storm Database and answer some basic questions about severe weather events. This analysis find events across the United States, which are most harmful with respect to population health and which types of events have the greatest economic consequences?

The events in the database start in the year 1950 and end in November 2011. After analyzing data we found that tornando causes most harm with respect to populatoin health. Report also concludes that flood causes highest economic damage.

Data Processing

Download Data and read data

download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2","repdata_data_StormData.csv.bz2")
storm_data <- read.csv("repdata_data_StormData.csv.bz2")

Data processing to find events which are most harmful with respect to population health

Aggregate Fatalities based on event type

fatalities <- aggregate(storm_data$FATALITIES, by=list(Event_TYPE=tolower(storm_data$EVTYPE)), FUN=sum )

Get top 5 fatalities

topfatalities <- fatalities[order(fatalities$x,decreasing=TRUE)[1:5],]

Aggregate injuries based on event type

injury_data <- aggregate(storm_data$INJURIES, by=list(Event_TYPE=tolower(storm_data$EVTYPE)), FUN=sum)

Get top 5 injuries

topinjuries <- injury_data[order(injury_data$x,decreasing=TRUE)[1:5],]

Data processing to find types of events have the greatest economic consequences?

Convert function which return right 10 power which will be use to convert data. e.g.1 -> 2.5 k -> 2.5 * 1000 = 2500 e.g.2 -> 7 M -> 7000000

convert <- function(unit) {
        u <- 0
        if(is.numeric(unit))
        {
                u <- 10^unit
        }
        if(tolower(unit) == "h")
        {
                u <- 100
        }
        if(tolower(unit) == "k")
        {
                u <- 1000
        }
        if(tolower(unit) == "m")
        {
                u <- 1000000
        }
        if(tolower(unit) == "b")
        {
                u <- 1000000000
        }
        as.numeric(u)
}

Function to Convert a column to numeric

convert_nums <- function(x) {
        if(x == '' | x == ' ') 0
        else as.numeric(x)
}

Add a new Column propdmgtotal in storm_data which stores total property damage

storm_data$propdmgtotal <- apply(data.frame(storm_data$PROPDMG,storm_data$PROPDMGEXP),1,function(x) {convert_nums(x[1]) * convert(x[2])})

Add a new Column cropdmgtotal in storm_data which stores total damage due to Crops

storm_data$cropdmgtotal <- apply(data.frame(storm_data$CROPDMG,storm_data$CROPDMGEXP),1,function(x) {convert_nums(x[1]) * convert(x[2])})

Add a new Column totaldamage which adds total damage due to Crops and damage due to property

storm_data$totaldamage <- storm_data$propdmgtotal + storm_data$cropdmgtotal

Aggregate total damage based on events

topdamage_data <- aggregate(storm_data$totaldamage, by=list(Event_TYPE=tolower(storm_data$EVTYPE)), FUN=sum)

Get top 5 total damage

topdamage <- topdamage_data[order(topdamage_data$x,decreasing=TRUE)[1:5],]

Results

Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

par(mfrow = c(1, 2), mar=c(8,4,4,2))
barplot(topfatalities$x,names.arg=topfatalities$Event_TYPE,main="Top Fatalities USA(Event)",las=2, col=heat.colors(5))
barplot(topinjuries$x,names.arg=topinjuries$Event_TYPE,main="Top injuries in USA by Event", las = 2, col=heat.colors(5))

Fig 1: Plot top 5 events based on total fatalities and Injuries

Result shows that tornando weather event causes top fatalities and injuries in USA.

Across the United States, which types of events have the greatest economic consequences?

par(mfrow = c(1, 1), mar=c(8,6,4,2))
barplot(topdamage$x,names.arg=topdamage$Event_TYPE,main="Top Damage in USA by Event", las = 2, col=heat.colors(5))

Fig 2: Plot shows top 5 total economical damage based on events

These result shows flood causes maximum economic damage