Synopsis

In this report we explore the NOAA Storm Database to find which type of severe weather event is the most harmful with respect to population health and which type has the greatest economic consequences. The result may be used to prioritize actions to deal with these events.

Data Processing

In this section we describe the different data processing actions that we implemented in the data from the NOAA database (we assume the data source “https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2” is in the working directory)

In the first step I read the data into a dataframe

df = read.csv("repdata%2Fdata%2FStormData.csv.bz2")

In the second step I will add a column to the dataframe that considers the total cost of each event. I am assuming that the column PROPDMG captures the economical value of the event with the column PROPDMG being the order of magnitud (i.e. “K” multiply the value of PROPDMG by 1000, “M” multiply the value of PROPDMG by 1000000, “B” multiply the value of PROPDMG by 1000000000)

Third I will group the data by event type calculating the average of the indicator of each impact:

Fatalities_Mean = aggregate(FATALITIES ~ EVTYPE,df,mean)
Injuries_Mean = aggregate(INJURIES ~ EVTYPE,df,mean)
TotalPropDamage_Mean = aggregate(PROPDMGCOST ~ EVTYPE,df,mean)

Finally I will only store the 4 events with the highest impact in each category

Ordered_Fatalities_Mean = Fatalities_Mean[with(Fatalities_Mean, order(-FATALITIES)), ]
Ordered_Injuries_Mean = Injuries_Mean[with(Injuries_Mean, order(-INJURIES)), ]
Ordered_TotalPropDamage_Mean = TotalPropDamage_Mean[with(TotalPropDamage_Mean, order(-PROPDMGCOST)), ]

Highest_Fatalities_Mean = Ordered_Fatalities_Mean[1:4,]
Highest_Injuries_Mean = Ordered_Injuries_Mean[1:4,]
Highest_TotalPropDamage_Mean = Ordered_TotalPropDamage_Mean[1:4,]

Results

The following barplot shows the 4 events with the highest impacts on human health considering two indicators (Number of Fatalities and Number of Injuries)

p<-ggplot(data=Highest_Fatalities_Mean, aes(x=EVTYPE, y=FATALITIES)) +
  geom_bar(stat="identity") + labs(title="Average Fatalities Number per Event Type")
p

p<-ggplot(data=Highest_Injuries_Mean, aes(x=EVTYPE, y=INJURIES)) +
  geom_bar(stat="identity") + labs(title="Average Injuries Number per Event Type")
p

As it can be seen Tornadoes, Wind and Hail is the one with the greatest number of fatalities (avg) whereas the Heat Wave is the one with the greatest number of injuries.

The following barplot shows the 4 events with the highest economical impact considering the property damage value indicator.

p<-ggplot(data=Highest_TotalPropDamage_Mean, aes(x=EVTYPE, y=PROPDMGCOST)) +
  geom_bar(stat="identity") + labs(title="Average Property Damage (in US$) per Event Type")
p

As it can be seen Tornadoes, Wind and Hail is the one with the greatest economical impact (avg).