Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. In this data analysis we use the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. We want to find types of events are most harmful with respect to population health and which types of events have the greatest economic consequences. To do this we will use the variables EVTYPE and damage.

Loading libraries

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(scales)

Data Processing

First download the data, load it to R and select only the columns that interest us.

fileUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"

download.file(fileUrl, destfile="storm_data.csv.bz2")
stormdata <- read.table("storm_data.csv.bz2",
    header=TRUE,sep=",")
stormdata <- select(stormdata,c(8,23:28))
stormdata$EVTYPE = toupper(stormdata$EVTYPE)

Analysis

We want to answer the following question.

Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

harmful<- stormdata %>% filter(FATALITIES!= 0 | INJURIES !=0 )
    
harmful<- stormdata%>% group_by(EVTYPE) %>%
    summarize(fatality=sum(FATALITIES,na.rm=TRUE),
        injury=sum(INJURIES,na.rm=T)) 
harmful<-data.frame(harmful)

fatalities<-arrange(harmful,desc(fatality))
injuries <-arrange(harmful,desc(injury))

Another question we want to answer is: Across the United States, which types of events have the greatest economic consequences?

The ‘CROPDMGEXP’ is the exponent values for ‘CROPDMG’ (crop damage). In the same way, ‘PROPDMGEXP’ is the exponent values for ‘PROPDMG’ (property damage). We use both to get the total values for crops and property damage. (B or b = Billion, M or m = Million, K or k = Thousand, H or h = Hundred). The number from one to ten represent the power of ten (10^The number). The symbols “-”, “+” and “?” refers to less than, greater than and low certainty.

for(i in 1:length(stormdata$PROPDMEXP)){
    ifelse(stormdata$PROPDMGEXP[i]=="B",
    stormdata$PROPDMG[i]<- (stormdata$PROPDMG[i])*(10^9),
    ifelse(stormdata$PROPDMGEXP[i]=="M" | stormdata$PROPDMGEXP[i]=="m",
    stormdata$PROPDMG[i]<- (stormdata$PROPDMG[i])*(10^6),
    ifelse(stormdata$PROPDMGEXP[i]=="K",
    stormdata$PROPDMG[i]<- (stormdata$PROPDMG[i])*(10^3),
    ifelse(stormdata$PROPDMGEXP[i]== "H" | stormdata$PROPDMGEXP[i]=="h",
    stormdata$PROPDMG[i]<- (stormdata$PROPDMG[i])*(10^2),
    stormdata$PROPDMG[i]<- (stormdata$PROPDMG[i])*1))))
}

for(i in 1:length(stormdata$CROPDMEXP)){
    ifelse(stormdata$CROPDMGEXP[i]=="B",
    stormdata$CROPDMG[i]<- (stormdata$CROPDMG[i])*(10^9),

    ifelse(stormdata$CROPDMGEXP[i]=="M" | stormdata$CROPDMGEXP[i]=="m",
    stormdata$CROPDMG[i]<- (stormdata$CROPDMG[i])*(10^6),
    ifelse(stormdata$CROPDMGEXP[i]=="K",
    stormdata$CROPDMG[i]<- (stormdata$CROPDMG[i])*(10^3),
    ifelse(stormdata$CROPDMGEXP[i]== "H" | stormdata$CROPDMGEXP[i]=="h",
    stormdata$CROPDMG[i]<- (stormdata$CROPDMG[i])*(10^2),
    stormdata$CROPDMG[i]<- (stormdata$CROPDMG[i])*1))))
}
econdmg<- stormdata%>% 
    mutate(dmg= PROPDMG + CROPDMG)%>%
    group_by(EVTYPE) %>%
    summarize(damage=sum(dmg,na.rm=TRUE))%>%
    arrange(desc(damage))
econdmg<-data.frame(econdmg)

Results

In this section we present the results to the questions that we have as objective

We see the types of events that cause more fatalities are TORNADOS.

p<-ggplot(fatalities[1:8,], aes(EVTYPE,fatality))
p+geom_bar(stat="identity")+ylab("Fatalities")+xlab("Event type")+
    ggtitle("Top eigth types of event cause fatalities across in USA")+
    theme(axis.text.x=element_text(size=8),
        axis.text.y=element_text(size=10), axis.title.y=element_text(size=10),
    plot.title = element_text(color = "#993333", size=10, 
    face="bold", hjust=0.5))

Types of events are cause more injuries.

g<-ggplot(injuries[1:8,], aes(EVTYPE,injury))
g+geom_bar(stat="identity")+ylab("Injuries")+xlab("Event type")+
    ggtitle("Top eigth types of event cause injuries across in USA")+
    theme(axis.text.x=element_text(size=8),
        axis.text.y=element_text(size=10), axis.title.y=element_text(size=10),
    plot.title = element_text(color = "#993333", size=10, 
    face="bold", hjust=0.5))

Types of events have the greatest economic consequences

g<-ggplot(econdmg[1:8,], aes(EVTYPE,damage))
g+geom_bar(stat="identity")+ylab("Economic damage")+xlab("Event type")+
    ggtitle("Top eigth types of event cause economic damage across in USA")+
    theme(axis.text.x=element_text(size=6),
        axis.text.y=element_text(size=10), axis.title.y=element_text(size=10),
    plot.title = element_text(color = "#993333", size=10, 
    face="bold", hjust=0.5))