Synopsis

Severe weather events can cause both public health and economic problems. e.g., deaths, injuries, damage to properties and crops.

The U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database tracks characteristics of major storms and weather events in the United States, including estimates of any fatalities, injuries, and property damage. We will use this data to explore which weather events cause the most harm in terms of deaths, injuries, economics losses. Thus, instead of equally distributing our efforts to control the outcomes of all weather events, we will focus our efforts on the events with the highest health and economic impacts.

Data processing

We load the data using the followin code. This step is time consuming , thus cache is used. We used dplyr package in pre-processing of the data. Initially, we select only the varialbes we need in our analysis.

if (!file.exists("stormData.csv.bz2")) {
  download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",method = "auto",destfile = "stormData.csv.bz2")
}
storm<-read.csv("stormData.csv.bz2")
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
data<-storm %>%
select(EVTYPE,FATALITIES,INJURIES,PROPDMG,PROPDMGEXP,CROPDMG,CROPDMGEXP,REFNUM)

The EVTYPE contains somme irrelevant types of events as ‘Summary of…’. To omit these entries, the follwoing code was used.

excluded<-grep("[Ss]ummary", data$EVTYPE)
included<-data[-excluded,]
head(included)
##    EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP REFNUM
## 1 TORNADO          0       15    25.0          K       0                 1
## 2 TORNADO          0        0     2.5          K       0                 2
## 3 TORNADO          0        2    25.0          K       0                 3
## 4 TORNADO          0        2     2.5          K       0                 4
## 5 TORNADO          0        2     2.5          K       0                 5
## 6 TORNADO          0        6     2.5          K       0                 6

Then, we dealt with “PROPDMGEXP” and “CROPDMGEXP” to estimate the total amount of economic losses.We create new variables, propexp and cropexp, to be muliplied with PROPDMG and CROPDMG resepctively.

unique(included$PROPDMGEXP)
##  [1] K M   B m + 0 5 6 ? 4 2 3 h 7 H - 1 8
## Levels:  - ? + 0 1 2 3 4 5 6 7 8 B h H K m M
included$propexp[included$PROPDMGEXP %in% c("","-","?","+","0")]<-1
included$propexp[included$PROPDMGEXP %in% c("B","b")]<-1000000000
included$propexp[included$PROPDMGEXP %in% c("M","m","6")]<-1000000
included$propexp[included$PROPDMGEXP %in% c("K","k","3")]<-1000
included$propexp[included$PROPDMGEXP %in% c("H","h","2")]<-100
included$propexp[included$PROPDMGEXP =="1"]<-10
included$propexp[included$PROPDMGEXP =="4"]<-10000
included$propexp[included$PROPDMGEXP =="5"]<-100000
included$propexp[included$PROPDMGEXP =="7"]<-10000000
included$propexp[included$PROPDMGEXP =="8"]<-100000000
unique(included$CROPDMGEXP)
## [1]   M K m B ? 0 k 2
## Levels:  ? 0 2 B k K m M
included$cropexp[included$CROPDMGEXP %in% c("","-","?","+","0")]<-1
included$cropexp[included$CROPDMGEXP %in% c("B","b")]<-1000000000
included$cropexp[included$CROPDMGEXP %in% c("M","m","6")]<-1000000
included$cropexp[included$CROPDMGEXP %in% c("K","k","3")]<-1000
included$cropexp[included$CROPDMGEXP %in% c("H","h","2")]<-100
included$cropexp[included$CROPDMGEXP =="1"]<-10
included$cropexp[included$CROPDMGEXP =="4"]<-10000
included$cropexp[included$CROPDMGEXP =="5"]<-100000
included$cropexp[included$CROPDMGEXP =="7"]<-10000000
included$cropexp[included$CROPDMGEXP =="8"]<-100000000
included$proploss<-included$propexp*included$PROPDMG
included$croploss<-included$cropexp*included$CROPDMG

Data Analaysis

Across the United States, Tornados are the most harmful with respect to population health. As seen in figure 1,tornados have the highest fatalities and injuries.The deaths resulting for excessive heat, which is the second cause of death among major weather events, is less than the half resulting from tornados. Regarding injuries, tornados was the in the forefront too.

fat<-included %>%
select(EVTYPE,FATALITIES) %>%
group_by(EVTYPE) %>% 
summarize(Fatalities=sum(FATALITIES)) %>%
ungroup() %>%
arrange(desc(Fatalities)) %>%
top_n(10) 
## Selecting by Fatalities
fat <- transform(fat, EVTYPE = reorder(EVTYPE,Fatalities))
inj<-included %>%
select(EVTYPE,INJURIES) %>%
group_by(EVTYPE) %>% 
summarize(Injuries=sum(INJURIES)) %>%
ungroup() %>%
arrange(desc(Injuries)) %>%
top_n(10) 
## Selecting by Injuries
inj <- transform(inj, EVTYPE = reorder(EVTYPE,Injuries))
library(ggplot2)
library(gridExtra)
x<-ggplot(fat,aes(x=EVTYPE,y=Fatalities))+geom_bar(stat="identity") + coord_flip()
y<-ggplot(inj,aes(x=EVTYPE,y=Injuries))+geom_bar(stat="identity") + coord_flip()
grid.arrange(x,y, ncol = 2,main=textGrob("Figure 1.Fatalities and Injuries form the Leading 10 Weather Events in the U.S.",gp=gpar(fontsize=15,font=3)))

Across the United States, as shown in figure 2, while floods have the greatest economic consequences due to property damage, drought have the greatese economics consequences due to damage to crops

prop<-included %>%
select(EVTYPE,proploss) %>%
group_by(EVTYPE) %>% 
summarize(Proploss=sum(proploss)/1000000) %>%
ungroup() %>%
arrange(desc(Proploss)) %>%
top_n(10) 
## Selecting by Proploss
prop<-transform(prop,EVTYPE=reorder(EVTYPE,Proploss))
crop<-included %>%
select(EVTYPE,croploss) %>%
group_by(EVTYPE) %>% 
summarize(croploss=sum(croploss)/1000000) %>%
ungroup() %>%
arrange(desc(croploss)) %>%
top_n(10) 
## Selecting by croploss
crop<-transform(crop,EVTYPE=reorder(EVTYPE,croploss))
a<-ggplot(prop,aes(x=EVTYPE,y=Proploss))+geom_bar(stat="identity") + coord_flip()+ylab("Property loss")+xlab("Evet type")
b<-ggplot(crop,aes(x=EVTYPE,y=croploss))+geom_bar(stat="identity") + coord_flip()+ylab("Crop loss")+xlab("Evet type")
grid.arrange(a,b, ncol = 2, main=textGrob("Figure 2.Monetary Losses in thousands of Dollars due to Damage to Properties and Crops caused by the Leading 10 Weather Events in the U.S.",gp=gpar(fontsize=15,font=3)))

Conclusion

Efforts to control the health consequences from tornados should be priotrized as they are reasponsible for the most deaths and injuries. To limit economic consequences from major weather events, measures should be implemented to deal with floods and droughts.