The current study presents an analysis of how severe weather events affect the health of population and economics of the United State. The first part of the study describes which types of events are most harmful with respect to population health in terms of number of fatalities and injuries (shown in Figure 1). The second part of the study describes which types of events have the greatest economic consequences in terms of property and crop damages (shown in Figure 2). Two panel figures are plotted as bargraphs to show the total impact of different weather events on each variable considered here. Heat and ice storm impacted most the health of population. Water spout and tropical stom affected most the economis of the country.
setwd("C:/Users/postdoc/Desktop/Data Science/Reproducible Research/hw2")
file <- bzfile("repdata-data-StormData.csv.bz2")
data <- read.csv(file)
set <- subset(data,select=c(EVTYPE,FATALITIES,INJURIES,PROPDMG,CROPDMG))
set <- set[complete.cases(set),]
library(dplyr)
##
## Attaching package: 'dplyr'
##
## The following objects are masked from 'package:stats':
##
## filter, lag
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(lattice)
First, I subset the data to the variables event type, number of fatalities and number of injuries using thresholds (3% of maximum number of fatalities and 10% of maximum number of injuries) to show the strongest impact by weather events. Then, I grouped the subset of data to particular event type and calulated the mean of variables for each group. Next, I plot bargraphs side by side to show the total impact of severe weather events on the health of population. I found that heat-related weather events, tornado and tsunami affected the number of fatalities. Also, I found that ice storm, flood and hurracane/typhoon impacted most the number of injuried people.
set_ef <- subset(set, FATALITIES >= 0.03*max(set$FATALITIES), select=c(EVTYPE,FATALITIES))
set_ef_group <- set_ef %>% group_by(EVTYPE) %>% summarize (mean=mean(FATALITIES))
plot1 <- barchart(mean ~ EVTYPE, data=set_ef_group, scales = list(x = list(rot = 45)),
main="Fatalities vs. Weather Event Type", col="red",
xlab="weather event type", ylab="number of fatalities")
set_ei <- subset(set, INJURIES >= 0.10*max(set$INJURIES), select=c(EVTYPE,INJURIES))
set_ei_group <- set_ei %>% group_by(EVTYPE) %>% summarize (mean=mean(INJURIES))
plot2 <- barchart(mean ~ EVTYPE, data=set_ei_group, scales = list(x = list(rot = 45)),
main="Injuries vs. Weather Event Type", col="blue",
xlab="weather event type", ylab="number of injuries")
print(plot1, position = c(0, 0, 0.5, 1), more = TRUE)
print(plot2, position = c(0.5, 0, 1, 1))
First, I subset the data to the variables event type, property damage and crop damage using thresholds (20% of maximum property damage and 80% of maximum of crop damage) to show the strongest impact by weather events. Then, I grouped the subset of data to particular event type and calulated the mean of variables for each group. Next, I plot bargraphs side by side to show the total impact of severe weather events on economics of the United States. I found that water spout, thunderstorm wind and land slide caused most property damage. Also, I found that tropical storm, river flloding and heavy snow impacted the crop damage.
set_epd <- subset(set, PROPDMG >= 0.20*max(set$PROPDMG), select=c(EVTYPE,PROPDMG))
set_epd_group <- set_epd %>% group_by(EVTYPE) %>% summarize (mean=mean(PROPDMG))
plot3 <- barchart(mean ~ EVTYPE, data=set_epd_group, scales = list(x = list(rot = 45)),
main="Property Damage vs. Weather Event", col="red",
xlab="weather event type", ylab="property damage")
set_ecd <- subset(set, CROPDMG >= 0.80*max(set$CROPDMG), select=c(EVTYPE,CROPDMG))
set_ecd_group <- set_ecd %>% group_by(EVTYPE) %>% summarize (mean=mean(CROPDMG))
plot4 <- barchart(mean ~ EVTYPE, data=set_ecd_group, scales = list(x = list(rot = 45)), col="blue",
main="Crop Damage vs. Weather Event", xlab="weather event type", ylab="crop damage")
print(plot3, position = c(0, 0, 0.5, 1), more = TRUE)
print(plot4, position = c(0.5, 0, 1, 1))