Analysis of the impact of Severe Events on health and economy

Synopsis

The Severe Weather data was downloaded from https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2 and extracted as a csv file.

The data was subsetted to include only necessary columns and only rows in which the variables for harmful effects to the human population and economic consequences are not 0 (FATALTIES, INJURIES, PROPDMG, CROPDMG).

All the wrong event names were replaced with the correct names.

Then, the PROPDMG and CROPDMG variables were converted to $ amounts and 2 new columns were created. The two columns were added to get the total damage amount and it was added as a new column.

Then the data was grouped by Events (EVTYPE) and the sums of the damages (to crop and property) and the impact on health (fatalities and injuries) were plotted to analyze the trends.

It was concluded that

  • Hurricanes and typhoons cause the most economic damage.
  • Tornados cause the most fatalities
  • Floods cause the most injuries

Data Processing

Data loading and subsetting

The data was downloaded from https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2 and extracted as a csv file called StormData.csv and was read into R. Then the data set was subsetted to include only neccessary columns. All rows where the Fatalities, Injuries, Property Damage and Crop Damage were 0, were removed.

#Read in the csv file
storm <- read.csv("StormData.csv")

#subset and take only needed columns
library(dplyr)
stormsub <- storm[,c("EVTYPE","FATALITIES","INJURIES","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")]
stormsub <- filter(stormsub, FATALITIES != 0, INJURIES!=0, PROPDMG!=0, CROPDMG!=0)

Data cleanup Steps

Replace all wrong event names with standard names. Standard names were obtained from the document at https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf

stormsub$EVTYPE[stormsub$EVTYPE=="EXCESSIVE HEAT"] <- "HEAT"
stormsub$EVTYPE[stormsub$EVTYPE=="HEAT WAVE DROUGHT"] <- "DROUGHT"
stormsub$EVTYPE[stormsub$EVTYPE=="THUNDERSTORM WINDS"] <- "THUNDERSTORM WIND"
stormsub$EVTYPE[stormsub$EVTYPE=="TROPICAL STORM GORDON"] <- "TROPICAL STORM"
stormsub$EVTYPE[stormsub$EVTYPE=="TSTM WIND"] <- "THUNDERSTORM WIND"
stormsub$EVTYPE[stormsub$EVTYPE=="WINTER STORM HIGH WINDS"] <- "WINTER STORM"
stormsub$EVTYPE[stormsub$EVTYPE=="WINTER STORMS"] <- "WINTER STORM"
stormsub$EVTYPE[stormsub$EVTYPE=="HIGH WINDS"] <- "HIGH WIND"
stormsub$EVTYPE[stormsub$EVTYPE=="HURRICANE"] <- "HURRICANE/TYPHOON"

Create new columns for Property Damage amount and Crop Damage amount using the number in the these columns and the exponential column.

#Property Damage
stormsub$propamt[which(stormsub$PROPDMGEXP=="M")] <- stormsub$PROPDMG[which(stormsub$PROPDMGEXP=="M")] * 1000000 
stormsub$propamt[which(stormsub$PROPDMGEXP=="K")] <- stormsub$PROPDMG[which(stormsub$PROPDMGEXP=="K")] * 1000
stormsub$propamt[which(stormsub$PROPDMGEXP=="B")] <- stormsub$PROPDMG[which(stormsub$PROPDMGEXP=="B")] * 1000000000

#Crop damage
stormsub$cropamt[which(stormsub$CROPDMGEXP=="M")] <- stormsub$CROPDMG[which(stormsub$CROPDMGEXP=="M")] * 1000000 
stormsub$cropamt[which(stormsub$CROPDMGEXP=="K")] <- stormsub$CROPDMG[which(stormsub$CROPDMGEXP=="K")] * 1000
stormsub$cropamt[which(stormsub$CROPDMGEXP=="B")] <- stormsub$CROPDMG[which(stormsub$CROPDMGEXP=="B")] * 1000000000

Create total column for the damages

stormsub$totalamt <- stormsub$propamt + stormsub$cropamt

Results

The results are shown as plots below.

Property and Crop damages

Draw the plot for the damages

library(ggplot2)
group <- group_by(stormsub, EVTYPE)
damagesumm <- summarize(group, sum(totalamt))
damagesumm <- as.data.frame(damagesumm)
names(damagesumm) <- c("event","amount")
ggplot(damagesumm, aes(event,amount)) + geom_bar(stat="identity") + labs(title="Damage amounts for events", x="Events", y="Damage Amounts") + theme(axis.text.x=element_text(angle=90,hjust=1))

The plot shows that Hurricanes/Typhoons damages cost the most.

Impact of severe weather on human health.

Draw the plot for the Fatalities

fatalities <- as.data.frame(summarize(group, sum(FATALITIES)))
names(fatalities) <- c("event","amount")
ggplot(fatalities, aes(event,amount)) + geom_bar(stat="identity") + labs(title="Fatalities for events", x="Events", y="Fatalities") + theme(axis.text.x=element_text(angle=90,hjust=1))

The plot shows that Tornadoes cause the most fatalities.

Draw the plot for Injuries

injuries <- as.data.frame(summarize(group, sum(INJURIES)))
names(injuries) <- c("event","amount")
ggplot(injuries, aes(event,amount)) + geom_bar(stat="identity") + labs(title="Injuries for events", x="Events", y="Injuries") + theme(axis.text.x=element_text(angle=90,hjust=1))

The plot shows that Floods cause the most injuries.

End of report