The dataset provided a history of the various weather conditions that had occured in the US from 1950 - 2011. It recorded the extent of damages in each of the respective years. The analysis helps to identify the top 10 most harmful weather conditions to the human and the top 10 weather condition that created the most impact to the economies.
Reading data from the source dataset:
data <- read.csv("~/assignment/repdata-data-StormData.csv",header = TRUE, sep = ",")
Base on the dataset provided, the columns INJURIES and FATALITIES are used to calculate the casualties caused by each event. The highest casualties registered signify the most harmful event to the human population.
## Loading required package: reshape2
## Loading required package: ggplot2
Base on the chart, Tonado is the top event that registed the highest casualties over 61 years.
From the dataset, we use the cost of damages to Property and Crop to calculate the impact to the economic.
## converting cost of damages to Propery and Crop
data$PROPDMGconv <- sapply(data$PROPDMGEXP, function(x) {if(x=="K") 10^3
else if(x=="M") 10^6
else if(x=="B") 10^9
else 1})
data$CROPDMGconv <- sapply(data$CROPDMGEXP, function(x) {if(x=="K") 10^3
else if(x=="M") 10^6
else if(x=="B") 10^9
else 1})
data$PropDmgValue <- data$PROPDMG * data$PROPDMGconv
data$CropDmgValue <- data$CROPDMG * data$CROPDMGconv
cost<-melt(data, id="EVTYPE", measure.vars = c("PropDmgValue", "CropDmgValue"))
cost_df<-dcast(cost, EVTYPE~variable, function(x) sum(x, na.rm=TRUE))
econDamage<-cost_df[with(cost_df, order(-rowSums(cost_df[c("PropDmgValue","CropDmgValue")]))),c(1,2:3)]
top10econDamage <- econDamage[1:10,]
top10econDamage<-melt(top10econDamage, id="EVTYPE", measure.vars = c("PropDmgValue","CropDmgValue"))
top10econDamage$value <- top10econDamage$value/10^9
top10econDamage$EVTYPE<-reorder(top10econDamage$EVTYPE, top10econDamage$value)
c2<-ggplot(top10econDamage,aes(x=EVTYPE,y=value))
c2<-c2+geom_bar(stat="identity", fill = "green")
c2<-c2+theme(axis.text.x=element_text(angle=45,vjust=1,hjust=1))
c2<-c2+labs(x="Weather events", y="USD (Billions)", title="Top 10 Weather Condition that has highest Economic Consequences")
print(c2)
Base on the chart, the weather condition that had the highest impact to the economic was Flood.