Synopsis

The dataset provided a history of the various weather conditions that had occured in the US from 1950 - 2011. It recorded the extent of damages in each of the respective years. The analysis helps to identify the top 10 most harmful weather conditions to the human and the top 10 weather condition that created the most impact to the economies.

Data Processing

Reading data from the source dataset:

data <- read.csv("~/assignment/repdata-data-StormData.csv",header = TRUE, sep = ",")

Q1 Analysing which are the events cause most harm to the human population.

Base on the dataset provided, the columns INJURIES and FATALITIES are used to calculate the casualties caused by each event. The highest casualties registered signify the most harmful event to the human population.

## Loading required package: reshape2
## Loading required package: ggplot2

Result

Base on the chart, Tonado is the top event that registed the highest casualties over 61 years.

Q2 Analysing which are the weather conditions with highest impact to the economic.

From the dataset, we use the cost of damages to Property and Crop to calculate the impact to the economic.

## converting cost of damages to Propery and Crop
data$PROPDMGconv <- sapply(data$PROPDMGEXP, function(x) {if(x=="K") 10^3
else if(x=="M") 10^6
else if(x=="B") 10^9
else 1})

data$CROPDMGconv <- sapply(data$CROPDMGEXP, function(x) {if(x=="K") 10^3
else if(x=="M") 10^6
else if(x=="B") 10^9
else 1})

data$PropDmgValue <- data$PROPDMG * data$PROPDMGconv
data$CropDmgValue <- data$CROPDMG * data$CROPDMGconv

cost<-melt(data, id="EVTYPE", measure.vars = c("PropDmgValue", "CropDmgValue"))

cost_df<-dcast(cost, EVTYPE~variable, function(x) sum(x, na.rm=TRUE))

econDamage<-cost_df[with(cost_df, order(-rowSums(cost_df[c("PropDmgValue","CropDmgValue")]))),c(1,2:3)]

top10econDamage <- econDamage[1:10,]

top10econDamage<-melt(top10econDamage, id="EVTYPE", measure.vars = c("PropDmgValue","CropDmgValue"))

top10econDamage$value <- top10econDamage$value/10^9

top10econDamage$EVTYPE<-reorder(top10econDamage$EVTYPE, top10econDamage$value)

c2<-ggplot(top10econDamage,aes(x=EVTYPE,y=value))

c2<-c2+geom_bar(stat="identity", fill = "green") 

c2<-c2+theme(axis.text.x=element_text(angle=45,vjust=1,hjust=1))

c2<-c2+labs(x="Weather events", y="USD (Billions)", title="Top 10 Weather Condition that has highest Economic Consequences") 

print(c2)

Result

Base on the chart, the weather condition that had the highest impact to the economic was Flood.