Synopsis

In this report, we will have a brief review of the top 5 severe weather events which are most harmful with respect to population health, as well as the top 5 severe weather events which have greatest economic damage in the last 60 years’ time.

The dataset I used to perform this analysis could be retrived from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. As well as the link that I attached at the beginning of this paragraph.

According to my analysis, the most harmful weather event with respect to population health is TORNADO and the greatest economic damage weather event is TORNADO as well. Those numbers are sorted and calculated(sum up) by EVTYPE.

Data Processing

This section describe the ways to get and load the dataset to R.

### Download dataset to the local computer
# url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
# download.file(url, destfile="./repdata-data-StormData.csv.bz2", method="curl")

### Before unzip the bz2, you need to install the following packages to R, there are some 
### prerequisite for installing these package in your host, you could search for each using
### Google
# install.packages("ncdf4", method = "curl")
# install.packages("RNetCDF", method = "curl")
# install.packages("R.utils", method = "curl")

### Loading packages to unzip the bz2 file
library(ncdf4)
library(R.utils)

### Unzip the bz2 file
#bunzip2("repdata-data-StormData.csv.bz2")

### Load data to a vector rawdata
setwd("/home/winfield/R-lab/course5/week5")
rawdata <- read.csv("repdata-data-StormData.csv")

Results

This section contains the tables for each top 5 severe weather events which most harmful to population health or greatest economic damage and graphics for each:

### Load dplyr tool for later data process and get the results
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
### Top 10 severe weather events most harmful to population health
top10harmhealthevents <- rawdata %>% group_by(EVTYPE) %>% summarise_at(c("FATALITIES", "INJURIES"), sum) %>% arrange(desc(FATALITIES, INJURIES)) %>% select(EVTYPE, FATALITIES, INJURIES) %>% mutate(TOTAL=FATALITIES+INJURIES) %>% head(5)

### Top 10 severe weather events cause greatest economic damage
top10econmicdamagevents <- rawdata %>% group_by(EVTYPE) %>% summarise_at(c("PROPDMG", "CROPDMG"), sum) %>% arrange(desc(PROPDMG, CROPDMG)) %>% select(EVTYPE, PROPDMG, CROPDMG) %>% mutate(TOTALDMG=PROPDMG+CROPDMG) %>% head(5)
library(ggplot2)
ggplot(top10harmhealthevents, aes(EVTYPE, TOTAL)) + labs(title="Total Numbers of The Casualties Caused by Each Top 5 Severe Weather Event") + geom_bar(stat="identity", fill="#FF9999", colour="black")

library(ggplot2)
ggplot(top10econmicdamagevents, aes(EVTYPE, TOTALDMG)) + labs(title="Total Economic Loss Caused by Each Top 5 Severe Weather Event") + geom_bar(stat="identity", fill="#FF9999", colour="black")

Thanks for reading!