Synopsis

Storm and severe weather events can cause large impact on people’s health and economic.

For human beings, it’s important to prevent such impact before the occurrence of disasters.

Thus, one thing that we can do is to analyize the past records, and have the sense of severe weather events. And the storm dataset recored by NOAA is useful for achieving this goal!!

Data Processing

This is the dataset about storm and severe weather events recored by NOAA(National Oceanic and Atmospheric Administration), and it is accessible thourght the URL below:
https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2

Reading and pre-processing data:

stormData <- read.csv("repdata_data_StormData.csv",colClasses = (c(rep("NULL", 7), NA, rep("NULL", 14), rep(NA, 6), rep("NULL", 9))))
## Warning in scan(file, what, nmax, sep, dec, quote, skip, nlines,
## na.strings, : EOF within quoted string
# keep only relevant columns
storm <- stormData[, c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG",  "CROPDMG")]
# re-assign proper columns' class
storm$FATALITIES <- as.numeric(storm$FATALITIES)
storm$INJURIES <- as.numeric(storm$INJURIES)
storm$PROPDMG <- as.numeric(storm$PROPDMG)
storm$CROPDMG <- as.numeric(storm$CROPDMG)

head(storm)
##    EVTYPE FATALITIES INJURIES PROPDMG CROPDMG
## 1 TORNADO      19282    19199   19216   17967
## 2 TORNADO      19282    18909   19130   17967
## 3 TORNADO      19282    19229   19216   17967
## 4 TORNADO      19282    19229   19130   17967
## 5 TORNADO      19282    19229   19130   17967
## 6 TORNADO      19282    19368   19130   17967

Aggregating data by types and divide them into four small datasets

# count the amount of damages(Fatality, Injury, PropertyDamage, CoprDamage) by different types of severe weather events
fat_storm <- aggregate(storm$FATALITIES, by=list(storm$EVTYPE), FUN=sum)
inj_storm <- aggregate(storm$INJURIES, by=list(storm$EVTYPE), FUN=sum)
prop_storm <- aggregate(storm$PROPDMG, by=list(storm$EVTYPE), FUN=sum)
crop_storm <- aggregate(storm$CROPDMG, by=list(storm$EVTYPE), FUN=sum)
# rename the columns
names(fat_storm) <- c("type", "frequency")
names(inj_storm) <- c("type", "frequency")
names(prop_storm) <- c("type", "frequency")
names(crop_storm) <- c("type", "frequency")

Finally, select the top-5 weather events from sorted data

# sort data by descending 
fat_storm <- fat_storm[order(fat_storm$frequency, decreasing=TRUE), ]
inj_storm <- inj_storm[order(inj_storm$frequency, decreasing=TRUE), ]
prop_storm <- prop_storm[order(prop_storm$frequency, decreasing=TRUE), ]
crop_storm <- crop_storm[order(crop_storm$frequency, decreasing=TRUE), ]
# select the top 5 data for plotting
new.fat_storm <- fat_storm[1:5, ]
new.inj_storm <- inj_storm[1:5, ]
new.prop_storm <- prop_storm[1:5, ]
new.crop_storm <- crop_storm[1:5, ]

Results

The plotting I use is Base Plotting System

To answer the first question, I plot the figure that shows TSTM WINDis the most influenced weather event for people health(no matter fatality or injuries)

# observe which type of weather events is fatal and injurious for people 
par(mfrow=c(1,2))
barplot(new.fat_storm$frequency, names.arg=new.fat_storm$type, 
        xlab="Weather Event Types", ylab="Fatalities", main="Events Causing Deaths")
barplot(new.inj_storm$frequency, names.arg=new.inj_storm$type, 
        xlab="Weather Event Types", ylab="Injuries", main="Events Causing Injuries")

Similarly, to answer the second question, I plot the figure below, and it shows the same result: “The TSTM WIND has the most impact on property as well”

# observe which type of weather events is extremely inflenced with property 
par(mfrow=c(1,2))
barplot(new.prop_storm$frequency, names.arg=new.prop_storm$type, 
        xlab="Weather Event Types", ylab="Properties", main="Events Causing Property Damage")
barplot(new.crop_storm$frequency, names.arg=new.crop_storm$type, 
        xlab="Weather Event Types", ylab="Crops", main="Events Causing Crop Damage")