Storm and severe weather events can cause large impact on people’s health and economic.
For human beings, it’s important to prevent such impact before the occurrence of disasters.
Thus, one thing that we can do is to analyize the past records, and have the sense of severe weather events. And the storm dataset recored by NOAA is useful for achieving this goal!!
This is the dataset about storm and severe weather events recored by NOAA(National Oceanic and Atmospheric Administration), and it is accessible thourght the URL below:https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2
Reading and pre-processing data:
stormData <- read.csv("repdata_data_StormData.csv",colClasses = (c(rep("NULL", 7), NA, rep("NULL", 14), rep(NA, 6), rep("NULL", 9))))
## Warning in scan(file, what, nmax, sep, dec, quote, skip, nlines,
## na.strings, : EOF within quoted string
# keep only relevant columns
storm <- stormData[, c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "CROPDMG")]
# re-assign proper columns' class
storm$FATALITIES <- as.numeric(storm$FATALITIES)
storm$INJURIES <- as.numeric(storm$INJURIES)
storm$PROPDMG <- as.numeric(storm$PROPDMG)
storm$CROPDMG <- as.numeric(storm$CROPDMG)
head(storm)
## EVTYPE FATALITIES INJURIES PROPDMG CROPDMG
## 1 TORNADO 19282 19199 19216 17967
## 2 TORNADO 19282 18909 19130 17967
## 3 TORNADO 19282 19229 19216 17967
## 4 TORNADO 19282 19229 19130 17967
## 5 TORNADO 19282 19229 19130 17967
## 6 TORNADO 19282 19368 19130 17967
Aggregating data by types and divide them into four small datasets
# count the amount of damages(Fatality, Injury, PropertyDamage, CoprDamage) by different types of severe weather events
fat_storm <- aggregate(storm$FATALITIES, by=list(storm$EVTYPE), FUN=sum)
inj_storm <- aggregate(storm$INJURIES, by=list(storm$EVTYPE), FUN=sum)
prop_storm <- aggregate(storm$PROPDMG, by=list(storm$EVTYPE), FUN=sum)
crop_storm <- aggregate(storm$CROPDMG, by=list(storm$EVTYPE), FUN=sum)
# rename the columns
names(fat_storm) <- c("type", "frequency")
names(inj_storm) <- c("type", "frequency")
names(prop_storm) <- c("type", "frequency")
names(crop_storm) <- c("type", "frequency")
Finally, select the top-5 weather events from sorted data
# sort data by descending
fat_storm <- fat_storm[order(fat_storm$frequency, decreasing=TRUE), ]
inj_storm <- inj_storm[order(inj_storm$frequency, decreasing=TRUE), ]
prop_storm <- prop_storm[order(prop_storm$frequency, decreasing=TRUE), ]
crop_storm <- crop_storm[order(crop_storm$frequency, decreasing=TRUE), ]
# select the top 5 data for plotting
new.fat_storm <- fat_storm[1:5, ]
new.inj_storm <- inj_storm[1:5, ]
new.prop_storm <- prop_storm[1:5, ]
new.crop_storm <- crop_storm[1:5, ]
The plotting I use is Base Plotting System
To answer the first question, I plot the figure that shows TSTM WINDis the most influenced weather event for people health(no matter fatality or injuries)
# observe which type of weather events is fatal and injurious for people
par(mfrow=c(1,2))
barplot(new.fat_storm$frequency, names.arg=new.fat_storm$type,
xlab="Weather Event Types", ylab="Fatalities", main="Events Causing Deaths")
barplot(new.inj_storm$frequency, names.arg=new.inj_storm$type,
xlab="Weather Event Types", ylab="Injuries", main="Events Causing Injuries")
Similarly, to answer the second question, I plot the figure below, and it shows the same result: “The TSTM WIND has the most impact on property as well”
# observe which type of weather events is extremely inflenced with property
par(mfrow=c(1,2))
barplot(new.prop_storm$frequency, names.arg=new.prop_storm$type,
xlab="Weather Event Types", ylab="Properties", main="Events Causing Property Damage")
barplot(new.crop_storm$frequency, names.arg=new.crop_storm$type,
xlab="Weather Event Types", ylab="Crops", main="Events Causing Crop Damage")