Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
This report analyses NOAA storm database and address the following questions:
Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health? Across the United States, which types of events have the greatest economic consequences?
From the URL “http://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2”, the raw data will be downloaded as “StormData.csv.bz2”
if(!file.exists("repdata_data_StormData.csv.bz2")) {
Original_Data_URL <- "http://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(Original_Data_URL, destfile="StormData.csv.bz2")
}
stormdata <- read.csv("repdata_data_StormData.csv.bz2", stringsAsFactors=F)
data <- read.csv("repdata_data_StormData.csv.bz2", stringsAsFactors=F)
##str(data)
##summary(data)
## Use just the required columns for analysis
stormdata <- data[, c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP",
"CROPDMG", "CROPDMGEXP")]
str(stormdata)
## 'data.frame': 902297 obs. of 7 variables:
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
## table(stormdata$EVTYPE)
stormdata$EVTYPE <- tolower(stormdata$EVTYPE)
summary(stormdata$EVTYPE)
## Length Class Mode
## 902297 character character
##Total 902297 observations with 7 varibles are recorded in the dataset.
Analysis
## Question1
##"Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?"
## Aggregate how many injuries+fatalities an event caused using aggregate function.
Aggcasualties <- with(data, aggregate(INJURIES + FATALITIES ~ EVTYPE, data=data, FUN = "sum"))
# Change the name
names(Aggcasualties)[2] <- "Totalcasualties"
# Order the number of casualties by decending method
ordered_casualties <- Aggcasualties[order(-Aggcasualties$Totalcasualties),]
# Just see top6 using head()
Top10 <- head(ordered_casualties)
Report
# Draw a barplot
barplot(Top10$Totalcasualties, main = "Event that caused the most harmful with respect to population health", xlab = "Total No. of casualties",col= 'blue', names.arg=Top10$EVTYPE)
## Question 2
##Across the United States, which types of events have the greatest economic consequences?
stormdata$PROPDMGEXP <- factor(tolower(stormdata$PROPDMGEXP))
stormdata$CROPDMGEXP <- factor(tolower(stormdata$CROPDMGEXP))
require(car)
## Loading required package: car
require(plyr)
## Loading required package: plyr
stormdata$PROPDMGEXP <- as.numeric(recode(as.character(stormdata$PROPDMGEXP),
"'0'=1;'1'=10;'2'=10^2;'3'=10^3;'4'=10^4;'5'=10^5;'6'=10^6;'7'=10^7;'8'=10^8;'b'=10^9;'h'=10^2;'k'=10^3;'m'=10^6;'-'=0;'?'=0;'+'=0"))
stormdata$CROPDMGEXP <- as.numeric(recode(as.character(stormdata$CROPDMGEXP),
"'0'=1;'1'=10;'2'=10^2;'3'=10^3;'4'=10^4;'5'=10^5;'6'=10^6;'7'=10^7;'8'=10^8;'b'=10^9;'h'=10^2;'k'=10^3;'m'=10^6;'-'=0;'?'=0;'+'=0"))
# calculate values in dollars
stormdata$PROPDMGDOLLAR <- stormdata$PROPDMG * stormdata$PROPDMGEXP
stormdata$CROPDMGDOLLAR <- stormdata$CROPDMG * stormdata$CROPDMGEXP
stormdataAgg <- ddply(stormdata, ~EVTYPE, summarise,PROPDMG = sum(PROPDMGDOLLAR), CROPDMG = sum(CROPDMGDOLLAR))
## View top 20 data value
propdmg <- stormdataAgg[order(stormdataAgg$PROPDMG, decreasing = T), c("EVTYPE",
"PROPDMG")][1:20, ]
cropdmg <- stormdataAgg[order(stormdataAgg$CROPDMG, decreasing = T), c("EVTYPE",
"CROPDMG")][1:20, ]
Report
par(mfrow = c(1, 2), oma = c(0, 0, 2, 0))
par(mar = c(12, 4, 3, 4))
barplot(propdmg$PROPDMG, names.arg = propdmg$EVTYPE, main = "property damages", col = 'blue', cex.axis = 0.8, cex.names = 0.7, las = 2)
barplot(cropdmg$CROPDMG, names.arg = cropdmg$EVTYPE, main = "crop damages",col = 'blue',
cex.axis = 0.8, cex.names = 0.7, las = 2)
title("Amounts ($) of economic damages ",
outer = TRUE)
Final Analysis Hence with this study it is seen that the event of ‘TORNADO’ has the greatest affect on both economy and Population health.