Introduction
Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
1: Code for reading in the dataset and processing the data.
1.a: Load the Data
NSD <- read.csv(bzfile("data/StormData.csv.bz2"))
1.b: Apply the filter to only select certain fields.
NSDR <- data.frame(NSD$EVTYPE, NSD$FATALITIES, NSD$INJURIES, NSD$PROPDMG, NSD$PROPDMGEXP, NSD$CROPDMG, NSD$CROPDMGEXP)
2: Aggregation Section
- Keeping only the EvType and numeric values
- Summarize values
- Get only the top 10 values
## Calculate Total Damage of each event using numbers and tranformation
totalDamageUSD <- NSDR$NSD.PROPDMG * 10^NSDR$NSD.PROPDMGEXP + NSDR$NSD.CROPDMG * 10^NSDR$NSD.CROPDMGEXP
NSDR <- cbind(NSDR, totalDamageUSD)
NSDR <- NSDR[,c(1,2,3,8)]
## Summarize Values
NSDRAggData <-aggregate(. ~ NSD.EVTYPE, data = NSDR, FUN=sum)
## Get the top 10 items of Fatalities, Injuries, and Damages
stormTopFatalities <- head(NSDRAggData[order(NSDRAggData$NSD.FATALITIES,decreasing=T),],10)
stormTopInjuries <- head(NSDRAggData[order(NSDRAggData$NSD.INJURIES,decreasing=T),],10)
stormTopDamages <- head(NSDRAggData[order(NSDRAggData$totalDamageUSD,decreasing=T),],10)
3 Results of Analysis
3.1 Top 10 events that are most harmful with respect to population health?
ggplot(data = stormTopFatalities, aes(x = stormTopFatalities$NSD.EVTYPE, y = stormTopFatalities$NSD.FATALITIES)) + geom_bar(stat = "identity") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + xlab("Event Type") + ylab("# of Fatalities") + ggtitle("NOAA Top 10: Highest Fatality Counts, 1950-2011")

3.2 Top 10 events that cause the highest number of injuries.
ggplot(data = stormTopInjuries, aes(x = stormTopInjuries$NSD.EVTYPE, y = stormTopInjuries$NSD.INJURIES)) + geom_bar(stat = "identity") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + xlab("Event Type") + ylab("# of Injuries") + ggtitle("NOAA Top 10: Highest Injuries Counts, 1950-2011")

3.3 Tornados appear to be the most harmful. What are the counts for Fatalities and Injuries.
stormTopFatalities[stormTopFatalities$NSD.EVTYPE=="TORNADO",c(1,2,3)]
## NSD.EVTYPE NSD.FATALITIES NSD.INJURIES
## 834 TORNADO 5633 91346
3.4 Show the cost of damages of the Top 10 most expensive types of events.
ggplot(data = stormTopDamages, aes(x = stormTopDamages$NSD.EVTYPE, y = stormTopDamages$totalDamageUSD)) + geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) + xlab("Event Type") +
ylab("Damages in USD") + ggtitle("NOAA Top 10: Most Expensive Type of Events, 1950-2011")

4 Total Flood Damage
4.1 Top Flood Damage
stormTopFatalities[stormTopFatalities$NSD.EVTYPE=="FLOOD",c(1,4)]
## NSD.EVTYPE totalDamageUSD
## 170 FLOOD 150319678257