Synopsis
The document highlights the impact of the various events on the public health and property. The impact is measures as a mean of all the times the event has occured. Since here have been a large number of different events, the pictorial representation is created based on the top values. The graphs indicate that combination of Tornadoes wind and hail causes most fatalities on average or heat wave causes most injuries while
Coastal Erosion causes most property damage
The reduction or filter is mainly to reduce the clutter on the file
Data download and processing
The first step involves download of the file and unzip as well as reading of the file
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2" , "FStormData.csv.bz2")
dfStormData <- read.csv("FStormData.csv.bz2")
library(ggplot2)
Data processing Fatalities: Aggregation , sort and display
The data is aggregarted to identify the mean of the Fatalities for each type of event. The aggregated data is sorted and only relevant data (value > 1)
Results
dfFatalSorted <- aggregate(dfStormData$FATALITIES , by = list(dfStormData$EVTYPE) , FUN = mean)
dfFatalSorted <- dfFatalSorted[order(-dfFatalSorted$x) , ]
dfFatalSorted <- dfFatalSorted[dfFatalSorted$x >= 1, ]
gFatal <- ggplot(dfFatalSorted , aes(Group.1 , x )) +
ylab("Count of Fatalities") +
xlab("Event") +
ylim(0, 30) +
ggtitle("Fatalities per event type" )+
geom_bar(stat = "identity" , fill = "Orange") +
theme(axis.text.x = element_text(angle=60, hjust=1 , size = 5))
gFatal
Data processing Injuries: Aggregation , sort and display
The data is aggregarted to identify the mean of the Injuries for each type of event. The aggregated data is sorted and only relevant data (value > 75)
Results
dfInjuriesSorted <- aggregate(dfStormData$INJURIES , by = list(dfStormData$EVTYPE) , FUN = mean)
dfInjuriesSorted <- dfInjuriesSorted[order(-dfInjuriesSorted$x) , ]
dfInjuriesSorted <- dfInjuriesSorted[dfInjuriesSorted$x >= 1, ]
gInjuries <- ggplot(dfInjuriesSorted , aes(Group.1 , x )) +
ylab("Count of Injuries") +
xlab("Event") +
ylim(0, 75) +
ggtitle("Injuries per event type" )+
geom_bar(stat = "identity" , fill = "Lightgreen") +
theme(axis.text.x = element_text(angle=60, hjust=1 , size = 5))
gInjuries
Data processing Property Damage / Economic consequence: Aggregation , sort and display
The data is aggregarted to identify the mean of the property damage for each type of event. The aggregated data is sorted and only relevant data (value > 100)
Results
dfPropDmgd <- aggregate(dfStormData$PROPDMG , by = list(dfStormData$EVTYPE) , FUN = mean)
dfPropDmgd <- dfPropDmgd[order(-dfPropDmgd$x) , ]
dfPropDmgd <- dfPropDmgd[dfPropDmgd$x >= 100, ]
gPropDmgd <- ggplot(dfPropDmgd , aes(Group.1 , x )) +
ylab("Prpoerty Damage") +
xlab("Event") +
ylim(0, 1000) +
ggtitle("Property Damage per event type" )+
geom_bar(stat = "identity" , fill = "Lightblue") +
theme(axis.text.x = element_text(angle=60, hjust=1 , size = 8))
gPropDmgd