Title: An analysis of the NOAA Storm Database
Synopsis: Immediately after the title, there should be a synopsis which describes and summarizes your analysis in at most 10 complete sentences.
The data is loaded using the workflow from ProjectTemplate. Placing the bz2 file inside the data folder and proceed with the following code chunk, will automatically produce the data inside the global environment as repdata.data.StormData
library("ProjectTemplate")
load.project()
## Loading project configuration
## Autoloading helper functions
## Running helper script: helpers.R
## Autoloading packages
## Loading package: reshape
## Loading package: plyr
##
## Attaching package: 'plyr'
##
## The following objects are masked from 'package:reshape':
##
## rename, round_any
##
## Loading package: ggplot2
## Loading package: stringr
## Loading package: lubridate
##
## Attaching package: 'lubridate'
##
## The following object is masked from 'package:plyr':
##
## here
##
## The following object is masked from 'package:reshape':
##
## stamp
##
## Autoloading data
## Loading data set: repdata.data.StormData
Ensure that FATALITIES and INJURIES are numeric data types and EVTYPE is character
repdata.data.StormData$FATALITIES <- as.numeric(repdata.data.StormData$FATALITIES)
repdata.data.StormData$INJURIES <- as.numeric(repdata.data.StormData$INJURIES)
repdata.data.StormData$EVTYPE <- as.character(repdata.data.StormData$EVTYPE)
repdata.data.StormData$PROPDMG <- as.numeric(repdata.data.StormData$PROPDMG)
Sum up fatalities, injuries, and property damage and grouping the stats by the nature of the event EVTYPE
sum_fatalities_by_event_data <- aggregate(repdata.data.StormData$FATALITIES,
by = list(repdata.data.StormData$EVTYPE), FUN = sum, na.rm = TRUE)
sum_injuries_by_event_data <- aggregate(repdata.data.StormData$INJURIES, by = list(repdata.data.StormData$EVTYPE),
FUN = sum, na.rm = TRUE)
sum_property_damage_by_event_data <- aggregate(repdata.data.StormData$PROPDMG,
by = list(repdata.data.StormData$EVTYPE), FUN = sum, na.rm = TRUE)
Change the column names in all 3 data sets.
cnames <- c("Events", "Fatalities")
colnames(sum_fatalities_by_event_data) <- cnames
cnames <- c("Events", "Injuries")
colnames(sum_injuries_by_event_data) <- cnames
cnames <- c("Events", "PropertyDamage")
colnames(sum_property_damage_by_event_data) <- cnames
sum_fatalities_by_event_data$Events <- as.factor(sum_fatalities_by_event_data$Events)
sum_injuries_by_event_data$Events <- as.factor(sum_injuries_by_event_data$Events)
sum_property_damage_by_event_data$Events <- as.factor(sum_property_damage_by_event_data$Events)
# Create the plot for population health
par(mfrow = c(1, 2))
# Create the plot for fatalities
fatalities_plot <- plot(sum_fatalities_by_event_data$Events, sum_fatalities_by_event_data$Fatalities,
xlab = "Events", ylab = "Fatalities", col = "red")
text(sum_fatalities_by_event_data$Events, sum_fatalities_by_event_data$Fatalities,
labels = sum_fatalities_by_event_data$Events, pos = 1, offset = 0.5)
# Create the plot for injuries
injuries_plot <- plot(sum_injuries_by_event_data$Events, sum_injuries_by_event_data$Injuries,
xlab = "Events", ylab = "Injuries", col = "orange")
text(sum_injuries_by_event_data$Events, sum_injuries_by_event_data$Injuries,
labels = sum_injuries_by_event_data$Events, pos = 1, offset = 0.5)
The biggest threat to population health is, by far and away, tornadoes. In terms of fatalities and injuries, the single storm event which causes the most damage to population health is clearly tornadoes with more than 5000 fatalities and 80000 injuries.
property_damage_plot <- plot(sum_property_damage_by_event_data$Events, sum_property_damage_by_event_data$PropertyDamage,
xlab = "Events", ylab = "Property Damage", col = "red")
text(sum_property_damage_by_event_data$Events, sum_property_damage_by_event_data$PropertyDamage,
labels = sum_property_damage_by_event_data$Events, pos = 1, offset = 0.5)
The top 3 events with the biggest economic consequences are tornadoes, thunderstorm winds, and floods.