Introduction to document

Title: An analysis of the NOAA Storm Database

Synopsis: Immediately after the title, there should be a synopsis which describes and summarizes your analysis in at most 10 complete sentences.

Data Processing

The data is loaded using the workflow from ProjectTemplate. Placing the bz2 file inside the data folder and proceed with the following code chunk, will automatically produce the data inside the global environment as repdata.data.StormData

library("ProjectTemplate")
load.project()
## Loading project configuration
## Autoloading helper functions
##  Running helper script: helpers.R
## Autoloading packages
##  Loading package: reshape
##  Loading package: plyr
## 
## Attaching package: 'plyr'
## 
## The following objects are masked from 'package:reshape':
## 
##     rename, round_any
## 
##  Loading package: ggplot2
##  Loading package: stringr
##  Loading package: lubridate
## 
## Attaching package: 'lubridate'
## 
## The following object is masked from 'package:plyr':
## 
##     here
## 
## The following object is masked from 'package:reshape':
## 
##     stamp
## 
## Autoloading data
##  Loading data set: repdata.data.StormData

Ensure that FATALITIES and INJURIES are numeric data types and EVTYPE is character

repdata.data.StormData$FATALITIES <- as.numeric(repdata.data.StormData$FATALITIES)
repdata.data.StormData$INJURIES <- as.numeric(repdata.data.StormData$INJURIES)
repdata.data.StormData$EVTYPE <- as.character(repdata.data.StormData$EVTYPE)
repdata.data.StormData$PROPDMG <- as.numeric(repdata.data.StormData$PROPDMG)

Sum up fatalities, injuries, and property damage and grouping the stats by the nature of the event EVTYPE

sum_fatalities_by_event_data <- aggregate(repdata.data.StormData$FATALITIES, 
    by = list(repdata.data.StormData$EVTYPE), FUN = sum, na.rm = TRUE)

sum_injuries_by_event_data <- aggregate(repdata.data.StormData$INJURIES, by = list(repdata.data.StormData$EVTYPE), 
    FUN = sum, na.rm = TRUE)

sum_property_damage_by_event_data <- aggregate(repdata.data.StormData$PROPDMG, 
    by = list(repdata.data.StormData$EVTYPE), FUN = sum, na.rm = TRUE)

Change the column names in all 3 data sets.

cnames <- c("Events", "Fatalities")
colnames(sum_fatalities_by_event_data) <- cnames
cnames <- c("Events", "Injuries")
colnames(sum_injuries_by_event_data) <- cnames
cnames <- c("Events", "PropertyDamage")
colnames(sum_property_damage_by_event_data) <- cnames

sum_fatalities_by_event_data$Events <- as.factor(sum_fatalities_by_event_data$Events)
sum_injuries_by_event_data$Events <- as.factor(sum_injuries_by_event_data$Events)
sum_property_damage_by_event_data$Events <- as.factor(sum_property_damage_by_event_data$Events)

Results: in which your results are presented.


# Create the plot for population health
par(mfrow = c(1, 2))
# Create the plot for fatalities
fatalities_plot <- plot(sum_fatalities_by_event_data$Events, sum_fatalities_by_event_data$Fatalities, 
    xlab = "Events", ylab = "Fatalities", col = "red")
text(sum_fatalities_by_event_data$Events, sum_fatalities_by_event_data$Fatalities, 
    labels = sum_fatalities_by_event_data$Events, pos = 1, offset = 0.5)

# Create the plot for injuries
injuries_plot <- plot(sum_injuries_by_event_data$Events, sum_injuries_by_event_data$Injuries, 
    xlab = "Events", ylab = "Injuries", col = "orange")
text(sum_injuries_by_event_data$Events, sum_injuries_by_event_data$Injuries, 
    labels = sum_injuries_by_event_data$Events, pos = 1, offset = 0.5)

plot of chunk unnamed-chunk-5

The biggest threat to population health is, by far and away, tornadoes. In terms of fatalities and injuries, the single storm event which causes the most damage to population health is clearly tornadoes with more than 5000 fatalities and 80000 injuries.

property_damage_plot <- plot(sum_property_damage_by_event_data$Events, sum_property_damage_by_event_data$PropertyDamage, 
    xlab = "Events", ylab = "Property Damage", col = "red")
text(sum_property_damage_by_event_data$Events, sum_property_damage_by_event_data$PropertyDamage, 
    labels = sum_property_damage_by_event_data$Events, pos = 1, offset = 0.5)

plot of chunk unnamed-chunk-6

The top 3 events with the biggest economic consequences are tornadoes, thunderstorm winds, and floods.