Synopsis

This data analysis report of the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database was produced for the Coursera Jonh Hopkins Reproducible Research course assigment. The intent of this work is try to adress the problem of spot most harmful weather Events from a health and financial perspectives. The Falalities and Injuries were sumed to represent health harm. The property and crop loss were sumed to represente economic damages. The worst event from a human health perspective was Tornados. From a economic perspective the worst evet was flood.

Load Libraries

Load the packages from tidyverse like dplyr, tibble and ggplot.

# Load libraries
library(tidyverse)

Load Data

The default Rbase package was used to automaticaly read compressed comma separated files with the function read.csv to load the NOOA data onto R enviroment.

# Load data
df <- read.csv("repdata_data_StormData.csv.bz2")

Pre Process

The pre processing step applyed to the data was is the selection of the columns/variables relevant to the analysis and the exponetiation of the damages.

Columns Selected:

# Define exponentiation dictionary
dict_exp <- c("K"= 10^3,
              "M"= 10^6,
              "B"= 10^9,
              "m"= 10^6,
              "+"= 10^0,
              "0"= 10^0,
              "5"= 10^5,
              "6"= 10^6,
              "?"= 10^0,
              "4"= 10^4,
              "2"= 10^2,
              "3"= 10^3,
              "h"= 10^2,
              "7"= 10^7,
              "H"= 10^2,
              "-"= 10^0,
              "1"= 10^1,
              "8"= 10^8)

# Select columns and exponentiate damages
df2 <- df %>% 
  select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP) %>% 
  mutate(PROPDMG = ifelse(PROPDMGEXP != "", dict_exp[PROPDMGEXP] * PROPDMG, PROPDMG),
         CROPDMG = ifelse(CROPDMGEXP != "", dict_exp[CROPDMGEXP] * CROPDMG, CROPDMG))

Results

Question 1

Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

# Group by and calculate pop health harm 
pop_health <- df2 %>% 
  group_by(EVTYPE) %>% 
  summarise(HEALTH = sum(FATALITIES) + sum(INJURIES)) %>% 
  ungroup() %>% 
  arrange(desc(HEALTH)) %>% 
  slice(1:5) %>% 
  mutate(EVTYPE =factor(EVTYPE, levels = c(EVTYPE)))

# Draw plot of health harm events
pop_health %>% 
  ggplot(mapping = aes(y = EVTYPE, x = HEALTH))+
  geom_col(fill = "darkred")+
  labs(title = "Top 5 Harmful Events to the Population Health",
       subtitle = "Date Reference: 1950 to 2011",
       x = "Health Harm (People)",
       y = "Event Type",
       caption = "Source: NOAA Storm Database \n 
                  Aggregation: Sum of Fatalities and Injuries by Event Type")

  • The most harmful weather event to human health is Tornado with a total of 96.979 human fatalities or injuries.

Question 2

Across the United States, which types of events have the greatest economic consequences?

# Group by and calculate economy damage 
economy <- df2 %>% 
  group_by(EVTYPE) %>% 
  summarise(ECONOMY = sum(PROPDMG) + sum(CROPDMG)) %>% 
  ungroup() %>% 
  arrange(desc(ECONOMY)) %>%
  slice(1:5) %>% 
  mutate(EVTYPE = factor(EVTYPE, levels = c(EVTYPE)))

# Draw plot of health harm events
economy %>% 
  ggplot(mapping = aes(y = EVTYPE, x = ECONOMY))+
  geom_col(fill = "darkgreen")+
  labs(title = "Top 5 Damage Events to the Economy",
       subtitle = "Date Reference: 1950 to 2011",
       x = "Damage (Dollars)",
       y = "Event Type",
       caption = "Source: NOAA Storm Database \n
                  Aggregation: Sum of Properties and Crop Damages by Event Type")

  • The weather event with great consequences to economy is Flood with a total of \(1.5 \times 10^{11}\) of dollar loss from damages in properties and crops.