Synopsis - max 10 complete sentences

In this analysis, I will use the NOAA Storm Database Analysis to identify which event has the most devastating effect on human lives and the one that makes the most damages in an amount of money. For this, I will use mostly data table package for a faster processing.

Data Processing

Data Preparation

First load neded packages and set working directory

library(data.table)
library(ggplot2)
library(R.utils) 
library(scales)
setwd("C:/Users/Rafael/Documents/GitHub/RepData_PeerAssessment2")

Environment Information

sessionInfo()
## R version 3.3.2 (2016-10-31)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 14393)
## 
## locale:
## [1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252   
## [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C                      
## [5] LC_TIME=Portuguese_Brazil.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] backports_1.0.5 magrittr_1.5    rprojroot_1.2   tools_3.3.2    
##  [5] htmltools_0.3.5 yaml_2.1.14     Rcpp_0.12.9     stringi_1.1.2  
##  [9] rmarkdown_1.3   knitr_1.15.1    stringr_1.1.0   digest_0.6.11  
## [13] evaluate_0.10

Start loading data by downloading, unzipping and importing needed columns of the raw data

file <- "repdata-data-StormData.csv.bz2"
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", file)

pa2 <- fread(bunzip2(file), select = c("EVTYPE","FATALITIES","INJURIES","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP"))
## 
Read 93.1% of 967216 rows
Read 902297 rows and 7 (of 37) columns from 0.523 GB file in 00:00:03

Population Health

Begin data transformation to evaluate with respect to population health.
Create a subset with the needed columns, grouped by event types and ordered bu fatalities and then injuries.

ph <- pa2[,.(FATALITIES = sum(FATALITIES), INJURIES = sum(INJURIES)), by=(EVTYPE)][order(-FATALITIES,-INJURIES)]
phMax <- tolower(ph[1,EVTYPE])
phMaxF <- format(ph[1,FATALITIES], big.mark = ",")
phMaxI <- format(ph[1,INJURIES], big.mark =  ",")
head(ph)
##            EVTYPE FATALITIES INJURIES
## 1:        TORNADO       5633    91346
## 2: EXCESSIVE HEAT       1903     6525
## 3:    FLASH FLOOD        978     1777
## 4:           HEAT        937     2100
## 5:      LIGHTNING        816     5230
## 6:      TSTM WIND        504     6957

Economic Consequences

Begin data transformation to evaluate economic consequences.
Create a subset with the needed columns, grouped by event types.

ec <- pa2[,.(propDmgValue = sum(PROPDMG * ifelse(PROPDMGEXP == "K"
                                                 , 10^3
                                                 ,ifelse(PROPDMGEXP == "M", 10^6
                                                         ,ifelse(PROPDMGEXP == "B"
                                                                 , 10^9,0)
                                                 )))
             ,cropDmgValue = sum(CROPDMG * ifelse(CROPDMGEXP == "K"
                                                  , 10^3
                                                  ,ifelse(CROPDMGEXP == "M", 10^6
                                                          ,ifelse(CROPDMGEXP == "B"
                                                                  , 10^9,0)
                                                  ))))
          ,by=(EVTYPE)]

Create a column with the sam of both propertie and crops damages, and orders the data set by it. Plot a column graph with the top 10 values.

ec <- ec[,totalValue := propDmgValue + cropDmgValue][order(-totalValue)]
ecMax <- tolower(ec[1,EVTYPE])
ecMaxM <- format(ec[1,totalValue], small.mark = ".", big.mark = ",")

ggplot(head(ec), aes(reorder(EVTYPE, -totalValue), totalValue))+
    geom_col() +
    labs(y = "Total Value (USD)"
         , x = "Event Type"
         , title = "Top 10 Event - Damages in USD")+
    scale_y_continuous(labels=comma) 

Results

1 - Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

Answer: The tornado is the most harmful event with respect to population health with 5,633 fatalities and 91,346 injures.

2 - Across the United States, which types of events have the greatest economic consequences?

Answer: The flood is the event that brings the greatest economic consequecies with $150,319,678,250 worth in damages.