1. Sypnosis

This report is the final project of the Coursera Reproducible Research course. The purpose of the project is to explore the NOAA Storm Database and analyze the impact of weather events on population and property.

The first analysis focus on showing what types of severe weather events are the most harmful on population health, exactly injuries and fatalities.
The second analysis shows the economic consequences that severe weather events has caused in property and crops.

The results show that the tornados caused the higher impact on population health. It also shows tha flood had the higher impact on property damage and drought had the higher impact on crop damaged.

2. Data Processing

The data can be downloaded from the course web site:

The variables corresponding to the analysis are:

The database contains data from 1950 until November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete. There is also some documentation of the database available:

Downloading, loading and subsetting the Data

Downloading the file

fileUrl<-"https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
directory<-("raw_data.csv")
download.file(fileUrl,directory)
library(plyr)
library(dplyr)
library(ggplot2)
library(grid)
library(gridExtra)
## Warning: package 'gridExtra' was built under R version 3.4.1

Loading the data

data<-read.table("raw_data.csv",header=TRUE, sep=",")

Subsetting the main Data for Analysis

storm_data<-data[,c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")]

2.1 Most harmful Events to population health

Processing the data

To show the impact of the weather in population, the data was summarized by type of event and ordered in a decreasing way in both fatalities and injuries.

FATALITIES

fatalities<-storm_data%>%
    group_by(EVTYPE)%>%
    summarise(FATALITIES=sum(FATALITIES))
  fatalities<-fatalities[order(fatalities$FATALITIES,decreasing=TRUE),]

INJURIES

injuries<-storm_data%>%
    group_by(EVTYPE)%>%
    summarise(INJURIES=sum(INJURIES))
injuries<-injuries[order(injuries$INJURIES,decreasing=TRUE),]

2.2 Events withe the greatest economic consequences

Setting the right units

The exponential values are stored in a seperate column describing their value with letters (h,H = hundred, k,K = thousand, m,M = million, B = billion); to convert the letters to a numeric value, the letter is changed to a 10^x expression, and the signs diferent to a number or letter are changed to 0. The new values were assigned to a new columns.

levels of PROPDMGEXP

levels(storm_data$PROPDMGEXP)
##  [1] ""  "-" "?" "+" "0" "1" "2" "3" "4" "5" "6" "7" "8" "B" "h" "H" "K"
## [18] "m" "M"
storm_data$PROPEXP[storm_data$PROPDMGEXP == "" ]<-1
storm_data$PROPEXP[storm_data$PROPDMGEXP == "-" ]<-0
storm_data$PROPEXP[storm_data$PROPDMGEXP == "?" ]<-0
storm_data$PROPEXP[storm_data$PROPDMGEXP == "+" ]<-0
storm_data$PROPEXP[storm_data$PROPDMGEXP == 0 ]<-1
storm_data$PROPEXP[storm_data$PROPDMGEXP == 1 ]<-10^1
storm_data$PROPEXP[storm_data$PROPDMGEXP == 2 ]<-10^2
storm_data$PROPEXP[storm_data$PROPDMGEXP == 3 ]<-10^3
storm_data$PROPEXP[storm_data$PROPDMGEXP == 4 ]<-10^4
storm_data$PROPEXP[storm_data$PROPDMGEXP == 5 ]<-10^5
storm_data$PROPEXP[storm_data$PROPDMGEXP == 7 ]<-10^7
storm_data$PROPEXP[storm_data$PROPDMGEXP == 8 ]<-10^8
storm_data$PROPEXP[storm_data$PROPDMGEXP == "B" ]<-10^9
storm_data$PROPEXP[storm_data$PROPDMGEXP == "h" | storm_data$PROPDMGEXP == "H" ]<-10^2
storm_data$PROPEXP[storm_data$PROPDMGEXP == "k" | storm_data$PROPDMGEXP == "K" ]<-10^3
storm_data$PROPEXP[storm_data$PROPDMGEXP == "m" | storm_data$PROPDMGEXP == "M" ]<-10^6

levels of CROPDMGEXP

levels(storm_data$CROPDMGEXP)
## [1] ""  "?" "0" "2" "B" "k" "K" "m" "M"
storm_data$CROPEXP[storm_data$CROPDMGEXP == ""]<-1
storm_data$CROPEXP[storm_data$CROPDMGEXP == "?"]<-0
storm_data$CROPEXP[storm_data$CROPDMGEXP == 0]<-1
storm_data$CROPEXP[storm_data$CROPDMGEXP == 2]<-10^2
storm_data$CROPEXP[storm_data$CROPDMGEXP == "B"]<-10^9
storm_data$CROPEXP[storm_data$CROPDMGEXP == "k" | storm_data$CROPDMGEXP == "K"]<-10^3
storm_data$CROPEXP[storm_data$CROPDMGEXP == "m" | storm_data$CROPDMGEXP == "M"]<-10^6

Creating the right exponential value

storm_data$PROPDMGCOST<-storm_data$PROPDMG*(as.numeric(storm_data$PROPEXP))
storm_data$CROPDMGCOST<-storm_data$CROPDMG*(as.numeric(storm_data$CROPEXP))

Processing the data

To show the impact of the weather in the economy, the data was summarized by type of event and ordered in a decreasing way in both property and crop.

PROPERTY DAMAGE

propdmg<-storm_data%>%
    group_by(EVTYPE)%>%
    summarise(PROPDMGCOST=sum(PROPDMGCOST))
propdmg<-propdmg[order(propdmg$PROPDMGCOST,decreasing=TRUE),]

CROP DAMAGE

cropdmg<-storm_data%>%
    group_by(EVTYPE)%>%
    summarise(CROPDMGCOST=sum(CROPDMGCOST))
cropdmg<-cropdmg[order(cropdmg$CROPDMGCOST,decreasing=TRUE),]

3. Results

3.1 Effects of the weather events in population health

Across the United States 985 type of eventes have been registered, this report shows the top 10 weather events that affected the populations health (injuries and deaths). Tornados caused the most higher number of fatalities and injuries.

report_f<-head(fatalities,10)
report_f
## # A tibble: 10 x 2
##            EVTYPE FATALITIES
##            <fctr>      <dbl>
##  1        TORNADO       5633
##  2 EXCESSIVE HEAT       1903
##  3    FLASH FLOOD        978
##  4           HEAT        937
##  5      LIGHTNING        816
##  6      TSTM WIND        504
##  7          FLOOD        470
##  8    RIP CURRENT        368
##  9      HIGH WIND        248
## 10      AVALANCHE        224
report_i<-head(injuries,10)
report_i
## # A tibble: 10 x 2
##               EVTYPE INJURIES
##               <fctr>    <dbl>
##  1           TORNADO    91346
##  2         TSTM WIND     6957
##  3             FLOOD     6789
##  4    EXCESSIVE HEAT     6525
##  5         LIGHTNING     5230
##  6              HEAT     2100
##  7         ICE STORM     1975
##  8       FLASH FLOOD     1777
##  9 THUNDERSTORM WIND     1488
## 10              HAIL     1361
plot_f <- ggplot(data=head(fatalities,10), aes(x=reorder(EVTYPE,FATALITIES),y=FATALITIES)) + 
coord_flip()+geom_bar(fill="purple",stat="identity",width=0.5) +
labs(title =" Top 10 Events causing impact in Health", 
x = "Event Type", y = "Total Number of Fatalities")

plot_i <- ggplot(data=head(injuries,10), aes(x=reorder(EVTYPE,INJURIES),y=INJURIES)) + 
coord_flip()+geom_bar(fill="turquoise",stat="identity",width=0.5) +
xlab("Event Type")+ylab("Total Number of Injuries")

grid.arrange(plot_f, plot_i, nrow =2)

3.2 Effects of the weather events in the Economy

Regarding to the cost of property damage, we can see the flood produced the the higher lost, followed by hurricanes adn storms. In the case of crops the higher cost from damages were caused by drought folowed by floods and river floods.

report_prop<-head(propdmg,10)
report_prop
## # A tibble: 10 x 2
##               EVTYPE  PROPDMGCOST
##               <fctr>        <dbl>
##  1             FLOOD 144657709807
##  2 HURRICANE/TYPHOON  69305840000
##  3       STORM SURGE  43323536000
##  4       FLASH FLOOD  16822673979
##  5              HAIL  15735267513
##  6         HURRICANE  11868319010
##  7    TROPICAL STORM   7703890550
##  8      WINTER STORM   6688497251
##  9         HIGH WIND   5270046260
## 10       RIVER FLOOD   5118945500
report_crop<-head(cropdmg,10)
report_crop
## # A tibble: 10 x 2
##               EVTYPE CROPDMGCOST
##               <fctr>       <dbl>
##  1           DROUGHT 13972566000
##  2             FLOOD  5661968450
##  3       RIVER FLOOD  5029459000
##  4         ICE STORM  5022113500
##  5              HAIL  3025954473
##  6         HURRICANE  2741910000
##  7 HURRICANE/TYPHOON  2607872800
##  8       FLASH FLOOD  1421317100
##  9      EXTREME COLD  1292973000
## 10      FROST/FREEZE  1094086000
plot_prop <- ggplot(data=head(propdmg,10), aes(x=reorder(EVTYPE,PROPDMGCOST),y=PROPDMGCOST)) + 
coord_flip()+geom_bar(fill="purple",stat="identity",width=0.5) +
labs(title =" Top 10 Events Damage Cost", 
x = "Event Type", y = "Damage Cost")

plot_crop <- ggplot(data=head(cropdmg,10), aes(x=reorder(EVTYPE,CROPDMGCOST),y=CROPDMGCOST)) + 
coord_flip()+geom_bar(fill="turquoise",stat="identity",width=0.5) +
xlab("Event Type")+ylab("Damage Cost")

grid.arrange(plot_prop, plot_crop, nrow =2)