Title:

“Analysis of Severe weather events based on the United States National Oceanic and Atmospheric Administration’s (NOAA) storm database. The impact of fatalities and injuries will be examined for popultion health. In adddition, the economic consequences will be examined as a result from weather events.
AUTHOR:”E.D.“
DATE:”November 2, 2016“
OUTPUT: html_document

———————————————————–

Synopsis:

The goal of the analysis is to explore the NOAA Storm Database and impact of severe weather events on population health and economic using the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. The main questions to answer:
1. Across the United States, which types of events are most harmful with respect to population health? From data analysis we found Tornados is most impact on population health
2. Across the United States, which types of events have the greatest economic consequences? flood and hurricane/typhoon and drought are most impact on economic damage.

———————————————————–

Data Processing:

echo = TRUE
#---------------------------------------------------------------------------------------------------------
## Code for reading in the dataset and/or processing the data

## Download the data file.
if(!file.exists("./mydata_DS5")){dir.create("./mydata_DS5")}
fileUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(fileUrl,destfile="./mydata_DS5/Dataset.zip")

#Unzip the dataSet to /mydata_DS5 directory
  library(utils)
  library(dplyr)
  bununzip2(zipfile="./mydata_DS5/Dataset.zip",exdir="./mydata_DS5")
  stormData <- read.csv(bzfile("repdata-data-StormData.csv.bz2"), header = TRUE, stringsAsFactors = FALSE)
# }

Take a preliminary look at the data and associated data structure.

Examine the names of the columns in the data file.

echo = TRUE
names(stormData)
 [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"     "COUNTYNAME" "STATE"      "EVTYPE"    
 [9] "BGN_RANGE"  "BGN_AZI"    "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN" "END_RANGE" 
[17] "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"      "F"          "MAG"        "FATALITIES" "INJURIES"  
[25] "PROPDMG"    "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC" "ZONENAMES"  "LATITUDE"  
[33] "LONGITUDE"  "LATITUDE_E" "LONGITUDE_" "REMARKS"    "REFNUM" 

Examine the table structure.

echo = TRUE
str(stormData)

data.frame’: 902297 obs. of 37 variables:
$ STATE__ : num 1 1 1 1 1 1 1 1 1 1 …
$ BGN_DATE : chr “4/18/1950 0:00:00” “4/18/1950 0:00:00” “2/20/1951 0:00:00” “6/8/1951 0:00:00” …
$ BGN_TIME : chr “0130” “0145” “1600” “0900” …
$ TIME_ZONE : chr “CST” “CST” “CST” “CST” …
$ COUNTY : num 97 3 57 89 43 77 9 123 125 57 …
$ COUNTYNAME: chr “MOBILE” “BALDWIN” “FAYETTE” “MADISON” …
$ STATE : chr “AL” “AL” “AL” “AL” …
$ EVTYPE : chr “TORNADO” “TORNADO” “TORNADO” “TORNADO” …
$ BGN_RANGE : num 0 0 0 0 0 0 0 0 0 0 …
$ BGN_AZI : chr “” “” “” “” …
$ BGN_LOCATI: chr “” “” “” “” …
$ END_DATE : chr “” “” “” “” …
$ END_TIME : chr “” “” “” “” …
$ COUNTY_END: num 0 0 0 0 0 0 0 0 0 0 …
$ COUNTYENDN: logi NA NA NA NA NA NA …
$ END_RANGE : num 0 0 0 0 0 0 0 0 0 0 …
$ END_AZI : chr “” “” “” “” …
$ END_LOCATI: chr “” “” “” “” …
$ LENGTH : num 14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 …
$ WIDTH : num 100 150 123 100 150 177 33 33 100 100 …
$ F : int 3 2 2 2 2 2 2 1 3 3 …
$ MAG : num 0 0 0 0 0 0 0 0 0 0 …
$ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 …
$ INJURIES : num 15 0 2 2 2 6 1 0 14 0 …
$ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 …
$ PROPDMGEXP: chr “K” “K” “K” “K” …
$ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 …
$ CROPDMGEXP: chr “” “” “” “” …
$ WFO : chr “” “” “” “” …
$ STATEOFFIC: chr “” “” “” “” …
$ ZONENAMES : chr “” “” “” “” …
$ LATITUDE : num 3040 3042 3340 3458 3412 …
$ LONGITUDE : num 8812 8755 8742 8626 8642 …
$ LATITUDE_E: num 3051 0 0 0 0 …
$ LONGITUDE_: num 8806 0 0 0 0 …
$ REMARKS : chr “” “” “” “” …
$ REFNUM : num 1 2 3 4 5 6 7 8 9 10 …

Examine the header data of the table

echo = TRUE
head(stormData)

STATE__ BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END COUNTYENDN END_RANGE
1 1 4/18/1950 0:00:00 0130 CST 97 MOBILE AL TORNADO 0 0 NA 0
2 1 4/18/1950 0:00:00 0145 CST 3 BALDWIN AL TORNADO 0 0 NA 0
3 1 2/20/1951 0:00:00 1600 CST 57 FAYETTE AL TORNADO 0 0 NA 0
4 1 6/8/1951 0:00:00 0900 CST 89 MADISON AL TORNADO 0 0 NA 0
5 1 11/15/1951 0:00:00 1500 CST 43 CULLMAN AL TORNADO 0 0 NA 0
6 1 11/15/1951 0:00:00 2000 CST 77 LAUDERDALE AL TORNADO 0 0 NA 0
END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES LATITUDE LONGITUDE LATITUDE_E
1 14.0 100 3 0 0 15 25.0 K 0 3040 8812 3051
2 2.0 150 2 0 0 0 2.5 K 0 3042 8755 0
3 0.1 123 2 0 0 2 25.0 K 0 3340 8742 0
4 0.0 100 2 0 0 2 2.5 K 0 3458 8626 0
5 0.0 150 2 0 0 2 2.5 K 0 3412 8642 0
6 1.5 177 2 0 0 6 2.5 K 0 3450 8748 0
LONGITUDE_ REMARKS REFNUM
1 8806 1
2 0 2
3 0 3
4 0 4
5 0 5
6 0 6

echo = TRUE
length(table(stormData$EVTYPE))

[1] 985 The number of unique Event Types are 985.

Examine the dimensions (R,C) of the table

echo = TRUE
dim(stormData)

[1] 902297 37 The dimensions of our data is 902297 rows and 37 columns

———————————————————–

Determine the effects of Event Types on how they impact economic aspects.

Calculate the amount of property damage and crop damage from the adverse weather event types.
CROPDMG is the amount of crop damage
CROPDMGEXP is a multiplier factor for the crop damage
PROPDMG is the amount of property damage
PROPDMGEXP is a multiplier factor for the property damage

The data uses “B” for billions, “M” for millions, “K” for thousands.

echo = TRUE
#Property damage assessment
PROPDMG <- stormData %>%
    mutate(PROPDMGEXP_Conv = ifelse(PROPDMGEXP == 'B', 10^9, ifelse(PROPDMGEXP == 'M', 10^6, ifelse(PROPDMGEXP == 'K', 10^3, ifelse(PROPDMGEXP == '', 0 ,PROPDMGEXP)))))
PROPDMG$PROPDMGEXP_Mult <- PROPDMG$PROPDMGEXP_Conv * PROPDMG$PROPDMG
PROPDMG_Total <- aggregate(PROPDMG$PROPDMGEXP_Mult, by = list(PROPDMG$EVTYPE), "sum")
names(PROPDMG_Total) <- c("EVTYPE","PROPDMG")
PROPDMG_TOP10 <- head(PROPDMG_Total[order(-PROPDMG_Total$PROPDMG),],10)
rownames(PROPDMG_TOP10)  = 1:nrow(PROPDMG_TOP10)

#Crop damage assessment
CROPDMG <- stormData %>%
    mutate(CROPDMGEXP_Conv = ifelse(CROPDMGEXP == 'B', 10^9, ifelse(CROPDMGEXP == 'M', 10^6, ifelse(CROPDMGEXP == 'K', 10^3, ifelse(CROPDMGEXP == '', 0 ,CROPDMGEXP)))))
CROPDMG$CROPDMGEXP_Mult <- CROPDMG$CROPDMGEXP_Conv * CROPDMG$CROPDMG
CROPDMG_Total <- aggregate(CROPDMG$CROPDMGEXP_Mult, by = list(CROPDMG$EVTYPE), "sum")
names(CROPDMG_Total) <- c("EVTYPE","CROPDMG")
CROPDMG_TOP10 <- head(CROPDMG_Total[order(-CROPDMG_Total$CROPDMG),],10)
rownames(CROPDMG_TOP10)  = 1:nrow(CROPDMG_TOP10)

———————————————————–

Determine the effects of Event Types on the injuries and fatalities of the population.

echo = TRUE
## 

## Determine the total number of Injuries and aggregate by the Event Type.
INJURIES_Total <- aggregate(stormData$INJURIES, by = list(stormData$EVTYPE), "sum")
names(INJURIES_Total) <- c("EVTYPE", "INJURIES")
## Determine the Top 10 Event Types for Fatalities.
INJURIES_Top10 <- head(INJURIES_Total[order(-INJURIES_Total$INJURIES),],10)
rownames(INJURIES_Top10)  = 1:nrow(INJURIES_Top10)


## Determine the total number of Fatalities and aggregate by the Event Type.
FATALITIES_Total <- aggregate(stormData$FATALITIES, by = list(stormData$EVTYPE), "sum")
names(FATALITIES_Total) <- c("EVTYPE", "FATALITIES")
## Determine the Top 10 Event Types for Fatalities.
FATALITIES_Top10 <- head(FATALITIES_Total[order(-FATALITIES_Total$FATALITIES),],10)
rownames(FATALITIES_Top10)  = 1:nrow(FATALITIES_Top10)

———————————————————–

Results

  1. Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health? Tornadoes have the most harmful effects with respect to population health. Tornadoes caused 91,346 injuries, and killed 5,633 people.
echo = TRUE
INJURIES_Top
               EVTYPE INJURIES
 1            TORNADO    91346
 2          TSTM WIND     6957
 3              FLOOD     6789
 4     EXCESSIVE HEAT     6525
 5          LIGHTNING     5230
 6               HEAT     2100
 7          ICE STORM     1975
 8        FLASH FLOOD     1777
 9  THUNDERSTORM WIND     1488
 10              HAIL     1361
echo = TRUE
FATALITIES_Top
            EVTYPE FATALITIES
 1         TORNADO       5633
 2  EXCESSIVE HEAT       1903
 3     FLASH FLOOD        978
 4            HEAT        937
 5       LIGHTNING        816
 6       TSTM WIND        504
 7           FLOOD        470
 8     RIP CURRENT        368
 9       HIGH WIND        248
 10      AVALANCHE        224
echo = TRUE

png("RR_Plot1ab.png", width=520, height=520) 
par(mfrow=c(1,2),  mar=c(14,4,4,1), las=3, cex=0.7)

Bar charts for Top 10 adverse weather events and numbers of Injuries and Fatalities.
barplot(INJURIES_Top10$INJURIES, names.arg = INJURIES_Top10$EVTYPE, col = 'yellow',
          main = 'Top 10 Adverse Weather Events for Injuries', ylab = 'Number of Injuries')
barplot(FATALITIES_Top10$FATALITIES, names.arg = FATALITIES_Top10$EVTYPE, col = 'red',
        main = 'Top 10 Adverse Weather Events for Fatalities', ylab = 'Number of Fatalities')
dev.off()

Click here for Bar charts for Top 10 adverse weather events and numbers of Injuries and Fatalities. [] (https://github.com/EJD2016/ReproducibleResearch/blob/master/RR_Plot1ab.png)


2. Across the United States, which types of events have the greatest economic consequences? The flood event has the greatest economic consequences on property damage. Flooding caused over $144 billion in property damage. The drought event has the greatest economic consequences on crop damage. Droughts caused over $13 billion in crop damage.

echo = TRUE
PROPDMG_TOP
               EVTYPE      PROPDMG
 1              FLOOD 144657709800
 2  HURRICANE/TYPHOON  69305840000
 3            TORNADO  56925662617
 4        STORM SURGE  43323536000
 5        FLASH FLOOD  16140813837
 6               HAIL  15727368787
 7          HURRICANE  11868319010
 8     TROPICAL STORM   7703890550
 9       WINTER STORM   6688497255
 10         HIGH WIND   5270046370

CROPDMG_TOP
               EVTYPE     CROPDMG
 1            DROUGHT 13972566000
 2              FLOOD  5661968450
 3        RIVER FLOOD  5029459000
 4          ICE STORM  5022113500
 5               HAIL  3025540012
 6          HURRICANE  2741910000
 7  HURRICANE/TYPHOON  2607872800
 8        FLASH FLOOD  1421317100
 9       EXTREME COLD  1292973000
 10      FROST/FREEZE  1094086000
echo = TRUE
Bar Chart for Top 10 events results from Property and crop damage.
png("RR_Plot2ab.png", width=520, height=520)
par(mfrow=c(1,2),  mar=c(14,4,4,1), las=3, cex=0.8)
barplot(PROPDMG_TOP$PROPDMG, names.arg = PROPDMG_TOP$EVTYPE, col = 'green',
        main = 'Top 10 Adverse Weather Events - Property damage', ylab = 'Total Number of Property damage')
barplot(CROPDMG_TOP$CROPDMG, names.arg = CROPDMG_TOP$EVTYPE, col = 'blue',
        main = 'Top 10 Adverse Weather Events - Crop damage', ylab = 'Total Number of Crop damage')

Click here for Bar charts for Top 10 adverse weather events and numbers/amounts of Injuries and Fatalities. [] (https://github.com/EJD2016/ReproducibleResearch/blob/master/RR_Plot2ab.png)