Synopsis

An Analysis Report of Health and Economic Impact by Severe Weather Events. This is based on NOAA Storm Database of US https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf

Storm and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severs events can results in fatalities, injuries and property damage. Preventing such outcomes to the extent possible is a key concern. The U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database tracks characteristics of major storms and weather events in the United States, include when and where they occur, as well as estimates of any fatalities, injuries and property damage. This report contains the exploratory analysis results on the health and economic impact by the severe weather events based on the data from NOAA database.

all R packages loaded here

suppressMessages(library(dplyr))
## Warning: package 'dplyr' was built under R version 3.2.3
library(knitr)
## Warning: package 'knitr' was built under R version 3.2.3
suppressMessages(library(R.utils))
## Warning: package 'R.utils' was built under R version 3.2.3
## Warning: package 'R.oo' was built under R version 3.2.3
## Warning: package 'R.methodsS3' was built under R version 3.2.3
library(gridExtra)
## Warning: package 'gridExtra' was built under R version 3.2.3
suppressMessages(library(ggplot2))
## Warning: package 'ggplot2' was built under R version 3.2.3

downloading zip file , and unzip

if(!file.exists("repdata-data-StormData.csv.bz2")){
     dlMethod <- "curl"
}
if(substr(Sys.getenv("OS"),1,7) == "Windows") dlMethod <- "wininet"
 Url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
     download.file(Url,destfile="repdata-data-StormData.csv.bz2",method=dlMethod,mode="wb")
    bunzip2(filename="repdata-data-StormData.csv.bz2",destname="repdata-data-StormData.CSV")  

Fetching Data into R

StormData<-read.csv("repdata-data-StormData.CSV")
str(StormData)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : Factor w/ 16335 levels "1/1/1966 0:00:00",..: 6523 6523 4242 11116 2224 2224 2260 383 3980 3980 ...
##  $ BGN_TIME  : Factor w/ 3608 levels "00:00:00 AM",..: 272 287 2705 1683 2584 3186 242 1683 3186 3186 ...
##  $ TIME_ZONE : Factor w/ 22 levels "ADT","AKS","AST",..: 7 7 7 7 7 7 7 7 7 7 ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: Factor w/ 29601 levels "","5NM E OF MACKINAC BRIDGE TO PRESQUE ISLE LT MI",..: 13513 1873 4598 10592 4372 10094 1973 23873 24418 4598 ...
##  $ STATE     : Factor w/ 72 levels "AK","AL","AM",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ EVTYPE    : Factor w/ 985 levels "   HIGH SURF ADVISORY",..: 834 834 834 834 834 834 834 834 834 834 ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : Factor w/ 35 levels "","  N"," NW",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_LOCATI: Factor w/ 54429 levels "","- 1 N Albion",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_DATE  : Factor w/ 6663 levels "","1/1/1993 0:00:00",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_TIME  : Factor w/ 3647 levels ""," 0900CST",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : Factor w/ 24 levels "","E","ENE","ESE",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_LOCATI: Factor w/ 34506 levels "","- .5 NNW",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ WFO       : Factor w/ 542 levels ""," CI","$AC",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ STATEOFFIC: Factor w/ 250 levels "","ALABAMA, Central",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ ZONENAMES : Factor w/ 25112 levels "","                                                                                                                               "| __truncated__,..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : Factor w/ 436774 levels "","-2 at Deer Park\n",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...

cleaning data

StormData$YEAR <- as.numeric(format(as.Date(StormData$BGN_DATE, format = "%m/%d/%Y %H:%M:%S"), "%Y"))
StormData1<- select(StormData,YEAR,EVTYPE,FATALITIES,INJURIES,PROPDMG,PROPDMGEXP,CROPDMG,CROPDMGEXP)
StormData2<- filter(StormData1,FATALITIES>0|INJURIES>0|PROPDMG>0|CROPDMG>0)
StormData3<-write.csv(format(StormData2),"C:/Bala/Data Scientist/JHU Docs/Reproducible Research/Project2/filtered_dataset.csv",row.name=TRUE)
summary(StormData2)
##       YEAR                    EVTYPE        FATALITIES      
##  Min.   :1950   TSTM WIND        :63234   Min.   :  0.0000  
##  1st Qu.:1997   THUNDERSTORM WIND:43655   1st Qu.:  0.0000  
##  Median :2002   TORNADO          :39944   Median :  0.0000  
##  Mean   :2000   HAIL             :26130   Mean   :  0.0595  
##  3rd Qu.:2008   FLASH FLOOD      :20967   3rd Qu.:  0.0000  
##  Max.   :2011   LIGHTNING        :13293   Max.   :583.0000  
##                 (Other)          :47410                     
##     INJURIES            PROPDMG          PROPDMGEXP        CROPDMG       
##  Min.   :   0.0000   Min.   :   0.00   K      :231428   Min.   :  0.000  
##  1st Qu.:   0.0000   1st Qu.:   2.00          : 11585   1st Qu.:  0.000  
##  Median :   0.0000   Median :   5.00   M      : 11320   Median :  0.000  
##  Mean   :   0.5519   Mean   :  42.75   0      :   210   Mean   :  5.411  
##  3rd Qu.:   0.0000   3rd Qu.:  25.00   B      :    40   3rd Qu.:  0.000  
##  Max.   :1700.0000   Max.   :5000.00   5      :    18   Max.   :990.000  
##                                        (Other):    32                    
##    CROPDMGEXP    
##         :152664  
##  K      : 99932  
##  M      :  1985  
##  k      :    21  
##  0      :    17  
##  B      :     7  
##  (Other):     7

Filtering the data as per requriment of analysis

StormData4<-read.csv("filtered_dataset.CSV")

Data Processing

To address the question related to the impacts of weather events on public health, we’ll summarize the ‘StormData’ separately by fatalities and injuries

Health<-select(StormData4,EVTYPE,FATALITIES,INJURIES)
Health1<-filter(Health,FATALITIES>0|INJURIES>0)
Health_data<-summarise(group_by(Health1,EVTYPE),FATALITIES=sum(FATALITIES),INJURIES=sum(INJURIES))
InjuriesData_top10<-arrange(Health_data,INJURIES) %>% top_n(10)
## Selecting by INJURIES
FatalData_top10<-arrange(Health_data,FATALITIES)%>% top_n(10)
## Selecting by INJURIES
head(InjuriesData_top10)
## Source: local data frame [6 x 3]
## 
##              EVTYPE FATALITIES INJURIES
##              (fctr)      (int)    (int)
## 1              HAIL         15     1361
## 2 THUNDERSTORM WIND        133     1488
## 3       FLASH FLOOD        978     1777
## 4         ICE STORM         89     1975
## 5              HEAT        937     2100
## 6         LIGHTNING        816     5230
head(FatalData_top10)
## Source: local data frame [6 x 3]
## 
##              EVTYPE FATALITIES INJURIES
##              (fctr)      (int)    (int)
## 1              HAIL         15     1361
## 2         ICE STORM         89     1975
## 3 THUNDERSTORM WIND        133     1488
## 4             FLOOD        470     6789
## 5         TSTM WIND        504     6957
## 6         LIGHTNING        816     5230

Top 10 weather events impacts public health due to INJURIES

InjuriesData_top10
## Source: local data frame [10 x 3]
## 
##               EVTYPE FATALITIES INJURIES
##               (fctr)      (int)    (int)
## 1               HAIL         15     1361
## 2  THUNDERSTORM WIND        133     1488
## 3        FLASH FLOOD        978     1777
## 4          ICE STORM         89     1975
## 5               HEAT        937     2100
## 6          LIGHTNING        816     5230
## 7     EXCESSIVE HEAT       1903     6525
## 8              FLOOD        470     6789
## 9          TSTM WIND        504     6957
## 10           TORNADO       5633    91346

Top 10 weather events impacts public health due to Fatalties

FatalData_top10
## Source: local data frame [10 x 3]
## 
##               EVTYPE FATALITIES INJURIES
##               (fctr)      (int)    (int)
## 1               HAIL         15     1361
## 2          ICE STORM         89     1975
## 3  THUNDERSTORM WIND        133     1488
## 4              FLOOD        470     6789
## 5          TSTM WIND        504     6957
## 6          LIGHTNING        816     5230
## 7               HEAT        937     2100
## 8        FLASH FLOOD        978     1777
## 9     EXCESSIVE HEAT       1903     6525
## 10           TORNADO       5633    91346

Property Damage from all types of weather events converted to Millions $

Converting all Property damage data noted in PROPDMGEXP coloumn “H”(hundred),“K”(kilo),“M”(million),“B”(billion) to millions

PropDmg<-select(StormData4, EVTYPE,PROPDMG,PROPDMGEXP)
PropDmg1<-filter(PropDmg,PROPDMG> 0)
PropDmg1$PROPDMGNUM =0
PropDmg1[PropDmg1$PROPDMGEXP == "H", ]$PROPDMGNUM = PropDmg1[PropDmg1$PROPDMGEXP == "H", ]$PROPDMG * 0.00001

PropDmg1[PropDmg1$PROPDMGEXP == "K", ]$PROPDMGNUM = PropDmg1[PropDmg1$PROPDMGEXP == "K", ]$PROPDMG * 0.001

PropDmg1[PropDmg1$PROPDMGEXP == "M", ]$PROPDMGNUM = PropDmg1[PropDmg1$PROPDMGEXP == "M", ]$PROPDMG * 1

PropDmg1[PropDmg1$PROPDMGEXP == "B", ]$PROPDMGNUM = PropDmg1[PropDmg1$PROPDMGEXP == "B", ]$PROPDMG * 1000

sum(PropDmg1$PROPDMGNUM)
## [1] 427279.7

To address the question related to the impacts of weather events on economy, we’ll summarize the ‘StormData’ separately by damage to property and crops.

PropDmg_data<-summarise(group_by(PropDmg1,EVTYPE),PROPDMG=round(sum(PROPDMGNUM),digits=1))
PropDmgData_top10<-arrange(PropDmg_data,PROPDMG) %>% top_n(10)
## Selecting by PROPDMG

Top 10 Events caused Property damage

PropDmgData_top10
## Source: local data frame [10 x 2]
## 
##               EVTYPE  PROPDMG
##               (fctr)    (dbl)
## 1          HIGH WIND   5270.0
## 2       WINTER STORM   6688.5
## 3     TROPICAL STORM   7703.9
## 4          HURRICANE  11868.3
## 5               HAIL  15727.4
## 6        FLASH FLOOD  16140.8
## 7        STORM SURGE  43323.5
## 8            TORNADO  56925.7
## 9  HURRICANE/TYPHOON  69305.8
## 10             FLOOD 144657.7

Crop Damage from all types of weather events converted to Millions $

Converting all Crop damage data noted in CROPDMGEXP coloumn “H”(hundred),“K”(kilo),“M”(million),“B”(billion) to Millions

CropDmg<-select(StormData4, EVTYPE,CROPDMG,CROPDMGEXP)
CropDmg1<-filter(CropDmg,CROPDMG > 0)
CropDmg1$CROPDMGNUM =0
CropDmg1[CropDmg1$CROPDMGEXP == "H", ]$CROPDMGNUM = CropDmg1[CropDmg1$CROPDMGEXP == "H", ]$CROPDMG * 0.00001

CropDmg1[CropDmg1$CROPDMGEXP == "K", ]$CROPDMGNUM = CropDmg1[CropDmg1$CROPDMGEXP == "K", ]$CROPDMG * 0.001

CropDmg1[CropDmg1$CROPDMGEXP == "M", ]$CROPDMGNUM = CropDmg1[CropDmg1$CROPDMGEXP == "M", ]$CROPDMG * 1

CropDmg1[CropDmg1$CROPDMGEXP == "B", ]$CROPDMGNUM = CropDmg1[CropDmg1$CROPDMGEXP == "B", ]$CROPDMG * 1000

sum(CropDmg1$CROPDMGNUM)
## [1] 49093.76
CropDmg_data<-summarise(group_by(CropDmg1,EVTYPE),CROPDMG=round(sum(CROPDMGNUM),digits=1))
CropDmgData_top10<-arrange(CropDmg_data,CROPDMG) %>% top_n(10)
## Selecting by CROPDMG

Top 10 Events caused Crop damage

CropDmgData_top10
## Source: local data frame [10 x 2]
## 
##               EVTYPE CROPDMG
##               (fctr)   (dbl)
## 1       FROST/FREEZE  1094.1
## 2       EXTREME COLD  1293.0
## 3        FLASH FLOOD  1421.3
## 4  HURRICANE/TYPHOON  2607.9
## 5          HURRICANE  2741.9
## 6               HAIL  3025.5
## 7          ICE STORM  5022.1
## 8        RIVER FLOOD  5029.5
## 9              FLOOD  5662.0
## 10           DROUGHT 13972.6

RESULTS

PLOTS “HEALTH DAMAGE”"

library(ggplot2)
Injggplot<-ggplot(InjuriesData_top10,aes(EVTYPE,INJURIES,fill=EVTYPE))+
          geom_bar(stat="identity")+ 
              theme(axis.text.x  = element_text(angle=45,hjust = 1))  +
              labs(x="Event Types",y= "Injuries ")+
              labs(title=" Injuries from Major Disasters")+
              scale_fill_discrete(name = "EVTYPE")+
              geom_text(aes(label=INJURIES), vjust = -0.3, size = 3.5)

           

Fatalggplot<-ggplot(FatalData_top10,aes(EVTYPE,FATALITIES,fill=EVTYPE))+
          geom_bar(stat="identity")+ 
              theme(axis.text.x  = element_text(angle=45,hjust = 1)) +
              labs(x="Event Types",y= "Fatalities ")+
              labs(title=" Fatalities from Major Disasters")+
              scale_fill_discrete(name = "EVTYPE")+
              geom_text(aes(label=FATALITIES), vjust = -0.3, size = 3.5)
             
grid.arrange(Injggplot,Fatalggplot,nrow=2)    

According to the data, tornadoes are probably the most deadly weather event in the US, followed by excessive heat events.

PLOTS “PROPERTY/CROP DAMAGE”

PropDmgPl<-ggplot(PropDmgData_top10,aes(EVTYPE,PROPDMG,fill=EVTYPE))+
          geom_bar(stat="identity")+ 
              theme(axis.text.x  = element_text(angle=45,hjust = 1))  +
              labs(x="Event Types",y= "Property Damage in Millions ")+
              labs(title=" Property Damage from Major Disasters")+
              scale_fill_discrete(name = "EVTYPE")+
              geom_text(aes(label=PROPDMG), vjust = -0.3, size = 3.5)

           

CropDmgPl<-ggplot(CropDmgData_top10,aes(EVTYPE,CROPDMG,fill=EVTYPE))+
          geom_bar(stat="identity")+ 
              theme(axis.text.x  = element_text(angle=45,hjust = 1)) +
              labs(x="Event Types",y= "Crop Damage in Millions  ")+
              labs(title=" Crop Damage from Major Disasters")+
              scale_fill_discrete(name = "EVTYPE")+
              geom_text(aes(label=CROPDMG), vjust = -0.3, size = 3.5)
             
grid.arrange(PropDmgPl,CropDmgPl,nrow=2)    

According to the data, FLOODS cause the most Property damages, followed by hurricanes/typhoons and torndaoes.

Drought and flood are causes for the greatest damage to crops.

The weather events have the greatest economic consequences are: flood, drought, Tornado and Typhoon. Across the United States, flood, tornado and typhoon have caused the greatest damage to properties.

-END-