Adverse Weather Impact on US Population and its Economy

Synopsis

Storms and weather events can produce some severe problems to both public health and not only for communities but also for municipalities. Many severe events may result in fatalities, injuries, property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the US National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database can be used to track the different characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

The main goals of the project are: (i) Identify the events that are harmful to population health and (ii) Identify the events that have the greatest economic consequences. In this report,effect of weather events on personal as well as property damages was studied. Graphical representations are also provided seperately for the top 8 weather events causing highest fatalities and injuries. Results also indicate that most Fatalities and injuries were caused by Tornados.

Also, barplots were plotted for the top 8 weather events that causes the highest property damage and crop damage.

Data Processing

Data

The data for this project is taken from U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. It is a comma-separated-value file compressed via the bzip2 algorithm to reduce its size.

Importing and Loading Data

Read the original files and display column names.

StormData <- read.table("repdata_data_StormData.csv.bz2", sep=",", header=TRUE)
colnames(StormData)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"
head(StormData)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL TORNADO
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL TORNADO
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL TORNADO
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL TORNADO
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL TORNADO
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL TORNADO
##   BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END COUNTYENDN
## 1         0                                               0         NA
## 2         0                                               0         NA
## 3         0                                               0         NA
## 4         0                                               0         NA
## 5         0                                               0         NA
## 6         0                                               0         NA
##   END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES INJURIES PROPDMG
## 1         0                      14.0   100 3   0          0       15    25.0
## 2         0                       2.0   150 2   0          0        0     2.5
## 3         0                       0.1   123 2   0          0        2    25.0
## 4         0                       0.0   100 2   0          0        2     2.5
## 5         0                       0.0   150 2   0          0        2     2.5
## 6         0                       1.5   177 2   0          0        6     2.5
##   PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES LATITUDE LONGITUDE
## 1          K       0                                         3040      8812
## 2          K       0                                         3042      8755
## 3          K       0                                         3340      8742
## 4          K       0                                         3458      8626
## 5          K       0                                         3412      8642
## 6          K       0                                         3450      8748
##   LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1       3051       8806              1
## 2          0          0              2
## 3          0          0              3
## 4          0          0              4
## 5          0          0              5
## 6          0          0              6
str(StormData)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : Factor w/ 16335 levels "1/1/1966 0:00:00",..: 6523 6523 4242 11116 2224 2224 2260 383 3980 3980 ...
##  $ BGN_TIME  : Factor w/ 3608 levels "00:00:00 AM",..: 272 287 2705 1683 2584 3186 242 1683 3186 3186 ...
##  $ TIME_ZONE : Factor w/ 22 levels "ADT","AKS","AST",..: 7 7 7 7 7 7 7 7 7 7 ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: Factor w/ 29601 levels "","5NM E OF MACKINAC BRIDGE TO PRESQUE ISLE LT MI",..: 13513 1873 4598 10592 4372 10094 1973 23873 24418 4598 ...
##  $ STATE     : Factor w/ 72 levels "AK","AL","AM",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ EVTYPE    : Factor w/ 985 levels "   HIGH SURF ADVISORY",..: 834 834 834 834 834 834 834 834 834 834 ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : Factor w/ 35 levels "","  N"," NW",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_LOCATI: Factor w/ 54429 levels "","- 1 N Albion",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_DATE  : Factor w/ 6663 levels "","1/1/1993 0:00:00",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_TIME  : Factor w/ 3647 levels ""," 0900CST",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : Factor w/ 24 levels "","E","ENE","ESE",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_LOCATI: Factor w/ 34506 levels "","- .5 NNW",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ WFO       : Factor w/ 542 levels ""," CI","$AC",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ STATEOFFIC: Factor w/ 250 levels "","ALABAMA, Central",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ ZONENAMES : Factor w/ 25112 levels "","                                                                                                               "| __truncated__,..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : Factor w/ 436781 levels "","-2 at Deer Park\n",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...

Extracting Required Data

The code below extracts the required weather data related to health and economic event.

event <- c("EVTYPE", "FATALITIES", "INJURIES", 
           "PROPDMG", "PROPDMGEXP", "CROPDMG", 
           "CROPDMGEXP")
data <- StormData[event]

Property Damage Analysis

Property and Crop damage exponents (PROPDMGEXP, CROPDMGEXP) for each level was listed out and assigned those values for the property exponent data. Invalid data was excluded by assigning the value as ‘0’ since the property damage value was calculated by multiplying the property damage and property exponent value. The code is

table(data$PROPDMGEXP)
## 
##             -      ?      +      0      1      2      3      4      5      6 
## 465934      1      8      5    216     25     13      4      4     28      4 
##      7      8      B      h      H      K      m      M 
##      5      1     40      1      6 424665      7  11330
table(data$CROPDMGEXP)
## 
##             ?      0      2      B      k      K      m      M 
## 618413      7     19      1      9     21 281832      1   1994
data$PROPDMGEXP<-factor(data$PROPDMGEXP,levels=c("H","K","M","B","h","m","O"))
data$PROPDMGEXP[is.na(data$PROPDMGEXP)] <- "O"
data$CROPDMGEXP<-factor(data$CROPDMGEXP,levels=c("K","M","B","k","m","O"))
data$CROPDMGEXP[is.na(data$CROPDMGEXP)] <- "O"
health <- StormData[,(c(8,23:24))]
property<-StormData[,c(8,25:28)]

Computing Property Damage Magnitude

Following key will be used to identify the multiplier for the orders of magnitude.

  1. o(one) = 1
  2. h(undred)=100
  3. k(thousand)=1000
  4. m(million)=1000000
  5. b(billion)=1000000000
StormData$PROPDMGEXP <- as.character(StormData$PROPDMGEXP)
StormData$CROPDMGEXP <- as.character(StormData$CROPDMGEXP)

StormData$PROPDMGMLT <- 0
StormData$CROPDMGMLT <- 0
# Property Damage
StormData$PROPDMGMLT[grepl("h", StormData$PROPDMGEXP,ignore.case = TRUE)]<-100
StormData$PROPDMGMLT[grepl("k", StormData$PROPDMGEXP,ignore.case = TRUE)]<-1000
StormData$PROPDMGMLT[grepl("m", StormData$PROPDMGEXP,ignore.case = TRUE)]<-1000000
StormData$PROPDMGMLT[grepl("b", StormData$PROPDMGEXP,ignore.case = TRUE)]<-1000000000
StormData$PROPDMGMLT[grepl("o", StormData$PROPDMGEXP,ignore.case = TRUE)]<-1
# Crop Damage
StormData$CROPDMGMLT[grepl("k", StormData$CROPDMGEXP,ignore.case = TRUE)]<-1000
StormData$CROPDMGMLT[grepl("m", StormData$CROPDMGEXP,ignore.case = TRUE)]<-1000000
StormData$CROPDMGMLT[grepl("b", StormData$CROPDMGEXP,ignore.case = TRUE)]<-1000000000
StormData$CROPDMGMLT[grepl("o", StormData$CROPDMGEXP,ignore.case = TRUE)]<-1

StormData$PROPDMG <- StormData$PROPDMG * StormData$PROPDMGMLT
StormData$CROPDMG <- StormData$CROPDMG * StormData$CROPDMGMLT

StormData$total <- StormData$PROPDMG + StormData$CROPDMG

Results

The restuls from required data after data cleaning are:

Population Health

It was found that “most harmful events” to population health are fatalities and injuries. To find some useful statsitics related to population health, the events fatalities and injuries are selected. It was also observed that “most harmful events” to econamic problem are Property and crop damages and for some useful statistics related to property damage, the events property and crop are selected. The code for both health and proprty events is:

Then for each incident (Fatalities,Injuries), the total values were estimated. Code for which is as follows.

health.totals <- aggregate(cbind(FATALITIES,INJURIES) ~ EVTYPE, data = health, sum, na.rm=TRUE)
health.totals$TOTAL <- health.totals$FATALITIES + health.totals$INJURIES
health.totals <- health.totals[order(-health.totals$TOTAL), ]
health.totals <- health.totals[1:25,]

Graphical Representation

par(mfrow = c(1, 2), 
    mar = c(12, 4, 3, 2), 
    mgp = c(3, 1, 0), 
    cex = 0.8)
# Fatalities
barplot(health.totals[ ,2], las=3, names.arg = health.totals$EVTYPE, 
        main="Highest Health Fatalities", 
        ylab="Number of Fatalities", 
        col="green"
        )
#Injuries
barplot(health.totals[,3], las=3, names.arg = health.totals$EVTYPE, 
        main="Highest Health Injuries", 
        ylab="Number of Fatalities", 
        col="green"
        )

From barplots of Fatalities and Injuries it can be seen that Tornado cause the highest level of risk for property damage.

The top 10 event causig property damage can be visualize as

par(mfrow = c(1, 3), 
    mar = c(12, 4, 3, 2), 
    mgp = c(3, 1, 0), 
    cex = 0.8)
# Fatalities
highdamag <- health.totals[order(-health.totals$FATALITIES), ][1:8, ]
barplot(highdamag$FATALITIES, names.arg=highdamag$EVTYPE, las=3, main ="Highest Damage Event Causing Fatalities")

# Injuries
highdamag <- health.totals[order(-health.totals$INJURIES), ][1:8, ]
barplot(highdamag$INJURIES, names.arg=highdamag$EVTYPE, las=3, main ="Highest Damage Event Causing Injuries")
# combined damage of Health Fatalities and Injuries
barplot(highdamag$TOTAL, names.arg=highdamag$EVTYPE, las=3, main ="Highest Damage Event Causing both Fatalities & Injuries")

The relationship between Fatalities and Injuries can be found

# Correlation from health totals
cor(health.totals[, -c(1,4)])
##            FATALITIES  INJURIES
## FATALITIES  1.0000000 0.9483005
## INJURIES    0.9483005 1.0000000
# Correlation for Highest Event
cor(highdamag[,-c(1,4)])
##            FATALITIES  INJURIES
## FATALITIES  1.0000000 0.9568194
## INJURIES    0.9568194 1.0000000

Economic Problem

Let we study the impact of events on US Economic from the cleaned data.

economic.total <- aggregate(cbind(PROPDMG = StormData$PROPDMG,
                                  CROPDMG=StormData$CROPDMG, 
                                  total=StormData$total) ~ EVTYPE, 
                            data = property, sum, na.rm=TRUE)
# Crop
economic.crop <- economic.total[order(-economic.total$CROPDMG), ]
economic.crop <- economic.crop[1:25,]
# Property
economic.prop <- economic.total[order(-economic.total$PROPDMG), ]
economic.prop <- economic.prop[1:25,]

Visualization of the Economic data

par(mfrow = c(1, 3), 
    mar = c(12, 4, 3, 2), 
    mgp = c(3, 1, 0), 
    cex = 0.8)
barplot(economic.prop[,2], names.arg=economic.prop$EVTYPE, las=3,
     col = rep(c("red", "pink", "yellow", "dark blue", "black" ), each = 5),
     main = "Economic Impact of Weather on Propery Damage")

barplot(economic.prop[,3], names.arg=economic.prop$EVTYPE, las=3,
     col = rep(c("red", "pink", "yellow", "dark blue", "black" ), each = 5),
     main = "Economic Impact of Weather on Crop Damage")

barplot(economic.prop[,4], names.arg=economic.prop$EVTYPE, las=3,
     col = rep(c("red", "pink", "yellow", "dark blue", "black" ), each = 5),
     main = "Economic Impact of Weather on total Damage")

Conclusion

The drought has the largest impact on crops damag. However, the flooding produces the largest overall weather-related impact to the economy. Second major events that caused the maximum damage was Hurricanes/Typhoos for property damage. For economic problem second major event is Typhoon.