Effects on Public-Healht and Economy of the Most Damaging Weather Events

Synopsis

Data about most damaging weather events given by NOAA lets us summarise theirs damages agains population.

The goal is give clearly information to the autorities about how to better allocate resources to palliate damages.

The two chosen parameters to determine it are “Public health damages” and “Economy damages”. “Flooding” phenomenon are the worst events on economic terms regarding “Property damage” and “Crop damage”. In terms of public health, the harmful climate events are “tornadoes”.

Data Processing

Loading the data

Our code downloads data automatically.

# setwd('~/Desktop')
if (!file.exists("stormdata.csv.bz2")) {
    download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", 
        destfile = "stormdata.csv.bz2", method = "curl")  # it can take some minutes
}

The data is provided as a compressed bz2 file. We should unzip it and then read the csv file as R object.

require("R.utils", warn.conflicts = F)
bunzip2("stormdata.csv.bz2", remove = F, overwrite = T)  # unziping
data <- read.csv("stormdata.csv", stringsAsFactors = FALSE)  # reading

We transform the necessary columns that we will use next. Then, we convert the BGN_DATE column values into proper R date (time) object.

col_used <- which(names(data) %in% c("EVTYPE", "BGN_DATE", "FATALITIES", "INJURIES", 
    "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP"))
df <- data[, col_used]
df$BGN_DATE = strptime(as.character(df$BGN_DATE), format = "%m/%d/%Y %H:%M:%S")

Dealing with the given data

We want to calculate the actual damges in US dollars. So, we should find the actual denominations or multiplying factors. We assume the next conversion:

So, to obtain damages in USD, actual values present in PROPDMG and CROPDMG fields are multiplied by these factors. And, these two extra columns are added to the data (named as “p.damage” and “c.damage”).

p.level = levels(as.factor(df$PROPDMGEXP))
p = c(0, 0, 0, 0, 1, 10, 100, 1000, 10000, 1e+05, 1e+06, 1e+07, 1e+08, 1e+09, 
    100, 100, 1000, 1e+06, 1e+06)
p.deno = rep(0, dim(df)[1])

for (i in 1:(length(p.level))) {
    p.deno[df$PROPDMGEXP == p.level[i]] = p[i]
}
p.damage = p.deno * df$PROPDMG
df$p.damage = p.damage
# =======================================
c.level = levels(as.factor(df$CROPDMGEXP))
c <- c(0, 0, 1, 100, 1e+09, 1000, 1000, 1e+06, 1e+06)
c.deno = rep(0, dim(df)[1])

for (i in 1:(length(c.level))) {
    c.deno[df$CROPDMGEXP == c.level[i]] = c[i]
}
c.damage = c.deno * df$CROPDMG
df$c.damage = c.damage

Results

Impact of the events on population HEALTH

Analyzing the impact of different events on population health (FATALITIES and INJURIES). Number of injuries and fatalities for each event type are aggregated to better observe which ones are more harmful compared to others.

aggd1 = aggregate(df$FATALITIES, by = list(df$EVTYPE), FUN = sum)
fatalities = aggd1[order(aggd1$x, decreasing = T), ]
names(fatalities) = c("Event.Type", "Fatalities")
head(fatalities, n = 8)
##         Event.Type Fatalities
## 826        TORNADO       5633
## 124 EXCESSIVE HEAT       1903
## 151    FLASH FLOOD        978
## 271           HEAT        937
## 453      LIGHTNING        816
## 846      TSTM WIND        504
## 167          FLOOD        470
## 572    RIP CURRENT        368
aggd2 = aggregate(df$INJURIES, by = list(df$EVTYPE), FUN = sum)
injuries = aggd2[order(aggd2$x, decreasing = T), ]
names(injuries) = c("Event.Type", "Injuries")
head(injuries, n = 8)
##         Event.Type Injuries
## 826        TORNADO    91346
## 846      TSTM WIND     6957
## 167          FLOOD     6789
## 124 EXCESSIVE HEAT     6525
## 453      LIGHTNING     5230
## 271           HEAT     2100
## 422      ICE STORM     1975
## 151    FLASH FLOOD     1777

Plotting the 20 most harmful events (for both, Injuries and Fatalities):

# par(oma = c(3.5, 0, 0, 0))
barplot(height = fatalities$Fatalities[1:20], names.arg = fatalities$Event.Type[1:20], 
    las = 2, cex.axis = 0.8, cex.names = 0.7, col = rainbow(20, start = 0, end = 0.35), 
    ylab = "Number of Fatalities")
title("Top Events \n causing Fatalities", line = -2)

plot of chunk plot_health_1

barplot(height = injuries$Injuries[1:20], names.arg = injuries$Event.Type[1:20], 
    las = 2, cex.axis = 0.7, cex.names = 0.7, col = rainbow(20, start = 0, end = 0.35), 
    ylab = "Number of Injuries")
title("Top Events \n causing Injuries", line = -2)

plot of chunk plot_health_2

Impact of the events on ECONOMY

Summarizing economic consequences of weather disaster events accross USA.

Property damages and Crop damages in USD are aggregated for each event type to better observe which ones are more harmful compared to others.

agg.d1 = aggregate(df$p.damage, by = list(df$EVTYPE), FUN = sum)
property = agg.d1[order(agg.d1$x, decreasing = T), ]
names(property) = c("Event.Type", "Property.Damage")
row.names(property) = 1:dim(property)[1]
head(property, 10)
##           Event.Type Property.Damage
## 1              FLOOD       1.226e+11
## 2  HURRICANE/TYPHOON       6.550e+10
## 3        STORM SURGE       4.256e+10
## 4          HURRICANE       5.707e+09
## 5            TORNADO       5.677e+09
## 6     TROPICAL STORM       5.157e+09
## 7       WINTER STORM       5.015e+09
## 8        RIVER FLOOD       5.001e+09
## 9   STORM SURGE/TIDE       4.001e+09
## 10    HURRICANE OPAL       3.120e+09
agg.d2 = aggregate(df$c.damage, by = list(df$EVTYPE), FUN = sum)
crop = agg.d2[order(agg.d2$x, decreasing = T), ]
names(crop) = c("Event.Type", "Crop.Damage")
row.names(crop) = 1:dim(crop)[1]
head(crop, 10)
##           Event.Type Crop.Damage
## 1        RIVER FLOOD   5.003e+09
## 2          ICE STORM   5.002e+09
## 3            DROUGHT   1.534e+09
## 4  HURRICANE/TYPHOON   1.515e+09
## 5               HAIL   9.962e+08
## 6               HEAT   4.007e+08
## 7             FREEZE   2.009e+08
## 8        FLASH FLOOD   1.792e+08
## 9              FLOOD   1.680e+08
## 10         TSTM WIND   1.092e+08

Plotting the 15 most harmful events regarding damages on Property and Crop.

# par(oma = c(3.5, 0, 0, 0))
barplot(height = property$Property.Damage[1:15], names.arg = property$Event.Type[1:15], 
    las = 2, cex.axis = 0.7, cex.names = 0.7, col = rainbow(20, start = 0, end = 0.35), 
    ylab = "Damage in US dollars")
title("Top Events \n causing Property damages", line = -2)

plot of chunk plot_econom_1

barplot(height = crop$Crop.Damage[1:15], names.arg = crop$Event.Type[1:15], 
    las = 2, cex.axis = 0.7, cex.names = 0.7, col = rainbow(20, start = 0, end = 0.35), 
    ylab = "Damage in US dollars")
title("Top Events \n causing Crop damages", line = -2)

plot of chunk plot_econom_2

Conclusion

In terms of public Healh, the most harmful events are “tornadoes” followed by others like “excesive heat” and “flood” and “tstm wind. Regarding Economics, even if "flood” problems seem the most damage type in both cases, the most harmful events are different if we are fixed our attemption to Property or Crop damages,

This graphs should be taken as information to can apply other methods like “pareto diagram” discrimination method to avoid main damages on public health and economics caused by weather events.