Health and Economic Impacts of US Severe Weather

Synopsis

This is a detailed report of the weather related impacts on the United states population health and its economy. Health factors that have been explored are the number of total injuries and fatalities resulting severe weather conditions. The economic impacts explored are the impact of these weather condictions on crop and property damaages.

Data Processing

Loading packages

Downloading and loading data in R

# Checks if the data is available, and if not it downloads it. 

if (!file.exists("StormData.csv.bz2")){
    
    url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
    
    download.file(url, "StormData.csv.bz2")
}

stormdata <- read.csv("StormData.csv.bz2")
## Warning in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,
## : EOF within quoted string

Transforming and cleaning the data

The code below does the following - Transforms the date column from a character to numeric - Filters data from 1990 to 2011 - Selects columns that will be relavant for our analysis - Removes the rows with no values in the selected columns - Prints the subset of the data

data <- transform(stormdata, BGN_DATE = as.Date(BGN_DATE, format = "%m/%d/%Y %H:%M:%S"))

data <- data %>% 
            filter(year(BGN_DATE) %in% 1990:2011)  %>% 
            select(BGN_DATE, EVTYPE, FATALITIES, INJURIES,
                   PROPDMG,PROPDMGEXP, CROPDMG, CROPDMGEXP) %>% 
            filter(FATALITIES > 0 | INJURIES > 0 | PROPDMG > 0 | 
                       PROPDMGEXP > 0 | CROPDMG > 0| CROPDMGEXP > 0)
head(data)
##     BGN_DATE  EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 1990-01-25 TORNADO          0       28     2.5          M       0           
## 2 1990-02-03 TORNADO          0        0    25.0          K       0           
## 3 1990-02-03 TORNADO          0        0    25.0          K       0           
## 4 1990-02-03 TORNADO          0        3     2.5          M       0           
## 5 1990-02-03 TORNADO          0        2     2.5          M       0           
## 6 1990-02-03 TORNADO          0       15     2.5          M       0

Converting the exponent columns (PROPDMGEXP and CROPDMGEXP)

We need to convert the exponent values in CROPDMGEXP and PROPDMGEXP

These are possible values of CROPDMGEXP and PROPDMGEXP:

H,h = hundreds = 100

K,k = kilos = thousands = 1,000

M,m = millions = 1,000,000

B,b = billions = 1,000,000,000

(+) = 1 , (-) = 0, (?) = 0

black/empty character = 0

numeric 0..8 = 10

The code below lookups the values in the CROPDMGEXP AND PROPDMGEX and replaces them with their exponest values.

data <- data %>% 
            mutate(CROPDMGEXP = ifelse( CROPDMGEXP %in% c("M","m"), 10^6,
                                    ifelse( CROPDMGEXP  == "K", 10^3, 
                                        ifelse( CROPDMGEXP  == "B", 10^9,
                                            ifelse(CROPDMGEXP  %in% c("","?","0"), 0,
                                                ifelse(CROPDMGEXP == "2", 10, "") 
                                                )))),
                   PROPDMGEXP = ifelse(PROPDMGEXP %in% c("M","m"), 10^6,
                                    ifelse( PROPDMGEXP == "K", 10^3, 
                                       ifelse( PROPDMGEXP  == "B", 10^9,
                                           ifelse(PROPDMGEXP %in% c("","?","0"), 0,
                                              ifelse(PROPDMGEXP %in% 
                                                              c("0","5","6",
                                                                "4","2","3",
                                                                "7","H","1","8"),
                                                        10, ""))))))

Results

Question 1: Type of events which are most harmful to population health

The table below presents the highest total fatalities and injuries by specif event type

Inj_fat_data <- data %>% 
                    group_by(EVTYPE) %>% 
                    summarise(fatalities = sum(FATALITIES), 
                              injuries  = sum(INJURIES)) %>% 
                    mutate(total = (fatalities + injuries)) %>% 
                    arrange(desc(total)) %>% 
                    slice(1:10)

head(Inj_fat_data)
## # A tibble: 6 × 4
##   EVTYPE         fatalities injuries total
##   <chr>               <dbl>    <dbl> <dbl>
## 1 TORNADO              1134    20080 21214
## 2 EXCESSIVE HEAT       1828     6324  8152
## 3 FLOOD                 377     6658  7035
## 4 LIGHTNING             764     4885  5649
## 5 TSTM WIND             327     5022  5349
## 6 FLASH FLOOD           858     1561  2419

The table above shows that TORNADO has the highest impact when injuries and fatalities are combined. the charts below show difference in contribution of the top 10 events by the type of of the impact. Whilst TORNADO had the highest impact in injuries, EXCESSIVE HEAT contributed the most to fatalities impact.

par(mfrow = c(1,3))

pie(Inj_fat_data$total, Inj_fat_data$EVTYPE, col = factor(Inj_fat_data$EVTYPE))
title(main = "Total Impact")


pie(Inj_fat_data$injuries, Inj_fat_data$EVTYPE, col = factor(Inj_fat_data$EVTYPE))
title(main = "Injuries Impact")

pie(Inj_fat_data$fatalities, Inj_fat_data$EVTYPE, col = factor(Inj_fat_data$EVTYPE))
title(main = "fatalities Impact")

Question 2: Type of event with the greatest economic consequences

The code below first creates cost varirables for Crop damage and property damage. it summarises the total cost of both crops and property by the event caused.

data <- transform(data, PROPDMGEXP = as.numeric(PROPDMGEXP),
                     CROPDMGEXP = as.numeric(CROPDMGEXP))
                                     
econ_impact <-  data %>% 
                    mutate(CROPCOST = (CROPDMG * CROPDMGEXP),
                           PROPCOST = (PROPDMG * PROPDMGEXP)) %>% 
                    group_by(EVTYPE) %>% 
                    summarise(CROPCOST = sum(CROPCOST),
                              PROPCOST = sum(PROPCOST ),
                              TOTALCOST = (CROPCOST + PROPCOST)) %>% 
                    arrange(desc(TOTALCOST)) %>% 
                    slice(1:10)

head(econ_impact)
## # A tibble: 6 × 4
##   EVTYPE               CROPCOST     PROPCOST    TOTALCOST
##   <chr>                   <dbl>        <dbl>        <dbl>
## 1 FLOOD              4405166450 134822237410 139227403860
## 2 HURRICANE/TYPHOON  2607872800  69305840000  71913712800
## 3 STORM SURGE              5000  43323536000  43323541000
## 4 FLASH FLOOD        1184061100  14058230296  15242291396
## 5 DROUGHT           13935485000   1045992000  14981477000
## 6 HURRICANE          2731410000  11857819010  14589229010

As outlined in the graph below and the table above, we see that floods had the greatest economic impact on both the crops and properties.

ggplot(econ_impact, aes(EVTYPE, TOTALCOST/10^9)) +
           geom_bar(stat = "identity", aes(fill = EVTYPE))