Data Processing

First, the data was input from the original file. Since we only focused on the impact of severe weather on population health and economic influence, I selected the related variables and created the “weather_2” data frame.

library(readr)
library(dplyr)
library(stringr)
library(lubridate)
weather=read_csv("C:/Users/Lenovo/Desktop/R/rdata/repdata_task2/repdata_data_StormData.csv.bz2")
weather_2=weather %>% 
    select(BGN_DATE,COUNTY,COUNTYNAME,STATE,EVTYPE,FATALITIES,INJURIES,PROPDMG,PROPDMGEXP,CROPDMG,CROPDMGEXP) %>% 
    mutate(BGN_DATE=mdy(str_replace(BGN_DATE," 0:00:00","")),year=year(BGN_DATE))

Results

The effects of severe weather on population health

In this part, I would analyze the impact of severe weather on fatality and injuries.

popuhealth=weather_2 %>% 
    group_by(EVTYPE) %>% 
    summarise(total_fatal=sum(FATALITIES),total_injury=sum(INJURIES))

top_fatal=popuhealth %>% 
    arrange(-total_fatal) %>% 
    select(EVTYPE,total_fatal)
head(top_fatal,5)
## # A tibble: 5 × 2
##   EVTYPE         total_fatal
##   <chr>                <dbl>
## 1 TORNADO               5633
## 2 EXCESSIVE HEAT        1903
## 3 FLASH FLOOD            978
## 4 HEAT                   937
## 5 LIGHTNING              816

The above table shows the top 5 causes of mortality across 1950 to 2021, which were tornado, excessive heat, flash flood, heat, and lightning.

top_injury=popuhealth %>% 
    arrange(-total_injury) %>% 
    select(EVTYPE,total_injury)
head(top_injury,5)
## # A tibble: 5 × 2
##   EVTYPE         total_injury
##   <chr>                 <dbl>
## 1 TORNADO               91346
## 2 TSTM WIND              6957
## 3 FLOOD                  6789
## 4 EXCESSIVE HEAT         6525
## 5 LIGHTNING              5230

The above table shows the top 5 causes of injuries, which were tornado, TSTM wind, flood, excessive heat and lightning. In summary, three types of disaster caused both a large number of fatality and injuries, that is tornado, excessive heat, and lightning. And I created two time series plots to show the fatality and injuries caused by these three disaster across time.

library(ggplot2)
library(patchwork)
p1=weather_2 %>% 
    filter(EVTYPE %in% c("TORNADO","EXCESSIVE HEAT","LIGHTNING")) %>% 
    group_by(year,EVTYPE) %>% 
    summarize(sum_fatal=sum(FATALITIES)) %>% 
    ggplot(aes(x=year,y=sum_fatal,color=EVTYPE))+
    geom_line()+
    labs(x="Year",y="Fatality")
p2=weather_2 %>% 
    filter(EVTYPE %in% c("TORNADO","EXCESSIVE HEAT","LIGHTNING")) %>% 
    group_by(year,EVTYPE) %>% 
    summarize(sum_injur=sum(INJURIES)) %>% 
    ggplot(aes(x=year,y=sum_injur,color=EVTYPE))+
    geom_line()+
    labs(x="Year",y="Injuries")
p1+p2+
    plot_annotation(title="Fatality and injuries caused by the most dangerous disaster weather events across 1950-2021")+
    plot_layout(guides = "collect")&theme(legend.position = "right")

The effects of severe weather on economic consequences

In the second part, I would analysis the impact of severe weather on economic consequences, which equals to the sum of property and crops damage.

weather_2=weather_2 %>% 
    mutate(PROPDMGEXPn=case_when(
       PROPDMGEXP=="B"~1000000000,
       PROPDMGEXP=="K"~1000,
       PROPDMGEXP=="m"~1000000,
       PROPDMGEXP=="M"~1000000,
        TRUE~0
    )) %>% 
    mutate(CROPDMGEXPn=case_when(
        CROPDMGEXP=="B"~1000000000,
        CROPDMGEXP %in% c("K","k")~1000,
        CROPDMGEXP %in% c("M","m")~1000000,
        TRUE~0
    )) %>% 
    mutate(ecosum=PROPDMG*PROPDMGEXPn+CROPDMG*CROPDMGEXPn)
economic_sum=weather_2 %>% 
    group_by(EVTYPE) %>% 
    summarize(ecocons=sum(ecosum,na.rm = T)) %>% 
    arrange(-ecocons)
head(economic_sum,5)
## # A tibble: 5 × 2
##   EVTYPE                 ecocons
##   <chr>                    <dbl>
## 1 FLOOD             150319678250
## 2 HURRICANE/TYPHOON  71913712800
## 3 TORNADO            57352113590
## 4 STORM SURGE        43323541000
## 5 HAIL               18758221170

As we can see from the above table, the top 5 events that caused the largest economic consequences are flood, hurricane/typhoon, tornado, storm surge, and hail. The trends of damages caused by these 5 events across time were shown in below figure.

weather_2 %>% 
    filter(EVTYPE %in% c("FLOOD","HURRICANE/TYPHOON","TORNADO","STORM SURGE","HAIL")) %>% 
    group_by(year,EVTYPE) %>% 
    summarise(ecocons=sum(ecosum,na.rm = T)) %>%
    ggplot(aes(x=year,y=ecocons,color=EVTYPE))+
    geom_line()+
    scale_y_log10()+
    labs(x="Year",y="Damages (property damage+crop damage)",title="The top 5 events that caused the largest economic consequences across 1950-2021")