Data Collection

Row

Objective

Climate change has been witnessed by the ever-evolving modern society. The increasing number recorded extreme weather events becomes the warning from the nature mother to all the human being. The objective of the dashboard is to explore the relationship between different weather events and injury and death throughout the year 2014. I requested the API from National Centers for Environmental Information for Storm Events Dataset.

Row

Dataset

# A tibble: 6 x 51
  begin_yearmonth begin_day begin_time end_yearmonth end_day end_time episode_id
            <int>     <int>      <int>         <int>   <int>    <int>      <int>
1          201402        18       1000        201402      18     2000      83473
2          201402         5        300        201402       5     2300      83491
3          201401        18       1000        201401      19      700      82185
4          201411        26       1000        201411      27     1000      91728
5          201402        13        630        201402      14      800      83476
6          201404        24       1014        201404      24     1316      84793
# … with 44 more variables: event_id <int>, state <chr>, state_fips <int>,
#   year <int>, month_name <chr>, event_type <chr>, cz_type <chr>,
#   cz_fips <int>, cz_name <chr>, wfo <chr>, begin_date_time <chr>,
#   cz_timezone <chr>, end_date_time <chr>, injuries_direct <int>,
#   injuries_indirect <int>, deaths_direct <int>, deaths_indirect <int>,
#   damage_property <chr>, damage_crops <chr>, source <chr>, magnitude <dbl>,
#   magnitude_type <chr>, flood_cause <chr>, category <int>, tor_f_scale <chr>,
#   tor_length <dbl>, tor_width <int>, tor_other_wfo <chr>,
#   tor_other_cz_state <chr>, tor_other_cz_fips <int>, tor_other_cz_name <chr>,
#   begin_range <int>, begin_azimuth <chr>, begin_location <chr>,
#   end_range <int>, end_azimuth <chr>, end_location <chr>, begin_lat <dbl>,
#   begin_lon <dbl>, end_lat <dbl>, end_lon <dbl>, episode_narrative <chr>,
#   event_narrative <chr>, data_source <chr>

Weather Event Duration Analysis

Column

Weather Event Duration and Injuries

Insight 1

The shorter the weather event duration, the higher direct/indirect injuries. Also, direct injuries happened more than four times of indirect injuries if the event duration is less than half day. It makes sense since the shorter the event duration is, the more fierce the event can be generally.As a result, there are more direct injuires than indirect injuries.

Column

Weather Event Duration and Deaths

Insight 2

First of all, death rates are way less than injury rates. Secondly, the shorter the weather duration, the higher death cases. Similar with injury rates analysis, the death cases reach at the highest when the event lasted less than half day. Lastly, direct deaths are generally more than indirect death. The longer the event duration, the less the death rates.

Month Analysis

Column

Heavy Rain & High Wind

Insight 3

  1. High wind is more likely to happen in the winter, while heavy rain is more likely to happen in the summer.
  2. The chance of high wind in the winter can nearly double that of the spring, and even 10 times more than that of the summer. However, we need to be aware of the fact of the dataset only limits to year 2014. We need more data sample to verify the point.
  3. Heavy rain weather event happened more smoothly over the seasons compared with high wind.

Column

Injuries by Month

Insight 4

  1. Direct and indirect injuries are more likely to happen on January, which can be explained by the possibility that the short weather event duration is more likely to happen in the winter.
  2. Overall, there are more direct injuries on each month than indirect injuries.
  3. Indirect injuries are more flat throughout the year, which can be explained by the definition of ‘indirect injuries’ that it could happen anytime.

Column

Deaths by Month

Insight 5

Compared with injuries, direct death cases are more likely to happen in the Spring and Summer, while indirect death cases remain the lowest during the spring and summer.

---
title: "512 Lab2"
author: "Zhuoxin Jiang"
output: 
  flexdashboard::flex_dashboard:
    orientation: columns
    vertical_layout: fill
    theme: readable
    source_code: embed
---

```{r setup, include=FALSE}
library(flexdashboard)
library(lubridate)
library(date)
library(dplyr)
library(ggplot2)
```

Data Collection {data-icon="fa-database" data-orientation=rows}
=============================

Row {data-height=250}
-----------------------------------------------------------------------

### Objective

Climate change has been witnessed by the ever-evolving modern society. The increasing number recorded extreme weather events becomes the warning from the nature mother to all the human being. The objective of the dashboard is to explore the relationship between different weather events and injury and death throughout the year 2014. I requested the API from National Centers for Environmental Information for Storm Events Dataset. 

Row {data-height=750}
-----------------------------------------------------------------------
### Dataset

```{r}
library("rnoaa")
df <- se_data(year = 2014, type = "details")
head(df)
```


Weather Event Duration Analysis {data-icon="fa-chart-line"}
=============================

Column {data-width=500}
-----------------------------------------------------------------------

### Weather Event Duration and Injuries
```{R}
# year month, day and time
df$begin_yearmonth <-  gsub('^([0-9]{4})([0-9]+)$', '\\1-\\2', df$begin_yearmonth)
df$begin_day <- substring(df$begin_date_time, 1, 2)
df$begin_time <- substring(df$begin_date_time, 11)

# similar for end side
df$end_yearmonth <-  gsub('^([0-9]{4})([0-9]+)$', '\\1-\\2', df$end_yearmonth)
df$end_day <- substring(df$end_date_time, 1, 2)
df$end_time <- substring(df$end_date_time, 11)

# paste together
df$cleaned_begin <- paste0(df$begin_yearmonth, '-', df$begin_day, ' ', df$begin_time )
df$cleaned_begin <- as.POSIXct(df$cleaned_begin)

df$cleaned_end <- paste0(df$end_yearmonth, '-', df$end_day, ' ', df$end_time )
df$cleaned_end <- as.POSIXct(df$cleaned_end)

df$duration <- round(difftime(df$cleaned_end, df$cleaned_begin, units = 'hours'), 0)

df$duration_range <- ifelse (df$duration<=12, "Half Day",
                             ifelse(12% 
  select(duration_range, injuries_direct, injuries_indirect, deaths_direct, deaths_indirect) %>% 
  group_by(duration_range) %>%
  summarise(total_direct_injuries=sum(injuries_direct),
            total_indirect_injuries=sum(injuries_indirect),
            total_direct_deaths=sum(deaths_direct),
            total_indirect_deaths=sum(deaths_indirect)) %>% ungroup() -> df1
  
df1$duration_range <- factor(df1$duration_range, levels=c('Half Day', 'A Day','A Week', 'Two Weeks', 'A Month', 'More Than A Month'))

ggplot(df1,aes(x=duration_range, group = 1)) +
  geom_bar(mapping = aes(x=duration_range,y=total_indirect_injuries),stat = "identity", fill = "grey") +
   geom_line(mapping = aes(x=duration_range,y=total_direct_injuries), size = 1, color = "blue") +
  xlab("Weather Event Duration") +
  ylab("Injury Case Number") +
  ggtitle("Weather Event Duration and Injuries ") +
  theme_bw()
```

### Insight 1

The shorter the weather event duration, the higher direct/indirect injuries. Also, direct injuries happened more than four times of indirect injuries if the event duration is less than half day. It makes sense since the shorter the event duration is, the more fierce the event can be generally.As a result, there are more direct injuires than indirect injuries.

Column {data-width=500}
-----------------------------------------------------------------------

### Weather Event Duration and Deaths
```{R}
ggplot(df1,aes(x=duration_range, group = 1)) +
  geom_bar(mapping = aes(x=duration_range,y=total_indirect_deaths),stat = "identity", fill = "grey") +
   geom_line(mapping = aes(x=duration_range,y=total_direct_deaths), size = 1, color = "red") +
  xlab("Weather Event Duration") +
  ylab("Death Case Number") +
  ggtitle("Weather Event Duration and Deaths ") +
  theme_bw()
```

### Insight 2

First of all, death rates are way less than injury rates. Secondly, the shorter the weather duration, the higher death cases. Similar with injury rates analysis, the death cases reach at the highest when the event lasted less than half day. Lastly, direct deaths are generally more than indirect death. The longer the event duration, the less the death rates.


Month Analysis {data-icon="fa-exclamation-circle"}
=============================

Column {data-width=400}
-----------------------------------------------------------------------

### Heavy Rain & High Wind
```{R}
df$event_type <- as.factor(df$event_type)
df$month_name <- as.factor(df$month_name)

df %>% select(month_name,event_type) %>% 
  group_by(month_name, event_type) %>%
  summarise(n=n()) %>%
  arrange(match(month_name, c('January', 'February','March', 'April', 'May', 'June', 'July', 'August', 'September','October', 'November', 'December'))) %>% 
  filter(event_type %in% c('Heavy Rain','High Wind')) -> df_2


df_2$month_name <- factor(df_2$month_name, levels=c('January', 'February','March', 'April', 'May', 'June', 'July', 'August', 'September','October', 'November', 'December'))


ggplot(df_2, aes(x = month_name , y = n, group=event_type, color=event_type, label = n)) +
  geom_line() +
  geom_text() +
  xlab("Month") +
  ylab("Total cases") +
  theme_bw()   
```

### Insight 3

1. High wind is more likely to happen in the winter, while heavy rain is more likely to happen in the summer.
2. The chance of high wind in the winter can nearly double that of the spring, and even 10 times more than that of the summer. However, we need to be aware of the fact of the dataset only limits to year 2014. We need more data sample to verify the point.
3. Heavy rain weather event happened more smoothly over the seasons compared with high wind.

Column {data-width=300}
-----------------------------------------------------------------------

### Injuries by Month
```{r}
df %>% select(month_name,injuries_direct, injuries_indirect) %>% 
  group_by(month_name) %>%
 summarise(total_direct_injuries=sum(injuries_direct),
            total_indirect_injuries=sum(injuries_indirect)) %>% 
  arrange(match(month_name, c('January', 'February','March', 'April', 'May', 'June', 'July', 'August', 'September','October', 'November', 'December'))) %>% ungroup() -> df_3

df_3$month_name <- factor(df_3$month_name, levels=c('January', 'February','March', 'April', 'May', 'June', 'July', 'August', 'September','October', 'November', 'December'))

ggplot(df_3, aes(x=month_name, group=1)) +
  geom_line(aes(y=total_direct_injuries),color = "darkred") +
  geom_line(aes(y=total_indirect_injuries)) +
  xlab ("Month") +
  ylab("Injury Cases") +
  theme_classic()
```

### Insight 4
1. Direct and indirect injuries are more likely to happen on January, which can be explained by the possibility that the short weather event duration is more likely to happen in the winter.
2. Overall, there are more direct injuries on each month than indirect injuries.
3. Indirect injuries are more flat throughout the year, which can be explained by the definition of 'indirect injuries' that it could happen anytime.


Column {data-width=300}
-----------------------------------------------------------------------

### Deaths by Month
```{r}
df %>% select(month_name,deaths_direct, deaths_indirect) %>% 
  group_by(month_name) %>%
 summarise(total_direct_death=sum(deaths_direct),
            total_indirect_death=sum(deaths_indirect)) %>% 
  arrange(match(month_name, c('January', 'February','March', 'April', 'May', 'June', 'July', 'August', 'September','October', 'November', 'December'))) %>% ungroup() -> df_4

df_4$month_name <- factor(df_4$month_name, levels=c('January', 'February','March', 'April', 'May', 'June', 'July', 'August', 'September','October', 'November', 'December'))

ggplot(df_4, aes(x=month_name, group=1)) +
  geom_line(aes(y=total_direct_death),color = "blue") +
  geom_line(aes(y=total_indirect_death)) +
  xlab ("Month") +
  ylab("Death Cases") +
  theme_classic()
```

### Insight 5

Compared with injuries, direct death cases are more likely to happen in the Spring and Summer, while indirect death cases remain the lowest during the spring and summer.