Climate change has been witnessed by the ever-evolving modern society. The increasing number recorded extreme weather events becomes the warning from the nature mother to all the human being. The objective of the dashboard is to explore the relationship between different weather events and injury and death throughout the year 2014. I requested the API from National Centers for Environmental Information for Storm Events Dataset.
# A tibble: 6 x 51
begin_yearmonth begin_day begin_time end_yearmonth end_day end_time episode_id
<int> <int> <int> <int> <int> <int> <int>
1 201402 18 1000 201402 18 2000 83473
2 201402 5 300 201402 5 2300 83491
3 201401 18 1000 201401 19 700 82185
4 201411 26 1000 201411 27 1000 91728
5 201402 13 630 201402 14 800 83476
6 201404 24 1014 201404 24 1316 84793
# … with 44 more variables: event_id <int>, state <chr>, state_fips <int>,
# year <int>, month_name <chr>, event_type <chr>, cz_type <chr>,
# cz_fips <int>, cz_name <chr>, wfo <chr>, begin_date_time <chr>,
# cz_timezone <chr>, end_date_time <chr>, injuries_direct <int>,
# injuries_indirect <int>, deaths_direct <int>, deaths_indirect <int>,
# damage_property <chr>, damage_crops <chr>, source <chr>, magnitude <dbl>,
# magnitude_type <chr>, flood_cause <chr>, category <int>, tor_f_scale <chr>,
# tor_length <dbl>, tor_width <int>, tor_other_wfo <chr>,
# tor_other_cz_state <chr>, tor_other_cz_fips <int>, tor_other_cz_name <chr>,
# begin_range <int>, begin_azimuth <chr>, begin_location <chr>,
# end_range <int>, end_azimuth <chr>, end_location <chr>, begin_lat <dbl>,
# begin_lon <dbl>, end_lat <dbl>, end_lon <dbl>, episode_narrative <chr>,
# event_narrative <chr>, data_source <chr>
The shorter the weather event duration, the higher direct/indirect injuries. Also, direct injuries happened more than four times of indirect injuries if the event duration is less than half day. It makes sense since the shorter the event duration is, the more fierce the event can be generally.As a result, there are more direct injuires than indirect injuries.
First of all, death rates are way less than injury rates. Secondly, the shorter the weather duration, the higher death cases. Similar with injury rates analysis, the death cases reach at the highest when the event lasted less than half day. Lastly, direct deaths are generally more than indirect death. The longer the event duration, the less the death rates.
Compared with injuries, direct death cases are more likely to happen in the Spring and Summer, while indirect death cases remain the lowest during the spring and summer.
---
title: "512 Lab2"
author: "Zhuoxin Jiang"
output:
flexdashboard::flex_dashboard:
orientation: columns
vertical_layout: fill
theme: readable
source_code: embed
---
```{r setup, include=FALSE}
library(flexdashboard)
library(lubridate)
library(date)
library(dplyr)
library(ggplot2)
```
Data Collection {data-icon="fa-database" data-orientation=rows}
=============================
Row {data-height=250}
-----------------------------------------------------------------------
### Objective
Climate change has been witnessed by the ever-evolving modern society. The increasing number recorded extreme weather events becomes the warning from the nature mother to all the human being. The objective of the dashboard is to explore the relationship between different weather events and injury and death throughout the year 2014. I requested the API from National Centers for Environmental Information for Storm Events Dataset.
Row {data-height=750}
-----------------------------------------------------------------------
### Dataset
```{r}
library("rnoaa")
df <- se_data(year = 2014, type = "details")
head(df)
```
Weather Event Duration Analysis {data-icon="fa-chart-line"}
=============================
Column {data-width=500}
-----------------------------------------------------------------------
### Weather Event Duration and Injuries
```{R}
# year month, day and time
df$begin_yearmonth <- gsub('^([0-9]{4})([0-9]+)$', '\\1-\\2', df$begin_yearmonth)
df$begin_day <- substring(df$begin_date_time, 1, 2)
df$begin_time <- substring(df$begin_date_time, 11)
# similar for end side
df$end_yearmonth <- gsub('^([0-9]{4})([0-9]+)$', '\\1-\\2', df$end_yearmonth)
df$end_day <- substring(df$end_date_time, 1, 2)
df$end_time <- substring(df$end_date_time, 11)
# paste together
df$cleaned_begin <- paste0(df$begin_yearmonth, '-', df$begin_day, ' ', df$begin_time )
df$cleaned_begin <- as.POSIXct(df$cleaned_begin)
df$cleaned_end <- paste0(df$end_yearmonth, '-', df$end_day, ' ', df$end_time )
df$cleaned_end <- as.POSIXct(df$cleaned_end)
df$duration <- round(difftime(df$cleaned_end, df$cleaned_begin, units = 'hours'), 0)
df$duration_range <- ifelse (df$duration<=12, "Half Day",
ifelse(12%
select(duration_range, injuries_direct, injuries_indirect, deaths_direct, deaths_indirect) %>%
group_by(duration_range) %>%
summarise(total_direct_injuries=sum(injuries_direct),
total_indirect_injuries=sum(injuries_indirect),
total_direct_deaths=sum(deaths_direct),
total_indirect_deaths=sum(deaths_indirect)) %>% ungroup() -> df1
df1$duration_range <- factor(df1$duration_range, levels=c('Half Day', 'A Day','A Week', 'Two Weeks', 'A Month', 'More Than A Month'))
ggplot(df1,aes(x=duration_range, group = 1)) +
geom_bar(mapping = aes(x=duration_range,y=total_indirect_injuries),stat = "identity", fill = "grey") +
geom_line(mapping = aes(x=duration_range,y=total_direct_injuries), size = 1, color = "blue") +
xlab("Weather Event Duration") +
ylab("Injury Case Number") +
ggtitle("Weather Event Duration and Injuries ") +
theme_bw()
```
### Insight 1
The shorter the weather event duration, the higher direct/indirect injuries. Also, direct injuries happened more than four times of indirect injuries if the event duration is less than half day. It makes sense since the shorter the event duration is, the more fierce the event can be generally.As a result, there are more direct injuires than indirect injuries.
Column {data-width=500}
-----------------------------------------------------------------------
### Weather Event Duration and Deaths
```{R}
ggplot(df1,aes(x=duration_range, group = 1)) +
geom_bar(mapping = aes(x=duration_range,y=total_indirect_deaths),stat = "identity", fill = "grey") +
geom_line(mapping = aes(x=duration_range,y=total_direct_deaths), size = 1, color = "red") +
xlab("Weather Event Duration") +
ylab("Death Case Number") +
ggtitle("Weather Event Duration and Deaths ") +
theme_bw()
```
### Insight 2
First of all, death rates are way less than injury rates. Secondly, the shorter the weather duration, the higher death cases. Similar with injury rates analysis, the death cases reach at the highest when the event lasted less than half day. Lastly, direct deaths are generally more than indirect death. The longer the event duration, the less the death rates.
Month Analysis {data-icon="fa-exclamation-circle"}
=============================
Column {data-width=400}
-----------------------------------------------------------------------
### Heavy Rain & High Wind
```{R}
df$event_type <- as.factor(df$event_type)
df$month_name <- as.factor(df$month_name)
df %>% select(month_name,event_type) %>%
group_by(month_name, event_type) %>%
summarise(n=n()) %>%
arrange(match(month_name, c('January', 'February','March', 'April', 'May', 'June', 'July', 'August', 'September','October', 'November', 'December'))) %>%
filter(event_type %in% c('Heavy Rain','High Wind')) -> df_2
df_2$month_name <- factor(df_2$month_name, levels=c('January', 'February','March', 'April', 'May', 'June', 'July', 'August', 'September','October', 'November', 'December'))
ggplot(df_2, aes(x = month_name , y = n, group=event_type, color=event_type, label = n)) +
geom_line() +
geom_text() +
xlab("Month") +
ylab("Total cases") +
theme_bw()
```
### Insight 3
1. High wind is more likely to happen in the winter, while heavy rain is more likely to happen in the summer.
2. The chance of high wind in the winter can nearly double that of the spring, and even 10 times more than that of the summer. However, we need to be aware of the fact of the dataset only limits to year 2014. We need more data sample to verify the point.
3. Heavy rain weather event happened more smoothly over the seasons compared with high wind.
Column {data-width=300}
-----------------------------------------------------------------------
### Injuries by Month
```{r}
df %>% select(month_name,injuries_direct, injuries_indirect) %>%
group_by(month_name) %>%
summarise(total_direct_injuries=sum(injuries_direct),
total_indirect_injuries=sum(injuries_indirect)) %>%
arrange(match(month_name, c('January', 'February','March', 'April', 'May', 'June', 'July', 'August', 'September','October', 'November', 'December'))) %>% ungroup() -> df_3
df_3$month_name <- factor(df_3$month_name, levels=c('January', 'February','March', 'April', 'May', 'June', 'July', 'August', 'September','October', 'November', 'December'))
ggplot(df_3, aes(x=month_name, group=1)) +
geom_line(aes(y=total_direct_injuries),color = "darkred") +
geom_line(aes(y=total_indirect_injuries)) +
xlab ("Month") +
ylab("Injury Cases") +
theme_classic()
```
### Insight 4
1. Direct and indirect injuries are more likely to happen on January, which can be explained by the possibility that the short weather event duration is more likely to happen in the winter.
2. Overall, there are more direct injuries on each month than indirect injuries.
3. Indirect injuries are more flat throughout the year, which can be explained by the definition of 'indirect injuries' that it could happen anytime.
Column {data-width=300}
-----------------------------------------------------------------------
### Deaths by Month
```{r}
df %>% select(month_name,deaths_direct, deaths_indirect) %>%
group_by(month_name) %>%
summarise(total_direct_death=sum(deaths_direct),
total_indirect_death=sum(deaths_indirect)) %>%
arrange(match(month_name, c('January', 'February','March', 'April', 'May', 'June', 'July', 'August', 'September','October', 'November', 'December'))) %>% ungroup() -> df_4
df_4$month_name <- factor(df_4$month_name, levels=c('January', 'February','March', 'April', 'May', 'June', 'July', 'August', 'September','October', 'November', 'December'))
ggplot(df_4, aes(x=month_name, group=1)) +
geom_line(aes(y=total_direct_death),color = "blue") +
geom_line(aes(y=total_indirect_death)) +
xlab ("Month") +
ylab("Death Cases") +
theme_classic()
```
### Insight 5
Compared with injuries, direct death cases are more likely to happen in the Spring and Summer, while indirect death cases remain the lowest during the spring and summer.