This project uses the data collected at Lake Tahoe Basin regarding the Jeffrey pine beetle outbreak between 1991-1996. From 1991 to 1996, Jeffrey pine beetles (JPB) caused tree mortality throughout the Lake Tahoe Basin during a severe drought. The data set describes the dynamics within the Lake Tahoe Basin of a 60-acre study area with 10,722 trees followed annually and assesses patterns of JPB-caused mortality. This project uses the information in the ‘pine beetle’ dataset to analyze and predict the minimum linear distance to the nearest brood tree (DeadDist).
The pine beetle dataset was used to predict the minimum distance to the nearest brood tree (DeadDist). The predictors used for these models were infestation severity closest to the response tree (Infes_Sever1), stand density index @ 1/20th-acre neighborhood surrounding response tree (SDI_20th), the basal area total for all infested trees within 1-acre neighborhood (BA_infest_1), and the basal area total summed for all trees within 1/2-acre neighborhood of response tree (Neigh_1/2th)).
Variables | Description |
---|---|
TreeDiam | Tree diameter/size |
Infest_sever1 | Infestation severity nearest to response tree |
Invest_sever2 | Infestation severity nearest to response tree |
Ind_DeadDist | Indicator if nearest brood tree is within 50m effective distance found |
DeadDist | Minimum linear distance to nearest brood tree |
SDI_20th | Stand Density Index @ 1/20th-acre neighborhood surrounding response tree |
Neigh_SDI_1/4th | Stand Density Index @ 1/4th-acre neighborhood surrounding response tree |
BA_20th | Basal Area @ 1/20th-acre neighborhood surrounding response tree |
Neigh_1/4th | Basal area total summed for all trees within 1/4th-acre neighborhood of response tree |
Neigh_1/2th | Basal area total summed for all trees within 1/2-acre neighborhood of response tree |
Neigh_1 | Basal area total summed for all trees within 1-acre neighborhood of response tree |
Neigh_1.5 | Basal area total summed for all trees within 1.5-acre neighborhood of response tree |
BA_Inf_20th | Basal area total for all infested trees within 1/20th-acre neighborhood |
BA_infest_1/4 | Basal area total for all infested trees within 1/4th-acre neighborhood |
BA_infest_1/2 | Basal area total for all infested trees within 1/2-acre neighborhood |
BA_infest_1 | Basal area total for all infested trees within 1-acre neighborhood |
BA_infest_1.5 | Basal area total for all infested trees within 1.5-acre neighborhood |
IND_BA_Infest_20th | Binary indicator for if a response tree has any infested trees within neigborhood |
IND_BA_infest_1/4th | Indicator of any infested trees within 1/4th-acre neighborhood of response tree |
IND_BA_infest_1/2th | Indicator of any infested trees within 1/2-acre neighborhood of response tree |
IND_BA_infest_1 | Indicator of any infested trees within 1-acre neighborhood of response tree |
IND_BA_infest_1.5 | Indicator of any infested trees within 1.5-acre neighborhood of response tree |
# A tibble: 5 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 7.97 0.0641 124. 0
2 TreeDiam -0.00549 0.00269 -2.04 4.14e- 2
3 Infest_Serv1 -0.0157 0.000901 -17.4 9.05e-67
4 SDI_20th -0.0132 0.00164 -8.05 9.12e-16
5 `Neigh_1/2th` -0.0232 0.000513 -45.3 0
# A tibble: 5 × 3
term estimate penalty
<chr> <dbl> <dbl>
1 (Intercept) 4.50 0.163
2 TreeDiam -0.0488 0.163
3 Infest_Serv1 -0.256 0.163
4 SDI_20th -0.231 0.163
5 Neigh_1/2th -0.817 0.163
---
title: "HGEN 612 Project 2 "
output:
flexdashboard::flex_dashboard:
orientation: columns
vertical_layout: fill
theme: journal
source_code: embed
---
```{r setup, include=FALSE}
library(flexdashboard)
library(DT)
library(ggplot2)
library(plotly)
library(corrr)
library(emo)
library(tidyverse)
library(readxl)
library(broom)
library(car)
library(ggfortify)
library(tidymodels)
library(vip)
library(performance)
library(GGally)
pine_tbl <- read_excel("Data/Data_1993.xlsx", sheet = 1)
```
Introduction to Project and Dataset
========================================
Column {data-width=400}
-----------------------------------------------------------------------
### **Project Overview**
```{r}
```
This project uses the data collected at Lake Tahoe Basin regarding the Jeffrey
pine beetle outbreak between 1991-1996. From 1991 to 1996, Jeffrey pine beetles
(JPB) caused tree mortality throughout the Lake Tahoe Basin during a severe drought.
The data set describes the dynamics within the Lake Tahoe Basin of a 60-acre study
area with 10,722 trees followed annually and assesses patterns of JPB-caused mortality.
This project uses the information in the 'pine beetle' dataset to analyze and predict
the minimum linear distance to the nearest brood tree (DeadDist).
### **Jeffery Pine Beetle**
```{r photo input, out.width='100%'}
knitr::include_graphics("~/Desktop/HGEN_612_Project2/Dendroctonus_ponderosae.jpg")
```
### **Model Evaluation**
The pine beetle dataset was used to predict the minimum distance to the nearest
brood tree (DeadDist). The predictors used for these models were infestation severity
closest to the response tree (Infes_Sever1), stand density index @ 1/20th-acre
neighborhood surrounding response tree (SDI_20th), the basal area total for all
infested trees within 1-acre neighborhood (BA_infest_1), and the basal area total
summed for all trees within 1/2-acre neighborhood of response tree (Neigh_1/2th)).
Column {data-width=600}
-----------------------------------------------------------------------
### **Data Variable Explanation**
```{r variable input, out.width='100%'}
data.frame(Variables = c("TreeDiam", "Infest_sever1", "Invest_sever2", "Ind_DeadDist",
"DeadDist", "SDI_20th", "Neigh_SDI_1/4th", "BA_20th",
"Neigh_1/4th", "Neigh_1/2th", "Neigh_1", "Neigh_1.5",
"BA_Inf_20th", "BA_infest_1/4", "BA_infest_1/2", "BA_infest_1",
"BA_infest_1.5", "IND_BA_Infest_20th", "IND_BA_infest_1/4th",
"IND_BA_infest_1/2th", "IND_BA_infest_1", "IND_BA_infest_1.5"),
Description = c("Tree diameter/size",
"Infestation severity nearest to response tree",
"Infestation severity nearest to response tree",
"Indicator if nearest brood tree is within 50m effective distance found",
"Minimum linear distance to nearest brood tree",
"Stand Density Index @ 1/20th-acre neighborhood surrounding response tree",
"Stand Density Index @ 1/4th-acre neighborhood surrounding response tree",
"Basal Area @ 1/20th-acre neighborhood surrounding response tree",
"Basal area total summed for all trees within 1/4th-acre neighborhood of response tree",
"Basal area total summed for all trees within 1/2-acre neighborhood of response tree",
"Basal area total summed for all trees within 1-acre neighborhood of response tree",
"Basal area total summed for all trees within 1.5-acre neighborhood of response tree",
"Basal area total for all infested trees within 1/20th-acre neighborhood",
"Basal area total for all infested trees within 1/4th-acre neighborhood",
"Basal area total for all infested trees within 1/2-acre neighborhood",
"Basal area total for all infested trees within 1-acre neighborhood",
"Basal area total for all infested trees within 1.5-acre neighborhood",
"Binary indicator for if a response tree has any infested trees within neigborhood",
"Indicator of any infested trees within 1/4th-acre neighborhood of response tree",
"Indicator of any infested trees within 1/2-acre neighborhood of response tree",
"Indicator of any infested trees within 1-acre neighborhood of response tree",
"Indicator of any infested trees within 1.5-acre neighborhood of response tree")) %>%
knitr::kable()
```
Predictor Overview
========================================
``` {r predictor overview}
pine_tbl_select <- pine_tbl %>%
select("DeadDist", "TreeDiam", "Infest_Serv1", "SDI_20th", "Neigh_1/2th")
ggpairs(pine_tbl_select)
```
Linear Regression Model
========================================
Column {data-width=400}
-----------------------------------------------------------------------
### **Tidy Model Fit Table**
``` {r Linear Regression model build out}
pine_tbl_recipe <-
recipe(DeadDist ~ ., data = pine_tbl_select) %>%
step_sqrt(all_outcomes()) %>%
step_corr(all_predictors())
lm_model <-
linear_reg() %>%
set_engine("lm")
pine_wflow <-
workflow() %>%
add_model(lm_model) %>%
add_recipe(pine_tbl_recipe)
pine_fit <-
pine_wflow %>%
fit(data = pine_tbl_select)
pine_fit %>%
extract_fit_parsnip() %>%
tidy()
```
### **VIP Plot**
``` {r VIP Plot }
pine_fit %>%
extract_fit_parsnip() %>%
vip::vip()
```
Column {data-width=600}
-----------------------------------------------------------------------
``` {r check model, out.width='100%'}
pine_fit %>%
extract_fit_parsnip() %>%
check_model()
```
Ridge Regression Model
========================================
Column {data-width=400}
-----------------------------------------------------------------------
``` {r ridge set up, include=FALSE}
pine_split <- initial_split(pine_tbl_select)
pine_train <- training(pine_split)
pine_test <- testing(pine_split)
ridge_mod <-
linear_reg(mixture = 0, penalty = 0.1629751) %>%
set_engine("glmnet")
ridge_mod %>%
translate()
pine_rec <- pine_train %>%
recipe(DeadDist ~ ., data = pine_tbl_select) %>%
step_sqrt(all_outcomes()) %>%
step_corr(all_predictors()) %>%
step_normalize(all_numeric(), -all_outcomes()) %>%
step_zv(all_numeric(), -all_outcomes()) #%>%
pine_ridge_wflow <-
workflow() %>%
add_model(ridge_mod) %>%
add_recipe(pine_rec)
pine_ridge_wflow
pine_ridge_fit <-
pine_ridge_wflow %>%
fit(data = pine_train)
```
### **Tidy Model Fit Ridge Table**
``` {r tidy table ridge}
pine_ridge_fit %>%
extract_fit_parsnip() %>%
tidy()
```
### **VIP Ridge Plot**
``` {r VIP Plot ridge}
pine_ridge_fit %>%
extract_fit_parsnip() %>%
vip::vip()
```
Column {data-width=600}
-----------------------------------------------------------------------
``` {r check model ridge, out.width='100%'}
```