ANLY 505 - Problem Set #2

The statistical model:

\(y_t = \beta_0 + \beta_1 * (Elevation_s)_t + \beta_2 * Slope_t + (b_s)_t + \epsilon_t\)

Where:

\(\beta_0\) is the mean response when both Elevation and Slope are 0
\(\beta_1\) is the change in mean response for a 1-unit change in elevation. Elevation is measured at the stand level, so all plots in a stand share a single value in elevation.
\(\beta_2\) is the change in mean response for a 1-unit change in slope. Slope is measured at the plot level, so every plot potentially has a unique value of slope.

Let’s define the parameters:

the intercept \((\beta_0)\) will be -1
the coefficient for elevation \((\beta_1)\) will be set to 0.005
the coefficient for slope \((\beta_2)\) will be set to 0.1

nstand = 5
nplot = 4
b0 = -1
b1 = .005
b2 = .1
sds = 2
sd = 1

Simulate other variables:

set.seed(16)
stand = rep(LETTERS[1:nstand], each = nplot)
standeff = rep( rnorm(nstand, 0, sds), each = nplot)
ploteff = rnorm(nstand*nplot, 0, sd)

Simulate elevation and slope:

elevation = rep( runif(nstand, 1000, 1500), each = nplot)
slope = runif(nstand*nplot, 2, 75)

Simulate response variable:

resp2 = b0 + b1*elevation + b2*slope + standeff + ploteff

Your tasks (complete each task in its’ own code chunk, make sure to use echo=TRUE so I can see your code):

list.of.packages <- c("lme4","lmerTest","broom","broom.mixed","tidyselect","dplyr","tidyverse","ggplot2","future","bindrcpp")

new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]

if(length(new.packages)) install.packages(new.packages)

all.package = lapply(list.of.packages, require, character.only = TRUE)

## Loading required package: lme4

## Warning: package 'lme4' was built under R version 3.5.3

## Loading required package: Matrix

## Loading required package: lmerTest

## Warning: package 'lmerTest' was built under R version 3.5.3

## 
## Attaching package: 'lmerTest'

## The following object is masked from 'package:lme4':
## 
##     lmer

## The following object is masked from 'package:stats':
## 
##     step

## Loading required package: broom

## Warning: package 'broom' was built under R version 3.5.3

## Loading required package: broom.mixed

## Warning: package 'broom.mixed' was built under R version 3.5.3

## Warning in checkMatrixPackageVersion(): Package version inconsistency detected.
## TMB was built with Matrix version 1.2.17
## Current Matrix version is 1.2.14
## Please re-install 'TMB' from source using install.packages('TMB', type = 'source') or ask CRAN for a binary version of 'TMB' matching CRAN's 'Matrix' package

## 
## Attaching package: 'broom.mixed'

## The following object is masked from 'package:broom':
## 
##     tidyMCMC

## Loading required package: tidyselect

## Warning: package 'tidyselect' was built under R version 3.5.3

## Loading required package: dplyr

## Warning: package 'dplyr' was built under R version 3.5.2

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

## Loading required package: tidyverse

## Warning: package 'tidyverse' was built under R version 3.5.3

## -- Attaching packages --------------------------- tidyverse 1.2.1 --

## v ggplot2 3.1.0     v readr   1.3.1
## v tibble  1.4.2     v purrr   0.3.2
## v tidyr   0.8.2     v stringr 1.3.1
## v ggplot2 3.1.0     v forcats 0.4.0

## Warning: package 'ggplot2' was built under R version 3.5.2

## Warning: package 'tibble' was built under R version 3.5.2

## Warning: package 'tidyr' was built under R version 3.5.2

## Warning: package 'readr' was built under R version 3.5.3

## Warning: package 'purrr' was built under R version 3.5.3

## Warning: package 'stringr' was built under R version 3.5.2

## Warning: package 'forcats' was built under R version 3.5.3

## -- Conflicts ------------------------------ tidyverse_conflicts() --
## x tidyr::expand()         masks Matrix::expand()
## x dplyr::filter()         masks stats::filter()
## x dplyr::lag()            masks stats::lag()
## x broom.mixed::tidyMCMC() masks broom::tidyMCMC()

## Loading required package: future

## Warning: package 'future' was built under R version 3.5.3

## Loading required package: bindrcpp

## Warning: package 'bindrcpp' was built under R version 3.5.2

Fit a linear mixed model with the response variable as a function of elevation and slope with stand as a random effect. Are the estimated parameters similar to the intial parameters as we defined them?

fit <- lmer(resp2 ~ 1 + elevation + slope + (1|stand))
summary(fit)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: resp2 ~ 1 + elevation + slope + (1 | stand)
## 
## REML criterion at convergence: 82
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.65583 -0.62467 -0.01693  0.53669  1.41736 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  stand    (Intercept) 1.208    1.099   
##  Residual             1.358    1.165   
## Number of obs: 20, groups:  stand, 5
## 
## Fixed effects:
##               Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept) -21.314628   6.602053   3.001313  -3.228   0.0482 *  
## elevation     0.020600   0.004916   3.113482   4.190   0.0230 *  
## slope         0.095105   0.016441  15.868032   5.785 2.88e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##           (Intr) elevtn
## elevation -0.991       
## slope      0.049 -0.148

Refer to defined terms, we have b0 = -1, b1 = .005, b2 = .1. Let’s check the output, we have b0 = -21.314628, b1 = 0.020600, b2 = 0.095105. By simple comparison, it is obvious that b0 and b1 are different, b2 is similar.

Create a function for your model and run 1000 simulations of that model.

fit_func <- function(nstand = 5, nplot = 4, b0 = -1, b1 = 0.005, b2 = 0.1, sds = 2, sd = 1) {
  stand <- rep(LETTERS[1:nstand], each = nplot)
  standeff <- rep(rnorm(nstand, 0, sds), each = nplot)
  ploteff <- rnorm(nstand * nplot, 0, sd)
  elevation <- rep(runif(nstand, 1000, 1500), each = nplot)
  slope <- runif(nstand * nplot, 2, 75)
  resp2 <- b0 + b1 * elevation + b2 * slope + standeff + ploteff
  
  dat <- data.frame(resp2, elevation, slope, stand)
  lmer(resp2 ~ 1 + elevation + slope + (1|stand), data = dat)
}
fit_func()

## Linear mixed model fit by REML ['lmerModLmerTest']
## Formula: resp2 ~ 1 + elevation + slope + (1 | stand)
##    Data: dat
## REML criterion at convergence: 85.1321
## Random effects:
##  Groups   Name        Std.Dev.
##  stand    (Intercept) 2.992   
##  Residual             1.077   
## Number of obs: 20, groups:  stand, 5
## Fixed Effects:
## (Intercept)    elevation        slope  
##    7.365232    -0.002138     0.102090

output <- replicate(n = 1000, expr = fit_func())

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00494375
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00426347
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00287986
## (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00314794
## (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00789635
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.0162996
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00307489
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular

Extract the stand and residual variances from this simulation run. Print the first 6 rows of the data.

variances <- output %>% map_dfr(tidy, effects = "ran_pars", scales = "vcov")
head(variances)

## # A tibble: 6 x 4
##   effect   group    term             estimate
##   <chr>    <chr>    <chr>               <dbl>
## 1 ran_pars stand    var__(Intercept)    2.78 
## 2 ran_pars Residual var__Observation    1.20 
## 3 ran_pars stand    var__(Intercept)    2.91 
## 4 ran_pars Residual var__Observation    0.771
## 5 ran_pars stand    var__(Intercept)    1.72 
## 6 ran_pars Residual var__Observation    1.82

Choose three different sample sizes (your choice) and run 1000 model simulations with each sample size. Create 3 visualizations that compare distributions of the variances for each of the 3 sample sizes. Make sure that the axes are labelled correctly. What do these graphs say about the relationship between sample size and variance?

sample_size <- c(10, 15, 20) %>% 
  set_names(c("10", "15", "20")) %>% 
  map(~replicate(1000, fit_func(nstand = .x)))

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00613041
## (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00465885
## (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00217807
## (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.0127477
## (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.0132353
## (tol = 0.002, component 1)

vars <- sample_size %>%
  modify_depth(2, ~tidy(.x, effects = "ran_pars", scales = "vcov")) %>%
  map_dfr(bind_rows, .id = "id") %>%
  filter(group == "stand")

add_prefix <- function(string) {
  paste("sample size:", string, sep = " ")
}

groupmed = vars %>%
  group_by(id) %>%
  summarise(mvar = median(estimate))

ggplot(vars, aes(x = estimate)) +
  geom_density(fill = "purple", alpha = "0.25") +
  facet_wrap(~id, labeller = as_labeller(add_prefix)) +
  geom_vline(aes(xintercept = 4, linetype = "True Variance"), size=0.5) +
  geom_vline(data = groupmed, aes(xintercept=mvar,linetype="Median Variance"), size = 0.5) +
  theme_bw() +
  labs(x="Estimated Variance")

Plot the coefficients of the estimates of elevation and slope. Hint: the x-axis should have 1000 values. Discuss the graphs.

est <- output %>% 
  map(tidy, effects = "fixed") %>% 
  bind_rows()

est %>% 
  dplyr::filter(term %in% c("slope", "elevation")) %>% 
  group_by(term) %>% 
  mutate(x = 1 : 1000) %>% 
  ungroup() %>%
  mutate(real_value = ifelse(term == "elevation", 0.005, 0.1)) %>% 
  ggplot(aes(x = x, y = estimate)) +
  facet_wrap(~term) +
  geom_line() +
  theme_bw() +
  geom_hline(aes(yintercept = real_value, color = term), linetype = 2, size = 1) +
  labs(x="Estimates", y = "Mean Response")

Submit a link to this document in R Pubs to your Moodle.

ANLY 505 - Problem Set #2

Simulation of Linear Mixed Models

Jingshu Zhao

2019-04-27