ANLY505 - Problem Set 2

The statistical model:

\(y_t = \beta_0 + \beta_1 * (Elevation_s)_t + \beta_2 * Slope_t + (b_s)_t + \epsilon_t\)

Where:

\(\beta_0\) is the mean response when both Elevation and Slope are 0
\(\beta_1\) is the change in mean response for a 1-unit change in elevation. Elevation is measured at the stand level, so all plots in a stand share a single value in elevation.
\(\beta_2\) is the change in mean response for a 1-unit change in slope. Slope is measured at the plot level, so every plot potentially has a unique value of slope.

Let’s define the parameters:

the intercept \((\beta_0)\) will be -1
the coefficient for elevation \((\beta_1)\) will be set to 0.005
the coefficient for slope \((\beta_2)\) will be set to 0.1

nstand = 5
nplot = 4
b0 = -1
b1 = .005
b2 = .1
sds = 2
sd = 1

Simulate other variables:

set.seed(16)
stand = rep(LETTERS[1:nstand], each = nplot)
plot = letters[1:(nstand*nplot)]
standeff = rep( rnorm(nstand, 0, sds), each = nplot)
ploteff = rnorm(nstand*nplot, 0, sd)

Simulate elevation and slope:

elevation = rep( runif(nstand, 1000, 1500), each = nplot)
slope = runif(nstand*nplot, 2, 75)

Simulate response variable:

resp2 = b0 + b1*elevation + b2*slope + standeff + ploteff

Your tasks (complete each task in its’ own code chunk, make sure to use echo=TRUE so I can see your code):

Fit a linear mixed model with the response variable as a function of elevation and slope with stand as a random effect. Are the estimated parameters similar to the intial parameters as we defined them?

library(lme4)

## Warning: package 'lme4' was built under R version 3.5.2

## Loading required package: Matrix

dat = data.frame(elevation, slope, resp2, stand)
fit_resp2 = lmer(resp2 ~ elevation + slope + (1|stand), data = dat)
fit_resp2

## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ elevation + slope + (1 | stand)
##    Data: dat
## REML criterion at convergence: 81.9874
## Random effects:
##  Groups   Name        Std.Dev.
##  stand    (Intercept) 1.099   
##  Residual             1.165   
## Number of obs: 20, groups:  stand, 5
## Fixed Effects:
## (Intercept)    elevation        slope  
##   -21.31463      0.02060      0.09511

# the given elevation coefficient b1=0.005, fitted b1=0.0206
# the given slope coefficient b2=0.1, fitted b2=0.0951

Create a function for your model and run 1000 simulations of that model.

mix_model=function(nstand=5, nplot=4, b0=-1, b1=0.005, b2=0.1, sd=1, sds=2){
  stand = rep(LETTERS[1:nstand], each = nplot)
  plot = letters[1:(nstand*nplot)]
  standeff = rep( rnorm(nstand, 0, sds), each = nplot)
  ploteff = rnorm(nstand*nplot, 0, sd)
  elevation = rep( runif(nstand, 1000, 1500), each = nplot)
  slope = runif(nstand*nplot, 2, 75)
  resp2 = b0 + b1*elevation + b2*slope + standeff + ploteff
  dat = data.frame(resp2, elevation, slope, stand)
  lmer(resp2 ~ elevation + slope + (1|stand), data=dat)
}

library(purrr)

## Warning: package 'purrr' was built under R version 3.5.2

mix_1000 = rerun(1000, mix_model())

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00350336
## (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.0045514
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00363544
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00439561
## (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00239513
## (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00219495
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00322495
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular

Extract the stand and residual variances from this simulation run. Print the first 6 rows of the data.

library(broom)

## Warning: package 'broom' was built under R version 3.5.2

get_vars = mix_1000 %>%
  map_dfr(tidy, effects = "ran_pars", scales = "vcov")

head(get_vars)

## # A tibble: 6 x 3
##   term                     group    estimate
##   <chr>                    <chr>       <dbl>
## 1 var_(Intercept).stand    stand       5.56 
## 2 var_Observation.Residual Residual    0.951
## 3 var_(Intercept).stand    stand       2.61 
## 4 var_Observation.Residual Residual    1.11 
## 5 var_(Intercept).stand    stand       9.73 
## 6 var_Observation.Residual Residual    1.36

Choose three different sample sizes (your choice) and run 1000 model simulations with each sample size. Create 3 visualizations that compare distributions of the variances for each of the 3 sample sizes. Make sure that the axes are labelled correctly. What do these graphs say about the relationship between sample size and variance?

library(tidyverse)

## ── Attaching packages ────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──

## ✔ ggplot2 3.0.0     ✔ readr   1.3.1
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   0.8.3     ✔ stringr 1.3.1
## ✔ ggplot2 3.0.0     ✔ forcats 0.4.0

## Warning: package 'tibble' was built under R version 3.5.2

## Warning: package 'tidyr' was built under R version 3.5.2

## Warning: package 'dplyr' was built under R version 3.5.2

## Warning: package 'forcats' was built under R version 3.5.2

## ── Conflicts ───────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ tidyr::expand() masks Matrix::expand()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

stand_sims = c(5,20,100) %>%
  set_names() %>%
  map(~replicate(1000, mix_model(nstand = .x)))

## boundary (singular) fit: see ?isSingular

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00315635
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00279738
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00312312
## (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.0061472
## (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.0024346
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00366529
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00315657
## (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00452179
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00294617
## (tol = 0.002, component 1)

## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00256339
## (tol = 0.002, component 1)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00202024
## (tol = 0.002, component 1)

stand_vars = stand_sims %>%
  modify_depth(2, ~tidy(.x, effects = "ran_pars", scales = "vcov")) %>%
  map_dfr(bind_rows, .id = "stand_num") %>%
  filter(group == "stand")

stand_vars = mutate(stand_vars, stand_num=fct_inorder(stand_num))

add_prefix = function(string){
  paste("Number of stands:", string, sep = " ")
}

groupmed = stand_vars %>%
  group_by(stand_num) %>%
  summarise(mvar = median(estimate))

ggplot(stand_vars, aes(x = estimate)) +
  geom_density(fill = "orange", alpha = 0.25) +
  facet_wrap(~stand_num, labeller = as_labeller(add_prefix)) +
  geom_vline(aes(xintercept =4, linetype = "True Variance"), size = 0.5) +
  geom_vline(data = groupmed, aes(xintercept = mvar, linetype = "Median Variance"), size = 0.5) +
  theme_bw() +
  scale_linetype_manual(name = "", values = c(2,1)) +
  theme(legend.position = "bottom", legend.key.width = unit(.1, "cm")) +
  labs(x = "Estimated Variance", y = NULL)

# with number of stands getting larger, the variance is getting much smaller, and the Median Variance is getting closer to the true variance

Plot the coefficients of the estimates of elevation and slope. Hint: the x-axis should have 1000 values. Discuss the graphs.

mix_1000 %>%
  map_df(tidy, effects = "fixed") %>%
  filter(term == "elevation") %>%
  ggplot(aes(x=1:1000, y=estimate)) +
  geom_line() +
  facet_wrap(~term) +
  geom_hline(aes(yintercept = 0.005, linetype = "True Elevation")) +
  labs(x = "x", y = "Estimated Elevation")

mix_1000 %>%
  map_df(tidy, effects = "fixed") %>%
  filter(term == "slope") %>%
  ggplot(aes(x=1:1000, y=estimate)) +
  geom_line() +
  facet_wrap(~term) +
  geom_hline(aes(yintercept = 0.1, linetype = "True Slope")) +
  labs(x = "x", y = "Estimated Slope")

# As can be seen in the 2 plots, the estimated values will differ simulation by simulation; however, if we run enough amount of times, the estimated values will bounce around the true values.

ANLY505 - Problem Set 2

Simulation of Linear Mixed Models

Yiqi Niu

2019-08-23