The statistical model:

\(y_t = \beta_0 + \beta_1 * (Elevation_s)_t + \beta_2 * Slope_t + (b_s)_t + \epsilon_t\)

Where:

Let’s define the parameters:

nstand = 5
nplot = 4
b0 = -1
b1 = .005
b2 = .1
sds = 2
sd = 1

Simulate other variables:

set.seed(16)
stand = rep(LETTERS[1:nstand], each = nplot)
standeff = rep( rnorm(nstand, 0, sds), each = nplot)
ploteff = rnorm(nstand*nplot, 0, sd)

Simulate elevation and slope:

elevation = rep( runif(nstand, 1000, 1500), each = nplot)
slope = runif(nstand*nplot, 2, 75)

Simulate response variable:

resp2 = b0 + b1*elevation + b2*slope + standeff + ploteff 

Your tasks (complete each task in its’ own code chunk, make sure to use echo=TRUE so I can see your code):

  1. Fit a linear mixed model with the response variable as a function of elevation and slope with stand as a random effect. Are the estimated parameters similar to the intial parameters as we defined them?
fit2 = lmer(resp2 ~ elevation + slope + (1|stand))
fit2
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ elevation + slope + (1 | stand)
## REML criterion at convergence: 81.9874
## Random effects:
##  Groups   Name        Std.Dev.
##  stand    (Intercept) 1.099   
##  Residual             1.165   
## Number of obs: 20, groups:  stand, 5
## Fixed Effects:
## (Intercept)    elevation        slope  
##   -21.31463      0.02060      0.09511

The estimated parameters for elevation (b1) and slope (b2) are 0.0206 and 0.09511, respectively. These are pretty close to the initial defined parameters (.005 and .1). The large difference is in the intercept parameter (-21.3 vs -1).

  1. Create a function for your model and run 1000 simulations of that model.
fun2 = function(b0=-1, b1=.005, b2=.1, nstand=5, nplot=4, sds=2, sd=1) {
  stand = rep(LETTERS[1:nstand], each = nplot)
  standeff = rep(rnorm(nstand, 0, sds), each = nplot)  # Stand Effect
  ploteff = rnorm(nstand * nplot, 0, sd)  # Plot Effect
  elevation = rep(runif(nstand, 1000, 1500), each = nplot)
  slope = runif(nstand*nplot, 2, 75)
  resp2 = b0 + b1 * elevation + b2 * slope + standeff + ploteff
  fit2 = lmer(resp2 ~ elevation + slope + (1|stand))
  fit2
}
fun2()
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ elevation + slope + (1 | stand)
## REML criterion at convergence: 80.9781
## Random effects:
##  Groups   Name        Std.Dev.
##  stand    (Intercept) 2.3573  
##  Residual             0.9754  
## Number of obs: 20, groups:  stand, 5
## Fixed Effects:
## (Intercept)    elevation        slope  
##   10.584601    -0.005464     0.086839
sim_fun2 = replicate(n=1000, fun2(), simplify = F)
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
length(sim_fun2)
## [1] 1000
  1. Extract the stand and residual variances from this simulation run. Print the first 6 rows of the data.
head(map(sim_fun2, ~summary(.x)), 6)
## [[1]]
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ elevation + slope + (1 | stand)
## 
## REML criterion at convergence: 81.8
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.64222 -0.50652 -0.07439  0.39168  1.49956 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  stand    (Intercept) 2.610    1.616   
##  Residual             1.106    1.052   
## Number of obs: 20, groups:  stand, 5
## 
## Fixed effects:
##               Estimate Std. Error t value
## (Intercept) -15.946250   5.626086  -2.834
## elevation     0.015212   0.004263   3.568
## slope         0.121069   0.015653   7.735
## 
## Correlation of Fixed Effects:
##           (Intr) elevtn
## elevation -0.984       
## slope     -0.058 -0.055
## 
## [[2]]
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ elevation + slope + (1 | stand)
## 
## REML criterion at convergence: 87.8
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.25126 -0.48517 -0.04175  0.49356  1.54182 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  stand    (Intercept) 9.731    3.120   
##  Residual             1.365    1.168   
## Number of obs: 20, groups:  stand, 5
## 
## Fixed effects:
##              Estimate Std. Error t value
## (Intercept) -13.92550   13.31884  -1.046
## elevation     0.01556    0.01170   1.330
## slope         0.12435    0.01663   7.479
## 
## Correlation of Fixed Effects:
##           (Intr) elevtn
## elevation -0.993       
## slope     -0.044 -0.003
## 
## [[3]]
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ elevation + slope + (1 | stand)
## 
## REML criterion at convergence: 76.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.9391 -0.3691 -0.1649  0.5172  1.7249 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  stand    (Intercept) 0.8269   0.9093  
##  Residual             0.9138   0.9559  
## Number of obs: 20, groups:  stand, 5
## 
## Fixed effects:
##              Estimate Std. Error t value
## (Intercept) -12.25232    4.45186  -2.752
## elevation     0.01300    0.00336   3.870
## slope         0.09365    0.01005   9.323
## 
## Correlation of Fixed Effects:
##           (Intr) elevtn
## elevation -0.992       
## slope      0.059 -0.137
## 
## [[4]]
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ elevation + slope + (1 | stand)
## 
## REML criterion at convergence: 86.8
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.17685 -0.67552 -0.04607  0.66582  1.92244 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  stand    (Intercept) 9.588    3.096   
##  Residual             1.316    1.147   
## Number of obs: 20, groups:  stand, 5
## 
## Fixed effects:
##              Estimate Std. Error t value
## (Intercept) -11.92341   21.43384  -0.556
## elevation     0.01345    0.01573   0.856
## slope         0.07667    0.01519   5.048
## 
## Correlation of Fixed Effects:
##           (Intr) elevtn
## elevation -0.997       
## slope     -0.112  0.080
## 
## [[5]]
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ elevation + slope + (1 | stand)
## 
## REML criterion at convergence: 65.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.3503 -0.5292 -0.3537  0.7736  1.5564 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  stand    (Intercept) 0.000    0.0000  
##  Residual             0.815    0.9028  
## Number of obs: 20, groups:  stand, 5
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 31.75050   15.80359   2.009
## elevation   -0.02351    0.01323  -1.776
## slope        0.12712    0.01061  11.985
## 
## Correlation of Fixed Effects:
##           (Intr) elevtn
## elevation -1.000       
## slope      0.072 -0.100
## convergence code: 0
## boundary (singular) fit: see ?isSingular
## 
## 
## [[6]]
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ elevation + slope + (1 | stand)
## 
## REML criterion at convergence: 83.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.9877 -0.4081 -0.1085  0.6376  1.8509 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  stand    (Intercept) 3.555    1.886   
##  Residual             1.301    1.141   
## Number of obs: 20, groups:  stand, 5
## 
## Fixed effects:
##              Estimate Std. Error t value
## (Intercept) -29.18993   15.11541  -1.931
## elevation     0.02430    0.01076   2.259
## slope         0.11198    0.01475   7.593
## 
## Correlation of Fixed Effects:
##           (Intr) elevtn
## elevation -0.998       
## slope      0.043 -0.079

I was not able to figure out how to extract just the stand and residual variances from the summary, so I have printed out the full summary of the first 6 simulations, which includes the variances. I will continue to try and figure out how to do it.

  1. Choose three different sample sizes (your choice) and run 1000 model simulations with each sample size. Create 3 visualizations that compare distributions of the variances for each of the 3 sample sizes. Make sure that the axes are labelled correctly. What do these graphs say about the relationship between sample size and variance?
# Creating more variance by adding additional observations
stand_sims = c(5, 200, 100) %>%
  set_names() %>%
  map(~replicate(1000, fun2(nstand = .x)))
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
stand_vars = stand_sims %>%
  modify_depth(2, ~tidy(.x, effects = "ran_pars", scales = "vcov")) %>%
  map_dfr(bind_rows, .id = "stand_num") %>%
  filter(group == "stand")
head(stand_vars)
## # A tibble: 6 x 4
##   stand_num term                  group estimate
##   <chr>     <chr>                 <chr>    <dbl>
## 1 5         var_(Intercept).stand stand   12.0  
## 2 5         var_(Intercept).stand stand    6.84 
## 3 5         var_(Intercept).stand stand    2.89 
## 4 5         var_(Intercept).stand stand    2.52 
## 5 5         var_(Intercept).stand stand    0.966
## 6 5         var_(Intercept).stand stand    2.91
ggplot(stand_vars, aes(x = estimate)) +
  geom_density(fill = "orange", alpha = .25) +
  facet_wrap(~stand_num)

ggplot(stand_vars, aes(x = estimate)) +
  geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

  1. Plot the coefficients of the estimates of elevation and slope. Hint: the x-axis should have 1000 values. Discuss the graphs.

  2. Submit a link to this document in R Pubs to your Moodle. This assignment is worth 25 points.