Load packages

The statistical model:

\(y_t = \beta_0 + \beta_1 * (Elevation_s)_t + \beta_2 * Slope_t + (b_s)_t + \epsilon_t\)

Where:

Let’s define the parameters:

nstand = 5
nplot = 4
b0 = -1
b1 = .005
b2 = .1
sds = 2
sd = 1

Simulate other variables:

set.seed(16)
stand = rep(LETTERS[1:nstand], each = nplot)
standeff = rep( rnorm(nstand, 0, sds), each = nplot)
ploteff = rnorm(nstand*nplot, 0, sd)

Simulate elevation and slope:

elevation = rep( runif(nstand, 1000, 1500), each = nplot)
slope = runif(nstand*nplot, 2, 75)

Simulate response variable:

resp2 = b0 + b1*elevation + b2*slope + standeff + ploteff 

Your tasks (complete each task in its’ own code chunk, make sure to use echo=TRUE so I can see your code):

  1. Fit a linear mixed model with the response variable as a function of elevation and slope with stand as a random effect. Are the estimated parameters similar to the intial parameters as we defined them?
model=lmer(resp2~1+elevation+slope+(1|stand))
summary(model)
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ 1 + elevation + slope + (1 | stand)
## 
## REML criterion at convergence: 82
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.65583 -0.62467 -0.01693  0.53669  1.41736 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  stand    (Intercept) 1.208    1.099   
##  Residual             1.358    1.165   
## Number of obs: 20, groups:  stand, 5
## 
## Fixed effects:
##               Estimate Std. Error t value
## (Intercept) -21.314628   6.602053  -3.228
## elevation     0.020600   0.004916   4.190
## slope         0.095105   0.016441   5.785
## 
## Correlation of Fixed Effects:
##           (Intr) elevtn
## elevation -0.991       
## slope      0.049 -0.148

Estimated slope 0.0951 is close to parameters. But -21.314 and elevation 0.0206 are both not close to the parameters.

  1. Create a function for your model and run 1000 simulations of that model.
function1=function(nstand=5,nplot=4,b0=-1,b1=0.005,b2=0.1,sds=2,sd=1){
  stand=rep(LETTERS[1:nstand],each=nplot)
  standeff=rep(rnorm(nstand,0,sds),each=nplot)
  ploteff=rnorm(nstand*nplot,0,sd)
  elevation=rep(runif(nstand,1000,1500),each=nplot)
  slope=runif(nstand*nplot,2,75)
  resp2=b0+b1*elevation+b2*slope+standeff+ploteff
  data1=data.frame(resp2,elevation,slope,stand)
  lmer(resp2~1+elevation+slope+(1|stand),data=data1)
}
function1()
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ 1 + elevation + slope + (1 | stand)
##    Data: data1
## REML criterion at convergence: 80.9781
## Random effects:
##  Groups   Name        Std.Dev.
##  stand    (Intercept) 2.3573  
##  Residual             0.9754  
## Number of obs: 20, groups:  stand, 5
## Fixed Effects:
## (Intercept)    elevation        slope  
##   10.584601    -0.005464     0.086839
simulation=replicate(n=1000,expr=function1())
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00350333
## (tol = 0.002, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00455142
## (tol = 0.002, component 1)
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.0036354
## (tol = 0.002, component 1)
## boundary (singular) fit: see ?isSingular
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00439557
## (tol = 0.002, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00239512
## (tol = 0.002, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00219495
## (tol = 0.002, component 1)
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00322495
## (tol = 0.002, component 1)
## boundary (singular) fit: see ?isSingular
  1. Extract the stand and residual variances from this simulation run. Print the first 6 rows of the data.
variances=simulation %>% map_dfr(tidy,effects="ran_pars",scales="vcov")
variances[1:6,]
## # A tibble: 6 x 4
##   effect   group    term             estimate
##   <chr>    <chr>    <chr>               <dbl>
## 1 ran_pars stand    var__(Intercept)    2.61 
## 2 ran_pars Residual var__Observation    1.11 
## 3 ran_pars stand    var__(Intercept)    9.73 
## 4 ran_pars Residual var__Observation    1.36 
## 5 ran_pars stand    var__(Intercept)    0.827
## 6 ran_pars Residual var__Observation    0.914
  1. Choose three different sample sizes (your choice) and run 1000 model simulations with each sample size. Create 3 visualizations that compare distributions of the variances for each of the 3 sample sizes. Make sure that the axes are labelled correctly. What do these graphs say about the relationship between sample size and variance?
plan(multiprocess)
stand_sim=c(5, 100, 150) %>% 
  set_names(c("size_5","size_100","size_150")) %>% 
  future_map(function(.size)replicate(n=1000,expr=function1(nstand=.size)))
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00264363
## (tol = 0.002, component 1)
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00294062
## (tol = 0.002, component 1)
## boundary (singular) fit: see ?isSingular
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00261699
## (tol = 0.002, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00202183
## (tol = 0.002, component 1)
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00291416
## (tol = 0.002, component 1)
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00422466
## (tol = 0.002, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.0034376
## (tol = 0.002, component 1)
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl =
## control$checkConv, : Model failed to converge with max|grad| = 0.00310566
## (tol = 0.002, component 1)
## boundary (singular) fit: see ?isSingular
stand_var=stand_sim %>%
     modify_depth(2,~tidy(.x,effects="ran_pars",scales="vcov")) %>%
     map_dfr(bind_rows,.id="stand_num") %>%
     filter(group=="stand")
stand_var
## # A tibble: 3,000 x 5
##    stand_num effect   group term             estimate
##    <chr>     <chr>    <chr> <chr>               <dbl>
##  1 size_5    ran_pars stand var__(Intercept)    5.61 
##  2 size_5    ran_pars stand var__(Intercept)    1.07 
##  3 size_5    ran_pars stand var__(Intercept)    1.23 
##  4 size_5    ran_pars stand var__(Intercept)    3.99 
##  5 size_5    ran_pars stand var__(Intercept)    2.50 
##  6 size_5    ran_pars stand var__(Intercept)    8.70 
##  7 size_5    ran_pars stand var__(Intercept)    6.65 
##  8 size_5    ran_pars stand var__(Intercept)    0.742
##  9 size_5    ran_pars stand var__(Intercept)    1.34 
## 10 size_5    ran_pars stand var__(Intercept)    2.05 
## # ... with 2,990 more rows
ggplot(stand_var,aes(x=estimate))+
     geom_density(fill="blue",alpha=.25)+
     facet_wrap(~stand_num)+
     geom_vline(xintercept=4)

It looks like the variance decreases when sample size increases.

  1. Plot the coefficients of the estimates of elevation and slope. Hint: the x-axis should have 1000 values. Discuss the graphs.
coef=simulation %>%
  map_dfr(tidy,effects="fixed") %>%
  filter(term %in% c("elevation","slope"))
coef
## # A tibble: 2,000 x 5
##    effect term      estimate std.error statistic
##    <chr>  <chr>        <dbl>     <dbl>     <dbl>
##  1 fixed  elevation   0.0152   0.00426     3.57 
##  2 fixed  slope       0.121    0.0157      7.73 
##  3 fixed  elevation   0.0156   0.0117      1.33 
##  4 fixed  slope       0.124    0.0166      7.48 
##  5 fixed  elevation   0.0130   0.00336     3.87 
##  6 fixed  slope       0.0937   0.0100      9.32 
##  7 fixed  elevation   0.0135   0.0157      0.856
##  8 fixed  slope       0.0767   0.0152      5.05 
##  9 fixed  elevation  -0.0235   0.0132     -1.78 
## 10 fixed  slope       0.127    0.0106     12.0  
## # ... with 1,990 more rows
coef %>% 
  dplyr::filter(term %in% c("elevation","slope")) %>% 
  group_by(term) %>% 
  mutate(x=1:1000) %>%
  ungroup() %>% 
  mutate(real_value=ifelse(term=="elevation",0.005,0.1)) %>% 
  ggplot(aes(x=x,y=estimate))+
  geom_line()+
  facet_wrap(~term)+
  geom_hline(aes(yintercept=real_value,color=term),linetype=4,size=0.5)+
  theme_bw()

The plots show that the estimated coefficiencies are close to the parameters of 0.005 and 0.1. But they look like really scattered in the plots, which might indicate the model might not fit well.

  1. Submit a link to this document in R Pubs to your Moodle. This assignment is worth 25 points.