The statistical model:
\(y_t = \beta_0 + \beta_1 * (Elevation_s)_t + \beta_2 * Slope_t + (b_s)_t + \epsilon_t\)
Where:
Let’s define the parameters:
nstand <- 5; nplot <- 4
b0 <- -1; b1 <- 0.005; b2 <- 0.1
sds <- 2; sd <- 1
Simulate other variables:
set.seed(16)
stand <- rep(LETTERS[1:nstand], each = nplot)
standeff <- rep(rnorm(nstand, 0, sds), each = nplot)
ploteff <- rnorm(nstand*nplot, 0, sd)
Simulate elevation and slope:
elevation <- rep(runif(nstand, 1000, 1500), each = nplot)
slope <- runif(nstand*nplot, 2, 75)
Simulate response variable:
resp2 <- b0 + b1*elevation + b2*slope + standeff + ploteff
Your tasks (complete each task in its’ own code chunk, make sure to use echo=TRUE so I can see your code):
fit2 <- lmer(resp2 ~ elevation + slope + (1|stand)) #the linear mixed model
options(scipen=100,digits=4)
tidy(fit2) %>% kable() %>% kable_styling()
| term | estimate | std.error | statistic | group |
|---|---|---|---|---|
| (Intercept) | -21.3146 | 6.6021 | -3.228 | fixed |
| elevation | 0.0206 | 0.0049 | 4.190 | fixed |
| slope | 0.0951 | 0.0164 | 5.785 | fixed |
| sd_(Intercept).stand | 1.0989 | NA | NA | stand |
| sd_Observation.Residual | 1.1653 | NA | NA | Residual |
ANSWER: Estimated β2 (0.0951) is close to the initial parameter 0.1, but estimated β0 (-21.3146) and estimated β1 (0.0206) is both not close to the initial parameters -1 and 0.005 respectively.
linearmixed_fun <- function(nstand=5, nplot=4, b0=-1, b1=0.005, b2=0.1, sigma_s=2, sigma=1) {
stand <- rep(LETTERS[1:nstand], each = nplot)
standeff <- rep(rnorm(nstand, 0, sigma_s), each = nplot)
ploteff <- rnorm(nstand*nplot, 0, sigma)
elevation <- rep(runif(nstand, 1000, 1500), each = nplot)
slope <- runif(nstand*nplot, 2, 75)
resp2 <- b0 + b1*elevation + b2*slope + standeff + ploteff
fit2 <- lmer(resp2 ~ elevation + slope + (1|stand))
}
times <- 1000
set.seed(16)
sims <- rerun(times, linearmixed_fun()) #omit messages "boundary (singular) fit: see ?isSingular"
stand_var <- sims %>%
map_dfr(tidy, effects="ran_pars", scales="vcov") %>%
filter(group=="stand")
stand_var$sequence <- rownames(stand_var)
res_var <- sims %>%
map_dfr(tidy, effects="ran_pars", scales="vcov") %>%
filter(group=="Residual")
res_var$sequence <- rownames(res_var)
dat <- merge(stand_var[c(1:6),], res_var[c(1:6),], by="sequence", all=T)
dat %>% kable() %>% kable_styling()
| sequence | term.x | group.x | estimate.x | term.y | group.y | estimate.y |
|---|---|---|---|---|---|---|
| 1 | var_(Intercept).stand | stand | 1.2075 | var_Observation.Residual | Residual | 1.3579 |
| 2 | var_(Intercept).stand | stand | 5.5570 | var_Observation.Residual | Residual | 0.9514 |
| 3 | var_(Intercept).stand | stand | 2.6105 | var_Observation.Residual | Residual | 1.1057 |
| 4 | var_(Intercept).stand | stand | 9.7313 | var_Observation.Residual | Residual | 1.3650 |
| 5 | var_(Intercept).stand | stand | 0.8269 | var_Observation.Residual | Residual | 0.9138 |
| 6 | var_(Intercept).stand | stand | 9.5880 | var_Observation.Residual | Residual | 1.3158 |
stand_sims <- c(5,20,100) %>%
set_names() %>%
map(~rerun(times, linearmixed_fun(nstand=.x)))
stand_vars <- stand_sims %>%
modify_depth(2, ~tidy(.x, effects="ran_pars", scales="vcov")) %>%
map_dfr(bind_rows, .id="num") %>%
filter(group=="stand")
stand_vars <- mutate(stand_vars, num=fct_inorder(num))
groupmed <- stand_vars %>%
group_by(num) %>%
summarise(medvar=median(estimate))
f1 <- ggplot(stand_vars, aes(x=estimate))+
geom_density(fill="orange", alpha=0.5)+
facet_wrap(~num, labeller=as_labeller(function(x) paste("Number of stands:", x, sep=" ")))+
geom_vline(aes(xintercept=sds^2, linetype="True Variance"), size=0.4)+
geom_vline(data=groupmed, aes(xintercept=medvar, linetype="Median Variance"), size=0.4)+
scale_linetype_manual(name="", values=c(2,1))+
labs(x="Estimated Variance", y="Density")+
theme_bw()
f1
ANSWER: Sample size is respectively chosen as 5, 20, 100. As sample size increases, variance will decrease.
coef <- sims %>%
map_dfr(tidy, effects="fixed") %>%
filter(term %in% c("elevation", "slope"))
coef$sequence <- rep(1:times, each=2)
coefmed <- coef %>%
group_by(term) %>%
summarise(med=median(estimate))
f2 <- ggplot(coef, aes(sequence, estimate))+
geom_point(size=0.2)+
facet_wrap(~term)+
geom_hline(data=coefmed, aes(yintercept=med), size=0.4)+
labs(x="Sequence number", y="Estimated coefficients")+
theme_bw()
f2
ANSWER: The median values of the estimated coefficients are shown as line whose values 0.00472 and 0.0998 are close to the initial parameters 0.005 and 0.1. However, the estimated coefficients β1 and β2 are much scattered in each plot. Some estimated β1 are even negative. The model may not fit well in this case.