The statistical model:
\(y_t = \beta_0 + \beta_1 * (Elevation_s)_t + \beta_2 * Slope_t + (b_s)_t + \epsilon_t\)
Where:
Let’s define the parameters:
nstand = 5
nplot = 4
b0 = -1
b1 = .005
b2 = .1
sds = 2
sd = 1
Simulate other variables:
set.seed(16)
stand = rep(LETTERS[1:nstand], each = nplot)
standeff = rep( rnorm(nstand, 0, sds), each = nplot)
ploteff = rnorm(nstand*nplot, 0, sd)
Simulate elevation and slope:
elevation = rep( runif(nstand, 1000, 1500), each = nplot)
slope = runif(nstand*nplot, 2, 75)
Simulate response variable:
resp2 = b0 + b1*elevation + b2*slope + standeff + ploteff
Your tasks (complete each task in its’ own code chunk, make sure to use echo=TRUE so I can see your code):
if (!require("lme4")) {
install.packages("lme4")
library(lme4)
}
## Loading required package: lme4
## Loading required package: Matrix
fit = lmer(resp2 ~ elevation + slope + (1|stand))
fit
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ elevation + slope + (1 | stand)
## REML criterion at convergence: 81.9874
## Random effects:
## Groups Name Std.Dev.
## stand (Intercept) 1.099
## Residual 1.165
## Number of obs: 20, groups: stand, 5
## Fixed Effects:
## (Intercept) elevation slope
## -21.31463 0.02060 0.09511
summary(fit)
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ elevation + slope + (1 | stand)
##
## REML criterion at convergence: 82
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.65583 -0.62467 -0.01693 0.53669 1.41736
##
## Random effects:
## Groups Name Variance Std.Dev.
## stand (Intercept) 1.208 1.099
## Residual 1.358 1.165
## Number of obs: 20, groups: stand, 5
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) -21.314628 6.602050 -3.228
## elevation 0.020600 0.004916 4.190
## slope 0.095105 0.016441 5.785
##
## Correlation of Fixed Effects:
## (Intr) elevtn
## elevation -0.991
## slope 0.049 -0.148
Answer: The fitted linear model have an estimated b2 at 0.095 which is close to the initial value 0.1, however the estimated b1 and b0 are different the initial values.
times = 1000
sims_func = function(nstand = 5, nplot = 4, b0 = -1, b1 = .005, b2 = 0.1, sds = 2, sd = 1) {
stand = rep(LETTERS[1:nstand], each = nplot)
standeff = rep(rnorm(nstand, 0, sds), each = nplot)
ploteff = rnorm(nstand*nplot, 0, sd)
elevation = rep(runif(nstand, 1000, 1500), each = nplot)
slope = runif(nstand*nplot, 2, 75)
resp2 = b0 + b1*elevation + b2*slope + standeff + ploteff
fit2 = lmer(resp2 ~ elevation + slope + (1|stand))
}
sims1000 = replicate(n = times, expr = sims_func())
if (!require("broom")) {
install.packages("broom")
library(broom)
}
## Loading required package: broom
## Warning: package 'broom' was built under R version 3.5.2
if (!require("tidyverse")) {
install.packages("tidyverse")
library(tidyverse)
}
## Loading required package: tidyverse
## Warning: package 'tidyverse' was built under R version 3.5.2
## -- Attaching packages ---------------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.1.0 v purrr 0.2.5
## v tibble 1.4.2 v dplyr 0.7.8
## v tidyr 0.8.2 v stringr 1.3.1
## v readr 1.3.0 v forcats 0.3.0
## Warning: package 'ggplot2' was built under R version 3.5.2
## -- Conflicts ------------------------------------------------------------------------------- tidyverse_conflicts() --
## x tidyr::expand() masks Matrix::expand()
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
variances = sims1000 %>% map_dfr(tidy, effects = "ran_pars", scales = "vcov")
variances %>% print(n = 6)
## # A tibble: 2,000 x 3
## term group estimate
## <chr> <chr> <dbl>
## 1 var_(Intercept).stand stand 5.56
## 2 var_Observation.Residual Residual 0.951
## 3 var_(Intercept).stand stand 2.61
## 4 var_Observation.Residual Residual 1.11
## 5 var_(Intercept).stand stand 9.73
## 6 var_Observation.Residual Residual 1.36
## # ... with 1,994 more rows
Answer: With the sample size increases the variance will decrease.
if (!require("furrr")) {
install.packages("furrr")
library(furrr)
}
## Loading required package: furrr
## Warning: package 'furrr' was built under R version 3.5.3
## Loading required package: future
## Warning: package 'future' was built under R version 3.5.3
coef = sims1000 %>%
future_map(tidy, effects = "fixed") %>%
bind_rows()
med_elevation = median(coef$estimate[coef$term=="elevation"])
med_elevation
## [1] 0.00469506
med_slope = median(coef$estimate[coef$term=="slope"])
med_slope
## [1] 0.09985177
mean_elevation = mean(coef$estimate[coef$term=="elevation"])
mean_elevation
## [1] 0.004896685
mean_slope = mean(coef$estimate[coef$term=="slope"])
mean_slope
## [1] 0.09969903
coef %>%
dplyr::filter(term %in% c("elevation", "slope")) %>%
group_by(term) %>%
mutate(series = 1 : times) %>%
ungroup() %>%
mutate(med_intercept = ifelse(term == "elevation", med_elevation, med_slope)) %>%
mutate(mean_intercept = ifelse(term == "elevation", mean_elevation, mean_slope)) %>%
ggplot(aes(x = series, y = estimate)) +
geom_line() +
facet_wrap(~term) +
geom_hline(aes(yintercept = med_intercept, color = term), linetype = 4, size = 0.3) +
geom_hline(aes(yintercept = mean_intercept, color = term), size = 0.3) +
labs(x="Series", y="Estimated Coefficients") +
theme_bw()
Answer: Mean elevation is 0.004897, mean slope is 0.099699, median elevation is 0.004695, and median slope us 0.099852. Both mean and median value of 1000 times of simulation are close to the initial value or 0.005 and 0.1. However, their distribution are various and scattered with some extreme values.
Sure