The statistical model:
\(y_t = \beta_0 + \beta_1 * (Elevation_s)_t + \beta_2 * Slope_t + (b_s)_t + \epsilon_t\)
Where:
Let’s define the parameters:
nstand = 5
nplot = 4
b0 = -1
b1 = .005
b2 = .1
sds = 2
sd = 1
Simulate other variables:
set.seed(16)
stand = rep(LETTERS[1:nstand], each = nplot)
standeff = rep( rnorm(nstand, 0, sds), each = nplot)
ploteff = rnorm(nstand*nplot, 0, sd)
Simulate elevation and slope:
elevation = rep( runif(nstand, 1000, 1500), each = nplot)
slope = runif(nstand*nplot, 2, 75)
Simulate response variable:
resp2 = b0 + b1*elevation + b2*slope + standeff + ploteff
Your tasks (complete each task in its’ own code chunk, make sure to use echo=TRUE so I can see your code):
library(Matrix)
library(lme4)
resp2 <- b0 + b1*elevation + b2*slope + standeff + ploteff
m1 <- lmer(resp2 ~ elevation + slope +(1|stand))
m1
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ elevation + slope + (1 | stand)
## REML criterion at convergence: 81.9874
## Random effects:
## Groups Name Std.Dev.
## stand (Intercept) 1.099
## Residual 1.165
## Number of obs: 20, groups: stand, 5
## Fixed Effects:
## (Intercept) elevation slope
## -21.31463 0.02060 0.09511
summary(m1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: resp2 ~ elevation + slope + (1 | stand)
##
## REML criterion at convergence: 82
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.65583 -0.62467 -0.01693 0.53669 1.41736
##
## Random effects:
## Groups Name Variance Std.Dev.
## stand (Intercept) 1.208 1.099
## Residual 1.358 1.165
## Number of obs: 20, groups: stand, 5
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) -21.314628 6.602053 -3.228
## elevation 0.020600 0.004916 4.190
## slope 0.095105 0.016441 5.785
##
## Correlation of Fixed Effects:
## (Intr) elevtn
## elevation -0.991
## slope 0.049 -0.148
Defined intercept(b0) is -1 which is totally different than estimated intercept -21.314628. Defined Elevation(b1) is .005 which is much smaller than estimated elevation 0.020600. Defined slope(b1) is 0.1 which is quite close to estimated slope 0.095105
set.seed(16)
m1_function = function(nstand = 5, nplot = 4, b0 = -1, b1 = 0.005, b2 = .1, sds = 2, sd = 1)
{
stand = rep(LETTERS[1:nstand], each = nplot)
standeff = rep(rnorm(nstand, 0, sds), each = nplot)
ploteff = rnorm(nstand*nplot, 0, sd)
elevation = rep(runif(nstand, 1000, 1500), each = nplot)
slope = runif(nstand*nplot, 2, 75)
resp2 = b0 + b1*elevation + b2*slope + standeff + ploteff
dframe = data.frame(resp2, elevation, slope, stand)
m1 = lmer(resp2 ~ elevation + slope + (1|stand), data = dframe)
}
set.seed(16)
m1_simul = replicate (1000, expr = m1_function())
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
library(tidyverse)
## ── Attaching packages ────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.1 ✔ purrr 0.3.2
## ✔ tibble 2.1.3 ✔ dplyr 0.8.1
## ✔ tidyr 0.8.3 ✔ stringr 1.4.0
## ✔ readr 1.3.1 ✔ forcats 0.4.0
## ── Conflicts ───────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ tidyr::expand() masks Matrix::expand()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(purrr)
library(tidyr)
library(broom)
tidy(m1, effects = "ran_pars", scales = "vcov")
## # A tibble: 2 x 3
## term group estimate
## <chr> <chr> <dbl>
## 1 var_(Intercept).stand stand 1.21
## 2 var_Observation.Residual Residual 1.36
library("broom")
head(map_dfr(m1_simul, broom::tidy, .id = "model", effects = "ran_pars"))
## # A tibble: 6 x 4
## model term group estimate
## <chr> <chr> <chr> <dbl>
## 1 1 sd_(Intercept).stand stand 1.10
## 2 1 sd_Observation.Residual Residual 1.17
## 3 2 sd_(Intercept).stand stand 2.36
## 4 2 sd_Observation.Residual Residual 0.975
## 5 3 sd_(Intercept).stand stand 1.62
## 6 3 sd_Observation.Residual Residual 1.05
m1_sims <- c(5, 20, 100) %>% set_names() %>% map(~replicate (1000, m1_function(nstand = .x)))
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
## boundary (singular) fit: see ?isSingular
library(dplyr)
var_stand = m1_sims %>% modify_depth(2, ~tidy(.x, effects = "ran_pars", scales = "vcov")) %>% map_dfr(bind_rows, .id = "stand_number") %>% filter (group == "stand")
head(var_stand)
## # A tibble: 6 x 4
## stand_number term group estimate
## <chr> <chr> <chr> <dbl>
## 1 5 var_(Intercept).stand stand 6.91
## 2 5 var_(Intercept).stand stand 1.92
## 3 5 var_(Intercept).stand stand 12.0
## 4 5 var_(Intercept).stand stand 6.84
## 5 5 var_(Intercept).stand stand 2.89
## 6 5 var_(Intercept).stand stand 2.52
library(forcats)
var_stand = mutate(var_stand, stand_number = fct_inorder(stand_number))
add_prefix <- function(string) { paste(" #Stands= ", string, sep = " ")}
med_group = var_stand %>% group_by(stand_number) %>% summarise(mvar = median(estimate))
library(ggplot2)
ggplot(var_stand, aes(x = estimate)) + geom_density(fill = "blue", alpha = 0.25) + facet_wrap(~stand_number, labeller = as_labeller(add_prefix)) + geom_vline(aes(xintercept = 4, linetype = "True Variance"), size = 0.6) + geom_vline(data = med_group, aes(xintercept = mvar, linetype = "Median Variance"), size = 0.5) + theme_bw() + scale_linetype_manual(name = "", values = c(2,1)) + theme(legend.position = "bottom", legend.key.width = unit(.1, "cm")) + labs(x = "Estimated Variance", y = "Density (Probability)")
When sample size increases, variance decreases. As the sample size increases from 5 to 20 its more evident but not so much when it increases from 20 to 100. Hence the threshold of the relationship between true variance and estimated is 20. So increase in sample will not have any effect on accuracy of variance estimate.
coeff = m1_simul %>%
map_dfr(tidy, effects = "fixed") %>%
filter(term %in% c("elevation","slope")) %>%
group_by(term)
coeff
## # A tibble: 2,000 x 4
## # Groups: term [2]
## term estimate std.error statistic
## <chr> <dbl> <dbl> <dbl>
## 1 elevation 0.0206 0.00492 4.19
## 2 slope 0.0951 0.0164 5.78
## 3 elevation -0.00546 0.00927 -0.589
## 4 slope 0.0868 0.0140 6.21
## 5 elevation 0.0152 0.00426 3.57
## 6 slope 0.121 0.0157 7.73
## 7 elevation 0.0156 0.0117 1.33
## 8 slope 0.124 0.0166 7.48
## 9 elevation 0.0130 0.00336 3.87
## 10 slope 0.0937 0.0100 9.32
## # … with 1,990 more rows
library(dplyr)
mean = coeff %>%
summarise_at("estimate", funs( mean))
mean
## # A tibble: 2 x 2
## term estimate
## <chr> <dbl>
## 1 elevation 0.00492
## 2 slope 0.0997
plot = ggplot(coeff, aes(x=rep(1:1000, each=2),y=estimate))+ geom_point(size=0.3) + facet_wrap(~term) + geom_hline(data=mean,aes(yintercept=mean$estimate,color = term), size = 0.9) + labs(x=" Numbers that replicated", y="Evaluated coefficients")
plot
The mean estimates of coefficient (elevation and slope) are 0.004 and 0.099 respectively which are close to the initial parameters of 0.005 and 0.1.