This chapter extended the basic multilevel strategy of partial pooling to slopes as well as intercepts. Accomplishing this meant modeling covariation in the statistical population of parameters. The LKJcorr prior was introduced as a convenient family of priors for correlation matrices. You saw how covariance models can be applied to causal inference, using instrumental variables and the front-door criterion. Gaussian processes represent a practical method of extending the varying effects strategy to continuous dimensions of similarity, such as spatial, network, phylogenetic, or any other abstract distance between entities in the data.
Place each answer inside the code chunk (grey box). The code chunks should contain a text response or a code that completes/answers the question or activity requested. Make sure to include plots if the question requests them. Problems are labeled Easy (E), Medium (M), and Hard(H).
Finally, upon completion, name your final output .html file as: YourName_ANLY505-Year-Semester.html and publish the assignment to your R Pubs account and submit the link to Canvas. Each question is worth 5 points.
14E1. Add to the following model varying slopes on the predictor x. \[\begin{align} \ y_i ∼ Normal(μ_i, σ) \\ \ μ_i = α_{group[i]} + βx_i \\ \ α_{group} ∼ Normal(α, σ_α) \\ \ α ∼ Normal(0, 10) \\ \ β ∼ Normal(0, 1) \\ \ σ ∼ Exponential(1) \\ \ σ_α ∼ Exponential(1) \\ \end{align}\]
#y_i \sim Normal(\mu_i, \sigma)\\
#\mu_i = \alpha_{GROUP[i]} + {\beta_{GROUP[i]}}x_i\\
#{
#\begin{bmatrix}
#\alpha_{GROUP} \\
#\beta_{GROUP}
#\end{bmatrix}
#}
#\sim Normal\left(\begin{bmatrix}
#\alpha \\
#\beta
#\end{bmatrix}
#, S \right)\\
#S =
#\begin{pmatrix}
# \sigma_{\alpha} & 0 \\
# 0 & \sigma_{\beta}
#\end{pmatrix}
#\begin{pmatrix}
# \sigma_{\alpha} & 0 \\
# 0 & \sigma_{\beta}
#\end{pmatrix}\\
#\alpha \sim Normal(0, 10)\\
#\beta \sim Normal(0, 1)\\
#\sigma \sim HalfCauchy(0, 2)\\
#\sigma_{\alpha} \sim HalfCauchy(0, 2)\\
#\sigma_{\beta} \sim HalfCauchy(0, 2)\\
# \sim LKJcorr(2)
14E2. Think up a context in which varying intercepts will be positively correlated with varying slopes. Provide a mechanistic explanation for the correlation.
#Think about the cafes example. Now suppose that cafes with a long waiting time in the morning have a even longer (short) waiting time in the evening and vice verca. This will result in a posetive (negativ) correlation.
#The correlation measures the strengh and direction of a linear conection between X and Y.
#rho_{X, Y} = corr(X, Y) = \frac{cov(X,Y)}{\sigma_X\sigma_Y}
14E3. When is it possible for a varying slopes model to have fewer effective parameters (as estimated by WAIC or PSIS) than the corresponding model with fixed (unpooled) slopes? Explain.
#If the prior assigned to each intercept shrinks them all towards the mean, this will result in fewer effective parameters.
#If we have an aggressive regularizing prior, this will result in a less flexible posterior and therefore fewer effective parameters."The [1] index in each vector is the varying intercept standard deviation, while the [2] and [3] are the slopes. While these are just posterior means, and the amount of shrinkage averages over the entire posterior, you can get a sense from the small values that shrinkage is pretty aggressive here. This is what takes the model from 56 actual parameters to 18 effective parameters, as measured by WAIC."
14M1. Repeat the café robot simulation from the beginning of the chapter. This time, set rho to zero, so that there is no correlation between intercepts and slopes. Plot the posterior distribution. How does the posterior distribution of the correlation reflect this change in the underlying simulation?
a <- 3.5 # average morning wait time
b <- (-1) # average difference afternoon wait time
sigma_a <- 1 # std dev in intercepts
sigma_b <- 0.5 # std dev in slopes
rho <- 0 # correlation between intercepts and slopes
Mu <- c(a, b)
cov_ab <- sigma_a * sigma_b * rho
Sigma <- matrix(c(sigma_a^2, cov_ab, cov_ab, sigma_b^2), ncol = 2)
# simulate observations
N_cafes <- 20
set.seed(6)
vary_effects <- mvrnorm(N_cafes, Mu, Sigma)
a_cafe <- vary_effects[, 1]
b_cafe <- vary_effects[, 2]
N_visits <- 10
afternoon <- rep(0:1, N_visits * N_cafes / 2)
cafe_id <- rep(1:N_cafes, each = N_visits)
mu <- a_cafe[cafe_id] + b_cafe[cafe_id] * afternoon
sigma <- 0.5 # std dev within cafes
wait <- rnorm(N_visits * N_cafes, mu, sigma)
# package into data frame
d <- data.frame(cafe = cafe_id, afternoon = afternoon, wait = wait)
m_M1 <- ulam(
alist(
wait ~ normal(mu, sigma),
mu <- a_cafe[cafe] + b_cafe[cafe] * afternoon,
c(a_cafe, b_cafe)[cafe] ~ multi_normal(c(a, b), Rho, sigma_cafe),
a ~ normal(5, 2),
b ~ normal(-1, 0.5),
sigma_cafe ~ exponential(1),
sigma ~ exponential(1),
Rho ~ lkj_corr(2)
),
data = d, chains = 6, cores = 6
)
post <- extract.samples(m_M1)
ggplot() +
stat_halfeye(aes(x = post$Rho[, 1, 2])) +
theme_bw() +
labs(x = "Rho")
precis(m_M1, pars = "Rho[1,2]")
## result
## mean 0.01308577
## sd 0.23733085
## 5.5% -0.36905311
## 94.5% 0.40167809
## n_eff 2612.70739645
## Rhat 0.99993719
14M2. Fit this multilevel model to the simulated café data: \[\begin{align} \ W_i ∼ Normal(μ_i, σ) \\ \ μ_i = α_{CAFE[i]} + β_{CAFE[i]}A_i \\ \ α_{CAFE} ∼ Normal(α, σ_α) \\ \ β_{CAFE} ∼ Normal(β, σ_β) \\ \ α ∼ Normal(0, 10) \\ \ β ∼ Normal(0, 10) \\ \ σ, σ_α, σ_β ∼ Exponential(1) \\ \end{align}\] Use WAIC to compare this model to the model from the chapter, the one that uses a multi-variate Gaussian prior. Visualize the differences between the posterior means. Explain the result.
# set up parameters of population
a <- 3.5 # average morning wait time
b <- -1 # average difference afternoon wait time
sigma_a <- 1 # std dev in intercepts
sigma_b <- 0.5 # std dev in slopes
rho <- -0.7 # correlation between intercepts and slopes
Mu <- c(a, b)
cov_ab <- sigma_a * sigma_b * rho
Sigma <- matrix(c(sigma_a^2, cov_ab, cov_ab, sigma_b^2), ncol = 2)
# simulate observations
N_cafes <- 20
set.seed(42)
vary_effects <- mvrnorm(N_cafes, Mu, Sigma)
a_cafe <- vary_effects[, 1]
b_cafe <- vary_effects[, 2]
N_visits <- 10
afternoon <- rep(0:1, N_visits * N_cafes / 2)
cafe_id <- rep(1:N_cafes, each = N_visits)
mu <- a_cafe[cafe_id] + b_cafe[cafe_id] * afternoon
sigma <- 0.5 # std dev within cafes
wait <- rnorm(N_visits * N_cafes, mu, sigma)
# package into data frame
d <- data.frame(cafe = cafe_id, afternoon = afternoon, wait = wait)
m_M2Baseline <- ulam(
alist(
wait ~ normal(mu, sigma),
mu <- a_cafe[cafe] + b_cafe[cafe] * afternoon,
c(a_cafe, b_cafe)[cafe] ~ multi_normal(c(a, b), Rho, sigma_cafe),
a ~ normal(5, 2),
b ~ normal(-1, 0.5),
sigma_cafe ~ exponential(1),
sigma ~ exponential(1),
Rho ~ lkj_corr(2)
),
data = d, chains = 4, cores = 4
)
m_M2 <- ulam(
alist(
wait ~ dnorm(mu, sigma),
mu <- a_cafe[cafe] + b_cafe[cafe] * afternoon,
a_cafe[cafe] ~ dnorm(a, sigma_alpha),
b_cafe[cafe] ~ dnorm(b, sigma_beta),
a ~ dnorm(0, 10),
b ~ dnorm(0, 10),
sigma ~ dexp(1),
sigma_alpha ~ dexp(1),
sigma_beta ~ dexp(1)
),
data = d, chains = 4, cores = 4
)
## Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
## Running the chains for more iterations may help. See
## http://mc-stan.org/misc/warnings.html#tail-ess
post_Base <- extract.samples(m_M2Baseline)
a_Base <- apply(post_Base$a_cafe, 2, mean)
b_Base <- apply(post_Base$b_cafe, 2, mean)
post_M2 <- extract.samples(m_M2)
a_M2 <- apply(post_M2$a_cafe, 2, mean)
b_M2 <- apply(post_M2$b_cafe, 2, mean)
plot(a_M2, b_M2,
xlab = "intercept", ylab = "slope",
pch = 16, col = rangi2, ylim = c(min(b_M2) - 0.05, max(b_M2) + 0.05),
xlim = c(min(a_M2) - 0.1, max(a_M2) + 0.1), cex = 2
)
points(a_Base, b_Base, pch = 1, cex = 2)
14M3. Re-estimate the varying slopes model for the UCBadmit data, now using a non-centered parameterization. Compare the efficiency of the forms of the model, using n_eff. Which is better? Which chain sampled faster?
data(UCBadmit)
d <- UCBadmit
dat_list <- list(
admit = d$admit,
applications = d$applications,
male = ifelse(d$applicant.gender == "male", 1, 0),
dept_id = rep(1:6, each = 2)
)
str(dat_list)
## List of 4
## $ admit : int [1:12] 512 89 353 17 120 202 138 131 53 94 ...
## $ applications: int [1:12] 825 108 560 25 325 593 417 375 191 393 ...
## $ male : num [1:12] 1 0 1 0 1 0 1 0 1 0 ...
## $ dept_id : int [1:12] 1 1 2 2 3 3 4 4 5 5 ...
Begin_C <- Sys.time()
m_M3 <- ulam(
alist(
admit ~ dbinom(applications, p),
logit(p) <- a[dept_id] + bm[dept_id] * male,
c(a, bm)[dept_id] ~ multi_normal(c(a_bar, bm_bar), Rho, sigma_dept),
a_bar ~ dnorm(0, 1),
bm_bar ~ dnorm(0, 1),
sigma_dept ~ dexp(1),
Rho ~ dlkjcorr(2)
),
data = dat_list, chains = 4, cores = 4
)
## Warning: There were 1 divergent transitions after warmup. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: Examine the pairs() plot to diagnose sampling problems
End_C <- Sys.time()
precis(m_M3, 3)
## mean sd 5.5% 94.5% n_eff
## bm[1] -0.76728926 2.673536e-01 -1.1965420 -0.3477217 1056.1396
## bm[2] -0.21514939 3.019798e-01 -0.7070368 0.2541455 1089.9844
## bm[3] 0.08072888 1.454014e-01 -0.1548227 0.3052824 2093.6313
## bm[4] -0.09659289 1.390897e-01 -0.3117404 0.1291158 1300.7489
## bm[5] 0.11771462 1.850484e-01 -0.1749345 0.4103478 1395.8194
## bm[6] -0.12162066 2.537355e-01 -0.5405836 0.2774844 1584.8821
## a[1] 1.28182951 2.537503e-01 0.8708790 1.6866249 1126.4066
## a[2] 0.74551009 3.031217e-01 0.2777246 1.2410216 1085.4284
## a[3] -0.64512741 8.655395e-02 -0.7822318 -0.5043864 2258.8750
## a[4] -0.61699318 1.037801e-01 -0.7829737 -0.4585176 1284.1910
## a[5] -1.12989019 1.117887e-01 -1.3102466 -0.9526836 1516.6141
## a[6] -2.59249602 1.897452e-01 -2.8979317 -2.2773330 1706.2246
## a_bar -0.36228767 5.168624e-01 -1.1780189 0.4939375 2009.5696
## bm_bar -0.17051278 2.097375e-01 -0.5019040 0.1591672 1436.8215
## sigma_dept[1] 1.47087671 4.534269e-01 0.9264733 2.2956434 1634.5424
## sigma_dept[2] 0.44403707 2.146551e-01 0.1848376 0.8030274 811.6385
## Rho[1,1] 1.00000000 0.000000e+00 1.0000000 1.0000000 NaN
## Rho[1,2] -0.33281867 3.276035e-01 -0.7940553 0.2525757 1583.9537
## Rho[2,1] -0.33281867 3.276035e-01 -0.7940553 0.2525757 1583.9537
## Rho[2,2] 1.00000000 8.145354e-17 1.0000000 1.0000000 2000.5663
## Rhat4
## bm[1] 1.0011513
## bm[2] 1.0037393
## bm[3] 0.9995542
## bm[4] 1.0006212
## bm[5] 0.9995703
## bm[6] 1.0020097
## a[1] 1.0009658
## a[2] 1.0043315
## a[3] 1.0003773
## a[4] 1.0007484
## a[5] 0.9983855
## a[6] 1.0021877
## a_bar 1.0035289
## bm_bar 1.0022218
## sigma_dept[1] 1.0023342
## sigma_dept[2] 1.0030660
## Rho[1,1] NaN
## Rho[1,2] 0.9998541
## Rho[2,1] 0.9998541
## Rho[2,2] 0.9979980
Begin_NC <- Sys.time()
m_M3NonCent <- ulam(
alist(
admit ~ dbinom(applications, p),
logit(p) <- a_bar + v[dept_id, 1] + (bm_bar + v[dept_id, 2]) * male,
transpars > matrix[dept_id, 2]:v <- compose_noncentered(sigma_dept, L_Rho, z),
matrix[2, dept_id]:z ~ dnorm(0, 1),
a_bar ~ dnorm(0, 1.5),
bm_bar ~ dnorm(0, 1),
vector[2]:sigma_dept ~ dexp(1),
cholesky_factor_corr[2]:L_Rho ~ lkj_corr_cholesky(2)
),
data = dat_list, chains = 4, cores = 4
)
End_NC <- Sys.time()
precis(m_M3NonCent, 3)
## mean sd 5.5% 94.5% n_eff Rhat4
## z[1,1] 1.23477592 0.5535698 0.41561508 2.16812164 571.8684 1.0006700
## z[1,2] 0.85622509 0.5099890 0.07290837 1.72021855 558.7171 1.0011555
## z[1,3] -0.14601009 0.3955301 -0.76082181 0.49990436 497.6378 1.0020836
## z[1,4] -0.12489755 0.3970778 -0.76387643 0.54051648 513.8818 1.0009937
## z[1,5] -0.49987532 0.4075483 -1.15619280 0.16385946 534.6285 1.0020257
## z[1,6] -1.55990345 0.5741875 -2.49959622 -0.65410407 688.7347 1.0033130
## z[2,1] -1.18918126 0.8094670 -2.47812398 0.04760238 1211.7757 0.9994392
## z[2,2] 0.17697210 0.8042798 -1.13615757 1.42376307 1862.8265 1.0004954
## z[2,3] 0.63997551 0.6225204 -0.30374341 1.63869613 1148.7481 1.0015636
## z[2,4] 0.13803191 0.5884489 -0.84403657 1.09140834 1415.4605 0.9999269
## z[2,5] 0.59888118 0.6648959 -0.45598216 1.68079773 1288.1685 0.9998901
## z[2,6] -0.42993856 0.8083375 -1.73926892 0.83763542 1278.7622 0.9995814
## a_bar -0.42924099 0.5957797 -1.41079633 0.54191276 484.2472 1.0027684
## bm_bar -0.16531002 0.2144009 -0.50509417 0.15660782 982.2696 1.0033215
## sigma_dept[1] 1.49933212 0.4627462 0.92956692 2.37712056 689.9083 1.0021833
## sigma_dept[2] 0.43980918 0.2261581 0.16390312 0.82048380 671.2762 1.0090505
## L_Rho[1,1] 1.00000000 0.0000000 1.00000000 1.00000000 NaN NaN
## L_Rho[1,2] 0.00000000 0.0000000 0.00000000 0.00000000 NaN NaN
## L_Rho[2,1] -0.31215447 0.3486588 -0.80449754 0.29718413 1593.6956 1.0013815
## L_Rho[2,2] 0.87250654 0.1407112 0.59114977 0.99917694 1329.4438 1.0014389
## v[1,1] 1.69751965 0.6330367 0.72021892 2.72731703 541.4973 1.0024496
## v[1,2] -0.58585292 0.3093785 -1.10712630 -0.12271885 1126.5245 1.0016512
## v[2,1] 1.17160617 0.6542532 0.13254366 2.19781667 592.4121 1.0017427
## v[2,2] -0.04783454 0.3252172 -0.56985585 0.48054848 1766.1048 1.0003284
## v[3,1] -0.21543051 0.6019046 -1.18561590 0.72799206 485.6293 1.0031505
## v[3,2] 0.23920026 0.2402506 -0.12075227 0.62948847 918.4453 1.0047373
## v[4,1] -0.18640503 0.6031192 -1.15052684 0.77804809 497.7053 1.0021949
## v[4,2] 0.07004324 0.2375508 -0.28630451 0.44793376 1138.9613 1.0023857
## v[5,1] -0.70349189 0.6031298 -1.68767250 0.27159266 491.4576 1.0033947
## v[5,2] 0.28077994 0.2637041 -0.08975926 0.73920632 984.8848 1.0027847
## v[6,1] -2.16418301 0.6266241 -3.17210267 -1.14763339 523.9875 1.0034506
## v[6,2] 0.04670092 0.3047181 -0.44809180 0.51224962 1467.0518 0.9996538
# Centred
End_C - Begin_C
## Time difference of 50.13902 secs
## Time difference of 30.24955 secs
# Non-Centred
End_NC - Begin_NC
## Time difference of 37.60459 secs
## Time difference of 25.83258 secs
14M4. Use WAIC to compare the Gaussian process model of Oceanic tools to the models fit to the same data in Chapter 11. Pay special attention to the effective numbers of parameters, as estimated by WAIC.
data(Kline)
d <- Kline
# Chapter 11 stuff
d$P <- scale(log(d$population))
d$contact_id <- ifelse(d$contact == "high", 2, 1)
# Chapter 14 stuff
d$society <- 1:10
data(islandsDistMatrix)
## Chapter 11 Model
dat2 <- list(T = d$total_tools, P = d$population, cid = d$contact_id)
m11.11 <- ulam(
alist(
T ~ dpois(lambda),
lambda <- exp(a[cid]) * P^b[cid] / g,
a[cid] ~ dnorm(1, 1),
b[cid] ~ dexp(1),
g ~ dexp(1)
),
data = dat2, chains = 4, cores = 4, log_lik = TRUE
)
## Chapter 14 Model
dat_list <- list(T = d$total_tools, P = d$population, society = d$society, Dmat = islandsDistMatrix)
m14.8 <- ulam(
alist(
T ~ dpois(lambda),
lambda <- (a * P^b / g) * exp(k[society]),
vector[10]:k ~ multi_normal(0, SIGMA),
matrix[10, 10]:SIGMA <- cov_GPL2(Dmat, etasq, rhosq, 0.01), c(a, b, g) ~ dexp(1), etasq ~ dexp(2), rhosq ~ dexp(0.5)
),
data = dat_list, chains = 4, cores = 4, iter = 2000, log_lik = TRUE
)
compare(m11.11, m14.8)
## WAIC SE dWAIC dSE pWAIC weight
## m14.8 67.11598 2.299248 0.00000 NA 3.876694 0.998722564
## m11.11 80.43922 11.350980 13.32324 11.41478 5.065518 0.001277436
14M5. Modify the phylogenetic distance example to use group size as the outcome and brain size as a predictor. Plot the estimates. Assuming brain size influences group size, what is your estimate of the effect? How does phylogeny influence the estimate?
data(Primates301)
d <- Primates301
d$name <- as.character(d$name)
dstan <- d[complete.cases(d$group_size, d$body, d$brain), ]
spp_obs <- dstan$name
dat_list <- list(
N_spp = nrow(dstan),
M = standardize(log(dstan$body)),
B = standardize(log(dstan$brain)),
G = standardize(log(dstan$group_size)),
Imat = diag(nrow(dstan))
)
data(Primates301_nex)
tree_trimmed <- keep.tip(Primates301_nex, spp_obs) # only keep tree that's relevant to our species
Rbm <- corBrownian(phy = tree_trimmed) # calculate expected covariance given a Brownian model
V <- vcv(Rbm) # compute expected variances and covariances
## Warning in Initialize.corPhyl(phy, dummy.df): No covariate specified, species
## will be taken as ordered in the data frame. To avoid this message, specify a
## covariate containing the species names with the 'form' argument.
Dmat <- cophenetic(tree_trimmed) # cophenetic distance matrix
dat_list$V <- V[spp_obs, spp_obs] # covariances in speciesXspecies matrix
dat_list$R <- dat_list$V / max(V) # relative covariances of speciesXspecies matrix
m_M5Ordi <- ulam(
alist(
G ~ multi_normal(mu, SIGMA),
mu <- a + bM * M + bB * B,
matrix[N_spp, N_spp]:SIGMA <- Imat * sigma_sq,
a ~ normal(0, 1),
c(bM, bB) ~ normal(0, 0.5),
sigma_sq ~ exponential(1)
),
data = dat_list, chains = 4, cores = 4
)
m_M5Brown <- ulam(
alist(
G ~ multi_normal(mu, SIGMA),
mu <- a + bM * M + bB * B,
matrix[N_spp, N_spp]:SIGMA <- R * sigma_sq,
a ~ normal(0, 1),
c(bM, bB) ~ normal(0, 0.5),
sigma_sq ~ exponential(1)
),
data = dat_list, chains = 4, cores = 4
)
dat_list$Dmat <- Dmat[spp_obs, spp_obs] / max(Dmat)
m_M5GP <- ulam(
alist(
G ~ multi_normal(mu, SIGMA),
mu <- a + bM * M + bB * B,
matrix[N_spp, N_spp]:SIGMA <- cov_GPL1(Dmat, etasq, rhosq, 0.01),
a ~ normal(0, 1),
c(bM, bB) ~ normal(0, 0.5),
etasq ~ half_normal(1, 0.25),
rhosq ~ half_normal(3, 0.25)
),
data = dat_list, chains = 4, cores = 4
)
plot(coeftab(m_M5Ordi, m_M5Brown, m_M5GP), pars = c("bM", "bB"))
14H1. Let’s revisit the Bangladesh fertility data, data(bangladesh), from the practice problems for Chapter 13. Fit a model with both varying intercepts by district_id and varying slopes of urban by district_id. You are still predicting use.contraception. Inspect the correlation between the intercepts and slopes. Can you interpret this correlation, in terms of what it tells you about the pattern of contraceptive use in the sample? It might help to plot the mean (or median) varying effect estimates for both the intercepts and slopes, by district. Then you can visualize the correlation and maybe more easily think through what it means to have a particular correlation. Plotting predicted proportion of women using contraception, with urban women on one axis and rural on the other, might also help.
data(bangladesh)
d <- bangladesh
dat_list <- list(
C = d$use.contraception,
did = as.integer(as.factor(d$district)),
urban = d$urban
)
m_H1 <- ulam(
alist(
C ~ bernoulli(p),
logit(p) <- a[did] + b[did] * urban,
c(a, b)[did] ~ multi_normal(c(abar, bbar), Rho, Sigma),
abar ~ normal(0, 1),
bbar ~ normal(0, 0.5),
Rho ~ lkj_corr(2),
Sigma ~ exponential(1)
),
data = dat_list, chains = 4, cores = 4, iter = 4000
)
precis(m_H1)
## mean sd 5.5% 94.5% n_eff Rhat4
## abar -0.6868581 0.09928555 -0.8473043 -0.5313484 6015.311 0.9999369
## bbar 0.6409046 0.15988868 0.3859150 0.8960423 4644.005 0.9999363
precis(m_H1, depth = 3, pars = c("Rho", "Sigma"))
## mean sd 5.5% 94.5% n_eff Rhat4
## Rho[1,1] 1.0000000 0.000000e+00 1.0000000 1.0000000 NaN NaN
## Rho[1,2] -0.6488515 1.640059e-01 -0.8611723 -0.3451257 1565.7005 1.0011707
## Rho[2,1] -0.6488515 1.640059e-01 -0.8611723 -0.3451257 1565.7005 1.0011707
## Rho[2,2] 1.0000000 5.868555e-17 1.0000000 1.0000000 6913.8832 0.9994999
## Sigma[1] 0.5733948 9.666828e-02 0.4289012 0.7355289 2096.9227 1.0013255
## Sigma[2] 0.7701551 2.068129e-01 0.4595600 1.1100284 883.0168 1.0026664
post <- extract.samples(m_H1)
a <- apply(post$a, 2, mean)
b <- apply(post$b, 2, mean)
plot(a, b, xlab = "a (intercept)", ylab = "b (urban slope)")
abline(h = 0, lty = 2)
abline(v = 0, lty = 2)
R <- apply(post$Rho, 2:3, mean)
s <- apply(post$Sigma, 2, mean)
S <- diag(s) %*% R %*% diag(s)
ll <- c(0.5, 0.67, 0.89, 0.97)
for (l in ll) {
el <- ellipse(S, centre = c(mean(post$abar), mean(post$bbar)), level = l)
lines(el, col = "black", lwd = 0.5)
}
u0 <- inv_logit(a)
u1 <- inv_logit(a + b)
plot(u0, u1, xlim = c(0, 1), ylim = c(0, 1), xlab = "urban = 0", ylab = "urban = 1")
abline(h = 0.5, lty = 2)
abline(v = 0.5, lty = 2)