Chapter 13 - Models With Memory

This chapter has been an introduction to the motivation, implementation, and interpretation of basic multilevel models. It focused on varying intercepts, which achieve better estimates of baseline differences among clusters in the data. They achieve better estimates, because they simultaneously model the population of clusters and use inferences about the population to pool information among parameters. From another perspective, varying intercepts are adaptively regularized parameters, relying upon a prior that is itself learned from the data.

Place each answer inside the code chunk (grey box). The code chunks should contain a text response or a code that completes/answers the question or activity requested. Make sure to include plots if the question requests them. Problems are labeled Easy (E), Medium (M), and Hard(H).

Finally, upon completion, name your final output .html file as: YourName_ANLY505-Year-Semester.html and publish the assignment to your R Pubs account and submit the link to Canvas. Each question is worth 5 points.

Questions

13E1. Which of the following priors will produce more shrinkage in the estimates? Show the prior plot. \[\begin{align} \ α_{TANK} ∼ Normal(0, 1) \tag{a} \\ \ α_{TANK} ∼ Normal(0, 2) \tag{b} \\ \end{align}\]

curve(dnorm(x,0,1), from=-5, to=5, ylab="Density") #solid line
curve(dnorm(x,0,2), add=TRUE, lty=2) #dashed line

# Because shrinkage is introduced by regularizing/informative priors, option 
# (a) will produce more shrinkage. This is also demonstrated in the graph where it
# is apparent that the density is more concentrated near 0 with the solid line 
# than with the dashed line.

13E2. Rewrite the following model as a multilevel model. \[\begin{align} y_i ∼ Binomial(1, p_i) \\ logit(p_i) = α_{group[i]} + βx_i \\ α_{group} ∼ Normal(0, 1.5) \\ β ∼ Normal(0, 0.5) \\ \end{align}\]

# y_i ~ Binomial(1, p_i)
# logit(p_i) = α_{group[i]} + βx_i
# α_{group} ~ Normal(α_bar, σ_α)
# α_bar ~ Normal(0, 1.5)
# β ~ Normal(0, 0.5)
# σ_α ~ Exponential(1)

13E3. Rewrite the following model as a multilevel model. \[\begin{align} y_i ∼ Normal(μ_i, σ) \\ μ_i = α_{group[i]} + βx_i \\ α_{group} ∼ Normal(0, 5) \\ β ∼ Normal(0, 1) \\ σ ∼ Exponential(1) \\ \end{align}\]

# y_i ∼ Normal(μ_i, σ)
# μ_i = α_{group[i]} + βx_i
# α_{group} ∼ Normal(α_bar, σ_α)
# α_bar ~ Normal(0, 5)
# β ∼ Normal(0, 1)
# σ ∼ Exponential(1)
# σ_α ~ Exponential(1)

13E4. Write a mathematical model formula for a Poisson regression with varying intercepts.

# y_i ~ Poisson(λ_i)
# log(λ_i) = α_{group[i]} + βx_i
# α_{group} ~ Normal(α_bar, σ_α)
# α_bar ~ Normal(0, 1)
# β ~ Normal(0, 1)
# σ_α ~ Exponential(0,1)

13E5. Write a mathematical model formula for a Poisson regression with two different kinds of varying intercepts, a cross-classified model.

# y_i ~ Poisson(λ_i)
# log(λ_i) = α_bar + α_{group[i]} + α_{var[i]} + βx_i
# α_{group} ~ Normal(0, σ_{group})
# α_bar ~ Normal(0, 1)
# α_{var} ~ Normal(0, σ_{var})
# β ~ Normal(0, 1)
# σ_{group} ~ Exponential(1)
# σ_{var} ~ Exponential(1)

13M1. Revisit the Reed frog survival data, data(reedfrogs), and add the predation and size treatment variables to the varying intercepts model. Consider models with either main effect alone, both main effects, as well as a model including both and their interaction. Instead of focusing on inferences about these two predictor variables, focus on the inferred variation across tanks. Explain why it changes as it does across models. Plot the sigma estimates.

data(reedfrogs)
d <- reedfrogs
str(d)
## 'data.frame':    48 obs. of  5 variables:
##  $ density : int  10 10 10 10 10 10 10 10 10 10 ...
##  $ pred    : Factor w/ 2 levels "no","pred": 1 1 1 1 1 1 1 1 2 2 ...
##  $ size    : Factor w/ 2 levels "big","small": 1 1 1 1 2 2 2 2 1 1 ...
##  $ surv    : int  9 10 7 10 9 9 10 9 4 9 ...
##  $ propsurv: num  0.9 1 0.7 1 0.9 0.9 1 0.9 0.4 0.9 ...
?reedfrogs

#dummy variables
d$predator <- ifelse( d$pred=="no" , 0 , 1 )
d$big <- ifelse( d$size=="big" , 1 , 0 )

d$tank <- 1:nrow(d)

dat_df <- list(
  S = d$surv,
  n = d$density,
  tank = 1:nrow(d),
  pred = ifelse(d$pred == "no", 0L, 1L),
  sizes = ifelse(d$size == "small", 1L, 2L)
)

# Tank only model
m_tank <- ulam(
  alist(
    S ~ dbinom(n, p),
    logit(p) <- a[tank],
    a[tank] ~ dnorm(a_bar, sigma),
    a_bar ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
    ),
  data = dat_df, chains = 4, cores = 1, log_lik = TRUE, iter = 2e3
)
## 
## SAMPLING FOR MODEL '1921d04e537dfe7a9bd4690da48e0744' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.3 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.263886 seconds (Warm-up)
## Chain 1:                0.258782 seconds (Sampling)
## Chain 1:                0.522668 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '1921d04e537dfe7a9bd4690da48e0744' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.4e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.30689 seconds (Warm-up)
## Chain 2:                0.211975 seconds (Sampling)
## Chain 2:                0.518865 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '1921d04e537dfe7a9bd4690da48e0744' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.5e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.219966 seconds (Warm-up)
## Chain 3:                0.171083 seconds (Sampling)
## Chain 3:                0.391049 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '1921d04e537dfe7a9bd4690da48e0744' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.7e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.186235 seconds (Warm-up)
## Chain 4:                0.190373 seconds (Sampling)
## Chain 4:                0.376608 seconds (Total)
## Chain 4:
# Predation-only model
m_pred <- ulam(
  alist(
    S ~ dbinom(n, p),
    logit(p) <- a[tank] + bp * pred,
    a[tank] ~ dnorm(a_bar, sigma),
    bp ~ dnorm(-0.5, 1),
    a_bar ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
    ),
  data = dat_df, chains = 4, cores = 1, log_lik = TRUE, iter = 2e3
)
## 
## SAMPLING FOR MODEL 'b41a3177f7b162a2fc6f9db3adecb0dc' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 5.1e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.51 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.308941 seconds (Warm-up)
## Chain 1:                0.320054 seconds (Sampling)
## Chain 1:                0.628995 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'b41a3177f7b162a2fc6f9db3adecb0dc' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.31927 seconds (Warm-up)
## Chain 2:                0.225158 seconds (Sampling)
## Chain 2:                0.544428 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'b41a3177f7b162a2fc6f9db3adecb0dc' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 3.8e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.38 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.417553 seconds (Warm-up)
## Chain 3:                0.258145 seconds (Sampling)
## Chain 3:                0.675698 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'b41a3177f7b162a2fc6f9db3adecb0dc' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 5.2e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.52 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.353578 seconds (Warm-up)
## Chain 4:                0.274491 seconds (Sampling)
## Chain 4:                0.628069 seconds (Total)
## Chain 4:
# Size-only model
m_size <- ulam(
  alist(
    S ~ dbinom(n, p),
    logit(p) <- a[tank] + s[sizes],
    a[tank] ~ dnorm(a_bar, sigma),
    s[sizes] ~ dnorm(0, 0.5),
    a_bar ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
    ),
  data = dat_df, chains = 4, cores = 1, log_lik = TRUE, iter = 2e3
)
## 
## SAMPLING FOR MODEL 'cb6160c74e684b26e539dc94b88b6663' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 3.8e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.38 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.35261 seconds (Warm-up)
## Chain 1:                0.309143 seconds (Sampling)
## Chain 1:                0.661753 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'cb6160c74e684b26e539dc94b88b6663' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2.1e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.324251 seconds (Warm-up)
## Chain 2:                0.256465 seconds (Sampling)
## Chain 2:                0.580716 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'cb6160c74e684b26e539dc94b88b6663' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2.5e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.45372 seconds (Warm-up)
## Chain 3:                0.420068 seconds (Sampling)
## Chain 3:                0.873788 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'cb6160c74e684b26e539dc94b88b6663' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 3.1e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.31 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.392144 seconds (Warm-up)
## Chain 4:                0.363892 seconds (Sampling)
## Chain 4:                0.756036 seconds (Total)
## Chain 4:
## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
## Running the chains for more iterations may help. See
## http://mc-stan.org/misc/warnings.html#bulk-ess
# Predation + Size model
m_add <- ulam(
  alist(
    S ~ dbinom(n, p),
    logit(p) <- a[tank] + bp * pred + s[sizes],
    a[tank] ~ dnorm(a_bar, sigma),
    bp ~ dnorm(-0.5, 1),
    s[sizes] ~ dnorm(0, 0.5),
    a_bar ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
    ),
  data = dat_df, chains = 4, cores = 1, log_lik = TRUE, iter = 2e3
)
## 
## SAMPLING FOR MODEL '60d2102ebf4cae70e95e677f88710aee' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 3.9e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.39 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.428438 seconds (Warm-up)
## Chain 1:                0.338277 seconds (Sampling)
## Chain 1:                0.766715 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '60d2102ebf4cae70e95e677f88710aee' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.52081 seconds (Warm-up)
## Chain 2:                0.55326 seconds (Sampling)
## Chain 2:                1.07407 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '60d2102ebf4cae70e95e677f88710aee' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.47797 seconds (Warm-up)
## Chain 3:                0.582708 seconds (Sampling)
## Chain 3:                1.06068 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '60d2102ebf4cae70e95e677f88710aee' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2.9e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.29 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.588512 seconds (Warm-up)
## Chain 4:                0.571848 seconds (Sampling)
## Chain 4:                1.16036 seconds (Total)
## Chain 4:
## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
## Running the chains for more iterations may help. See
## http://mc-stan.org/misc/warnings.html#bulk-ess
# Predation + Size + Interaction model
m_inter <- ulam(
  alist(
    S ~ dbinom(n, p),
    logit(p) <- a_bar + a[tank] * sigma + bp[sizes] * pred + s[sizes],
    a[tank] ~ dnorm(0, 1),
    bp[sizes] ~ dnorm(-0.5, 1),
    s[sizes] ~ dnorm(0, 0.5),
    a_bar ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
  ),
  data = dat_df, chains = 4, cores = 1, log_lik = TRUE, iter = 2e3
)
## 
## SAMPLING FOR MODEL '167946ca79aaf624d3f90dbc7635fec0' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 4.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.46 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.605095 seconds (Warm-up)
## Chain 1:                0.467434 seconds (Sampling)
## Chain 1:                1.07253 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '167946ca79aaf624d3f90dbc7635fec0' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.9e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.536057 seconds (Warm-up)
## Chain 2:                0.467884 seconds (Sampling)
## Chain 2:                1.00394 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '167946ca79aaf624d3f90dbc7635fec0' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.8e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.615396 seconds (Warm-up)
## Chain 3:                0.472942 seconds (Sampling)
## Chain 3:                1.08834 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '167946ca79aaf624d3f90dbc7635fec0' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.8e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.576507 seconds (Warm-up)
## Chain 4:                0.445321 seconds (Sampling)
## Chain 4:                1.02183 seconds (Total)
## Chain 4:
# Assess the variation among tanks through sigma parameter

plot(coeftab(m_tank, m_pred, m_size, m_add, m_inter),
  pars = "sigma",
  labels = c("Tank", "Predation", "Size", "Additive", "Interaction")
)

# Focusing on sigma_tank, the "inferred variation across tanks", the first and 
# third models have the highest variation (sigma_tank=1.63). The first model 
# considers density only, while the third model considers density and size. 
# Thus, size does not reduce tank variation when only considering density. 
# Variation is most reduced in model 5 (sigma_tank=0.74), which considers density, 
# predator presence, size, and the interaction between predator and size. 
# Model 4 is similar in variation (sigma_tank=0.78), so this indicates that the 
# interaction between predator and size provides some additional information 
# (0.74<0.78), but not a lot of information. Finally, model 2, which considers 
# density and predator presence, has a sigma_tank=0.83. Thus, while size and an 
# interaction with size helps mitigate the variation among tanks, predator has 
# the largest impact on reducing variation.

13M2. Compare the models you fit just above, using WAIC. Can you reconcile the differences in WAIC with the posterior distributions of the models? Show the WAIC table.

compare(m_tank, m_pred, m_size, m_add, m_inter)
##             WAIC       SE     dWAIC      dSE    pWAIC    weight
## m_pred  199.3418 9.054166 0.0000000       NA 19.34210 0.2671996
## m_tank  199.8585 7.274915 0.5166516 5.727324 20.77260 0.2063699
## m_inter 200.0409 9.063731 0.6990753 3.019872 19.29641 0.1883795
## m_add   200.0478 8.763894 0.7060164 1.893949 19.18137 0.1877268
## m_size  200.4922 7.264123 1.1504026 5.829732 21.10868 0.1503242
precis(m_tank)
##           mean        sd      5.5%    94.5%    n_eff    Rhat4
## a_bar 1.342861 0.2569343 0.9481485 1.753561 4912.189 0.999667
## sigma 1.609501 0.2062903 1.3072823 1.960482 3600.577 1.000491
precis(m_pred)
##             mean        sd      5.5%     94.5%    n_eff    Rhat4
## bp    -2.4169763 0.3022560 -2.889423 -1.929843 414.6427 1.006026
## a_bar  2.5237191 0.2356665  2.147840  2.901404 490.8606 1.006824
## sigma  0.8243025 0.1527523  0.609259  1.081816 871.3499 1.001931
precis(m_size)
##           mean        sd      5.5%    94.5%     n_eff    Rhat4
## a_bar 1.296244 0.4052460 0.6517304 1.946644  531.5796 1.005083
## sigma 1.619325 0.2141472 1.3109215 1.982791 1879.9737 1.000646
precis(m_add)
##             mean        sd       5.5%     94.5%     n_eff    Rhat4
## bp    -2.4398833 0.2856159 -2.8825090 -1.974110 1014.8363 1.002815
## a_bar  2.3905052 0.3914234  1.7603295  3.011944  268.1827 1.011234
## sigma  0.7768474 0.1486175  0.5574306  1.036306 1101.9161 1.000928
precis(m_inter)
##            mean       sd      5.5%     94.5%    n_eff    Rhat4
## a_bar 2.3203457 0.404230 1.6911798 2.9817322 1716.396 1.002045
## sigma 0.7505262 0.147153 0.5345387 0.9984328 1863.792 1.001388
# The slope for size (Bb) tends to be closer to 0. The slope for predator, as 
# well as the interaction between size and predator, are much better at reducing 
# variation in sigma_tank. Considering predator only performed best according to 
# WAIC, but the interaction model with predator and size and the initial model 
# without predator and size were a close second and third, respectively. 
# The non-interaction model with predator and size performed worse, and the 
# model with size only performed 2nd worse. When limited to considering the 
# variables available in this data-set, size does not seem to benefit the 
# modeling, except when considered in interaction with predator.

13M3. Re-estimate the basic Reed frog varying intercept model, but now using a Cauchy distribution in place of the Gaussian distribution for the varying intercepts. That is, fit this model: \[\begin{align} s_i ∼ Binomial(n_i, p_i) \\ logit(p_i) = α_{tank[i]} \\ α_{tank} ∼ Cauchy(α, σ) \\ α ∼ Normal(0, 1) \\ σ ∼ Exponential(1) \\ \end{align}\]

(You are likely to see many divergent transitions for this model. Can you figure out why? Can you fix them?) Plot and compare the posterior means of the intercepts, αtank, to the posterior means produced in the chapter, using the customary Gaussian prior. Can you explain the pattern of differences? Take note of any change in the mean α as well.

m_tankCauchy <- ulam(
  alist(
    S ~ dbinom(n, p),
    logit(p) <- a[tank],
    a[tank] ~ dcauchy(a_bar, sigma),
    a_bar ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
    ),
    data = dat_df, chains = 4, cores = 1, log_lik = TRUE,
    iter = 2e3, control = list(adapt_delta = 0.99)
)
## 
## SAMPLING FOR MODEL '8ae4ff1d6cc2cbeb072a8de22fc5d537' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 3.1e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.31 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.922843 seconds (Warm-up)
## Chain 1:                1.12933 seconds (Sampling)
## Chain 1:                2.05217 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '8ae4ff1d6cc2cbeb072a8de22fc5d537' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.4e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 1.17508 seconds (Warm-up)
## Chain 2:                1.11639 seconds (Sampling)
## Chain 2:                2.29147 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '8ae4ff1d6cc2cbeb072a8de22fc5d537' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.5e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 1.21116 seconds (Warm-up)
## Chain 3:                0.791922 seconds (Sampling)
## Chain 3:                2.00309 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '8ae4ff1d6cc2cbeb072a8de22fc5d537' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.5e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 1.21839 seconds (Warm-up)
## Chain 4:                1.40865 seconds (Sampling)
## Chain 4:                2.62704 seconds (Total)
## Chain 4:
## Warning: There were 229 divergent transitions after warmup. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: Examine the pairs() plot to diagnose sampling problems
a_tank <- apply(extract.samples(m_tank)$a, 2, mean)
a_tankCauchy <- apply(extract.samples(m_tankCauchy)$a, 2, mean)
plot(a_tank, a_tankCauchy,
  pch = 16, col = rangi2,
  xlab = "Intercept (Gaussian prior)", ylab = " Intercept (Cauchy prior)"
)

abline(a = 0, b = 1, lty = 2)

# In general, the Cauchy prior and the Gaussian prior produce equal intercepts 
# (the dashed line). However, at the high end of survival, the extreme values 
# become much more extreme for the Cauchy prior compared to the Gaussian.

13M4. Now use a Student-t distribution with ν = 2 for the intercepts: \[\begin{align} α_{tank} ∼ Student(2, α, σ) \end{align}\]

Refer back to the Student-t example in Chapter 7 (page 234), if necessary. Plot and compare the resulting posterior to both the original model and the Cauchy model in 13M3. Can you explain the differences and similarities in shrinkage in terms of the properties of these distributions?

data(reedfrogs)
d <- reedfrogs

d$pred <- ifelse(d$pred=="no", 0, 1)
d$big <- ifelse(d$size=="big", 1, 0)

# Define and fit the models
d$tank <- 1:nrow(d)

m_st <- map2stan(
  alist(
  surv ~ dbinom(density, p),
  logit(p) <- a_tank[tank],
  a_tank[tank] ~ dstudent(2, a, sigma),
  a ~ dnorm(0, 1),
  sigma ~ dcauchy(0, 1)
  ),
  data=d, chains=4, warmup=1000, iter=3000, control=list(adapt_delta=0.99), cores=1)
## 
## SAMPLING FOR MODEL '477a3c1fd6e875cd224d7978a4d4f3af' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 3.2e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.32 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 3000 [  0%]  (Warmup)
## Chain 1: Iteration:  300 / 3000 [ 10%]  (Warmup)
## Chain 1: Iteration:  600 / 3000 [ 20%]  (Warmup)
## Chain 1: Iteration:  900 / 3000 [ 30%]  (Warmup)
## Chain 1: Iteration: 1001 / 3000 [ 33%]  (Sampling)
## Chain 1: Iteration: 1300 / 3000 [ 43%]  (Sampling)
## Chain 1: Iteration: 1600 / 3000 [ 53%]  (Sampling)
## Chain 1: Iteration: 1900 / 3000 [ 63%]  (Sampling)
## Chain 1: Iteration: 2200 / 3000 [ 73%]  (Sampling)
## Chain 1: Iteration: 2500 / 3000 [ 83%]  (Sampling)
## Chain 1: Iteration: 2800 / 3000 [ 93%]  (Sampling)
## Chain 1: Iteration: 3000 / 3000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.820178 seconds (Warm-up)
## Chain 1:                1.97964 seconds (Sampling)
## Chain 1:                2.79982 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '477a3c1fd6e875cd224d7978a4d4f3af' NOW (CHAIN 2).
## Chain 2: Rejecting initial value:
## Chain 2:   Error evaluating the log probability at the initial value.
## Chain 2: Exception: student_t_lpdf: Scale parameter is -1.4059, but must be > 0!  (in 'model81bd3a3abec9_477a3c1fd6e875cd224d7978a4d4f3af' at line 18)
## 
## Chain 2: 
## Chain 2: Gradient evaluation took 8.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.86 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 3000 [  0%]  (Warmup)
## Chain 2: Iteration:  300 / 3000 [ 10%]  (Warmup)
## Chain 2: Iteration:  600 / 3000 [ 20%]  (Warmup)
## Chain 2: Iteration:  900 / 3000 [ 30%]  (Warmup)
## Chain 2: Iteration: 1001 / 3000 [ 33%]  (Sampling)
## Chain 2: Iteration: 1300 / 3000 [ 43%]  (Sampling)
## Chain 2: Iteration: 1600 / 3000 [ 53%]  (Sampling)
## Chain 2: Iteration: 1900 / 3000 [ 63%]  (Sampling)
## Chain 2: Iteration: 2200 / 3000 [ 73%]  (Sampling)
## Chain 2: Iteration: 2500 / 3000 [ 83%]  (Sampling)
## Chain 2: Iteration: 2800 / 3000 [ 93%]  (Sampling)
## Chain 2: Iteration: 3000 / 3000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 1.1915 seconds (Warm-up)
## Chain 2:                1.91249 seconds (Sampling)
## Chain 2:                3.10398 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '477a3c1fd6e875cd224d7978a4d4f3af' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 3000 [  0%]  (Warmup)
## Chain 3: Iteration:  300 / 3000 [ 10%]  (Warmup)
## Chain 3: Iteration:  600 / 3000 [ 20%]  (Warmup)
## Chain 3: Iteration:  900 / 3000 [ 30%]  (Warmup)
## Chain 3: Iteration: 1001 / 3000 [ 33%]  (Sampling)
## Chain 3: Iteration: 1300 / 3000 [ 43%]  (Sampling)
## Chain 3: Iteration: 1600 / 3000 [ 53%]  (Sampling)
## Chain 3: Iteration: 1900 / 3000 [ 63%]  (Sampling)
## Chain 3: Iteration: 2200 / 3000 [ 73%]  (Sampling)
## Chain 3: Iteration: 2500 / 3000 [ 83%]  (Sampling)
## Chain 3: Iteration: 2800 / 3000 [ 93%]  (Sampling)
## Chain 3: Iteration: 3000 / 3000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.892037 seconds (Warm-up)
## Chain 3:                0.977518 seconds (Sampling)
## Chain 3:                1.86956 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '477a3c1fd6e875cd224d7978a4d4f3af' NOW (CHAIN 4).
## Chain 4: Rejecting initial value:
## Chain 4:   Error evaluating the log probability at the initial value.
## Chain 4: Exception: student_t_lpdf: Scale parameter is -0.050959, but must be > 0!  (in 'model81bd3a3abec9_477a3c1fd6e875cd224d7978a4d4f3af' at line 18)
## 
## Chain 4: 
## Chain 4: Gradient evaluation took 1.9e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 3000 [  0%]  (Warmup)
## Chain 4: Iteration:  300 / 3000 [ 10%]  (Warmup)
## Chain 4: Iteration:  600 / 3000 [ 20%]  (Warmup)
## Chain 4: Iteration:  900 / 3000 [ 30%]  (Warmup)
## Chain 4: Iteration: 1001 / 3000 [ 33%]  (Sampling)
## Chain 4: Iteration: 1300 / 3000 [ 43%]  (Sampling)
## Chain 4: Iteration: 1600 / 3000 [ 53%]  (Sampling)
## Chain 4: Iteration: 1900 / 3000 [ 63%]  (Sampling)
## Chain 4: Iteration: 2200 / 3000 [ 73%]  (Sampling)
## Chain 4: Iteration: 2500 / 3000 [ 83%]  (Sampling)
## Chain 4: Iteration: 2800 / 3000 [ 93%]  (Sampling)
## Chain 4: Iteration: 3000 / 3000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.759923 seconds (Warm-up)
## Chain 4:                1.9127 seconds (Sampling)
## Chain 4:                2.67263 seconds (Total)
## Chain 4:
## Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
## Running the chains for more iterations may help. See
## http://mc-stan.org/misc/warnings.html#tail-ess
# Compare the posterior means of the a_tank parameters
post_m_st <- extract.samples(m_st)
a_tank_m_st <- apply(post_m_st$a_tank, 2, mean)

# Plotting the posterior means
# original model vs. Student-t
plot(a_tank, a_tank_m_st, pch=16, col=rangi2,
  xlab="Intercept (Gaussian prior)", ylab="Intercept (Student-t prior)")
abline(a=0, b=1, lty=2)

# Cauchy vs. Student-t 
plot(a_tankCauchy, a_tank_m_st, pch=16, col=rangi2, 
     xlab="Intercept (Student-t prior)", ylab="Intercept (Gaussian prior)")
abline(a=0, b=1, lty=2)

# The Cauchy prior leads to the most shrinkage with the smallest sigma_tank (1.02), 
# indicating a reduced standard deviation and a greater density around the mean. 
# The Gaussian prior has the least shrinkage (sigma_tank=1.63), with Student-T 
# in the middle (sigma_tank=1.26) but tending towards Cauchy.

# Dashed line represents the equal intercepts of model. For majority of tank 
# intercepts, the Student-T model produces posterior means that are similar to those 
# from the Gaussian model. However, extremely large intercepts of Gaussian prior, 
# are more extreme for Student-T prior. For tanks on right side of the plot, all 
# of the tadpoles survived, thus implying data only from each tank make the 
# log-odds of survival as infinite. Gaussian prior causes more shrinkage of 
# the extreme values than the Student-t prior.

# For Student-t and Cauchy model, posterior means are essentially the same, but 
# have extremely large intercepts and more extreme under the Student-T prior. 
# For tanks on right side of the plot, all of the tadpoles survived. Thus, using 
# data only from each tank, the log-odds of survival are infinite, and adaptive 
# prior applies pooling that shrinks those log-odds inwards from infinity. 
# However, Student-T prior causes more shrinkage of the extreme values than the 
# Cauchy prior and accounts for 5 extreme points on the above right of the plot.

13M5. Modify the cross-classified chimpanzees model m13.4 so that the adaptive prior for blocks contains a parameter \(\bar{γ}\) for its mean: \[\begin{align} γ_j ∼ Normal(\bar{γ}, σ_γ) \\ \bar{γ} ∼ Normal(0, 1.5) \\ \end{align}\]

Compare the precis output of this model to m13.4. What has including \(\bar{γ}\) done?

data(chimpanzees)
d <- chimpanzees
d$treatment <- 1 + d$prosoc_left + 2*d$condition

dat_list <- list(
    pulled_left = d$pulled_left,
    actor = d$actor,
    block_id = d$block,
    treatment = as.integer(d$treatment) )

set.seed(13)
m13.m5 <- ulam(
    alist(
        pulled_left ~ dbinom(1, p) ,
        logit(p) <- a[actor] + g[block_id] + b[treatment],
        b[treatment] ~ dnorm(0 , 0.5),
      ## adaptive priors
        a[actor] ~ dnorm(a_bar, sigma_a),
        g[block_id] ~ dnorm(g_bar, sigma_g),
      ## hyper-priors
        a_bar ~ dnorm(0 , 1.5),
        sigma_a ~ dexp(1),
        g_bar ~ dnorm( 0 , 1.5),
        sigma_g ~ dexp(1)
    ), data=dat_list, chains=4, cores=4, log_lik=TRUE)
## Warning: There were 45 divergent transitions after warmup. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: Examine the pairs() plot to diagnose sampling problems
## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
## Running the chains for more iterations may help. See
## http://mc-stan.org/misc/warnings.html#bulk-ess
## Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
## Running the chains for more iterations may help. See
## http://mc-stan.org/misc/warnings.html#tail-ess
precis(m13.m5, depth=2)
##               mean        sd        5.5%       94.5%    n_eff     Rhat4
## b[1]    -0.1424963 0.2932419 -0.61590110  0.30545981 793.0816 1.0052564
## b[2]     0.3833844 0.3033354 -0.08942929  0.86996757 817.9242 1.0064184
## b[3]    -0.4887525 0.2996009 -0.97165388 -0.02423786 586.1809 1.0047189
## b[4]     0.2718516 0.2975282 -0.21235212  0.74544335 745.7570 1.0052244
## a[1]    -0.7635225 1.1219478 -2.51532245  1.09798471 179.8003 1.0321022
## a[2]     4.2731839 1.6424192  1.90839707  7.02681055 321.2669 1.0170072
## a[3]    -1.0712900 1.1309734 -2.84354542  0.79376554 174.7139 1.0343995
## a[4]    -1.0612456 1.1400593 -2.86374873  0.84227311 179.5626 1.0319627
## a[5]    -0.7649474 1.1277781 -2.58431015  1.14170913 181.5657 1.0326821
## a[6]     0.1770241 1.1275745 -1.61742436  1.99839365 181.6722 1.0338694
## a[7]     1.7162027 1.1700311 -0.13188781  3.65160243 194.1145 1.0293125
## g[1]     0.2328924 1.1114283 -1.63015857  2.01051749 158.0517 1.0365936
## g[2]     0.4600031 1.1111238 -1.42972744  2.20277630 174.2923 1.0348350
## g[3]     0.4713142 1.1125794 -1.39355247  2.20616215 173.5618 1.0351668
## g[4]     0.4250929 1.1095505 -1.47683883  2.19878809 172.8190 1.0354198
## g[5]     0.3823321 1.1098455 -1.49672771  2.14289262 170.4677 1.0348847
## g[6]     0.5329241 1.1108632 -1.34200923  2.25426180 183.8572 1.0335899
## a_bar    0.2682630 1.1041243 -1.42446350  2.06895242 274.1352 1.0184422
## sigma_a  1.9931509 0.6388321  1.15793698  3.09559963 867.4210 0.9999258
## g_bar    0.4184948 1.1001619 -1.44193749  2.14421658 173.5054 1.0353378
## sigma_g  0.2341232 0.1876910  0.04418729  0.55755282 218.7320 1.0207599
plot(precis(m13.m5,depth=2)) # also plot

# model from chapter
set.seed(13)
m13.4 <- ulam(
    alist(
        pulled_left ~ dbinom( 1 , p ) ,
        logit(p) <- a[actor] + g[block_id] + b[treatment] ,
        b[treatment] ~ dnorm( 0 , 0.5 ),
      ## adaptive priors
        a[actor] ~ dnorm( a_bar , sigma_a ),
        g[block_id] ~ dnorm( 0 , sigma_g ),
      ## hyper-priors
        a_bar ~ dnorm( 0 , 1.5 ),
        sigma_a ~ dexp(1),
        sigma_g ~ dexp(1)
    ) , data=dat_list , chains=4 , cores=4 , log_lik=TRUE )
## Warning: There were 3 divergent transitions after warmup. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: Examine the pairs() plot to diagnose sampling problems
## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
## Running the chains for more iterations may help. See
## http://mc-stan.org/misc/warnings.html#bulk-ess
## Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
## Running the chains for more iterations may help. See
## http://mc-stan.org/misc/warnings.html#tail-ess
precis(m13.4, depth=2)
##                mean        sd        5.5%       94.5%    n_eff    Rhat4
## b[1]    -0.12761972 0.3037129 -0.61252512  0.35183186 403.9460 1.010975
## b[2]     0.40823885 0.3023219 -0.05326095  0.90132047 358.1884 1.008743
## b[3]    -0.47268650 0.2997765 -0.93307498  0.02072530 325.5048 1.013445
## b[4]     0.28714055 0.3040014 -0.19691575  0.78344953 314.6443 1.010892
## a[1]    -0.37004084 0.3811197 -0.95628869  0.25016530 359.9558 1.012071
## a[2]     4.72220826 1.3250881  3.09931231  6.97058585 441.6557 1.004065
## a[3]    -0.67101760 0.3808711 -1.28662169 -0.06097011 393.4247 1.008186
## a[4]    -0.67676183 0.3768182 -1.28197076 -0.07983407 379.6604 1.010870
## a[5]    -0.36288568 0.3731567 -0.95811917  0.24818279 388.4028 1.009631
## a[6]     0.57237309 0.3700805 -0.01941836  1.17313202 413.4078 1.006331
## a[7]     2.10982675 0.4584000  1.38057372  2.85182835 468.8966 1.006541
## g[1]    -0.15680088 0.2287594 -0.56696571  0.09442740 321.4707 1.008384
## g[2]     0.03827762 0.1947428 -0.22232152  0.36334359 947.7353 1.001397
## g[3]     0.05590012 0.1958143 -0.18855486  0.36400551 764.3195 1.000097
## g[4]     0.01132998 0.1764046 -0.24564546  0.29159179 900.1668 1.001605
## g[5]    -0.03345170 0.1915442 -0.34386016  0.23453987 821.9538 1.008256
## g[6]     0.11021854 0.2025267 -0.12957112  0.43837744 760.1784 1.002336
## a_bar    0.63302585 0.7379718 -0.51603187  1.78529881 700.5277 1.004161
## sigma_a  2.03260593 0.6482588  1.22001000  3.13988882 570.0554 1.000680
## sigma_g  0.20945647 0.1742578  0.03294841  0.51642363 252.4811 1.016748
plot(precis(m13.m5, depth=2)) # also plot

plot(precis(m13.4, depth=2)) # also plot

compare(m13.m5, m13.4)
##            WAIC       SE     dWAIC       dSE    pWAIC    weight
## m13.4  532.2419 19.42157 0.0000000        NA 10.68379 0.5677934
## m13.m5 532.7876 19.43458 0.5457081 0.2630791 11.00983 0.4322066
coeftab(m13.m5, m13.4)
##         m13.m5  m13.4  
## b[1]      -0.14   -0.13
## b[2]       0.38    0.41
## b[3]      -0.49   -0.47
## b[4]       0.27    0.29
## a[1]      -0.76   -0.37
## a[2]       4.27    4.72
## a[3]      -1.07   -0.67
## a[4]      -1.06   -0.68
## a[5]      -0.76   -0.36
## a[6]       0.18    0.57
## a[7]       1.72    2.11
## g[1]       0.23   -0.16
## g[2]       0.46    0.04
## g[3]       0.47    0.06
## g[4]       0.43    0.01
## g[5]       0.38   -0.03
## g[6]       0.53    0.11
## a_bar      0.27    0.63
## sigma_a    1.99    2.03
## g_bar      0.42      NA
## sigma_g    0.23    0.21
## nobs        504     504
# The shift from adding g_bar can be seen in sigma_a. When the block 
# distribution mean=0, the spread of sigma_a is much wider. However, using g_bar 
# rather than 0 leads to a wide spread for g_bar, but a narrower spread for sigma_a.
# Model m13.4m samples poorly, and n_eff values are lower, where as Rhat values 
# are larger. Also. here inferences about the slopes are identical, so, even 
# though over-parameterized model is inefficient, it has identified the slope parameters.

13M6. Sometimes the prior and the data (through the likelihood) are in conflict, because they concentrate around different regions of parameter space. What happens in these cases depends a lot upon the shape of the tails of the distributions. Likewise, the tails of distributions strongly influence can outliers are shrunk or not towards the mean. I want you to consider four different models to fit to one observation at y = 0. The models differ only in the distributions assigned to the likelihood and prior. Here are the four models:

\[\begin{align} Model \;NN: y &∼ Normal(μ, 1) & Model \;TN: y &∼ Student(2, μ, 1) \\ μ &∼ Normal(10, 1) & μ &∼ Normal(10, 1) \\ Model \;NT: y &∼ Normal(μ, 1) & Model \;TT: y &∼ Student(2, μ, 1) \\ μ &∼ Student(2, 10, 1) & μ &∼ Student(2, 10, 1) \\ \end{align}\]

Estimate and plot the posterior distributions against the likelihoods for these models and compare them. Can you explain the results, using the properties of the distributions?

data(chimpanzees)
d=chimpanzees
m13m6.1 <- map2stan( 
  alist(
    pulled_left ~ dnorm(mu , 1),
    mu~dnorm(10,1)
  ), 
  data=d , iter=2000 , chains=1, cores=1 
)
## 
## SAMPLING FOR MODEL 'f4767c4dc3044c5a46554caab6f417d8' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 9e-06 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.013144 seconds (Warm-up)
## Chain 1:                0.016393 seconds (Sampling)
## Chain 1:                0.029537 seconds (Total)
## Chain 1: 
## Error in .local(fit, data, n, ...) : 
##   There appear to be no linear models here
m13m6.2 <- map2stan( 
  alist(
    pulled_left ~ dstudent(2, mu , 1),
    mu~dnorm(10,1)
  ), 
  data=d , iter=2000 , chains=1, cores=1 
)
## 
## SAMPLING FOR MODEL '7994c03ef71beed8326306b1ac52e1df' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 3.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.34 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.089325 seconds (Warm-up)
## Chain 1:                0.102854 seconds (Sampling)
## Chain 1:                0.192179 seconds (Total)
## Chain 1: 
## Error in .local(fit, data, n, ...) : 
##   There appear to be no linear models here
m13m6.3 <- map2stan( 
  alist(
    pulled_left ~ dnorm(mu , 1),
    mu~dstudent(2,10,1)
  ), 
  data=d , iter=2000 , chains=1, cores=1 
)
## 
## SAMPLING FOR MODEL '5323d1639e8f8bde5ea077e49d14c989' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.023373 seconds (Warm-up)
## Chain 1:                0.021607 seconds (Sampling)
## Chain 1:                0.04498 seconds (Total)
## Chain 1: 
## Error in .local(fit, data, n, ...) : 
##   There appear to be no linear models here
m13m6.4 <- map2stan( 
  alist(
    pulled_left ~ dstudent(2,mu , 1),
    mu~dstudent(2,10,1)
  ), 
  data=d , iter=2000 , chains=1, cores=1 
)
## 
## SAMPLING FOR MODEL '47e5a433792eb1526c956629b13a4a4b' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.26 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.091816 seconds (Warm-up)
## Chain 1:                0.091822 seconds (Sampling)
## Chain 1:                0.183638 seconds (Total)
## Chain 1: 
## Error in .local(fit, data, n, ...) : 
##   There appear to be no linear models here
precis(m13m6.1)
##         mean         sd      5.5%     94.5%   n_eff   Rhat4
## mu 0.5945776 0.04567043 0.5192311 0.6664719 291.764 1.00859
precis(m13m6.2)
##         mean         sd      5.5%     94.5%    n_eff   Rhat4
## mu 0.6176782 0.04557713 0.5472078 0.6928091 386.9974 1.00012
precis(m13m6.3)
##         mean         sd      5.5%     94.5%    n_eff    Rhat4
## mu 0.5814504 0.04521341 0.5110258 0.6545654 374.3171 1.000478
precis(m13m6.4)
##         mean         sd      5.5%     94.5%    n_eff     Rhat4
## mu 0.5980499 0.04363902 0.5309723 0.6646481 278.0218 0.9989996
post_m13m6.1 <- extract.samples(m13m6.1)
post_m13m6.2 <- extract.samples(m13m6.2)
post_m13m6.3 <- extract.samples(m13m6.3)
post_m13m6.4 <- extract.samples(m13m6.4)

dens(post_m13m6.2$mu, col="blue")

dens(post_m13m6.1$mu, col="red", add = TRUE)
dens(post_m13m6.3$mu, col="green", add=TRUE)
dens(post_m13m6.4$mu, col="black", add=TRUE)
abline(v=0.6, col = "grey")
legend("topright",
       col = c("red","blue", "green", "black"),
       pch = 19,
       legend = c("Model 1","Model 2", "Model 3", "Model 4"))

# The shape of tails of Student-t distribution are heavier than Normal 
# distribution. From posterior distributions plot, it can be depicted that when 
# the prior and the data (through the likelihood) are of the same distribution, 
# the distribution of posterior tends to be symmetric.

EXTRA CREDIT (10 POINTS)

13H1. In 1980, a typical Bengali woman could have 5 or more children in her lifetime. By the year 2000, a typical Bengali woman had only 2 or 3. You’re going to look at a historical set of data, when contraception was widely available but many families chose not to use it. These data reside in data(bangladesh) and come from the 1988 Bangladesh Fertility Survey. Each row is one of 1934 women. There are six variables, but you can focus on two of them for this practice problem: (1) district: ID number of administrative district each woman resided in (2) use.contraception: An indicator (0/1) of whether the woman was using contraception The first thing to do is ensure that the cluster variable, district, is a contiguous set of integers. Recall that these values will be index values inside the model. If there are gaps, you’ll have parameters for which there is no data to inform them. Worse, the model probably won’t run. Look at the unique values of the district variable:

data(bangladesh)
d <- bangladesh
sort(unique(d$district)) #District 54 is absent. So district isn’t yet a good 
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
## [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
## [51] 51 52 53 55 56 57 58 59 60 61
# index variable, because it is not contiguous. This is easy to fix. 
# Just make a new variable that is contiguous. This is enough to do it:

d$district_id <- as.integer(as.factor(d$district))
sort(unique(d$district_id)) 
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
## [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
## [51] 51 52 53 54 55 56 57 58 59 60
# Now there are 60 values, contiguous integers 
# 1 to 60. Now, focus on predicting use.contraception, clustered by district_id.

Fit both (1) a traditional fixed-effects model that uses an index variable for district and (2) a multilevel model with varying intercepts for district. Plot the predicted proportions of women in each district using contraception, for both the fixed-effects model and the varying-effects model. That is, make a plot in which district ID is on the horizontal axis and expected proportion using contraception is on the vertical. Make one plot for each model, or layer them on the same plot, as you prefer. How do the models disagree? Can you explain the pattern of disagreement? In particular, can you explain the most extreme cases of disagreement, both why they happen where they do and why the models reach different inferences?

# prep trimmed data list
dlist <- list(
  use_contraception = d$use.contraception,
  district = d$district_id )

# fixed effects model
m12H1f <- map2stan(
  alist(
    use_contraception ~ dbinom( 1 , p ),
    logit(p) <- a_district[district],
    a_district[district] ~ dnorm(0,10)
    ),
  data=dlist )
## 
## SAMPLING FOR MODEL 'dc325c27954842f31c4174c34dc5f98d' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0.00024 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.4 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 6.56313 seconds (Warm-up)
## Chain 1:                4.34202 seconds (Sampling)
## Chain 1:                10.9052 seconds (Total)
## Chain 1:
# varying effects model
m12H1v <- map2stan(
  alist(
    use_contraception ~ dbinom( 1 , p ),
    logit(p) <- a + a_district[district],
    a ~ dnorm(0,10),
    a_district[district] ~ dnorm(0,sigma),
    sigma ~ dcauchy(0,1)
    ),
  data=dlist )
## 
## SAMPLING FOR MODEL '210a85d2a312395f299aa8716e031c84' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0.000269 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.69 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 5.8696 seconds (Warm-up)
## Chain 1:                4.75166 seconds (Sampling)
## Chain 1:                10.6213 seconds (Total)
## Chain 1:
# get predicted probabilities for each district for each model
pred.dat <- list(district=1:60)

# call link for each model, passing it the new cases to compute predicted probabilities for:
pred1 <- link(m12H1f,data=pred.dat)
## [ 100 / 1000 ]
[ 200 / 1000 ]
[ 300 / 1000 ]
[ 400 / 1000 ]
[ 500 / 1000 ]
[ 600 / 1000 ]
[ 700 / 1000 ]
[ 800 / 1000 ]
[ 900 / 1000 ]
[ 1000 / 1000 ]
pred2 <- link(m12H1v,data=pred.dat)
## [ 100 / 1000 ]
[ 200 / 1000 ]
[ 300 / 1000 ]
[ 400 / 1000 ]
[ 500 / 1000 ]
[ 600 / 1000 ]
[ 700 / 1000 ]
[ 800 / 1000 ]
[ 900 / 1000 ]
[ 1000 / 1000 ]
#  Average probabilities in each district:
p1.mean <- apply( pred1 , 2 , mean )
p2.mean <- apply( pred2 , 2 , mean )
round(p1.mean, 2)
##  [1] 0.26 0.35 0.96 0.50 0.36 0.29 0.28 0.38 0.30 0.08 0.00 0.35 0.42 0.63 0.36
## [16] 0.55 0.29 0.34 0.39 0.41 0.39 0.20 0.27 0.07 0.45 0.38 0.18 0.24 0.28 0.49
## [31] 0.45 0.21 0.43 0.65 0.50 0.35 0.54 0.28 0.50 0.46 0.50 0.55 0.53 0.22 0.33
## [46] 0.52 0.47 0.53 0.02 0.47 0.45 0.44 0.42 0.17 0.58 0.19 0.46 0.10 0.22 0.21
# plot the estimated proportions using contraception:
plot( 1:60 , p1.mean , col=rangi2 , pch=16 , xlab="District" ,
ylab="probability use contraception" )
points( 1:60 , p2.mean )
abline( h=logistic(coef(m12H1v)[1]) , lty=2 ) # plot line for 'a'

# In the entire sample, blue points are fixed-effects estimates, open blacks are 
# the varying effects, and dashed line is the average proportion of women using 
# contraception. Here, black points are closer to dashed line, similar to tadpole 
# example in lecture, and it results from shrinkage, which in turn results from 
# pooling information.

# Given, anomalies detected in graph, we can interchange the axis for better 
# view such that number of women in the district are on horizontal axis and 
# absolute difference between fixed and varying estimates on vertical (y) axis

# compute number of women sampled in each district
n_by_district <- sapply( 1:60 , function(did) length(d$district_id[d$district_id==did]) )

# compute shrinkage
shrinkage <- abs( p1.mean - p2.mean )
plot( n_by_district , shrinkage , col ="slateblue" ,
xlab ="number of women sampled" , ylab ="shrinkage by district" )

# The plot shows per district rate of contraception use. Districts are sorted by 
# the number of observations from the smallest to the largest.
# We can see that differences between pooled, fixed and empirical estimates are 
# tiny for the right part of the plot, where number of observations per district is ~100.
# But for the first 10 districts, that have less than 15 observations, pooled 
# estimates shrink towards the sample mean. In other words, districts with fewer 
# women show more shrinkage, as there is less information available and, as an 
# overall result, the model over-fits, causing the shrink to be more towards the overall mean.