Chapter 13 - Models with Memory

This chapter has been an introduction to the motivation, implementation, and interpretation of basic multilevel models. It focused on varying intercepts, which achieve better estimates of baseline differences among clusters in the data. They achieve better estimates, because they simultaneously model the population of clusters and use inferences about the population to pool information among parameters. From another perspective, varying intercepts are adaptively regularized parameters, relying upon a prior that is itself learned from the data.

Place each answer inside the code chunk (grey box). The code chunks should contain a text response or a code that completes/answers the question or activity requested. Make sure to include plots if the question requests them.

Finally, upon completion, name your final output .html file as: YourName_ANLY505-Year-Semester.html and publish the assignment to your R Pubs account and submit the link to Canvas. Each question is worth 5 points.

Questions

13-1. Revisit the Reed frog survival data, data(reedfrogs), and add the predation and size treatment variables to the varying intercepts model. Consider models with either main effect alone, both main effects, as well as a model including both and their interaction. Instead of focusing on inferences about these two predictor variables, focus on the inferred variation across tanks. Explain why it changes as it does across models. Plot the sigma estimates.

data(reedfrogs)
d = reedfrogs
dat = list(
  surv_ = d$surv,
  dens_ = d$density,
  tank_ = 1:nrow(d),
  pred_ = ifelse(d$pred == "no", 0L, 1L),
  size_ = ifelse(d$size == "small", 1L, 2L)
)
 
m_tank = ulam(
  alist(
    surv_ ~ dbinom(dens_, p),
    logit(p) <- a[tank_],
    a[tank_] ~ dnorm(a_1, sigma),
    a_1 ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
  ),
  data = dat, chains = 4, cores = 4, log_lik = TRUE, iter = 2e3
)

m_pred = ulam(
  alist(
    surv_ ~ dbinom(dens_, p),
    logit(p) <- a[tank_] + bp * pred_,
    a[tank_] ~ dnorm(a_1, sigma),
    bp ~ dnorm(-0.5, 1),
    a_1 ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
  ),
  data = dat, chains = 4, cores = 4, log_lik = TRUE, iter = 2e3
)

m_size = ulam(
  alist(
    surv_ ~ dbinom(dens_, p),
    logit(p) <- a[tank_] + s[size_],
    a[tank_] ~ dnorm(a_1, sigma),
    s[size_] ~ dnorm(0, 0.5),
    a_1 ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
  ),
  data = dat, chains = 4, cores = 4, log_lik = TRUE, iter = 2e3
)

m_pred_size = ulam(
  alist(
    surv_ ~ dbinom(dens_, p),
    logit(p) <- a[tank_] + bp * pred_ + s[size_],
    a[tank_] ~ dnorm(a_1, sigma),
    bp ~ dnorm(-0.5, 1),
    s[size_] ~ dnorm(0, 0.5),
    a_1 ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
  ),
  data = dat, chains = 4, cores = 4, log_lik = TRUE, iter = 2e3
)

m_Interaction = ulam(
  alist(
    surv_ ~ dbinom(dens_, p),
    logit(p) <- a_1 + a[tank_] * sigma + bp[size_] * pred_ + s[size_],
    a[tank_] ~ dnorm(0, 1),
    bp[size_] ~ dnorm(-0.5, 1),
    s[size_] ~ dnorm(0, 0.5),
    a_1 ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
  ),
  data = dat, chains = 4, cores = 4, log_lik = TRUE, iter = 2e3
)

plot(coeftab(m_tank, m_pred, m_size, m_pred_size, m_Interaction),
  pars = "sigma",
  labels = c("Tank", "Predation", "Size", "Pred+Size", "Interaction")
)

# We can see that adding a predictor always decreased the posterior mean variation across tanks. Because the predictors are predicting variation. This leaves less variation for the varying intercepts to mop up.And also, these models don’t help prediction very much, so it has minimal impact on the estimated variation across tanks.

13-2. Compare the models you fit just above, using WAIC. Can you reconcile the differences in WAIC with the posterior distributions of the models? Show the WAIC table.

compare(m_tank, m_pred, m_size, m_pred_size, m_Interaction)
##                   WAIC       SE     dWAIC      dSE    pWAIC     weight
## m_pred        198.5191 8.906167 0.0000000       NA 18.98466 0.32957764
## m_Interaction 198.8937 8.973269 0.3745387 3.220463 18.75149 0.27329248
## m_pred_size   199.7165 8.782557 1.1973164 1.806188 19.10943 0.18111890
## m_tank        200.4993 7.325003 1.9801421 5.696359 21.10481 0.12245467
## m_size        201.0376 7.046793 2.5184968 5.634164 21.32244 0.09355631
precis(m_pred, 2)
##             mean        sd        5.5%     94.5%     n_eff     Rhat4
## a[1]   2.4547764 0.6842472  1.38427945  3.565727 2804.0947 1.0015081
## a[2]   2.9709308 0.7549380  1.82725153  4.216540 3271.2301 1.0016061
## a[3]   1.7013744 0.6177928  0.72331140  2.715977 3208.6190 1.0003863
## a[4]   2.9671926 0.7513868  1.84848307  4.208027 3055.3957 1.0016294
## a[5]   2.5004920 0.6991209  1.41252124  3.652552 3086.2833 1.0004745
## a[6]   2.4722115 0.6778966  1.40187377  3.583627 3687.7348 1.0002898
## a[7]   2.9472262 0.7672396  1.81674426  4.247077 2740.0995 0.9997851
## a[8]   2.4859525 0.6802084  1.42925098  3.595798 3169.1747 1.0013831
## a[9]   2.2011548 0.5665074  1.28743715  3.086426 1611.3002 1.0026527
## a[10]  3.5720177 0.6383421  2.58959491  4.610477 1560.5277 1.0011613
## a[11]  2.9854429 0.5696490  2.09063262  3.919706 1507.7805 1.0019510
## a[12]  2.7303971 0.5873522  1.78641420  3.662706 1566.7273 1.0032371
## a[13]  3.0133229 0.5884069  2.07233099  3.979951 1436.4355 1.0009905
## a[14]  2.4659673 0.5913512  1.50668449  3.406391 1462.1128 1.0020991
## a[15]  3.5700126 0.6323598  2.59479311  4.605112 1579.5062 1.0045170
## a[16]  3.5737856 0.6233412  2.62529119  4.575110 1600.4797 1.0009298
## a[17]  2.8876157 0.6122651  1.95525711  3.915650 3902.1421 0.9995788
## a[18]  2.5633704 0.5701273  1.69934805  3.517710 3842.2795 1.0002725
## a[19]  2.2759118 0.5248530  1.46919022  3.135039 3093.4154 0.9998767
## a[20]  3.2770303 0.6934484  2.26878429  4.443128 2436.7066 1.0008498
## a[21]  2.5526006 0.5619163  1.71061718  3.498270 2574.6305 1.0011721
## a[22]  2.5559423 0.5416140  1.73037253  3.456130 3226.6253 1.0004965
## a[23]  2.5600754 0.5838783  1.65778678  3.540410 3897.5802 1.0014431
## a[24]  2.0079127 0.5151939  1.22546081  2.863591 3048.0444 1.0001420
## a[25]  1.5504630 0.4876157  0.73001614  2.309887 1103.4103 1.0025866
## a[26]  2.5132562 0.4609045  1.77088677  3.241238  988.0399 1.0028838
## a[27]  1.2203781 0.5051491  0.37219835  1.999024 1193.6283 1.0028974
## a[28]  1.9769583 0.4733654  1.20563936  2.713473 1018.8853 1.0029334
## a[29]  2.5161181 0.4592867  1.74976193  3.243415  976.1690 1.0023416
## a[30]  3.5172171 0.4897483  2.74744744  4.310282 1164.9170 1.0017767
## a[31]  1.8429499 0.4693307  1.08968544  2.596389  995.7754 1.0028962
## a[32]  2.1169389 0.4669889  1.37533907  2.851092 1019.0478 1.0019730
## a[33]  3.0613805 0.5807592  2.20326683  4.060531 3196.2696 1.0012581
## a[34]  2.7501853 0.5416691  1.90449008  3.634424 3567.2494 1.0002216
## a[35]  2.7509200 0.5493631  1.91477358  3.670654 3897.7916 1.0007824
## a[36]  2.2657393 0.4683367  1.54727648  3.036518 3474.6254 0.9996318
## a[37]  2.2638404 0.4788706  1.53419321  3.045801 3556.0232 1.0005311
## a[38]  3.4190708 0.6548613  2.43071026  4.535138 2572.4632 1.0018660
## a[39]  2.7557540 0.5519617  1.90419696  3.676108 3279.3596 1.0006053
## a[40]  2.5116974 0.5138011  1.72274289  3.379111 3719.5302 1.0002083
## a[41]  0.9059233 0.5077046  0.08812746  1.685448 1121.7230 1.0014858
## a[42]  1.8942251 0.4408752  1.20709299  2.599686  899.5043 1.0027131
## a[43]  1.9910842 0.4308359  1.32153991  2.680174  871.0715 1.0022702
## a[44]  2.0913054 0.4280646  1.40643031  2.777853  851.2095 1.0035957
## a[45]  2.8913685 0.4248321  2.22028152  3.581249  873.1258 1.0035124
## a[46]  1.8964501 0.4364219  1.18978211  2.576710  878.4396 1.0025802
## a[47]  4.0015793 0.4876063  3.23698904  4.791036 1112.9242 1.0023372
## a[48]  2.4014410 0.4262177  1.71003578  3.075853  862.8145 1.0028031
## bp    -2.4329951 0.3073848 -2.90689067 -1.933734  477.8519 1.0068303
## a_1    2.5338237 0.2389046  2.15174483  2.906545  588.1476 1.0047222
## sigma  0.8274781 0.1441678  0.61824224  1.077693 1336.0908 1.0024504
precis(m_size, 2)
##              mean        sd        5.5%       94.5%     n_eff    Rhat4
## a[1]   2.22912003 0.9638341  0.79793702  3.82883512 1677.2008 1.001145
## a[2]   3.07884602 1.1278960  1.39972089  5.01306759 1946.8067 1.000682
## a[3]   1.08099246 0.7628572 -0.09872935  2.31458319 1352.9802 1.000624
## a[4]   3.11907910 1.1737044  1.43268866  5.12169771 1704.9975 1.001204
## a[5]   1.97151316 0.9785974  0.52467137  3.62925724 1496.3871 1.002125
## a[6]   1.98688597 0.9784683  0.53736938  3.65590419 1655.1865 1.006193
## a[7]   2.91884413 1.1721446  1.16385272  4.95372610 1737.7730 1.001342
## a[8]   1.96325727 0.9470642  0.52112136  3.52565433 1288.9821 1.002682
## a[9]  -0.07973628 0.7079450 -1.22195193  1.04267475 1241.7284 1.002352
## a[10]  2.18293546 0.9402855  0.77261877  3.74711702 1812.5424 1.000858
## a[11]  1.09525929 0.7820893 -0.09304433  2.39349330 1373.6271 1.001462
## a[12]  0.67028337 0.7281719 -0.47884371  1.87390756 1212.0287 1.002085
## a[13]  0.82381879 0.7796008 -0.39390411  2.08050145 1228.5996 1.000369
## a[14]  0.02110893 0.7314864 -1.12861780  1.19477882 1037.5191 1.002494
## a[15]  2.01060062 0.9623577  0.56496858  3.64211751 1530.4731 1.002336
## a[16]  1.98193903 0.9820149  0.53147381  3.62614103 1575.0959 1.000800
## a[17]  2.98017437 0.8734070  1.67158227  4.42976637 1510.0923 1.000451
## a[18]  2.47205585 0.7494253  1.33412014  3.70086929 1574.5701 1.001871
## a[19]  2.12188360 0.7171456  1.02376514  3.34675563 1266.1171 1.001260
## a[20]  3.71684411 1.0521911  2.14465614  5.53905031 1635.3267 1.000506
## a[21]  2.21788102 0.7735745  1.00971869  3.47110540 1322.8425 1.002461
## a[22]  2.22320595 0.7865050  1.03547933  3.56525783  969.1197 1.003392
## a[23]  2.22278785 0.7898126  1.03543254  3.51776587 1198.6047 1.001322
## a[24]  1.51886619 0.6822254  0.44298653  2.61999359  834.6361 1.003424
## a[25] -0.89906655 0.5913370 -1.88923191  0.02082721  937.6156 1.002157
## a[26]  0.26417177 0.5494313 -0.60007261  1.14446624  859.3195 1.003662
## a[27] -1.32221824 0.6319258 -2.34701840 -0.35295781 1044.5299 1.001797
## a[28] -0.37311555 0.5665953 -1.27148515  0.52789519  871.9641 1.001840
## a[29] -0.02836775 0.5678599 -0.92153839  0.88570106  684.5245 1.002620
## a[30]  1.26556898 0.6357492  0.26412182  2.28883639  920.1810 1.001873
## a[31] -0.82285560 0.5815830 -1.77981978  0.10024963  729.1630 1.005081
## a[32] -0.50049783 0.5645886 -1.40629862  0.41233589  706.7900 1.003436
## a[33]  3.25938804 0.8323029  2.02717235  4.65290781 1497.3415 1.001532
## a[34]  2.79801430 0.7359285  1.69551802  4.02965402 1261.3272 1.002086
## a[35]  2.80469999 0.7419513  1.69280426  4.04012166 1189.9544 1.000623
## a[36]  2.14495147 0.6356123  1.13140621  3.19843963 1078.1386 1.002943
## a[37]  1.88230420 0.6445102  0.90347866  2.94023699  861.9576 1.000463
## a[38]  3.73199731 1.0474084  2.21614912  5.55836605 1839.2883 1.001146
## a[39]  2.51908944 0.7459633  1.37374151  3.75741864 1151.2852 1.001623
## a[40]  2.17436366 0.6987797  1.13409796  3.35046189 1031.8357 1.001828
## a[41] -1.71367892 0.6070065 -2.71469767 -0.74972520  972.3874 1.002467
## a[42] -0.46570784 0.5192616 -1.28704059  0.38046582  814.6003 1.003356
## a[43] -0.34855873 0.5208499 -1.17815557  0.48792488  771.5811 1.003435
## a[44] -0.23186229 0.5175352 -1.07112948  0.57904434  784.7659 1.002224
## a[45]  0.39104164 0.5355397 -0.47111360  1.22586776  668.1027 1.003798
## a[46] -0.76618988 0.5331292 -1.63843220  0.05404932  649.2345 1.003621
## a[47]  1.87126319 0.6679176  0.85941508  2.97225273  862.2204 1.002389
## a[48] -0.19608439 0.5146424 -1.01761495  0.64611184  615.6491 1.004452
## s[1]   0.19864398 0.4088957 -0.45383840  0.85172963  424.1757 1.006349
## s[2]  -0.11090639 0.3967157 -0.75130689  0.51440651  525.8826 1.005577
## a_1    1.30388746 0.4323892  0.62938924  2.00245929  495.1230 1.004066
## sigma  1.60836372 0.2106510  1.30389344  1.97904296 2155.6765 1.001668
# After comparing the models using WAIC, the parameter estimates of bp are much further from 0 than those for s. In addition, anytime bp is contained in a model, the sigma decreases drastically.

13-3. Re-estimate the basic Reed frog varying intercept model, but now using a Cauchy distribution in place of the Gaussian distribution for the varying intercepts. That is, fit this model: \[\begin{align} s_i ∼ Binomial(n_i, p_i) \\ logit(p_i) = α_{tank[i]} \\ α_{tank} ∼ Cauchy(α, σ) \\ α ∼ Normal(0, 1) \\ σ ∼ Exponential(1) \\ \end{align}\]

(You are likely to see many divergent transitions for this model. Can you figure out why? Can you fix them?) Plot and compare the posterior means of the intercepts, αtank, to the posterior means produced in the chapter, using the customary Gaussian prior. Can you explain the pattern of differences? Take note of any change in the mean α as well.

m_tank_cauchy <- ulam(
  alist(
    surv_ ~ dbinom(dens_, p),
    logit(p) <- a[tank_],
    a[tank_] ~ dcauchy(a_1, sigma),
    a_1 ~ dnorm(0, 1.5),
    sigma ~ dexp(1)
  ),
  data = dat, chains = 4, cores = 4, log_lik = TRUE,
  iter = 2e3, control = list(adapt_delta = 0.99)
)

s_tank <- apply(extract.samples(m_tank)$a, 2, mean)
s_tank_cauchy <- apply(extract.samples(m_tank_cauchy)$a, 2, mean)
plot(s_tank, s_tank_cauchy,
  pch = 16, col = "blue",
  xlab = "Gaussian intercept", ylab = "Cauchy intercept"
)
abline(a = 0, b = 1, lty = 2)

#Firstly, the WAIC of these 2 models are basically creating the same results. However, the α_tank estimates of the Cauchy prior are even more extreme by comparison.  Secondly, since the Gaussian distribution is more concentrated than the Cauchy distribution, the Gaussian estimates have more shrinkage applied to them and so fall to lower values.

13-4. Now use a Student-t distribution with ν = 2 for the intercepts: \[\begin{align} α_{tank} ∼ Student(2, α, σ) \end{align}\]

Refer back to the Student-t example in Chapter 7 (page 234), if necessary. Plot and compare the resulting posterior to both the original model and the Cauchy model in 13-3. Can you explain the differences and similarities in shrinkage in terms of the properties of these distributions?

m_student <- map2stan(
  alist(
    surv_ ~ dbinom(dens_, p),
    logit(p) <- a[tank_],
    a[tank_] ~ dstudent(2, a_1, sigma),
    a_1 ~ dnorm(0,1),
    sigma ~ dcauchy(0,1)
    ),
  data=dat, chains=4, cores=4, warmup=1000, 
  iter=3000, control=list(adapt_delta=0.99)
)

s_student <- apply(extract.samples(m_student)$a, 2, mean)

 
plot(s_tank, s_student,
  pch = 16, col = "red",
  xlab = "Gaussian intercept", ylab = "student-t intercept"
)
abline(a = 0, b = 1, lty = 2)

plot(s_student, s_tank_cauchy,
  pch = 16, col = "green",
  xlab = "student-t intercept", ylab = "Cauchy intercept"
)
abline(a = 0, b = 1, lty = 2)

#We can see that compared with the other two models, the student-t model results in the same posterior means. Once passing the extreme α_tank, the student-t model has more extreme values. And also, the Gaussian model causes more shrinkage of extreme values than the Student-t model. The Student-t model causes more shrinkage of extreme values than the Cauchy model.

13-5. Sometimes the prior and the data (through the likelihood) are in conflict, because they concentrate around different regions of parameter space. What happens in these cases depends a lot upon the shape of the tails of the distributions. Likewise, the tails of distributions strongly influence can outliers are shrunk or not towards the mean. I want you to consider four different models to fit to one observation at y = 0. The models differ only in the distributions assigned to the likelihood and prior. Here are the four models:

\[\begin{align} Model \;NN: y &∼ Normal(μ, 1) & Model \;TN: y &∼ Student(2, μ, 1) \\ μ &∼ Normal(10, 1) & μ &∼ Normal(10, 1) \\ Model \;NT: y &∼ Normal(μ, 1) & Model \;TT: y &∼ Student(2, μ, 1) \\ μ &∼ Student(2, 10, 1) & μ &∼ Student(2, 10, 1) \\ \end{align}\]

Estimate and plot the posterior distributions against the likelihoods for these models and compare them. Can you explain the results, using the properties of the distributions?

data(bangladesh)
d <- bangladesh
library(dplyr)
d$district_id <- as.integer(as.factor(d$district))
d$use_contraception <- d$use.contraception
d$age_centered <- d$age.centered
d1 <- select(d,-use.contraception, -age.centered, -district)

# Model 1
m13m6.1 <- map2stan( 
  alist(
    pulled_left ~ dnorm(mu , 1),
    mu~dnorm(10,1)
  ), 
  data=d1 , iter=2000 , chains=1, cores=1 
)
## 
## SAMPLING FOR MODEL '7d5417fd7aa3ed94365d0774c8907b95' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 4.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.44 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.0048 seconds (Warm-up)
## Chain 1:                0.005116 seconds (Sampling)
## Chain 1:                0.009916 seconds (Total)
## Chain 1: 
## Error in .local(fit, data, n, ...) : 
##   There appear to be no linear models here
precis(m13m6.1)
##                 mean       sd     5.5%    94.5%    n_eff     Rhat4
## pulled_left 9.940927 1.357999 7.899563 12.11358 460.2006 0.9999645
## mu          9.935908 1.020641 8.259513 11.56642 370.3153 0.9992017
m13m6.2 <- map2stan( 
  alist(
    pulled_left ~ dstudent(2, mu, 1),
    mu~dnorm(10,1)
  ), 
  data=d1 , iter=2000 , chains=1, cores=1 
)
## 
## SAMPLING FOR MODEL '2f2b4f35c006c9c647e6ecd53b7b5e15' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 9e-06 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.005766 seconds (Warm-up)
## Chain 1:                0.005248 seconds (Sampling)
## Chain 1:                0.011014 seconds (Total)
## Chain 1: 
## Error in .local(fit, data, n, ...) : 
##   There appear to be no linear models here
precis(m13m6.2)
##                 mean        sd     5.5%    94.5%    n_eff    Rhat4
## pulled_left 9.937625 2.1624976 6.830660 12.93576 561.0197 1.004583
## mu          9.922028 0.9641712 8.418448 11.38846 433.0664 1.002075
m13m6.3 <- map2stan( 
  alist(
    pulled_left ~ dnorm(mu, 1),
    mu~dstudent(2,10,1)
  ), 
  data=d1 , iter=2000 , chains=1, cores=1 
)
## 
## SAMPLING FOR MODEL 'ecf054acc4af66c5180feea3ecd3dfca' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 6e-06 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.005854 seconds (Warm-up)
## Chain 1:                0.006173 seconds (Sampling)
## Chain 1:                0.012027 seconds (Total)
## Chain 1: 
## Error in .local(fit, data, n, ...) : 
##   There appear to be no linear models here
precis(m13m6.3)
##                 mean       sd     5.5%    94.5%    n_eff     Rhat4
## pulled_left 10.06413 2.153756 7.172677 13.28855 312.9945 0.9996861
## mu          10.04764 1.929205 7.468338 12.75254 320.0405 1.0001757
m13m6.4 <- map2stan( 
  alist(
    pulled_left ~ dstudent(2, mu, 1),
    mu~dstudent(2,10,1)
  ), 
  data=d1 , iter=2000 , chains=1, cores=1 
)
## 
## SAMPLING FOR MODEL 'a30f141cca14548fc2d390ca53297cfa' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 7e-06 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.006804 seconds (Warm-up)
## Chain 1:                0.007416 seconds (Sampling)
## Chain 1:                0.01422 seconds (Total)
## Chain 1: 
## Error in .local(fit, data, n, ...) : 
##   There appear to be no linear models here
precis(m13m6.4)
##                 mean       sd     5.5%    94.5%    n_eff     Rhat4
## pulled_left 9.751712 3.019636 5.569520 13.64759 344.0565 1.0022427
## mu          9.939943 1.681568 7.437044 12.27910 373.9948 0.9996087
post_m13m6.1 <- extract.samples(m13m6.1)
post_m13m6.2 <- extract.samples(m13m6.2)
post_m13m6.3 <- extract.samples(m13m6.3)
post_m13m6.4 <- extract.samples(m13m6.4)

dens(post_m13m6.2$mu, col="blue")
dens(post_m13m6.1$mu, col="red", add = TRUE)
dens(post_m13m6.3$mu, col="black", add=TRUE)
dens(post_m13m6.4$mu, col="purple", add=TRUE)
abline(v=0.6, col = "grey")
legend("topright",
       col = c("blue","red", "black", "purple"),
       pch = 19,
       legend = c("Model 1","Model 2", "Model 3", "Model 4"))

# According to the plot above, we can see that the prior and the data through likelihood are of the same distribution, the distribution seems to be symmetric. Meanwhile, the tails of the Student-t distribution are heavier, meaning it has an impact on the shrinkage.