Chapter 11 - God Spiked the Integers

This chapter described some of the most common generalized linear models, those used to model counts. It is important to never convert counts to proportions before analysis, because doing so destroys information about sample size. A fundamental difficulty with these models is that parameters are on a different scale, typically log-odds (for binomial) or log-rate (for Poisson), than the outcome variable they describe. Therefore computing implied predictions is even more important than before.

Place each answer inside the code chunk (grey box). The code chunks should contain a text response or a code that completes/answers the question or activity requested. Make sure to include plots if the question requests them.

Finally, upon completion, name your final output .html file as: YourName_ANLY505-Year-Semester.html and publish the assignment to your R Pubs account and submit the link to Canvas. Each question is worth 5 points.

Questions

11-1. As explained in the chapter, binomial data can be organized in aggregated and disaggregated forms, without any impact on inference. But the likelihood of the data does change when the data are converted between the two formats. Can you explain why?

# The difference in likelihood of data is because the aggregated binomial counts have to average over all of the orders, or permutations, that are consistent with the observed count; However, the disaggregated binomial counts, in 0/1 form, do not have to cope with order.

# As the example shown in the chapter, when calculating the aggregated form of data, dbinom(6,9,0.2) contains a term for all the orders the 6 successes could appear in 9 trials.

\[Pr(6|9,p)=\frac{ 9!} {6!(9-6)!}p^6(1-p)^{9-6} \]

# That fraction in front is the multiplicity that just counts all the ways you could see 6 successes in 9 trials. 
# But as disaggregated datra, when we instead split the 6 successes apart into 9 different 0/1 trials, like in a logistic regression, there is no multiplicity term to compute. So the joint probably of all 9 trials is instead:

\[ p^6(1-p)^{9-6} \]

# So the aggregated data has an extra constant in front to handle all the permutations. This doesn’t influence inference, because the multiplicity constant isn’t a function of the parameter p. But it does influence the magnitude of the likelihood and log-likelihood.

11-2. Use quap to construct a quadratic approximate posterior distribution for the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330). Plot and compare the quadratic approximation to the posterior distribution produced instead from MCMC. Can you explain both the differences and the similarities between the approximate and the MCMC distributions? Relax the prior on the actor intercepts to Normal(0,10). Re-estimate the posterior using both ulam and quap. Plot and compare the posterior distributions. Do the differences increase or decrease? Why?

# Trim the data
data(chimpanzees)
d <- chimpanzees
d$treatment <- 1 + d$prosoc_left + 2 * d$condition
dat_list <- list(
  pulled_left = d$pulled_left,
  actor = d$actor,
  treatment = as.integer(d$treatment)
)

# Re-run the MCMC model in Chapter m11.4
m11.4 <- ulam(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a[actor] + b[treatment],
    a[actor] ~ dnorm(0, 1.5),
    b[treatment] ~ dnorm(0, 0.5)
  ),
  data = dat_list, chains = 4, log_lik = TRUE
)
## 
## SAMPLING FOR MODEL '80e2b6267e3dc4ff0c2916d0cf0879e8' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.586 seconds (Warm-up)
## Chain 1:                0.6 seconds (Sampling)
## Chain 1:                1.186 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '80e2b6267e3dc4ff0c2916d0cf0879e8' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 0 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 2: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 2: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 2: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 2: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.616 seconds (Warm-up)
## Chain 2:                0.556 seconds (Sampling)
## Chain 2:                1.172 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '80e2b6267e3dc4ff0c2916d0cf0879e8' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 0 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 3: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 3: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 3: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 3: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 3: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 3: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 3: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 3: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 3: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 3: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 3: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.612 seconds (Warm-up)
## Chain 3:                0.579 seconds (Sampling)
## Chain 3:                1.191 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '80e2b6267e3dc4ff0c2916d0cf0879e8' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 0 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 4: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 4: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 4: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 4: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 4: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 4: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 4: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 4: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 4: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 4: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 4: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.586 seconds (Warm-up)
## Chain 4:                0.538 seconds (Sampling)
## Chain 4:                1.124 seconds (Total)
## Chain 4:
precis(m11.4, depth=2)
##             mean        sd        5.5%       94.5%     n_eff    Rhat4
## a[1] -0.45513694 0.3068005 -0.92979013  0.04748336  759.9765 1.007343
## a[2]  3.88539594 0.7390949  2.78130210  5.09493955 1021.8760 0.998933
## a[3] -0.74248183 0.3326749 -1.27739978 -0.21252352  811.5868 1.007643
## a[4] -0.74512021 0.3276820 -1.26946919 -0.21583817  799.2258 1.004049
## a[5] -0.44608332 0.3220214 -0.95628677  0.06098074  696.1487 1.007831
## a[6]  0.47457640 0.3267403 -0.06014964  0.98544512  814.9036 1.005584
## a[7]  1.96411775 0.4254917  1.30223185  2.66761418 1023.1440 1.003967
## b[1] -0.04523939 0.2825148 -0.49499205  0.42093847  758.9986 1.006363
## b[2]  0.48241582 0.2749191  0.03314098  0.90997321  714.9600 1.005894
## b[3] -0.38636441 0.2817833 -0.83543661  0.07073891  738.4869 1.006607
## b[4]  0.36301423 0.2774921 -0.10320245  0.81966126  698.1003 1.006143
#  quadratic approximation model m11.4_quap
m11.4_quap <- quap(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a[actor] + b[treatment],
    a[actor] ~ dnorm(0, 1.5),
    b[treatment] ~ dnorm(0, 0.5)
  ),
  data = dat_list
)

precis(m11.4_quap, depth=2)
##             mean        sd        5.5%       94.5%
## a[1] -0.43921686 0.3276018 -0.96278784  0.08435412
## a[2]  3.70603635 0.7217449  2.55254861  4.85952409
## a[3] -0.73274635 0.3329754 -1.26490540 -0.20058730
## a[4] -0.73274595 0.3329754 -1.26490498 -0.20058692
## a[5] -0.43920356 0.3276017 -0.96277427  0.08436715
## a[6]  0.46894877 0.3317748 -0.06129139  0.99918893
## a[7]  1.90507072 0.4136449  1.24398620  2.56615524
## b[1] -0.04065793 0.2837319 -0.49411629  0.41280043
## b[2]  0.47214276 0.2842163  0.01791029  0.92637523
## b[3] -0.37835225 0.2852441 -0.83422742  0.07752291
## b[4]  0.36201959 0.2838328 -0.09160003  0.81563921
coeftab_plot(coeftab(m11.4, m11.4_quap),cex=0.6)

# The posterior distribution between quadratic approximation model (m11.4_quap) and MCMC model (m11.4) are quite similar, except for the second intercept a[2]. While it makes almost no difference for estimating the treatment effects, the quadratic approximation underestimates the uncertainty for actor 2’s handedness. That is because the quadratic approximation enforces symmetric uncertainty around the posterior mode, but the nature of a logistic regression often leads to asymmetric uncertainty.  It can be supported by comparing the marginal posterior distributions for a[2] as below:

set.seed(1999)
post <- extract.samples(m11.4)
post_quap <- extract.samples(m11.4_quap)
dens(post$a[, 2], lwd = 2, legend=TRUE)
dens(post_quap$a[, 2], add = TRUE, lwd = 2, col = rangi2)

# The blue line the marginal posterior dist of a[2] by quadratic approximation. The black line is the marginal posterior dist of a[2] by MCMC. The long tail on the righthand side in MCMC arises from the fact that basically any sufficiently large value will result in a probability of 1 for the event. It is consistent with the data, since actor 2 always pulled the left lever. 
# Overall, both ways of fitting the model appreciate the strong preference, but the quadratic approximation cannot appreciate the asymmetry in uncertainty, whereas the MCMC has no trouble recognizing the imbalance in uncertainty.
# change prior of a to norm(0,10)

# ulam model m11.4_np
m11.4_np <- ulam(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a[actor] + b[treatment],
    a[actor] ~ dnorm(0, 10),
    b[treatment] ~ dnorm(0, 0.5)
  ),
  data = dat_list, chains = 4, log_lik = TRUE
)
## 
## SAMPLING FOR MODEL '7f23be0d03ff2d95fe728491a7317546' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.914 seconds (Warm-up)
## Chain 1:                0.715 seconds (Sampling)
## Chain 1:                1.629 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '7f23be0d03ff2d95fe728491a7317546' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 0 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 2: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 2: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 2: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 2: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.828 seconds (Warm-up)
## Chain 2:                0.647 seconds (Sampling)
## Chain 2:                1.475 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '7f23be0d03ff2d95fe728491a7317546' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 0 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 3: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 3: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 3: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 3: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 3: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 3: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 3: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 3: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 3: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 3: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 3: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.883 seconds (Warm-up)
## Chain 3:                0.756 seconds (Sampling)
## Chain 3:                1.639 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '7f23be0d03ff2d95fe728491a7317546' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 0 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 4: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 4: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 4: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 4: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 4: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 4: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 4: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 4: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 4: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 4: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 4: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.875 seconds (Warm-up)
## Chain 4:                0.619 seconds (Sampling)
## Chain 4:                1.494 seconds (Total)
## Chain 4:
precis(m11.4_np, depth=2)
##            mean        sd        5.5%       94.5%    n_eff    Rhat4
## a[1] -0.3422740 0.3519298 -0.88562726  0.23078014 405.2034 1.009166
## a[2] 11.0437237 5.1047765  4.93580343 20.18864516 746.9067 1.002515
## a[3] -0.6432142 0.3534593 -1.18537666 -0.06334683 390.1187 1.003280
## a[4] -0.6501124 0.3649075 -1.20498426 -0.06876005 346.9239 1.005320
## a[5] -0.3412372 0.3525298 -0.88530284  0.23189752 324.7699 1.005100
## a[6]  0.6160937 0.3587410  0.05886377  1.19590504 416.8038 1.007300
## a[7]  2.2118114 0.4585261  1.51142424  2.96248649 656.3452 1.001118
## b[1] -0.1658482 0.3063649 -0.65889012  0.31921410 378.4970 1.006973
## b[2]  0.3714890 0.3046392 -0.13542783  0.86011105 331.7466 1.006865
## b[3] -0.5174327 0.3135090 -1.02446972 -0.02012784 409.3005 1.005154
## b[4]  0.2554701 0.2994190 -0.20974984  0.71989385 337.7104 1.008124
#  quadratic approximation model m11.4_quap_np
m11.4_quap_np <- quap(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a[actor] + b[treatment],
    a[actor] ~ dnorm(0, 10),
    b[treatment] ~ dnorm(0, 0.5)
  ),
  data = dat_list
)

precis(m11.4_quap_np, depth=2)
##            mean        sd        5.5%        94.5%
## a[1] -0.3519849 0.3477652 -0.90778094  0.203811072
## a[2]  6.9924561 3.5461790  1.32497713 12.659935090
## a[3] -0.6546508 0.3537050 -1.21993968 -0.089361832
## a[4] -0.6546404 0.3537047 -1.21992887 -0.089351883
## a[5] -0.3520059 0.3477655 -0.90780237  0.203790537
## a[6]  0.5811998 0.3522753  0.01819581  1.144203809
## a[7]  2.1186550 0.4523601  1.39569615  2.841613862
## b[1] -0.1418852 0.3011498 -0.62318077  0.339410386
## b[2]  0.3815926 0.3009912 -0.09944948  0.862634767
## b[3] -0.4901647 0.3030962 -0.97457094 -0.005758549
## b[4]  0.2695532 0.3007596 -0.21111867  0.750225160
coeftab_plot(coeftab(m11.4_np, m11.4_quap_np),cex=0.6)

set.seed(1999)
post <- extract.samples(m11.4_np)
post_quap <- extract.samples(m11.4_quap_np)
dens(post$a[, 2], lwd = 2, legend=TRUE)
dens(post_quap$a[, 2], add = TRUE, lwd = 2, col = rangi2)

# Re-estimating the posterior with relaxed priors a ~N(0,10), we observed that the difference of posterior distributions in a[2] increases. That is because the relaxed prior is too flat. However, a flat prior in the logit space is ot flat prior in the outcome prob space. It actually produces a very non-flat prior distribution on the outcome scale.This priors that is less skeptical ti the large difference could result in overfitting. 

11-3. Revisit the data(Kline) islands example. This time drop Hawaii from the sample and refit the models. What changes do you observe?

data(Kline)
# drop Hawaii
d <- Kline [!Kline$culture == "Hawaii",]
d
##      culture population contact total_tools mean_TU
## 1   Malekula       1100     low          13     3.2
## 2    Tikopia       1500     low          22     4.7
## 3 Santa Cruz       3600     low          24     4.0
## 4        Yap       4791    high          43     5.0
## 5   Lau Fiji       7400    high          33     5.0
## 6  Trobriand       8000    high          19     4.0
## 7      Chuuk       9200    high          40     3.8
## 8      Manus      13000     low          28     6.6
## 9      Tonga      17500    high          55     5.4
d$P <- scale( log(d$population) )
d$contact_id <- ifelse( d$contact=="high" , 2 , 1 )

# geocentric model
dat <- list(
  T = d$total_tools ,
  P = d$P ,
  cid = d$contact_id )


m11.10 <- ulam(
  alist(
    T ~ dpois( lambda ),
    log(lambda) <- a[cid] + b[cid]*P,
    a[cid] ~ dnorm( 3 , 0.5 ),
    b[cid] ~ dnorm( 0 , 0.2 )
    ), data=dat , chains=4 , log_lik=TRUE )
## 
## SAMPLING FOR MODEL '3ebdae5360e013c94b8cfd064ba42d31' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.023 seconds (Warm-up)
## Chain 1:                0.022 seconds (Sampling)
## Chain 1:                0.045 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '3ebdae5360e013c94b8cfd064ba42d31' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 0 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 2: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 2: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 2: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 2: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.023 seconds (Warm-up)
## Chain 2:                0.025 seconds (Sampling)
## Chain 2:                0.048 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '3ebdae5360e013c94b8cfd064ba42d31' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 0 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 3: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 3: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 3: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 3: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 3: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 3: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 3: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 3: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 3: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 3: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 3: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.023 seconds (Warm-up)
## Chain 3:                0.022 seconds (Sampling)
## Chain 3:                0.045 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '3ebdae5360e013c94b8cfd064ba42d31' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 0 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 4: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 4: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 4: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 4: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 4: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 4: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 4: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 4: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 4: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 4: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 4: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.024 seconds (Warm-up)
## Chain 4:                0.019 seconds (Sampling)
## Chain 4:                0.043 seconds (Total)
## Chain 4:
k <- PSIS( m11.10 , pointwise=TRUE )$k
plot( dat$P , dat$T , xlab="log population (std)" , ylab="total tools" ,
      col=rangi2 , pch=ifelse( dat$cid==1 , 1 , 16 ) , lwd=2 ,
      ylim=c(0,75) , cex=1+normalize(k) )

# set up the horizontal axis values to compute predictions at
ns <- 100
P_seq <- seq( from=min(dat$P)-0.15 , to=max(dat$P)+0.15 , length.out=ns )

# predictions for cid=1 (low contact)
lambda <- link( m11.10 , data=data.frame( P=P_seq , cid=1 ) )
lmu <- apply( lambda , 2 , mean )
lci <- apply( lambda , 2 , PI, prob=0.89 )
lines( P_seq , lmu , lty=2 , lwd=1.5 )
shade( lci , P_seq , xpd=TRUE )

# predictions for cid=2 (high contact)
lambda <- link( m11.10 , data=data.frame( P=P_seq , cid=2 ) )
lmu <- apply( lambda , 2 , mean )
lci <- apply( lambda , 2 , PI, prob=0.89 )
lines( P_seq , lmu , lty=1 , lwd=1.5 )
shade( lci , P_seq , xpd=TRUE )

# Plot with population
plot( d$population , d$total_tools , xlab="population" , ylab="total tools" ,
col=rangi2 , pch=ifelse( dat$cid==1 , 1 , 16 ) , lwd=2 ,
ylim=c(0,75) , cex=1+normalize(k) )
ns <- 100
#P_seq <- seq( from=min(scale(log(d$population)))-0.15 , to=max(scale(log(d$population)))+0.15 , length.out=ns )
P_seq <- seq( from=min(dat$P)-0.15 , to=max(dat$P)+0.15 , length.out=ns )

# 0.9382112 is sd of log(population)
# 8.582836 is mean of log(population)
#pop_seq <- exp( P_seq*sd(log(d$population)) + mean(log(d$population)) )
pop_seq <- seq( from=min(d$population)-0.15 , to=max(d$population)+0.15 , length.out=ns )
lambda <- link( m11.10 , data=data.frame(P=(log(pop_seq) - mean(log(d$population)))/sd(log(d$population)), cid=1 ) )
lmu <- apply( lambda , 2 , mean )
lci <- apply( lambda , 2 , PI, prob=0.89 )
lines( pop_seq , lmu , lty=2 , lwd=1.5 )
shade( lci , pop_seq , xpd=TRUE )

lambda <- link( m11.10 , data=data.frame( P=(log(pop_seq) - mean(log(d$population)))/sd(log(d$population)) , cid=2 ) )
lmu <- apply( lambda , 2 , mean )
lci <- apply( lambda , 2 , PI, prob=0.89 )
lines( pop_seq , lmu , lty=1 , lwd=1.5 )
shade( lci , pop_seq , xpd=TRUE )

# Hawaii is a highly influential point identified in the original model in the chapter: It has extreme population size and the most tools, compared to the other islands. 
# Before we dropped the point, due to the big influence of Hawaii as a low contact island, the model shows a silly pattern that the trend for societies with high contact (solid) is higher than the trend for societies with low contact (dashed) when population size is low, but then the trend of high contact societies become smaller when population size is high. It does not makes sense, because we know a  counter-factual Hawaii with the same population size but high contact should theoretically have at least as many tools as the real Hawaii. It shouldn’t have fewer.
# After we drop this highly influential point of Hawaii, the population range of both high contact and low contact societies in data are on the same level.  The model has better idea where the trend for both low and high contact societies go at population sizes between 0 to 20000. Now the pattern looks reasonable: the trend for societies with high contact (solid) is higher than the trend for societies with low contact (dashed) regardless of the population size.

11-4. Use WAIC or PSIS to compare the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330), to the simpler models fit in the same section. Interpret the results.

# 1. unique intercept only model:
m11.4_intercept <- quap(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a,
    a ~ dnorm(0, 10)
  ),
  data = dat_list
)

# 2.unique intercept and Treatment model:
m11.4_int_treat <- quap(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a + b[treatment],
    a ~ dnorm(0, 1.5),
    b[treatment] ~ dnorm(0, 0.5)
  ),
  data = dat_list
)

# 3. Individual Intercept and Treatment model:
m11.4 <- ulam(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a[actor] + b[treatment],
    a[actor] ~ dnorm(0, 1.5),
    b[treatment] ~ dnorm(0, 0.5)
  ),
  data = dat_list, chains = 4, log_lik = TRUE
)
## 
## SAMPLING FOR MODEL '80e2b6267e3dc4ff0c2916d0cf0879e8' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.604 seconds (Warm-up)
## Chain 1:                0.461 seconds (Sampling)
## Chain 1:                1.065 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '80e2b6267e3dc4ff0c2916d0cf0879e8' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 0 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 2: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 2: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 2: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 2: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.601 seconds (Warm-up)
## Chain 2:                0.489 seconds (Sampling)
## Chain 2:                1.09 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '80e2b6267e3dc4ff0c2916d0cf0879e8' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 0 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 3: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 3: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 3: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 3: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 3: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 3: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 3: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 3: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 3: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 3: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 3: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.584 seconds (Warm-up)
## Chain 3:                0.489 seconds (Sampling)
## Chain 3:                1.073 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '80e2b6267e3dc4ff0c2916d0cf0879e8' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 0 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 4: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 4: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 4: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 4: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 4: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 4: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 4: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 4: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 4: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 4: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 4: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.598 seconds (Warm-up)
## Chain 4:                0.534 seconds (Sampling)
## Chain 4:                1.132 seconds (Total)
## Chain 4:
compare(m11.4_intercept, m11.4_int_treat, m11.4)
##                     WAIC        SE    dWAIC      dSE     pWAIC       weight
## m11.4           531.8807 18.944167   0.0000       NA 8.3230644 1.000000e+00
## m11.4_int_treat 682.5002  9.020159 150.6195 18.42176 3.6351440 1.965154e-33
## m11.4_intercept 687.9330  7.139779 156.0522 18.99391 0.9964187 1.299234e-34
# plot(compare(m11.4_intercept, m11.4_int_treat, m11.4))

# From the table of WAIC, it shows clearly that the model accounting for individual intercepts as well as treatment effects (m11.4) outperforms the simpler models. The m11.4 modela also dominate the weight as 1. It reveals that the variation among actors is more effective than the variation arising from treatment. Since only model m11.4 recognizes the variation among actors, we can conclude that even after we account for the penalty it still is expected to out-perform the other simpler models and dominates the weight.

11-5. The data contained in data(salamanders) are counts of salamanders (Plethodon elongatus) from 47 different 49-m2 plots in northern California.181 The column SALAMAN is the count in each plot, and the columns PCTCOVER and FORESTAGE are percent of ground cover and age of trees in the plot, respectively. You will model SALAMAN as a Poisson variable. (a) Model the relationship between density and percent cover, using a log-link (same as the example in the book and lecture). Use weakly informative priors of your choosing. Check the quadratic approximation again, by comparing quap to ulam. Then plot the expected counts and their 89% interval against percent cover. In which ways does the model do a good job? A bad job? (b) Can you improve the model by using the other predictor, FORESTAGE? Try any models you think useful. Can you explain why FORESTAGE helps or does not help with prediction?

# (a) 

# Load the data
data(salamanders)
d <- salamanders
d$PCTCOVER_std <- standardize(d$PCTCOVER)
d$FORESTAGE_std <- standardize(d$FORESTAGE)

# Construct the model formula:
f <- alist(
  SALAMAN ~ dpois(lambda),
  log(lambda) <- a + b*PCTCOVER_std,
  a ~ dnorm(0,1),
  b ~ dnorm(0,0.5)
)

# Prior simulation check:
# Prior check for alpha:
curve( dlnorm( x , 0 , 1 ) , from=0 , to=100 , n=200, xlab = "mean number of salamanders", ylab="density" )

# Prior check for lambda (mean number of salamanders ):

par(mfrow = c(1,2))
set.seed(10)
N <- 100
a <- rnorm( N , 0 , 1 )
b <- rnorm( N , 0 , 0.5 )
plot( NULL , xlim=c(-2,2) , ylim=c(0,100), xlab = "PCTCOVER_std", ylab="number of salamanders" )
for ( i in 1:N ) curve( exp( a[i] + b[i]*x ) , add=TRUE , col=grau() )
mtext("a~dnorm(0,1), b~dnorm(0,0.5)")


x_seq <- seq(from = 0, to = 20, length.out = 100)
lambda <-sapply(x_seq, function(x) exp(a+b*x))
plot(NULL, xlim = range(x_seq), ylim = c(0,500),xlab = "PCTCOVER", ylab="number of salamanders")
for ( i in 1:N ) lines(x_seq, lambda[i,],col=grau(), lwd=1.5)
mtext("a~dnorm(0,1), b~dnorm(0,0.5)")

# Build ulam model
dat <- list(
  PCTCOVER_std = d$PCTCOVER_std,
  SALAMAN = d$SALAMAN
)

m11.5_ulam <- ulam(
  f, data=d, chains=4, log_lik = TRUE)
## 
## SAMPLING FOR MODEL '0974552b51a37892ea604decaa4484cf' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.04 seconds (Warm-up)
## Chain 1:                0.031 seconds (Sampling)
## Chain 1:                0.071 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '0974552b51a37892ea604decaa4484cf' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 0 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 2: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 2: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 2: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 2: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.036 seconds (Warm-up)
## Chain 2:                0.033 seconds (Sampling)
## Chain 2:                0.069 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '0974552b51a37892ea604decaa4484cf' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 0 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 3: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 3: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 3: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 3: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 3: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 3: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 3: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 3: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 3: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 3: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 3: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.036 seconds (Warm-up)
## Chain 3:                0.031 seconds (Sampling)
## Chain 3:                0.067 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '0974552b51a37892ea604decaa4484cf' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 0 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 4: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 4: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 4: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 4: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 4: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 4: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 4: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 4: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 4: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 4: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 4: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.039 seconds (Warm-up)
## Chain 4:                0.042 seconds (Sampling)
## Chain 4:                0.081 seconds (Total)
## Chain 4:
m11.5_quap <- quap(
  f, data=dat
)
coeftab_plot(coeftab(m11.5_ulam, m11.5_quap),cex=0.6)

# From the posterior distribution comparison between ulam and quap, we observe that the results are quite similar and don't have much difference.

# Plot the counts and interval
plot(dat$PCTCOVER_std, dat$SALAMAN,
  col = rangi2, lwd = 2,
  xlab = "PCTCOVER(standardized)", ylab = "salamanders observed"
)
p_seq <- seq(min(dat$PCTCOVER_std)-0.15, to = max(dat$PCTCOVER_std)+0.15, length.out = 100)
l <- link(m11.5_ulam, data = list(PCTCOVER_std=p_seq))
lines(p_seq, colMeans(l))
shade(apply(l, 2, PI,prob=0.89), p_seq)

# Good job: The count plot shows a relationship between cover and density: the more cover, the more salamanders on average. 
# Bad job: But the observed data actually seem to show two clusters of responses, with plots with cover lower than the average having similar low counts, with no obvious increasing trend, and those with higher covers having much higher counts. The model doesn’t really predict this observation.
# (b)
f2 <- alist(
  SALAMAN ~ dpois(lambda),
  log(lambda) <- a + bC * PCTCOVER_std + bA * FORESTAGE_std,
  a ~ dnorm(0, 1),
  bC ~ dnorm(0, 0.5),
  bA ~ dnorm(0, 0.5)
)

dat2<- list(
  PCTCOVER_std = d$PCTCOVER_std,
  FORESTAGE_std = d$FORESTAGE_std,
  SALAMAN = d$SALAMAN
)
m11.5_ulam_2 <- ulam(f2, data = dat2, chains = 4, log_lik = TRUE)
## 
## SAMPLING FOR MODEL '4edc6a4fed28fb13b977a2bedfdf522a' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.055 seconds (Warm-up)
## Chain 1:                0.054 seconds (Sampling)
## Chain 1:                0.109 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '4edc6a4fed28fb13b977a2bedfdf522a' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 0.001 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 10 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 2: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 2: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 2: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 2: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 2: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 2: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 2: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 2: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 2: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 2: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 2: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.051 seconds (Warm-up)
## Chain 2:                0.051 seconds (Sampling)
## Chain 2:                0.102 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '4edc6a4fed28fb13b977a2bedfdf522a' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 0 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 3: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 3: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 3: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 3: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 3: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 3: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 3: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 3: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 3: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 3: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 3: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.051 seconds (Warm-up)
## Chain 3:                0.043 seconds (Sampling)
## Chain 3:                0.094 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '4edc6a4fed28fb13b977a2bedfdf522a' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 0 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 4: Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 4: Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 4: Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 4: Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 4: Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 4: Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 4: Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 4: Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 4: Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 4: Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 4: Iteration: 1000 / 1000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.051 seconds (Warm-up)
## Chain 4:                0.046 seconds (Sampling)
## Chain 4:                0.097 seconds (Total)
## Chain 4:
precis(m11.5_ulam_2)
##          mean         sd       5.5%     94.5%     n_eff    Rhat4
## a  0.48715720 0.13861201  0.2637263 0.7011763  747.1388 1.006676
## bC 1.03650261 0.17332808  0.7778636 1.3327213  889.3690 1.003758
## bA 0.01326378 0.09539204 -0.1445817 0.1628367 1052.3664 1.000059
precis_plot(precis(m11.5_ulam_2))

# Marginal Posterior
plot(dat2$PCTCOVER_std, dat2$SALAMAN,
  col = rangi2, lwd = 2,
  xlab = "PCTCOVER(standardized)", ylab = "salamanders observed"
)
p_seq <- seq(from = min(dat2$PCTCOVER_std)-0.15, to = max(dat2$PCTCOVER_std)+0.15, length.out = 100)
l <- link(m11.5_ulam_2, data = data.frame(PCTCOVER_std=p_seq, FORESTAGE_std = 0))
lines(p_seq, colMeans(l))
shade(apply(l, 2, PI,prob=0.89), p_seq)

plot(dat2$FORESTAGE_std, dat2$SALAMAN,
  col = rangi2, lwd = 2,
  xlab = "FORESTAGE(standardized)", ylab = "salamanders observed"
)
p_seq <- seq(from = min(dat2$FORESTAGE_std)-0.15, to = max(dat2$FORESTAGE_std)+0.15, length.out = 100)
l <- link(m11.5_ulam_2, data = data.frame(FORESTAGE_std=p_seq, PCTCOVER_std = 0))
lines(p_seq, colMeans(l))
shade(apply(l, 2, PI,prob=0.89), p_seq)

# From the results, we observe that the estimate for bA is nearly zero, with small interval around it. Thus, there isn’t much association between forest age and salamander density, while  controlling for percent cover.  This is probably because the two predictors are correlated, but the predictive ability of forest age arises just through its correlation with percent cover.However, since we already condition on percent cover, forest age does not influence salamander count that much in the model.