Chapter 11 - God Spiked the Integers

This chapter described some of the most common generalized linear models, those used to model counts. It is important to never convert counts to proportions before analysis, because doing so destroys information about sample size. A fundamental difficulty with these models is that parameters are on a different scale, typically log-odds (for binomial) or log-rate (for Poisson), than the outcome variable they describe. Therefore computing implied predictions is even more important than before.

Place each answer inside the code chunk (grey box). The code chunks should contain a text response or a code that completes/answers the question or activity requested. Make sure to include plots if the question requests them.

Finally, upon completion, name your final output .html file as: YourName_ANLY505-Year-Semester.html and publish the assignment to your R Pubs account and submit the link to Canvas. Each question is worth 5 points.

Questions

11-1. As explained in the chapter, binomial data can be organized in aggregated and disaggregated forms, without any impact on inference. But the likelihood of the data does change when the data are converted between the two formats. Can you explain why?

# The binomial data can be organized in aggregated and disaggregated forms without any impact on inference. However, the likelihood of the data does change when the data are converted between the two formats. The reason for the difference in likelihood is because in the aggregated version, the multiplicative factor in the likelihood is now a constant. In the disaggregated version, the multiplicative factor is computed based on the number of observations in the data form. This is why while the inference stays the same, the likelihood of the data does change.

11-2. Use quap to construct a quadratic approximate joint posterior distribution for the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330). Plot and compare the quadratic approximation to the joint posterior distribution produced instead from MCMC. Do not use the ‘pairs’ plot. Can you explain both the differences and the similarities between the approximate and the MCMC distributions?

Relax the prior on the actor intercepts to Normal(0,10). Re-estimate the posterior using both ulam and quap. Plot and compare the posterior distributions. Do not use the ‘pairs’ plot. Do the differences increase or decrease? Why?

library(data.table)

data("chimpanzees")
d <-  chimpanzees
d$recipient <- NULL

# Construct a quadratic approximate join posterior distribution
model_11.2 <- quap(
                alist(
                  pulled_left ~ dbinom(1,p),
                  logit(p) <- a[actor] + (bp + bpC * condition)*prosoc_left,
                  a[actor] ~ dnorm(0, 1.5),
                  bp ~ dnorm(0,10),
                  bpC ~ dnorm(0,10)
                ),
                data = d
              )

precis(model_11.2, depth=2)
##            mean        sd       5.5%      94.5%
## a[1] -0.7073140 0.2628184 -1.1273486 -0.2872795
## a[2]  3.5021260 0.7156040  2.3584525  4.6457994
## a[3] -1.0026167 0.2716336 -1.4367396 -0.5684938
## a[4] -1.0025466 0.2716310 -1.4366654 -0.5684278
## a[5] -0.7072851 0.2628177 -1.1273186 -0.2872516
## a[6]  0.2077845 0.2620126 -0.2109622  0.6265312
## a[7]  1.6564220 0.3603289  1.0805467  2.2322972
## bp    0.8197491 0.2556820  0.4111200  1.2283783
## bpC  -0.1297303 0.2944443 -0.6003091  0.3408485
# Visualize joint posterior
post <- extract.samples(model_11.2)
p_left <- inv_logit(post$a)
plot(precis(as.data.frame(p_left)) , xlim=c(0,1))

# Construct a MCMC joint posterior distribution
model_11.2_stan <- ulam(
                        alist(
                          pulled_left ~ dbinom(1, p),
                          logit(p) <- a[actor] + (bp + bpC*condition)*prosoc_left,
                          a[actor] ~ dnorm(0, 1.5),
                          bp ~ dnorm(0, 10),
                          bpC ~ dnorm(0, 10)
                        ),
                        data=d, chains=2, iter=2500, warmup=500
                    )
## Running MCMC with 2 sequential chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:    1 / 2500 [  0%]  (Warmup) 
## Chain 1 Iteration:  100 / 2500 [  4%]  (Warmup) 
## Chain 1 Iteration:  200 / 2500 [  8%]  (Warmup) 
## Chain 1 Iteration:  300 / 2500 [ 12%]  (Warmup) 
## Chain 1 Iteration:  400 / 2500 [ 16%]  (Warmup) 
## Chain 1 Iteration:  500 / 2500 [ 20%]  (Warmup) 
## Chain 1 Iteration:  501 / 2500 [ 20%]  (Sampling) 
## Chain 1 Iteration:  600 / 2500 [ 24%]  (Sampling) 
## Chain 1 Iteration:  700 / 2500 [ 28%]  (Sampling) 
## Chain 1 Iteration:  800 / 2500 [ 32%]  (Sampling) 
## Chain 1 Iteration:  900 / 2500 [ 36%]  (Sampling) 
## Chain 1 Iteration: 1000 / 2500 [ 40%]  (Sampling) 
## Chain 1 Iteration: 1100 / 2500 [ 44%]  (Sampling) 
## Chain 1 Iteration: 1200 / 2500 [ 48%]  (Sampling) 
## Chain 1 Iteration: 1300 / 2500 [ 52%]  (Sampling) 
## Chain 1 Iteration: 1400 / 2500 [ 56%]  (Sampling) 
## Chain 1 Iteration: 1500 / 2500 [ 60%]  (Sampling) 
## Chain 1 Iteration: 1600 / 2500 [ 64%]  (Sampling) 
## Chain 1 Iteration: 1700 / 2500 [ 68%]  (Sampling) 
## Chain 1 Iteration: 1800 / 2500 [ 72%]  (Sampling) 
## Chain 1 Iteration: 1900 / 2500 [ 76%]  (Sampling) 
## Chain 1 Iteration: 2000 / 2500 [ 80%]  (Sampling) 
## Chain 1 Iteration: 2100 / 2500 [ 84%]  (Sampling) 
## Chain 1 Iteration: 2200 / 2500 [ 88%]  (Sampling) 
## Chain 1 Iteration: 2300 / 2500 [ 92%]  (Sampling) 
## Chain 1 Iteration: 2400 / 2500 [ 96%]  (Sampling) 
## Chain 1 Iteration: 2500 / 2500 [100%]  (Sampling) 
## Chain 1 finished in 1.8 seconds.
## Chain 2 Iteration:    1 / 2500 [  0%]  (Warmup) 
## Chain 2 Iteration:  100 / 2500 [  4%]  (Warmup) 
## Chain 2 Iteration:  200 / 2500 [  8%]  (Warmup) 
## Chain 2 Iteration:  300 / 2500 [ 12%]  (Warmup) 
## Chain 2 Iteration:  400 / 2500 [ 16%]  (Warmup) 
## Chain 2 Iteration:  500 / 2500 [ 20%]  (Warmup) 
## Chain 2 Iteration:  501 / 2500 [ 20%]  (Sampling) 
## Chain 2 Iteration:  600 / 2500 [ 24%]  (Sampling) 
## Chain 2 Iteration:  700 / 2500 [ 28%]  (Sampling) 
## Chain 2 Iteration:  800 / 2500 [ 32%]  (Sampling) 
## Chain 2 Iteration:  900 / 2500 [ 36%]  (Sampling) 
## Chain 2 Iteration: 1000 / 2500 [ 40%]  (Sampling) 
## Chain 2 Iteration: 1100 / 2500 [ 44%]  (Sampling) 
## Chain 2 Iteration: 1200 / 2500 [ 48%]  (Sampling) 
## Chain 2 Iteration: 1300 / 2500 [ 52%]  (Sampling) 
## Chain 2 Iteration: 1400 / 2500 [ 56%]  (Sampling) 
## Chain 2 Iteration: 1500 / 2500 [ 60%]  (Sampling) 
## Chain 2 Iteration: 1600 / 2500 [ 64%]  (Sampling) 
## Chain 2 Iteration: 1700 / 2500 [ 68%]  (Sampling) 
## Chain 2 Iteration: 1800 / 2500 [ 72%]  (Sampling) 
## Chain 2 Iteration: 1900 / 2500 [ 76%]  (Sampling) 
## Chain 2 Iteration: 2000 / 2500 [ 80%]  (Sampling) 
## Chain 2 Iteration: 2100 / 2500 [ 84%]  (Sampling) 
## Chain 2 Iteration: 2200 / 2500 [ 88%]  (Sampling) 
## Chain 2 Iteration: 2300 / 2500 [ 92%]  (Sampling) 
## Chain 2 Iteration: 2400 / 2500 [ 96%]  (Sampling) 
## Chain 2 Iteration: 2500 / 2500 [100%]  (Sampling) 
## Chain 2 finished in 1.8 seconds.
## 
## Both chains finished successfully.
## Mean chain execution time: 1.8 seconds.
## Total execution time: 3.9 seconds.
precis(model_11.2_stan, depth=2)
##            mean        sd       5.5%      94.5%    n_eff     Rhat4
## a[1] -0.7182017 0.2689244 -1.1570976 -0.2914097 4005.647 1.0003926
## a[2]  3.6895678 0.7522819  2.6078535  4.9826093 3281.081 1.0010617
## a[3] -1.0130333 0.2741159 -1.4532987 -0.5737669 3667.987 0.9998836
## a[4] -1.0210778 0.2769099 -1.4762988 -0.5935663 4333.402 1.0002458
## a[5] -0.7177741 0.2627250 -1.1444226 -0.3012562 4767.985 1.0003881
## a[6]  0.2141878 0.2646526 -0.2005260  0.6278846 4259.104 1.0004642
## a[7]  1.7018950 0.3741498  1.1269594  2.3188768 4655.145 0.9996224
## bp    0.8349202 0.2617422  0.4176912  1.2396860 2816.944 0.9998250
## bpC  -0.1353687 0.3027293 -0.6184692  0.3448239 3506.118 1.0000294
# Visualize joint posterior
post <- extract.samples(model_11.2_stan)
p_left <- inv_logit(post$a)
plot(precis(as.data.frame(p_left)) , xlim=c(0,1))

# Comparison plot
quap_post <- extract.samples(model_11.2)
stan_post <- extract.samples(model_11.2_stan)

par(mfrow=c(1,2))
dens(quap_post$a[,2], main="Quadratic Approximation for \n Actor 2")
dens(stan_post$a[,2], main="MCMC Approximation for \n Actor 2")

# Comparing the quap and MCMC joint posterior distribution, we see that there is not much difference between the estimates for bp and bpC. Similarly if we look at Actor 2, we see a similar posterior distribution between the two method (quap and MCMC). That said, we see that the distribution is slightly more skewed for MCMC for Actor 2, than for quap.

# Relaxed actor intercept using quap
model_11.2_relax_quap <- quap(
                          alist(
                            pulled_left ~ dbinom(1,p),
                            logit(p) <- a[actor] + (bp + bpC * condition)*prosoc_left,
                            a[actor] ~ dnorm(0,10),
                            bp ~ dnorm(0,10),
                            bpC ~ dnorm(0,10)
                          ),
                          data = d
                        )

precis(model_11.2_relax_quap, depth=2)
##            mean        sd       5.5%      94.5%
## a[1] -0.7261782 0.2684852 -1.1552694 -0.2970870
## a[2]  6.6754210 3.6126519  0.9017055 12.4491364
## a[3] -1.0309218 0.2784264 -1.4759010 -0.5859427
## a[4] -1.0309223 0.2784264 -1.4759014 -0.5859432
## a[5] -0.7261792 0.2684852 -1.1552704 -0.2970879
## a[6]  0.2127797 0.2670015 -0.2139404  0.6394997
## a[7]  1.7545536 0.3845072  1.1400368  2.3690704
## bp    0.8221297 0.2610079  0.4049887  1.2392707
## bpC  -0.1318294 0.2969351 -0.6063890  0.3427302
# Visualize joint posterior
post <- extract.samples(model_11.2_relax_quap)
p_left <- inv_logit(post$a)
plot(precis(as.data.frame(p_left)) , xlim=c(0,1))

# Relaxed actor intercept using ulam
model_11.2_relax_ulam <- ulam(
                            alist(
                              pulled_left ~ dbinom(1,p),
                              logit(p) <- a[actor] + (bp + bpC * condition)*prosoc_left,
                              a[actor] ~ dnorm(0,10),
                              bp ~ dnorm(0,10),
                              bpC ~ dnorm(0,10)
                            ),
                            data = d
                          )
## Running MCMC with 1 chain, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 1.0 seconds.
precis(model_11.2_relax_ulam, depth=2)
##            mean        sd       5.5%      94.5%    n_eff     Rhat4
## a[1] -0.7561307 0.2580807 -1.1947142 -0.3705990 451.3323 0.9988888
## a[2] 11.0498517 5.9714631  4.6073502 21.6115455 233.4800 1.0006052
## a[3] -1.0598802 0.2906747 -1.5529763 -0.5685164 450.8404 0.9997604
## a[4] -1.0470657 0.2740113 -1.4675208 -0.5833726 396.5901 0.9998241
## a[5] -0.7563852 0.2617926 -1.1820171 -0.3375344 360.8891 1.0001453
## a[6]  0.2170449 0.2745060 -0.2295342  0.6665440 340.1466 0.9980859
## a[7]  1.7763603 0.4178019  1.1694053  2.4507573 598.0670 0.9989420
## bp    0.8603156 0.2790071  0.4023583  1.2674911 241.0798 0.9981557
## bpC  -0.1575911 0.3168249 -0.6570838  0.3253090 212.7618 1.0037173
# Visualize joint posterior
post <- extract.samples(model_11.2_relax_ulam)
p_left <- inv_logit(post$a)
plot(precis(as.data.frame(p_left)) , xlim=c(0,1))

# Comparison plot
quap_post <- extract.samples(model_11.2_relax_quap)
stan_post <- extract.samples(model_11.2_relax_ulam)

par(mfrow=c(1,2))
dens(quap_post$a[,2], main="Relaxed Quadratic Approximation for \n Actor 2")
dens(stan_post$a[,2], main="Relaxed MCMC Approximation for \n Actor 2")

# Comparing the quap and MCMC joint posterior distribution for the relaxed prior on the actor, we see that there is not much difference between the estimates for bp and bpC. However if we look at Actor 2 specifically, we see a similar posterior distribution between the two method (quap and MCMC) are quite different. We see that the skew for the MCMC posterior is even more prominent now that we have a relaxed prior on the actor. 

11-3. Revisit the data(Kline) islands example. This time drop Hawaii from the sample and refit the models. Plot the joint posterior. What changes do you observe?

data("Kline")
d <- Kline
d$P <- scale(log(d$population))
d$id <- ifelse(d$contact == "high", 2, 1)
d
##       culture population contact total_tools mean_TU            P id
## 1    Malekula       1100     low          13     3.2 -1.291473310  1
## 2     Tikopia       1500     low          22     4.7 -1.088550750  1
## 3  Santa Cruz       3600     low          24     4.0 -0.515764892  1
## 4         Yap       4791    high          43     5.0 -0.328773359  2
## 5    Lau Fiji       7400    high          33     5.0 -0.044338980  2
## 6   Trobriand       8000    high          19     4.0  0.006668287  2
## 7       Chuuk       9200    high          40     3.8  0.098109204  2
## 8       Manus      13000     low          28     6.6  0.324317564  1
## 9       Tonga      17500    high          55     5.4  0.518797917  2
## 10     Hawaii     275000     low          71     6.6  2.321008320  1
dat <- list(
    T = d$total_tools ,
    P = d$P ,
    cid = d$contact_id )

# Intercept only
m11.9 <- ulam(
    alist(
        T ~ dpois( lambda ),
        log(lambda) <- a,
        a ~ dnorm(3,0.5)
    ), data=dat , chains=4 , log_lik=TRUE )
## Running MCMC with 4 sequential chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.0 seconds.
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 2 finished in 0.0 seconds.
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 finished in 0.0 seconds.
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 finished in 0.0 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.0 seconds.
## Total execution time: 0.5 seconds.
# Drop hawaii
d <- subset(d, d$culture != "Hawaii")
d$P <- scale(log(d$population))
d$id <- ifelse(d$contact == "high", 2, 1)
d
##      culture population contact total_tools mean_TU          P id
## 1   Malekula       1100     low          13     3.2 -1.6838108  1
## 2    Tikopia       1500     low          22     4.7 -1.3532297  1
## 3 Santa Cruz       3600     low          24     4.0 -0.4201043  1
## 4        Yap       4791    high          43     5.0 -0.1154764  2
## 5   Lau Fiji       7400    high          33     5.0  0.3478956  2
## 6  Trobriand       8000    high          19     4.0  0.4309916  2
## 7      Chuuk       9200    high          40     3.8  0.5799580  2
## 8      Manus      13000     low          28     6.6  0.9484740  1
## 9      Tonga      17500    high          55     5.4  1.2653019  2
dat <- list(
    T = d$total_tools ,
    P = d$P ,
    cid = d$contact_id )

# Intercept only
model_11.3 <- ulam(
    alist(
        T ~ dpois( lambda ),
        log(lambda) <- a,
        a ~ dnorm(3,0.5)
    ), data=dat , chains=4 , log_lik=TRUE )
## Running MCMC with 4 sequential chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.0 seconds.
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 2 finished in 0.0 seconds.
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 finished in 0.0 seconds.
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 finished in 0.0 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.0 seconds.
## Total execution time: 0.5 seconds.
# Compare model with and without Hawaii
compare(m11.9, model_11.3, func=PSIS)
##                PSIS       SE    dPSIS      dSE    pPSIS       weight
## model_11.3  98.5695 18.78927  0.00000       NA 5.018937 1.000000e+00
## m11.9      141.6724 34.02885 43.10294 23.41756 8.359836 4.368338e-10
# Comparison plot
whi_post <- extract.samples(m11.9)
wohi_post <- extract.samples(model_11.3)

par(mfrow=c(1,2))
dens(whi_post$a, main="Model without Hawaii in Data")
dens(wohi_post$a, main="Model with Hawaii in Data")

# When we exclude Hawaii from the data, we see that the posterior distribution for a is wider than for model with hawaii in the data. Additionally, the maximum value of the density is higher for the model with Hawaii in the Data. That said both model have similar bandwidth.

9-4. Use WAIC or PSIS to compare the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330), to the simpler models fit in the same section. Interpret the results.

data(chimpanzees)
d <- chimpanzees

# Model 11.1
m11.1 <- quap(
              alist(
                  pulled_left ~ dbinom(1, p),
                  logit(p) <- a,
                  a ~ dnorm(0 , 10)
              ), data=d)

# Model 11.2
m11.2 <- quap(
              alist(
                pulled_left ~ dbinom(1, p) ,
                logit(p) <- a + bp*prosoc_left ,
                a ~ dnorm(0,10) ,
                bp ~ dnorm(0,10)
              ),
              data=d )

# Model 11.3
m11.3 <- quap(
              alist(
                  pulled_left ~ dbinom(1, p) ,
                  logit(p) <- a + (bp + bpC * condition)*prosoc_left,
                  a ~ dnorm(0,10),
                  bp ~ dnorm(0,10),
                  bpC ~ dnorm(0,10)
              ), data=d)

# Model 11.4
m11.4 <- quap(
              alist(
                pulled_left ~ dbinom(1,p),
                logit(p) <- a[actor] + (bp + bpC * condition)*prosoc_left,
                a[actor] ~ dnorm(0,10),
                bp ~ dnorm(0,10),
                bpC ~ dnorm(0,10)
              ),
              data = d
            )

# Comparing all models
compare(m11.1,m11.2,m11.3,m11.4)
##           WAIC        SE    dWAIC      dSE      pWAIC       weight
## m11.4 556.2846 18.333057   0.0000       NA 18.2794042 1.000000e+00
## m11.2 680.4654  9.260809 124.1807 17.85138  1.9834452 1.082672e-27
## m11.3 682.1438  9.364389 125.8591 17.79886  2.8994699 4.677772e-28
## m11.1 687.7927  7.067068 131.5080 18.69627  0.9259206 2.775820e-29
# Looking at the WAIC, we see that the most complex model (m11.4) has the lowest WAIC. This means that the m11.4 is our preferred model for this problem.

11-5. Explain why the logit link is appropriate for a binomial generalized linear model?

# Logit link is appropriate for a binomial generalized linear model because for a binomial generalized linear model the goal is to map continuous value to a probability space of between 0 and 1, which a logit link would provide. Therefore logit link is an appropriate mapping for a binomial generalized linear model.