Chapter 11 - God Spiked the Integers

This chapter described some of the most common generalized linear models, those used to model counts. It is important to never convert counts to proportions before analysis, because doing so destroys information about sample size. A fundamental difficulty with these models is that parameters are on a different scale, typically log-odds (for binomial) or log-rate (for Poisson), than the outcome variable they describe. Therefore computing implied predictions is even more important than before.

Place each answer inside the code chunk (grey box). The code chunks should contain a text response or a code that completes/answers the question or activity requested. Make sure to include plots if the question requests them.

Finally, upon completion, name your final output .html file as: YourName_ANLY505-Year-Semester.html and publish the assignment to your R Pubs account and submit the link to Canvas. Each question is worth 5 points.

Questions

11-1. As explained in the chapter, binomial data can be organized in aggregated and disaggregated forms, without any impact on inference. But the likelihood of the data does change when the data are converted between the two formats. Can you explain why?

#Because the multiplicative factor in the likelihood is a constant in the aggregated form and the multiplicative factor is computed based on the number of observations in the data form in the disaggregated form.

11-2. Use quap to construct a quadratic approximate joint posterior distribution for the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330). Plot and compare the quadratic approximation to the joint posterior distribution produced instead from MCMC. Do not use the ‘pairs’ plot. Can you explain both the differences and the similarities between the approximate and the MCMC distributions? Relax the prior on the actor intercepts to Normal(0,10). Re-estimate the posterior using both ulam and quap. Plot and compare the posterior distributions. Do not use the ‘pairs’ plot. Do the differences increase or decrease? Why?

data("chimpanzees")
data_1 <-  chimpanzees
data_1$recipient <- NULL

model_1 <- quap(
          alist(
            pulled_left ~ dbinom(1,p),
            logit(p) <- a[actor] + (xp + xpc * condition)*prosoc_left,
            a[actor] ~ dnorm(0,10),
            xp ~ dnorm(0,10),
            xpc ~ dnorm(0,10)
          ),
          data = data_1
)
pairs(model_1)

They are very similar. This is because the MCMC posterior is slightly skewed, whereas the QUAP posterior is forced to be Gaussian. Therefore, more density is given to the lower tail and less to the upper tail than in the MCMC posterior.

11-3. Revisit the data(Kline) islands example. This time drop Hawaii from the sample and refit the models. Plot the joint posterior. What changes do you observe?

data("Kline")
d <- Kline
d$P <- scale(log(d$population))
d$id <- ifelse(d$contact == "high", 2, 1)
d
##       culture population contact total_tools mean_TU            P id
## 1    Malekula       1100     low          13     3.2 -1.291473310  1
## 2     Tikopia       1500     low          22     4.7 -1.088550750  1
## 3  Santa Cruz       3600     low          24     4.0 -0.515764892  1
## 4         Yap       4791    high          43     5.0 -0.328773359  2
## 5    Lau Fiji       7400    high          33     5.0 -0.044338980  2
## 6   Trobriand       8000    high          19     4.0  0.006668287  2
## 7       Chuuk       9200    high          40     3.8  0.098109204  2
## 8       Manus      13000     low          28     6.6  0.324317564  1
## 9       Tonga      17500    high          55     5.4  0.518797917  2
## 10     Hawaii     275000     low          71     6.6  2.321008320  1
dat <- list(
    T = d$total_tools ,
    P = d$P ,
    cid = d$contact_id )

# Intercept only
# m11.9 <- ulam(
#     alist(
#         T ~ dpois( lambda ),
#         log(lambda) <- a,
#         a ~ dnorm(3,0.5)
#     ), data=dat , chains=4 , log_lik=TRUE )


d <- subset(d, d$culture != "Hawaii")
d$P <- scale(log(d$population))
d$id <- ifelse(d$contact == "high", 2, 1)
dat <- list(
    T = d$total_tools ,
    P = d$P ,
    cid = d$contact_id )

# # Intercept only
# model_11.3 <- ulam(
#     alist(
#         T ~ dpois( lambda ),
#         log(lambda) <- a,
#         a ~ dnorm(3,0.5)
#     ), data=dat , chains=4 , log_lik=TRUE )

# compare(m11.9, model_11.3, func=PSIS)
# 
# 
# whi_post <- extract.samples(m11.9)
# wohi_post <- extract.samples(model_11.3)
# 
# par(mfrow=c(1,2))
# dens(whi_post$a, main="Model without Hawaii in Data")
# dens(wohi_post$a, main="Model with Hawaii in Data")

#Without Hawaii, the posterior distribution for a is wider. The maximum value of the density that including Hawaii data is higher. Both models have similar bandwidth.

9-4. Use WAIC or PSIS to compare the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330), to the simpler models fit in the same section. Interpret the results.

data(chimpanzees)
d <- chimpanzees

# Model 11.1
m11.1 <- quap(
              alist(
                  pulled_left ~ dbinom(1, p),
                  logit(p) <- a,
                  a ~ dnorm(0 , 10)
              ), data=d)

# Model 11.2
m11.2 <- quap(
              alist(
                pulled_left ~ dbinom(1, p) ,
                logit(p) <- a + bp*prosoc_left ,
                a ~ dnorm(0,10) ,
                bp ~ dnorm(0,10)
              ),
              data=d )

# Model 11.3
m11.3 <- quap(
              alist(
                  pulled_left ~ dbinom(1, p) ,
                  logit(p) <- a + (bp + bpC * condition)*prosoc_left,
                  a ~ dnorm(0,10),
                  bp ~ dnorm(0,10),
                  bpC ~ dnorm(0,10)
              ), data=d)

# Model 11.4
m11.4 <- quap(
              alist(
                pulled_left ~ dbinom(1,p),
                logit(p) <- a[actor] + (bp + bpC * condition)*prosoc_left,
                a[actor] ~ dnorm(0,10),
                bp ~ dnorm(0,10),
                bpC ~ dnorm(0,10)
              ),
              data = d
            )

# Comparing all models
compare(m11.1,m11.2,m11.3,m11.4)
##           WAIC        SE    dWAIC      dSE     pWAIC       weight
## m11.4 543.7846 18.876967   0.0000       NA 12.722729 1.000000e+00
## m11.2 680.6137  9.321945 136.8291 18.27567  2.058617 1.940630e-30
## m11.3 682.4283  9.294972 138.6437 18.19399  3.041692 7.832361e-31
## m11.1 688.0091  7.134189 144.2245 19.06978  1.034486 4.808950e-32

#The most complex model (m11.4) has the lowest WAIC which means the m11.4 is our preferred model for this problem.

11-5. Explain why the logit link is appropriate for a binomial generalized linear model?

#Because the goal of a binomial generalized linear model is to map continuous value to a probability space of between 0 and 1.