This chapter described some of the most common generalized linear models, those used to model counts. It is important to never convert counts to proportions before analysis, because doing so destroys information about sample size. A fundamental difficulty with these models is that parameters are on a different scale, typically log-odds (for binomial) or log-rate (for Poisson), than the outcome variable they describe. Therefore computing implied predictions is even more important than before.
Place each answer inside the code chunk (grey box). The code chunks should contain a text response or a code that completes/answers the question or activity requested. Problems are labeled Easy (E), Medium (M), and Hard (H).
Finally, upon completion, name your final output .html file as: YourName_ANLY505-Year-Semester.html and publish the assignment to your R Pubs account and submit the link to Canvas. Each question is worth 5 points.
11E1. If an event has probability 0.35, what are the log-odds of this event?
p <- 0.35
odds <- p/(1 - p)
log_odds <- log(odds)
log_odds
## [1] -0.6190392
11E2. If an event has log-odds 3.2, what is the probability of this event?
log_odds <- 3.2
odds <- exp(log_odds)
p <- odds/(1 + odds)
p
## [1] 0.9608343
11E3. Suppose that a coefficient in a logistic regression has value 1.7. What does this imply about the proportional change in odds of the outcome?
exp(1.7)
## [1] 5.473947
# 1 unit of increase in the predictor variable leads to a 5.47 increase in the odds of the outcome.
11E4. Why do Poisson regressions sometimes require the use of an offset? Provide an example.
# Because offset can bring all observations on the same scale. Poisson regression needs an offset when the length of time varies for two events.
# For example, someone counts the number of an event every day, and someone else counts the same event every week. In this case we use an offset to convert both measurements to the daily basis.
11M1. As explained in the chapter, binomial data can be organized in aggregated and disaggregated forms, without any impact on inference. But the likelihood of the data does change when the data are converted between the two formats. Can you explain why?
# The aggregated data uses a constant in front of the likelihood to account for all permutations of the ordering of events, while disaggregated data predicts each separately, generating a joint probability without the need of a constant.
11M2. If a coefficient in a Poisson regression has value 1.7, what does this imply about the change in the outcome?
# A log of unit increase in the predictor variable will result in a 1.7 increase in log of outcome, and the outcome will change by exp(1.7) = 5.47 times.
11M3. Explain why the logit link is appropriate for a binomial generalized linear model.
# Because the logit link maps a parameter that is defined as a probability mass. The value lies between 0 and 1. The binomial generalized model always generates binary outcome variables (0 and 1).
11M4. Explain why the log link is appropriate for a Poisson generalized linear model.
# Because the log link predictor values are restricted to positive real numbers. And the outcomes of Poisson distribution are always positive values.
11M5. What would it imply to use a logit link for the mean of a Poisson generalized linear model? Can you think of a real research problem for which this would make sense?
# Using the logit link means that the Poisson likelihood lambda parameter always falls within the range of [0, +inf).
11M6. State the constraints for which the binomial and Poisson distributions have maximum entropy. Are the constraints different at all for binomial and Poisson? Why or why not?
# Binomial distribution constraints are that the events are discrete and the expected value is constant. Poisson distribution has more constraints than binomial because it is a special case of binomial distribution and it has its own constraints: the variance is equal to expected value and both are constant.
11M7. Use quap to construct a quadratic approximate posterior distribution for the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330). Compare the quadratic approximation to the posterior distribution produced instead from MCMC. Can you explain both the differences and the similarities between the approximate and the MCMC distributions? Relax the prior on the actor intercepts to Normal(0,10). Re-estimate the posterior using both ulam and quap. Do the differences increase or decrease? Why?
data("chimpanzees")
d <- chimpanzees
d$recipient <- NULL
#map
q11.7 <- map(alist(
pulled_left ~ dbinom( 1 , p ) ,
logit(p) <- a[actor] + (bp + bpC*condition)*prosoc_left ,
a[actor] ~ dnorm(0,10),
bp ~ dnorm(0,10),
bpC ~ dnorm(0,10)
) ,
data=d)
pairs(q11.7)
# Posterior mean and standard deviation are very close. MCMC model's posterior std is slightly higher.
11M8. Revisit the data(Kline) islands example. This time drop Hawaii from the sample and refit the models. What changes do you observe?
data(Kline)
d <- Kline
d<-d[!(d$culture=="Hawaii"),]
d$P <- scale( log(d$population) )
d$contact_id <- ifelse( d$contact=="high" , 2 , 1 )
d
## culture population contact total_tools mean_TU P contact_id
## 1 Malekula 1100 low 13 3.2 -1.6838108 1
## 2 Tikopia 1500 low 22 4.7 -1.3532297 1
## 3 Santa Cruz 3600 low 24 4.0 -0.4201043 1
## 4 Yap 4791 high 43 5.0 -0.1154764 2
## 5 Lau Fiji 7400 high 33 5.0 0.3478956 2
## 6 Trobriand 8000 high 19 4.0 0.4309916 2
## 7 Chuuk 9200 high 40 3.8 0.5799580 2
## 8 Manus 13000 low 28 6.6 0.9484740 1
## 9 Tonga 17500 high 55 5.4 1.2653019 2
dat <- list(
T = d$total_tools ,
P = d$P ,
cid = d$contact_id )
# intercept only
m11.9 <- ulam(
alist(
T ~ dpois( lambda ),
log(lambda) <- a,
a ~ dnorm(3,0.5)
), data=dat , chains=4 , log_lik=TRUE )
##
## SAMPLING FOR MODEL '641e811761950b3da45a22778826ca3f' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.007 seconds (Warm-up)
## Chain 1: 0.008 seconds (Sampling)
## Chain 1: 0.015 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL '641e811761950b3da45a22778826ca3f' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 0 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 2: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 2: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 2: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 2: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 2: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 2: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 2: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 2: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 2: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 2: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 2: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.007 seconds (Warm-up)
## Chain 2: 0.001 seconds (Sampling)
## Chain 2: 0.008 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL '641e811761950b3da45a22778826ca3f' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 0 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 3: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 3: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 3: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 3: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 3: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 3: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 3: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 3: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 3: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 3: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 3: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.002 seconds (Warm-up)
## Chain 3: 0 seconds (Sampling)
## Chain 3: 0.002 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL '641e811761950b3da45a22778826ca3f' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 0 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 4: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 4: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 4: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 4: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 4: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 4: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 4: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 4: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 4: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 4: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 4: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.001 seconds (Warm-up)
## Chain 4: 0 seconds (Sampling)
## Chain 4: 0.001 seconds (Total)
## Chain 4:
# interaction model
m11.10 <- ulam(
alist(
T ~ dpois( lambda ),
log(lambda) <- a[cid] + b[cid]*P,
a[cid] ~ dnorm( 3 , 0.5 ),
b[cid] ~ dnorm( 0 , 0.2 )
), data=dat , chains=4 , log_lik=TRUE )
##
## SAMPLING FOR MODEL '3ebdae5360e013c94b8cfd064ba42d31' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.014 seconds (Warm-up)
## Chain 1: 0.028 seconds (Sampling)
## Chain 1: 0.042 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL '3ebdae5360e013c94b8cfd064ba42d31' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 0 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 2: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 2: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 2: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 2: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 2: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 2: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 2: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 2: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 2: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 2: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 2: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.021 seconds (Warm-up)
## Chain 2: 0.019 seconds (Sampling)
## Chain 2: 0.04 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL '3ebdae5360e013c94b8cfd064ba42d31' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 0 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 3: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 3: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 3: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 3: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 3: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 3: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 3: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 3: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 3: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 3: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 3: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.023 seconds (Warm-up)
## Chain 3: 0.021 seconds (Sampling)
## Chain 3: 0.044 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL '3ebdae5360e013c94b8cfd064ba42d31' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 0 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 4: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 4: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 4: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 4: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 4: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 4: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 4: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 4: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 4: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 4: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 4: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.019 seconds (Warm-up)
## Chain 4: 0.018 seconds (Sampling)
## Chain 4: 0.037 seconds (Total)
## Chain 4:
compare( m11.9 , m11.10 , func=PSIS )
## Some Pareto k values are high (>0.5). Set pointwise=TRUE to inspect individual points.
## PSIS SE dPSIS dSE pPSIS weight
## m11.10 78.08711 12.31847 0.0000 NA 7.021960 9.999560e-01
## m11.9 98.14741 18.67052 20.0603 17.17407 4.704405 4.404955e-05
11H1. Use WAIC or PSIS to compare the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330), to the simpler models fit in the same section. Interpret the results.
d <- chimpanzees
m11.1 <- map(
alist(
pulled_left ~ dbinom(1, p),
logit(p) <- a ,
a ~ dnorm(0,10)
),
data=d )
m11.2.2 <- map(
alist(
pulled_left ~ dbinom(1, p) ,
logit(p) <- a + bp*prosoc_left ,
a ~ dnorm(0,10) ,
bp ~ dnorm(0,10)
),
data=d )
m11.3 <- map(
alist(
pulled_left ~ dbinom(1, p) ,
logit(p) <- a + (bp + bpC*condition)*prosoc_left ,
a ~ dnorm(0,10) ,
bp ~ dnorm(0,10) ,
bpC ~ dnorm(0,10)
), data=d )
m11.4 <- map(
alist(
pulled_left ~ dbinom(1, p),
logit(p) <- a[actor] + (bp + bpC*condition)*prosoc_left,
a[actor] ~ dnorm(0, 10),
bp ~ dnorm(0, 10),
bpC ~ dnorm(0, 10)
),
data = d)
compare(m11.1, m11.2.2, m11.3, m11.4)
## WAIC SE dWAIC dSE pWAIC weight
## m11.4 561.1518 18.211962 0.0000 NA 20.7252360 1.000000e+00
## m11.2.2 680.5055 9.419013 119.3537 17.77880 2.0040859 1.209707e-26
## m11.3 682.5466 9.345130 121.3948 17.68225 3.1022168 4.359581e-27
## m11.1 687.8924 7.081407 126.7406 18.63143 0.9759254 3.010447e-28
# WAIC in model 11.4 gets much smaller than other models, which weight at 1.