Assignment #9

Questions

11E1. If an event has probability 0.35, what are the log-odds of this event?

p <- 0.35
p/(1-p)

## [1] 0.5384615

11E2. If an event has log-odds 3.2, what is the probability of this event?

lo <- 3.2
lo/(1+lo)

## [1] 0.7619048

11E3. Suppose that a coefficient in a logistic regression has value 1.7. What does this imply about the proportional change in odds of the outcome?

exp(1.7)

## [1] 5.473947

#the log odds will change in 5.474 times

11E4. Why do Poisson regressions sometimes require the use of an offset? Provide an example.

#Poisson models events per unit time/distance. Because time/distance varies across cases, we need an offset which can adjust all observations to the same scale.

11M1. As explained in the chapter, binomial data can be organized in aggregated and disaggregated forms, without any impact on inference. But the likelihood of the data does change when the data are converted between the two formats. Can you explain why?

#Binomial likelihood in the aggregated form contains a multiplier. For non-aggregated format, each event is modeled independently.

11M2. If a coefficient in a Poisson regression has value 1.7, what does this imply about the change in the outcome?

#An average number of events per interval increases in 5.47 times.

11M3. Explain why the logit link is appropriate for a binomial generalized linear model.

#Binomial likelihood is parametrised by parameter p
#Modelling it with linear combination of the predictors is required.
#linear relation logit (p) = a + b*[treatment], p is in [0,1] scale.
#logit function ensures required constraint.

11M4. Explain why the log link is appropriate for a Poisson generalized linear model.

#Using a log link can constrain the parameter to be positive.

11M5. What would it imply to use a logit link for the mean of a Poisson generalized linear model? Can you think of a real research problem for which this would make sense?

#Using logit link implies that a lambda parameter of the Poisson likelihood always falls in the (0,1) range.

11M6. State the constraints for which the binomial and Poisson distributions have maximum entropy. Are the constraints different at all for binomial and Poisson? Why or why not?

#The binomial distribution has maximum entropy when each trial must result in one of two possible events and the expected value is constant. Poisson distribution is a special case of the Binomial one.

11M7. Use quap to construct a quadratic approximate posterior distribution for the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330). Compare the quadratic approximation to the posterior distribution produced instead from MCMC. Can you explain both the differences and the similarities between the approximate and the MCMC distributions? Relax the prior on the actor intercepts to Normal(0,10). Re-estimate the posterior using both ulam and quap. Do the differences increase or decrease? Why?

data("chimpanzees")
d <- chimpanzees
d$recipient <- NULL

q2 <- map(alist(
  pulled_left ~ dbinom( 1 , p ) ,
  logit(p) <- a[actor] + (bp + bpC*condition)*prosoc_left ,
  a[actor] ~ dnorm(0,10),
  bp ~ dnorm(0,10),
  bpC ~ dnorm(0,10)
) ,
data=d)
pairs(q2)

# MCMC model's posterior standard deviation is higher.

11M8. Revisit the data(Kline) islands example. This time drop Hawaii from the sample and refit the models. What changes do you observe?

data(Kline)
d <- Kline
d$P <- scale( log(d$population) )
d$contact_id <- ifelse( d$contact=="high" , 2 , 1 )
#coefficient dropped when Hawaii is excluded

11H1. Use WAIC or PSIS to compare the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330), to the simpler models fit in the same section. Interpret the results.

data("chimpanzees")

d <- chimpanzees

m1 <- map(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a ,
    a ~ dnorm(0,10)
  ),
  data=d )

m2 <- map(
  alist(
    pulled_left ~ dbinom(1, p) ,
    logit(p) <- a + bp*prosoc_left ,
    a ~ dnorm(0,10) ,
    bp ~ dnorm(0,10)
  ),
  data=d )

m3 <- map(
  alist(
    pulled_left ~ dbinom(1, p) ,
    logit(p) <- a + (bp + bpC*condition)*prosoc_left ,
    a ~ dnorm(0,10) ,
    bp ~ dnorm(0,10) ,
    bpC ~ dnorm(0,10)
  ), data=d )

m4 <- map(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a[actor] + (bp + bpC*condition)*prosoc_left,
    a[actor] ~ dnorm(0, 10),
    bp ~ dnorm(0, 10),
    bpC ~ dnorm(0, 10)
  ),
  data = d)


compare(m1,m2,m3,m4)

##        WAIC        SE    dWAIC      dSE     pWAIC       weight
## m4 564.1452 18.064555   0.0000       NA 22.339407 1.000000e+00
## m2 680.6278  9.142781 116.4826 17.65966  2.060980 5.082957e-26
## m3 682.5917  9.394598 118.4465 17.63338  3.125592 1.903972e-26
## m1 687.9815  7.143163 123.8364 18.58945  1.020692 1.286106e-27

Assignment #9

Yucheng Hu

2020-10-06

Chapter 11 - God Spiked the Integers

Questions