This chapter described some of the most common generalized linear models, those used to model counts. It is important to never convert counts to proportions before analysis, because doing so destroys information about sample size. A fundamental difficulty with these models is that parameters are on a different scale, typically log-odds (for binomial) or log-rate (for Poisson), than the outcome variable they describe. Therefore computing implied predictions is even more important than before.
Place each answer inside the code chunk (grey box). The code chunks should contain a text response or a code that completes/answers the question or activity requested. Make sure to include plots if the question requests them. Problems are labeled Easy (E), Medium (M), and Hard(H).
Finally, upon completion, name your final output .html file as: YourName_ANLY505-Year-Semester.html and publish the assignment to your R Pubs account and submit the link to Canvas. Each question is worth 5 points.
11E1. If an event has probability 0.35, what are the log-odds of this event?
p = 0.35
odds = p/(1-p)
log_odds = log(odds)
log_odds
## [1] -0.6190392
11E2. If an event has log-odds 3.2, what is the probability of this event?
p = exp(3.2)/(1 + exp(3.2))
p
## [1] 0.9608343
11E3. Suppose that a coefficient in a logistic regression has value 1.7. What does this imply about the proportional change in odds of the outcome?
#It implies that we will have a 1.7 increase in the log-odds of the outcome
11E4. Why do Poisson regressions sometimes require the use of an offset? Provide an example.
#Poisson regressions require offsets when the length of time for 2 events is different. For example, 2 bloggers produce similar content for a year. One totals the rate of published videos per day, another per week. In such case, they have different timeframes and offset will help to standartize the scale for calculations.
11M1. As explained in the chapter, binomial data can be organized in aggregated and disaggregated forms, without any impact on inference. But the likelihood of the data does change when the data are converted between the two formats. Can you explain why?
#Because the multiplier gets converted to a constant log-scale when the data are converted between the two formats. ANd because of the way the data is organized, the aggregated probabilities are usually larger in value.
11M2. If a coefficient in a Poisson regression has value 1.7, what does this imply about the change in the outcome?
#If a coefficient in a Poisson regression has value 1.7, then the difference in log of outcomes will now change by 1.7 than can be calculated as:
exp(1.7)
## [1] 5.473947
11M3. Explain why the logit link is appropriate for a binomial generalized linear model.
#The logit link maps a parameter (defined as a probability mass), that's why its value lies between zero and one which is also an output of a binomial generalized linear model. It makes the logit link most appropriate for the model.
11M4. Explain why the log link is appropriate for a Poisson generalized linear model.
#The log link function maps a parameter that is defined over only positive real values to a linear model. This fits well with the Poisson distribtuion where the outcomes are always positive values.
11M5. What would it imply to use a logit link for the mean of a Poisson generalized linear model? Can you think of a real research problem for which this would make sense?
# A logit link restricts the vaues between zero to one. And as mentioned before, using this for the mean of a Poisson generalized liner model would imply that the variable values would be a positive value between zero to one.
11M6. State the constraints for which the binomial and Poisson distributions have maximum entropy. Are the constraints different at all for binomial and Poisson? Why or why not?
#When the constraints fall under two events, and the expected value is constant. Poisson is a special case of the binomial dsitribution and max entropy occurs under exactly the same constraints.
11M7. Use quap to construct a quadratic approximate posterior distribution for the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330). Plot and compare the quadratic approximation to the posterior distribution produced instead from MCMC. Can you explain both the differences and the similarities between the approximate and the MCMC distributions? Relax the prior on the actor intercepts to Normal(0,10). Re-estimate the posterior using both ulam and quap. Plot and compare the posterior distributions. Do the differences increase or decrease? Why?
library(rethinking)
data("chimpanzees")
d<- chimpanzees
d$recipient <- NULL
q <- map(alist(
pulled_left ~ dbinom( 1 , p ) ,
logit(p) <- a[actor] + (bp + bpC*condition)*prosoc_left ,
a[actor] ~ dnorm(0,10),
bp ~ dnorm(0,10),
bpC ~ dnorm(0,10)
) ,
data=d)
pairs(q)
precis(q , depth=2)
## mean sd 5.5% 94.5%
## a[1] -0.7261626 0.2684847 -1.1552530 -0.2970722
## a[2] 6.6751154 3.6121645 0.9021789 12.4480520
## a[3] -1.0309237 0.2784264 -1.4759029 -0.5859445
## a[4] -1.0309166 0.2784261 -1.4758953 -0.5859378
## a[5] -0.7261504 0.2684844 -1.1552403 -0.2970605
## a[6] 0.2127613 0.2670008 -0.2139575 0.6394801
## a[7] 1.7545450 0.3845051 1.1400316 2.3690585
## bp 0.8221123 0.2610073 0.4049722 1.2392525
## bpC -0.1318310 0.2969346 -0.6063899 0.3427279
# Posterior mean and standard deviation are very close. MCMC posterior is a little higher.
11M8. Revisit the data(Kline) islands example. This time drop Hawaii from the sample and refit the models. What changes do you observe?
data(Kline)
d <- Kline
d$P <- scale( log(d$population) )
d$contact_id <- ifelse( d$contact=="high" , 2 , 1 )
dat <- list(
Totals = d$total_tools ,
Pop = d$P ,
cid = d$contact_id
)
m2.0 <- ulam(
alist(
Totals~ dpois(lambda),
log(lambda) <- a[cid] + b[cid]*Pop,
a[cid] ~ dnorm(3, 0.5),
b[cid] ~ dnorm(0, 0.2)
),data = dat, chains = 4, cores = 4
)
precis(m2.0 , depth=2)
## mean sd 5.5% 94.5% n_eff Rhat4
## a[1] 3.3183795 0.08714611 3.18091376 3.4528157 1796.875 0.9991361
## a[2] 3.6103512 0.07133624 3.49871674 3.7244736 1953.409 0.9990935
## b[1] 0.3766972 0.05401822 0.28826760 0.4626805 1514.460 1.0010900
## b[2] 0.1944644 0.15931625 -0.05764787 0.4530994 2163.315 1.0000940
#Slopes are now very similar.
11H1. Use WAIC or PSIS to compare the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330), to the simpler models fit in the same section. Interpret the results.
library(rethinking)
data("chimpanzees")
d <- chimpanzees
m11.1 <- map(
alist(
pulled_left ~ dbinom(1, p),
logit(p) <- a ,
a ~ dnorm(0,10)
),
data=d )
m11.2 <- map(
alist(
pulled_left ~ dbinom(1, p) ,
logit(p) <- a + bp*prosoc_left ,
a ~ dnorm(0,10) ,
bp ~ dnorm(0,10)
),
data=d )
m11.3 <- map(
alist(
pulled_left ~ dbinom(1, p) ,
logit(p) <- a + (bp + bpC*condition)*prosoc_left ,
a ~ dnorm(0,10) ,
bp ~ dnorm(0,10) ,
bpC ~ dnorm(0,10)
), data=d )
m11.4 <- map(
alist(
pulled_left ~ dbinom(1, p),
logit(p) <- a[actor] + (bp + bpC*condition)*prosoc_left,
a[actor] ~ dnorm(0, 10),
bp ~ dnorm(0, 10),
bpC ~ dnorm(0, 10)
),
data = d)
compare(m11.1,m11.2,m11.3,m11.4)
## WAIC SE dWAIC dSE pWAIC weight
## m11.4 548.0019 18.626081 0.0000 NA 14.192038 1.000000e+00
## m11.2 680.7498 9.164020 132.7479 18.07875 2.122543 1.493337e-29
## m11.3 682.3906 9.317492 134.3887 18.03439 3.022706 6.574438e-30
## m11.1 687.9189 7.160203 139.9170 18.90861 0.989257 4.143849e-31