This chapter described some of the most common generalized linear models, those used to model counts. It is important to never convert counts to proportions before analysis, because doing so destroys information about sample size. A fundamental difficulty with these models is that parameters are on a different scale, typically log-odds (for binomial) or log-rate (for Poisson), than the outcome variable they describe. Therefore computing implied predictions is even more important than before.
Place each answer inside the code chunk (grey box). The code chunks should contain a text response or a code that completes/answers the question or activity requested. Make sure to include plots if the question requests them. Problems are labeled Easy (E), Medium (M), and Hard(H).
Finally, upon completion, name your final output .html file as: YourName_ANLY505-Year-Semester.html and publish the assignment to your R Pubs account and submit the link to Canvas. Each question is worth 5 points.
11E1. If an event has probability 0.35, what are the log-odds of this event?
log(0.35/(1-0.35))
## [1] -0.6190392
11E2. If an event has log-odds 3.2, what is the probability of this event?
logistic(3.2)
## [1] 0.9608343
#log(0.96/(1-0.96))
11E3. Suppose that a coefficient in a logistic regression has value 1.7. What does this imply about the proportional change in odds of the outcome?
exp(1.7)
## [1] 5.473947
# it means that 1 unit of x will imply 5.47 unit of propotional change in odds of the outcome
11E4. Why do Poisson regressions sometimes require the use of an offset? Provide an example.
# The poisson distribution is used as a limiting distribution of binomial. sometimes it is more relevant to model rates instead of counts. This is relevant when individuals are not following same amount of time. For example, five cases over 1 years should not amount to the same as five cases over 10 years. So will use Tx to exposure the time of those with covariate x. In this case, the LogTx is the offset
11M1. As explained in the chapter, binomial data can be organized in aggregated and disaggregated forms, without any impact on inference. But the likelihood of the data does change when the data are converted between the two formats. Can you explain why?
# The likelihood of the data does change when the data is converted between two formats because the aggregated form involves a extra log-odd factor
11M2. If a coefficient in a Poisson regression has value 1.7, what does this imply about the change in the outcome?
exp(1.7)
## [1] 5.473947
# it means that 1 unit of x will imply 5.47 unit of change
11M3. Explain why the logit link is appropriate for a binomial generalized linear model.
# logit link essentially connects a parameter constrained between zero and one and the real space. The logit function is like logit(p_i) = log(p/(1-p))
# the p_i is a probability mass. The link will work for a GLM
curve(logit,from = -0.5,to = 1)
## Warning in log(x): NaNs produced
11M4. Explain why the log link is appropriate for a Poisson generalized linear model.
curve(log, from = -0.5, to = 100000)
## Warning in log(x): NaNs produced
11M5. What would it imply to use a logit link for the mean of a Poisson generalized linear model? Can you think of a real research problem for which this would make sense?
# this is implies the mean mu lies between the zero and one. The Poisson distibution is defined by a single parameter. The premise of a poisson regression problem is the GLM models a count with an unknown maximum
# To fixed this problem, we can confine the poisson distribution to be followed only within a particular range.
# like the the covid-19 test problem can be constrains with a log(p/(s-p)) model
11M6. State the constraints for which the binomial and Poisson distributions have maximum entropy. Are the constraints different at all for binomial and Poisson? Why or why not?
# The binomial distribution is defined to be the maximum entropy distribution are: 1. discrete binary outcomes 2. Constant probability
# this is defined by the number of outcomes as well as the probability. The experiment is to reduce a series of independent and identical bernoulli trails with only two outcomes. The poisson distribution is derived as a limiting form of the binomial, where n -> ∞ and p -> 0. Since this does not change the underlying constraints, this is still a maximum entropy distribution.
11M7. Use quap to construct a quadratic approximate posterior distribution for the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330). Plot and compare the quadratic approximation to the posterior distribution produced instead from MCMC. Can you explain both the differences and the similarities between the approximate and the MCMC distributions? Relax the prior on the actor intercepts to Normal(0,10). Re-estimate the posterior using both ulam and quap. Plot and compare the posterior distributions. Do the differences increase or decrease? Why?
data('chimpanzees')
d<- chimpanzees
d$recipient<- NULL
m11_7<- rethinking::map(
alist(
pulled_left ~dbinom(1,p),
logit(p)<- a[actor] + (bp+bpc*condition)*prosoc_left,
a[actor] ~dnorm(0,10),
bp ~ dnorm(0,10),
bpc ~ dnorm(0,10)
),
data = d
)
pairs(m11_7)
11M8. Revisit the data(Kline) islands example. This time drop Hawaii from the sample and refit the models. What changes do you observe?
library('dplyr')
data(Kline)
kDat <- Kline
kDat <- kDat %>% dplyr::mutate(cid=ifelse(contact == "high", 2, 1),
stdPop=standardize(log(population))) %>% filter(culture != "Hawaii")
dataList<- list(
totTools = kDat$total_tools,
stdPop = kDat$stdPop,
cid = as.integer(kDat$cid)
)
m11_8 <- ulam(
alist(
totTools ~ dpois(lambda),
log(lambda) <- a[cid] + b[cid]*stdPop,
a[cid] ~ dnorm(3, 0.5),
b[cid] ~ dnorm(0, 0.2)
),data = dataList, chains = 4, cores = 4
)
pairs(m11_8)
m11_8 %>% precis(2)
## mean sd 5.5% 94.5% n_eff Rhat4
## a[1] 3.1793203 0.12729884 2.96582405 3.3782197 1691.368 0.9998414
## a[2] 3.6117057 0.07382473 3.49626185 3.7277584 1779.533 1.0020825
## b[1] 0.1902573 0.12816463 -0.00860378 0.3971389 1653.918 1.0004470
## b[2] 0.1909199 0.15284178 -0.05823923 0.4389852 1765.421 0.9997861
11H1. Use WAIC or PSIS to compare the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330), to the simpler models fit in the same section. Interpret the results.
data("chimpanzees")
d2 <- chimpanzees
m11h1_1 <- rethinking::map(
alist(
pulled_left ~ dbinom(1, p),
logit(p) <- a,
a ~ dnorm(0,10)
),
data = d2 )
m11h1_2 <- rethinking::map(
alist(
pulled_left ~ dbinom(1, p) ,
logit(p) <- a + bp*prosoc_left,
a ~ dnorm(0,10) ,
bp ~ dnorm(0,10)
),
data = d2 )
m11h1_3 <- rethinking::map(
alist(
pulled_left ~ dbinom(1, p),
logit(p) <- a + (bp + bpC*condition)*prosoc_left,
a ~ dnorm(0,10),
bp ~ dnorm(0,10),
bpC ~ dnorm(0,10)
),
data = d2 )
m11h1_4 <- rethinking::map(
alist(
pulled_left ~ dbinom(1, p),
logit(p) <- a[actor] + (bp + bpC*condition)*prosoc_left,
a[actor] ~ dnorm(0, 10),
bp ~ dnorm(0, 10),
bpC ~ dnorm(0, 10)
),
data = d2)
compare(m11h1_1,m11h1_2,m11h1_3,m11h1_4)
## WAIC SE dWAIC dSE pWAIC weight
## m11h1_4 549.9617 18.554308 0.0000 NA 15.5689313 1.000000e+00
## m11h1_2 680.5102 9.232976 130.5485 17.98034 2.0057272 4.484884e-29
## m11h1_3 682.1850 9.317678 132.2233 17.92136 2.9205058 1.941175e-29
## m11h1_1 687.9396 7.156139 137.9780 18.83602 0.9996638 1.092599e-30