Chapter 11 - God Spiked the Integers

This chapter described some of the most common generalized linear models, those used to model counts. It is important to never convert counts to proportions before analysis, because doing so destroys information about sample size. A fundamental difficulty with these models is that parameters are on a different scale, typically log-odds (for binomial) or log-rate (for Poisson), than the outcome variable they describe.Therefore computing implied predictions is even more important than before.

Place each answer inside the code chunk (grey box). The code chunks should contain a text response or a code that completes/answers the question or activity requested. Make sure to include plots if the question requests them.

Finally, upon completion, name your final output .html file as: YourName_ANLY505-Year-Semester.html and publish the assignment to your R Pubs account and submit the link to Canvas. Each question is worth 5 points.

Questions

11-1. As explained in the chapter, binomial data can be organized in aggregated and disaggregated forms, without any impact on inference. But the likelihood of the data does change when the data are converted between the two formats. Can you explain why?

## Explanation (Referencing to the card example from our prior "Rethinking" chapter):

# Considering the card example into account for referential and explanation purposes, and with an assumption, suppose that we have 2 white, and one black card which is drawn, we do know and acknowledge that in the scenario of the aggregated form of the data, the expected probability of the observation would be determined as "3p(1-p)", or in other words, referring that our binomial distribution has three number of trials and with the probability of white card equivalent to P("W' card) = 2/3.

# Hence this aggregated form of scenario equips us with multiple ways to extract the 2 black cards from the three cards pull and irrespective of the order in which the cards are pulled, However on the other hand, in the scenario for the dis-aggregated data, the approach is majorly inclined towards predicting the outcome result for each of the drawn cards to a larger proportion, and finally we multiply the predictions collectively, for obtaining a joint probability reference to "p(1-p)".

# Therefore in consideration of the above example, it can be concluded that for the case of the aggregated data, it is to a larger extent modeled with an additional constant for the handling of the various permutations, however, this does only have an impact on our log-likelihood and likelihood, and without impacting or changing our inference.

11-2. Use quap to construct a quadratic approximate joint posterior distribution for the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330). Plot and compare the quadratic approximation to the joint posterior distribution produced instead from MCMC. Do not use the ‘pairs’ plot. Can you explain both the differences and the similarities between the approximate and the MCMC distributions? Relax the prior on the actor intercepts to Normal(0,10). Re-estimate the posterior using both ulam and quap. Plot and compare the posterior distributions. Do not use the ‘pairs’ plot. Do the differences increase or decrease? Why?

## Extracting or loading "Chimpanzees" data set.

data("chimpanzees")


dta_chmp <- chimpanzees


str(dta_chmp)
## 'data.frame':    504 obs. of  8 variables:
##  $ actor       : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ recipient   : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ condition   : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ block       : int  1 1 1 1 1 1 2 2 2 2 ...
##  $ trial       : int  2 4 6 8 10 12 14 16 18 20 ...
##  $ prosoc_left : int  0 0 1 0 1 1 1 1 0 0 ...
##  $ chose_prosoc: int  1 0 0 1 1 1 0 0 1 1 ...
##  $ pulled_left : int  0 1 0 0 1 1 0 0 0 0 ...
set.seed(123)

dta_chmp$treatment <- 1 + dta_chmp$prosoc_left + 2 * dta_chmp$condition

Data_list <- list(
  pulled_left = dta_chmp$pulled_left,
  actor = dta_chmp$actor,
  treatment = as.integer(dta_chmp$treatment))


## Extracting or loading of the MCMC chimpanzee model (m11.4):

Model_11.4 <- ulam(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a[actor] + b[treatment],
    a[actor] ~ dnorm(0, 1.5),
    b[treatment] ~ dnorm(0, 0.5)
  ),
  data = Data_list, chains = 4, log_lik = TRUE)
## Running MCMC with 4 sequential chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.7 seconds.
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 2 finished in 0.7 seconds.
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 finished in 0.7 seconds.
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 finished in 0.7 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.7 seconds.
## Total execution time: 3.2 seconds.
## Determination of "precis" for our model (Model_11.4)

precis(Model_11.4, depth = 2)
##             mean        sd        5.5%       94.5%     n_eff     Rhat4
## a[1] -0.43226290 0.3169509 -0.93153209  0.07713279  725.1897 0.9999461
## a[2]  3.93960628 0.7856877  2.77534850  5.29598075 1102.8927 1.0019990
## a[3] -0.73576828 0.3338179 -1.25944465 -0.19861085  744.3575 1.0028710
## a[4] -0.72912108 0.3329218 -1.27069250 -0.19264369  636.2889 1.0021449
## a[5] -0.43350177 0.3285755 -0.93899370  0.08061543  686.9268 1.0021754
## a[6]  0.49898114 0.3243207 -0.01816447  1.00690500  703.9423 1.0017723
## a[7]  1.97010433 0.4091758  1.31875755  2.63334540 1002.8958 0.9996774
## b[1] -0.05600863 0.2821516 -0.51113120  0.39173633  655.3929 1.0010477
## b[2]  0.47218438 0.2853034  0.01720480  0.92449658  654.7957 1.0003854
## b[3] -0.39457784 0.2858289 -0.85436577  0.06337371  609.8488 1.0044779
## b[4]  0.35228353 0.2803761 -0.08034195  0.81656701  612.8803 1.0029137
##  Determination for constructing a quadratic approximate joint posterior distribution for the chimpanzee   model that includes a unique intercept for each actor and leveraging the "quap" function.

Quap_Model_11.4 <- quap(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a[actor] + b[treatment],
    a[actor] ~ dnorm(0, 1.5),
    b[treatment] ~ dnorm(0, 0.5)
  ),
  data = Data_list
)



## Determination for the plot, comparing the quadratic approximation to the joint posterior distribution produced instead referencing from our MCMC model.

# Retracting MCMC model m11.4 from chapter for plot consideration.

m11.4 <- ulam(
    alist(
        pulled_left ~ dbinom( 1 , p ) ,
        logit(p) <- a[actor] + b[treatment] ,
        a[actor] ~ dnorm( 0 , 1.5 ),
        b[treatment] ~ dnorm( 0 , 0.5 )
    ) , data=Data_list) 
## Running MCMC with 1 chain, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.4 seconds.
## Determination for Quadratic approximate "Quap" model "coef" Plot

plot(coeftab(Quap_Model_11.4))

## Determination for Model (m11.4) "coef" plot.
plot(coeftab(m11.4))

## Coefficient plot of our 'Quap' and 'MCMC' models for comparison purposes.

plot(coeftab(m11.4, Quap_Model_11.4),
  labels = paste(rep(rownames(coeftab(m11.4, Quap_Model_11.4)@coefs), each = 2),
    rep(c("         Mcmc", "Quap"), nrow(coeftab(m11.4, Quap_Model_11.4)@coefs) * 2),
    sep = "--------"
  )
)

## Plot Result Inference for "Quap" and "MCMC" models: 
# Referring the coefficients plots, it's clearly evident that our Quadratic "Quap" model did perform pretty reasonably good from the perspective of parameters estimates and with having only a single tangible difference for actor "a[2]" coefficient, where the MCMC (ulam) model seems to have a slightly larger estimates in comparison with our Quadratic (quap) model.



## Determination of the Posterior distribution density plot for this estimates "Quap" and "ulam" models for further differences evaluation:


Posterior_MCMC <- extract.samples(Model_11.4)

Posterior_Quap <- extract.samples(Quap_Model_11.4)

dens(Posterior_MCMC$a[, 2], lwd = 2, col = "blue")

dens(Posterior_Quap$a[, 2], add = TRUE, lwd = 2, col = "red")

## Posterior Plots Differences Reasons and Analysis: 

# Referring to Posterior distribution plots for both the models, i.e. "Quadratic (quap) and MCMC (ulam), it seems that our MCMC (ulam) model (Model_11.4) for actor is placing slightly more magnitude of probability mass comparatively in the upper tail end region, and hence in the process further pushing slightly the posterior distribution means to the right in comparison to our Quadratic (quap) model (Qual_Model_11.4).





#### Determination of Revised "Quap" and "Ulam" models by relaxing the prior on the actor intercepts to Normal(0,10).

### Determination for the Re-estimation of the posterior using both ''ulam and 'quap'.


## Revised MCMC(ulam) model:

Revised_Model_11.4 <- ulam(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a[actor] + b[treatment],
    a[actor] ~ dnorm(0, 10),
    b[treatment] ~ dnorm(0, 0.5)
  ),
  data = Data_list, chains = 4, log_lik = TRUE)
## Running MCMC with 4 sequential chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.8 seconds.
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 2 finished in 0.8 seconds.
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 finished in 1.0 seconds.
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 finished in 0.8 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.9 seconds.
## Total execution time: 3.7 seconds.
## Revised Quadratic (quap) model:

Revised_Quap_Model_11.4 <- quap(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a[actor] + b[treatment],
    a[actor] ~ dnorm(0, 10),
    b[treatment] ~ dnorm(0, 0.5)
  ),
  data = Data_list
)



## Retracting revised MCMC model m11.4 from chapter with relaxed prior on the actor intercepts to Normal(0,10) for plot consideration.

Revised_m11.4 <- ulam(
    alist(
        pulled_left ~ dbinom( 1 , p ) ,
        logit(p) <- a[actor] + b[treatment] ,
        a[actor] ~ dnorm( 0 , 10 ),
        b[treatment] ~ dnorm( 0 , 0.5 )
    ) , data=Data_list) 
## Running MCMC with 1 chain, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.7 seconds.
## Collective "Coef" plot of our Revised "Quap" and "MCMC" with relaxed prior on the actor intercepts to Normal(0,10) models for comparison purposes.

plot(coeftab(Revised_m11.4, Revised_Quap_Model_11.4),
  labels = paste(rep(rownames(coeftab(Revised_m11.4, Revised_Quap_Model_11.4)@coefs), each = 2),
    rep(c("         Mcmc", "Quap"), nrow(coeftab(Revised_m11.4, Revised_Quap_Model_11.4)@coefs) * 2),
    sep = "--------"
  )
)

## Plot Result Inference for Revised "Quap" and "MCMC" models revised with relaxed prior on the actor intercepts to Normal(0,10):

# Referring the coefficients plots, it's clearly evident that our Quadratic "Quap" model did perform pretty reasonably good from the perspective of overall parameters estimates but with an exception of having only a single and significant tangible difference for the actor "a[2]" coefficient, where our revised MCMC (ulam) model seems to have a significantly larger estimates in comparison with our revised Quadratic (quap) model.



## Determination of the Posterior distribution density plot for this actor's "a[2]" estimate's  utilizing the"Quap" and "ulam" models for further differences evaluation:


Rvsd_Posterior_MCMC <- extract.samples(Revised_Model_11.4)

Rvsd_Posterior_Quap <- extract.samples(Revised_Quap_Model_11.4)

dens(Rvsd_Posterior_MCMC$a[, 2], lwd = 2, col = "grey")

dens(Rvsd_Posterior_Quap$a[, 2], add = TRUE, lwd = 2, col = "green")

## Posterior Plots Differences Reasons and Analysis for our Revised models with relaxed prior on the actor intercepts to Normal(0,10):

# Referring to Posterior distribution plots for our revised models, i.e. "Quadratic (quap) and MCMC (ulam), it is evident that our revised MCMC (ulam) model i.e., Revised_m11.4 with relaxed actor "a[2]" is clearly placing more and significant magnitude of probability mass comparatively in the upper end region, and hence in the process further significantly pushing the posterior distribution means to the right in comparison to our other revised Quadratic (quap) model (Revised_Qual_Model_11.4), and probably this difference in posterior has also clearly and significantly increased and existing in comparison to our prior posterior plot, because our Quadratic (quap) model generally considers and assumes mostly the posterior distribution to be Gaussian, and thus in the process it yields to a more symmetric distribution in comparison to the MCMC (ulam) model and comprising of a less probability mass in the upper tail region of the distribution.

11-3. Revisit the data(Kline) islands example. This time drop Hawaii from the sample and refit the models. Plot the joint posterior. What changes do you observe?

data("Kline")


## Leveraging the data(kline) data set and standardization of the same and determination of few new columns with the standardized log of population and an index variable for contact:

Data_Kline <- Kline %>% 
  mutate(P = standardize(log(population)))
                         
Data_Kline$id <- ifelse(Data_Kline$contact=="high", 2, 1)

Data_Kline
##       culture population contact total_tools mean_TU            P id
## 1    Malekula       1100     low          13     3.2 -1.291473310  1
## 2     Tikopia       1500     low          22     4.7 -1.088550750  1
## 3  Santa Cruz       3600     low          24     4.0 -0.515764892  1
## 4         Yap       4791    high          43     5.0 -0.328773359  2
## 5    Lau Fiji       7400    high          33     5.0 -0.044338980  2
## 6   Trobriand       8000    high          19     4.0  0.006668287  2
## 7       Chuuk       9200    high          40     3.8  0.098109204  2
## 8       Manus      13000     low          28     6.6  0.324317564  1
## 9       Tonga      17500    high          55     5.4  0.518797917  2
## 10     Hawaii     275000     low          71     6.6  2.321008320  1
## Dropping the "Hawaii" from our standardized "Data_Kline" data set.

Data_Excldng_Hawaii <- filter(Data_Kline, culture != "Hawaii")

Data_Excldng_Hawaii
##      culture population contact total_tools mean_TU            P id
## 1   Malekula       1100     low          13     3.2 -1.291473310  1
## 2    Tikopia       1500     low          22     4.7 -1.088550750  1
## 3 Santa Cruz       3600     low          24     4.0 -0.515764892  1
## 4        Yap       4791    high          43     5.0 -0.328773359  2
## 5   Lau Fiji       7400    high          33     5.0 -0.044338980  2
## 6  Trobriand       8000    high          19     4.0  0.006668287  2
## 7      Chuuk       9200    high          40     3.8  0.098109204  2
## 8      Manus      13000     low          28     6.6  0.324317564  1
## 9      Tonga      17500    high          55     5.4  0.518797917  2
## Determination for the Refit of the model after dropping the "Hawaii" from our sample data set

Dat <- list(
    T = Data_Excldng_Hawaii$total_tools ,
    P = Data_Excldng_Hawaii$P ,
    cid = Data_Excldng_Hawaii$id )

Dat
## $T
## [1] 13 22 24 43 33 19 40 28 55
## 
## $P
## [1] -1.291473310 -1.088550750 -0.515764892 -0.328773359 -0.044338980
## [6]  0.006668287  0.098109204  0.324317564  0.518797917
## attr(,"scaled:center")
## [1] 8.977005
## attr(,"scaled:scale")
## [1] 1.52844
## 
## $cid
## [1] 1 1 1 2 2 2 2 1 2
## Determination of the revised interaction model reference chapter model (m11.10).

set.seed(123)

Revised_m11.10 <- ulam(
alist(T ~ dpois( lambda ),
    log(lambda) <- a[cid] + b[cid]*P,
    a[cid] ~ dnorm( 3 , 0.5 ),
    b[cid] ~ dnorm( 0 , 0.2 )
), data=Dat , chains=4 , log_lik=TRUE )
## Running MCMC with 4 sequential chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.0 seconds.
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 2 finished in 0.0 seconds.
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 finished in 0.0 seconds.
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 finished in 0.0 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.0 seconds.
## Total execution time: 0.5 seconds.
## Determination of the plot for the Joint posterior prediction for the Revised oceanic tool counts model for high and low contact population and dropping "Hawaii" from data set, refrence to chapter "Island" Model.

k <- PSIS(Revised_m11.10 , pointwise=TRUE )$k

ns <- 100

plot( Data_Excldng_Hawaii$population , Data_Excldng_Hawaii$total_tools , xlab="population" , ylab="total tools" ,
    col=rangi2 , pch=ifelse( Dat$cid==1 , 1 , 16 ) , lwd=2 ,
    ylim=c(0,75) , cex=1+normalize(k) )
P_seq <- seq( from=-5 , to=3 , length.out=ns ) # 1.53 is sd of log(population)
pop_seq <- exp( P_seq*1.53 + 9 ) # 9 is mean of log(population)
lambda <- link( Revised_m11.10 , data=data.frame( P=P_seq , cid=1 ) )
lmu <- apply( lambda , 2 , mean )
lci <- apply( lambda , 2 , PI )
lines( pop_seq , lmu , lty=2 , lwd=1.5 )
shade( lci , pop_seq , xpd=TRUE )
lambda <- link(Revised_m11.10 , data=data.frame( P=P_seq , cid=2 ) )
lmu <- apply( lambda , 2 , mean )
lci <- apply( lambda , 2 , PI )
lines( pop_seq , lmu , lty=1 , lwd=1.5 )
shade( lci , pop_seq , xpd=TRUE )

## Result Inference: 
# The outcome results findings after refitting our revised model and with the exclusion of Hawaii revealed that the slopes for the low contact and high contact population for the total tools for the oceanic model were almost or nearly identical, and perhaps this observation with our model revision was also different from our chapter's posterior prediction plot, where the slopes for the low and high contact population were quite different, thus, it appears that the non-inclusion of the "Hawaii" in our model's equation to a large proportion drove this differences in slopes for high and low contact population's total tools for our revised oceanic model (Revised_m11.10).

9-4. Use WAIC or PSIS to compare the chimpanzee model that includes a unique intercept for each actor, m11.4 (page 330), to the simpler models fit in the same section. Interpret the results.

data("chimpanzees")


dta_chmp <- chimpanzees


dta_chmp$treatment <- 1 + dta_chmp$prosoc_left + 2 * dta_chmp$condition

Data_list <- list(
  pulled_left = dta_chmp$pulled_left,
  actor = dta_chmp$actor,
  treatment = as.integer(dta_chmp$treatment))

## Determination of other various simpler "chimpanzee" models from chapter section with unique intercepts for comparison with Chimpanzee model (m11.4).


## Determination of the simple "Intercept" model from chapter section.

Intercept_m11.1 <- quap(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a,
    a ~ dnorm(0, 10)
  ),
  data = dta_chmp
)


## Determination of the simple "Intercept" and "Treatment" model from chapter section.

Intrcpt_Trtmnt_m11.3 <- quap(
  alist(
    pulled_left ~ dbinom(1, p),
    logit(p) <- a + b[treatment],
    a ~ dnorm(0, 1.5),
    b[treatment] ~ dnorm(0, 0.5)
  ),
  data = dta_chmp
)



## Determination of Model (m11.4) for the Individual Intercept and Treatment from the chapter for comparison with other derived simple models.

Model_11.4 <- ulam(
    alist(
        pulled_left ~ dbinom( 1 , p ) ,
        logit(p) <- a[actor] + b[treatment] ,
        a[actor] ~ dnorm( 0 , 1.5 ),
        b[treatment] ~ dnorm( 0 , 0.5 )
    ) , data=Data_list , chains=4 , log_lik=TRUE )
## Running MCMC with 4 sequential chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.7 seconds.
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 2 finished in 0.7 seconds.
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 finished in 0.7 seconds.
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 finished in 0.7 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.7 seconds.
## Total execution time: 3.1 seconds.
## Comparison of our models i.e., Model "Intercept_m11.1", Model "Intrcpt_Trtmnt_m11.3", and Model "Model_11.4" from above leveraging the "WAIC" and "PSIS" compare function:


## Comparison utilizing "WAIC" function.

Comp_Waic <-
  compare(Intercept_m11.1, Intrcpt_Trtmnt_m11.3, Model_11.4, func = WAIC) %>% 
  as_tibble(rownames = "Model") %>% 
  knitr::kable(digits = 2)

Comp_Waic
Model WAIC SE dWAIC dSE pWAIC weight
Model_11.4 531.88 18.91 0.00 NA 8.31 1
Intrcpt_Trtmnt_m11.3 682.28 9.15 150.40 18.40 3.54 0
Intercept_m11.1 687.84 7.11 155.96 18.94 0.95 0
## Comparing the models  utilizing "PSIS" function.

Comp_PSIS <- 
  compare(Intercept_m11.1, Intrcpt_Trtmnt_m11.3, Model_11.4, func = PSIS) %>% 
  as_tibble(rownames = "Model") %>% 
  knitr::kable(digits = 2)

Comp_PSIS
Model PSIS SE dPSIS dSE pPSIS weight
Model_11.4 531.94 18.93 0.00 NA 8.34 1
Intrcpt_Trtmnt_m11.3 682.76 8.99 150.82 18.37 3.76 0
Intercept_m11.1 687.90 7.02 155.96 18.92 0.98 0
## Determination of Plot for our model comparison using "WAIC" and "PSIS" functions for better result visualization.


plot(compare(Intercept_m11.1, Intrcpt_Trtmnt_m11.3, Model_11.4, func = WAIC))

plot(compare(Intercept_m11.1, Intrcpt_Trtmnt_m11.3, Model_11.4, func = PSIS))

## Models Comparison Results Inference: 
# Reference to our output result findings from model comparison values for both the WAIC and PSIS function, and plot result output, it's evident that our Model (m11.4) for the Individual Intercept and Treatment, has comparatively lower WAIC and PSIS scores of "532.17, and "532.22" than our other select simple models and our Model (m11.4) further seems to have outperformed our other select simpler models- Model (m11.3), & Model (m11.1) and having the best model fit and with the full "Akaike" weight.

11-5. Explain why the logit link is appropriate for a binomial generalized linear model?

## Response: 
# The logit link can be considered appropriate for the binomial generalized linear models because, generally in the case of a binomial generalized linear model we are mostly interested in having a binary output outcome space that denotes the probabilities of the events, to be between "0" or "1", and since the logit link maps the probability parameter and whose value generally lies between "0" and "1" onto the real line, thus logit link maps the probability space into linear model space and this mapped value can be utilized in the linear model, thus in the process making the logic link as the most appropriate and efficient for a binomial generalized linear model.