Assignment #7

Questions

8-1. Recall the tulips example from the chapter. Suppose another set of treatments adjusted the temperature in the greenhouse over two levels: cold and hot. The data in the chapter were collected at the cold temperature. You find none of the plants grown under the hot temperature developed any blooms at all, regardless of the water and shade levels. Can you explain this result in terms of interactions between water, shade, and temperature?

# Tulips don't appear to bloom at higher temperatures, resulting in a three-way interaction.

# Blooming is influenced not just by the combination of water and shade, but also by temperature.

# No amount of shade or water will make the tulip blossom if the temperature is too high.

8-2. Can you invent a regression equation that would make the bloom size zero, whenever the temperature is hot?

# Let Li stand for the ordinary linear model.

# Then, let Ti be a 0-1 indicator of whether temperature was hot/cold.

# Ui = Li(1 − Ti)

# When Ti = 1, the entire model above is zero, regardless of the value of Li.

8-3. Repeat the tulips analysis, but this time use priors that constrain the effect of water to be positive and the effect of shade to be negative. Use prior predictive simulation and visualize. What do these prior assumptions mean for the interaction prior, if anything?

library(rethinking)
data(tulips)
dt <- tulips

# For convenience of calculation, instead of using the shade variable directly, constructing a new variable named "light" with the same magnitude as shade and the opposite orientation.

# Adjusting the variables
dt$light <- -1 * dt$shade
dt$blooms_std <- dt$blooms / max(dt$blooms)
dt$water_cent <- dt$water - mean(dt$water)
dt$shade_cent <- dt$shade - mean(dt$shade)
dt$light_cent <- dt$light - mean(dt$light)

#Creating a Bayesian multiple linear regression model with bound to be positive coefficients for bw, bl, and bwl
m1 <- quap(
  alist(
    blooms_std ~ dnorm(mu, sigma),
    mu <- a + bw*water_cent + bl*light_cent + bwl*water_cent*light_cent,
    a ~ dnorm(0.5, 0.25),
    bw ~ dlnorm(0, 0.25),
    bl ~ dlnorm(0, 0.25),
    bwl ~ dlnorm(0, 0.25),
    sigma ~ dexp(1)
  ),data = dt)
summary(m1)

##            mean         sd      5.5%     94.5%
## a     0.3674934 0.06889164 0.2573913 0.4775956
## bw    0.4310389 0.07918482 0.3044863 0.5575915
## bl    0.3904834 0.07899628 0.2642321 0.5167347
## bwl   0.4487317 0.08977111 0.3052601 0.5922033
## sigma 0.3712643 0.09547310 0.2186799 0.5238488

#Plotting posterior predictions
par(mfrow=c(1,3)) # 3 plots in 1 row
for (l in -1:1) {
    idx <- which(dt$light_cent == l)
    plot( dt$water_cent[idx], dt$blooms_std[idx], xlim=c(-1,1), ylim=c(0,1),
        xlab="water", ylab="blooms", pch=16, col=rangi2)
    mu <- link(m1, data=data.frame( light_cent=l , water_cent=-1:1))
    for (i in 1:20) lines( -1:1, mu[i,], col=col.alpha("black",0.3))
}

#Plotting prior predictions
set.seed(7)
prior <- extract.prior(m1)

par(mfrow=c(1,3)) # 3 plots in 1 row
for (l in -1:1) {
    idx <- which(dt$light_cent == l)
    plot( dt$water_cent[idx], dt$blooms_std[idx], xlim=c(-1,1), ylim=c(0,1),
        xlab="water", ylab="blooms", pch=16, col=rangi2)
    mu <- link(m1, post = prior, data=data.frame( light_cent=l , water_cent=-1:1))
    for (i in 1:20) lines( -1:1, mu[i,], col=col.alpha("black",0.3))
}

# The prior predction plots imply that the effect of water on bloom varies depending on light. When there is not enough light, water does not lead to bloom.

8-4. Return to the data(tulips) example in the chapter. Now include the bed variable as a predictor in the interaction model. Don’t interact bed with the other predictors; just include it as a main effect. Note that bed is categorical. So to use it properly, you will need to either construct dummy variables or rather an index variable, as explained in Chapter 5.

d <- tulips
d$shade.c <- d$shade - mean(d$shade)
d$water.c <- d$water - mean(d$water)

# Dummy variables
d$bedb <- d$bed == "b"
d$bedc <- d$bed == "c"

# Index variable
d$bedx <- coerce_index(d$bed)

m_dummy <- map(
  alist(
    blooms ~ dnorm(mu, sigma),
    mu <- a + bW*water.c + bS*shade.c + bWS*water.c*shade.c + bBb*bedb + bBc*bedc,
    a ~ dnorm(130, 100),
    bW ~ dnorm(0, 100),
    bS ~ dnorm(0, 100),
    bWS ~ dnorm(0, 100),
    bBb ~ dnorm(0, 100),
    bBc ~ dnorm(0, 100),
    sigma ~ dunif(0, 100)
  ),
  data = d,
  start = list(a = mean(d$blooms), bW = 0, bS = 0, bWS = 0, bBb = 0, bBc = 0, sigma = sd(d$blooms))
)
precis(m_dummy)

##            mean        sd      5.5%     94.5%
## a      99.36131 12.757521  78.97233 119.75029
## bW     75.12433  9.199747  60.42136  89.82730
## bS    -41.23103  9.198481 -55.93198 -26.53008
## bWS   -52.15060 11.242951 -70.11901 -34.18219
## bBb    42.41139 18.039255  13.58118  71.24160
## bBc    47.03141 18.040136  18.19979  75.86303
## sigma  39.18964  5.337920  30.65862  47.72067

8-5. Use WAIC to compare the model from 8-4 to a model that omits bed. What do you infer from this comparison? Can you reconcile the WAIC results with the posterior distribution of the bed coefficients?

m_omit <- map(
  alist(
    blooms ~ dnorm(mu, sigma),
    mu <- a + bW*water.c + bS*shade.c + bWS*water.c*shade.c,
    a ~ dnorm(130, 100),
    bW ~ dnorm(0, 100),
    bS ~ dnorm(0, 100),
    bWS ~ dnorm(0, 100),
    sigma ~ dunif(0, 100)
  ),
  data = d,
  start = list(a = mean(d$blooms), bW = 0, bS = 0, bWS = 0, sigma = sd(d$blooms))
)
precis(m_omit)

##            mean        sd      5.5%     94.5%
## a     129.00797  8.670771 115.15041 142.86554
## bW     74.95946 10.601997  58.01542  91.90350
## bS    -41.14054 10.600309 -58.08188 -24.19920
## bWS   -51.87265 12.948117 -72.56625 -31.17906
## sigma  45.22497  6.152982  35.39132  55.05863

compare(m_dummy, m_omit)

##             WAIC       SE    dWAIC      dSE    pWAIC    weight
## m_dummy 293.3087 9.679256 0.000000       NA 9.078013 0.7717443
## m_omit  295.7450 9.968972 2.436373 7.110708 6.415325 0.2282557

post <- extract.samples(m_dummy)
post.a <- post$a
post.b <- post$a + post$bBb
post.c <- post$a + post$bBc
dens(post.a, col = "red", xlim = c(50, 200), ylim = c(0, 0.035))
dens(post.b, col = "blue", add = TRUE)
dens(post.c, col = "black", add = TRUE)

# The dummy one is better than the other one.

# Bed “a” had particularly fewer blooms than the other beds.

Assignment #7

Nischal Bondalapati

5/17/2022

Chapter 8 - Conditional Manatees

Questions