Assignment #7

Questions

8E1. For each of the causal relationships below, name a hypothetical third variable that would lead to an interaction effect:

Bread dough rises because of yeast.
Education leads to higher income.
Gasoline makes a car go.

# 1. Temperature or sugar
# 2. Gender or race or major
# 3. Brand of the car which means different quality of engine or something.

8E2. Which of the following explanations invokes an interaction?

Caramelizing onions requires cooking over low heat and making sure the onions do not dry out.
A car will go faster when it has more cylinders or when it has a better fuel injector.
Most people acquire their political beliefs from their parents, unless they get them instead from their friends.
Intelligent animal species tend to be either highly social or have manipulative appendages (hands, tentacles, etc.).

# 1

8E3. For each of the explanations in 8E2, write a linear model that expresses the stated relationship.

# 1. Caramelization = a + b*heat + c*dryness + d*heat*dryness
# 2. Speed = a + b*cylinders + c*fuel_injector
# 3. political_beliefs = a + b*from_parent + c*from_friends 
# 4. Intelligence = a + b*sociality + c*manipulative_appendages

8M1. Recall the tulips example from the chapter. Suppose another set of treatments adjusted the temperature in the greenhouse over two levels: cold and hot. The data in the chapter were collected at the cold temperature. You find none of the plants grown under the hot temperature developed any blooms at all, regardless of the water and shade levels. Can you explain this result in terms of interactions between water, shade, and temperature?

# yes. Since the interaction on water and shade depends on the level of temperature, this is the a statement for the three-way interaction.

8M2. Can you invent a regression equation that would make the bloom size zero, whenever the temperature is hot?

# I = 1 when temperature is hot and 0 when temperature is cold
# bloom = (original model)(1 - I)

8M4. Repeat the tulips analysis, but this time use priors that constrain the effect of water to be positive and the effect of shade to be negative. Use prior predictive simulation. What do these prior assumptions mean for the interaction prior, if anything? Visualize the prior simulation.

library(dplyr)
library(rethinking)
data(tulips)
df = tulips
df = df %>% 
  mutate(water_c = water - mean(water), 
         shade_c = shade - mean(shade), 
         blooms_std = blooms/max(blooms)
         )

m1 <- quap(
  alist(
    blooms_std ~ dnorm( mu , sigma ) ,
    mu <- a + bw*water_c + bs*shade_c + bws*water_c*shade_c,
    a ~ dnorm( 0.5, 0.25) ,
    bw ~ dnorm( 1 , 0.25 ) ,
    bs ~ dnorm( -1 , 0.25 ) ,
    bws ~ dnorm( 0 , 0.25 ) ,
    sigma ~ dexp( 1 )
  ),
  data = df)

precis(m1)

##             mean         sd        5.5%       94.5%
## a      0.3579972 0.02404867  0.31956275  0.39643160
## bw     0.2205119 0.02953045  0.17331658  0.26770729
## bs    -0.1272575 0.02956790 -0.17451270 -0.08000226
## bws   -0.1431292 0.03587130 -0.20045844 -0.08579991
## sigma  0.1255285 0.01721840  0.09801021  0.15304686

8H1. Return to the data(tulips) example in the chapter. Now include the bed variable as a predictor in the interaction model. Don’t interact bed with the other predictors; just include it as a main effect. Note that bed is categorical. So to use it properly, you will need to either construct dummy variables or rather an index variable, as explained in Chapter 5.

library(dplyr)
library(rethinking)
data(tulips)
df = tulips
df = df %>% 
  mutate(water_c = water - mean(water), 
         shade_c = shade - mean(shade), 
         bed_b = ifelse(bed == "b", 1, 0), 
         bed_c = ifelse(bed == "c", 1, 0))

m1 <- map(
  alist(
    blooms ~ dnorm(mu,sigma),
    mu <- a + bW*water_c + bS*shade_c + bWS*water_c*shade_c + bBB*bed_b + bBC*bed_c ,
    a ~ dnorm(0,100),
    c(bW,bS,bWS,bBB,bBC) ~ dnorm(0,100),
    sigma ~ dunif(0,100)
  ),
  start=list(a=130,bW=0,bS=0,bWS=0,bBB=0,bBC=0,sigma=90),
  data=df)

precis(m1)

##            mean        sd      5.5%     94.5%
## a      97.33917 12.738825  76.98007 117.69827
## bW     75.16581  9.195133  60.47021  89.86141
## bS    -41.25380  9.194014 -55.94761 -26.55999
## bWS   -52.22757 11.237265 -70.18689 -34.26825
## bBB    44.42496 18.015989  15.63193  73.21799
## bBC    49.03426 18.016352  20.24065  77.82787
## sigma  39.17074  5.330386  30.65175  47.68972

8H5. Consider the data(Wines2012) data table. These data are expert ratings of 20 different French and American wines by 9 different French and American judges. Your goal is to model score, the subjective rating assigned by each judge to each wine. I recommend standardizing it. In this problem, consider only variation among judges and wines. Construct index variables of judge and wine and then use these index variables to construct a linear regression model. Justify your priors. You should end up with 9 judge parameters and 20 wine parameters. Plot the parameter estimates. How do you interpret the variation among individual judges and individual wines? Do you notice any patterns, just by plotting the differences? Which judges gave the highest/lowest ratings? Which wines were rated worst/best on average?

library(dplyr)
library(rethinking)
data(Wines2012)
df = Wines2012

df2 = df %>% 
  mutate(s = standardize(score), 
         judge = as.integer(judge), 
         wine = as.integer(wine)) %>% 
  select(s, judge, wine)
  
m2 <- ulam(
    alist(
        s ~ dnorm(mu, sigma),
        mu <- a[judge] + b[wine],
        a[judge] ~ dnorm(0, 0.5),
        b[wine] ~ dnorm(0, 0.5),
        sigma ~ dexp(1)
    ),
    data = df2, 
    chains = 4 ,
    cores = 4
)

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

precis(m2, 2)

##               mean         sd        5.5%       94.5%    n_eff     Rhat4
## a[1]  -0.280600714 0.19050389 -0.58365797  0.01437641 2190.695 0.9981236
## a[2]   0.208540438 0.19689691 -0.10330061  0.51208528 2327.454 0.9997113
## a[3]   0.200911365 0.19697702 -0.10844624  0.50956785 2431.067 0.9983620
## a[4]  -0.545389840 0.20309564 -0.86572747 -0.22711477 2384.073 0.9984522
## a[5]   0.791030001 0.19678133  0.46589186  1.11292314 2570.728 0.9985539
## a[6]   0.469484279 0.20014324  0.13491483  0.77773327 2490.065 0.9990251
## a[7]   0.129271907 0.19581095 -0.17882993  0.43985072 1991.046 1.0001667
## a[8]  -0.659839261 0.19396898 -0.96306987 -0.34558684 1935.295 0.9988078
## a[9]  -0.343878335 0.20002000 -0.66583855 -0.02642579 2090.180 0.9988553
## b[1]   0.117215690 0.25591504 -0.28952398  0.52357563 3036.023 0.9987001
## b[2]   0.087212154 0.25761460 -0.33593030  0.49437528 3187.735 0.9980640
## b[3]   0.230274258 0.26603216 -0.20828863  0.64371621 2728.102 0.9989098
## b[4]   0.461681793 0.25550438  0.05153983  0.86384518 2518.212 0.9990931
## b[5]  -0.109983727 0.25889057 -0.54180620  0.31100129 2716.183 0.9987349
## b[6]  -0.309980213 0.26369897 -0.73030114  0.11664519 2391.580 1.0004569
## b[7]   0.252041584 0.25747705 -0.15275898  0.66591833 2645.964 0.9987999
## b[8]   0.230265779 0.25093112 -0.16251925  0.63508341 2215.460 0.9993079
## b[9]   0.069223689 0.25149291 -0.33223548  0.47424692 2605.972 0.9998092
## b[10]  0.108235338 0.25956634 -0.30005543  0.52807133 3047.277 0.9988528
## b[11] -0.003802214 0.25904608 -0.43091926  0.39634458 2589.138 0.9984171
## b[12] -0.025183534 0.26651038 -0.45468444  0.40102842 2913.684 0.9999867
## b[13] -0.088262138 0.27025338 -0.53047788  0.34649205 3262.710 0.9986698
## b[14]  0.011612358 0.26633690 -0.41329444  0.44495306 3387.121 0.9987698
## b[15] -0.179735075 0.24688948 -0.58131438  0.22230428 3175.406 0.9985195
## b[16] -0.165456034 0.25505328 -0.58178629  0.22836526 3089.810 0.9987438
## b[17] -0.114545615 0.24882635 -0.50945145  0.28530275 2906.113 0.9989782
## b[18] -0.711675066 0.25715266 -1.12193425 -0.29045388 2651.458 0.9996908
## b[19] -0.131881679 0.25895529 -0.55341809  0.27506882 3064.918 0.9995525
## b[20]  0.325907242 0.27307327 -0.11599489  0.75818074 2880.483 0.9990134
## sigma  0.846955058 0.04683172  0.77579793  0.92392809 3046.120 0.9991995

traceplot(m2)

## [1] 1000
## [1] 1
## [1] 1000

# From this plot, we know that judge 5 generally gave highest rating in average and judge 8 gave lowest rating in average. And, wine 4 got highest rating and 18 got lowest rating in anverage.

Assignment #7

Yin-Ting Chou

2021-05-11

Chapter 8 - Conditional Manatees

Questions