Bayes Factors for One-Way Within-Subject Designs

============================================================================================================
MORE: heteroscedastic error covariance matrix, https://rpubs.com/sherloconan/1075410

1. Getting Started

In Loftus and Masson (1994), some to-be-recalled 20-word lists are presented at a rate of 1, 2, or 5 sec per word. Suppose that the hypothetical experiment is run as a one-way within-subject (repeated-measures) design with three conditions.

dat <- matrix(c(10,6,11,22,16,15,1,12,9,8,
                13,8,14,23,18,17,1,15,12,9,
                13,8,14,25,20,17,4,17,12,12), ncol=3,
              dimnames=list(paste0("s",1:10), paste0("Level",1:3))) #wide data format
# dat <- dat * 100

df <- data.frame("ID"=factor(rep(paste0("s",1:10), 3)),
                 "Level"=factor(rep(paste0("Level",1:3), each=10)),
                 "Response"=c(dat)) #long data format

Or, testing a simulated data set.

mu.b <- 100; sigma.b <- 20; n <- 24
delta.mu <- 6.7; sigma.e <- 6.67 #rho=.9, power=.8
set.seed(277)
TrueScore <- rnorm(n, mu.b, sigma.b)
eij1 <- rnorm(n, 0, sigma.e)
eij2 <- rnorm(n, 0, sigma.e)

dat <- cbind("Level1"=TrueScore+eij1+delta.mu,
             "Level2"=TrueScore+eij2)
# dat <- dat * 100

df <- data.frame("ID"=factor(rep(paste0("s",1:n), 2)),
                 "Level"=factor(rep(c("Level1", "Level2"), each=n)),
                 "Response"=c(dat))

## The Bayes factors will be computed for the simulated data (in the second chunk).

Method: Constructing the Jeffreys-Zellner-Siow (JZS) Bayes factor.

\[\begin{align*} \tag{1} \mathcal{M}_1:\ Y_{ij}&=\mu+\sigma_\epsilon(d_i+b_j)+\epsilon_{ij}\\ \text{versus}\quad\mathcal{M}_0:\ Y_{ij}&=\mu+\sigma_\epsilon b_j+\epsilon_{ij},\qquad\epsilon_{ij}\overset{\text{i.i.d.}}{\sim}\mathcal{N}(0,\sigma_\epsilon^2) \text{ for } i=1,\dotsb,a;\ j=1,\dotsb,n. \end{align*}\]

\[\begin{equation} \tag{2} \pi(\mu,\sigma_\epsilon^2)\propto 1/\ \sigma_\epsilon^{2} \end{equation}\]

\[\begin{equation} \tag{3} d_i^\star\mid g\overset{\text{i.i.d.}}{\sim}\mathcal{N}(0,g) \end{equation}\]

\[\begin{equation} \tag{4} g\sim\text{Scale-inv-}\chi^2(1,h^2) \end{equation}\]

\[\begin{equation} \tag{5} (d_1^\star,\dotsb,d_{a-1}^\star)=(d_1,\dotsb,d_a)\cdot\mathbf{Q} \end{equation}\]

\[\begin{equation} \tag{6} \mathbf{I}_a-\frac{1}{a}\mathbf{J}_a=\mathbf{Q}\cdot\mathbf{Q}^\top \end{equation}\]

\[\begin{equation} \tag{7} b_j\mid g_b\overset{\text{i.i.d.}}{\sim}\mathcal{N}(0,g_b) \end{equation}\]

\[\begin{equation} \tag{8} g_b\sim\text{Scale-inv-}\chi^2(1,h_b^2) \end{equation}\]

\(\mathbf{Q}\) is an \(a\times(a-1)\) matrix of the \(a-1\) eigenvectors of unit length corresponding to the nonzero eigenvalues of the left side term in (6). For example, the projected main effect \(d^\star=(d_1-d_2)/\sqrt{2}\) when \(a=2\). In the other direction (given \(d_1+d_2=0\)), \(d_1=d^\star/\sqrt{2}\) and \(d_2=-d^\star/\sqrt{2}\).

set.seed(277)
(JZS_BF <- anovaBF(Response~Level+ID, data=df, whichRandom="ID", progress=F))

## Bayes factor analysis
## --------------
## [1] Level + ID : 11.02461 ±0.67%
## 
## Against denominator:
##   Response ~ ID 
## ---
## Bayes factor type: BFlinearModel, JZS

Note: Fixed effects are assumed. See the discussion here.

Markov chain Monte Carlo (MCMC) diagnostics. Plots of iterations versus sampled values for each variable in the chain.

set.seed(277)
draws <- posterior(JZS_BF, iterations=1e5, progress=F)
summary(draws)

## 
## Iterations = 1:1e+05
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 1e+05 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##                  Mean      SD Naive SE Time-series SE
## mu           108.7277   5.172 0.016354       0.016354
## Level-Level1   3.2648   1.197 0.003786       0.004314
## Level-Level2  -3.2648   1.197 0.003786       0.004314
## ID-s1         30.4422   7.557 0.023898       0.025012
## ID-s10       -19.4938   7.550 0.023875       0.023875
## ID-s11        -0.8085   7.493 0.023695       0.023576
## ID-s12         8.1456   7.522 0.023788       0.023788
## ID-s13        45.9977   7.635 0.024143       0.026187
## ID-s14       -28.2025   7.538 0.023838       0.024622
## ID-s15       -12.8174   7.539 0.023840       0.023840
## ID-s16        23.6346   7.549 0.023872       0.024428
## ID-s17         4.3252   7.521 0.023784       0.023784
## ID-s18        -3.2712   7.523 0.023791       0.023791
## ID-s19        14.9474   7.555 0.023890       0.024280
## ID-s2         10.1977   7.531 0.023814       0.023814
## ID-s20       -13.9946   7.550 0.023874       0.023874
## ID-s21       -21.1130   7.571 0.023942       0.024498
## ID-s22       -19.3361   7.549 0.023873       0.024044
## ID-s23        35.0892   7.594 0.024015       0.025727
## ID-s24         7.3267   7.553 0.023886       0.023886
## ID-s3        -18.7996   7.534 0.023825       0.024219
## ID-s4         31.0016   7.584 0.023983       0.024427
## ID-s5        -33.4762   7.580 0.023969       0.025458
## ID-s6         22.1565   7.528 0.023805       0.024781
## ID-s7        -29.3490   7.573 0.023948       0.024707
## ID-s8         -0.2011   7.526 0.023800       0.023937
## ID-s9        -32.1534   7.573 0.023946       0.024887
## sig2          70.0214  23.074 0.072968       0.185488
## g_Level        5.4399 478.834 1.514206       1.514206
## g_ID           9.5521   4.501 0.014235       0.029968
## 
## 2. Quantiles for each variable:
## 
##                   2.5%      25%      50%     75%    97.5%
## mu            98.51260 105.3381 108.7324 112.119 118.9246
## Level-Level1   0.91212   2.4740   3.2605   4.052   5.6351
## Level-Level2  -5.63512  -4.0518  -3.2605  -2.474  -0.9121
## ID-s1         15.59911  25.3911  30.4842  35.481  45.2401
## ID-s10       -34.41346 -24.4874 -19.4985 -14.513  -4.5370
## ID-s11       -15.50512  -5.8010  -0.8069   4.185  13.8785
## ID-s12        -6.60409   3.1258   8.1533  13.152  22.9883
## ID-s13        30.90195  40.9262  46.0031  51.035  60.9876
## ID-s14       -43.06944 -33.2252 -28.2251 -23.183 -13.3536
## ID-s15       -27.65613 -17.8460 -12.8154  -7.811   2.0251
## ID-s16         8.76354  18.6665  23.6514  28.628  38.5223
## ID-s17       -10.47225  -0.6513   4.3171   9.299  19.1054
## ID-s18       -17.93953  -8.2818  -3.2555   1.701  11.5917
## ID-s19         0.01507   9.9408  14.9718  19.959  29.7518
## ID-s2         -4.62003   5.1722  10.2083  15.219  25.0096
## ID-s20       -28.87834 -19.0018 -13.9904  -8.952   0.8958
## ID-s21       -36.06758 -26.1364 -21.1057 -16.090  -6.1758
## ID-s22       -34.18702 -24.4105 -19.3543 -14.309  -4.4714
## ID-s23        20.04937  30.0764  35.0963  40.140  49.9902
## ID-s24        -7.58871   2.3097   7.3593  12.320  22.2375
## ID-s3        -33.67105 -23.8061 -18.7900 -13.807  -3.9780
## ID-s4         15.99679  25.9912  31.0130  36.031  45.9018
## ID-s5        -48.36228 -38.5250 -33.5321 -28.437 -18.4998
## ID-s6          7.28988  17.1852  22.1697  27.173  36.9708
## ID-s7        -44.14209 -34.4148 -29.3593 -24.302 -14.3894
## ID-s8        -14.97812  -5.1907  -0.2325   4.780  14.6205
## ID-s9        -47.01976 -37.1770 -32.1381 -27.148 -17.1065
## sig2          38.37888  53.9211  65.7141  81.237 127.3316
## g_Level        0.06451   0.2002   0.4255   1.057  12.2281
## g_ID           3.45870   6.4077   8.6860  11.696  20.7292

plot(draws, cex.lab=1.5, cex.axis=1.5, cex.main=1.5, cex.sub=1.5)

2A. Full Stan Model (unstandardized effect sizes)

\[\begin{align*} \tag{a} \mathcal{M}_1:\ Y_{ij}&=\mu+t_i+\epsilon_{ij}\\ \text{versus}\quad\mathcal{M}_0:\ Y_{ij}&=\mu+\epsilon_{ij}\\ (\epsilon_{1j},\dotsb,\epsilon_{aj})^\top&\overset{\text{i.i.d.}}{\sim}\mathbf{\mathcal{N}}_a(\mathbf{0},\mathbf{\Sigma})\qquad\text{for } i=1,\dotsb,a;\ j=1,\dotsb,n. \end{align*}\]

\[\begin{equation} \tag{b} \pi(\mu)\propto 1 \end{equation}\]

\[\begin{equation} \tag{c} \mathbf{\Sigma}\sim\mathcal{W}\left(\frac{\mathbf{I}_{a}}{2\theta},\ a+2\right),\quad\theta=10^{-4} \end{equation}\]

\[\begin{equation} \tag{3} t_i^\star\mid g\overset{\text{i.i.d.}}{\sim}\mathcal{N}(0,g) \end{equation}\]

\[\begin{equation} \tag{4} g\sim\text{Scale-inv-}\chi^2(1,h^2) \end{equation}\]

\[\begin{equation} \tag{5} (t_1^\star,\dotsb,t_{a-1}^\star)=(t_1,\dotsb,t_a)\cdot\mathbf{Q} \end{equation}\]

\[\begin{equation} \tag{6} \mathbf{I}_a-\frac{1}{a}\mathbf{J}_a=\mathbf{Q}\cdot\mathbf{Q}^\top \end{equation}\]

\(\mathbf{Q}\) is an \(a\times(a-1)\) matrix of the \(a-1\) eigenvectors of unit length corresponding to the nonzero eigenvalues of the left side term in (6). For example, the projected main effect \(t^\star=(t_1-t_2)/\sqrt{2}\) when \(a=2\). In the other direction (given \(t_1+t_2=0\)), \(t_1=t^\star/\sqrt{2}\) and \(t_2=-t^\star/\sqrt{2}\).

computeQ <- function(C) {
  #' reduced parametrization for the fixed treatment effects
  S <- diag(C) - matrix(1, nrow = C, ncol = C) / C
  e <- qr(S)
  Q <- qr.Q(e) %*% diag(sign(diag(qr.R(e))))
  Q[, which(abs(e$qraux) > 1e-3), drop = FALSE]
}

Specify the full model \(\mathcal{M}_1\) in (a). stancodeM1 initially declares the reduced fixed treatment effects tf. Note that the effects are not standardized (\(d_i=t_i/\sigma_\epsilon\)) but the data are scaled. Compile and fit the model in Stan.

Note that to compute the (log) marginal likelihood for a Stan model, we need to specify the model in a certain way. Instead of using “~” signs for specifying distributions, we need to directly use the (log) density functions. The reason for this is that when using the “~” sign, constant terms are dropped which are not needed for sampling from the posterior. However, for computing the marginal likelihood, these constants need to be retained. For instance, instead of writing y ~ normal(mu, sigma); we would need to write target += normal_lpdf(y | mu, sigma); (Gronau, 2021). See also the discussion regarding the implicit uniform prior for computing the log marginal likelihood.

stancodeM1 <- "
data {
  int<lower=1> N;        // number of subjects
  int<lower=2> C;        // number of conditions
  // vector[C] Y[N];        // responses [removed features]
  array[N] vector[C] Y;  // responses [Stan 2.26+ syntax for array declarations]
  matrix[C,C-1] P;       // projecting C-1 fixed effects into C
  real<lower=0> ht;      // scale parameter of the variance of the reduced treatment effect
  matrix[C,C] V;         // scale matrix of the Wishart prior
  real lwr;              // lower bound for the grand mean
  real upr;              // upper bound for the grand mean
}

parameters {
  real<lower=lwr, upper=upr> mu; // grand mean
  cov_matrix[C] Sigma;           // error covariance matrix
  real<lower=0> gt;              // variance of the reduced treatment effect
  vector[C-1] tf;                // reduced treatment effect
}

transformed parameters {
  vector[C] t;           // treatment effect
  t = P * tf;
}

model {
  target += uniform_lpdf(mu | lwr, upr);
  target += wishart_lpdf(Sigma | C+2, V);                // Chung et al. (2015)
  target += scaled_inv_chi_square_lpdf(gt | 1, ht);
  target += normal_lpdf(tf | 0, sqrt(gt));
  for (k in 1:N) {
    target += multi_normal_lpdf(Y[k] | mu + t, Sigma);
  }
}"

stanmodelM1 <- stan_model(model_code=stancodeM1)
datalistM1 <- list("N"=nrow(dat), "C"=ncol(dat),
                   "Y"=dat/sd(dat), #scale the data
                   "P"=computeQ(ncol(dat)), "ht"=.5,
                   "V"=diag(ncol(dat))*5E3)
datalistM1$lwr <- mean(datalistM1[["Y"]])-6*sd(datalistM1[["Y"]]) # -6 * 1
datalistM1$upr <- mean(datalistM1[["Y"]])+6*sd(datalistM1[["Y"]])
stanfitM1 <- sampling(stanmodelM1, data=datalistM1,
                      iter=50000, warmup=10000, chains=4, seed=277, refresh=0)

rstan::summary(stanfitM1)[[1]] %>% kable(digits=4) %>% kable_classic(full_width=F)

	mean	se_mean	sd	2.5%	25%	50%	75%	97.5%	n_eff	Rhat
mu	4.2847	0.0007	0.2395	3.8098	4.1292	4.2842	4.4402	4.7564	129317.57	1
Sigma[1,1]	1.4030	0.0020	0.5306	0.7124	1.0389	1.2956	1.6402	2.7177	70584.72	1
Sigma[1,2]	1.3021	0.0020	0.5175	0.6281	0.9473	1.1964	1.5355	2.5851	67199.80	1
Sigma[2,1]	1.3021	0.0020	0.5175	0.6281	0.9473	1.1964	1.5355	2.5851	67199.80	1
Sigma[2,2]	1.4703	0.0021	0.5577	0.7463	1.0881	1.3564	1.7211	2.8531	69917.23	1
gt	1.7221	0.2270	56.3103	0.0393	0.1061	0.2134	0.5135	5.8548	61547.46	1
tf[1]	0.1992	0.0002	0.0740	0.0518	0.1510	0.1995	0.2478	0.3441	131586.40	1
t[1]	0.1409	0.0001	0.0523	0.0366	0.1067	0.1410	0.1753	0.2433	131586.40	1
t[2]	-0.1409	0.0001	0.0523	-0.2433	-0.1753	-0.1410	-0.1067	-0.0366	131586.40	1
lp__	-91.0545	0.0080	1.9080	-95.6979	-92.0644	-90.6942	-89.6523	-88.4253	56227.93	1

mean(dat/sd(dat))

## [1] 4.283763

cov(dat/sd(dat))

##           Level1    Level2
## Level1 0.9757775 0.9058127
## Level2 0.9058127 1.0230817

Chung, Y., Gelman, A., Rabe-Hesketh, S., Liu, J., & Dorie, V. (2015). Weakly informative prior for point estimation of covariance matrices in hierarchical models. Journal of Educational and Behavioral Statistics, 40, 136-157. https://doi.org/10.3102/1076998615570945

2B. Null Stan Model

\[\begin{equation} \tag{b} \pi(\mu)\propto 1 \end{equation}\]

\[\begin{equation} \tag{c} \mathbf{\Sigma}\sim\mathcal{W}\left(\frac{\mathbf{I}_{a}}{2\theta},\ a+2\right),\quad\theta=10^{-4} \end{equation}\]

\[\begin{equation} \tag{3} t_i^\star\mid g\overset{\text{i.i.d.}}{\sim}\mathcal{N}(0,g) \end{equation}\]

\[\begin{equation} \tag{4} g\sim\text{Scale-inv-}\chi^2(1,h^2) \end{equation}\]

\[\begin{equation} \tag{5} (t_1^\star,\dotsb,t_{a-1}^\star)=(t_1,\dotsb,t_a)\cdot\mathbf{Q} \end{equation}\]

\[\begin{equation} \tag{6} \mathbf{I}_a-\frac{1}{a}\mathbf{J}_a=\mathbf{Q}\cdot\mathbf{Q}^\top \end{equation}\]

\(\mathbf{Q}\) is an \(a\times(a-1)\) matrix of the \(a-1\) eigenvectors of unit length corresponding to the nonzero eigenvalues of the left side term in (6). For example, the projected main effect \(t^\star=(t_1-t_2)/\sqrt{2}\) when \(a=2\). In the other direction (given \(t_1+t_2=0\)), \(t_1=t^\star/\sqrt{2}\) and \(t_2=-t^\star/\sqrt{2}\).

Specify the null model \(\mathcal{M}_0\) in (a). Compile and fit the model in Stan.

stancodeM0 <- "
data {
  int<lower=1> N;        // number of subjects
  int<lower=2> C;        // number of conditions
  // vector[C] Y[N];        // responses [removed features]
  array[N] vector[C] Y;  // responses [Stan 2.26+ syntax for array declarations]
  matrix[C,C] V;         // scale matrix of the Wishart prior
  real lwr;              // lower bound for the grand mean
  real upr;              // upper bound for the grand mean
}

parameters {
  real<lower=lwr, upper=upr> mu; // grand mean
  cov_matrix[C] Sigma;           // error covariance matrix
}

transformed parameters {
  vector[C] muS;         // grand mean vector
  for (i in 1:C) {
    muS[i] = mu;
  }
}

model {
  target += uniform_lpdf(mu | lwr, upr);
  target += wishart_lpdf(Sigma | C+2, V);                // Chung et al. (2015)
  for (k in 1:N) {
    target += multi_normal_lpdf(Y[k] | muS, Sigma);
  }
}"

stanmodelM0 <- stan_model(model_code=stancodeM0)
stanfitM0 <- sampling(stanmodelM0, data=datalistM1[-c(4,5)],
                      iter=50000, warmup=10000, chains=4, seed=277, refresh=0)

rstan::summary(stanfitM0)[[1]] %>% kable(digits=4) %>% kable_classic(full_width=F)

	mean	se_mean	sd	2.5%	25%	50%	75%	97.5%	n_eff	Rhat
mu	4.3193	0.0010	0.2810	3.7633	4.1363	4.3197	4.5019	4.8739	83212.88	1
Sigma[1,1]	1.4514	0.0024	0.5728	0.7237	1.0620	1.3316	1.7005	2.8900	58230.87	1
Sigma[1,2]	1.3107	0.0024	0.5467	0.6070	0.9390	1.1985	1.5522	2.6772	53452.35	1
Sigma[2,1]	1.3107	0.0024	0.5467	0.6070	0.9390	1.1985	1.5522	2.6772	53452.35	1
Sigma[2,2]	1.5444	0.0026	0.6138	0.7630	1.1282	1.4153	1.8111	3.0858	55762.98	1
muS[1]	4.3193	0.0010	0.2810	3.7633	4.1363	4.3197	4.5019	4.8739	83212.88	1
muS[2]	4.3193	0.0010	0.2810	3.7633	4.1363	4.3197	4.5019	4.8739	83212.88	1
lp__	-92.3540	0.0066	1.5162	-96.1960	-93.0836	-92.0091	-91.2473	-90.4749	52910.61	1

mean(dat/sd(dat))

## [1] 4.283763

cov(dat/sd(dat))

##           Level1    Level2
## Level1 0.9757775 0.9058127
## Level2 0.9058127 1.0230817

3. Bayes Factor via Bridge Sampling

The newly proposed models in (a): (i) do not include an explicit subject random effects term, (ii) introduce a more general structure of the error term, (iii) do not standardize the treatment effects, and (iv) scale the data.

Compute log marginal likelihoods for full and null models via bridge sampling. Then, \(\textit{BF}_{10}=\frac{p(\text{Data}\ \mid\ \mathcal{M}_1)}{p(\text{Data}\ \mid\ \mathcal{M}_0)}\).

M1 <- bridge_sampler(stanfitM1, silent=T)
summary(M1)

## 
## Bridge sampling log marginal likelihood estimate 
## (method = "normal", repetitions = 1):
## 
##  -93.32558
## 
## Error Measures:
## 
##  Relative Mean-Squared Error: 1.887701e-06
##  Coefficient of Variation: 0.001373936
##  Percentage Error: 0%
## 
## Note:
## All error measures are approximate.

M0 <- bridge_sampler(stanfitM0, silent=T)
summary(M0)

## 
## Bridge sampling log marginal likelihood estimate 
## (method = "normal", repetitions = 1):
## 
##  -94.69036
## 
## Error Measures:
## 
##  Relative Mean-Squared Error: 9.999568e-07
##  Coefficient of Variation: 0.0009999784
##  Percentage Error: 0%
## 
## Note:
## All error measures are approximate.

bf(M1, M0)

## Estimated Bayes factor in favor of M1 over M0: 3.91487

Do we expect the Bayes factor estimates to vary significantly despite different model specifications?

Check the Monte Carlo error of estimating the Bayes factor.

num <- 50; BF <- PerErr.M1 <- PerErr.M0 <- numeric(num)
system.time(
for (i in 1:num) { #roughly 74 min of run time
  result <- BFBS(seed=i) #defined in All-in-One R Script
  BF[i] <- result$bf
  PerErr.M1[i] <- result$pererr.M1
  PerErr.M0[i] <- result$pererr.M0
})


range(PerErr.M1); range(PerErr.M0)

hist(BF,prob=T,col="white",yaxt="n",ylab="",
     xlab="Bayes factor estimates",main="")
lines(density(BF),col="red",lwd=2)

## [1] 0.001344015 0.001392706

## [1] 0.0009759583 0.0010113908

All-in-One R Script

library(rstan)
# devtools::install_github("quentingronau/bridgesampling@master")
library(bridgesampling)
library(BayesFactor)

run.this=F #if TRUE, test the hypothetical data; if FALSE, test the simulated data
if (run.this) {
  dat <- matrix(c(10,6,11,22,16,15,1,12,9,8,
                  13,8,14,23,18,17,1,15,12,9,
                  13,8,14,25,20,17,4,17,12,12), ncol=3,
                dimnames=list(paste0("s",1:10), paste0("Level",1:3))) #wide data format

  df <- data.frame("ID"=factor(rep(paste0("s",1:10), 3)),
                   "Level"=factor(rep(paste0("Level",1:3), each=10)),
                   "Response"=c(dat)) #long data format
} else {
  mu.b <- 100; sigma.b <- 20; n <- 24
  delta.mu <- 6.7; sigma.e <- 6.67 #rho=.9, power=.8
  set.seed(277)
  TrueScore <- rnorm(n, mu.b, sigma.b)
  eij1 <- rnorm(n, 0, sigma.e)
  eij2 <- rnorm(n, 0, sigma.e)

  dat <- cbind("Level1" = TrueScore + eij1 + delta.mu,
               "Level2" = TrueScore + eij2)

  df <- data.frame("ID" = factor(rep(paste0("s",1:n), 2)),
                   "Level" = factor(rep(c("Level1", "Level2"), each = n)),
                   "Response" = as.vector(dat))
}

set.seed(277)
(JZS_BF <- anovaBF(Response~Level+ID, data=df, whichRandom="ID", progress=F))
# set.seed(277)
# draws <- posterior(JZS_BF, iterations=1e5, progress=F)
# summary(draws)
# plot(draws, cex.lab=1.5, cex.axis=1.5, cex.main=1.5, cex.sub=1.5)


computeQ <- function(C) {
  #' reduced parametrization for the fixed treatment effects
  S <- diag(C) - matrix(1, nrow = C, ncol = C) / C
  e <- qr(S)
  Q <- qr.Q(e) %*% diag(sign(diag(qr.R(e))))
  Q[, which(abs(e$qraux) > 1e-3), drop = FALSE]
}

{
  stancodeM1 <- "
  data {
    int<lower=1> N;        // number of subjects
    int<lower=2> C;        // number of conditions
    // vector[C] Y[N];        // responses [removed features]
    array[N] vector[C] Y;  // responses [Stan 2.26+ syntax for array declarations]
    matrix[C,C-1] P;       // projecting C-1 fixed effects into C
    real<lower=0> ht;      // scale parameter of the variance of the reduced treatment effect
    matrix[C,C] V;         // scale matrix of the Wishart prior
    real lwr;              // lower bound for the grand mean
    real upr;              // upper bound for the grand mean
  }

  parameters {
    real<lower=lwr, upper=upr> mu; // grand mean
    cov_matrix[C] Sigma;           // error covariance matrix
    real<lower=0> gt;              // variance of the reduced treatment effect
    vector[C-1] tf;                // reduced treatment effect
  }

  transformed parameters {
    vector[C] t;           // treatment effect
    t = P * tf;
  }

  model {
    target += uniform_lpdf(mu | lwr, upr);
    target += wishart_lpdf(Sigma | C+2, V);                // Chung et al. (2015)
    target += scaled_inv_chi_square_lpdf(gt | 1, ht);
    target += normal_lpdf(tf | 0, sqrt(gt));
    for (k in 1:N) {
      target += multi_normal_lpdf(Y[k] | mu + t, Sigma);
    }
  }"

  stanmodelM1 <- stan_model(model_code=stancodeM1)

  stancodeM0 <- "
  data {
    int<lower=1> N;        // number of subjects
    int<lower=2> C;        // number of conditions
    // vector[C] Y[N];        // responses [removed features]
    array[N] vector[C] Y;  // responses [Stan 2.26+ syntax for array declarations]
    matrix[C,C] V;         // scale matrix of the Wishart prior
    real lwr;              // lower bound for the grand mean
    real upr;              // upper bound for the grand mean
  }

  parameters {
    real<lower=lwr, upper=upr> mu; // grand mean
    cov_matrix[C] Sigma;           // error covariance matrix
  }

  transformed parameters {
    vector[C] muS;         // grand mean vector
    for (i in 1:C) {
      muS[i] = mu;
    }
  }

  model {
    target += uniform_lpdf(mu | lwr, upr);
    target += wishart_lpdf(Sigma | C+2, V);                // Chung et al. (2015)
    for (k in 1:N) {
      target += multi_normal_lpdf(Y[k] | muS, Sigma);
    }
  }"

  stanmodelM0 <- stan_model(model_code=stancodeM0)
} #Stan models

datalistM1 <- list("N"=nrow(dat), "C"=ncol(dat),
                   "Y"=dat/sd(dat), #scale the data
                   "P"=computeQ(ncol(dat)), "ht"=.5,
                   "V"=diag(ncol(dat))*5E3)
datalistM1$lwr <- mean(datalistM1[["Y"]])-6*sd(datalistM1[["Y"]])
datalistM1$upr <- mean(datalistM1[["Y"]])+6*sd(datalistM1[["Y"]])


BFBS <- function(seed=277, diagnostics=F) {
  stanfitM0 <- sampling(stanmodelM0, data=datalistM1[-c(4,5)],
                        iter=50000, warmup=10000, chains=4, seed=seed, refresh=0)

  stanfitM1 <- sampling(stanmodelM1, data=datalistM1,
                        iter=50000, warmup=10000, chains=4, seed=seed, refresh=0)
  
  set.seed(seed)
  M0 <- bridge_sampler(stanfitM0, silent=T)
  
  set.seed(seed)
  M1 <- bridge_sampler(stanfitM1, silent=T)
  
  if (diagnostics) {
    list("bf"=exp(M1$logml - M0$logml),
         "pererr.M1"=error_measures(M1)$cv,
         "pererr.M0"=error_measures(M0)$cv,
         "stan.M1"=stanfitM1, "stan.M0"=stanfitM0)
  } else {
    list("bf"=exp(M1$logml - M0$logml),
         "pererr.M1"=error_measures(M1)$cv,
         "pererr.M0"=error_measures(M0)$cv)
  }
}

BFBS()


# MCMC <- BFBS(diag=T)
# rstan::summary(MCMC$stan.M1)[[1]]
# rstan::summary(MCMC$stan.M0)[[1]]

4. \(\times100\)

All the responses multiplied 100 times.

if (run.this) { #hypothetical
  dat <- dat * 100
  df <- data.frame("ID"=factor(rep(paste0("s",1:10), 3)),
                   "Level"=factor(rep(paste0("Level",1:3), each=10)),
                   "Response"=c(dat))
} else { #simulated
  dat <- dat * 100
  df <- data.frame("ID"=factor(rep(paste0("s",1:n), 2)),
                   "Level"=factor(rep(c("Level1", "Level2"), each=n)),
                   "Response"=c(dat))
}

## The Bayes factors will be computed for the simulated data (in the else-statement).

set.seed(277)
(JZS_BF <- anovaBF(Response~Level+ID, data=df, whichRandom="ID", progress=F))

## Bayes factor analysis
## --------------
## [1] Level + ID : 11.02461 ±0.67%
## 
## Against denominator:
##   Response ~ ID 
## ---
## Bayes factor type: BFlinearModel, JZS

Compute log marginal likelihoods for full and null models via bridge sampling.

datalistM1$Y <- dat/sd(dat)
datalistM1$lwr <- mean(datalistM1[["Y"]])-6*sd(datalistM1[["Y"]])
datalistM1$upr <- mean(datalistM1[["Y"]])+6*sd(datalistM1[["Y"]])

stanfitM1 <- sampling(stanmodelM1, data=datalistM1,
                      iter=50000, warmup=10000, chains=4, seed=277, refresh=0)

rstan::summary(stanfitM1)[[1]] %>% kable(digits=4) %>% kable_classic(full_width=F)

	mean	se_mean	sd	2.5%	25%	50%	75%	97.5%	n_eff	Rhat
mu	4.2848	0.0007	0.2378	3.8134	4.1301	4.2850	4.4393	4.7547	125730.55	1
Sigma[1,1]	1.4023	0.0020	0.5302	0.7119	1.0385	1.2935	1.6398	2.7295	68770.45	1
Sigma[1,2]	1.3012	0.0020	0.5149	0.6262	0.9469	1.1954	1.5326	2.5950	65687.00	1
Sigma[2,1]	1.3012	0.0020	0.5149	0.6262	0.9469	1.1954	1.5326	2.5950	65687.00	1
Sigma[2,2]	1.4692	0.0021	0.5523	0.7451	1.0897	1.3559	1.7190	2.8543	68715.60	1
gt	1.2876	0.1180	33.3903	0.0390	0.1056	0.2124	0.5153	5.9464	80005.44	1
tf[1]	0.1992	0.0002	0.0736	0.0531	0.1513	0.1992	0.2474	0.3441	125501.26	1
t[1]	0.1409	0.0001	0.0520	0.0375	0.1070	0.1409	0.1749	0.2433	125501.26	1
t[2]	-0.1409	0.0001	0.0520	-0.2433	-0.1749	-0.1409	-0.1070	-0.0375	125501.26	1
lp__	-91.0471	0.0080	1.9025	-95.6780	-92.0667	-90.6896	-89.6482	-88.4230	55903.08	1

M1 <- bridge_sampler(stanfitM1, silent=T)
summary(M1)

## 
## Bridge sampling log marginal likelihood estimate 
## (method = "normal", repetitions = 1):
## 
##  -93.32272
## 
## Error Measures:
## 
##  Relative Mean-Squared Error: 1.845126e-06
##  Coefficient of Variation: 0.001358354
##  Percentage Error: 0%
## 
## Note:
## All error measures are approximate.

stanfitM0 <- sampling(stanmodelM0, data=datalistM1[-c(4,5)],
                      iter=50000, warmup=10000, chains=4, seed=277, refresh=0)

rstan::summary(stanfitM0)[[1]] %>% kable(digits=4) %>% kable_classic(full_width=F)

	mean	se_mean	sd	2.5%	25%	50%	75%	97.5%	n_eff	Rhat
mu	4.3207	0.0010	0.2807	3.7644	4.1389	4.3218	4.5043	4.8731	85370.05	1.0000
Sigma[1,1]	1.4496	0.0024	0.5711	0.7208	1.0626	1.3308	1.6980	2.8686	55709.08	1.0001
Sigma[1,2]	1.3095	0.0024	0.5441	0.6063	0.9401	1.1982	1.5518	2.6650	51429.12	1.0001
Sigma[2,1]	1.3095	0.0024	0.5441	0.6063	0.9401	1.1982	1.5518	2.6650	51429.12	1.0001
Sigma[2,2]	1.5434	0.0026	0.6101	0.7632	1.1286	1.4172	1.8093	3.0645	54262.92	1.0001
muS[1]	4.3207	0.0010	0.2807	3.7644	4.1389	4.3218	4.5043	4.8731	85370.05	1.0000
muS[2]	4.3207	0.0010	0.2807	3.7644	4.1389	4.3218	4.5043	4.8731	85370.05	1.0000
lp__	-92.3445	0.0064	1.4972	-96.0943	-93.0803	-92.0088	-91.2452	-90.4770	55573.66	1.0000

M0 <- bridge_sampler(stanfitM0, silent=T)
summary(M0)

## 
## Bridge sampling log marginal likelihood estimate 
## (method = "normal", repetitions = 1):
## 
##  -94.68977
## 
## Error Measures:
## 
##  Relative Mean-Squared Error: 9.805829e-07
##  Coefficient of Variation: 0.0009902438
##  Percentage Error: 0%
## 
## Note:
## All error measures are approximate.

bf(M1, M0) #Bayes factor

## Estimated Bayes factor in favor of M1 over M0: 3.92376

Bayes Factors for One-Way Within-Subject Designs

Weakly Informative Prior for Covariance Matrices

Zhengxiao Wei, Farouk S. Nathoo, Michael E. J. Masson

2023-09-11

1. Getting Started

2A. Full Stan Model (unstandardized effect sizes)

2B. Null Stan Model

3. Bayes Factor via Bridge Sampling

All-in-One R Script

4. \(\times100\)