============================================================================================================

This up-to-date document is available at https://rpubs.com/sherloconan/1206246

 

1. Methods

BayesFactor” R Package

 

\[\begin{align} \tag{1} \mathcal{M}_1:\ y_i&=\beta_0+\beta_1 x_i+\epsilon_i\\ \text{versus}\quad\mathcal{M}_0:\ y_i&=\beta_0+\epsilon_i,\\ \qquad\qquad&\epsilon_{i}\overset{\text{i.i.d.}}{\sim}\mathcal{N}(0,\ \sigma_\epsilon^2)\quad\text{ for } i=1,\dotsb,n \end{align}\]

\[\begin{equation} \tag{2} \pi(\beta_0,\sigma_\epsilon^2)\propto 1/\ \sigma_\epsilon^{2} \end{equation}\]

\[\begin{equation} \tag{3} \beta_1\sim\mathcal{N}\left(0,\ ng\cdot\sigma_\epsilon^2\cdot(\boldsymbol{x}^\top\boldsymbol{x})^{-1}\right) \end{equation}\]

\[\begin{equation} \tag{4} g\sim\text{Inv-Gamma}(1/2,\ \color{lightgray}{\text{scale}=}\varphi/2) \end{equation}\]

By default, the prior scale on the standardized slope is \(\varphi=\sqrt{2}/4\).

The scales are medium, wide, and ultrawide for \(\varphi\in\{\sqrt{2}/4,\ 1/2,\ \sqrt{2}/2\}\) in linear regression, \(h\in\{1/2,\ \sqrt{2}/2,\ 1\}\) in analysis of variance, and \(\gamma\in\{\sqrt{2}/2,\ 1,\ \sqrt{2}\}\) in \(t\)-tests, respectively.

See ?BayesFactor::regressionBF.

 

Equivalently, \(g^{-1}\sim\text{Gamma}(1/2,\ \color{lightgray}{\text{rate}=}\varphi/2)\).   Or, \(g\sim\text{Scale-inv-}\chi^2(1,\ \color{lightgray}{h^2=}\varphi)\).

 

scroll to top

   

BIC Approximation

 

Approximating Bayes Factors from Bayesian Information Criterion (BIC)

   

The BIC for model \(i\) with the number of free parameters \(k_i\) and the sample size \(n\) is \(\text{BIC}(\mathcal{M}_i)=-2\ln{\mathcal{L}_i}+k_i\ln{n}\quad\) for \(n\gg k_i\),

where \(\mathcal{L}_i=p(\boldsymbol{y}\mid\hat{\boldsymbol{\theta}}_i,\mathcal{M}_i)\) is the maximum likelihood.

 

Use the second-order Taylor approximation of log-marginal likelihood of each model.

\[\begin{equation} \tag{8} \mathit{BF}_{10}\approx\exp\left\{\frac{\text{BIC}(\mathcal{M}_0)-\text{BIC}(\mathcal{M}_1)}{2}\right\} \end{equation}\]

Derivation in Wagenmakers (2007, Appendix B).

 

?BIC

 

scroll to top

   

Approximate Bayes Factor

 

The test statistic in a Wald test is \(W=\left(\frac{\hat{\theta}-\theta_0}{\text{se}(\hat{\theta})}\right)^2\sim\chi_1^2\) under \(\mathcal{H}_0\), where \(\hat{\theta}\) is the maximum likelihood estimate, and \(\text{se}(\hat{\theta})\) is the standard error.

Wagenmakers (2022; its Eq. 6) reviewed the Jeffreys-style approximate Bayes factor (JAB) and proposed a piecewise approximation (WAB) from \(p\)-value and effective sample size \(N\) in Eq. 10.

The complication with JAB is that the approximation is valid only when \(\text{se}(\hat{\theta})\ll\sigma_g\), where \(\sigma_g\) is the scale of the prior distribution \(g(\theta)\). Note that with very small sample sizes, this assumption is not true and Eq. 9 is biased against \(\mathcal{H}_0\).

\[\begin{equation} \tag{9} \textit{JAB}_{01}=A\cdot\sqrt{N}\exp\left\{-\frac{1}{2}W\right\} \end{equation}\]

A normal unit-information prior for \(\theta\) yields \(A=1\). Based on the Jeffreys prior, the general approximation specifies \(A=\sqrt{\pi/2}\approx1.253\).

 

Wagenmakers’s piecewise approximation is

\[ \textit{WAB}_{01}\approx \begin{cases} \tag{10} \text{$\ p^{\frac{1}{4}}\sqrt{N}$,} & \text{$p>0.5$} \\ \\ \text{$\ \sqrt{pN}$,} & \text{$0.1<p\leqslant0.5$ (simpler)} \\ \\ \text{$\ \frac{4}{3}p^{\frac{2}{3}}\sqrt{N}$,} & \text{$0.1<p\leqslant0.5$ (more precise)} \\ \\ \text{$\ 3p\sqrt{N}$,} & \text{$p\leqslant0.1$} \end{cases} \]

 

qchisq(p, 1, lower.tail=F) # inverse CDF of chisq(1): 1-p → W = t² → JAB
qchisq(1-p, 1) # same

     

Note: The two-sided \(p\)-value of the \(t\)-statistic differs from the \(p\)-value of the Wald statistic, particularly when the sample size is insufficient.

# suppose t(𝜈 = 10 + 10 - 2) = 0.77, compute the p-values
c(pt(0.77, 18, lower.tail=F)*2, pchisq(0.77^2, 1, lower.tail=F))
## [1] 0.4512878 0.4412999
# suppose a two-sided p = 0.45, compute the test statistics t² and W
c(qt(0.45/2, 18, lower.tail=F)^2, qchisq(0.45, 1, lower.tail=F)) # try increasing 𝜈
## [1] 0.5963371 0.5706519

 

scroll to top

   

Savage–Dickey Density Ratio

 

The Savage–Dickey density ratio is a special form of the Bayes factor for nested models by dividing the value of
the posterior density over the parameters for the alternative model evaluated at the hypothesized value by
the prior for the same model evaluated at the same point.

\[\begin{equation} \tag{11} \textit{BF}_{01}:=\frac{p(\boldsymbol{y}\mid \mathcal{M}_0)}{p(\boldsymbol{y}\mid \mathcal{M}_1)}=\frac{\color{#0055A4}{p(\boldsymbol{\theta}=\boldsymbol{\theta}_0\mid\boldsymbol{y},\mathcal{M}_1)}}{\color{#EF4135}{p(\boldsymbol{\theta}=\boldsymbol{\theta}_0\mid \mathcal{M}_1)}} \end{equation}\]

 

stancode1M1 <- "
data {
  int<lower=1> N;           // total number of observations
  vector[N] x;              // predictor variable
  vector[N] y;              // response variable
  vector[2] mu;             // mean vector for theta
  matrix[2,2] Sigma;        // covariance matrix for theta
}

parameters {
  vector[2] theta;          // combined vector for alpha and beta
  real<lower=0> sigma;      // error standard deviation
}

transformed parameters {
  real alpha = theta[1];    // intercept
  real beta = theta[2];     // slope
}

model {
  // Priors
  theta ~ multi_normal(mu, Sigma);    // unit information
  target += -log(sigma);              // Jeffreys prior

  // Likelihood (linear regression model)
  y ~ normal(alpha + beta * x, sigma);
}"

stanmodel1M1 <- stan_model(model_code=stancode1M1)

 

scroll to top

   

2. Simulation

 

Effect Size

\[\begin{equation} Y=\beta_0+\beta_1X+\epsilon \end{equation}\]

If \(X\sim\mathcal{N}(0,\ \sigma_X^2)\ \perp \!\!\! \perp\ \epsilon\sim\mathcal{N}(0,\ \sigma_\epsilon^2)\),   then \(Y\sim\mathcal{N}(\beta_0,\ \color{lightgray}{\sigma_Y^2=}\beta_1^2\sigma_X^2+\sigma_\epsilon^2)\).

 

In the linear regression analysis, the Pearson’s correlation coefficient is \(r:=\frac{\text{Cov}(Y,X)}{\sigma_X\sigma_Y}\). Recall that \(\frac{\text{Cov}(Y,X)}{\sigma_X^2}=\frac{\text{Cov}(\beta_0+\beta_1X+\epsilon,\ X)}{\sigma_X^2}=\beta_1\).

Thus, the standardized slope (also known as the beta coefficient in standardized units) is \(r=\beta_1\frac{\sigma_X}{\sigma_Y}=\beta_1\frac{\sigma_X}{\sqrt{\beta_1^2\sigma_X^2+\sigma_\epsilon^2}}\).

And, the (unstandardized) regression slope is \(\beta_1=\frac{\sigma_\epsilon}{\sigma_X}\cdot\frac{r}{\sqrt{1-r^2}}\).

 

  • Cohen (1988, p. 76-81) suggests that the effect sizes are small, medium, and large for \(r\in\{0.1,\ 0.3,\ 0.5\}\), respectively.

   

Data

simData2 <- function(n, intercept=NULL, slope=NULL, sd.X=2, sd.e=10) {
  #' Input - 
  #' n :             number of observations
  #' intercept:      true baseline at x = 0
  #' slope:          true regression coefficient; Cov(Y, X) / sd.X²
  #' sd.X:           true standard deviation of X
  #' sd.e:           true error standard deviation
  #' 
  #' Details - 
  #' sd.Y²:          variance of Y; slope² · sd.X² + sd.e²
  #' r:              standardized slope; Cov(Y, X) / (sd.X · sd.Y)
  #'                 r ≡ slope · sd.X / sd.Y  ==>
  #'             slope ≡ r · sd.e / (sd.X · sqrt(1 - r²))
  #'             
  #'                 r = 0.1 small
  #'                 r = 0.3 medium
  #'                 r = 0.5 large effect size (Cohen, 1988, p. 76-81)
  #' 
  #' Output - dependent and independent variables, y and x
  
  # no random seed
  x <- stats::rnorm(n, 0, sd.X) # independent variable
  epsilon <- stats::rnorm(n, 0, sd.e) # error term
  data.frame("DV"=intercept + slope * x + epsilon, "IV"=x)
}

   

The Bayes Factors

Note: All the Bayes factors are \(\textit{BF}_{01}\).

computeBFs2 <- function(data, precise=F) {
  #' Input -
  #' data:     long format data whose columns must be `DV` and `IV`
  #' precise:  logical; if FALSE (default), use the simpler form for Output 6;
  #'                             otherwise, use the more precise form
  #' 
  #' Global -  stanmodel1M1
  #' Dependency - BayesFactor (v0.9.12-4.7), rstan (v2.32.6), polspline (v1.1.25)
  #' Output -  a vector of the Bayes factors in favor of the null
  #'           1. Savage–Dickey density ratio (Eq. 11) using the normal unit-information prior
  #'           2. lmBF function or regressionBF function
  #'           3. BIC approximation to the Bayes factor in Eq. 8; BIC(fit)
  #'           4. Jeffreys approximate Bayes factor in Eq. 9 using the normal unit-information prior
  #'           5. Jeffreys approximate Bayes factor in Eq. 9 using the Jeffreys prior
  #'           6. Wagenmakers approximate Bayes factor based on the p-value and sample size in Eq. 10
  
  data <- stats::na.omit(data)
  N <- nrow(data) # number of observations
  
  fit0 <- stats::lm(DV ~ 1, data) # null model fit,  ŷ = ^β₀
  fit1 <- stats::lm(DV ~ IV, data) # full model fit, ŷ = ^β₀ + ^β₁ · x
  pVal <- summary(fit1)$coefficients["IV", "Pr(>|t|)"] # p-value of the regression coefficient
  tVal <- stats::qt(pVal/2, N-2, lower.tail=F) # t-statistic (^β₁ >= 0), ["IV", "t value"]
  
  datalist <- list(N=N, x=data$IV, y=data$DV,
                   mu=stats::coef(fit1), Sigma=stats::vcov(fit1)*N)
  stanfitM1 <- rstan::sampling(stanmodel1M1, data=datalist,
                               iter=15000, warmup=5000, chains=2, seed=277, refresh=0)
  betaS <- rstan::extract(stanfitM1, pars="beta")$beta
  dPost <- polspline::dlogspline(0, polspline::logspline(betaS)) # estimated posterior density
  dPrior <- stats::dnorm(0, datalist$mu[2], sqrt(datalist$Sigma[2,2]))
  SDdr01 <- dPost / dPrior # Savage–Dickey density ratio
  
  bf <- BayesFactor::lmBF(DV ~ IV, data, progress=F) # lmBF10
  BF01 <- ifelse(BayesFactor::extractBF(bf)$error < 0.01, # Monte Carlo error
                 1 / BayesFactor::extractBF(bf)$bf, NA) # lmBF01
  
  BICB01 <- exp((stats::BIC(fit1) - stats::BIC(fit0)) / 2) # Eq. 8
  
  # sqrt(N) * exp(-0.5 * stats::qchisq(1-pVal, 1)) #  <== CAUTION, tiny p-value --> infinity Wald statistic
  JAB01_UI <- sqrt(N) * exp(-0.5 * tVal * tVal) # Eq. 9, unit-information prior

  WAB01 <- ifelse(pVal <= .5,
                  ifelse(pVal > .1,
                         ifelse(precise,
                                4 * pVal^(2/3) * sqrt(N) / 3, # 0.1 < p <= 0.5 (more precise)
                                sqrt(pVal * N)), # 0.1 < p <= 0.5 (simpler)
                         3 * pVal * sqrt(N)), # p <= 0.1
                  pVal^(1/4) * sqrt(N)) # p > 0.5, Eq. 10
  
  c("SD"=SDdr01, "lmBF"=BF01, "BIC_approx"=BICB01, 
                 "JAB"=JAB01_UI, "JAB_J"=sqrt(pi/2) * JAB01_UI, "WAB"=WAB01)
}

# set.seed(277)
# test2 <- simData2(30, 10, 2)
# computeBFs2(test2)

   

Simulation Study

We aim to conduct several simulation studies to evaluate the accuracy of approximate objective Bayes factors in their ability to convert \(p\)-values and \(n\) reported in scientific articles to Bayesian measures of evidence.

 

nSim <- 1000 # number of simulation runs for each setting
nObs <- c(10, 20, 30, 100, 500) # number of observations
BETA1 <- c("Null"=0, "Small"=0.5, "Medium"=1.6, "Large"=2.9) # regression coefficient (slope)

reportBF <- reportBF_avg <- reportBF_sd <- reportERR <- reportERR_avg <- reportERR_sd <- 
  reportRULE <- reportRULE_H1 <- reportRULE_H0 <- reportRULE_inc <-
  setNames(vector(mode="list", length=length(nObs) * length(BETA1)), # pre-allocation
           apply(expand.grid(BETA1, nObs), 1, function(r) paste0("n = ", r[2], ", slope = ", r[1])))

index <- 1 # initial list index

for (n in nObs) {
  
  for (slope in BETA1) {
    set.seed(n+slope+277)
    temp <- t(replicate(nSim, computeBFs2(simData2(n, intercept=10, slope=slope)))) # replicate the Bayes factors
    reportBF[[index]] <- temp
    reportBF_avg[[index]] <- colMeans(temp, na.rm=T) # mean Bayes factors in favor of the null
    reportBF_sd[[index]] <- apply(na.omit(temp), 2, sd) # sample standard deviations of them
    
    tempERR <- 100* (temp - temp[,1]) / temp[,1] # percent errors; the first column should be all 0
    reportERR[[index]] <- tempERR # (can take in NA)
    reportERR_avg[[index]] <- colMeans(tempERR, na.rm=T) # mean percent errors
    reportERR_sd[[index]] <- apply(na.omit(tempERR), 2, sd) # sample standard deviations of the percent errors
    
    tempRULE_H1 <- temp < 1/3 # support H1
    tempRULE_H0 <- temp > 3 # support H0
    tempRULE_inc <- temp >= 1/3 & temp <= 3 # inconclusive
    
    # counts of matching decisions for H1
    reportRULE_H1[[index]] <- colSums(tempRULE_H1 * tempRULE_H1[,1], na.rm=T)
    
    # counts of matching decisions for H0
    reportRULE_H0[[index]] <- colSums(tempRULE_H0 * tempRULE_H0[,1], na.rm=T)
    
    # counts of matching inconclusiveness
    reportRULE_inc[[index]] <- colSums(tempRULE_inc * tempRULE_inc[,1], na.rm=T)
    
    tempRULE <- 1*(temp < 1/3) - 1*(temp > 3) # 1: support H1;  0: inconclusive;  -1: support H0
    reportRULE[[index]] <- colSums(tempRULE[,1] == tempRULE, na.rm=T) # counts of matching decisions
    # colMeans(); a value near 1 suggests matching results; the first column should be all 1
    
    index <- index + 1
  }
} # roughly 4.6 hours of run time

# Warning messages:
# 1-8: In polspline::logspline(betaS) :  Not all models could be fitted

 

reshapeW2L2 <- function(list, prob=T, lab=c("SD", "BF", "BIC", "JAB", "JAB_J", "WAB")) {
  #' Input -
  #' list:    a list of size length(nObs) * length(BETA1);
  #'          reportERR (unaggregated),
  #'          reportRULE, reportRULE_H1, reportRULE_H0, or reportRULE_inc (aggregated)
  #' prob:    logical; if TRUE (default), return the proportions;
  #'                           otherwise, return the original values
  #' lab:     a vector of characters representing the Bayes factor methods
  #' 
  #' Global -
  #' nSim:    number of simulation runs for each setting
  #' nObs:    number of observations
  #' BETA1:   regression coefficient (slope)
  #' 
  #' Output - unlist the report and reshape the wide into the long
  
  len <- length(lab) # six methods
  nRep <- ifelse(!prob, nSim, 1) # nSim if unaggregated; 1 if aggregated
  long <- data.frame("value"=unname(unlist(list)),
                     "method"=factor(rep(rep(1:len, each=nRep), length(list)), labels=lab),
                     "n"=rep(nObs, each=nRep*len*length(BETA1)),
                     "slope"=rep(rep(unname(BETA1), each=nRep*len), length(nObs)))
  
  if (prob) {
    list2 <- lapply(list, function(vec) {
      if (vec[1] == 0) {
        vec * NA
      } else {
        vec / vec[1] # convert counts to proportions
      }
    })
    list3 <- lapply(list, function(vec) rep(vec[1], length(vec))) # denominator
    long$total <- unname(unlist(list3))
    long$prop <- unname(unlist(list2))
  }
  long
}


formatting <- function(val) {
  #' format the value using scientific notation
  if (abs(val) >= 10) {
    paste("~italic(BF)[\"01\"] %~~%", 
          gsub("e\\+0*(\\d+)", "%*%10^\\1", sprintf("%.2e", val)))
  } else if (abs(val) < 1) {
    paste("~italic(BF)[\"01\"] %~~%", 
          gsub("e-0*(\\d+)", "%*%10^{-\\1}", sprintf("%.2e", val)))
  } else {
    paste("~italic(BF)[\"01\"] %~~%", sprintf("%.2f", val))
  }
}

 

scroll to top

   

3. Graphic Results

 

Visualization

Note: The Jeffreys approximate Bayes factor (JAB_J) in Eq. 9, using the Jeffreys prior, will not be plotted.

Boxplot (Percent Errors)

# percent errors
df2 <- reshapeW2L2(reportERR, prob=F)

index <- 0

# boxplot of the percent errors
for (beta in BETA1) {
  baseBF01 <- sapply(reportBF_avg[seq(1, by=length(BETA1), length.out=length(nObs)) + index], 
                     function(x) x[1]) # baseline is the Savage–Dickey density ratio
  baseBF01_lab <- sapply(baseBF01, formatting) # scientific notation
  index <- index + 1
  
  sub <- subset(df2, !(method %in% c("SD", "BF", "JAB_J")) & slope == beta) # subset the long
  anno <- data.frame("label"=baseBF01_lab, "n"=unique(sub$n),
                     "x"=2, "y"=.9 * aggregate(value~n, sub, FUN=min)$value) # for annotation use
  
  print(ggplot(sub, aes(x=method, y=value)) +
          geom_boxplot(alpha=0.3, na.rm=T) +
          geom_text(data=anno, aes(label=label, x=x, y=y), parse=T, col="red") +
          facet_wrap(.~n, scales="free_y", nrow=1,
                     labeller=label_bquote(cols=italic(n)==.(n))) +
          labs(x="\nBayes Factor Methods", y="Percent Error (%)\n") +
          ggtitle(paste0(names(BETA1[BETA1==beta]), " Effect Size of the Slope")) +
          geom_hline(yintercept=0, linetype="dashed", color="gray") + # reference line 0%
          theme_classic() +
          theme(axis.text.x=element_text(angle=90)))
  cat("<br><br><br><br><br>")
}





















 

scroll to top

   

Boxplot (\(\textit{BF}_{01}\))

df_BF <- reshapeW2L2(reportBF, prob=F)

# boxplot of the BF₀₁
for (beta in BETA1) {
  sub <- subset(df_BF, slope == beta & method != "JAB_J") # subset the long
  print(ggplot(sub, aes(x=method, y=value)) +
          geom_boxplot(alpha=0.3, na.rm=T) +
          facet_wrap(.~n, scales="free_y", nrow=1,
                     labeller=label_bquote(cols=italic(n)==.(n))) +
          labs(x="\nBayes Factor Methods", y=expression(italic("BF")["01"])) +
          ggtitle(paste0(names(BETA1[BETA1==beta]), " Effect Size of the Slope")) +
          theme_classic() +
          theme(axis.text.x=element_text(angle=90)))
  cat("<br><br><br><br><br>")
}





















 

scroll to top

   

Line Chart (Proportions of Agreement)

# decisions
decis2 <- rbind(reshapeW2L2(reportRULE),
                reshapeW2L2(reportRULE_H1),
                reshapeW2L2(reportRULE_H0),
                reshapeW2L2(reportRULE_inc))
decis2$result <- factor(rep(c("Overall", "BF01 < 1/3", "BF01 > 3", "[1/3, 3]"),
                            each=length(levels(df2$method)) * length(nObs) * length(BETA1)),
                        levels=c("Overall", "BF01 < 1/3", "BF01 > 3", "[1/3, 3]"))
decis2$delta <- paste0("Slope = ", decis2$slope)
decis2 <- decis2[!is.na(decis2[,"prop"]),]
if (F) {
  # compute the pointwise confidence intervals using the normal approximation
  decis2$half.len <- qnorm(1-.05/2) * 
    sqrt(decis2$value * (decis2$total - decis2$value) / decis2$total^3) # 95% CI width
  decis2$lwr <- max(0, decis2$value / decis2$total - decis2$half.len) # 95% CI lower bound
  decis2$upr <- min(1, decis2$value / decis2$total + decis2$half.len) # 95% CI upper bound
} else { # Jeffreys intervals
  decis2$lwr <- qbeta(.05/2, decis2$value, decis2$total - decis2$value + 1) # 95% CI lower bound
  decis2$upr <- qbeta(1-.05/2, decis2$value + 1, decis2$total - decis2$value) # 95% CI upper bound
  decis2$half.len <- (decis2$upr - decis2$lwr) / 2 # 95% CI width
}

# line chart of the proportions of agreement
ggplotly(ggplot(decis2, aes(n, prop, color=method, label=value)) +
           geom_line(linewidth=1.2, na.rm=T) + geom_point(size=2, na.rm=T) +
           geom_ribbon(aes(ymin=lwr, ymax=upr), alpha=.2) +
           scale_color_manual(values=c("black", "gray", "#E69F00", "#CC79A7", "#D55E00", "#0072B2"),
                              labels=c("SD", "BF", "BIC", "JAB", "JAB_J", "WAB")) +
           facet_grid(result~delta) +
           labs(x="Number of Observations", y="Proportion of Agreement", color="Method") +
           scale_x_continuous(breaks=c(10, 100, 500)) +
           scale_y_continuous(breaks=c(0, 0.5, 1)) + # min(decis2$prop, na.rm=T)
           theme_minimal()) # interactive plots

     

 

scroll to top

   

4. Decision Rules

 

Debugging

(mat <- matrix(rep(c(0, -1), 5), nrow=2))
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    0    0    0    0    0
## [2,]   -1   -1   -1   -1   -1
(res <- mat[mat[,1] == -1, 1] == mat[mat[,1] == -1,]) # only one row ==> column vector
## [1] TRUE TRUE TRUE TRUE TRUE
colSums(res) #  ERROR!
## Error in base::colSums(x, na.rm = na.rm, dims = dims, ...): 'x' must be an array of at least two dimensions
colSums(rbind(rep(F, 5), res)) # counts of matching decisions
## [1] 1 1 1 1 1
colSums((mat == -1) * (mat == -1)[,1]) # counts of matching decisions
## [1] 1 1 1 1 1

 

The counts of matching Bayes factors between the gold standard and approximate methods.

reportRULE
## $`n = 10, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        689        910        935        765        871 
## 
## $`n = 10, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        692        907        928        773        874 
## 
## $`n = 10, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        750        947        955        831        918 
## 
## $`n = 10, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        773        938        951        909        897 
## 
## $`n = 20, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        393        974        978        879        921 
## 
## $`n = 20, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        401        979        992        909        940 
## 
## $`n = 20, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        675        972        966        907        936 
## 
## $`n = 20, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        878        977        972        957        955 
## 
## $`n = 30, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        291        981        991        932        978 
## 
## $`n = 30, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        349        988        989        919        980 
## 
## $`n = 30, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        720        981        979        918        970 
## 
## $`n = 30, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        947        990        979        967        984 
## 
## $`n = 100, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        828        998        997        965        971 
## 
## $`n = 100, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        759        990        989        946        956 
## 
## $`n = 100, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        918        994        992        962        976 
## 
## $`n = 100, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        999       1000       1000       1000       1000 
## 
## $`n = 500, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        935        999        999        984        998 
## 
## $`n = 500, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000        732        983        981        925        961 
## 
## $`n = 500, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000       1000       1000       1000       1000       1000 
## 
## $`n = 500, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000       1000       1000       1000       1000       1000
reportRULE_inc
## $`n = 10, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        665        665        652        651        430        665 
## 
## $`n = 10, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        666        666        652        652        439        666 
## 
## $`n = 10, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        659        659        637        633        490        659 
## 
## $`n = 10, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        555        555        518        516        464        555 
## 
## $`n = 20, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        372        372        365        356        251        372 
## 
## $`n = 20, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        368        368        366        362        279        368 
## 
## $`n = 20, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        498        498        480        467        411        498 
## 
## $`n = 20, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        377        377        360        349        339        377 
## 
## $`n = 30, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        267        267        263        262        199        267 
## 
## $`n = 30, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        311        311        309        304        233        311 
## 
## $`n = 30, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        465        465        458        447        392        465 
## 
## $`n = 30, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        224        224        215        204        204        224 
## 
## $`n = 100, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        116        109        115        114         84         87 
## 
## $`n = 100, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        218        201        213        209        173        174 
## 
## $`n = 100, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        247        199        245        242        231        223 
## 
## $`n = 100, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##          2          1          2          2          2          2 
## 
## $`n = 500, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##         41         39         40         40         26         39 
## 
## $`n = 500, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        342        245        330        326        294        303 
## 
## $`n = 500, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##          0          0          0          0          0          0 
## 
## $`n = 500, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##          0          0          0          0          0          0
reportRULE_H1
## $`n = 10, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##         49         24         49         49         49         35 
## 
## $`n = 10, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##         57         26         57         57         57         39 
## 
## $`n = 10, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        166         91        166        166        166        133 
## 
## $`n = 10, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        348        218        348        348        348        282 
## 
## $`n = 20, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##         31         21         31         31         31         26 
## 
## $`n = 20, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##         43         33         43         43         41         39 
## 
## $`n = 20, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        220        177        220        220        214        191 
## 
## $`n = 20, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        555        501        555        555        550        528 
## 
## $`n = 30, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##         24         24         24         24         24         24 
## 
## $`n = 30, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##         41         38         40         41         38         38 
## 
## $`n = 30, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        263        255        263        263        254        254 
## 
## $`n = 30, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        736        723        736        736        723        723 
## 
## $`n = 100, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##         12         12         12         12          9         12 
## 
## $`n = 100, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##         51         51         48         50         42         51 
## 
## $`n = 100, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        710        710        706        707        688        710 
## 
## $`n = 100, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        998        998        998        998        998        998 
## 
## $`n = 500, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##          4          4          4          4          3          4 
## 
## $`n = 500, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        235        235        230        232        208        235 
## 
## $`n = 500, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000       1000       1000       1000       1000       1000 
## 
## $`n = 500, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##       1000       1000       1000       1000       1000       1000
reportRULE_H0
## $`n = 10, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        286          0        209        235        286        171 
## 
## $`n = 10, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        277          0        198        219        277        169 
## 
## $`n = 10, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        175          0        144        156        175        126 
## 
## $`n = 10, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##         97          0         72         87         97         60 
## 
## $`n = 20, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        597          0        578        591        597        523 
## 
## $`n = 20, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        589          0        570        587        589        533 
## 
## $`n = 20, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        282          0        272        279        282        247 
## 
## $`n = 20, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##         68          0         62         68         68         50 
## 
## $`n = 30, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        709          0        694        705        709        687 
## 
## $`n = 30, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        648          0        639        644        648        631 
## 
## $`n = 30, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        272          0        260        269        272        251 
## 
## $`n = 30, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##         40          0         39         39         40         37 
## 
## $`n = 100, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        872        707        871        871        872        872 
## 
## $`n = 100, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        731        507        729        730        731        731 
## 
## $`n = 100, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##         43          9         43         43         43         43 
## 
## $`n = 100, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##          0          0          0          0          0          0 
## 
## $`n = 500, slope = 0`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        955        892        955        955        955        955 
## 
## $`n = 500, slope = 0.5`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##        423        252        423        423        423        423 
## 
## $`n = 500, slope = 1.6`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##          0          0          0          0          0          0 
## 
## $`n = 500, slope = 2.9`
##         SD       lmBF BIC_approx        JAB      JAB_J        WAB 
##          0          0          0          0          0          0

 

scroll to top

   

5. Finite Sample Correction

 

Centering the normal unit-information prior on the null hypothesized value \(\theta_0\), the customary choice in objective Bayesian testing yields the following JAB* (its Eq. 4; Wagenmakers, 2022).

\[\begin{equation} \textit{JAB}_{01}^*\approx\sqrt{N}\exp{\left\{-\frac{N-1}{2}\left(\frac{\hat{\theta}-\theta_0}{\sigma_g}\right)^2\right\}} \end{equation}\]

Alternatively, the JAB in Eq. 9 centers the prior on the maximum likelihood estimate.

\[\begin{equation} \textit{JAB}_{01}\approx\sqrt{N}\exp{\left\{-\frac{N}{2}\left(\frac{\hat{\theta}-\theta_0}{\sigma_g}\right)^2\right\}} \end{equation}\]

Using a power transformation, it can be shown that \(\textit{JAB}_{01}^*=\textit{JAB}_{01}^{\frac{N-1}{N}}\cdot N^{\frac{1}{2N}}\).

 

\(N=10\)

par(mfrow=c(1, length(BETA1)))
for (i in 1:length(BETA1)) {
  reportBF[[i]][,5] <- reportBF[[i]][,4]^(9/10) * 10^(1/20) # number of observations is 10
  boxplot(reportBF[[i]][,c(1,4:6)], names=c("SD", "JAB", "JAB*", "WAB"),
          xlab="", ylab=expression(italic("BF")["01"]), main=paste0(names(BETA1)[i], " Slope"), las=2)
}

 

scroll to top

   

6. eJAB

 

To test the hypotheses \(\mathcal{H}_0:\boldsymbol{\theta}=\boldsymbol{\theta}_0\) versus \(\mathcal{H}_1:\boldsymbol{\theta}\neq\boldsymbol{\theta}_0\), we define the extended Jeffreys approximate objective Bayes factor (eJAB) as \[\begin{equation} \mathit{eJAB}_{01}=\sqrt{N}\exp\left\{-\frac{1}{2}\frac{N^{1/q}-1}{N^{1/q}}\cdot Q_{\chi^{2}_{q}}(1-p)\right\}, \end{equation}\]

where \(q\) is the size of the parameter vector \(\boldsymbol{\theta}\), \(N\) is the sample size, \(Q_{\chi^{2}_{q}}(\cdot)\) is the quantile function of the chi-squared distribution with \(q\) degrees of freedom, and \(p\) is the \(p\)-value from a null-hypothesis significance test. We further define \(\mathit{eJAB}_{10}=1\ \!/\ \!\mathit{eJAB}_{01}\).

 

The sample size \(N\) is

  • the total number of observations for for \(t\)-tests, linear regression, logistic regression, one-way ANOVA, and chi-squared tests.

  • the number of events for Cox models.

  • the number of independent observations for one-way repeated-measures ANOVA (Nathoo & Masson, 2016).

 

The degrees of freedom are

  • \(q=1\) for \(t\)-tests, linear regression, logistic regression, and Cox models, where \(\mathcal{H}_0\) specifies a point-null hypothesis (Wagenmakers, 2022).

  • \(q=I-1\) for one-way ANOVA and one-way repeated-measures ANOVA, where \(I\) is the number of conditions.

  • \(q=(R-1)(C-1)\) for chi-squared tests, where \(R\) and \(C\) are the numbers of rows and columns, respectively.

Since the Bayes factor is the ratio of the marginal likelihoods of two competing models, \(q=p_1-p_0\) also represents the difference in the number of free parameters between the alternative and null models.

 

For further reading, please visit https://rpubs.com/sherloconan/1361280.

 

scroll to top