Simulation Study

Prior

# prior at 0
prior0_delta = dnorm(0, 0, 1) * 2
prior0_xi = dnorm(0, 0, 1) * 2

1. Single parameter

To investigate how the Bayes factor (Savage-Dikcey density ratio, SDR) varies with the standardized magnitude parameter \(\delta\) and the smoothness parameter \(\xi\), I conducted a simulation study in which one parameter was held fixed while the other was estimated using GPR. Specifically, datasets were generated for combinations of values from the following grids. For each combination of parameter values, 50 datasets were independently generated.

xi_list = c(1, 2, 10) # lambda: 1, 0.5, 0.1
delta_list = c(0.1, 0.5, 1, 2, 3)

Figure 1 presents the results. In panel (a), where \(\xi\) is fixed and \(\delta\) varies, we observe that larger values of \(\delta\) lead to higher SDRs. When \(\delta\) is small, the differences in SDR across different values of \(\xi\) are minor. As \(\delta\) increases, the SDR becomes substantially larger for higher \(\xi\). In panel (b), where \(\delta\) is fixed and \(\xi\) varies, we find that increasing \(\xi\) typically results in higher SDRs. The rate of increase depends on the value of \(\delta\): for large \(\delta\), SDR rises sharply with \(\xi\), while for small \(\delta\) (e.g., 0.1), the SDR remains nearly flat across all \(\xi\) values.

2. Joint posterior density

delta_list = c(0.1, 0.5, 1, 1.5, 2, 2.5, 3)
xi_list = c(0.1, 0.2, 1, 2, 10)
n_datasets = 50

Next, we generate datasets using 35 unique combinations of \(\delta\) and \(\xi\). For each combination, 50 datasets are simulated, resulting in 1750 GPR model fits in total. For each combination, we compute the mean and standard deviation of the Bayes factor estimated via bridge sampling (BF10) and the reciprocal of the Savage–Dickey ratio (SDR10`) estimated by copula-based log-spline approach, allowing us to assess the consistency and variability of these metrics across repeated datasets.

# prior at 0
prior0_delta_or_xi = dnorm(0) * 2 * 2

Figure 2 (a) and (b) illustrate how log(BF10) and log(SDR10) vary with \(\delta_{\text{true}}\) and \(\xi_{\text{true}}\), respectively. Each point in the plots represents the mean of the logged Bayes factor (bf10) or the Savage–Dickey ratio (sdr10) across all fitted models with the same true parameter values. Error bars show the standard deviation, reflecting the variability across datasets. As \(\delta_{\text{true}}\) or \(\xi_{\text{true}}\) increases, both log(BF10) and log(SDR10) tend to increase. When the true parameter values are small, sdr10 closely aligns with bf10. However, as the values grow, sdr10 increases more slowly, leading to greater divergence between the two. This is likely because the posterior mass moves further from zero, making it harder to accurately estimate posterior density at the null using limited samples. Nonetheless, even when underestimated, sdr10 remains sufficiently large to favor the alternative model.

Figure 2: Mean and standard deviation of log-transformed Bayes factors (log(BF10)) and Savage–Dickey ratios (log(SDR10)) across repeated simulations. Panel (a) shows the trend with respect to \(\delta_{\text{true}}\), and panel (b) with respect to \(\xi_{\text{true}}\).

To further demonstrate the feasibility of the proposed method, Figure 3 presents heatmaps of log(BF10) and log(SDR10) varying with both \(\delta_{\text{true}}\) and \(\xi_{\text{true}}\). From the plots, we observe that neither a large \(\delta_{\text{true}}\) nor a large \(\xi_{\text{true}}\) alone is sufficient to produce high Bayes factors. Substantial evidence in favor of the alternative model — reflected by large log(BF10) and log(SDR10) values — only emerges when both \(\delta_{\text{true}}\) and \(\xi_{\text{true}}\) are large. Notably, the SDR10 heatmap exhibits a similar overall pattern to that of BF10, supporting the consistency of the Savage–Dickey approximation with bridge sampling.

Prior

Prior 1: std ~ halfN(0,1)

Prior 2: var ~ halfN(0,1)

Data

x = seq(0, 1, length=N)
simu_data = list(delta=0.3, lambda=0.5, sigma2=0.25, N=100, x=as.numeric(x))
set.seed(1)
simu_fit = stan(file='stan_programs/simu.stan', data=simu_data,
                warmup=0, iter=10, chains=1, algorithm="Fixed_param", seed=1)
y = as.vector(rstan::extract(simu_fit)$y[1,])
fit_data <- list(
  N = length(x), Y = y, Kgp_1 = 1, Dgp_1 = 1, prior_only = 0,
  Nsubgp_1 = length(unique(x)),
  Jgp_1 = match(x, unique(x)),
  Xgp_1 = as.array(matrix(unique(x), ncol = 1))
)

Fit models with the two priors

prior 1
- sdr = 0.6265

prior 2
- sdr = 0.6317