最近ベイズ統計モデリングが流行している。2018年の『心理学評論』でも「統計革命」と題した特集が組まれ、ベイズ統計モデリングに関する特集論文がいくつか出ている。また、浜田宏・石田淳・清水裕二(著)『社会科学のためのベイズ統計モデリング』も2019年12月に公刊された。僕も乗り遅れないように勉強を本格的に進め出した。そこで、ここではこれまで使用されてきた分析方法をベイズによるアプローチでやってみる。自分のための備忘録である。新たな発見があり、記事にする余裕ができたら随時更新していこうと思う。学習中ですので間違いがあればご指摘ください。
まず、使用するパッケージを読み込む。
# get packages
library(tidyverse)
## ─ Attaching packages ─────────
## ✔ ggplot2 3.2.0 ✔ purrr 0.3.3
## ✔ tibble 2.1.3 ✔ dplyr 0.8.3
## ✔ tidyr 1.0.0 ✔ stringr 1.4.0
## ✔ readr 1.3.1 ✔ forcats 0.4.0
## ─ Conflicts ─ tidyverse_conflicts() ─
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(rstan)
## Loading required package: StanHeaders
## rstan (Version 2.19.2, GitRev: 2e1f913d3ca3)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
##
## Attaching package: 'rstan'
## The following object is masked from 'package:tidyr':
##
## extract
library(ggmcmc)
## Registered S3 method overwritten by 'GGally':
## method from
## +.gg ggplot2
データは以下のように設定する。
# making data
N_A <- 1000
N_B <- 800
meanA <- 59
meanB <- 60
sigmaA <- 10
sigmaB <- 15
set.seed(1234)
Y_A <- rnorm(N_A, meanA, sigmaA)
Y_B <- rnorm(N_B, meanB, sigmaB)
まずは普通にt検定(等分散を仮定)を行ってみる。
# normal t test (var equal)
ttest <- t.test(Y_A, Y_B, var.equal = TRUE)
print(ttest)
##
## Two Sample t-test
##
## data: Y_A and Y_B
## t = -2.2716, df = 1798, p-value = 0.02323
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -2.456978 -0.180138
## sample estimates:
## mean of x mean of y
## 58.73403 60.05259
次にベイズによるアプローチ。まず、手元のデータYAとYBは以下のように、それぞれ異なるμを持つ正規分布から導出されたと考える。
YA〜Normal(μA,σ)YB〜Normal(μB,σ)
μAとμBとσについては全くわからない(情報がない)とし0から100をとる一様分布に従うとする。すなわち以下のようになる。
μA〜Uniform(0,100)μB〜Uniform(0,100)σ〜Uniform(0,100)
以上を踏まえて、stan側のファイルに以下のように記述する。なお、これはt.test(normal).stanとして保存しておく。
// t test (var equal)
data {
int N_A;
int N_B;
real Y_A[N_A];
real Y_B[N_B];
}
parameters {
real mu_A;
real mu_B;
real<lower = 0> sigma;
}
model {
mu_A ~ uniform(0, 100);
mu_B ~ uniform(0, 100);
sigma ~ uniform(0, 100);
for(n in 1:N_A)
Y_A[n] ~ normal(mu_A, sigma);
for(n in 1:N_B)
Y_B[n] ~ normal(mu_B, sigma);
}
次に、MCMCサンプルを生成し結果を確認、またMCMCサンプルを抽出するために以下のコードをR側のファイルに書く。
# bayesian t test
data <- list(N_A = N_A, N_B = N_B, Y_A = Y_A, Y_B = Y_B)
fit <- rstan::stan(file = 't test(normal).stan', data = data, seed = 1234)
##
## SAMPLING FOR MODEL 't test(normal)' NOW (CHAIN 1).
## Chain 1: Rejecting initial value:
## Chain 1: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 1: Stan can't start sampling from this initial value.
## Chain 1: Rejecting initial value:
## Chain 1: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 1: Stan can't start sampling from this initial value.
## Chain 1: Rejecting initial value:
## Chain 1: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 1: Stan can't start sampling from this initial value.
## Chain 1:
## Chain 1: Gradient evaluation took 0.000345 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 3.45 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.873036 seconds (Warm-up)
## Chain 1: 0.630921 seconds (Sampling)
## Chain 1: 1.50396 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 't test(normal)' NOW (CHAIN 2).
## Chain 2: Rejecting initial value:
## Chain 2: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 2: Stan can't start sampling from this initial value.
## Chain 2:
## Chain 2: Gradient evaluation took 0.000115 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 1.15 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.87003 seconds (Warm-up)
## Chain 2: 0.620414 seconds (Sampling)
## Chain 2: 1.49044 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 't test(normal)' NOW (CHAIN 3).
## Chain 3: Rejecting initial value:
## Chain 3: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 3: Stan can't start sampling from this initial value.
## Chain 3: Rejecting initial value:
## Chain 3: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 3: Stan can't start sampling from this initial value.
## Chain 3: Rejecting initial value:
## Chain 3: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 3: Stan can't start sampling from this initial value.
## Chain 3: Rejecting initial value:
## Chain 3: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 3: Stan can't start sampling from this initial value.
## Chain 3: Rejecting initial value:
## Chain 3: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 3: Stan can't start sampling from this initial value.
## Chain 3:
## Chain 3: Gradient evaluation took 8.7e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.87 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 1.0203 seconds (Warm-up)
## Chain 3: 0.837934 seconds (Sampling)
## Chain 3: 1.85823 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 't test(normal)' NOW (CHAIN 4).
## Chain 4: Rejecting initial value:
## Chain 4: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4: Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4: Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4: Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4: Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4: Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4: Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4: Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4: Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4: Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4: Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4: Stan can't start sampling from this initial value.
## Chain 4:
## Chain 4: Gradient evaluation took 8.7e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.87 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.720689 seconds (Warm-up)
## Chain 4: 0.610845 seconds (Sampling)
## Chain 4: 1.33153 seconds (Total)
## Chain 4:
print(fit)
## Inference for Stan model: t test(normal).
## 4 chains, each with iter=2000; warmup=1000; thin=1;
## post-warmup draws per chain=1000, total post-warmup draws=4000.
##
## mean se_mean sd 2.5% 25% 50% 75% 97.5%
## mu_A 58.74 0.01 0.39 57.95 58.48 58.74 59.01 59.51
## mu_B 60.05 0.01 0.42 59.21 59.76 60.04 60.33 60.86
## sigma 12.25 0.00 0.20 11.87 12.10 12.24 12.38 12.65
## lp__ -5405.98 0.03 1.15 -5408.90 -5406.52 -5405.72 -5405.13 -5404.62
## n_eff Rhat
## mu_A 4135 1
## mu_B 3985 1
## sigma 4253 1
## lp__ 2004 1
##
## Samples were drawn using NUTS(diag_e) at Tue Dec 17 14:08:37 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
sample <- rstan::extract(fit)
一応、traceplotを確認し収束しているかをチェックする。なお、上記でRhatの結果も確認する必要がある。
# check convergence
g_mcmc <- fit %>%
ggs() %>%
ggs_traceplot()
print(g_mcmc)
最後に、μAとμBのMCMCサンプルを用いて両者の差(diff=μA−μB)を計算しその事後分布を図示する。なお、ここで赤い縦ラインは2.5パーセンタイル(-2.36951037)、97.5パーセンタイル(-0.22392976)、青い縦ラインは0.5パーセンタイル(-2.65241993)、99.5パーセンタイル(0.06217632)である。頻度主義の文脈では両側検定でそれぞれ5%有意、1%有意の判定ラインに該当する。なお、黒の点線は0である。また、事後分布はそのパラメータが従う確率分布なので「そのパラメータは〇〇%の確率で△〜▽に含まれる」と解釈できる。ここから、95%の確率でdiffは-2.36951037〜-0.22392976に含まれ、99%の確率で-2.65241993〜0.06217632に含まれると解釈できる。よって、99%の確率の場合は0をまたいでしまう。
# visualize
df <- tibble(diff = sample$mu_A - sample$mu_B)
quantile(df$diff, c(0.025, 0.975, 0.005, 0.995))
## 2.5% 97.5% 0.5% 99.5%
## -2.36951037 -0.22392976 -2.65241993 0.06217632
g <- df %>%
ggplot() +
geom_density(aes(x = diff), fill = 'blue', alpha = 0.3) +
geom_vline(xintercept = 0, colour = 'black', linetype = 'dashed') +
geom_vline(xintercept = quantile(df$diff, 0.975), colour = 'red') +
geom_vline(xintercept = quantile(df$diff, 0.025), colour = 'red') +
geom_vline(xintercept = quantile(df$diff, 0.995), colour = 'blue') +
geom_vline(xintercept = quantile(df$diff, 0.005), colour = 'blue')
print(g)
もし等分散を仮定できないと考える場合はまず、手元のデータYAとYBは以下のように、それぞれ異なるμと異なるσを持つ正規分布から導出されたと考えれば良い。
YA〜Normal(μA,σA)YB〜Normal(μB,σB)
これを考慮してコードを書き直せば等分散を仮定しないt検定的な分析が可能となる。