Loading [MathJax]/jax/output/HTML-CSS/jax.js

はじめに

最近ベイズ統計モデリングが流行している。2018年の『心理学評論』でも「統計革命」と題した特集が組まれ、ベイズ統計モデリングに関する特集論文がいくつか出ている。また、浜田宏・石田淳・清水裕二(著)『社会科学のためのベイズ統計モデリング』も2019年12月に公刊された。僕も乗り遅れないように勉強を本格的に進め出した。そこで、ここではこれまで使用されてきた分析方法をベイズによるアプローチでやってみる。自分のための備忘録である。新たな発見があり、記事にする余裕ができたら随時更新していこうと思う。学習中ですので間違いがあればご指摘ください。

t検定

まず、使用するパッケージを読み込む。

# get packages
library(tidyverse)
## ─ Attaching packages ─────────
## ✔ ggplot2 3.2.0     ✔ purrr   0.3.3
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   1.0.0     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0
## ─ Conflicts ─ tidyverse_conflicts() ─
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(rstan)
## Loading required package: StanHeaders
## rstan (Version 2.19.2, GitRev: 2e1f913d3ca3)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
## 
## Attaching package: 'rstan'
## The following object is masked from 'package:tidyr':
## 
##     extract
library(ggmcmc)
## Registered S3 method overwritten by 'GGally':
##   method from   
##   +.gg   ggplot2

データは以下のように設定する。

# making data
N_A <- 1000
N_B <- 800
meanA <- 59
meanB <- 60
sigmaA <- 10
sigmaB <- 15

set.seed(1234)
Y_A <- rnorm(N_A, meanA, sigmaA)
Y_B <- rnorm(N_B, meanB, sigmaB)

まずは普通にt検定(等分散を仮定)を行ってみる。

# normal t test (var equal)
ttest <- t.test(Y_A, Y_B, var.equal = TRUE)
print(ttest)
## 
##  Two Sample t-test
## 
## data:  Y_A and Y_B
## t = -2.2716, df = 1798, p-value = 0.02323
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.456978 -0.180138
## sample estimates:
## mean of x mean of y 
##  58.73403  60.05259

次にベイズによるアプローチ。まず、手元のデータYAYBは以下のように、それぞれ異なるμを持つ正規分布から導出されたと考える。

YANormal(μA,σ)YBNormal(μB,σ)

μAμBσについては全くわからない(情報がない)とし0から100をとる一様分布に従うとする。すなわち以下のようになる。

μAUniform(0,100)μBUniform(0,100)σUniform(0,100)

以上を踏まえて、stan側のファイルに以下のように記述する。なお、これはt.test(normal).stanとして保存しておく。

// t test (var equal)
data {
  int N_A;
  int N_B;
  real Y_A[N_A];
  real Y_B[N_B];
}

parameters {
  real mu_A;
  real mu_B;
  real<lower = 0> sigma;
}

model {
  mu_A ~ uniform(0, 100);
  mu_B ~ uniform(0, 100);
  sigma ~ uniform(0, 100);
  for(n in 1:N_A)
    Y_A[n] ~ normal(mu_A, sigma);
  for(n in 1:N_B)
    Y_B[n] ~ normal(mu_B, sigma);
}

次に、MCMCサンプルを生成し結果を確認、またMCMCサンプルを抽出するために以下のコードをR側のファイルに書く。

# bayesian t test
data <- list(N_A = N_A, N_B = N_B, Y_A = Y_A, Y_B = Y_B)
fit <- rstan::stan(file = 't test(normal).stan', data = data, seed = 1234)
## 
## SAMPLING FOR MODEL 't test(normal)' NOW (CHAIN 1).
## Chain 1: Rejecting initial value:
## Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 1:   Stan can't start sampling from this initial value.
## Chain 1: Rejecting initial value:
## Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 1:   Stan can't start sampling from this initial value.
## Chain 1: Rejecting initial value:
## Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 1:   Stan can't start sampling from this initial value.
## Chain 1: 
## Chain 1: Gradient evaluation took 0.000345 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 3.45 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.873036 seconds (Warm-up)
## Chain 1:                0.630921 seconds (Sampling)
## Chain 1:                1.50396 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 't test(normal)' NOW (CHAIN 2).
## Chain 2: Rejecting initial value:
## Chain 2:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 2:   Stan can't start sampling from this initial value.
## Chain 2: 
## Chain 2: Gradient evaluation took 0.000115 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 1.15 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.87003 seconds (Warm-up)
## Chain 2:                0.620414 seconds (Sampling)
## Chain 2:                1.49044 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 't test(normal)' NOW (CHAIN 3).
## Chain 3: Rejecting initial value:
## Chain 3:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 3:   Stan can't start sampling from this initial value.
## Chain 3: Rejecting initial value:
## Chain 3:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 3:   Stan can't start sampling from this initial value.
## Chain 3: Rejecting initial value:
## Chain 3:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 3:   Stan can't start sampling from this initial value.
## Chain 3: Rejecting initial value:
## Chain 3:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 3:   Stan can't start sampling from this initial value.
## Chain 3: Rejecting initial value:
## Chain 3:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 3:   Stan can't start sampling from this initial value.
## Chain 3: 
## Chain 3: Gradient evaluation took 8.7e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.87 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 1.0203 seconds (Warm-up)
## Chain 3:                0.837934 seconds (Sampling)
## Chain 3:                1.85823 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 't test(normal)' NOW (CHAIN 4).
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: Rejecting initial value:
## Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 4:   Stan can't start sampling from this initial value.
## Chain 4: 
## Chain 4: Gradient evaluation took 8.7e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.87 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.720689 seconds (Warm-up)
## Chain 4:                0.610845 seconds (Sampling)
## Chain 4:                1.33153 seconds (Total)
## Chain 4:
print(fit)
## Inference for Stan model: t test(normal).
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##           mean se_mean   sd     2.5%      25%      50%      75%    97.5%
## mu_A     58.74    0.01 0.39    57.95    58.48    58.74    59.01    59.51
## mu_B     60.05    0.01 0.42    59.21    59.76    60.04    60.33    60.86
## sigma    12.25    0.00 0.20    11.87    12.10    12.24    12.38    12.65
## lp__  -5405.98    0.03 1.15 -5408.90 -5406.52 -5405.72 -5405.13 -5404.62
##       n_eff Rhat
## mu_A   4135    1
## mu_B   3985    1
## sigma  4253    1
## lp__   2004    1
## 
## Samples were drawn using NUTS(diag_e) at Tue Dec 17 14:08:37 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).
sample <- rstan::extract(fit)

一応、traceplotを確認し収束しているかをチェックする。なお、上記でRhatの結果も確認する必要がある。

# check convergence
g_mcmc <- fit %>% 
  ggs() %>% 
  ggs_traceplot()
print(g_mcmc)

最後に、μAμBのMCMCサンプルを用いて両者の差(diff=μAμB)を計算しその事後分布を図示する。なお、ここで赤い縦ラインは2.5パーセンタイル(-2.36951037)、97.5パーセンタイル(-0.22392976)、青い縦ラインは0.5パーセンタイル(-2.65241993)、99.5パーセンタイル(0.06217632)である。頻度主義の文脈では両側検定でそれぞれ5%有意、1%有意の判定ラインに該当する。なお、黒の点線は0である。また、事後分布はそのパラメータが従う確率分布なので「そのパラメータは〇〇%の確率で△〜▽に含まれる」と解釈できる。ここから、95%の確率でdiffは-2.36951037〜-0.22392976に含まれ、99%の確率で-2.65241993〜0.06217632に含まれると解釈できる。よって、99%の確率の場合は0をまたいでしまう。

# visualize
df <- tibble(diff = sample$mu_A - sample$mu_B)
quantile(df$diff, c(0.025, 0.975, 0.005, 0.995))
##        2.5%       97.5%        0.5%       99.5% 
## -2.36951037 -0.22392976 -2.65241993  0.06217632
g <- df %>% 
  ggplot() +
  geom_density(aes(x = diff), fill = 'blue', alpha = 0.3) + 
  geom_vline(xintercept = 0, colour = 'black', linetype = 'dashed') + 
  geom_vline(xintercept = quantile(df$diff, 0.975), colour = 'red') + 
  geom_vline(xintercept = quantile(df$diff, 0.025), colour = 'red') + 
  geom_vline(xintercept = quantile(df$diff, 0.995), colour = 'blue') +
  geom_vline(xintercept = quantile(df$diff, 0.005), colour = 'blue') 
print(g)

もし等分散を仮定できないと考える場合はまず、手元のデータYAYBは以下のように、それぞれ異なるμと異なるσを持つ正規分布から導出されたと考えれば良い。

YANormal(μA,σA)YBNormal(μB,σB)

これを考慮してコードを書き直せば等分散を仮定しないt検定的な分析が可能となる。