我々は全知全能の神ではないため,「真の」傾向スコアを知ることはできない.
そのため,次善策として「推定」傾向スコアを使って効果を推定している.
しかし,その効果はどれくらい頑健なのだろうか.
そんな不安を少しばかり軽くしてくれる(してくれないときもある)のが,感度分析(Sensitivity Analysis)である.
『傾向スコア(計量分析 One Point)』の練習データを使う.
pacman::p_load(tidyverse)
read_csv("psdemo.csv") -> data
data |>
mutate_all(~ifelse(. == -9, NA, .)) |>
mutate(s_grade = factor(s_grade)) |>
dplyr::select(!c(teacher_id, School_id, district_id, PS)) |>
drop_na() -> data
今回は非復元最近傍マッチングで処置効果を推定していく.
本来であれば,マッチングの前に傾向スコアの共通サポート等を確認すべきではあるが,主要な関心事ではないので今回は省略する.
共通サポートの確認については前にまとめたものを参照.
pacman::p_load(MatchIt)
matching <- matchit(s_treatment ~ student_gender + s_grade +
S_CLIMATE_COMMUNITY + S_CLIMATE_SCHOOLSAFETY +
S_CONFLICTRES_AGGRESSIVE + S_CONFLICTRES_RELATIONSHIPS +
S_CONFLICTRES_AGGBELIEF + S_LEARNING_RECESSEFFECT +
S_LEARNING_SPORTSEFFECT + S_LEARNING_ENGAGEMENT +
S_RECESS_ORGANIZED + S_RECESS_ENJOYMENT +
S_YOUTHDEV_INTERACTIONS + S_YOUTHDEV_PEERCONFLICT +
S_YOUTHDEV_PEERNONCONFLICT + S_PHYSICAL_SELFCONCEPT,
data = data, method = "nearest", replace = F)
結果を眺めてみる.
summary(matching)
##
## Call:
## matchit(formula = s_treatment ~ student_gender + s_grade + S_CLIMATE_COMMUNITY +
## S_CLIMATE_SCHOOLSAFETY + S_CONFLICTRES_AGGRESSIVE + S_CONFLICTRES_RELATIONSHIPS +
## S_CONFLICTRES_AGGBELIEF + S_LEARNING_RECESSEFFECT + S_LEARNING_SPORTSEFFECT +
## S_LEARNING_ENGAGEMENT + S_RECESS_ORGANIZED + S_RECESS_ENJOYMENT +
## S_YOUTHDEV_INTERACTIONS + S_YOUTHDEV_PEERCONFLICT + S_YOUTHDEV_PEERNONCONFLICT +
## S_PHYSICAL_SELFCONCEPT, data = data, method = "nearest",
## replace = F)
##
## Summary of Balance for All Data:
## Means Treated Means Control Std. Mean Diff.
## distance 0.2211 0.1413 0.7235
## student_gender 0.5286 0.5078 0.0417
## s_grade4 0.5143 0.5039 0.0208
## s_grade5 0.4857 0.4961 -0.0208
## S_CLIMATE_COMMUNITY 3.0121 2.6881 0.7697
## S_CLIMATE_SCHOOLSAFETY 2.6696 2.4441 0.3198
## S_CONFLICTRES_AGGRESSIVE 1.3304 1.4693 -0.3343
## S_CONFLICTRES_RELATIONSHIPS 3.1640 3.1253 0.0583
## S_CONFLICTRES_AGGBELIEF 1.5183 1.6755 -0.2514
## S_LEARNING_RECESSEFFECT 2.5141 2.4399 0.1242
## S_LEARNING_SPORTSEFFECT 2.6307 2.5405 0.1213
## S_LEARNING_ENGAGEMENT 3.2662 3.1935 0.1802
## S_RECESS_ORGANIZED 2.0624 1.9596 0.1846
## S_RECESS_ENJOYMENT 3.6070 3.6175 -0.0267
## S_YOUTHDEV_INTERACTIONS 3.3261 3.2483 0.1358
## S_YOUTHDEV_PEERCONFLICT 2.2191 2.0405 0.2512
## S_YOUTHDEV_PEERNONCONFLICT 1.8375 1.7053 0.2080
## S_PHYSICAL_SELFCONCEPT 1.7277 1.7449 -0.0837
## Var. Ratio eCDF Mean eCDF Max
## distance 1.3025 0.2216 0.4030
## student_gender . 0.0208 0.0208
## s_grade4 . 0.0104 0.0104
## s_grade5 . 0.0104 0.0104
## S_CLIMATE_COMMUNITY 0.6259 0.1267 0.3215
## S_CLIMATE_SCHOOLSAFETY 0.7172 0.0702 0.1561
## S_CONFLICTRES_AGGRESSIVE 0.5234 0.0579 0.1363
## S_CONFLICTRES_RELATIONSHIPS 0.7751 0.0283 0.0764
## S_CONFLICTRES_AGGBELIEF 0.8435 0.0639 0.1137
## S_LEARNING_RECESSEFFECT 0.6425 0.0411 0.0956
## S_LEARNING_SPORTSEFFECT 0.7270 0.0494 0.0880
## S_LEARNING_ENGAGEMENT 0.6207 0.0306 0.1014
## S_RECESS_ORGANIZED 0.7546 0.0360 0.0779
## S_RECESS_ENJOYMENT 0.6805 0.0245 0.1070
## S_YOUTHDEV_INTERACTIONS 0.7617 0.0294 0.0874
## S_YOUTHDEV_PEERCONFLICT 0.8338 0.0619 0.1497
## S_YOUTHDEV_PEERNONCONFLICT 0.8158 0.0571 0.1476
## S_PHYSICAL_SELFCONCEPT 0.7739 0.0385 0.1061
##
## Summary of Balance for Matched Data:
## Means Treated Means Control Std. Mean Diff.
## distance 0.2211 0.2204 0.0059
## student_gender 0.5286 0.4786 0.1002
## s_grade4 0.5143 0.4929 0.0429
## s_grade5 0.4857 0.5071 -0.0429
## S_CLIMATE_COMMUNITY 3.0121 2.9700 0.1001
## S_CLIMATE_SCHOOLSAFETY 2.6696 2.6738 -0.0059
## S_CONFLICTRES_AGGRESSIVE 1.3304 1.3311 -0.0015
## S_CONFLICTRES_RELATIONSHIPS 3.1640 3.0952 0.1034
## S_CONFLICTRES_AGGBELIEF 1.5183 1.4469 0.1142
## S_LEARNING_RECESSEFFECT 2.5141 2.5265 -0.0208
## S_LEARNING_SPORTSEFFECT 2.6307 2.5856 0.0606
## S_LEARNING_ENGAGEMENT 3.2662 3.2859 -0.0487
## S_RECESS_ORGANIZED 2.0624 2.0325 0.0538
## S_RECESS_ENJOYMENT 3.6070 3.6265 -0.0495
## S_YOUTHDEV_INTERACTIONS 3.3261 3.2829 0.0754
## S_YOUTHDEV_PEERCONFLICT 2.2191 2.2855 -0.0935
## S_YOUTHDEV_PEERNONCONFLICT 1.8375 1.8709 -0.0525
## S_PHYSICAL_SELFCONCEPT 1.7277 1.7232 0.0219
## Var. Ratio eCDF Mean eCDF Max Std. Pair Dist.
## distance 1.0167 0.0016 0.0286 0.0125
## student_gender . 0.0500 0.0500 0.9873
## s_grade4 . 0.0214 0.0214 1.0433
## s_grade5 . 0.0214 0.0214 1.0433
## S_CLIMATE_COMMUNITY 0.6921 0.0291 0.0929 0.8531
## S_CLIMATE_SCHOOLSAFETY 0.6945 0.0442 0.0857 1.1704
## S_CONFLICTRES_AGGRESSIVE 1.1499 0.0237 0.0714 0.9071
## S_CONFLICTRES_RELATIONSHIPS 0.8248 0.0293 0.1143 1.2136
## S_CONFLICTRES_AGGBELIEF 1.3691 0.0347 0.1000 0.9378
## S_LEARNING_RECESSEFFECT 0.7533 0.0307 0.0714 1.2315
## S_LEARNING_SPORTSEFFECT 0.7612 0.0307 0.0714 1.1944
## S_LEARNING_ENGAGEMENT 0.8463 0.0290 0.0857 1.2307
## S_RECESS_ORGANIZED 0.8698 0.0210 0.0500 1.1195
## S_RECESS_ENJOYMENT 0.6308 0.0354 0.1429 1.1330
## S_YOUTHDEV_INTERACTIONS 0.9131 0.0247 0.0857 1.1588
## S_YOUTHDEV_PEERCONFLICT 0.9352 0.0264 0.0786 1.1823
## S_YOUTHDEV_PEERNONCONFLICT 0.9002 0.0179 0.0500 1.1223
## S_PHYSICAL_SELFCONCEPT 0.8113 0.0244 0.0500 1.1582
##
## Sample Sizes:
## Control Treated
## All 772 140
## Matched 140 140
## Unmatched 632 0
## Discarded 0 0
処置効果を推定して,結果をmsummaryでキレイに出力する.
match_data <- match.data(matching)
pacman::p_load(estimatr)
result <- lm_robust(S_CLIMATE_RECESSSAFETY ~ s_treatment, data = match_data)
pacman::p_load(modelsummary)
summary <- list("Result" = result)
msummary(summary, stars = c("*" = .05, "**" = .01, "***" = .001),
notes = list('括弧内は標準誤差'))
| Result | |
|---|---|
| * p < 0.05, ** p < 0.01, *** p < 0.001 | |
| 括弧内は標準誤差 | |
| (Intercept) | 2.685*** |
| (0.077) | |
| s_treatment | 0.225* |
| (0.097) | |
| Num.Obs. | 280 |
| R2 | 0.019 |
| R2 Adj. | 0.016 |
| AIC | 680.4 |
| BIC | 691.3 |
| RMSE | 0.81 |
処置効果は5%水準で有意という結果になった.
本題の感度分析をやっていく.
処置効果が5%水準で有意という上の結果は,どれくらい頑健なのだろうか.
感度分析をやるにあたって,下準備をする.
マッチングしたデータについて,subclassの値が小さい順に並び替える.
match_data |>
arrange(subclass) -> match_data
次に,アウトカム(S_CLIMATE_RECESSSAFETY)を処置群・統制群別でベクトルに変換する.
#処置群
match_data |>
filter(s_treatment == 1) |>
dplyr::select(S_CLIMATE_RECESSSAFETY) -> t_outcome
t_outcome <- t_outcome$S_CLIMATE_RECESSSAFETY
#統制群
match_data |>
filter(s_treatment == 0) |>
dplyr::select(S_CLIMATE_RECESSSAFETY) -> c_outcome
c_outcome <- c_outcome$S_CLIMATE_RECESSSAFETY
感度分析にはいくつか種類があるが,今回はWilcoxon符号順位p値に対するRosenbaum感度分析(Rosenbaum Sensitivity Test for Wilcoxon Signed Rank P-Value)をやってみる.
Gammaで処置割り当てのオッズの最大値,GammaIncでオッズをいくつずつ増加させていくかを指示する.詳しくはここを参照.
pacman::p_load(rbounds)
psens(t_outcome, c_outcome, Gamma = 2, GammaInc = 0.1)
##
## Rosenbaum Sensitivity Test for Wilcoxon Signed Rank P-Value
##
## Unconfounded estimate .... 0.017
##
## Gamma Lower bound Upper bound
## 1.0 0.0170 0.0170
## 1.1 0.0048 0.0489
## 1.2 0.0012 0.1086
## 1.3 0.0003 0.1980
## 1.4 0.0001 0.3109
## 1.5 0.0000 0.4353
## 1.6 0.0000 0.5580
## 1.7 0.0000 0.6686
## 1.8 0.0000 0.7612
## 1.9 0.0000 0.8339
## 2.0 0.0000 0.8882
##
## Note: Gamma is Odds of Differential Assignment To
## Treatment Due to Unobserved Factors
##
Gammaが1.2の時点でp値は0.1086となり,有意ではなくなってしまう.
これは,欠落変数による選択バイアスがわずかに増加する(オッズが1.2になる)だけで,処置効果が有意ではなくなってしまうかもしれないことを意味している.
つまり,感度分析前の上の結果は頑健ではない可能性が高いといえる.
「処置効果がある」と主張したいときにこんな結果が出たら,そっ閉じしたくなりますね.