Hiro_macchan
2015/2/21
multivariate regression (Ordinary Least Square) \[ Y = \beta_0 + \beta_1x_1 + \eta \] \[ Y = \{y|0,1\} \]
summary(lm(y~x_1))
Call:
lm(formula = y ~ x_1)
Residuals:
Min 1Q Median 3Q Max
-0.71711 -0.18350 0.00627 0.18907 0.67767
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.494726 0.008252 59.95 <2e-16 ***
x_1 0.073624 0.001424 51.72 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.2609 on 998 degrees of freedom
Multiple R-squared: 0.7282, Adjusted R-squared: 0.728
F-statistic: 2674 on 1 and 998 DF, p-value: < 2.2e-16
Logistic Function & Logistic regression
Prediction of Outcome
e.g.
res <- glm(y~x1+x2+x3, family = binomial(logit))
summary(res)
predict(res,newdata = new.df)
Estimate Causal Effect
例えば
割合の比較
| x | col1 | col2 |
|---|---|---|
| row1 | a | b |
| row2 | c | d |
Logistic regression とOdds Ratio
\( \beta_1 = \frac{z_m-z_f}{1-0} \)
\( =log(\frac{p_m}{1-p_m})-log(\frac{p_f}{1-p_f}) \)
\( = log(\frac{p_m}{1-p_m}/\frac{p_f}{1-p_f}) \)
\( OR = \frac{p_m}{1-p_m}/\frac{p_f}{1-p_f}= e^{\beta_1} \)
年齢(連続変数)の場合年齢が1単位増加した場合のオッズ比を算出できる。
\( \beta \)の信頼区間の出し方などはソフトウェアに任せましょう。
\[ z = logit(p) =log(\frac{p}{1-p}) = \beta_0+\beta_1Sex + \beta_2Age + \beta_3Treat \]
\( exp(beta_3) \);
性別と年齢を補正したうえでの Treatment odds ratio
Prediction of Outcome
Causal Effect Estimation
Donald B. Rubin
missing value
#install.packages("dagR")
library(dagR)
dag.dat <-
dag.init(outcome = NULL, exposure = NULL, covs = c(1),
arcs = c(0,-1,1,0, 1,-1),
assocs = c(0,0), xgap = 0.04, ygap = 0.05, len = 0.1,
x.name = "Hospital admission",
cov.names = c("Confounder; Patient Age"),
y.name = "Death"
)
junk <- dag.draw(dag.dat)
X -> C -> Y ; Back Door, Open Path
Stratify with C, Multivariable regression include C
Close Back Door
Closed Path
http://dagitty.net/dags.html?id=qKWMS
1:None
2:C1
3:C2
4:C1,C2
1:C1
2:C2
3:C3
4:None
4:None
http://dagitty.net/dags.html
1:None, 2:C1, 3:C2, 4:C1,C2
Prediction of Outcome
Causal Effect Estimation
APE とはロジスティックモデルの結果を利用した仮想的なRisk Differnce にあたる。
2値変数x のAPEを求める場合、すべての症例がx=1 であった場合の推計イベント割合から、すべての症例がx=0 であった場合の推計イベント割合を引いて算出される。
具体的には以下の式による[2]。
\( g(z) = \frac{1}{1+exp(-z)} \)
\( APE = \hat{\beta_K}(N^{-1}\sum_{i=1}^{N}{g(x_i\hat{\beta})}) \cdots x_K がcontinuous \)
\( APE = N^{-1}\sum_{i=1}^{N}{[g(\hat{\beta_1}+\hat{\beta_2}x_{i2}+\cdots+\hat{\beta_{K-1}}x_{i,K-1}+\hat{\beta_{K}})-g(\hat{\beta_1}+\hat{\beta_2}x_{i2}+\cdots+\hat{\beta_{K-1}}x_{i,K-1})]} \cdots x_K がbinary \)