Software for implementing matching methods and propensity scores: http://www.biostat.jhsph.edu/~estuart/propensityscoresoftware.html
Graphical representation of balance using PSAgraphics: http://www.jstatsoft.org/v29/i06/paper
MatchIt: Nonparametric Preprocessing for Parametric Causal Inference1: http://r.iq.harvard.edu/docs/matchit/2.4-20/matchit.pdf
Propensity Score Matching: Nearest Neighbor (Greedy) Matching with & without Subclassification: http://www.youtube.com/watch?v=LmtR8B9pRWc
Most of the methods resulted in ORs between 1.5 to 2.0.
The matching method (unconditional) and the SMR weighting method are answering the same question, contrasting the exposed subjects to similar subjects who were not exposed (average treatment effect in the treated, ATT).
The inverse probability of treatment assignment weight (IPTW) method is answering a different question, i.e., contrasting the whole cohort on TPA to the whole cohort off TPA (average treatment effect, ATE). It means a counterfactual situation where even those with contraindications were treated with TPA, thus, the OR of death would increase.
-- Unadjusted result
Unadjusted logistic OR 3.17 (2.33, 4.31)
-- Conventional logistic regression adjusting for covariates as they are
Conventional logistic OR 1.91 (1.27, 2.88)
-- Propensity score as a covariate in logistic regression
Linear PS adjusted OR 2.76 (1.97, 3.87)
Curvilinear PS adjusted OR 1.65 (1.09, 2.51)
-- PS quintile stratification and crude OR within each stratum
Stratum 1 OR 0.00 (0.00, 888.11) # No exposed case
Stratum 2 OR 0.00 (0.00, 143.14) # No exposed case
Stratum 3 OR 7.95 (0.75, 49.44)
Stratum 4 OR 1.94 (0.21, 8.72)
Stratum 5 OR 1.72 (1.14, 2.56)
M-H combined OR 1.80 (1.25, 2.60) # Not appropriate here because of effect modification
-- PS Matching and logistic regression
MatchIt 1:1 OR 1.78 (1.01, 3.19) # Unconditional logistic regression
nonrandom 1:1 OR 1.69 (0.96, 2.97) # Unconditional logistic regression
nonrandom 1:1 OR 1.72 (0.96, 3.08) # Conditional logistic regression
nonrandom 1:1 OR 1.68 (0.96, 2.97) # GEE with exchangeable correlation
MatchIt 1:4 OR 1.80 (1.17, 2.72) # Unconditional logistic regression
nonrandom 1:4 OR 1.80 (1.18, 2.74) # Unconditional logistic regression
nonrandom 1:4 OR 1.83 (1.19, 2.82) # Conditional logistic regression
nonrandom 1:4 OR 1.80 (1.18, 2.74) # GEE with exchangeable correlation
-- PS Weighting and logistic regression with robust variance estimator
IPTW OR 4.61 (1.22, 17.45) # Everybody on TPA vs Nobody on TPA (ATE)
SMRW OR 1.54 (0.96, 2.48) # Treated on TPA vs Treated not on TPA (ATT)
## Load data
library(sas7bdat)
dat <- read.sas7bdat("./epi271data.sas7bdat")
names(dat) <- tolower(names(dat))
## Create new variables
dat <- within(dat, {
## age ≥ 70
age70 <- as.numeric(age >= 70)
## submart ≥ 90
sumbart90 <- as.numeric(sumbartel >= 90)
## rankpre 4,5,6 to 5
rankpre[rankpre %in% c(4,5,6)] <- 5
## Categorical time
timeintcat <- cut(timeint, breaks = c(-Inf,1,3,5,Inf), labels = 1:4,
right = FALSE, include.lowest = T)
## Quantile of resident
residentq <- cut(residents, breaks = quantile(residents), labels = 1:4, include.lowest = T)
})
## Convert categoricals to factors
categoricals <- c("age5","age70","afib","aphasia","living","gender","rankpre","time","residentq","referral","paresis","sumbart90","timeintcat","transport","vigilanz","ward")
dat[categoricals] <- lapply(dat[categoricals],
function(var) {
var <- factor(var)
})
logit.ps <- glm(tpa ~ age5 + afib + aphasia + cardiac + gender + htn + hyperchol + icu + living + rankpre + residentq + referral + paresis + prevstroke + sumbart90 + transport + timeintcat + vigilanz + ward + timeintcat:year + age70:year, data = dat, family = binomial)
## Extract propensity score (for those who ps is available)
pscores <- fitted(logit.ps)
## Put them back into the dataset using case numbers
dat$pscore <- NA
## Put the PS into the dataset for those who have PS
dat$pscore[as.numeric(names(pscores))] <- pscores
Crude & conventional analyses
## Fit a crude model
logit.crude <- glm(death ~ tpa,
data = dat,
family = binomial(link = "logit"))
library(epicalc)
logistic.display(logit.crude)$table[1,, drop = F]
OR(95%CI) P(Wald's test) P(LR-test)
tpa: 1 vs 0 "3.17 (2.33,4.31) " "< 0.001" "< 0.001"
## Fit a traditional logistic regression model
logit.traditional <- glm(death ~ tpa + age5 + afib + aphasia + cardiac + gender + htn + hyperchol + icu + living + rankpre + residentq + referral + paresis + prevstroke + sumbart90 + transport + timeintcat + vigilanz + ward + timeintcat:year + age70:year,
data = dat,
family = binomial(link = "logit"))
logistic.display(logit.traditional)$table[1,, drop = F]
OR lower95ci upper95ci Pr(>|Z|)
tpa 1.914 1.27 2.885 0.001924
a. Replace the confounder set in your outcome model with the propensity score (i.e., linear, quintiles, deciles etc). 2. Regression model: run an outcome model of the association between TPA and death controlling for confounding by including the propensity score.
## Linear relationship assumed
logit.linear.ps.adjusted <- glm(death ~ tpa + pscore,
data = dat,
family = binomial(link = "logit"))
logistic.display(logit.linear.ps.adjusted)
Logistic regression predicting death
crude OR(95%CI) adj. OR(95%CI) P(Wald's test) P(LR-test)
tpa: 1 vs 0 2.76 (1.97,3.87) 1.71 (1.11,2.63) 0.016 0.019
pscore (cont. var.) 9.61 (4.89,18.88) 5.24 (2.21,12.41) < 0.001 < 0.001
Log-likelihood = -2023.0324
No. of observations = 9146
AIC value = 4052.0647
## curvilinear relationship assumed
logit.curvilinear.ps.adjusted <- glm(death ~ tpa + pscore + I(pscore^2),
data = dat,
family = binomial(link = "logit"))
logistic.display(logit.curvilinear.ps.adjusted)
OR lower95ci upper95ci Pr(>|Z|)
tpa 1.653413 1.09112590 2.50546 0.017729754
pscore 147.758293 18.05759275 1209.04893 0.000003193
I(pscore^2) 0.001006 0.00001467 0.06899 0.001376494
## Create quintiles
quintiles <- quantile(dat$pscore, prob = seq(from = 0,to = 1, by = 0.2), na.rm = T)
dat$pscoreq <- cut(dat$pscore, breaks = quintiles, labels = 1:5, include.lowest = T)
## Perform logistic regression within each stratum
library(plyr)
logistic.stratified <- dlply(.data = dat, .variables = "pscoreq",
.fun = function(DF) {
glm(death ~ tpa, data = DF, family = binomial)
})
a. What is the association between t-PA and death in each stratum?
## Get OR in each stratum
res.strata.logit <- lapply(logistic.stratified[1:5], function(X){
logistic.display(X)$table[1,]
})
print(do.call(rbind, res.strata.logit), quote = F)
OR(95%CI) P(Wald's test) P(LR-test)
1 0 (0,Inf) 0.984 0.771
2 0 (0,Inf) 0.986 0.701
3 7.97 (1.53,41.67) 0.014 0.039
4 1.94 (0.43,8.71) 0.385 0.422
5 1.72 (1.17,2.53) 0.006 0.008
## In the first two strata, no exposed cases are present, thus OR of 0
xtabs.stratified <- dlply(.data = dat, .variables = "pscoreq",
.fun = function(DF) {
xtabs(~ tpa + death, data = DF)[2:1, 2:1]
})
## Show 2x2 table in each stratum
xtabs.stratified[1:5]
$`1`
death
tpa 1 0
1 0 1
0 76 1754
$`2`
death
tpa 1 0
1 0 2
0 66 1761
$`3`
death
tpa 1 0
1 2 5
0 87 1734
$`4`
death
tpa 1 0
1 2 13
0 133 1681
$`5`
death
tpa 1 0
1 38 236
0 133 1422
b. What is the overall effect estimate?
The MH-OR is probably not useful in this situation as the effects are heterogenous across PS strata.
## Transform list to array
array.stratified <- laply(xtabs.stratified[1:5], invisible)
## Array permutation to change dimention order
array.stratified <- aperm(array.stratified, c(2,3,1))
## Rename dimensions
dimnames(array.stratified) <- list(tpa = c("1", "0"), death = c("1", "0"), stratum = c("1", "2", "3", "4", "5"))
## OR both stratum-specific and Mantel-Haenszel
epicalc::mhor(mhtable = array.stratified)
Stratified analysis by stratum
OR lower lim. upper lim. P value
stratum 1 0.00 0.000 888.11 1.00000
stratum 2 0.00 0.000 143.14 1.00000
stratum 3 7.95 0.747 49.44 0.04196
stratum 4 1.94 0.211 8.72 0.30501
stratum 5 1.72 1.137 2.56 0.00919
M-H combined 1.80 1.252 2.60 0.00129
M-H Chi2(1) = 10.36 , P value = 0.001
One or more cells of the stratified table == 0.
Homogeneity test not computable.
Graph not drawn
Graphical representation of balance within strata using PSAgraphics
http://www.jstatsoft.org/v29/i06/paper
## Complete cases only for this package
dat.ps.complete <- dat[complete.cases(dat),]
nrow(dat.ps.complete)
[1] 7410
## Very few exposed with low PS
xtabs(~ pscoreq + tpa, data = dat.ps.complete)
tpa
pscoreq 0 1
1 1491 0
2 1487 1
3 1456 7
4 1471 14
5 1255 228
## Visualization using PSAgraphics
library(PSAgraphics)
## Age compared (Compare balance graphically of a continuous covariate as part of a PSA)
box.psa(continuous = dat.ps.complete$age,
treatment = dat.ps.complete$tpa,
strata = dat.ps.complete$pscoreq)
## Atrial fibrillation compared (Compare balance graphically of a categorical covariate as part of a PSA)
cat.psa(categorical = dat.ps.complete$afib,
treatment = dat.ps.complete$tpa,
strata = dat.ps.complete$pscoreq)
$`treatment:stratum.proportions`
0:1 1:1 0:2 1:2 0:3 1:3 0:4 1:4 0:5 1:5
0 0.881 NaN 0.841 1 0.792 1 0.704 0.857 0.68 0.627
1 0.119 NaN 0.159 0 0.208 0 0.296 0.143 0.32 0.373
## Generates a Propensity Score Assessment Plot (does not work)
## circ.psa(response = (dat.ps.complete$death == 1),
## treatment = dat.ps.complete$tpa,
## strata = dat.ps.complete$pscoreq)
Matching using MatchIt package
Reference
full and optimal will be done through optmatch package.
method: This argument specifies a matching method. Currently,
‘"exact"’ (exact matching), ‘"full"’ (full matching),
‘"genetic"’ (genetic matching), ‘"nearest"’ (nearest neighbor
matching), ‘"optimal"’ (optimal matching), and ‘"subclass"’
(subclassification) are available. The default is
‘"nearest"’. Note that within each of these matching methods,
_MatchIt_ offers a variety of options.
## Full syntax
matchit(formula, data = NULL, discard = 0, exact = FALSE,
replace = FALSE, ratio = 1, model = "logit", reestimate = FALSE,
nearest = TRUE, m.order = 2, caliper = 0, calclosest = FALSE,
mahvars = NULL, subclass = 0, sub.by = "treat", counter = TRUE,
full = FALSE, full.options = list(), ...)
Propensity score matching using MatchIt
Most of these methods (such as logistic or probit regression) define the distance by first estimating the propensity score, defined as the probability of receiving treatment, conditional on the covariates. ( http://r.iq.harvard.edu/docs/matchit/2.4-20/matchit.pdf 4.1.0.2.2 Additional Arguments for Specification of Distance Measures)
## Load MatchIt by Dr. Gary King
library(MatchIt)
##
## MatchIt (Version 2.4-20, built: 2011-10-24)
## Please refer to http://gking.harvard.edu/matchit for full documentation
## or help.matchit() for help with commands supported by MatchIt.
##
## matchit() takes formula for propensity score, not propensity score itself
out.matchit <- matchit(## Give formula for propensity score model
formula = tpa ~ age5 + afib + aphasia + cardiac + gender + htn + hyperchol + icu + living + rankpre + residentq + referral + paresis + prevstroke + sumbart90 + transport + timeintcat + vigilanz + ward + timeintcat:year + age70:year,
data = dat.ps.complete, # Cannot have missing values. Use complete dataset
method = "nearest", # nearest is the same as greedy match
distance = "logit", # Distance defined by usual propensity score from logistic model
ratio = 1 # 1:1 match is the default
)
## 1 tx:4 control matching
out.matchit4 <- matchit(## Give formula for propensity score model
formula = tpa ~ age5 + afib + aphasia + cardiac + gender + htn + hyperchol + icu + living + rankpre + residentq + referral + paresis + prevstroke + sumbart90 + transport + timeintcat + vigilanz + ward + timeintcat:year + age70:year,
data = dat.ps.complete, # Cannot have missing values. Use complete dataset
method = "nearest", # nearest is the same as greedy match
distance = "logit", # Distance defined by usual propensity score from logistic model
ratio = 4 # 1:4 match
)
## 1:1 match. There are only 250 treated, thus, 250 contols were chosen
out.matchit
Call:
matchit(formula = tpa ~ age5 + afib + aphasia + cardiac + gender +
htn + hyperchol + icu + living + rankpre + residentq + referral +
paresis + prevstroke + sumbart90 + transport + timeintcat +
vigilanz + ward + timeintcat:year + age70:year, data = dat.ps.complete,
method = "nearest", distance = "logit", ratio = 1)
Sample sizes:
Control Treated
All 7160 250
Matched 250 250
Unmatched 6910 0
Discarded 0 0
## out.matchit$match.matrix # Stratum indicator is in $match.matrix
## Output Matched Data Sets for further analysis
dat.matchit <- match.data(out.matchit)
## 1:4 match. There are only 250 treated, thus, 1000 contols were chosen
out.matchit4
Call:
matchit(formula = tpa ~ age5 + afib + aphasia + cardiac + gender +
htn + hyperchol + icu + living + rankpre + residentq + referral +
paresis + prevstroke + sumbart90 + transport + timeintcat +
vigilanz + ward + timeintcat:year + age70:year, data = dat.ps.complete,
method = "nearest", distance = "logit", ratio = 4)
Sample sizes:
Control Treated
All 7160 250
Matched 1000 250
Unmatched 6160 0
Discarded 0 0
## out.matchit4$match.matrix # Stratum indicator is in $match.matrix
## Output Matched Data Sets for further analysis
dat.matchit4 <- match.data(out.matchit4)
Examine balance
## See balance of all covariates (not run)
## plot(out.matchit)
## Propensity score jittered dot plots 1:1
plot(out.matchit, type = "jitter", interactive = F)
## Propensity score jittered dot plots 1:4
plot(out.matchit4, type = "jitter", interactive = F)
## Propensity score histogram 1:1
plot(out.matchit, type = "hist")
## Propensity score histogram 1:4
plot(out.matchit4, type = "hist")
## Density plot 1:1
library(ggplot2)
ggplot(data = dat.matchit, mapping = aes(x = distance, fill = factor(tpa))) +
layer(geom = "density", stat = "density", alpha = 1/3) +
theme_bw() +
theme(legend.key = element_blank()) +
labs(title = "1:1 Matched cohort")
## Density plot 1:4
ggplot(data = dat.matchit4, mapping = aes(x = distance, fill = factor(tpa), y = ..count..)) +
layer(geom = "density", stat = "density", alpha = 1/3) +
theme_bw() +
theme(legend.key = element_blank()) +
labs(title = "1:4 Matched cohort")
## Summary of balance before and after matching 1:1
summary(out.matchit)
Call:
matchit(formula = tpa ~ age5 + afib + aphasia + cardiac + gender +
htn + hyperchol + icu + living + rankpre + residentq + referral +
paresis + prevstroke + sumbart90 + transport + timeintcat +
vigilanz + ward + timeintcat:year + age70:year, data = dat.ps.complete,
method = "nearest", distance = "logit", ratio = 1)
Summary of balance for all data:
Means Treated Means Control SD Control Mean Diff eQQ Med eQQ Mean eQQ Max
distance 0.302 0.024 0.070 0.277 0.298 0.276 0.559
age51 0.164 0.084 0.278 0.080 0.000 0.080 1.000
age52 0.100 0.113 0.317 -0.013 0.000 0.016 1.000
age53 0.168 0.115 0.319 0.053 0.000 0.052 1.000
age54 0.132 0.139 0.346 -0.007 0.000 0.008 1.000
age55 0.148 0.166 0.372 -0.018 0.000 0.020 1.000
age56 0.228 0.179 0.384 0.049 0.000 0.048 1.000
age57 0.040 0.102 0.303 -0.062 0.000 0.064 1.000
age58 0.020 0.101 0.301 -0.081 0.000 0.084 1.000
afib1 0.348 0.217 0.412 0.131 0.000 0.132 1.000
aphasia1 0.284 0.238 0.426 0.046 0.000 0.044 1.000
cardiac 0.284 0.271 0.444 0.013 0.000 0.012 1.000
gender1 0.568 0.486 0.500 0.082 0.000 0.080 1.000
gender2 0.416 0.431 0.495 -0.015 0.000 0.016 1.000
htn 0.700 0.731 0.443 -0.031 0.000 0.032 1.000
hyperchol 0.280 0.311 0.463 -0.031 0.000 0.032 1.000
icu 0.988 0.949 0.219 0.039 0.000 0.040 1.000
living1 0.120 0.204 0.403 -0.084 0.000 0.084 1.000
living2 0.736 0.611 0.487 0.125 0.000 0.124 1.000
living3 0.020 0.039 0.194 -0.019 0.000 0.020 1.000
rankpre1 0.800 0.568 0.495 0.232 0.000 0.232 1.000
rankpre2 0.160 0.206 0.405 -0.046 0.000 0.048 1.000
rankpre3 0.020 0.090 0.286 -0.070 0.000 0.072 1.000
rankpre5 0.016 0.093 0.291 -0.077 0.000 0.080 1.000
residentq2 0.152 0.242 0.428 -0.090 0.000 0.092 1.000
residentq3 0.292 0.239 0.426 0.053 0.000 0.052 1.000
residentq4 0.264 0.223 0.416 0.041 0.000 0.040 1.000
referral1 0.244 0.198 0.399 0.046 0.000 0.044 1.000
referral2 0.104 0.341 0.474 -0.237 0.000 0.236 1.000
referral3 0.520 0.241 0.428 0.279 0.000 0.280 1.000
referral4 0.092 0.136 0.342 -0.044 0.000 0.044 1.000
paresis1 0.028 0.092 0.290 -0.064 0.000 0.064 1.000
paresis2 0.908 0.587 0.492 0.321 0.000 0.320 1.000
paresis3 0.004 0.012 0.110 -0.008 0.000 0.008 1.000
prevstroke 0.124 0.172 0.377 -0.048 0.000 0.048 1.000
sumbart901 0.080 0.313 0.464 -0.233 0.000 0.232 1.000
transport1 0.024 0.261 0.439 -0.237 0.000 0.236 1.000
transport2 0.628 0.230 0.421 0.398 0.000 0.396 1.000
transport3 0.324 0.388 0.487 -0.064 0.000 0.064 1.000
transport4 0.016 0.034 0.181 -0.018 0.000 0.020 1.000
timeintcat2 0.812 0.182 0.386 0.630 1.000 0.628 1.000
timeintcat3 0.116 0.122 0.327 -0.006 0.000 0.008 1.000
timeintcat4 0.024 0.422 0.494 -0.398 0.000 0.400 1.000
vigilanz1 0.004 0.021 0.142 -0.017 0.000 0.020 1.000
vigilanz2 0.200 0.103 0.304 0.097 0.000 0.096 1.000
vigilanz3 0.708 0.769 0.421 -0.061 0.000 0.060 1.000
ward1 0.024 0.206 0.405 -0.182 0.000 0.184 1.000
ward2 0.728 0.641 0.480 0.087 0.000 0.088 1.000
ward3 0.204 0.047 0.212 0.157 0.000 0.156 1.000
timeintcat1:year 0.148 0.777 1.388 -0.629 0.000 0.636 4.000
timeintcat2:year 2.412 0.507 1.171 1.905 2.000 1.900 4.000
timeintcat3:year 0.264 0.366 1.048 -0.102 0.000 0.112 2.000
timeintcat4:year 0.068 1.191 1.558 -1.123 0.000 1.124 4.000
year:age701 1.140 1.050 1.528 0.090 0.000 0.088 1.000
Summary of balance for matched data:
Means Treated Means Control SD Control Mean Diff eQQ Med eQQ Mean eQQ Max
distance 0.302 0.280 0.180 0.022 0.001 0.022 0.165
age51 0.164 0.192 0.395 -0.028 0.000 0.028 1.000
age52 0.100 0.088 0.284 0.012 0.000 0.012 1.000
age53 0.168 0.176 0.382 -0.008 0.000 0.008 1.000
age54 0.132 0.152 0.360 -0.020 0.000 0.020 1.000
age55 0.148 0.132 0.339 0.016 0.000 0.016 1.000
age56 0.228 0.224 0.418 0.004 0.000 0.004 1.000
age57 0.040 0.024 0.153 0.016 0.000 0.016 1.000
age58 0.020 0.012 0.109 0.008 0.000 0.008 1.000
afib1 0.348 0.336 0.473 0.012 0.000 0.012 1.000
aphasia1 0.284 0.296 0.457 -0.012 0.000 0.012 1.000
cardiac 0.284 0.336 0.473 -0.052 0.000 0.052 1.000
gender1 0.568 0.624 0.485 -0.056 0.000 0.056 1.000
gender2 0.416 0.368 0.483 0.048 0.000 0.048 1.000
htn 0.700 0.632 0.483 0.068 0.000 0.068 1.000
hyperchol 0.280 0.248 0.433 0.032 0.000 0.032 1.000
icu 0.988 0.988 0.109 0.000 0.000 0.000 0.000
living1 0.120 0.112 0.316 0.008 0.000 0.008 1.000
living2 0.736 0.732 0.444 0.004 0.000 0.004 1.000
living3 0.020 0.024 0.153 -0.004 0.000 0.004 1.000
rankpre1 0.800 0.816 0.388 -0.016 0.000 0.016 1.000
rankpre2 0.160 0.156 0.364 0.004 0.000 0.004 1.000
rankpre3 0.020 0.016 0.126 0.004 0.000 0.004 1.000
rankpre5 0.016 0.008 0.089 0.008 0.000 0.008 1.000
residentq2 0.152 0.180 0.385 -0.028 0.000 0.028 1.000
residentq3 0.292 0.232 0.423 0.060 0.000 0.060 1.000
residentq4 0.264 0.284 0.452 -0.020 0.000 0.020 1.000
referral1 0.244 0.236 0.425 0.008 0.000 0.008 1.000
referral2 0.104 0.084 0.278 0.020 0.000 0.020 1.000
referral3 0.520 0.512 0.501 0.008 0.000 0.008 1.000
referral4 0.092 0.116 0.321 -0.024 0.000 0.024 1.000
paresis1 0.028 0.020 0.140 0.008 0.000 0.008 1.000
paresis2 0.908 0.892 0.311 0.016 0.000 0.016 1.000
paresis3 0.004 0.016 0.126 -0.012 0.000 0.012 1.000
prevstroke 0.124 0.100 0.301 0.024 0.000 0.024 1.000
sumbart901 0.080 0.132 0.339 -0.052 0.000 0.052 1.000
transport1 0.024 0.020 0.140 0.004 0.000 0.004 1.000
transport2 0.628 0.612 0.488 0.016 0.000 0.016 1.000
transport3 0.324 0.324 0.469 0.000 0.000 0.000 0.000
transport4 0.016 0.032 0.176 -0.016 0.000 0.016 1.000
timeintcat2 0.812 0.816 0.388 -0.004 0.000 0.004 1.000
timeintcat3 0.116 0.104 0.306 0.012 0.000 0.012 1.000
timeintcat4 0.024 0.048 0.214 -0.024 0.000 0.024 1.000
vigilanz1 0.004 0.020 0.140 -0.016 0.000 0.016 1.000
vigilanz2 0.200 0.212 0.410 -0.012 0.000 0.012 1.000
vigilanz3 0.708 0.692 0.463 0.016 0.000 0.016 1.000
ward1 0.024 0.032 0.176 -0.008 0.000 0.008 1.000
ward2 0.728 0.728 0.446 0.000 0.000 0.000 0.000
ward3 0.204 0.200 0.401 0.004 0.000 0.004 1.000
timeintcat1:year 0.148 0.104 0.592 0.044 0.000 0.044 2.000
timeintcat2:year 2.412 2.424 1.488 -0.012 0.000 0.060 1.000
timeintcat3:year 0.264 0.240 0.791 0.024 0.000 0.032 1.000
timeintcat4:year 0.068 0.132 0.617 -0.064 0.000 0.072 3.000
year:age701 1.140 1.160 1.585 -0.020 0.000 0.044 1.000
Percent Balance Improvement:
Mean Diff. eQQ Med eQQ Mean eQQ Max
distance 92.034 99.59 91.91 70.41
age51 64.904 0.00 65.00 0.00
age52 7.613 0.00 25.00 0.00
age53 84.882 0.00 84.61 0.00
age54 -170.802 0.00 -150.00 0.00
age55 10.022 0.00 20.00 0.00
age56 91.758 0.00 91.67 0.00
age57 74.291 0.00 75.00 0.00
age58 90.104 0.00 90.48 0.00
afib1 90.847 0.00 90.91 0.00
aphasia1 73.998 0.00 72.73 0.00
cardiac -286.064 0.00 -333.33 0.00
gender1 31.446 0.00 30.00 0.00
gender2 -225.948 0.00 -200.00 0.00
htn -119.315 0.00 -112.50 0.00
hyperchol -4.051 0.00 0.00 0.00
icu 100.000 0.00 100.00 100.00
living1 90.434 0.00 90.48 0.00
living2 96.792 0.00 96.77 0.00
living3 79.064 0.00 80.00 0.00
rankpre1 93.111 0.00 93.10 0.00
rankpre2 91.332 0.00 91.67 0.00
rankpre3 94.258 0.00 94.44 0.00
rankpre5 89.631 0.00 90.00 0.00
residentq2 68.757 0.00 69.56 0.00
residentq3 -12.543 0.00 -15.38 0.00
residentq4 51.166 0.00 50.00 0.00
referral1 82.539 0.00 81.82 0.00
referral2 91.573 0.00 91.53 0.00
referral3 97.135 0.00 97.14 0.00
referral4 44.972 0.00 45.45 0.00
paresis1 87.589 0.00 87.50 0.00
paresis2 95.011 0.00 95.00 0.00
paresis3 -47.224 0.00 -50.00 0.00
prevstroke 50.070 0.00 50.00 0.00
sumbart901 77.668 0.00 77.59 0.00
transport1 98.310 0.00 98.31 0.00
transport2 95.975 0.00 95.96 0.00
transport3 100.000 0.00 100.00 100.00
transport4 10.807 0.00 20.00 0.00
timeintcat2 99.365 100.00 99.36 0.00
timeintcat3 -112.463 0.00 -50.00 0.00
timeintcat4 93.975 0.00 94.00 0.00
vigilanz1 3.211 0.00 20.00 0.00
vigilanz2 87.584 0.00 87.50 0.00
vigilanz3 73.947 0.00 73.33 0.00
ward1 95.608 0.00 95.65 0.00
ward2 100.000 0.00 100.00 100.00
ward3 97.449 0.00 97.44 0.00
timeintcat1:year 93.007 0.00 93.08 50.00
timeintcat2:year 99.370 100.00 96.84 75.00
timeintcat3:year 76.420 0.00 71.43 50.00
timeintcat4:year 94.300 0.00 93.59 25.00
year:age701 77.709 0.00 50.00 0.00
Sample sizes:
Control Treated
All 7160 250
Matched 250 250
Unmatched 6910 0
Discarded 0 0
## Summary of balance before and after matching 1:4
summary(out.matchit4)
Call:
matchit(formula = tpa ~ age5 + afib + aphasia + cardiac + gender +
htn + hyperchol + icu + living + rankpre + residentq + referral +
paresis + prevstroke + sumbart90 + transport + timeintcat +
vigilanz + ward + timeintcat:year + age70:year, data = dat.ps.complete,
method = "nearest", distance = "logit", ratio = 4)
Summary of balance for all data:
Means Treated Means Control SD Control Mean Diff eQQ Med eQQ Mean eQQ Max
distance 0.302 0.024 0.070 0.277 0.298 0.276 0.559
age51 0.164 0.084 0.278 0.080 0.000 0.080 1.000
age52 0.100 0.113 0.317 -0.013 0.000 0.016 1.000
age53 0.168 0.115 0.319 0.053 0.000 0.052 1.000
age54 0.132 0.139 0.346 -0.007 0.000 0.008 1.000
age55 0.148 0.166 0.372 -0.018 0.000 0.020 1.000
age56 0.228 0.179 0.384 0.049 0.000 0.048 1.000
age57 0.040 0.102 0.303 -0.062 0.000 0.064 1.000
age58 0.020 0.101 0.301 -0.081 0.000 0.084 1.000
afib1 0.348 0.217 0.412 0.131 0.000 0.132 1.000
aphasia1 0.284 0.238 0.426 0.046 0.000 0.044 1.000
cardiac 0.284 0.271 0.444 0.013 0.000 0.012 1.000
gender1 0.568 0.486 0.500 0.082 0.000 0.080 1.000
gender2 0.416 0.431 0.495 -0.015 0.000 0.016 1.000
htn 0.700 0.731 0.443 -0.031 0.000 0.032 1.000
hyperchol 0.280 0.311 0.463 -0.031 0.000 0.032 1.000
icu 0.988 0.949 0.219 0.039 0.000 0.040 1.000
living1 0.120 0.204 0.403 -0.084 0.000 0.084 1.000
living2 0.736 0.611 0.487 0.125 0.000 0.124 1.000
living3 0.020 0.039 0.194 -0.019 0.000 0.020 1.000
rankpre1 0.800 0.568 0.495 0.232 0.000 0.232 1.000
rankpre2 0.160 0.206 0.405 -0.046 0.000 0.048 1.000
rankpre3 0.020 0.090 0.286 -0.070 0.000 0.072 1.000
rankpre5 0.016 0.093 0.291 -0.077 0.000 0.080 1.000
residentq2 0.152 0.242 0.428 -0.090 0.000 0.092 1.000
residentq3 0.292 0.239 0.426 0.053 0.000 0.052 1.000
residentq4 0.264 0.223 0.416 0.041 0.000 0.040 1.000
referral1 0.244 0.198 0.399 0.046 0.000 0.044 1.000
referral2 0.104 0.341 0.474 -0.237 0.000 0.236 1.000
referral3 0.520 0.241 0.428 0.279 0.000 0.280 1.000
referral4 0.092 0.136 0.342 -0.044 0.000 0.044 1.000
paresis1 0.028 0.092 0.290 -0.064 0.000 0.064 1.000
paresis2 0.908 0.587 0.492 0.321 0.000 0.320 1.000
paresis3 0.004 0.012 0.110 -0.008 0.000 0.008 1.000
prevstroke 0.124 0.172 0.377 -0.048 0.000 0.048 1.000
sumbart901 0.080 0.313 0.464 -0.233 0.000 0.232 1.000
transport1 0.024 0.261 0.439 -0.237 0.000 0.236 1.000
transport2 0.628 0.230 0.421 0.398 0.000 0.396 1.000
transport3 0.324 0.388 0.487 -0.064 0.000 0.064 1.000
transport4 0.016 0.034 0.181 -0.018 0.000 0.020 1.000
timeintcat2 0.812 0.182 0.386 0.630 1.000 0.628 1.000
timeintcat3 0.116 0.122 0.327 -0.006 0.000 0.008 1.000
timeintcat4 0.024 0.422 0.494 -0.398 0.000 0.400 1.000
vigilanz1 0.004 0.021 0.142 -0.017 0.000 0.020 1.000
vigilanz2 0.200 0.103 0.304 0.097 0.000 0.096 1.000
vigilanz3 0.708 0.769 0.421 -0.061 0.000 0.060 1.000
ward1 0.024 0.206 0.405 -0.182 0.000 0.184 1.000
ward2 0.728 0.641 0.480 0.087 0.000 0.088 1.000
ward3 0.204 0.047 0.212 0.157 0.000 0.156 1.000
timeintcat1:year 0.148 0.777 1.388 -0.629 0.000 0.636 4.000
timeintcat2:year 2.412 0.507 1.171 1.905 2.000 1.900 4.000
timeintcat3:year 0.264 0.366 1.048 -0.102 0.000 0.112 2.000
timeintcat4:year 0.068 1.191 1.558 -1.123 0.000 1.124 4.000
year:age701 1.140 1.050 1.528 0.090 0.000 0.088 1.000
Summary of balance for matched data:
Means Treated Means Control SD Control Mean Diff eQQ Med eQQ Mean eQQ Max
distance 0.302 0.147 0.133 0.155 0.202 0.155 0.272
age51 0.164 0.127 0.333 0.037 0.000 0.036 1.000
age52 0.100 0.105 0.307 -0.005 0.000 0.004 1.000
age53 0.168 0.116 0.320 0.052 0.000 0.052 1.000
age54 0.132 0.158 0.365 -0.026 0.000 0.028 1.000
age55 0.148 0.185 0.388 -0.037 0.000 0.036 1.000
age56 0.228 0.211 0.408 0.017 0.000 0.016 1.000
age57 0.040 0.063 0.243 -0.023 0.000 0.024 1.000
age58 0.020 0.035 0.184 -0.015 0.000 0.016 1.000
afib1 0.348 0.293 0.455 0.055 0.000 0.056 1.000
aphasia1 0.284 0.258 0.438 0.026 0.000 0.024 1.000
cardiac 0.284 0.269 0.444 0.015 0.000 0.016 1.000
gender1 0.568 0.529 0.499 0.039 0.000 0.040 1.000
gender2 0.416 0.443 0.497 -0.027 0.000 0.028 1.000
htn 0.700 0.699 0.459 0.001 0.000 0.004 1.000
hyperchol 0.280 0.272 0.445 0.008 0.000 0.008 1.000
icu 0.988 0.976 0.153 0.012 0.000 0.012 1.000
living1 0.120 0.139 0.346 -0.019 0.000 0.020 1.000
living2 0.736 0.714 0.452 0.022 0.000 0.024 1.000
living3 0.020 0.016 0.126 0.004 0.000 0.004 1.000
rankpre1 0.800 0.717 0.451 0.083 0.000 0.084 1.000
rankpre2 0.160 0.206 0.405 -0.046 0.000 0.048 1.000
rankpre3 0.020 0.037 0.189 -0.017 0.000 0.016 1.000
rankpre5 0.016 0.023 0.150 -0.007 0.000 0.008 1.000
residentq2 0.152 0.209 0.407 -0.057 0.000 0.056 1.000
residentq3 0.292 0.233 0.423 0.059 0.000 0.060 1.000
residentq4 0.264 0.268 0.443 -0.004 0.000 0.004 1.000
referral1 0.244 0.210 0.408 0.034 0.000 0.032 1.000
referral2 0.104 0.151 0.358 -0.047 0.000 0.048 1.000
referral3 0.520 0.455 0.498 0.065 0.000 0.064 1.000
referral4 0.092 0.128 0.334 -0.036 0.000 0.036 1.000
paresis1 0.028 0.036 0.186 -0.008 0.000 0.008 1.000
paresis2 0.908 0.833 0.373 0.075 0.000 0.076 1.000
paresis3 0.004 0.006 0.077 -0.002 0.000 0.004 1.000
prevstroke 0.124 0.148 0.355 -0.024 0.000 0.024 1.000
sumbart901 0.080 0.138 0.345 -0.058 0.000 0.060 1.000
transport1 0.024 0.043 0.203 -0.019 0.000 0.020 1.000
transport2 0.628 0.529 0.499 0.099 0.000 0.100 1.000
transport3 0.324 0.381 0.486 -0.057 0.000 0.056 1.000
transport4 0.016 0.032 0.176 -0.016 0.000 0.016 1.000
timeintcat2 0.812 0.691 0.462 0.121 0.000 0.124 1.000
timeintcat3 0.116 0.205 0.404 -0.089 0.000 0.088 1.000
timeintcat4 0.024 0.030 0.171 -0.006 0.000 0.008 1.000
vigilanz1 0.004 0.007 0.083 -0.003 0.000 0.004 1.000
vigilanz2 0.200 0.177 0.382 0.023 0.000 0.024 1.000
vigilanz3 0.708 0.721 0.449 -0.013 0.000 0.012 1.000
ward1 0.024 0.043 0.203 -0.019 0.000 0.020 1.000
ward2 0.728 0.787 0.410 -0.059 0.000 0.056 1.000
ward3 0.204 0.121 0.326 0.083 0.000 0.084 1.000
timeintcat1:year 0.148 0.224 0.842 -0.076 0.000 0.080 2.000
timeintcat2:year 2.412 2.041 1.616 0.371 0.000 0.372 2.000
timeintcat3:year 0.264 0.537 1.170 -0.273 0.000 0.272 2.000
timeintcat4:year 0.068 0.077 0.466 -0.009 0.000 0.028 2.000
year:age701 1.140 1.129 1.573 0.011 0.000 0.044 1.000
Percent Balance Improvement:
Mean Diff. eQQ Med eQQ Mean eQQ Max
distance 44.083 32.26 43.84 51.34
age51 53.624 0.00 55.00 0.00
age52 61.505 0.00 75.00 0.00
age53 1.731 0.00 0.00 0.00
age54 -252.042 0.00 -250.00 0.00
age55 -108.074 0.00 -80.00 0.00
age56 64.971 0.00 66.67 0.00
age57 63.043 0.00 62.50 0.00
age58 81.444 0.00 80.95 0.00
afib1 58.047 0.00 57.58 0.00
aphasia1 43.663 0.00 45.45 0.00
cardiac -11.365 0.00 -33.33 0.00
gender1 52.257 0.00 50.00 0.00
gender2 -83.346 0.00 -75.00 0.00
htn 96.775 0.00 87.50 0.00
hyperchol 73.987 0.00 75.00 0.00
icu 68.991 0.00 70.00 0.00
living1 77.281 0.00 76.19 0.00
living2 82.356 0.00 80.64 0.00
living3 79.064 0.00 80.00 0.00
rankpre1 64.265 0.00 63.79 0.00
rankpre2 0.315 0.00 0.00 0.00
rankpre3 75.597 0.00 77.78 0.00
rankpre5 90.928 0.00 90.00 0.00
residentq2 36.398 0.00 39.13 0.00
residentq3 -10.668 0.00 -15.38 0.00
residentq4 90.233 0.00 90.00 0.00
referral1 25.790 0.00 27.27 0.00
referral2 80.197 0.00 79.66 0.00
referral3 76.721 0.00 77.14 0.00
referral4 17.459 0.00 18.18 0.00
paresis1 87.589 0.00 87.50 0.00
paresis2 76.614 0.00 76.25 0.00
paresis3 75.463 0.00 50.00 0.00
prevstroke 50.070 0.00 50.00 0.00
sumbart901 75.091 0.00 74.14 0.00
transport1 91.975 0.00 91.53 0.00
transport2 75.098 0.00 74.75 0.00
transport3 10.922 0.00 12.50 0.00
transport4 10.807 0.00 20.00 0.00
timeintcat2 80.803 100.00 80.25 0.00
timeintcat3 -1475.767 0.00 -1000.00 0.00
timeintcat4 98.494 0.00 98.00 0.00
vigilanz1 81.852 0.00 80.00 0.00
vigilanz2 76.202 0.00 75.00 0.00
vigilanz3 78.832 0.00 80.00 0.00
ward1 89.569 0.00 89.13 0.00
ward2 31.807 0.00 36.36 0.00
ward3 47.064 0.00 46.15 0.00
timeintcat1:year 87.922 0.00 87.42 50.00
timeintcat2:year 80.522 100.00 80.42 50.00
timeintcat3:year -168.220 0.00 -142.86 0.00
timeintcat4:year 99.198 0.00 97.51 50.00
year:age701 87.740 0.00 50.00 0.00
Sample sizes:
Control Treated
All 7160 250
Matched 1000 250
Unmatched 6160 0
Discarded 0 0
a. What is the association between t-PA and death after matching?
Unconditional logistic regression on matched sample
Conditional logistic regression is considered unnecessary ().
## Unconditional logistic regression using matched dataset
logistic.matched <- glm(death ~ tpa,
data = dat.matchit,
family = binomial(link = "logit"))
logistic.display(logistic.matched)
Logistic regression predicting death
OR(95%CI) P(Wald's test) P(LR-test)
tpa: 1 vs 0 1.78 (1,3.15) 0.049 0.046
Log-likelihood = -173.3488
No. of observations = 500
AIC value = 350.6976
## Using 1:4 match dataset
logistic.matched4 <- glm(death ~ tpa,
data = dat.matchit4,
family = binomial(link = "logit"))
logistic.display(logistic.matched4)
Logistic regression predicting death
OR(95%CI) P(Wald's test) P(LR-test)
tpa: 1 vs 0 1.8 (1.18,2.74) 0.006 0.008
Log-likelihood = -387.2768
No. of observations = 1250
AIC value = 778.5537
Matching example using nonrandom package
There are multiple ways to handle PS-matched datasets. But these results in similar results (Hosmer&Lemeshow, Applied Logistic Regression, 3rd).
## Load nonrandom
library(nonrandom)
## Propensity score estimation (pscore object returned)
dat.nonrandom <- pscore(formula = tpa ~ age5 + afib + aphasia + cardiac + gender + htn + hyperchol + icu + living + rankpre + residentq + referral + paresis + prevstroke + sumbart90 + transport + timeintcat + vigilanz + ward + timeintcat:year + age70:year,
data = dat.ps.complete,
family = "binomial",
name.pscore = "ps")
## Graphical check for propensity score distributions in treatment groups
plot(dat.nonrandom, main = "PS distribution")
## Actual Propensity score matching
res.ps.match <- ps.match(object = dat.nonrandom,
control.matched.by = matched.by,
who.treated = 1, # value 1 is treated
name.match.index = "match.index",
ratio = 1, # default 1:1 match
caliper = "logit",
x = 0.2,
givenTmatchingC = TRUE,
bestmatch.first = TRUE,
setseed = FALSE,
combine.output = TRUE)
## Actual Propensity score matching (1:4 Match)
res.ps.match4 <- ps.match(object = dat.nonrandom,
control.matched.by = matched.by,
who.treated = 1, # value 1 is treated
name.match.index = "match.index",
ratio = 4, # 1:4 match
caliper = "logit",
x = 0.2,
givenTmatchingC = TRUE,
bestmatch.first = TRUE,
setseed = FALSE,
combine.output = TRUE)
## Show summary
res.ps.match
Matched by: ps
Matching parameter:
Caliper size: 0.532
Ratio: 1.000
Who is treated?: 1.000
Matching information:
Untreated to treated?: TRUE
Best match?: TRUE
Matching data:
Number of treated obs.: 250
Number of matched treated obs.: 250
Number of untreated obs.: 7160
Number of matched untreated obs.: 250
Number of total matched obs.: 500
Number of not matched obs.: 6910
Number of matching sets: 250
Number of incomplete matching sets: 0
res.ps.match4
Matched by: ps
Matching parameter:
Caliper size: 0.532
Ratio: 4.000
Who is treated?: 1.000
Matching information:
Untreated to treated?: TRUE
Best match?: TRUE
Matching data:
Number of treated obs.: 250
Number of matched treated obs.: 250
Number of untreated obs.: 7160
Number of matched untreated obs.: 1000
Number of total matched obs.: 1250
Number of not matched obs.: 6160
Number of matching sets: 250
Number of incomplete matching sets: 0
## Make stratified dataset from propensity score ($data contains a dataset with staratum index)
even5strata <- ps.makestrata(res.ps.match, breaks = 5) # Five evenly spaced
even5strata
Stratified by: ps
Strata information:
Strata bounds n n (proportion)
1 [-0.000802,0.16] 6910 0.933
2 (0.16,0.321] 272 0.037
3 (0.321,0.482] 149 0.02
4 (0.482,0.643] 57 0.008
5 (0.643,0.804] 22 0.003
quintileStrata <- ps.makestrata(res.ps.match, breaks = quintiles) # Quintiles
quintileStrata
Stratified by: ps
Strata information:
Strata bounds n n (proportion)
1 [5.35e-07,0.00024] 1550 0.209
2 (0.00024,0.00109] 1507 0.203
3 (0.00109,0.0041] 1416 0.191
4 (0.0041,0.0223] 1468 0.198
5 (0.0223,0.819] 1469 0.198
### 1:1 match
## Using UNCONDITIONAL logistic regression
logistic.matched <- glm(death ~ tpa,
data = res.ps.match$data.matched,
family = binomial(link = "logit"))
logistic.display(logistic.matched)
Logistic regression predicting death
OR(95%CI) P(Wald's test) P(LR-test)
tpa: 1 vs 0 1.69 (0.96,2.97) 0.07 0.066
Log-likelihood = -175.7124
No. of observations = 500
AIC value = 355.4247
## Using CONDITIONAL logistic regression
library(survival)
clogit.matched <- clogit(formula = death ~ tpa + strata(match.index),
data = res.ps.match$data.matched,
method =c("exact", "approximate", "efron", "breslow")[1],
)
clogit.matched
Call:
clogit(formula = death ~ tpa + strata(match.index), data = res.ps.match$data.matched,
method = c("exact", "approximate", "efron", "breslow")[1],
)
coef exp(coef) se(coef) z p
tpa 0.544 1.72 0.296 1.83 0.067
Likelihood ratio test=3.49 on 1 df, p=0.0617 n= 500, number of events= 57
## Using GEE with exchangeable working correlation
library(geepack)
gee.matched <- geeglm(formula = death ~ tpa,
family = binomial(link = "logit"),
id = match.index,
data = res.ps.match$data.matched,
corstr = "exchangeable",
scale.fix = FALSE
)
summary(gee.matched)
Call:
geeglm(formula = death ~ tpa, family = binomial(link = "logit"),
data = res.ps.match$data.matched, id = match.index, corstr = "exchangeable",
scale.fix = FALSE)
Coefficients:
Estimate Std.err Wald Pr(>|W|)
(Intercept) -2.338 0.223 109.70 <2e-16 ***
tpa 0.523 0.288 3.29 0.07 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Estimated Scale Parameters:
Estimate Std.err
(Intercept) 1 0.263
Correlation: Structure = exchangeable Link = identity
Estimated Correlation Parameters:
Estimate Std.err
alpha 0 0
Number of clusters: 500 Maximum cluster size: 1
exp(coef(gee.matched))
(Intercept) tpa
0.0965 1.6871
### 1:4 match
## Using UNCONDITIONAL logistic regression
logistic.matched4 <- glm(death ~ tpa,
data = res.ps.match4$data.matched,
family = binomial(link = "logit"))
logistic.display(logistic.matched4)
Logistic regression predicting death
OR(95%CI) P(Wald's test) P(LR-test)
tpa: 1 vs 0 1.8 (1.18,2.74) 0.006 0.008
Log-likelihood = -387.2768
No. of observations = 1250
AIC value = 778.5537
## Using CONDITIONAL logistic regression
library(survival)
clogit.matched4 <- clogit(formula = death ~ tpa + strata(match.index),
data = res.ps.match4$data.matched,
method =c("exact", "approximate", "efron", "breslow")[1],
)
clogit.matched4
Call:
clogit(formula = death ~ tpa + strata(match.index), data = res.ps.match4$data.matched,
method = c("exact", "approximate", "efron", "breslow")[1],
)
coef exp(coef) se(coef) z p
tpa 0.605 1.83 0.219 2.76 0.0058
Likelihood ratio test=7.16 on 1 df, p=0.00747 n= 1250, number of events= 118
## Using GEE with exchangeable working correlation
library(geepack)
gee.matched4 <- geeglm(formula = death ~ tpa,
family = binomial(link = "logit"),
id = match.index,
data = res.ps.match4$data.matched,
corstr = "exchangeable",
scale.fix = FALSE
)
summary(gee.matched4)
Call:
geeglm(formula = death ~ tpa, family = binomial(link = "logit"),
data = res.ps.match4$data.matched, id = match.index, corstr = "exchangeable",
scale.fix = FALSE)
Coefficients:
Estimate Std.err Wald Pr(>|W|)
(Intercept) -2.402 0.115 439.09 <2e-16 ***
tpa 0.587 0.215 7.43 0.0064 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Estimated Scale Parameters:
Estimate Std.err
(Intercept) 1 0.207
Correlation: Structure = exchangeable Link = identity
Estimated Correlation Parameters:
Estimate Std.err
alpha 0.0905 0.0289
Number of clusters: 1249 Maximum cluster size: 2
exp(coef(gee.matched4))
(Intercept) tpa
0.0905 1.7982
a. Create weights for either the IPTW or the SMR-weighted analyses
dat.ps.complete <- within(dat.ps.complete, {
IPTW <- NA
IPTW[tpa == "1"] <- (1 / pscore)[tpa == "1"] # 1/PS for treated
IPTW[tpa == "0"] <- (1 / (1 - pscore))[tpa == "0"] # 1/(1-PS) for untreated
SMRW <- NA
SMRW[tpa == "1"] <- 1 # 1 for treated
SMRW[tpa == "0"] <- (pscore / (1 - pscore))[tpa == "0"] # PS/(1-PS) (PS odds) for untreated
})
## Check distribution
library(doBy)
summaryBy(IPTW ~ tpa, dat.ps.complete, FUN = summary)
tpa IPTW.Min. IPTW.1st Qu. IPTW.Median IPTW.Mean IPTW.3rd Qu. IPTW.Max.
1 0 1.00 1.00 1.00 1.03 1.01 3.84
2 1 1.23 2.29 3.51 30.30 10.20 1680.00
summaryBy(IPTW ~ tpa, dat.ps.complete, FUN = sum)
tpa IPTW.sum
1 0 7401
2 1 7576
summaryBy(SMRW ~ tpa, dat.ps.complete, FUN = summary)
tpa SMRW.Min. SMRW.1st Qu. SMRW.Median SMRW.Mean SMRW.3rd Qu. SMRW.Max.
1 0 0.000000535 0.000335 0.00183 0.0336 0.0108 2.84
2 1 1.000000000 1.000000 1.00000 1.0000 1.0000 1.00
summaryBy(SMRW ~ tpa, dat.ps.complete, FUN = sum)
tpa SMRW.sum
1 0 241
2 1 250
## Unweighted cohort (area 1 for each group, but these groups are different in sizes)
ggplot(data = dat.ps.complete,
mapping = aes(x = pscore, fill = factor(tpa), weight = NULL)) +
layer(geom = "density", stat = "density", alpha = 1/3) +
theme_bw() +
theme(legend.key = element_blank()) +
labs(title = "Unweighted cohort")
## IPTW
ggplot(data = dat.ps.complete,
mapping = aes(x = pscore, fill = factor(tpa), weight = IPTW)) +
layer(geom = "density", stat = "density", alpha = 1/3) +
theme_bw() +
theme(legend.key = element_blank()) +
labs(title = "IPTW")
## SMRW
ggplot(data = dat.ps.complete,
mapping = aes(x = pscore, fill = factor(tpa), weight = SMRW)) +
layer(geom = "density", stat = "density", alpha = 1/3) +
theme_bw() +
theme(legend.key = element_blank()) +
labs(title = "SMRW")
b. Use the weights to adjust for confounding
Inverse Probability of Treatment assignment Weight (IPTW)
Estimates averate treatment effect (ATE).
## IPTW logistic regression
logit.iptw <- glm(death ~ tpa,
data = dat.ps.complete,
weights = IPTW, # weighted analysis
family = binomial(link = "logit"))
## Load sandwich package for robust sandwich covariance matrix estimators
library(sandwich)
## Load lmtest package for coeftest
library(lmtest)
## Test using robust sandwich covariance matrix estimators
coeftest(logit.iptw, vcov = sandwich)
z test of coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.7832 0.0516 -53.95 <2e-16 ***
tpa 1.5282 0.6793 2.25 0.024 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Show OR
exp(coef(logit.iptw))
(Intercept) tpa
0.0618 4.6097
SMR weight
Estimates averate treatment effect in the treated (ATT).
logit.smrw <- glm(death ~ tpa,
data = dat.ps.complete,
weights = SMRW, # weighted analysis
family = binomial(link = "logit"))
## Test using robust sandwich covariance matrix estimators
coeftest(logit.smrw, vcov = sandwich)
z test of coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.248 0.161 -13.96 <2e-16 ***
tpa 0.433 0.243 1.78 0.075 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Show OR
exp(coef(logit.smrw))
(Intercept) tpa
0.106 1.542