In this section we will define some key samples to use later in the analysis:
The nonelderly sample are individuals aged 64 or younger as of the last vital status update.
The SCCS sample are individuals in the 12 SCCS states (e.g., there are 409 individuals who had a primary residence outside the 12 SCCS states when they were recruited into the sample; for now we’ll just exclude those from our anlayses.)
The Expansion State sample includes individuals from Kentucky, West Virginia, Arkansas and Louisiana. Note that a Restricted Expansion State sample is also defined excluding Louisiana respondants, as Louisiana expanded Medicaid while we were in the field with FU3.
The Medicare sample are individuals aged 65 or older as the last vital status update, or who self-reported Medicare coverage in any follow-up survey.
# Define Samples
xx <- xx0 %>%
mutate(ss.nonelderly.fu3 = as.integer(vitalstatus_age <= 62 &
r_1 == 1)) %>% # Non-Elderly
# These are the 12 SCCS states
mutate(ss.sccs = as.integer(
state %in% c(
"05",
"21",
"22",
"54",
"51",
"47",
"13",
"01",
"28",
"12",
"37",
"45"
)
)) %>%
mutate(
ss.expansion = as.integer(state %in% c("05", "21", "54", "22")),
# Expansion States (includes LA)
ss.expansion2 = as.integer(state %in% c("05", "21", "54")),
ss.nonmedicare = ifelse(vitalstatus_age <= 64 , 1, 0),
ss.medicare = as.integer(vitalstatus_age >= 65)
)
# NOTE: NEED TO FIND A WAY TO GET THE MEDICARE VARIABLES TO INTEGRATE THE FU1-3
# RESPONSES; RIGHT NOW SOME ARE BEING SET TO NA BECAUSE IF THE FU MEDICARE
# VARIABLE IS MISSING, THE WHOLE THING GETS SET TO MISSING.
# Merge in the Nonresponse weight
load(file.path(
"../../../data",
filedir,
paste0("sccs-fu3-nonresponse-weight", vers, ".Rdata")
))
xx <-
xx %>% left_join(nr.weight, "idnumber") %>% dplyr::mutate(w.FU3 = ifelse(is.na(w.FU3), 0, w.FU3))
source("./R/composite-sf12-score.R")
xx <-
xx %>% left_join(sf12f1, "idnumber") %>% left_join(sf12f3, "idnumber")
Our first objective is to come up with outcome measures of health status and mortality.
For this analysis we will rely on established composite measures of physical and mental health from the SF-12. The scoring instructions have been converted (and validated using the test data) from the SAS file for the SF-12 v2.0 composite score found here.
One critical item to note is that the “couplets” of SF12 questions in the survey led to substantial nonresponse for the second item in the couplet. This can be seen in the missing data summary plot below, where there are clear patterns of missingness; this is evident by the vertical “bands” of grey for the I2B, I3B and I4B SF-12 questions (i.e., the second item in the couplets).
What this means is that we end up with more missing data for our composite analyses than we will for the general analyses, since we cannot construct a composite for indivduals who did not fill out all 12 items.
ggplot_missing <- function(x){
x %>%
is.na %>%
reshape2::melt() %>%
ggplot(data = .,
aes(x = Var2,
y = Var1)) +
geom_raster(aes(fill = value)) +
scale_fill_grey(name = "",
labels = c("Present","Missing")) +
theme_minimal() +
theme(axis.text.x = element_text(angle=45, vjust=0.5)) +
labs(x = "Variables in Dataset",
y = "Rows / observations")
}
sf12f1 %>% select(starts_with("i")) %>% select(-idnumber) %>% filter(row_number() %in% sample(seq(nrow(sf12f1)),200)) %>% ggplot_missing()
The diagram shows how we derive our final sample (circles) for the composite health outcome anlayses. Note that anlayses of general health status (5 categories) will have more observations because they do not rely on fully-observed responses to the SF-12 questions. The other interesting thing to note is that the partial non-response for the SF-12 questions seems to be somewhat random: roughly 25% of FU1 respondants did not fully answer the 12 questions, and among full SF-12 respondants in FU1, roughly 25% did not fully respond in FU3. We will, obviously, need to do some more analytic work and decide what we want to do with these cases (e.g., imputation).
Note that the final analytic sample, for now, includes slightly fewer observations due to item nonresponse in the baseline covariates used to adjust the regressions.
The spike histograms below provide detail on the distribution of the composite physical and mental health measures for FU1 by state Medicaid expansion status.
xx %>% filter(ss.composite.health.nonmedicare==1) %>%
mutate(healthf1 =sf12_physicalf1,
healthf3 = sf12_physicalf3,
mentalf1 = sf12_mentalf1,
mentalf3 = sf12_mentalf3) %>%
filter(!is.na(healthf3) & !(is.na(healthf1))) %>%
select(ss.expansion,healthf3,healthf1) %>%
mutate(health.cs = healthf3-healthf1) %>%
with(histboxp(x=healthf1,group=ss.expansion,sd=TRUE,bins=200,xlab="Composite Physical Health Status (FU1)")) -> p
putHfig(p, "Spike histograms, means, quantiles, Gini's mean difference, and SD for SF-12 Composite Physical Health Status stratified by expansion status",
scap="Stratified spike histograms and quantiles for SF-12 Physical Health Status")
Spike histograms, means, quantiles, Gini’s mean difference, and SD for SF-12 Composite Physical Health Status stratified by expansion status
xx %>% filter(ss.composite.health.nonmedicare==1) %>%
mutate(healthf1 =sf12_physicalf1,
healthf3 = sf12_physicalf3,
mentalf1 = sf12_mentalf1,
mentalf3 = sf12_mentalf3) %>%
filter(!is.na(mentalf1) & !(is.na(mentalf3))) %>%
select(ss.expansion,mentalf3,mentalf1) %>%
mutate(mental.cs = mentalf3-mentalf1) %>%
with(histboxp(x=mentalf1,group=ss.expansion,sd=TRUE,bins=200,xlab="Composite Mental Health Status (FU1)")) -> p
putHfig(p, "Spike histograms, means, quantiles, Gini's mean difference, and SD for SF-12 Composite Mental Health Status stratified by expansion status",
scap="Stratified spike histograms and quantiles for SF-12 Mental Health Status")
Spike histograms, means, quantiles, Gini’s mean difference, and SD for SF-12 Composite Mental Health Status stratified by expansion status
putHfig(p, "Spike histograms, means, quantiles, Gini's mean difference, and SD for SF-12 Composite Physical Health Status stratified by expansion status",
scap="Stratified spike histograms and quantiles for SF-12 Physical Health Status")
<hr>
### <span style="font-family:Verdana;font-size:10px;">∟ </span><span style="font-family:Verdana;font-size:12px;color:MidnightBlue;">Stratified spike histograms and quantiles for SF-12 Physical Health Status</span>
<span style="font-family:Verdana;font-size:12px;color:MidnightBlue;">Spike histograms, means, quantiles, Gini's mean difference, and SD for SF-12 Composite Physical Health Status stratified by expansion status</span>
The following plots compare the distribution of the physical and mental health composite scores to analogous scores from a representative sample of Southern Region MEPS repsondants who are exact matched based on age and gender to the SCCS sample.
sccs.health.status <-xx %>% filter(ss.composite.health.nonmedicare==1) %>%
select(sf12_mentalf1,sf12_physicalf1,age=vitalstatus_age,sex) %>%
mutate(sample="SCCS",tx=1,
male = as.integer(sex=="M")) %>% select(-sex)
load("../../../data/meps/sf12f1_meps15.RData")
meps.health.status <- sf12f1 %>% filter(region15==3) %>%
select(sf12_mentalf1,sf12_physicalf1,sex,age=age15x) %>%
mutate(sample = "MEPS",tx=0,
male=as.integer(sex==1)) %>% select(-sex) # Has the same name in the MEPS dataset so need to
require(MatchIt)
rbind(sccs.health.status,meps.health.status) %>% na.omit() -> meps.sccs.compare
m.out <- matchit(tx ~ male + age , data = meps.sccs.compare,
method = "exact")
meps.sccs.compare <- cbind.data.frame(weight = m.out$weights,meps.sccs.compare)
meps.sccs.compare %>% group_by(tx) %>% summarise(sum_weight=sum(weight)) %>% pull(sum_weight) -> totals
ratio <- totals[2] / totals[1]
meps.sccs.compare %>% mutate(weight = ((1-tx)*ratio+1*tx)*weight) -> meps.sccs.compare
meps.sccs.compare %>% ggplot(aes(x=sf12_physicalf1))+geom_density(aes(fill=sample,x=sf12_physicalf1,weights=weight),alpha=0.4)+theme_few()+
labs(fill="Sample",x="Physical Health Composite",y="Density")
Ignoring unknown aesthetics: weights
meps.sccs.compare %>% ggplot(aes(x=sf12_mentalf1))+geom_density(aes(fill=sample,x=sf12_mentalf1,weights=weight),alpha=0.4)+theme_few()+
labs(fill="Sample",x="Mental Health Composite",y="Density")
Ignoring unknown aesthetics: weights
As is clear from the CONSORT diagram, even among the nonelderly we have a relatively high death rate between the FU1 and FU3 surveys; 9.1669
percent of nonelderly FU1 respondants with full responses to the FU1 SF12 questions have FU3 health status truncated by death.
To begin, its easiest start by specifying a DD approach for the health status outcome that lends itself well to the principal stratification set-up (i.e., a regression of the outcome on a treatment indicator plus baseline covariates). That is, the principal stratification framework was developed for a case where we have treatment assignment indicator (\(Z\)) and principal strata (\(S\)) which could correnspond to treatment compliance status (which sets up the usual IV framework) or strata based on survival status, in which case we are interested in health status changes among the substrata of survivors only. However, our observational study is being identified off of a difference-in-difference framework, not an RCT. So we first need to map at DD framework to something that looks like an RCT regression specification of the outcome on the treatment indicator + covariates. As we move forward with the analyses we can, obviously, spend a bit more time thinking through the analytic specifications for the more robust DD estimator we plan to use for other analyses (e.g., with state and time effects) under the principal stratification framework. But for now, the longitudinal nature of our data allows us to easily map the standard DD framework to the principal stratification framework via a change score approach.
To see this, define:
Using the above we can define the average causal effect as \[ E[Y_{it}(1)-Y_{it}(0))] = \tau \] and given that we have longitudinal data we also have the following \[ \tau = E[Y_{i1}-Y_{i0}|Z_i=1] - E[Y_{i1}-Y_{i0}|Z_i=0] \] Thus, we can estimate \(\tau\) by regressing the change in health status for each unit, \(Y_{i1}-Y_{i0}\) on the indicator for treatment.
To validate this we can fit a standard \(expansion + post + expansion \times post\) regression with the SCCS data (across-state comparisons of nonelderly in expansion vs. nonexpansion states), and compare the results from that regression to a regression of the health status change scores. As the table below shows, both approaches yield the same coeffient estimate (I have suppressed the inference as I did not adjust for longitudinally in these regressions).
# Define covariates and regression formulas.
XX <- c(
"education",
"sex",
"maritalstatus",
"hhsize",
"employed",
"raceanalysis"
)
XX.spline <- "vitalstatus_age"
# post + expnasion + post*expansion
phys.fmla.unadj <- tofmla(y="health",x=c("ss.expansion","post","post_expansion"))
phys.fmla.adj <- tofmla(y="health",x=c("ss.expansion","post","post_expansion",XX),x.spline=XX.spline)
# state + year + post*expansion
phys.fmla.unadj2 <- tofmla(y="health",x=c("state","year"),interact=list(c("post","ss.expansion")))
phys.fmla.adj2 <- tofmla(y="health",x=c("state","year",XX),x.spline=XX.spline,interact=list(c("post","ss.expansion")))
# change score
phys.fmla.cs.unadj <- tofmla(y="health.cs",x=c("ss.expansion"))
phys.fmla.cs.adj <- tofmla(y="health.cs",x=c("ss.expansion",XX),x.spline=XX.spline)
# Mental health version of all the above.
mental.fmla.unadj <- tofmla(y="mental",x=c("ss.expansion","post","post_expansion"))
mental.fmla.adj <- tofmla(y="mental",x=c("ss.expansion","post","post_expansion",XX),x.spline=XX.spline)
mental.fmla.unadj2 <- tofmla(y="mental",x=c("state","year"),interact=list(c("post","ss.expansion")))
mental.fmla.adj2 <- tofmla(y="mental",x=c("state","year",XX),x.spline=XX.spline,interact=list(c("post","ss.expansion")))
mental.fmla.cs.unadj <- tofmla(y="mental.cs",x=c("ss.expansion"))
mental.fmla.cs.adj <- tofmla(y="mental.cs",x=c("ss.expansion",XX),x.spline=XX.spline)
# Mental health version of all the above.
ins.fmla.unadj <- tofmla(y="insurance01",x=c("ss.expansion","post","post_expansion"))
ins.fmla.adj <- tofmla(y="insurance01",x=c("ss.expansion","post","post_expansion",XX),x.spline=XX.spline)
ins.fmla.unadj2 <- tofmla(y="insurance01",x=c("state","year"),interact=list(c("post","ss.expansion")))
ins.fmla.adj2 <- tofmla(y="insurance01",x=c("state","year",XX),x.spline=XX.spline,interact=list(c("post","ss.expansion")))
ins.fmla.cs.unadj <- tofmla(y="insurance01.cs",x=c("ss.expansion"))
ins.fmla.cs.adj <- tofmla(y="insurance01.cs",x=c("ss.expansion",XX),x.spline=XX.spline)
# Excellent/V. Good
ex_vg.fmla.unadj <- tofmla(y="ex_vg",x=c("ss.expansion","post","post_expansion"))
ex_vg.fmla.adj <- tofmla(y="ex_vg",x=c("ss.expansion","post","post_expansion",XX),x.spline=XX.spline)
ex_vg.fmla.unadj2 <- tofmla(y="ex_vg",x=c("state","year"),interact=list(c("post","ss.expansion")))
ex_vg.fmla.adj2 <- tofmla(y="ex_vg",x=c("state","year",XX),x.spline=XX.spline,interact=list(c("post","ss.expansion")))
ex_vg.fmla.cs.unadj <- tofmla(y="ex_vg.cs",x=c("ss.expansion"))
ex_vg.fmla.cs.adj <- tofmla(y="ex_vg.cs",x=c("ss.expansion",XX),x.spline=XX.spline)
# Fair / Poor
fair_poor.fmla.unadj <- tofmla(y="fair_poor",x=c("ss.expansion","post","post_expansion"))
fair_poor.fmla.adj <- tofmla(y="fair_poor",x=c("ss.expansion","post","post_expansion",XX),x.spline=XX.spline)
fair_poor.fmla.unadj2 <- tofmla(y="fair_poor",x=c("state","year"),interact=list(c("post","ss.expansion")))
fair_poor.fmla.adj2 <- tofmla(y="fair_poor",x=c("state","year",XX),x.spline=XX.spline,interact=list(c("post","ss.expansion")))
fair_poor.fmla.cs.unadj <- tofmla(y="fair_poor.cs",x=c("ss.expansion"))
fair_poor.fmla.cs.adj <- tofmla(y="fair_poor.cs",x=c("ss.expansion",XX),x.spline=XX.spline)
# Standard Diff-in-Diff
xx %>% filter(ss.sccs==1 & ss.nonmedicare==1) %>%
mutate(healthf1 =sf12_physicalf1,
healthf3 = sf12_physicalf3) %>%
filter(!is.na(healthf3) & !(is.na(healthf1))) %>%
select(ss.expansion,health=healthf1,XX,XX.spline) %>% mutate(post=0) %>%
mutate_at(.vars=XX,.funs=function(x) as.factor(x)) -> tt.xx.1
xx %>% filter(ss.sccs==1 & ss.nonmedicare==1) %>%
mutate(healthf1 =sf12_physicalf1,
healthf3 = sf12_physicalf3) %>%
filter(!is.na(healthf3) & !(is.na(healthf1))) %>%
select(ss.expansion,health=healthf3,XX,XX.spline) %>% mutate(post=1) %>%
mutate_at(.vars=XX,.funs=function(x) as.factor(x)) -> tt.xx.2
# Unadjusted
rbind(tt.xx.1,tt.xx.2) %>% mutate(post_expansion = ss.expansion*post) %>%
do(tidy(lm(phys.fmla.unadj,data=.))) %>%
filter(term=="post_expansion") -> result.std
# Adjusted
rbind(tt.xx.1,tt.xx.2) %>% mutate(post_expansion = ss.expansion*post) %>%
do(tidy(lm(phys.fmla.adj,data=.))) %>%
filter(term=="post_expansion") -> result.std.adj
# Change Score
xx %>% filter(ss.sccs==1 & ss.nonmedicare==1) %>%
mutate(healthf1 =sf12_physicalf1,
healthf3 = sf12_physicalf3) %>%
filter(!is.na(healthf3) & !(is.na(healthf1))) %>%
select(ss.expansion,healthf3,healthf1,XX,XX.spline) %>%
mutate(health.cs = healthf3-healthf1) %>%
mutate_at(.vars=XX,.funs=function(x) as.factor(x)) -> xx.cs
# Unadjusted
xx.cs %>% do(tidy(lm(phys.fmla.cs.unadj,data=.))) %>%
filter(term=="ss.expansion") -> result.cs
# Adjusted
xx.cs %>% do(tidy(lm(phys.fmla.cs.adj,data=.))) %>%
filter(term=="ss.expansion") -> result.cs.adj
rbind(result.std,result.cs) %>% mutate(Approach=c("Standard","Change Score")) %>% select(Approach,everything()) %>%
select(Approach,term,estimate) %>%
knitr::kable(caption = "Comparison of Standard DD to Change Score DD")
Approach | term | estimate |
---|---|---|
Standard | post_expansion | 0.1684 |
Change Score | ss.expansion | 0.1684 |
We’ll start off by first fitting regressions of the form: \[ Y_{it} = \beta_0 + \beta_1 \textrm{State}_{i} + \beta_2 \textrm{Year}_t + \beta_3 \textrm{Expansion}_{it} \times \textrm{Post}_t + \epsilon_{it} \]
# Reshape the Data for Analyses
xx %>%
#filter(ss.nonmedicare==1 & r_1==1) %>%
filter(ss.nonelderly.fu3==1 & r_1==1 & state!="" & ss.sccs==1 ) %>%
#filter(ss.composite.health.nonmedicare == 1 ) %>%
select(
healthf1 = sf12_physicalf1,
healthf3 = sf12_physicalf3,
mentalf1 = sf12_mentalf1,
mentalf3 = sf12_mentalf3,
ss.expansion,
idnumber,
questionnaireyrf1,
questionnaireyrf3,
insurancecoveragef1,
insurancecoveragef3,
sf12v2healthf1,
sf12v2healthf3,
vitalstatus
) %>%
gather(var,
value,
-ss.expansion,
-idnumber,
-vitalstatus) %>%
mutate(round = ifelse(grep("f[1:3]$", var), gsub("[^1:3]", "", var), "")) %>%
mutate(round = ifelse(round=="11","1",ifelse(round=="13","3",round))) %>%
separate(var, c("var", "temp"), "f[1:3]$") %>% select(-temp) %>%
spread(var, value) %>% arrange(idnumber) %>%
mutate(post = as.integer(questionnaireyr >= 2014)) %>%
mutate(expansion = as.integer(ss.expansion == 1)) %>%
mutate(ex_vg = as.integer(sf12v2health %in% c(1)),
fair_poor = as.integer(sf12v2health %in% c(4,5))) %>%
mutate(post_expansion = post * expansion) -> xx.composite.health.nonmedicare.panel0
# Make sure we have a balance sample
xx.composite.health.nonmedicare.panel0 %>%
filter(!is.na(sf12v2health)) %>%
group_by(idnumber) %>%
mutate(n_gen = n()) %>% filter(n_gen == 2) %>% pull(idnumber) %>% unique() -> sample.genhealth
xx.composite.health.nonmedicare.panel0 %>%
filter(!is.na(health) & !is.na(mental)) %>%
group_by(idnumber) %>%
mutate(n_gen = n()) %>% filter(n_gen == 2) %>% pull(idnumber) %>% unique() -> sample.compositehealth
# Add in Baseline Covariates
xx %>% select(idnumber,XX,XX.spline,state) %>%
mutate_at(
.vars = XX,
.funs = function(x)
as.factor(x)
) %>%
inner_join(xx.composite.health.nonmedicare.panel0,"idnumber") %>%
mutate(state=droplevels(state),
year = as.factor(questionnaireyr),
post = as.integer(questionnaireyr>=2014),
insurance01 = as.integer(insurancecoverage==1)) %>%
####### RESTRICTING TO SURVIVORS ONLY!!!! #########
filter(vitalstatus!="D") -> xx.composite.health.nonmedicare.panel
f.composite.physical.panel.adj <- lm(phys.fmla.adj2,data=subset(xx.composite.health.nonmedicare.panel,idnumber %in% sample.compositehealth))
f.composite.mental.panel.adj <- lm(mental.fmla.adj2,data=subset(xx.composite.health.nonmedicare.panel,idnumber %in% sample.compositehealth))
f.composite.insurance.panel.adj <- lm(ins.fmla.adj2,data=subset(xx.composite.health.nonmedicare.panel,idnumber %in% sample.genhealth))
f.composite.ex_vg.panel.adj <- lm(ex_vg.fmla.adj2,data=subset(xx.composite.health.nonmedicare.panel,idnumber %in% sample.genhealth))
f.composite.fair_poor.panel.adj <- lm(fair_poor.fmla.adj2,data=subset(xx.composite.health.nonmedicare.panel,idnumber %in% sample.genhealth))
e.samp.physical = expand.model.frame(f.composite.physical.panel.adj,~idnumber) %>% na.omit() %>% pull(idnumber)
e.samp.mental = expand.model.frame(f.composite.mental.panel.adj,~idnumber) %>% na.omit() %>% pull(idnumber)
e.samp.insurance = expand.model.frame(f.composite.insurance.panel.adj,~idnumber) %>% na.omit() %>% pull(idnumber)
e.samp.ex_vg = expand.model.frame(f.composite.ex_vg.panel.adj,~idnumber) %>% na.omit() %>% pull(idnumber)
e.samp.fair_poor = expand.model.frame(f.composite.fair_poor.panel.adj,~idnumber) %>% na.omit() %>% pull(idnumber)
f.composite.physical.panel <- lm(phys.fmla.unadj2,data=subset(xx.composite.health.nonmedicare.panel,idnumber %in% e.samp.physical))
f.composite.mental.panel <- lm(mental.fmla.unadj2,data=subset(xx.composite.health.nonmedicare.panel,idnumber %in% e.samp.mental))
f.composite.insurance.panel <- lm(ins.fmla.unadj2,data=subset(xx.composite.health.nonmedicare.panel,idnumber %in% e.samp.insurance))
f.composite.ex_vg.panel <- lm(ex_vg.fmla.unadj2,data=subset(xx.composite.health.nonmedicare.panel,idnumber %in% e.samp.ex_vg))
f.composite.fair_poor.panel <- lm(fair_poor.fmla.unadj2,data=subset(xx.composite.health.nonmedicare.panel,idnumber %in% e.samp.fair_poor))
summ.phys <- model.frame(f.composite.physical.panel.adj) %>% select(health) %>% data.frame() %>%
mutate(health = as.numeric(health)) %>% summarise(mean = mean(health,na.rm=TRUE),sd=sd(health)) %>% mutate(mean = paste0(round(mean,1)) , sd = paste0("(",round(sd,1),")")) %>% as.vector()
summ.mental <- model.frame(f.composite.mental.panel.adj) %>% select(mental) %>% data.frame() %>% mutate(mental=as.numeric(mental)) %>% summarise(mean = mean(mental,na.rm=TRUE),sd=sd(mental)) %>% mutate(mean = paste0(round(mean,1)) , sd = paste0("(",round(sd,1),")")) %>% as.vector()
summ.insurance <- model.frame(f.composite.insurance.panel.adj) %>% select(insurance01) %>% data.frame() %>% summarise(mean = mean(insurance01),sd=sd(insurance01)) %>% mutate(mean = paste0(round(mean,3)) , sd = paste0("(",round(sd,3),")")) %>% as.vector()
summ.ex_vg <- model.frame(f.composite.ex_vg.panel.adj) %>% select(ex_vg) %>% data.frame() %>% summarise(mean = mean(ex_vg),sd=sd(ex_vg)) %>% mutate(mean = paste0(round(mean,3)) , sd = paste0("(",round(sd,3),")")) %>% as.vector()
summ.fair_poor <- model.frame(f.composite.fair_poor.panel.adj) %>% select(fair_poor) %>% data.frame() %>% summarise(mean = mean(fair_poor),sd=sd(fair_poor)) %>% mutate(mean = paste0(round(mean,3)) , sd = paste0("(",round(sd,3),")")) %>% as.vector()
r1 <- tidy(f.composite.physical.panel) %>% filter(grepl("^I",term)) %>% mutate(adjusted="No",outcome="Physical Composite") %>%
cbind(summ.phys) %>% select(outcome,mean,sd,adjusted,everything(),-term)
r2 <- tidy(f.composite.physical.panel.adj) %>% filter(grepl("^I",term)) %>% mutate(adjusted="Yes",outcome="Physical Composite") %>%
cbind(summ.phys) %>% select(outcome,mean,sd,adjusted,everything(),-term)
r3 <- tidy(f.composite.mental.panel) %>% filter(grepl("^I",term)) %>% mutate(adjusted="No",outcome="Mental Composite") %>%
cbind(summ.mental) %>% select(outcome,mean,sd,adjusted,everything(),-term)
r4 <- tidy(f.composite.mental.panel.adj) %>% filter(grepl("^I",term)) %>% mutate(adjusted="Yes",outcome="Mental Composite") %>%
cbind(summ.mental) %>% select(outcome,mean,sd,adjusted,everything(),-term)
r5 <- tidy(f.composite.insurance.panel) %>% filter(grepl("^I",term)) %>% mutate(adjusted="No",outcome="Insurance Coverage Yes/No") %>%
cbind(summ.insurance) %>% select(outcome,mean,sd,adjusted,everything(),-term)
r6 <- tidy(f.composite.insurance.panel.adj) %>% filter(grepl("^I",term)) %>% mutate(adjusted="Yes",outcome="Insurance Coverage Yes/No") %>%
cbind(summ.insurance) %>% select(outcome,mean,sd,adjusted,everything(),-term)
r7 <- tidy(f.composite.ex_vg.panel) %>% filter(grepl("^I",term)) %>% mutate(adjusted="No",outcome="Excellent or Very Good Health") %>%
cbind(summ.ex_vg) %>% select(outcome,mean,sd,adjusted,everything(),-term)
r8 <- tidy(f.composite.ex_vg.panel.adj) %>% filter(grepl("^I",term)) %>% mutate(adjusted="Yes",outcome="Excellent or Very Good Health") %>%
cbind(summ.ex_vg) %>% select(outcome,mean,sd,adjusted,everything(),-term)
r9 <- tidy(f.composite.fair_poor.panel) %>% filter(grepl("^I",term)) %>% mutate(adjusted="No",outcome="Fair or Poor Health") %>%
cbind(summ.fair_poor) %>% select(outcome,mean,sd,adjusted,everything(),-term)
r10 <- tidy(f.composite.fair_poor.panel.adj) %>% filter(grepl("^I",term)) %>% mutate(adjusted="Yes",outcome="Fair or Poor Health") %>%
cbind(summ.fair_poor) %>% select(outcome,mean,sd,adjusted,everything(),-term)
rbind(r5,r6,r7,r8,r9,r10,r1,r2,r3,r4) %>% knitr::kable(caption="Across State Difference-in-Difference Estimates")
outcome | mean | sd | adjusted | estimate | std.error | statistic | p.value |
---|---|---|---|---|---|---|---|
Insurance Coverage Yes/No | 0.775 | (0.418) | No | 0.0764 | 0.0116 | 6.5851 | 0.0000 |
Insurance Coverage Yes/No | 0.775 | (0.418) | Yes | 0.0777 | 0.0114 | 6.8027 | 0.0000 |
Excellent or Very Good Health | 0.063 | (0.243) | No | 0.0029 | 0.0069 | 0.4151 | 0.6781 |
Excellent or Very Good Health | 0.063 | (0.243) | Yes | 0.0022 | 0.0067 | 0.3205 | 0.7486 |
Fair or Poor Health | 0.376 | (0.484) | No | 0.0031 | 0.0137 | 0.2263 | 0.8210 |
Fair or Poor Health | 0.376 | (0.484) | Yes | 0.0044 | 0.0129 | 0.3385 | 0.7350 |
Physical Composite | 41.5 | (13.1) | No | 0.3488 | 0.4722 | 0.7386 | 0.4601 |
Physical Composite | 41.5 | (13.1) | Yes | 0.2529 | 0.4298 | 0.5885 | 0.5562 |
Mental Composite | 46.6 | (12.2) | No | 0.1685 | 0.4446 | 0.3791 | 0.7046 |
Mental Composite | 46.6 | (12.2) | Yes | 0.1686 | 0.4215 | 0.4001 | 0.6891 |
require(nnet)
source("./principal-stratification/ding-lu-replication/PS_M_weighting_SA.R")
source("./principal-stratification/ding-lu-replication/PS_M_weighting.R")
source("./principal-stratification/ding-lu-replication/PS_SM_weighting.R")
################################################################
## Define variables used in principal stratification analyses.
################################################################
# Baseline variables for predicting principal strata
XX <- c(
"education",
"sex",
"maritalstatus",
"hhsize",
"employed",
"raceanalysis",
"insurancecoverage",
"enrollment_age"
)
# Other variables needed for the principal stratification analysis
XX2 <- c("ss.expansion", "r_3", "vitalstatus")
# Outcome variables
YY <- c("health_cs","mental_cs","healthf1","healthf3","mentalf1","mentalf3")
# Construct and standardize the data
xx.ps0 <- xx %>% filter(ss.composite.health.nonmedicare == 1) %>%
rename(
healthf1 = sf12_physicalf1,
healthf3 = sf12_physicalf3,
mentalf1 = sf12_mentalf1,
mentalf3 = sf12_mentalf3
) %>%
mutate(health_cs = healthf3 - healthf1,
mental_cs = mentalf3 - mentalf1
) %>%
select_at(vars("idnumber",YY,XX,XX2)) %>%
mutate_at(.vars=YY,.funs=function(x) ifelse(is.na(x),0,x)) %>%
mutate_at(.vars=c(XX),.funs = function(x) ifelse(x %in% c("8888","9999"),NA,x)) %>%
mutate(raceanalysis = ifelse(raceanalysis %in% 3:6,3,raceanalysis)) %>%
mutate(hhsize = ifelse(hhsize>4,5,hhsize)) %>%
mutate(employed = ifelse(employed==3,NA,employed)) %>%
mutate_at(.vars = c(XX[-which(XX=="enrollment_age")]),.funs=function(x) as.factor(x))
# The model matrix expands out the factors into dummy variables.
f.temp <- lm(tofmla(y="health_cs",x=c(XX,"ss.expansion","mental_cs","vitalstatus","r_3")),data=xx.ps0)
mf <-cbind.data.frame(na.omit(expand.model.frame(f.temp, ~ idnumber)),
model.matrix(f.temp)[,-which(colnames(model.matrix(f.temp)) %in%
colnames(model.frame(f.temp)))]) %>% tbl_df() %>%
mutate(
Z = as.integer(ss.expansion),
D = as.integer(vitalstatus == "A"),
Y = health_cs
)
# These model runs aren't used anywhere; they're mainly a convenient way to expand the model frame
# to include all the factors as dummy variables, exclude levels of categorical variables, etc.
f.temp2 <- lm(tofmla(y="health_cs",x=c(XX)),data=mf)
covariates <- colnames(model.matrix(f.temp2))[-1]
covariates <- grep("sex|enrollment_age",covariates,value=TRUE)
# Scale the covariates for easier fitting.
xx.ps = mf #%>% mutate_at(covariates,scale)
# For code development and debugging purporses we only select 250 random cases to fit the
# principal stratification code on.
set.seed(123)
ss <- sample(seq(nrow(mf)),200)
ss <- seq(nrow(mf))
# Define the final principal stratification sample (i.e., only the sampled cases)
xx.ps.s <- xx.ps %>% filter(row_number() %in% ss)
# Check Balance
covariates %>%
purrr::map(~PSPS_M_weighting(xx.ps.s$Z, xx.ps.s$D, as.matrix(xx.ps.s[,covariates]),as.matrix(xx.ps.s[,.x]))) -> balance
AACE.reg.balance <- seq(length(balance)) %>% purrr::map_dbl(~balance[[.x]]$AACE.reg)
AACE.balance <- seq(length(balance)) %>% purrr::map_dbl(~balance[[.x]]$AACE)
# Fit the point estimates
c("mental_cs","health_cs") %>%
purrr::map(~PSPS_M_weighting(xx.ps.s$Z, xx.ps.s$D, as.matrix(xx.ps.s[,covariates]),as.matrix(xx.ps.s[,.x]))) -> point
# Get inference via bootstrapped samples.
xx.ps.s %>% bootstrap(1000) %>%
do(data.frame(value=PSPS_M_weighting(.$Z, .$D, as.matrix(.[,covariates]),.$health_cs)$AACE.reg)) -> inference.physical
xx.ps.s %>% bootstrap(1000) %>%
do(data.frame(value=PSPS_M_weighting(.$Z, .$D, as.matrix(.[,covariates]),.$mental_cs)$AACE.reg)) -> inference.mental