ECLS-K-2011

Here is an example using the Early Childhood Longitudinal Study Kindergarten 2011 cohort study data to evaluate how kindergartens in assigned public (i.e. kindergarteners attending the school they are geographically assigned) versus those in alternative regular public schools (e.g. private, public charter or magnet schools) over time on a self control construct. I will go through how to grab and clean data, impute the missing data, match the data using propensity score matching, use multilevel modeling to track self control over time, then aggregate the results from the different imputed data sets to create a table of final results. Here is a link to a description of the variables: https://docs.google.com/spreadsheets/d/1UngXf85XP29WHhUx0f88b7ceGJ49ezPKTuUqjedgPlE/edit?usp=sharing

The three variables of interest will be the XPUBPRI, which is whether a student attended an assigned public (i.e the school assigned the child based upon their address) or an alternative to regular public school (e.g. private school, public charter or magnet). Time, which indicates the four time points described below and XTCHCON, which is the teacher’s reports of the student’s self control.

The first step is to create a data set with each of the variables above over the four time points. The first time point takes place in the fall of 2010, then spring 2011, then fall 2011, and finally spring 2012. The 1 in each variable indicates fall 2010 and so on for 2 (spring 2011),3 (fall 2011), 4 (spring 2012).

I then change all of the negative -9’s to NA, so that I can impute then later on.

setwd("~/Google Drive/PARCS/Projects/ECLSK2011/Data")
data = read.csv("ELCS-K-2011.csv", header = TRUE)
data1 = cbind(id = 1:length(data$X1TCHAPP), X1TCHAPP = data$X1TCHAPP, X2TCHAPP = data$X2TCHAPP, X3TCHAPP = data$X3TCHAPP, X4TCHAPP = data$X4TCHAPP, X1TCHCON = data$X1TCHCON, X2TCHCON = data$X2TCHCON, X3TCHCON = data$X3TCHCON, X4TCHCON = data$X4TCHCON,X1TCHPER = data$X1TCHPER, X2TCHPER = data$X2TCHPER, X3TCHPER = data$X3TCHPER, X4TCHPER = data$X4TCHPER, X1TCHEXT = data$X1TCHEXT, X2TCHEXT = data$X2TCHEXT, X3TCHEXT = data$X3TCHEXT, X4TCHEXT = data$X4TCHEXT, X1TCHINT = data$X1TCHINT, X2TCHINT = data$X2TCHINT, X3TCHINT = data$X3TCHINT, X4TCHINT = data$X4TCHINT,X1RTHET = data$X1RTHET, X2RTHET = data$X2RTHET, X3RTHET = data$X3RTHET, X4RTHET = data$X4RTHET, X1MTHET = data$X1MTHET, X2MTHET = data$X2MTHET, X3MTHET = data$X3MTHET, X4MTHET = data$X4MTHET, X1BMI = data$X1BMI, X2BMI = data$X2BMI, X3BMI = data$X3BMI, X4BMI = data$X4BMI)

data1 = apply(data1, 2, function(x){ifelse(x == -9, NA, x)})
data1 = as.data.frame(data1)
head(data1)

##   id X1TCHAPP X2TCHAPP X3TCHAPP X4TCHAPP X1TCHCON X2TCHCON X3TCHCON
## 1  1     3.57     3.00       NA     3.29     4.00     3.25       NA
## 2  2     2.43     3.71       NA     2.71     2.67     4.00       NA
## 3  3       NA     3.00       NA     2.43       NA     2.75       NA
## 4  4     2.14     2.00       NA     2.86     2.00     1.67       NA
## 5  5     3.67     3.57       NA       NA     3.75     3.50       NA
## 6  6     2.29     2.17       NA     2.57       NA     3.00       NA
##   X4TCHCON X1TCHPER X2TCHPER X3TCHPER X4TCHPER X1TCHEXT X2TCHEXT X3TCHEXT
## 1     3.25      4.0      3.8       NA      3.4      1.2      1.6       NA
## 2     3.25      3.0      4.0       NA      3.2      1.2      1.0       NA
## 3     4.00       NA      4.0       NA      4.0       NA      1.8       NA
## 4     3.00      2.2      2.0       NA      2.6      3.2      3.0       NA
## 5       NA      3.8      4.0       NA       NA      2.2      1.2       NA
## 6     3.25       NA      2.2       NA      2.2       NA      1.8       NA
##   X4TCHEXT X1TCHINT X2TCHINT X3TCHINT X4TCHINT X1RTHET X2RTHET X3RTHET
## 1     2.83     1.25     1.25       NA     2.00  0.1930  1.7614      NA
## 2     1.33     1.75     1.25       NA     1.25 -0.7870  0.7452      NA
## 3     1.83       NA     2.00       NA     1.50      NA  0.3323      NA
## 4     1.83     1.50     2.25       NA     1.25 -1.7087 -0.2922      NA
## 5       NA     1.25     1.75       NA       NA -0.0734  0.8013      NA
## 6     1.67     2.00     2.00       NA     2.25 -1.4601 -0.6792      NA
##   X4RTHET X1MTHET X2MTHET X3MTHET X4MTHET X1BMI X2BMI X3BMI X4BMI
## 1  3.1523  0.5827  1.5384      NA  2.7286 16.40 16.32    NA 16.05
## 2  2.2023 -0.3473  0.8800      NA  2.0825 22.19 22.92    NA 22.84
## 3  1.1861      NA  1.0112      NA  2.2080    NA 15.10    NA 17.92
## 4  0.8632 -2.1245 -0.2834      NA  1.1082 18.14 19.47    NA 21.71
## 5      NA -0.5545  0.6163      NA      NA 15.20 15.54    NA    NA
## 6  1.0132 -1.0616 -0.5379      NA  0.4484 17.28 17.19    NA 17.78

Here I am creating a separate data set for the indicator of interest PUBPRI where I am setting the variable to be a 1 is alternative to public school and 0 is a regular public school.

Here I am creating a separate data set for the indicator of interest PUBPRI where I am setting the variable to be a 1 is alternative to public school and 0 is a regular public school. I am making these changes to PUBPRI separately, because I need to make specific transformations to these variables and not to the other variables in the dataset. I will combine this data set with the one described above.

XPUBPRI = cbind(X1PUBPRI = data$X1PUBPRI, X2PUBPRI = data$X2PUBPRI, X3PUBPRI = data$X3PUBPRI, X4PUBPRI = data$X4PUBPRI)
XPUBPRI = apply(XPUBPRI, 2, function(x){ifelse(x == -9, NA, x)})
XPUBPRI = as.data.frame(XPUBPRI)
XPUBPRI = as.data.frame(apply(XPUBPRI, 2, function(x){ifelse(x == 1, 0, 1)}))
head(XPUBPRI)

##   X1PUBPRI X2PUBPRI X3PUBPRI X4PUBPRI
## 1        0        0       NA        0
## 2        0        0       NA        0
## 3       NA        0       NA        0
## 4        0        0       NA        0
## 5        1        1       NA       NA
## 6        0        0       NA        0

data1 = cbind(data1, XPUBPRI); head(data1)

##   id X1TCHAPP X2TCHAPP X3TCHAPP X4TCHAPP X1TCHCON X2TCHCON X3TCHCON
## 1  1     3.57     3.00       NA     3.29     4.00     3.25       NA
## 2  2     2.43     3.71       NA     2.71     2.67     4.00       NA
## 3  3       NA     3.00       NA     2.43       NA     2.75       NA
## 4  4     2.14     2.00       NA     2.86     2.00     1.67       NA
## 5  5     3.67     3.57       NA       NA     3.75     3.50       NA
## 6  6     2.29     2.17       NA     2.57       NA     3.00       NA
##   X4TCHCON X1TCHPER X2TCHPER X3TCHPER X4TCHPER X1TCHEXT X2TCHEXT X3TCHEXT
## 1     3.25      4.0      3.8       NA      3.4      1.2      1.6       NA
## 2     3.25      3.0      4.0       NA      3.2      1.2      1.0       NA
## 3     4.00       NA      4.0       NA      4.0       NA      1.8       NA
## 4     3.00      2.2      2.0       NA      2.6      3.2      3.0       NA
## 5       NA      3.8      4.0       NA       NA      2.2      1.2       NA
## 6     3.25       NA      2.2       NA      2.2       NA      1.8       NA
##   X4TCHEXT X1TCHINT X2TCHINT X3TCHINT X4TCHINT X1RTHET X2RTHET X3RTHET
## 1     2.83     1.25     1.25       NA     2.00  0.1930  1.7614      NA
## 2     1.33     1.75     1.25       NA     1.25 -0.7870  0.7452      NA
## 3     1.83       NA     2.00       NA     1.50      NA  0.3323      NA
## 4     1.83     1.50     2.25       NA     1.25 -1.7087 -0.2922      NA
## 5       NA     1.25     1.75       NA       NA -0.0734  0.8013      NA
## 6     1.67     2.00     2.00       NA     2.25 -1.4601 -0.6792      NA
##   X4RTHET X1MTHET X2MTHET X3MTHET X4MTHET X1BMI X2BMI X3BMI X4BMI X1PUBPRI
## 1  3.1523  0.5827  1.5384      NA  2.7286 16.40 16.32    NA 16.05        0
## 2  2.2023 -0.3473  0.8800      NA  2.0825 22.19 22.92    NA 22.84        0
## 3  1.1861      NA  1.0112      NA  2.2080    NA 15.10    NA 17.92       NA
## 4  0.8632 -2.1245 -0.2834      NA  1.1082 18.14 19.47    NA 21.71        0
## 5      NA -0.5545  0.6163      NA      NA 15.20 15.54    NA    NA        1
## 6  1.0132 -1.0616 -0.5379      NA  0.4484 17.28 17.19    NA 17.78        0
##   X2PUBPRI X3PUBPRI X4PUBPRI
## 1        0       NA        0
## 2        0       NA        0
## 3        0       NA        0
## 4        0       NA        0
## 5        1       NA       NA
## 6        0       NA        0

Here I am cleaning up the demographic variables that I want to include, which are the child’s gender (female 1 male 0), the child’s ethnicity (hispanic, black, asian, american indian, hawaiian, and multiracial with white as the reference category). And the parent’s ethnicity (nonwhite 1 and white 0).

After each variable is loaded in, I then need to change all -9’s to NA like I did above, because I want -9’s as NA’s for imputation later. Then I use the apply and ifelse inside a function within the apply function to change each data set to one’s and zero’s while maintaining the NA’s because I want to impute those later.

At the end of this section, I am deleting each of the X1_RACETHP_R variables, because I no longer need that variable, since I broke it into the different ethnicities.

childGender =cbind(X1_CHSEX_R = data$X_CHSEX_R)
childGender = as.data.frame(childGender)

childGender = apply(childGender, 2, function(x){ifelse(x == -9, NA, x)})
childGender = as.data.frame(childGender)

childGender = ifelse(is.na(childGender), NA, ifelse(childGender == 1, 1,0))
childGender = as.data.frame(childGender)


childEth = cbind(X1_RACETHP_R = data$X_RACETHP_R)
childEth = apply(childEth, 2, function(x){ifelse(x == -9, NA, x)})
childEth = as.data.frame(childEth)


X_WHITE_R = ifelse(is.na(childEth), NA, ifelse(childEth == 1,1,0))
X_WHITE_R = as.data.frame(X_WHITE_R)
names(X_WHITE_R) = paste0("X", 1:ncol(X_WHITE_R),"_WHITE_R")
head(X_WHITE_R)

##   X1_WHITE_R
## 1          1
## 2          1
## 3          0
## 4          1
## 5          0
## 6          1

n = dim(X_WHITE_R)
n = n[1]
missingWhite = sum(is.na(X_WHITE_R))
perMissingWhite = missingWhite/n; perMissingWhite

## [1] 0.06921976

X_HISP_R = ifelse(is.na(childEth), NA, ifelse(childEth == 3,1, ifelse(childEth == 4, 1, 0)))
X_HISP_R = as.data.frame(X_HISP_R)
names(X_HISP_R) = paste0("X", 1:ncol(X_HISP_R),"_HISP_R")

X_BLACK_R = ifelse(is.na(childEth), NA, ifelse(childEth == 2, 1, 0))
X_BLACK_R = as.data.frame(X_BLACK_R)
names(X_BLACK_R) = paste0("X", 1:ncol(X_BLACK_R),"_BLACK_R")


X_ASIAN_R = ifelse(is.na(childEth), NA, ifelse(childEth == 5, 1, 0))
X_ASIAN_R = as.data.frame(X_ASIAN_R)
names(X_ASIAN_R) = paste0("X", 1:ncol(X_ASIAN_R),"_ASIAN_R")


X_AMINAN_R = ifelse(is.na(childEth), NA, ifelse(childEth == 7, 1, 0))
X_AMINAN_R = as.data.frame(X_AMINAN_R)
names(X_AMINAN_R) = paste0("X", 1:ncol(X_AMINAN_R),"_AMINAN_R")


X_HAWPI_R = ifelse(is.na(childEth), NA, ifelse(childEth == 6, 1, 0))
X_HAWPI_R = as.data.frame(X_HAWPI_R)
names(X_AMINAN_R) = paste0("X", 1:ncol(X_AMINAN_R),"_AMINAN_R")


X_MULTR_R = ifelse(is.na(childEth), NA, ifelse(childEth == 8, 1, 0))
X_MULTR_R = as.data.frame(X_MULTR_R)
names(X_MULTR_R) = paste0("X", 1:ncol(X_MULTR_R),"_MULTR_R")


childEth = cbind(X_HISP_R, X_BLACK_R, X_ASIAN_R, X_AMINAN_R, X_HAWPI_R, X_MULTR_R)
childEth = as.data.frame(childEth)

parEth = cbind(X1PAR1RAC = data$X1PAR1RAC)
parEth = apply(parEth, 2, function(x){ifelse(x == -9, NA, x)})
parEth = as.data.frame(parEth)
parEth = ifelse(is.na(parEth), NA, ifelse(parEth == 1, 0,1))
parEth = as.data.frame(parEth)
names(parEth) = paste0("X", 1:ncol(parEth),"PAR1RAC")

data1 = cbind(data1, childGender, childEth, parEth)

data1$X1_RACETHP_R = data1$X2_RACETHP_R = data1$X3_RACETHP_R = data1$X4_RACETHP_R = NULL
head(data1)

##   id X1TCHAPP X2TCHAPP X3TCHAPP X4TCHAPP X1TCHCON X2TCHCON X3TCHCON
## 1  1     3.57     3.00       NA     3.29     4.00     3.25       NA
## 2  2     2.43     3.71       NA     2.71     2.67     4.00       NA
## 3  3       NA     3.00       NA     2.43       NA     2.75       NA
## 4  4     2.14     2.00       NA     2.86     2.00     1.67       NA
## 5  5     3.67     3.57       NA       NA     3.75     3.50       NA
## 6  6     2.29     2.17       NA     2.57       NA     3.00       NA
##   X4TCHCON X1TCHPER X2TCHPER X3TCHPER X4TCHPER X1TCHEXT X2TCHEXT X3TCHEXT
## 1     3.25      4.0      3.8       NA      3.4      1.2      1.6       NA
## 2     3.25      3.0      4.0       NA      3.2      1.2      1.0       NA
## 3     4.00       NA      4.0       NA      4.0       NA      1.8       NA
## 4     3.00      2.2      2.0       NA      2.6      3.2      3.0       NA
## 5       NA      3.8      4.0       NA       NA      2.2      1.2       NA
## 6     3.25       NA      2.2       NA      2.2       NA      1.8       NA
##   X4TCHEXT X1TCHINT X2TCHINT X3TCHINT X4TCHINT X1RTHET X2RTHET X3RTHET
## 1     2.83     1.25     1.25       NA     2.00  0.1930  1.7614      NA
## 2     1.33     1.75     1.25       NA     1.25 -0.7870  0.7452      NA
## 3     1.83       NA     2.00       NA     1.50      NA  0.3323      NA
## 4     1.83     1.50     2.25       NA     1.25 -1.7087 -0.2922      NA
## 5       NA     1.25     1.75       NA       NA -0.0734  0.8013      NA
## 6     1.67     2.00     2.00       NA     2.25 -1.4601 -0.6792      NA
##   X4RTHET X1MTHET X2MTHET X3MTHET X4MTHET X1BMI X2BMI X3BMI X4BMI X1PUBPRI
## 1  3.1523  0.5827  1.5384      NA  2.7286 16.40 16.32    NA 16.05        0
## 2  2.2023 -0.3473  0.8800      NA  2.0825 22.19 22.92    NA 22.84        0
## 3  1.1861      NA  1.0112      NA  2.2080    NA 15.10    NA 17.92       NA
## 4  0.8632 -2.1245 -0.2834      NA  1.1082 18.14 19.47    NA 21.71        0
## 5      NA -0.5545  0.6163      NA      NA 15.20 15.54    NA    NA        1
## 6  1.0132 -1.0616 -0.5379      NA  0.4484 17.28 17.19    NA 17.78        0
##   X2PUBPRI X3PUBPRI X4PUBPRI X1_CHSEX_R X1_HISP_R X1_BLACK_R X1_ASIAN_R
## 1        0       NA        0          1         0          0          0
## 2        0       NA        0          0         0          0          0
## 3        0       NA        0          1         1          0          0
## 4        0       NA        0          1         0          0          0
## 5        1       NA       NA          0         1          0          0
## 6        0       NA        0          0         0          0          0
##   X1_AMINAN_R X1_MULTR_R X1PAR1RAC
## 1           0          0         0
## 2           0          0         0
## 3           0          0        NA
## 4           0          0         0
## 5           0          0         0
## 6           0          0         0

Here is where I will use Amelia. Amelia is an R package developed by Gary King at Harvard University. If a person is missing a data point, Amelia uses information from that person’s variables (i.e. demographics, other test scores) to predict what that missing value would have been. Because there can be variability in inputting or predicting missing data values researchers like King recommend developing at least five data sets containing imputed values and then averaging them together (I average the results together at the end). To use amelia, you can set x to the data set of interest, m equal to the number of data sets that you want to impute and for non-continuous variables you can use the noms argument if you have nominal variables or you can use logs for count variables.

The summary function provides information on the algorithm’s convergence and the percentage of missing data per variable.

The compare density function compares the density of values for the self control, variable of interest over actual values. So this function predicts what the nonmissing values would have been using Amelia’s algorithm and then compares the density of a specified variable to the actual variable density. If the density’s match, then there is some evidence that the missing value algorithm is at least able to predict nonmissing values, which as close as we can get to estimating how well the algorithm would predict the missing values. I only focus on the dependent variable self control, since it is the most important variable and there is evidence that the density’s match.

The disperse function evaluates how quickly the chains converged or if they converged at all. Essentially, if the chains all converge on the same location the algorithm was able to adequately able to find a maximum likelihood (i.e. given the data find the parameter estimates that maximizes the probability that parameter estimates predict the actual values).

Finally, I wrioe the data sets to csv files, which will be loaded in again for further analyses.

library(Amelia)

## Loading required package: Rcpp

## ## 
## ## Amelia II: Multiple Imputation
## ## (Version 1.7.4, built: 2015-12-05)
## ## Copyright (C) 2005-2017 James Honaker, Gary King and Matthew Blackwell
## ## Refer to http://gking.harvard.edu/amelia/ for more information
## ##

library(mitools)
m = 5
a.out = amelia(x = data1, m=m, noms = c("X1PUBPRI", "X2PUBPRI", "X3PUBPRI", "X4PUBPRI", "X1_CHSEX_R",  "X1_HISP_R", "X1_BLACK_R", "X1_ASIAN_R", "X1_AMINAN_R", "X1_MULTR_R", "X1PAR1RAC"))

## -- Imputation 1 --
## 
##   1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
##  21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
## 
## 
## -- Imputation 2 --
## 
##   1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
##  21 22 23 24 25 26 27 28 29 30 31 32
## 
## -- Imputation 3 --
## 
##   1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
##  21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
##  41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
## 
## -- Imputation 4 --
## 
##   1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
##  21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
## 
## -- Imputation 5 --
## 
##   1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
##  21 22 23 24 25 26 27 28

summary(a.out)

## 
## Amelia output with 5 imputed datasets.
## Return code:  1 
## Message:  Normal EM convergence. 
## 
## Chain Lengths:
## --------------
## Imputation 1:  40
## Imputation 2:  32
## Imputation 3:  55
## Imputation 4:  39
## Imputation 5:  28
## 
## Rows after Listwise Deletion:  2140 
## Rows after Imputation:  18174 
## Patterns of missingness in the data:  1528 
## 
## Fraction Missing for original variables: 
## -----------------------------------------
## 
##             Fraction Missing
## id               0.000000000
## X1TCHAPP         0.187300539
## X2TCHAPP         0.120831958
## X3TCHAPP         0.723671179
## X4TCHAPP         0.259986794
## X1TCHCON         0.254429405
## X2TCHCON         0.130846264
## X3TCHCON         0.743699791
## X4TCHCON         0.273577638
## X1TCHPER         0.245735666
## X2TCHPER         0.130681193
## X3TCHPER         0.740068229
## X4TCHPER         0.268845604
## X1TCHEXT         0.208484648
## X2TCHEXT         0.124958732
## X3TCHEXT         0.726862551
## X4TCHEXT         0.262793001
## X1TCHINT         0.216518103
## X2TCHINT         0.127049631
## X3TCHINT         0.733245295
## X4TCHINT         0.267855178
## X1RTHET          0.137834269
## X2RTHET          0.054418400
## X3RTHET          0.714207109
## X4RTHET          0.168317376
## X1MTHET          0.141906020
## X2MTHET          0.056729394
## X3MTHET          0.712666447
## X4MTHET          0.168977660
## X1BMI            0.136018488
## X2BMI            0.055683944
## X3BMI            0.712666447
## X4BMI            0.171288654
## X1PUBPRI         0.094090459
## X2PUBPRI         0.019478376
## X3PUBPRI         0.705953560
## X4PUBPRI         0.157752834
## X1_CHSEX_R       0.002145923
## X1_HISP_R        0.069219764
## X1_BLACK_R       0.069219764
## X1_ASIAN_R       0.069219764
## X1_AMINAN_R      0.069219764
## X1_MULTR_R       0.069219764
## X1PAR1RAC        0.268405414

compare.density(a.out, var = "X1PUBPRI", main = "Observed and Imputed values of Private Time 1")

compare.density(a.out, var = "X2PUBPRI", main = "Observed and Imputed values of Private Time 2")

compare.density(a.out, var = "X3PUBPRI", main = "Observed and Imputed values of Private Time 3")

compare.density(a.out, var = "X4PUBPRI", main = "Observed and Imputed values of Private Time 4")

compare.density(a.out, var = "X1TCHCON", main = "Observed and Imputed values of Self Control Time 1")

compare.density(a.out, var = "X2TCHCON", main = "Observed and Imputed values of Self Control Time 2")

compare.density(a.out, var = "X3TCHCON", main = "Observed and Imputed values of Self Control Time 3")

compare.density(a.out, var = "X4TCHCON", main = "Observed and Imputed values of Self Control Time 4")

overimpute(a.out, var = "X1TCHCON")

overimpute(a.out, var = "X2TCHCON")

overimpute(a.out, var = "X3TCHCON")

overimpute(a.out, var = "X4TCHCON")

disperse(a.out, dims = 1, m = 5)

write.amelia(obj = a.out, file.stem = "ECLSK")
head(data1)

##   id X1TCHAPP X2TCHAPP X3TCHAPP X4TCHAPP X1TCHCON X2TCHCON X3TCHCON
## 1  1     3.57     3.00       NA     3.29     4.00     3.25       NA
## 2  2     2.43     3.71       NA     2.71     2.67     4.00       NA
## 3  3       NA     3.00       NA     2.43       NA     2.75       NA
## 4  4     2.14     2.00       NA     2.86     2.00     1.67       NA
## 5  5     3.67     3.57       NA       NA     3.75     3.50       NA
## 6  6     2.29     2.17       NA     2.57       NA     3.00       NA
##   X4TCHCON X1TCHPER X2TCHPER X3TCHPER X4TCHPER X1TCHEXT X2TCHEXT X3TCHEXT
## 1     3.25      4.0      3.8       NA      3.4      1.2      1.6       NA
## 2     3.25      3.0      4.0       NA      3.2      1.2      1.0       NA
## 3     4.00       NA      4.0       NA      4.0       NA      1.8       NA
## 4     3.00      2.2      2.0       NA      2.6      3.2      3.0       NA
## 5       NA      3.8      4.0       NA       NA      2.2      1.2       NA
## 6     3.25       NA      2.2       NA      2.2       NA      1.8       NA
##   X4TCHEXT X1TCHINT X2TCHINT X3TCHINT X4TCHINT X1RTHET X2RTHET X3RTHET
## 1     2.83     1.25     1.25       NA     2.00  0.1930  1.7614      NA
## 2     1.33     1.75     1.25       NA     1.25 -0.7870  0.7452      NA
## 3     1.83       NA     2.00       NA     1.50      NA  0.3323      NA
## 4     1.83     1.50     2.25       NA     1.25 -1.7087 -0.2922      NA
## 5       NA     1.25     1.75       NA       NA -0.0734  0.8013      NA
## 6     1.67     2.00     2.00       NA     2.25 -1.4601 -0.6792      NA
##   X4RTHET X1MTHET X2MTHET X3MTHET X4MTHET X1BMI X2BMI X3BMI X4BMI X1PUBPRI
## 1  3.1523  0.5827  1.5384      NA  2.7286 16.40 16.32    NA 16.05        0
## 2  2.2023 -0.3473  0.8800      NA  2.0825 22.19 22.92    NA 22.84        0
## 3  1.1861      NA  1.0112      NA  2.2080    NA 15.10    NA 17.92       NA
## 4  0.8632 -2.1245 -0.2834      NA  1.1082 18.14 19.47    NA 21.71        0
## 5      NA -0.5545  0.6163      NA      NA 15.20 15.54    NA    NA        1
## 6  1.0132 -1.0616 -0.5379      NA  0.4484 17.28 17.19    NA 17.78        0
##   X2PUBPRI X3PUBPRI X4PUBPRI X1_CHSEX_R X1_HISP_R X1_BLACK_R X1_ASIAN_R
## 1        0       NA        0          1         0          0          0
## 2        0       NA        0          0         0          0          0
## 3        0       NA        0          1         1          0          0
## 4        0       NA        0          1         0          0          0
## 5        1       NA       NA          0         1          0          0
## 6        0       NA        0          0         0          0          0
##   X1_AMINAN_R X1_MULTR_R X1PAR1RAC
## 1           0          0         0
## 2           0          0         0
## 3           0          0        NA
## 4           0          0         0
## 5           0          0         0
## 6           0          0         0

Here I am reading back in the imputed data sets. Then I am creating data sets with only the fall 2010 (i.e. variables with 1’s in the beginning of the variable’s name) for each of the imputed data sets. I am only gathering the first time points for each variable for two reasons. First, my analysis is an intent to treat (ITT). ITT in this context means that I am focusing on students starting in the treatment (i.e. an alternative to public school) and not worrying about if they stay in that school. Given that I have no control over whether students switch or move between schools focusing on students starting in alternative to regular public schools makes sense. Second, researchers such as Gary King who developed a propensity score matching software have recommended that when using propensity score matching people, students in this case, are matched on pretreatment covariates. The Fall 2010 is at the beginning of kindergarten therefore it is as close to pretreatment as we can get with this data set. It makes sense to match students on pretreatment variables, because we want to make sure they start off being similar and we expect that changes, some due to the dependent variable of interest will alter how similar they are over time.

In the data sets below, the first number in the variable name means that it only contains the first time points for each variable and the second number in each variable name indicates which imputed data set within the first variable data set that I am referring to. For example, ECLSK11 is the dataset with only the first time points for each variable with the first imputed data set.

setwd("~/Google Drive/PARCS/Projects/PropScore/Data")
ECLSK1  = read.csv("ECLSK1.csv", header = TRUE)
ECLSK1 = na.omit(ECLSK1)
ECLSK1 = as.data.frame(ECLSK1)

ECLSK2  = read.csv("ECLSK2.csv", header = TRUE)
ECLSK2 = na.omit(ECLSK2)
ECLSK2 = as.data.frame(ECLSK2)

ECLSK3  = read.csv("ECLSK3.csv", header = TRUE)
ECLSK3 = na.omit(ECLSK3)
ECLSK3 = as.data.frame(ECLSK3)

ECLSK4  = read.csv("ECLSK4.csv", header = TRUE)
ECLSK4 = na.omit(ECLSK4)
ECLSK4 = as.data.frame(ECLSK4)

ECLSK5  = read.csv("ECLSK5.csv", header = TRUE)
ECLSK5 = na.omit(ECLSK5)
ECLSK5 = as.data.frame(ECLSK5)

ECLSK11 = cbind(id = ECLSK1$id, X1TCHAPP= ECLSK1$X1TCHAPP, X1TCHCON =ECLSK1$X1TCHCON, X1TCHPER=ECLSK1$X1TCHPER, X1TCHEXT=ECLSK1$X1TCHEXT, X1TCHINT=ECLSK1$X1TCHINT, X1RTHET=ECLSK1$X1RTHET, X1MTHET=ECLSK1$X1MTHET, X1BMI=ECLSK1$X1BMI, X1PUBPRI=ECLSK1$X1PUBPRI, X1_CHSEX_R = ECLSK1$X1_CHSEX_R, X1_HISP_R = ECLSK1$X1_HISP_R, X1_BLACK_R = ECLSK1$X1_BLACK_R, X1_ASIAN_R = ECLSK1$X1_ASIAN_R, X1_AMINAN_R = ECLSK1$X1_AMINAN_R, X1_MULTR_R = ECLSK1$X1_MULTR_R, X1PAR1RAC = ECLSK1$X1PAR1RAC); 

ECLSK12 = cbind(id = ECLSK2$id, X1TCHAPP= ECLSK2$X1TCHAPP, X1TCHCON =ECLSK2$X1TCHCON, X1TCHPER=ECLSK2$X1TCHPER, X1TCHEXT=ECLSK2$X1TCHEXT, X1TCHINT=ECLSK2$X1TCHINT, X1RTHET=ECLSK2$X1RTHET, X1MTHET=ECLSK2$X1MTHET, X1BMI=ECLSK2$X1BMI, X1PUBPRI=ECLSK2$X1PUBPRI, X1_CHSEX_R = ECLSK2$X1_CHSEX_R, X1_HISP_R = ECLSK2$X1_HISP_R, X1_BLACK_R = ECLSK2$X1_BLACK_R, X1_ASIAN_R = ECLSK2$X1_ASIAN_R, X1_AMINAN_R = ECLSK2$X1_AMINAN_R, X1_MULTR_R = ECLSK2$X1_MULTR_R, X1PAR1RAC = ECLSK2$X1PAR1RAC); 

ECLSK13 = cbind(id = ECLSK3$id, X1TCHAPP= ECLSK3$X1TCHAPP, X1TCHCON =ECLSK3$X1TCHCON, X1TCHPER=ECLSK3$X1TCHPER, X1TCHEXT=ECLSK3$X1TCHEXT, X1TCHINT=ECLSK3$X1TCHINT, X1RTHET=ECLSK3$X1RTHET, X1MTHET=ECLSK3$X1MTHET, X1BMI=ECLSK3$X1BMI, X1PUBPRI=ECLSK3$X1PUBPRI, X1_CHSEX_R = ECLSK3$X1_CHSEX_R, X1_HISP_R = ECLSK3$X1_HISP_R, X1_BLACK_R = ECLSK3$X1_BLACK_R, X1_ASIAN_R = ECLSK3$X1_ASIAN_R, X1_AMINAN_R = ECLSK3$X1_AMINAN_R, X1_MULTR_R = ECLSK3$X1_MULTR_R, X1PAR1RAC = ECLSK3$X1PAR1RAC); 

ECLSK14 = cbind(id = ECLSK4$id, X1TCHAPP= ECLSK4$X1TCHAPP, X1TCHCON =ECLSK4$X1TCHCON, X1TCHPER=ECLSK4$X1TCHPER, X1TCHEXT=ECLSK4$X1TCHEXT, X1TCHINT=ECLSK4$X1TCHINT, X1RTHET=ECLSK4$X1RTHET, X1MTHET=ECLSK4$X1MTHET, X1BMI=ECLSK4$X1BMI, X1PUBPRI=ECLSK4$X1PUBPRI, X1_CHSEX_R = ECLSK4$X1_CHSEX_R, X1_HISP_R = ECLSK4$X1_HISP_R, X1_BLACK_R = ECLSK4$X1_BLACK_R, X1_ASIAN_R = ECLSK4$X1_ASIAN_R, X1_AMINAN_R = ECLSK4$X1_AMINAN_R, X1_MULTR_R = ECLSK4$X1_MULTR_R, X1PAR1RAC = ECLSK4$X1PAR1RAC); 

ECLSK15 = cbind(id = ECLSK5$id, X1TCHAPP= ECLSK5$X1TCHAPP, X1TCHCON =ECLSK5$X1TCHCON, X1TCHPER=ECLSK5$X1TCHPER, X1TCHEXT=ECLSK5$X1TCHEXT, X1TCHINT=ECLSK5$X1TCHINT, X1RTHET=ECLSK5$X1RTHET, X1MTHET=ECLSK5$X1MTHET, X1BMI=ECLSK5$X1BMI, X1PUBPRI=ECLSK5$X1PUBPRI, X1_CHSEX_R = ECLSK5$X1_CHSEX_R, X1_HISP_R = ECLSK5$X1_HISP_R, X1_BLACK_R = ECLSK5$X1_BLACK_R, X1_ASIAN_R = ECLSK5$X1_ASIAN_R, X1_AMINAN_R = ECLSK5$X1_AMINAN_R, X1_MULTR_R = ECLSK5$X1_MULTR_R, X1PAR1RAC = ECLSK5$X1PAR1RAC);

Now I need to get rid of the first time points for each variable in the original imputed data sets, because I want to combine the students who are matched on their first time points with their data from other time points later on and not create duplicates.

ECLSK1$X1TCHAPP = ECLSK1$X1TCHCON = ECLSK1$X1TCHPER = ECLSK1$X1TCHEXT= ECLSK1$X1TCHINT= ECLSK1$X1RTHET= ECLSK1$X1MTHET= ECLSK1$X1BMI= ECLSK1$X1PUBPRI = ECLSK1$X1_CHSEX_R= ECLSK1$X1_HISP_R =ECLSK1$X1_BLACK_R=ECLSK1$X1_ASIAN_R=ECLSK1$X1_AMINAN_R=ECLSK1$X1_MULTR_R=ECLSK1$X1PAR1RAC= NULL; head(ECLSK1)

##   X id X2TCHAPP X3TCHAPP X4TCHAPP X2TCHCON X3TCHCON X4TCHCON X2TCHPER
## 1 1  1     3.00 2.782508 3.290000     3.25 2.220770 3.250000      3.8
## 2 2  2     3.71 3.672679 2.710000     4.00 3.948725 3.250000      4.0
## 3 3  3     3.00 2.258966 2.430000     2.75 2.896026 4.000000      4.0
## 4 4  4     2.00 1.935345 2.860000     1.67 2.249771 3.000000      2.0
## 5 5  5     3.57 3.663384 3.481786     3.50 3.990938 3.466804      4.0
## 6 6  6     2.17 2.892432 2.570000     3.00 4.029104 3.250000      2.2
##   X3TCHPER X4TCHPER X2TCHEXT X3TCHEXT  X4TCHEXT X2TCHINT X3TCHINT X4TCHINT
## 1 3.043015 3.400000      1.6 2.095746 2.8300000     1.25 1.901459 2.000000
## 2 3.676641 3.200000      1.0 1.050640 1.3300000     1.25 1.097103 1.250000
## 3 3.644675 4.000000      1.8 2.440212 1.8300000     2.00 2.127760 1.500000
## 4 2.250536 2.600000      3.0 2.855797 1.8300000     2.25 1.215868 1.250000
## 5 3.315024 3.115097      1.2 1.055278 0.9094211     1.75 1.061368 0.734857
## 6 2.870739 2.200000      1.8 1.790236 1.6700000     2.00 1.721055 2.250000
##   X2RTHET    X3RTHET  X4RTHET X2MTHET     X3MTHET  X4MTHET X2BMI    X3BMI
## 1  1.7614 2.34962926 3.152300  1.5384  2.53200533 2.728600 16.32 16.93500
## 2  0.7452 0.89808379 2.202300  0.8800  1.00809281 2.082500 22.92 24.18920
## 3  0.3323 0.88050961 1.186100  1.0112  0.55828675 2.208000 15.10 17.81556
## 4 -0.2922 0.12163952 0.863200 -0.2834  0.10080883 1.108200 19.47 20.79843
## 5  0.8013 0.70428075 1.439961  0.6163 -0.03791495 1.219532 15.54 14.73541
## 6 -0.6792 0.06192273 1.013200 -0.5379 -0.09088203 0.448400 17.19 17.30015
##      X4BMI X2PUBPRI X3PUBPRI X4PUBPRI
## 1 16.05000        0        0        0
## 2 22.84000        0        0        0
## 3 17.92000        0        0        0
## 4 21.71000        0        0        0
## 5 13.88093        1        1        1
## 6 17.78000        0        0        0

ECLSK2$X1TCHAPP = ECLSK2$X1TCHCON = ECLSK2$X1TCHPER = ECLSK2$X1TCHEXT= ECLSK2$X1TCHINT= ECLSK2$X1RTHET= ECLSK2$X1MTHET= ECLSK2$X1BMI= ECLSK2$X1PUBPRI = ECLSK2$X1_CHSEX_R= ECLSK2$X1_HISP_R =ECLSK2$X1_BLACK_R=ECLSK2$X1_ASIAN_R=ECLSK2$X1_AMINAN_R=ECLSK2$X1_MULTR_R=ECLSK2$X1PAR1RAC= NULL; head(ECLSK2)

##   X id X2TCHAPP X3TCHAPP X4TCHAPP X2TCHCON X3TCHCON X4TCHCON X2TCHPER
## 1 1  1     3.00 3.497192 3.290000     3.25 3.470739 3.250000      3.8
## 2 2  2     3.71 3.244063 2.710000     4.00 3.863807 3.250000      4.0
## 3 3  3     3.00 2.328649 2.430000     2.75 3.394821 4.000000      4.0
## 4 4  4     2.00 2.429293 2.860000     1.67 2.968973 3.000000      2.0
## 5 5  5     3.57 3.254047 3.110572     3.50 3.667085 3.237876      4.0
## 6 6  6     2.17 3.262664 2.570000     3.00 2.722379 3.250000      2.2
##   X3TCHPER X4TCHPER X2TCHEXT  X3TCHEXT X4TCHEXT X2TCHINT X3TCHINT X4TCHINT
## 1 3.003929 3.400000      1.6 2.1034365  2.83000     1.25 2.067914 2.000000
## 2 3.100503 3.200000      1.0 0.8209779  1.33000     1.25 1.676324 1.250000
## 3 3.487341 4.000000      1.8 2.3005313  1.83000     2.00 1.560177 1.500000
## 4 2.047030 2.600000      3.0 2.1689861  1.83000     2.25 1.473268 1.250000
## 5 3.553475 3.227734      1.2 1.8880763  1.67663     1.75 1.609909 2.134583
## 6 3.681915 2.200000      1.8 2.1602021  1.67000     2.00 1.168969 2.250000
##   X2RTHET     X3RTHET  X4RTHET X2MTHET    X3MTHET  X4MTHET X2BMI    X3BMI
## 1  1.7614  2.60438140 3.152300  1.5384  1.9901336 2.728600 16.32 15.77690
## 2  0.7452  1.28017953 2.202300  0.8800  0.5494903 2.082500 22.92 23.06920
## 3  0.3323  0.84172330 1.186100  1.0112  1.1208704 2.208000 15.10 15.75879
## 4 -0.2922 -0.06325954 0.863200 -0.2834 -0.4607727 1.108200 19.47 19.25400
## 5  0.8013  1.38699376 1.517941  0.6163  1.4138479 2.080058 15.54 15.65149
## 6 -0.6792  0.45592617 1.013200 -0.5379  0.2225133 0.448400 17.19 17.82710
##      X4BMI X2PUBPRI X3PUBPRI X4PUBPRI
## 1 16.05000        0        0        0
## 2 22.84000        0        0        0
## 3 17.92000        0        0        0
## 4 21.71000        0        0        0
## 5 14.54577        1        1        1
## 6 17.78000        0        0        0

ECLSK3$X1TCHAPP = ECLSK3$X1TCHCON = ECLSK3$X1TCHPER = ECLSK3$X1TCHEXT= ECLSK3$X1TCHINT= ECLSK3$X1RTHET= ECLSK3$X1MTHET= ECLSK3$X1BMI= ECLSK3$X1PUBPRI= ECLSK3$X1_CHSEX_R= ECLSK3$X1_HISP_R =ECLSK3$X1_BLACK_R=ECLSK3$X1_ASIAN_R=ECLSK3$X1_AMINAN_R=ECLSK3$X1_MULTR_R=ECLSK3$X1PAR1RAC= NULL; head(ECLSK3)

##   X id X2TCHAPP X3TCHAPP X4TCHAPP X2TCHCON X3TCHCON X4TCHCON X2TCHPER
## 1 1  1     3.00 3.221424 3.290000     3.25 3.180986 3.250000      3.8
## 2 2  2     3.71 2.867293 2.710000     4.00 3.603820 3.250000      4.0
## 3 3  3     3.00 3.192093 2.430000     2.75 2.969523 4.000000      4.0
## 4 4  4     2.00 2.905346 2.860000     1.67 2.892274 3.000000      2.0
## 5 5  5     3.57 3.205616 3.772295     3.50 3.941475 3.868375      4.0
## 6 6  6     2.17 2.576567 2.570000     3.00 2.971302 3.250000      2.2
##   X3TCHPER X4TCHPER X2TCHEXT  X3TCHEXT  X4TCHEXT X2TCHINT  X3TCHINT
## 1 3.062670 3.400000      1.6 1.8886239 2.8300000     1.25 1.5117094
## 2 3.086080 3.200000      1.0 0.6608571 1.3300000     1.25 0.7402232
## 3 3.522695 4.000000      1.8 1.3823443 1.8300000     2.00 1.3416348
## 4 2.650058 2.600000      3.0 1.8689814 1.8300000     2.25 1.9985187
## 5 4.105735 4.277933      1.2 0.2449470 0.5309798     1.75 1.3785064
## 6 2.698337 2.200000      1.8 1.0508469 1.6700000     2.00 2.2233617
##    X4TCHINT X2RTHET    X3RTHET  X4RTHET X2MTHET      X3MTHET  X4MTHET
## 1 2.0000000  1.7614  1.7862784 3.152300  1.5384  1.862279638 2.728600
## 2 1.2500000  0.7452  1.2005340 2.202300  0.8800  1.090047932 2.082500
## 3 1.5000000  0.3323  0.6931774 1.186100  1.0112  1.041094428 2.208000
## 4 1.2500000 -0.2922 -0.2661892 0.863200 -0.2834 -0.173559699 1.108200
## 5 0.9101853  0.8013  0.7613987 1.038999  0.6163  0.999369515 1.889081
## 6 2.2500000 -0.6792 -0.5622996 1.013200 -0.5379  0.001488898 0.448400
##   X2BMI    X3BMI    X4BMI X2PUBPRI X3PUBPRI X4PUBPRI
## 1 16.32 15.67643 16.05000        0        0        0
## 2 22.92 21.26740 22.84000        0        0        0
## 3 15.10 15.56543 17.92000        0        0        0
## 4 19.47 18.60358 21.71000        0        0        0
## 5 15.54 16.53990 17.26113        1        1        0
## 6 17.19 17.05379 17.78000        0        0        0

ECLSK4$X1TCHAPP = ECLSK4$X1TCHCON = ECLSK4$X1TCHPER = ECLSK4$X1TCHEXT= ECLSK4$X1TCHINT= ECLSK4$X1RTHET= ECLSK4$X1MTHET= ECLSK4$X1BMI= ECLSK4$X1PUBPRI = ECLSK4$X1_CHSEX_R= ECLSK4$X1_HISP_R =ECLSK4$X1_BLACK_R=ECLSK4$X1_ASIAN_R=ECLSK4$X1_AMINAN_R=ECLSK4$X1_MULTR_R=ECLSK4$X1PAR1RAC=NULL; head(ECLSK4)

##   X id X2TCHAPP X3TCHAPP X4TCHAPP X2TCHCON X3TCHCON X4TCHCON X2TCHPER
## 1 1  1     3.00 3.271957 3.290000     3.25 2.973837 3.250000      3.8
## 2 2  2     3.71 3.166463 2.710000     4.00 3.464733 3.250000      4.0
## 3 3  3     3.00 2.906256 2.430000     2.75 3.277272 4.000000      4.0
## 4 4  4     2.00 2.203293 2.860000     1.67 2.163832 3.000000      2.0
## 5 5  5     3.57 3.127168 4.242666     3.50 3.607413 3.669118      4.0
## 6 6  6     2.17 2.704005 2.570000     3.00 3.336438 3.250000      2.2
##   X3TCHPER X4TCHPER X2TCHEXT  X3TCHEXT X4TCHEXT X2TCHINT  X3TCHINT
## 1 2.966093 3.400000      1.6 2.3570533  2.83000     1.25 1.4808616
## 2 2.952690 3.200000      1.0 0.5002567  1.33000     1.25 2.1758779
## 3 3.865213 4.000000      1.8 1.0761579  1.83000     2.00 1.6371609
## 4 2.346653 2.600000      3.0 2.5156549  1.83000     2.25 0.5709399
## 5 4.042179 4.043726      1.2 1.4677999  1.19443     1.75 1.6135397
## 6 2.460741 2.200000      1.8 1.5606344  1.67000     2.00 1.5179091
##   X4TCHINT X2RTHET     X3RTHET  X4RTHET X2MTHET     X3MTHET  X4MTHET X2BMI
## 1 2.000000  1.7614  2.69406629 3.152300  1.5384  2.50483671 2.728600 16.32
## 2 1.250000  0.7452  0.73749908 2.202300  0.8800  0.46173896 2.082500 22.92
## 3 1.500000  0.3323  0.95565927 1.186100  1.0112  1.36679864 2.208000 15.10
## 4 1.250000 -0.2922 -0.04272936 0.863200 -0.2834 -0.25960270 1.108200 19.47
## 5 1.734666  0.8013  1.37818565 1.737159  0.6163  1.43888270 2.036757 15.54
## 6 2.250000 -0.6792  0.29472602 1.013200 -0.5379  0.08945304 0.448400 17.19
##      X3BMI   X4BMI X2PUBPRI X3PUBPRI X4PUBPRI
## 1 16.36377 16.0500        0        0        0
## 2 21.00701 22.8400        0        0        0
## 3 18.04266 17.9200        0        0        0
## 4 20.94828 21.7100        0        0        0
## 5 15.73721 15.3375        1        1        1
## 6 18.95435 17.7800        0        0        0

ECLSK5$X1TCHAPP = ECLSK5$X1TCHCON = ECLSK5$X1TCHPER = ECLSK5$X1TCHEXT= ECLSK5$X1TCHINT= ECLSK5$X1RTHET= ECLSK5$X1MTHET= ECLSK5$X1BMI= ECLSK5$X1PUBPRI = ECLSK5$X1_CHSEX_R= ECLSK5$X1_HISP_R =ECLSK5$X1_BLACK_R=ECLSK5$X1_ASIAN_R=ECLSK5$X1_AMINAN_R=ECLSK5$X1_MULTR_R=ECLSK5$X1PAR1RAC= NULL; head(ECLSK5)

##   X id X2TCHAPP X3TCHAPP X4TCHAPP X2TCHCON X3TCHCON X4TCHCON X2TCHPER
## 1 1  1     3.00 3.019350 3.290000     3.25 3.246884 3.250000      3.8
## 2 2  2     3.71 2.720240 2.710000     4.00 3.896280 3.250000      4.0
## 3 3  3     3.00 3.141751 2.430000     2.75 3.331117 4.000000      4.0
## 4 4  4     2.00 3.136820 2.860000     1.67 3.363474 3.000000      2.0
## 5 5  5     3.57 3.789130 3.146126     3.50 4.046981 3.031817      4.0
## 6 6  6     2.17 2.780977 2.570000     3.00 2.825802 3.250000      2.2
##   X3TCHPER X4TCHPER X2TCHEXT  X3TCHEXT X4TCHEXT X2TCHINT  X3TCHINT
## 1 3.643697  3.40000      1.6 2.2581823 2.830000     1.25 1.2144426
## 2 3.568072  3.20000      1.0 1.4309279 1.330000     1.25 1.2082418
## 3 3.252823  4.00000      1.8 2.1677423 1.830000     2.00 1.8363239
## 4 3.228493  2.60000      3.0 1.6342253 1.830000     2.25 1.1509811
## 5 3.465395  3.11138      1.2 0.9275937 1.713889     1.75 0.5814109
## 6 2.581877  2.20000      1.8 2.2122015 1.670000     2.00 2.1347089
##   X4TCHINT X2RTHET     X3RTHET X4RTHET X2MTHET     X3MTHET X4MTHET X2BMI
## 1 2.000000  1.7614  1.82368436 3.15230  1.5384  2.17363138 2.72860 16.32
## 2 1.250000  0.7452  0.61969215 2.20230  0.8800  1.02563425 2.08250 22.92
## 3 1.500000  0.3323  0.89781515 1.18610  1.0112  1.49360487 2.20800 15.10
## 4 1.250000 -0.2922 -0.09669842 0.86320 -0.2834 -0.08906052 1.10820 19.47
## 5 1.557898  0.8013  1.68890642 2.50197  0.6163  1.09898997 1.58664 15.54
## 6 2.250000 -0.6792  0.22126193 1.01320 -0.5379  0.44815971 0.44840 17.19
##      X3BMI    X4BMI X2PUBPRI X3PUBPRI X4PUBPRI
## 1 16.82059 16.05000        0        0        0
## 2 22.24472 22.84000        0        0        0
## 3 15.24411 17.92000        0        0        0
## 4 19.62823 21.71000        0        0        0
## 5 17.17621 15.58692        1        1        1
## 6 17.43088 17.78000        0        0        0

Now we are analyzing the data with only the first time points for each student. We are matching students on their fall 2010 characteristics included in the model below which are creating scores for how likely a student is to fall into the treatment category. Then matchit finds those students in the control group that have the highest likelihood to be in the treatment (i.e. highest propensity scores) and matches them with a student with a similar propensity score in the treatment group. Propensity score matching allows us to say that students in the treatment and control are at least similar on the included pretreatment covariates, because matchit will select students for a final data set that similar on characteristics. In this model, I am using the nearest neighbor algorithm where I am matching each student from the treatment (i.e. alternative to assigned public school) to a control student (i.e. assigned public school). Then, for the first data, remember there are five imputed datasets, I examine the jitter plot which shows the distribution of matched students in the treatment and control. In this example the two almost exactly align providing a good sign that I have students that are very similar on the included covariates. Next I look at a histogram of the matched students from the treatment and control groups and again the distributions of propensity scores seems almost identical providing more evidence that students in the treatment and control are very similar on included covariates.

Then I need to compile a dataset with the weights, which include whether a student was matched or not, the treatment, which is whether or not the student is in an alternative to assigned public school or not, and only select those students who were matched (i.e. if they have a 1 on the weight variable).

library(MatchIt)
ECLSK11 = as.data.frame(ECLSK11)
m.out1 = matchit(X1PUBPRI ~ X1TCHAPP + X1TCHCON + X1TCHPER + X1TCHEXT + X1TCHINT + X1RTHET + X1MTHET + X1BMI +X1_CHSEX_R + X1_HISP_R + X1_BLACK_R + X1_ASIAN_R + X1_AMINAN_R + X1_MULTR_R + X1PAR1RAC, data = ECLSK11, method = "nearest", ratio = 1)

#plot(m.out1, type = "jitter")
#plot(m.out1, type = "hist")


m.out1Data = as.data.frame(cbind(m.out1$X, weights = m.out1$weights, X1PUBPRI = m.out1$treat))
m.out1Data = m.out1Data[which(m.out1Data$weights == 1),]
head(m.out1Data)

##    X1TCHAPP X1TCHCON X1TCHPER X1TCHEXT X1TCHINT X1RTHET X1MTHET X1BMI
## 5      3.67     3.75      3.8      2.2     1.25 -0.0734 -0.5545 15.20
## 12     2.57     2.25      2.4      2.6     1.75 -0.3999 -0.2823 17.39
## 13     3.86     4.00      3.6      1.0     2.25  0.3953  0.3557 17.07
## 23     3.00     3.00      2.8      1.8     1.25  0.8916  1.2520 14.34
## 29     2.86     3.00      3.2      2.0     1.00 -0.0284  0.2346 15.28
## 30     4.00     3.00      3.6      1.4     1.25 -0.3733  0.3625 16.07
##    X1_CHSEX_R X1_HISP_R X1_BLACK_R X1_ASIAN_R X1_AMINAN_R X1_MULTR_R
## 5           0         1          0          0           0          0
## 12          1         0          1          0           0          0
## 13          1         0          0          0           0          0
## 23          1         0          0          0           0          0
## 29          0         0          0          0           0          0
## 30          0         0          0          0           0          0
##    X1PAR1RAC weights X1PUBPRI
## 5          0       1        1
## 12         1       1        0
## 13         0       1        1
## 23         0       1        1
## 29         0       1        0
## 30         0       1        1

write.csv(m.out1Data, "m.out1Data.csv")
m.out1Data = read.csv("m.out1Data.csv", header = TRUE)
head(m.out1Data)

##    X X1TCHAPP X1TCHCON X1TCHPER X1TCHEXT X1TCHINT X1RTHET X1MTHET X1BMI
## 1  5     3.67     3.75      3.8      2.2     1.25 -0.0734 -0.5545 15.20
## 2 12     2.57     2.25      2.4      2.6     1.75 -0.3999 -0.2823 17.39
## 3 13     3.86     4.00      3.6      1.0     2.25  0.3953  0.3557 17.07
## 4 23     3.00     3.00      2.8      1.8     1.25  0.8916  1.2520 14.34
## 5 29     2.86     3.00      3.2      2.0     1.00 -0.0284  0.2346 15.28
## 6 30     4.00     3.00      3.6      1.4     1.25 -0.3733  0.3625 16.07
##   X1_CHSEX_R X1_HISP_R X1_BLACK_R X1_ASIAN_R X1_AMINAN_R X1_MULTR_R
## 1          0         1          0          0           0          0
## 2          1         0          1          0           0          0
## 3          1         0          0          0           0          0
## 4          1         0          0          0           0          0
## 5          0         0          0          0           0          0
## 6          0         0          0          0           0          0
##   X1PAR1RAC weights X1PUBPRI
## 1         0       1        1
## 2         1       1        0
## 3         0       1        1
## 4         0       1        1
## 5         0       1        0
## 6         0       1        1

colnames(m.out1Data)[1] = c("id")

Repeat the process above for data set two.

ECLSK12 = as.data.frame(ECLSK12)

m.out2 = matchit(X1PUBPRI ~ X1TCHAPP + X1TCHCON + X1TCHPER + X1TCHEXT + X1TCHINT + X1RTHET + X1MTHET + X1BMI +X1_CHSEX_R + X1_HISP_R + X1_BLACK_R + X1_ASIAN_R + X1_AMINAN_R + X1_MULTR_R + X1PAR1RAC, data = ECLSK12, method = "nearest", ratio = 1)


m.out2Data = as.data.frame(cbind(m.out2$X, weights = m.out2$weights, X1PUBPRI = m.out2$treat))
m.out2Data = m.out2Data[which(m.out2Data$weights == 1),]
head(m.out2Data)

##    X1TCHAPP X1TCHCON X1TCHPER X1TCHEXT X1TCHINT X1RTHET X1MTHET X1BMI
## 5      3.67     3.75      3.8      2.2     1.25 -0.0734 -0.5545 15.20
## 12     2.57     2.25      2.4      2.6     1.75 -0.3999 -0.2823 17.39
## 13     3.86     4.00      3.6      1.0     2.25  0.3953  0.3557 17.07
## 23     3.00     3.00      2.8      1.8     1.25  0.8916  1.2520 14.34
## 30     4.00     3.00      3.6      1.4     1.25 -0.3733  0.3625 16.07
## 32     3.71     3.25      3.8      1.8     1.25  0.0991  0.4504 16.51
##    X1_CHSEX_R X1_HISP_R X1_BLACK_R X1_ASIAN_R X1_AMINAN_R X1_MULTR_R
## 5           0         1          0          0           0          0
## 12          1         0          1          0           0          0
## 13          1         0          0          0           0          0
## 23          1         0          0          0           0          0
## 30          0         0          0          0           0          0
## 32          0         0          0          0           0          0
##    X1PAR1RAC weights X1PUBPRI
## 5          0       1        1
## 12         1       1        0
## 13         0       1        1
## 23         0       1        1
## 30         0       1        1
## 32         0       1        0

write.csv(m.out2Data, "m.out2Data.csv")
m.out2Data = read.csv("m.out2Data.csv", header = TRUE)
head(m.out2Data)

##    X X1TCHAPP X1TCHCON X1TCHPER X1TCHEXT X1TCHINT X1RTHET X1MTHET X1BMI
## 1  5     3.67     3.75      3.8      2.2     1.25 -0.0734 -0.5545 15.20
## 2 12     2.57     2.25      2.4      2.6     1.75 -0.3999 -0.2823 17.39
## 3 13     3.86     4.00      3.6      1.0     2.25  0.3953  0.3557 17.07
## 4 23     3.00     3.00      2.8      1.8     1.25  0.8916  1.2520 14.34
## 5 30     4.00     3.00      3.6      1.4     1.25 -0.3733  0.3625 16.07
## 6 32     3.71     3.25      3.8      1.8     1.25  0.0991  0.4504 16.51
##   X1_CHSEX_R X1_HISP_R X1_BLACK_R X1_ASIAN_R X1_AMINAN_R X1_MULTR_R
## 1          0         1          0          0           0          0
## 2          1         0          1          0           0          0
## 3          1         0          0          0           0          0
## 4          1         0          0          0           0          0
## 5          0         0          0          0           0          0
## 6          0         0          0          0           0          0
##   X1PAR1RAC weights X1PUBPRI
## 1         0       1        1
## 2         1       1        0
## 3         0       1        1
## 4         0       1        1
## 5         0       1        1
## 6         0       1        0

colnames(m.out2Data)[1] = c("id")

Repeat the process above for data set three.

ECLSK13 = as.data.frame(ECLSK13)

m.out3 = matchit(X1PUBPRI ~ X1TCHAPP + X1TCHCON + X1TCHPER + X1TCHEXT + X1TCHINT + X1RTHET + X1MTHET + X1BMI +X1_CHSEX_R + X1_HISP_R + X1_BLACK_R + X1_ASIAN_R + X1_AMINAN_R + X1_MULTR_R + X1PAR1RAC, data = ECLSK13, method = "nearest", ratio = 1)


m.out3Data = as.data.frame(cbind(m.out3$X, weights = m.out3$weights, X1PUBPRI = m.out3$treat))
m.out3Data = m.out3Data[which(m.out3Data$weights == 1),]
head(m.out3Data)

##    X1TCHAPP X1TCHCON X1TCHPER X1TCHEXT X1TCHINT X1RTHET X1MTHET X1BMI
## 5      3.67     3.75      3.8      2.2     1.25 -0.0734 -0.5545 15.20
## 12     2.57     2.25      2.4      2.6     1.75 -0.3999 -0.2823 17.39
## 13     3.86     4.00      3.6      1.0     2.25  0.3953  0.3557 17.07
## 20     3.00     3.33      3.4      1.8     1.50 -0.4954 -0.0743 19.16
## 23     3.00     3.00      2.8      1.8     1.25  0.8916  1.2520 14.34
## 24     2.29     3.00      2.4      2.2     1.50  1.7069  0.9574 17.67
##    X1_CHSEX_R X1_HISP_R X1_BLACK_R X1_ASIAN_R X1_AMINAN_R X1_MULTR_R
## 5           0         1          0          0           0          0
## 12          1         0          1          0           0          0
## 13          1         0          0          0           0          0
## 20          0         0          0          1           0          0
## 23          1         0          0          0           0          0
## 24          1         0          0          0           0          0
##    X1PAR1RAC weights X1PUBPRI
## 5          0       1        1
## 12         1       1        0
## 13         0       1        1
## 20         0       1        0
## 23         0       1        1
## 24         0       1        0

write.csv(m.out3Data, "m.out3Data.csv")
m.out3Data = read.csv("m.out3Data.csv", header = TRUE)
head(m.out3Data)

##    X X1TCHAPP X1TCHCON X1TCHPER X1TCHEXT X1TCHINT X1RTHET X1MTHET X1BMI
## 1  5     3.67     3.75      3.8      2.2     1.25 -0.0734 -0.5545 15.20
## 2 12     2.57     2.25      2.4      2.6     1.75 -0.3999 -0.2823 17.39
## 3 13     3.86     4.00      3.6      1.0     2.25  0.3953  0.3557 17.07
## 4 20     3.00     3.33      3.4      1.8     1.50 -0.4954 -0.0743 19.16
## 5 23     3.00     3.00      2.8      1.8     1.25  0.8916  1.2520 14.34
## 6 24     2.29     3.00      2.4      2.2     1.50  1.7069  0.9574 17.67
##   X1_CHSEX_R X1_HISP_R X1_BLACK_R X1_ASIAN_R X1_AMINAN_R X1_MULTR_R
## 1          0         1          0          0           0          0
## 2          1         0          1          0           0          0
## 3          1         0          0          0           0          0
## 4          0         0          0          1           0          0
## 5          1         0          0          0           0          0
## 6          1         0          0          0           0          0
##   X1PAR1RAC weights X1PUBPRI
## 1         0       1        1
## 2         1       1        0
## 3         0       1        1
## 4         0       1        0
## 5         0       1        1
## 6         0       1        0

colnames(m.out3Data)[1] = c("id")
m.out1

## 
## Call: 
## matchit(formula = X1PUBPRI ~ X1TCHAPP + X1TCHCON + X1TCHPER + 
##     X1TCHEXT + X1TCHINT + X1RTHET + X1MTHET + X1BMI + X1_CHSEX_R + 
##     X1_HISP_R + X1_BLACK_R + X1_ASIAN_R + X1_AMINAN_R + X1_MULTR_R + 
##     X1PAR1RAC, data = ECLSK11, method = "nearest", ratio = 1)
## 
## Sample sizes:
##           Control Treated
## All         15918    2256
## Matched      2256    2256
## Unmatched   13662       0
## Discarded       0       0

Repeat the process above for data set four.

ECLSK14 = as.data.frame(ECLSK14)

m.out4 = matchit(X1PUBPRI ~ X1TCHAPP + X1TCHCON + X1TCHPER + X1TCHEXT + X1TCHINT + X1RTHET + X1MTHET + X1BMI +X1_CHSEX_R + X1_HISP_R + X1_BLACK_R + X1_ASIAN_R + X1_AMINAN_R + X1_MULTR_R + X1PAR1RAC, data = ECLSK14, method = "nearest", ratio = 1)


m.out4Data = as.data.frame(cbind(m.out4$X, weights = m.out4$weights, X1PUBPRI = m.out4$treat))
m.out4Data = m.out4Data[which(m.out4Data$weights == 1),]
head(m.out4Data)

##    X1TCHAPP X1TCHCON X1TCHPER X1TCHEXT X1TCHINT    X1RTHET   X1MTHET
## 1  3.570000 4.000000 4.000000 1.200000 1.250000  0.1930000  0.582700
## 3  2.863156 3.133581 2.922238 2.199735 1.618352  0.6787664  1.217776
## 5  3.670000 3.750000 3.800000 2.200000 1.250000 -0.0734000 -0.554500
## 6  2.290000 2.758068 2.689421 1.509188 2.000000 -1.4601000 -1.061600
## 8  3.570000 3.670000 3.600000 1.000000 1.000000 -0.4149000 -0.292100
## 13 3.860000 4.000000 3.600000 1.000000 2.250000  0.3953000  0.355700
##      X1BMI X1_CHSEX_R X1_HISP_R X1_BLACK_R X1_ASIAN_R X1_AMINAN_R
## 1  16.4000          1         0          0          0           0
## 3  16.1203          1         1          0          0           0
## 5  15.2000          0         1          0          0           0
## 6  17.2800          0         0          0          0           0
## 8  13.3500          0         0          0          0           0
## 13 17.0700          1         0          0          0           0
##    X1_MULTR_R X1PAR1RAC weights X1PUBPRI
## 1           0         0       1        0
## 3           0         1       1        0
## 5           0         0       1        1
## 6           0         0       1        0
## 8           0         0       1        0
## 13          0         0       1        1

write.csv(m.out4Data, "m.out4Data.csv")
m.out4Data = read.csv("m.out4Data.csv", header = TRUE)
head(m.out4Data)

##    X X1TCHAPP X1TCHCON X1TCHPER X1TCHEXT X1TCHINT    X1RTHET   X1MTHET
## 1  1 3.570000 4.000000 4.000000 1.200000 1.250000  0.1930000  0.582700
## 2  3 2.863156 3.133581 2.922238 2.199735 1.618352  0.6787664  1.217776
## 3  5 3.670000 3.750000 3.800000 2.200000 1.250000 -0.0734000 -0.554500
## 4  6 2.290000 2.758068 2.689421 1.509188 2.000000 -1.4601000 -1.061600
## 5  8 3.570000 3.670000 3.600000 1.000000 1.000000 -0.4149000 -0.292100
## 6 13 3.860000 4.000000 3.600000 1.000000 2.250000  0.3953000  0.355700
##     X1BMI X1_CHSEX_R X1_HISP_R X1_BLACK_R X1_ASIAN_R X1_AMINAN_R
## 1 16.4000          1         0          0          0           0
## 2 16.1203          1         1          0          0           0
## 3 15.2000          0         1          0          0           0
## 4 17.2800          0         0          0          0           0
## 5 13.3500          0         0          0          0           0
## 6 17.0700          1         0          0          0           0
##   X1_MULTR_R X1PAR1RAC weights X1PUBPRI
## 1          0         0       1        0
## 2          0         1       1        0
## 3          0         0       1        1
## 4          0         0       1        0
## 5          0         0       1        0
## 6          0         0       1        1

colnames(m.out4Data)[1] = c("id")

Repeat the process above for data set five.

ECLSK15 = as.data.frame(ECLSK15)

m.out5 = matchit(X1PUBPRI ~ X1TCHAPP + X1TCHCON + X1TCHPER + X1TCHEXT + X1TCHINT + X1RTHET + X1MTHET + X1BMI +X1_CHSEX_R + X1_HISP_R + X1_BLACK_R + X1_ASIAN_R + X1_AMINAN_R + X1_MULTR_R + X1PAR1RAC, data = ECLSK15, method = "nearest", ratio = 1)


m.out5Data = as.data.frame(cbind(m.out5$X, weights = m.out5$weights, X1PUBPRI = m.out5$treat))
m.out5Data = m.out5Data[which(m.out5Data$weights == 1),]
head(m.out5Data)

##    X1TCHAPP X1TCHCON X1TCHPER X1TCHEXT X1TCHINT    X1RTHET    X1MTHET
## 3  2.784357   2.9082 3.728913 1.614667 1.629037 -0.3511338 -0.1779907
## 4  2.140000   2.0000 2.200000 3.200000 1.500000 -1.7087000 -2.1245000
## 5  3.670000   3.7500 3.800000 2.200000 1.250000 -0.0734000 -0.5545000
## 13 3.860000   4.0000 3.600000 1.000000 2.250000  0.3953000  0.3557000
## 20 3.000000   3.3300 3.400000 1.800000 1.500000 -0.4954000 -0.0743000
## 23 3.000000   3.0000 2.800000 1.800000 1.250000  0.8916000  1.2520000
##       X1BMI X1_CHSEX_R X1_HISP_R X1_BLACK_R X1_ASIAN_R X1_AMINAN_R
## 3  13.89359          1         1          0          0           0
## 4  18.14000          1         0          0          0           0
## 5  15.20000          0         1          0          0           0
## 13 17.07000          1         0          0          0           0
## 20 19.16000          0         0          0          1           0
## 23 14.34000          1         0          0          0           0
##    X1_MULTR_R X1PAR1RAC weights X1PUBPRI
## 3           0         1       1        0
## 4           0         0       1        0
## 5           0         0       1        1
## 13          0         0       1        1
## 20          0         1       1        0
## 23          0         0       1        1

write.csv(m.out5Data, "m.out5Data.csv")
m.out5Data = read.csv("m.out5Data.csv", header = TRUE)
head(m.out5Data)

##    X X1TCHAPP X1TCHCON X1TCHPER X1TCHEXT X1TCHINT    X1RTHET    X1MTHET
## 1  3 2.784357   2.9082 3.728913 1.614667 1.629037 -0.3511338 -0.1779907
## 2  4 2.140000   2.0000 2.200000 3.200000 1.500000 -1.7087000 -2.1245000
## 3  5 3.670000   3.7500 3.800000 2.200000 1.250000 -0.0734000 -0.5545000
## 4 13 3.860000   4.0000 3.600000 1.000000 2.250000  0.3953000  0.3557000
## 5 20 3.000000   3.3300 3.400000 1.800000 1.500000 -0.4954000 -0.0743000
## 6 23 3.000000   3.0000 2.800000 1.800000 1.250000  0.8916000  1.2520000
##      X1BMI X1_CHSEX_R X1_HISP_R X1_BLACK_R X1_ASIAN_R X1_AMINAN_R
## 1 13.89359          1         1          0          0           0
## 2 18.14000          1         0          0          0           0
## 3 15.20000          0         1          0          0           0
## 4 17.07000          1         0          0          0           0
## 5 19.16000          0         0          0          1           0
## 6 14.34000          1         0          0          0           0
##   X1_MULTR_R X1PAR1RAC weights X1PUBPRI
## 1          0         1       1        0
## 2          0         0       1        0
## 3          0         0       1        1
## 4          0         0       1        1
## 5          0         1       1        0
## 6          0         0       1        1

colnames(m.out5Data)[1] = c("id")

Now I need to gather the final sample. The final sample size is difficult, because the number of control and treatment units differs, because in the imputation process in some cases codes missing values for public and private schools was different for different datas. Therefore, the number of private schools varied slightly meaning that total number sample size differs depending on how many missing value in the public private variable were coded as private (each private school is matched with one public school so the number of students in private school is the final sample size).

Therefore, I took the mean of the total sample from the matched control (i.e. private) units for each of the five data sets. I then calculated the standard deviation for each of them as well.

n = t(cbind(m.out1$nn[1,2], m.out2$nn[1,2], m.out3$nn[1,2], m.out4$nn[1,2], m.out5$nn[1,2]))
nMean = round(mean(n),0); nMean

## [1] 2256

nSD = round(sd(n),0); nSD

## [1] 7

Now I need to get the means and sds for the private and public students. Need to create two data sets one with private and one with public. Then get means and sds for each group.

m.out1DataPri = m.out1Data[which(m.out1Data$X1PUBPRI == 1),]
m.out2DataPri = m.out2Data[which(m.out2Data$X1PUBPRI == 1),]
m.out3DataPri = m.out3Data[which(m.out3Data$X1PUBPRI == 1),]
m.out4DataPri = m.out4Data[which(m.out4Data$X1PUBPRI == 1),]
m.out5DataPri = m.out5Data[which(m.out5Data$X1PUBPRI == 1),]


mean1Pri =apply(m.out1DataPri, 2, mean)
mean2Pri =apply(m.out2DataPri, 2, mean)
mean3Pri =apply(m.out3DataPri, 2, mean)
mean4Pri =apply(m.out4DataPri, 2, mean)
mean5Pri =apply(m.out5DataPri, 2, mean)

sd1Pri = apply(m.out1DataPri, 2, sd)
sd2Pri = apply(m.out2DataPri, 2, sd)
sd3Pri = apply(m.out3DataPri, 2, sd)
sd4Pri = apply(m.out4DataPri, 2, sd)
sd5Pri = apply(m.out5DataPri, 2, sd)


allMeansPri = t(as.matrix(cbind(mean1Pri, mean2Pri, mean3Pri, mean4Pri, mean5Pri)))

allSDsPri = t(as.matrix(cbind(sd1Pri, sd2Pri, sd3Pri, sd4Pri, sd5Pri)))

allMeansSDsComPri = mi.meld(q = allMeansPri, se = allSDsPri)
allMeansSDsComPri = as.data.frame(allMeansSDsComPri)
write.csv(allMeansSDsComPri, "allMeansSDsComPri.csv")


m.out1DataPub = m.out1Data[which(m.out1Data$X1PUBPRI == 0),]
m.out2DataPub = m.out2Data[which(m.out2Data$X1PUBPRI == 0),]
m.out3DataPub = m.out3Data[which(m.out3Data$X1PUBPRI == 0),]
m.out4DataPub = m.out4Data[which(m.out4Data$X1PUBPRI == 0),]
m.out5DataPub = m.out5Data[which(m.out5Data$X1PUBPRI == 0),]

mean1Pub =apply(m.out1DataPub, 2, mean)
mean2Pub =apply(m.out2DataPub, 2, mean)
mean3Pub =apply(m.out3DataPub, 2, mean)
mean4Pub =apply(m.out4DataPub, 2, mean)
mean5Pub =apply(m.out5DataPub, 2, mean)

sd1Pub = apply(m.out1DataPub, 2, sd)
sd2Pub = apply(m.out2DataPub, 2, sd)
sd3Pub = apply(m.out3DataPub, 2, sd)
sd4Pub = apply(m.out4DataPub, 2, sd)
sd5Pub = apply(m.out5DataPub, 2, sd)

allMeansPub = t(as.matrix(cbind(mean1Pub, mean2Pub, mean3Pub, mean4Pub, mean5Pub)))

allSDsPub = t(as.matrix(cbind(sd1Pub, sd2Pub, sd3Pub, sd4Pub, sd5Pub)))

allMeansSDsComPub = mi.meld(q = allMeansPub, se = allSDsPub)
allMeansSDsComPub = as.data.frame(allMeansSDsComPub)
write.csv(allMeansSDsComPub, "allMeansSDsComPub.csv")

Now I need to conduct an inner join with the data set that has time points 2 through 4 with the matched data set which has the time point 1 variable. This is pretty simple with the merge function in r, which by default with conduct an inner join (i.e. only include in the data set those id values that are present in both data sets).

ECLSK1 = merge(ECLSK1, m.out1Data, by = c("id"))
ECLSK2 = merge(ECLSK2, m.out2Data, by = c("id"))
ECLSK3 = merge(ECLSK3, m.out3Data, by = c("id"))
ECLSK4 = merge(ECLSK4, m.out4Data, by = c("id"))
ECLSK5 = merge(ECLSK5, m.out5Data, by = c("id"))

Next, because I am including several fixed variables (i.e. variables that do not change over time), I need to create those values for time points 2 through 4. Therefore, I replicate each column three times and then use the paste0 function to produce names that go from 1 to the number of columns in the data set, so 1 through 4. I need to repeat this process five times once for each imputed data set.

X1_CHSEX_R = data.frame(ECLSK1$X1_CHSEX_R)
colnames(X1_CHSEX_R) = c("X1_CHSEX_R")
X1_CHSEX_R = cbind(X1_CHSEX_R, replicate(3, X1_CHSEX_R$X1_CHSEX_R))
names(X1_CHSEX_R) = paste0("X", 1:ncol(X1_CHSEX_R),"_CHSEX_R")


X1_HISP_R = data.frame(ECLSK1$X1_HISP_R)
colnames(X1_HISP_R) = c("X1_HISP_R")
X1_HISP_R = cbind(X1_HISP_R, replicate(3, X1_HISP_R$X1_HISP_R))
names(X1_HISP_R) = paste0("X", 1:ncol(X1_HISP_R),"_HISP_R")


X1_BLACK_R = data.frame(ECLSK1$X1_BLACK_R)
colnames(X1_BLACK_R) = c("X1_BLACK_R")
X1_BLACK_R = cbind(X1_BLACK_R, replicate(3, X1_BLACK_R$X1_BLACK_R))
names(X1_BLACK_R) = paste0("X", 1:ncol(X1_BLACK_R),"_BLACK_R")


X1_ASIAN_R = data.frame(ECLSK1$X1_ASIAN_R)
colnames(X1_ASIAN_R) = c("X1_ASIAN_R")
X1_ASIAN_R = cbind(X1_ASIAN_R, replicate(3, X1_ASIAN_R$X1_ASIAN_R))
names(X1_ASIAN_R) = paste0("X", 1:ncol(X1_ASIAN_R),"_ASIAN_R")


X1_AMINAN_R = data.frame(ECLSK1$X1_AMINAN_R)
colnames(X1_AMINAN_R) = c("X1_AMINAN_R")
X1_AMINAN_R = cbind(X1_AMINAN_R, replicate(3, X1_AMINAN_R$X1_AMINAN_R))
names(X1_AMINAN_R) = paste0("X", 1:ncol(X1_AMINAN_R),"_AMINAN_R")


X1_MULTR_R = data.frame(ECLSK1$X1_MULTR_R)
colnames(X1_MULTR_R) = c("X1_MULTR_R")
X1_MULTR_R = cbind(X1_MULTR_R, replicate(3, X1_MULTR_R$X1_MULTR_R))
names(X1_MULTR_R) = paste0("X", 1:ncol(X1_MULTR_R),"_MULTR_R")


X1PAR1RAC = data.frame(ECLSK1$X1PAR1RAC)
colnames(X1PAR1RAC) = c("X1PAR1RAC")
X1PAR1RAC = cbind(X1PAR1RAC, replicate(3, X1PAR1RAC$X1PAR1RAC))
names(X1PAR1RAC) = paste0("X", 1:ncol(X1PAR1RAC),"PAR1RAC")


ECLSK1 = cbind(ECLSK1, X1_CHSEX_R, X1_HISP_R, X1_BLACK_R, X1_ASIAN_R, X1_AMINAN_R, X1_MULTR_R, X1PAR1RAC)

X1_CHSEX_R = data.frame(ECLSK2$X1_CHSEX_R)
colnames(X1_CHSEX_R) = c("X1_CHSEX_R")
X1_CHSEX_R = cbind(X1_CHSEX_R, replicate(3, X1_CHSEX_R$X1_CHSEX_R))
names(X1_CHSEX_R) = paste0("X", 1:ncol(X1_CHSEX_R),"_CHSEX_R")


X1_HISP_R = data.frame(ECLSK2$X1_HISP_R)
colnames(X1_HISP_R) = c("X1_HISP_R")
X1_HISP_R = cbind(X1_HISP_R, replicate(3, X1_HISP_R$X1_HISP_R))
names(X1_HISP_R) = paste0("X", 1:ncol(X1_HISP_R),"_HISP_R")


X1_BLACK_R = data.frame(ECLSK2$X1_BLACK_R)
colnames(X1_BLACK_R) = c("X1_BLACK_R")
X1_BLACK_R = cbind(X1_BLACK_R, replicate(3, X1_BLACK_R$X1_BLACK_R))
names(X1_BLACK_R) = paste0("X", 1:ncol(X1_BLACK_R),"_BLACK_R")


X1_ASIAN_R = data.frame(ECLSK2$X1_ASIAN_R)
colnames(X1_ASIAN_R) = c("X1_ASIAN_R")
X1_ASIAN_R = cbind(X1_ASIAN_R, replicate(3, X1_ASIAN_R$X1_ASIAN_R))
names(X1_ASIAN_R) = paste0("X", 1:ncol(X1_ASIAN_R),"_ASIAN_R")


X1_AMINAN_R = data.frame(ECLSK2$X1_AMINAN_R)
colnames(X1_AMINAN_R) = c("X1_AMINAN_R")
X1_AMINAN_R = cbind(X1_AMINAN_R, replicate(3, X1_AMINAN_R$X1_AMINAN_R))
names(X1_AMINAN_R) = paste0("X", 1:ncol(X1_AMINAN_R),"_AMINAN_R")


X1_MULTR_R = data.frame(ECLSK2$X1_MULTR_R)
colnames(X1_MULTR_R) = c("X1_MULTR_R")
X1_MULTR_R = cbind(X1_MULTR_R, replicate(3, X1_MULTR_R$X1_MULTR_R))
names(X1_MULTR_R) = paste0("X", 1:ncol(X1_MULTR_R),"_MULTR_R")


X1PAR1RAC = data.frame(ECLSK2$X1PAR1RAC)
colnames(X1PAR1RAC) = c("X1PAR1RAC")
X1PAR1RAC = cbind(X1PAR1RAC, replicate(3, X1PAR1RAC$X1PAR1RAC))
names(X1PAR1RAC) = paste0("X", 1:ncol(X1PAR1RAC),"PAR1RAC")


ECLSK2 = cbind(ECLSK2, X1_CHSEX_R, X1_HISP_R, X1_BLACK_R, X1_ASIAN_R, X1_AMINAN_R, X1_MULTR_R, X1PAR1RAC)

X1_CHSEX_R = data.frame(ECLSK3$X1_CHSEX_R)
colnames(X1_CHSEX_R) = c("X1_CHSEX_R")
X1_CHSEX_R = cbind(X1_CHSEX_R, replicate(3, X1_CHSEX_R$X1_CHSEX_R))
names(X1_CHSEX_R) = paste0("X", 1:ncol(X1_CHSEX_R),"_CHSEX_R")


X1_HISP_R = data.frame(ECLSK3$X1_HISP_R)
colnames(X1_HISP_R) = c("X1_HISP_R")
X1_HISP_R = cbind(X1_HISP_R, replicate(3, X1_HISP_R$X1_HISP_R))
names(X1_HISP_R) = paste0("X", 1:ncol(X1_HISP_R),"_HISP_R")


X1_BLACK_R = data.frame(ECLSK3$X1_BLACK_R)
colnames(X1_BLACK_R) = c("X1_BLACK_R")
X1_BLACK_R = cbind(X1_BLACK_R, replicate(3, X1_BLACK_R$X1_BLACK_R))
names(X1_BLACK_R) = paste0("X", 1:ncol(X1_BLACK_R),"_BLACK_R")


X1_ASIAN_R = data.frame(ECLSK3$X1_ASIAN_R)
colnames(X1_ASIAN_R) = c("X1_ASIAN_R")
X1_ASIAN_R = cbind(X1_ASIAN_R, replicate(3, X1_ASIAN_R$X1_ASIAN_R))
names(X1_ASIAN_R) = paste0("X", 1:ncol(X1_ASIAN_R),"_ASIAN_R")


X1_AMINAN_R = data.frame(ECLSK3$X1_AMINAN_R)
colnames(X1_AMINAN_R) = c("X1_AMINAN_R")
X1_AMINAN_R = cbind(X1_AMINAN_R, replicate(3, X1_AMINAN_R$X1_AMINAN_R))
names(X1_AMINAN_R) = paste0("X", 1:ncol(X1_AMINAN_R),"_AMINAN_R")


X1_MULTR_R = data.frame(ECLSK3$X1_MULTR_R)
colnames(X1_MULTR_R) = c("X1_MULTR_R")
X1_MULTR_R = cbind(X1_MULTR_R, replicate(3, X1_MULTR_R$X1_MULTR_R))
names(X1_MULTR_R) = paste0("X", 1:ncol(X1_MULTR_R),"_MULTR_R")


X1PAR1RAC = data.frame(ECLSK3$X1PAR1RAC)
colnames(X1PAR1RAC) = c("X1PAR1RAC")
X1PAR1RAC = cbind(X1PAR1RAC, replicate(3, X1PAR1RAC$X1PAR1RAC))
names(X1PAR1RAC) = paste0("X", 1:ncol(X1PAR1RAC),"PAR1RAC")


ECLSK3 = cbind(ECLSK3, X1_CHSEX_R, X1_HISP_R, X1_BLACK_R, X1_ASIAN_R, X1_AMINAN_R, X1_MULTR_R, X1PAR1RAC)

X1_CHSEX_R = data.frame(ECLSK4$X1_CHSEX_R)
colnames(X1_CHSEX_R) = c("X1_CHSEX_R")
X1_CHSEX_R = cbind(X1_CHSEX_R, replicate(3, X1_CHSEX_R$X1_CHSEX_R))
names(X1_CHSEX_R) = paste0("X", 1:ncol(X1_CHSEX_R),"_CHSEX_R")


X1_HISP_R = data.frame(ECLSK4$X1_HISP_R)
colnames(X1_HISP_R) = c("X1_HISP_R")
X1_HISP_R = cbind(X1_HISP_R, replicate(3, X1_HISP_R$X1_HISP_R))
names(X1_HISP_R) = paste0("X", 1:ncol(X1_HISP_R),"_HISP_R")


X1_BLACK_R = data.frame(ECLSK4$X1_BLACK_R)
colnames(X1_BLACK_R) = c("X1_BLACK_R")
X1_BLACK_R = cbind(X1_BLACK_R, replicate(3, X1_BLACK_R$X1_BLACK_R))
names(X1_BLACK_R) = paste0("X", 1:ncol(X1_BLACK_R),"_BLACK_R")


X1_ASIAN_R = data.frame(ECLSK4$X1_ASIAN_R)
colnames(X1_ASIAN_R) = c("X1_ASIAN_R")
X1_ASIAN_R = cbind(X1_ASIAN_R, replicate(3, X1_ASIAN_R$X1_ASIAN_R))
names(X1_ASIAN_R) = paste0("X", 1:ncol(X1_ASIAN_R),"_ASIAN_R")


X1_AMINAN_R = data.frame(ECLSK4$X1_AMINAN_R)
colnames(X1_AMINAN_R) = c("X1_AMINAN_R")
X1_AMINAN_R = cbind(X1_AMINAN_R, replicate(3, X1_AMINAN_R$X1_AMINAN_R))
names(X1_AMINAN_R) = paste0("X", 1:ncol(X1_AMINAN_R),"_AMINAN_R")


X1_MULTR_R = data.frame(ECLSK4$X1_MULTR_R)
colnames(X1_MULTR_R) = c("X1_MULTR_R")
X1_MULTR_R = cbind(X1_MULTR_R, replicate(3, X1_MULTR_R$X1_MULTR_R))
names(X1_MULTR_R) = paste0("X", 1:ncol(X1_MULTR_R),"_MULTR_R")


X1PAR1RAC = data.frame(ECLSK4$X1PAR1RAC)
colnames(X1PAR1RAC) = c("X1PAR1RAC")
X1PAR1RAC = cbind(X1PAR1RAC, replicate(3, X1PAR1RAC$X1PAR1RAC))
names(X1PAR1RAC) = paste0("X", 1:ncol(X1PAR1RAC),"PAR1RAC")


ECLSK4 = cbind(ECLSK4, X1_CHSEX_R, X1_HISP_R, X1_BLACK_R, X1_ASIAN_R, X1_AMINAN_R, X1_MULTR_R, X1PAR1RAC)

X1_CHSEX_R = data.frame(ECLSK5$X1_CHSEX_R)
colnames(X1_CHSEX_R) = c("X1_CHSEX_R")
X1_CHSEX_R = cbind(X1_CHSEX_R, replicate(3, X1_CHSEX_R$X1_CHSEX_R))
names(X1_CHSEX_R) = paste0("X", 1:ncol(X1_CHSEX_R),"_CHSEX_R")


X1_HISP_R = data.frame(ECLSK5$X1_HISP_R)
colnames(X1_HISP_R) = c("X1_HISP_R")
X1_HISP_R = cbind(X1_HISP_R, replicate(3, X1_HISP_R$X1_HISP_R))
names(X1_HISP_R) = paste0("X", 1:ncol(X1_HISP_R),"_HISP_R")


X1_BLACK_R = data.frame(ECLSK5$X1_BLACK_R)
colnames(X1_BLACK_R) = c("X1_BLACK_R")
X1_BLACK_R = cbind(X1_BLACK_R, replicate(3, X1_BLACK_R$X1_BLACK_R))
names(X1_BLACK_R) = paste0("X", 1:ncol(X1_BLACK_R),"_BLACK_R")


X1_ASIAN_R = data.frame(ECLSK5$X1_ASIAN_R)
colnames(X1_ASIAN_R) = c("X1_ASIAN_R")
X1_ASIAN_R = cbind(X1_ASIAN_R, replicate(3, X1_ASIAN_R$X1_ASIAN_R))
names(X1_ASIAN_R) = paste0("X", 1:ncol(X1_ASIAN_R),"_ASIAN_R")


X1_AMINAN_R = data.frame(ECLSK5$X1_AMINAN_R)
colnames(X1_AMINAN_R) = c("X1_AMINAN_R")
X1_AMINAN_R = cbind(X1_AMINAN_R, replicate(3, X1_AMINAN_R$X1_AMINAN_R))
names(X1_AMINAN_R) = paste0("X", 1:ncol(X1_AMINAN_R),"_AMINAN_R")


X1_MULTR_R = data.frame(ECLSK5$X1_MULTR_R)
colnames(X1_MULTR_R) = c("X1_MULTR_R")
X1_MULTR_R = cbind(X1_MULTR_R, replicate(3, X1_MULTR_R$X1_MULTR_R))
names(X1_MULTR_R) = paste0("X", 1:ncol(X1_MULTR_R),"_MULTR_R")


X1PAR1RAC = data.frame(ECLSK5$X1PAR1RAC)
colnames(X1PAR1RAC) = c("X1PAR1RAC")
X1PAR1RAC = cbind(X1PAR1RAC, replicate(3, X1PAR1RAC$X1PAR1RAC))
names(X1PAR1RAC) = paste0("X", 1:ncol(X1PAR1RAC),"PAR1RAC")


ECLSK5 = cbind(ECLSK5, X1_CHSEX_R, X1_HISP_R, X1_BLACK_R, X1_ASIAN_R, X1_AMINAN_R, X1_MULTR_R, X1PAR1RAC)

Here I am going to grab the means and standard deviations across the variables across time points.

mean1 =apply(ECLSK1, 2, mean)
mean2 =apply(ECLSK2, 2, mean)
mean3 =apply(ECLSK3, 2, mean)
mean4 =apply(ECLSK4, 2, mean)
mean5 =apply(ECLSK5, 2, mean)

sd1 = apply(ECLSK1, 2, sd)
sd2 = apply(ECLSK2, 2, sd)
sd3 = apply(ECLSK3, 2, sd)
sd4 = apply(ECLSK4, 2, sd)
sd5 = apply(ECLSK5, 2, sd)


allMeans = t(as.matrix(cbind(mean1, mean2, mean3, mean4, mean5)))

allSDs = t(as.matrix(cbind(sd1, sd2, sd3, sd4, sd5)))

allMeansSDsCom = mi.meld(q = allMeans, se = allSDs)
allMeansSDsCom = as.data.frame(allMeansSDsCom)
write.csv(allMeansSDsCom, "allMeansSDsCom.csv")

Here I can create transform my cross sectional data set into a longitudinal data set, where instead of having four columns for each variable, I have one column for each variable with a time variable that indicates, which time point each response is aligned with. I do this for each of the five imputed datasets.

ECLSK1= reshape(ECLSK1, varying = list(c("X1TCHAPP", "X2TCHAPP", "X3TCHAPP", "X4TCHAPP"), c("X1TCHCON", "X2TCHCON", "X3TCHCON", "X4TCHCON"), c("X1TCHPER", "X2TCHPER", "X3TCHPER", "X4TCHPER"), c("X1TCHEXT", "X2TCHEXT", "X3TCHEXT", "X4TCHEXT"), c("X1TCHINT", "X2TCHINT", "X3TCHINT", "X4TCHINT"), c("X1RTHET", "X2RTHET", "X3RTHET", "X4RTHET"), c("X1MTHET", "X2MTHET", "X3MTHET", "X4MTHET"), c("X1BMI", "X2BMI", "X3BMI", "X4BMI"), c("X1PUBPRI", "X2PUBPRI", "X3PUBPRI","X4PUBPRI"), c("X1_CHSEX_R", "X2_CHSEX_R", "X3_CHSEX_R", "X4_CHSEX_R"), c("X1_HISP_R", "X2_HISP_R", "X3_HISP_R", "X4_HISP_R"), c("X1_BLACK_R", "X2_BLACK_R", "X3_BLACK_R", "X4_BLACK_R"), c("X1_ASIAN_R", "X2_ASIAN_R", "X3_ASIAN_R", "X4_ASIAN_R"), c("X1_AMINAN_R", "X2_AMINAN_R", "X3_AMINAN_R", "X4_AMINAN_R"), c("X1_MULTR_R", "X2_MULTR_R", "X3_MULTR_R", "X4_MULTR_R"), c("X1PAR1RAC", "X2PAR1RAC", "X3PAR1RAC", "X4PAR1RAC")), times = c(0,1,2,3), direction = "long")

ECLSK2= reshape(ECLSK2, varying = list(c("X1TCHAPP", "X2TCHAPP", "X3TCHAPP", "X4TCHAPP"), c("X1TCHCON", "X2TCHCON", "X3TCHCON", "X4TCHCON"), c("X1TCHPER", "X2TCHPER", "X3TCHPER", "X4TCHPER"), c("X1TCHEXT", "X2TCHEXT", "X3TCHEXT", "X4TCHEXT"), c("X1TCHINT", "X2TCHINT", "X3TCHINT", "X4TCHINT"), c("X1RTHET", "X2RTHET", "X3RTHET", "X4RTHET"), c("X1MTHET", "X2MTHET", "X3MTHET", "X4MTHET"), c("X1BMI", "X2BMI", "X3BMI", "X4BMI"), c("X1PUBPRI", "X2PUBPRI", "X3PUBPRI","X4PUBPRI"), c("X1_CHSEX_R", "X2_CHSEX_R", "X3_CHSEX_R", "X4_CHSEX_R"), c("X1_HISP_R", "X2_HISP_R", "X3_HISP_R", "X4_HISP_R"), c("X1_BLACK_R", "X2_BLACK_R", "X3_BLACK_R", "X4_BLACK_R"), c("X1_ASIAN_R", "X2_ASIAN_R", "X3_ASIAN_R", "X4_ASIAN_R"), c("X1_AMINAN_R", "X2_AMINAN_R", "X3_AMINAN_R", "X4_AMINAN_R"), c("X1_MULTR_R", "X2_MULTR_R", "X3_MULTR_R", "X4_MULTR_R"), c("X1PAR1RAC", "X2PAR1RAC", "X3PAR1RAC", "X4PAR1RAC")), times = c(0,1,2,3), direction = "long")

ECLSK3= reshape(ECLSK3, varying = list(c("X1TCHAPP", "X2TCHAPP", "X3TCHAPP", "X4TCHAPP"), c("X1TCHCON", "X2TCHCON", "X3TCHCON", "X4TCHCON"), c("X1TCHPER", "X2TCHPER", "X3TCHPER", "X4TCHPER"), c("X1TCHEXT", "X2TCHEXT", "X3TCHEXT", "X4TCHEXT"), c("X1TCHINT", "X2TCHINT", "X3TCHINT", "X4TCHINT"), c("X1RTHET", "X2RTHET", "X3RTHET", "X4RTHET"), c("X1MTHET", "X2MTHET", "X3MTHET", "X4MTHET"), c("X1BMI", "X2BMI", "X3BMI", "X4BMI"), c("X1PUBPRI", "X2PUBPRI", "X3PUBPRI","X4PUBPRI"), c("X1_CHSEX_R", "X2_CHSEX_R", "X3_CHSEX_R", "X4_CHSEX_R"), c("X1_HISP_R", "X2_HISP_R", "X3_HISP_R", "X4_HISP_R"), c("X1_BLACK_R", "X2_BLACK_R", "X3_BLACK_R", "X4_BLACK_R"), c("X1_ASIAN_R", "X2_ASIAN_R", "X3_ASIAN_R", "X4_ASIAN_R"), c("X1_AMINAN_R", "X2_AMINAN_R", "X3_AMINAN_R", "X4_AMINAN_R"), c("X1_MULTR_R", "X2_MULTR_R", "X3_MULTR_R", "X4_MULTR_R"), c("X1PAR1RAC", "X2PAR1RAC", "X3PAR1RAC", "X4PAR1RAC")), times = c(0,1,2,3), direction = "long")

ECLSK4= reshape(ECLSK4, varying = list(c("X1TCHAPP", "X2TCHAPP", "X3TCHAPP", "X4TCHAPP"), c("X1TCHCON", "X2TCHCON", "X3TCHCON", "X4TCHCON"), c("X1TCHPER", "X2TCHPER", "X3TCHPER", "X4TCHPER"), c("X1TCHEXT", "X2TCHEXT", "X3TCHEXT", "X4TCHEXT"), c("X1TCHINT", "X2TCHINT", "X3TCHINT", "X4TCHINT"), c("X1RTHET", "X2RTHET", "X3RTHET", "X4RTHET"), c("X1MTHET", "X2MTHET", "X3MTHET", "X4MTHET"), c("X1BMI", "X2BMI", "X3BMI", "X4BMI"), c("X1PUBPRI", "X2PUBPRI", "X3PUBPRI","X4PUBPRI"), c("X1_CHSEX_R", "X2_CHSEX_R", "X3_CHSEX_R", "X4_CHSEX_R"), c("X1_HISP_R", "X2_HISP_R", "X3_HISP_R", "X4_HISP_R"), c("X1_BLACK_R", "X2_BLACK_R", "X3_BLACK_R", "X4_BLACK_R"), c("X1_ASIAN_R", "X2_ASIAN_R", "X3_ASIAN_R", "X4_ASIAN_R"), c("X1_AMINAN_R", "X2_AMINAN_R", "X3_AMINAN_R", "X4_AMINAN_R"), c("X1_MULTR_R", "X2_MULTR_R", "X3_MULTR_R", "X4_MULTR_R"), c("X1PAR1RAC", "X2PAR1RAC", "X3PAR1RAC", "X4PAR1RAC")), times = c(0,1,2,3), direction = "long")

ECLSK5= reshape(ECLSK5, varying = list(c("X1TCHAPP", "X2TCHAPP", "X3TCHAPP", "X4TCHAPP"), c("X1TCHCON", "X2TCHCON", "X3TCHCON", "X4TCHCON"), c("X1TCHPER", "X2TCHPER", "X3TCHPER", "X4TCHPER"), c("X1TCHEXT", "X2TCHEXT", "X3TCHEXT", "X4TCHEXT"), c("X1TCHINT", "X2TCHINT", "X3TCHINT", "X4TCHINT"), c("X1RTHET", "X2RTHET", "X3RTHET", "X4RTHET"), c("X1MTHET", "X2MTHET", "X3MTHET", "X4MTHET"), c("X1BMI", "X2BMI", "X3BMI", "X4BMI"), c("X1PUBPRI", "X2PUBPRI", "X3PUBPRI","X4PUBPRI"), c("X1_CHSEX_R", "X2_CHSEX_R", "X3_CHSEX_R", "X4_CHSEX_R"), c("X1_HISP_R", "X2_HISP_R", "X3_HISP_R", "X4_HISP_R"), c("X1_BLACK_R", "X2_BLACK_R", "X3_BLACK_R", "X4_BLACK_R"), c("X1_ASIAN_R", "X2_ASIAN_R", "X3_ASIAN_R", "X4_ASIAN_R"), c("X1_AMINAN_R", "X2_AMINAN_R", "X3_AMINAN_R", "X4_AMINAN_R"), c("X1_MULTR_R", "X2_MULTR_R", "X3_MULTR_R", "X4_MULTR_R"), c("X1PAR1RAC", "X2PAR1RAC", "X3PAR1RAC", "X4PAR1RAC")), times = c(0,1,2,3), direction = "long")

Equations 1.1 through 1.3 describe the multilevel model. First there is level one, which has a intercept Beta(0j) and the variables of interest Beta(1j) which is the coefficient for the time variable and Beta(2j) is the coefficient for the binary variable indicating whether a student is in private (1) or public (0) school at each time point (i) over each person (j). The i’s in time indicate each time point, which are nested in each person j where j can vary across each person’s intercept and slope, which are described in equations 1.2 and 1.3. Beta(1j) interacts with Beta(2j) and is the variable of interest and represents whether or not students are in a private school and are different from public school students on teacher’s reported self control scores for students over time. Then there is Beta(xj)(X), where X represents a vector of time varying covariates. Finally there is the individual error term e(ij), which is the difference between the predicted and observed value for each person over each time point.

Next, the intercept is broken down into the different factors influencing its value for each student. Beta(0j) represents the average intercept value and Beta(0j)(Q) represents the level two time invariant covariates including the in the model where Q is a vector of time invariant variables. Then there is the individual variation that each student has from the average intercept value represented by u0j allowing each student to have their intercept value.

Finally, there is the random slope component, which allows each student (i) to have their own trajectory or slope over self control over the vector of time varying variables X across time j. The X’s represent the same vector of beta’s from the level one equation, but are broken down into two components an average slope coefficient and a unique error term (uxj) that allows each time varying beta to be different for each time point across self control scores.

Level one covariates X1TCHAPP, X1TCHPER, X1TCHEXT, X1TCHINT, X1RTHET, X1MTHET, X1BMI, X1PUBPRI

Level two variables: X1_CHSEX_R, X1_HISP_R, X1_BLACK_R, X1_ASIAN_R, X1_AMINAN_R, X1_MULTR_R, X1PAR1RAC

\[ Level~1:~~~{y_{ij} = \beta_{0j} + \beta_{1j}(time_{ij})*(Private_{ij}) + \beta_{xj}*(X) + e_{ij}}~~~ (1.1)\]

\[ Level~2~Intercept:~~~{\beta_{0j} = \gamma_{00} + \gamma_{0j}*(Q) + u_{0j}} ~~~ (1.2)\]

\[ Level~2~Slope:~~~{\beta_{xj} = \gamma_{x0} + u_{xj}} ~~~ (1.3)\] \[Mixed~model: ~~~{y_{ij} = \gamma_{00}+\gamma_{10}(Time_{ij})*(Private_{j}) + \gamma_{x0}*(X)+ \gamma_{0j}*(Q) + u_{xj} + u_{0j} + e_{ij}} ~~~(1.4)\] Here I am going to mean center the continuous variables to reduce the potential effect of multicollinearity.

head(ECLSK1)

##      id  X weights time X1TCHAPP X1TCHCON X1TCHPER X1TCHEXT X1TCHINT
## 5.0   5  5       1    0     3.67     3.75      3.8      2.2     1.25
## 12.0 12 12       1    0     2.57     2.25      2.4      2.6     1.75
## 13.0 13 13       1    0     3.86     4.00      3.6      1.0     2.25
## 23.0 23 23       1    0     3.00     3.00      2.8      1.8     1.25
## 29.0 29 29       1    0     2.86     3.00      3.2      2.0     1.00
## 30.0 30 30       1    0     4.00     3.00      3.6      1.4     1.25
##      X1RTHET X1MTHET X1BMI X1PUBPRI X1_CHSEX_R X1_HISP_R X1_BLACK_R
## 5.0  -0.0734 -0.5545 15.20        1          0         1          0
## 12.0 -0.3999 -0.2823 17.39        0          1         0          1
## 13.0  0.3953  0.3557 17.07        1          1         0          0
## 23.0  0.8916  1.2520 14.34        1          1         0          0
## 29.0 -0.0284  0.2346 15.28        0          0         0          0
## 30.0 -0.3733  0.3625 16.07        1          0         0          0
##      X1_ASIAN_R X1_AMINAN_R X1_MULTR_R X1PAR1RAC
## 5.0           0           0          0         0
## 12.0          0           0          0         1
## 13.0          0           0          0         0
## 23.0          0           0          0         0
## 29.0          0           0          0         0
## 30.0          0           0          0         0

ECLSK1Scale = ECLSK1[,c(5:12)]
ECLSK1Scale = scale(ECLSK1Scale, scale = FALSE)
ECLSK1NoScale = cbind(ECLSK1[c(1:4)], ECLSK1[c(13:20)])
ECLSK1 = cbind(ECLSK1NoScale, ECLSK1Scale)

ECLSK2Scale = ECLSK2[,c(5:12)]
ECLSK2Scale = scale(ECLSK2Scale, scale = FALSE)
ECLSK2NoScale = cbind(ECLSK2[c(1:4)], ECLSK2[c(13:20)])
ECLSK2 = cbind(ECLSK2NoScale, ECLSK2Scale)

ECLSK3Scale = ECLSK3[,c(5:12)]
ECLSK3Scale = scale(ECLSK3Scale, scale = FALSE)
ECLSK3NoScale = cbind(ECLSK3[c(1:4)], ECLSK3[c(13:20)])
ECLSK3 = cbind(ECLSK3NoScale, ECLSK3Scale)

ECLSK4Scale = ECLSK4[,c(5:12)]
ECLSK4Scale = scale(ECLSK4Scale, scale = FALSE)
ECLSK4NoScale = cbind(ECLSK4[c(1:4)], ECLSK4[c(13:20)])
ECLSK4 = cbind(ECLSK4NoScale, ECLSK4Scale)

ECLSK5Scale = ECLSK5[,c(5:12)]
ECLSK5Scale = scale(ECLSK5Scale, scale = FALSE)
ECLSK5NoScale = cbind(ECLSK5[c(1:4)], ECLSK5[c(13:20)])
ECLSK5 = cbind(ECLSK5NoScale, ECLSK5Scale)

Now I will put the model into nlme to analyze it longitudinally. The first model is the null model that only contains the intercept for each person (i.e. their average self control value over time). Then we compare the null model to a model with the covariates of interest with variable of interest the interaction between time and the PUBPRI variable where each person receives their own intercept and then compare that model to a model where each person receives their and intercept and slope (i.e. trajectory of a person’s self control over time). I then compare each model using the anova package, which in all cases indicates that the model with random intercepts and slopes is a better fit relative to the null and random intercepts only models. In order to compare the models, I need to use restricted maximum likelihood estimation to ensure the degrees of freedom are comparable.

head(ECLSK1)

##      id  X weights time X1PUBPRI X1_CHSEX_R X1_HISP_R X1_BLACK_R
## 5.0   5  5       1    0        1          0         1          0
## 12.0 12 12       1    0        0          1         0          1
## 13.0 13 13       1    0        1          1         0          0
## 23.0 23 23       1    0        1          1         0          0
## 29.0 29 29       1    0        0          0         0          0
## 30.0 30 30       1    0        1          0         0          0
##      X1_ASIAN_R X1_AMINAN_R X1_MULTR_R X1PAR1RAC   X1TCHAPP   X1TCHCON
## 5.0           0           0          0         0  0.5572863  0.5613903
## 12.0          0           0          0         1 -0.5427137 -0.9386097
## 13.0          0           0          0         0  0.7472863  0.8113903
## 23.0          0           0          0         0 -0.1127137 -0.1886097
## 29.0          0           0          0         0 -0.2527137 -0.1886097
## 30.0          0           0          0         0  0.8872863 -0.1886097
##        X1TCHPER  X1TCHEXT   X1TCHINT   X1RTHET    X1MTHET      X1BMI
## 5.0   0.6598489  0.537983 -0.2350162 -0.849159 -1.4208179 -1.2095692
## 12.0 -0.7401511  0.937983  0.2649838 -1.175659 -1.1486179  0.9804308
## 13.0  0.4598489 -0.662017  0.7649838 -0.380459 -0.5106179  0.6604308
## 23.0 -0.3401511  0.137983 -0.2350162  0.115841  0.3856821 -2.0695692
## 29.0  0.0598489  0.337983 -0.4850162 -0.804159 -0.6317179 -1.1295692
## 30.0  0.4598489 -0.262017 -0.2350162 -1.149059 -0.5038179 -0.3395692

library(nlme)

model11 = lme(fixed = X1TCHCON ~1, random =  ~ 1 | id, data = ECLSK1, method = "ML")
summary(model11)

## Linear mixed-effects model fit by maximum likelihood
##  Data: ECLSK1 
##        AIC      BIC    logLik
##   28781.39 28804.79 -14387.69
## 
## Random effects:
##  Formula: ~1 | id
##         (Intercept)  Residual
## StdDev:   0.4428333 0.4382241
## 
## Fixed effects: X1TCHCON ~ 1 
##                     Value   Std.Error    DF       t-value p-value
## (Intercept) -3.074733e-17 0.007355659 13536 -4.180092e-15       1
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -5.04905518 -0.60433467  0.04130509  0.62747223  3.88528174 
## 
## Number of Observations: 18048
## Number of Groups: 4512

model21 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC
, random = ~1 | id, data = ECLSK1, method = "ML")

model31 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC
, random = ~time | id, data = ECLSK1, method = "ML")
summary(model31)

## Linear mixed-effects model fit by maximum likelihood
##  Data: ECLSK1 
##        AIC      BIC    logLik
##   9271.457 9435.273 -4614.728
## 
## Random effects:
##  Formula: ~time | id
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev     Corr  
## (Intercept) 0.15438442 (Intr)
## time        0.07296314 -0.642
## Residual    0.28184779       
## 
## Fixed effects: X1TCHCON ~ time * X1PUBPRI + X1MTHET + X1TCHAPP + X1TCHPER +      X1TCHEXT + X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHINT + X1MTHET +      X1BMI + X1_CHSEX_R + X1_HISP_R + X1_BLACK_R + X1_ASIAN_R +      X1_AMINAN_R + X1_MULTR_R + X1PAR1RAC 
##                    Value  Std.Error    DF   t-value p-value
## (Intercept)   -0.0837828 0.00792250 13527 -10.57530  0.0000
## time           0.0583871 0.00384688 13527  15.17776  0.0000
## X1PUBPRI      -0.0158433 0.00835424 13527  -1.89644  0.0579
## X1MTHET       -0.0247142 0.00376723 13527  -6.56033  0.0000
## X1TCHAPP       0.1610749 0.00588536 13527  27.36873  0.0000
## X1TCHPER       0.4669074 0.00578860 13527  80.65987  0.0000
## X1TCHEXT      -0.3258988 0.00539856 13527 -60.36777  0.0000
## X1TCHINT       0.0234863 0.00544579 13527   4.31273  0.0000
## X1BMI         -0.0015943 0.00116037 13527  -1.37394  0.1695
## X1_CHSEX_R     0.0287664 0.00570923  4504   5.03857  0.0000
## X1_HISP_R     -0.0228653 0.00984075  4504  -2.32353  0.0202
## X1_BLACK_R    -0.0245429 0.01125252  4504  -2.18111  0.0292
## X1_ASIAN_R     0.0297704 0.01165369  4504   2.55459  0.0107
## X1_AMINAN_R    0.0376698 0.04788427  4504   0.78668  0.4315
## X1_MULTR_R     0.0290168 0.01184518  4504   2.44967  0.0143
## X1PAR1RAC     -0.0169623 0.00863494  4504  -1.96438  0.0495
## time:X1PUBPRI -0.0050467 0.00436521 13527  -1.15612  0.2477
##  Correlation: 
##               (Intr) time   X1PUBP X1MTHE X1TCHA X1TCHP X1TCHE X1TCHI
## time          -0.749                                                 
## X1PUBPRI      -0.524  0.430                                          
## X1MTHET        0.474 -0.632 -0.008                                   
## X1TCHAPP      -0.176  0.173  0.011 -0.311                            
## X1TCHPER       0.041 -0.079 -0.008  0.052 -0.499                     
## X1TCHEXT       0.047 -0.030  0.002 -0.063  0.276  0.294              
## X1TCHINT       0.003 -0.051  0.012  0.034  0.109  0.121 -0.074       
## X1BMI          0.071 -0.065 -0.005  0.029  0.001 -0.015 -0.044 -0.005
## X1_CHSEX_R    -0.415  0.069 -0.003 -0.116  0.146  0.020 -0.066  0.058
## X1_HISP_R     -0.079 -0.050  0.000  0.082  0.002  0.003 -0.003  0.008
## X1_BLACK_R    -0.075 -0.041  0.013  0.081 -0.002 -0.019 -0.079  0.006
## X1_ASIAN_R    -0.119  0.030 -0.001 -0.057 -0.011  0.026  0.028  0.014
## X1_AMINAN_R   -0.020 -0.004 -0.012  0.005  0.011  0.007  0.010 -0.002
## X1_MULTR_R    -0.125  0.014  0.012 -0.020  0.007  0.004 -0.013  0.003
## X1PAR1RAC     -0.050 -0.031 -0.002  0.043 -0.025  0.037  0.022  0.015
## time:X1PUBPRI  0.382 -0.513 -0.763 -0.011  0.006 -0.005 -0.003 -0.008
##               X1BMI  X1_CHS X1_HIS X1_BLA X1_ASI X1_AMI X1_MUL X1PAR1
## time                                                                 
## X1PUBPRI                                                             
## X1MTHET                                                              
## X1TCHAPP                                                             
## X1TCHPER                                                             
## X1TCHEXT                                                             
## X1TCHINT                                                             
## X1BMI                                                                
## X1_CHSEX_R    -0.008                                                 
## X1_HISP_R     -0.043  0.013                                          
## X1_BLACK_R    -0.021  0.039  0.457                                   
## X1_ASIAN_R     0.063  0.046  0.395  0.387                            
## X1_AMINAN_R   -0.027  0.009  0.070  0.067  0.047                     
## X1_MULTR_R     0.028  0.032  0.257  0.249  0.224  0.041              
## X1PAR1RAC     -0.040 -0.027 -0.580 -0.604 -0.538 -0.073 -0.299       
## time:X1PUBPRI -0.004  0.007 -0.001 -0.009  0.001  0.001 -0.004  0.005
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -6.65670345 -0.58152488  0.03949771  0.57214713  4.68063282 
## 
## Number of Observations: 18048
## Number of Groups: 4512

model41 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC, random = ~time | id, data = ECLSK1, correlation = corAR1(), method = "ML")


anova(model11, model21, model31, model41)

##         Model df       AIC       BIC     logLik   Test   L.Ratio p-value
## model11     1  3 28781.386 28804.789 -14387.693                         
## model21     2 19  9400.087  9548.302  -4681.043 1 vs 2 19413.300  <.0001
## model31     3 21  9271.457  9435.273  -4614.728 2 vs 3   132.630  <.0001
## model41     4 22  9258.546  9430.163  -4607.273 3 vs 4    14.911   1e-04

model12 = lme(fixed = X1TCHCON ~1, random =  ~ 1 | id, data = ECLSK2, method = "ML")


model22 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC
              , random = ~1 | id, data = ECLSK2, method = "ML")

model32 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC, random = ~time | id, data = ECLSK2, method = "ML")

model42 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC, random = ~time | id, data = ECLSK2, correlation = corAR1(), method = "ML")

anova(model12, model22, model42)

##         Model df       AIC       BIC     logLik   Test   L.Ratio p-value
## model12     1  3 28727.444 28750.850 -14360.722                         
## model22     2 19  9536.966  9685.206  -4749.483 1 vs 2 19222.478  <.0001
## model42     3 22  9332.478  9504.124  -4644.239 2 vs 3   210.488  <.0001

model13 = lme(fixed = X1TCHCON ~1, random =  ~ 1 | id, data = ECLSK3, method = "ML")


model23 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC
              , random = ~1 | id, data = ECLSK3, method = "ML")

model33 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC, random = ~time | id, data = ECLSK3, method = "ML")

model43 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC, random = ~time | id, data = ECLSK3, correlation = corAR1(), method = "ML")

anova(model13, model23, model33, model43)

##         Model df       AIC       BIC     logLik   Test   L.Ratio p-value
## model13     1  3 28412.689 28436.081 -14203.345                         
## model23     2 19  8988.660  9136.807  -4475.330 1 vs 2 19456.030  <.0001
## model33     3 21  8841.699  9005.441  -4399.850 2 vs 3   150.960  <.0001
## model43     4 22  8836.584  9008.123  -4396.292 3 vs 4     7.115  0.0076

model14 = lme(fixed = X1TCHCON ~1, random =  ~ 1 | id, data = ECLSK4, method = "ML")


model24 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC, random = ~1 | id, data = ECLSK4, method = "ML")

model34 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC, random = ~time | id, data = ECLSK4, method = "ML")

model44 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC, random = ~time | id, data = ECLSK4, correlation = corAR1(), method = "ML")

anova(model14, model24, model34, model44)

##         Model df       AIC       BIC     logLik   Test   L.Ratio p-value
## model14     1  3 28895.127 28918.524 -14444.563                         
## model24     2 19  9870.243 10018.425  -4916.122 1 vs 2 19056.884  <.0001
## model34     3 21  9764.713  9928.492  -4861.356 2 vs 3   109.530  <.0001
## model44     4 22  9735.541  9907.120  -4845.771 3 vs 4    31.171  <.0001

model15 = lme(fixed = X1TCHCON ~1, random =  ~ 1 | id, data = ECLSK5, method = "ML")


model25 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC, random = ~1 | id, data = ECLSK5, method = "ML")

model35 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC, random = ~time | id, data = ECLSK5, method = "ML")

model45 = lme(fixed = X1TCHCON ~ time*X1PUBPRI  +X1MTHET +  X1TCHAPP + X1TCHPER + X1TCHEXT + X1TCHAPP+X1TCHPER +X1TCHEXT+ X1TCHINT+X1MTHET+X1BMI +X1_CHSEX_R + X1_HISP_R +X1_BLACK_R+X1_ASIAN_R+X1_AMINAN_R+X1_MULTR_R+X1PAR1RAC, random = ~time | id, data = ECLSK5, correlation = corAR1(), method = "ML")

anova(model15, model25, model35, model45)

##         Model df       AIC       BIC     logLik   Test   L.Ratio p-value
## model15     1  3 29202.503 29225.919 -14598.251                         
## model25     2 19  9803.909  9952.208  -4882.955 1 vs 2 19430.594  <.0001
## model35     3 21  9680.268  9844.178  -4819.134 2 vs 3   127.641  <.0001
## model45     4 22  9665.041  9836.756  -4810.521 3 vs 4    17.227  <.0001

Next I need to grab the parameter estimate and standard error for the PUBPRI interaction with time variable, because this is the variable that can tell us if students in alternative to assigned public school change on the self control over time relative to students in assigned public schools. I will also get the degrees of freedom for later use.

Then I use the mi.meld function in Amelia to “average” the parameter estimates and standard errors across the five imputed datasets to create one parameter estimate and standard error for the interaction effect between PUBPRI and time.

n = model41$fixDF$X[3]
model41 = summary(model41)
model41= model41$tTable[17,c(1:2)]

model42 = summary(model42)
model42= model42$tTable[17,c(1:2)]

model43 = summary(model43)
model43= model43$tTable[17,c(1:2)]

model44 = summary(model44)
model44= model44$tTable[17,c(1:2)]

model45 = summary(model45)
model45= model45$tTable[17,c(1:2)]

modelAll = as.data.frame(t(cbind(model41, model42, model43, model44, model45)))
rownames(modelAll) <- c()

modelPars = as.data.frame(modelAll$Value)
colnames(modelPars) = c("Pars")
modelSE = as.data.frame(modelAll$Std.Error)
colnames(modelSE) = c("SE")
library(Amelia)
allPars = mi.meld(q = modelPars, se = modelSE)

allParsPE = t(as.data.frame(allPars$q.mi))

allParsSE =  t(as.data.frame(allPars$se.mi))

allParSesPaperSC = cbind(allParsPE, allParsSE)
colnames(allParSesPaperSC) = c("ParameterEstimate", "StandardError")
write.csv(allParSesPaperSC, "allParSesPaperSC.csv")

Here I am creating the t-statistic and p-value using the parameter estimate for the interaction of the alternative with time variable for teacher reported self control by dividing the parameter estimate by the standard error to get the t-statistic, then using the degrees of freedom for the interaction effect to get the associated p-value and confidence interval.

allParSesPaperSC = data.frame(allParSesPaperSC)
allParSesPaperSC$TStatistic = allParSesPaperSC$ParameterEstimate / allParSesPaperSC$StandardError

pValue = 2*pt(-abs(allParSesPaperSC$TStatistic), df = n)
allParSesPaperSC$pValue = pValue
allParSesPaperSC

##      ParameterEstimate StandardError TStatistic    pValue
## Pars      -0.006683166   0.004842435  -1.380125 0.1675709

Upper = 1.96*allParSesPaperSC$StandardError+allParSesPaperSC$ParameterEstimate
Lower = -(1.96*allParSesPaperSC$StandardError-allParSesPaperSC$ParameterEstimate)
allParSesPaperSC$Upper = Upper
allParSesPaperSC$Lower = Lower
allParSesPaperSC

##      ParameterEstimate StandardError TStatistic    pValue       Upper
## Pars      -0.006683166   0.004842435  -1.380125 0.1675709 0.002808006
##            Lower
## Pars -0.01617434