Link to questions for Felix
Initial data-munging was done using a series of SQL queries. For direct further education entrants we took those who indicated that they had finished year 12 the year before and indicated that they were currently undertaking some form of teriary study. Gap year students were derived by taking those who had completed year 12 the year before and indicated that they had defered further education and/or confirmed the following year that they had defered further education in the year previous.
Further education drop-outs were taking by exploring if in any of the years under investigation they indicated they had withdrawn from their teriary education course. Gap-year returners composed those who had, at any time during the period of investigation indicated that they currently undertaking teriary education. SQL queries underlying theis subsetting are in .Rmd file associated with this file.
## Loading required package: DBI
## [1] "META1995" "META1998" "META2003" "Student1995" "Student1998"
## [6] "Student2003"
The number of adolescence who directly entered university after high-school was 2259. The number of adolescents that took a gap-year (offered further education but defered) was 645. Of the direct entrants, those who withdraw from further education in the direct entrant group were 81. For ther gap-year group, those who eventually returned to further education were 461.
The PISA database consistst on individuals who are the same age but may be in a veriety of year in school grades. In the current research we wanted to compare individuals in terms of years since high-school graduation rather than age in years. As such, we needed to rearrange the data.
The plots show relatively consistent quadratic growth suggesting an increase in satisfaction with career and future prospects before returning to high-school levels. The dramatic spike in satisfaction in the 'gap-year' group reflects large amounts of missing data in this group in the final waves. With Bayes imputation this mean returns to the trend present in both groups.
Growth curve models were fitted via hierachical bayes models to account for missing data and provide appropriate estiamtes of uncertainty. Before fitting these models, however, we fitted standard glmm linear and quadratic models in the R package lme4. This was done to choose the appropriate model to fit in JAGS. Most models appeared to display quadratic growth, fitted via the following using the following script.
Hierachical Bayes Model - JAGS script
model{
# Model
for (i in 1:n) {
mu[i] <- alpha[id[i]] + beta[id[i]] * wave[i] + beta2 * group[i] +
beta3 * wave[i] * group[i] + beta.quad * wave[i] *wave[i];
Y[i] ~ dnorm(mu[i], tau.c)
}
for (j in 1:n.id) {
alpha[j] ~ dnorm(alpha.mu, alpha.tau);
beta[j] ~ dnorm(beta.mu, beta.tau);
}
# Priors
alpha.mu ~ dnorm(0, 1.0E-4);
beta.mu ~ dnorm(0, 1.0E-4);
tau.c ~ dgamma(1.0E-3, 1.0E-3);
alpha.tau ~ dgamma(1.0E-3, 1.0E-3);
beta.tau ~ dgamma(1.0E-3, 1.0E-3);
beta2 ~ dnorm(0, 1.0E-4);
beta3 ~ dnorm(0, 1.0E-4);
beta.quad ~ dnorm(0, 1.0E-4);
alpha.sigma <- 1.0/sqrt(alpha.tau);
beta.sigma <- 1.0/sqrt(beta.tau);
}
GLMM and Hierachical Bayes models were then run. I do have questions here though:
## Linear mixed model fit by REML ['lmerMod']
## Formula: CAREER.PROS ~ wave + group + wave:group + (wave | id)
## Data: careerPros
##
## REML criterion at convergence: 20886
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.12411 0.3523
## wave 0.00636 0.0797 -0.29
## Residual 0.21419 0.4628
## Number of obs: 12884, groups: id, 2883
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.39699 0.01145 296.7
## wave -0.01147 0.00403 -2.8
## group 0.00306 0.01145 0.3
## wave:group -0.00120 0.00403 -0.3
##
## Correlation of Fixed Effects:
## (Intr) wave group
## wave -0.581
## group 0.565 -0.336
## wave:group -0.336 0.592 -0.581
## Linear mixed model fit by REML ['lmerMod']
## Formula: CAREER.PROS ~ wave + group + I(wave^2) + wave:group + I(wave^2):group + (wave | id)
## Data: careerPros
##
## REML criterion at convergence: 20859
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.12503 0.354
## wave 0.00656 0.081 -0.30
## Residual 0.21279 0.461
## Number of obs: 12884, groups: id, 2883
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.365820 0.012574 267.7
## wave 0.051468 0.011424 4.5
## group 0.000293 0.012574 0.0
## I(wave^2) -0.015303 0.002628 -5.8
## wave:group 0.006906 0.011424 0.6
## group:I(wave^2) -0.002326 0.002628 -0.9
##
## Correlation of Fixed Effects:
## (Intr) wave group I(w^2) wv:grp
## wave -0.573
## group 0.562 -0.331
## I(wave^2) 0.412 -0.935 0.244
## wave:group -0.331 0.602 -0.573 -0.578
## grp:I(wv^2) 0.244 -0.578 0.412 0.632 -0.935
## Data: careerPros
## Models:
## M_linear: CAREER.PROS ~ wave + group + wave:group + (wave | id)
## M_quad: CAREER.PROS ~ wave + group + I(wave^2) + wave:group + I(wave^2):group +
## M_quad: (wave | id)
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## M_linear 8 20867 20927 -10426 20851
## M_quad 10 20825 20899 -10402 20805 46.7 2 7.1e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Inference for Bugs model at "/Users/phparker/Dropbox/Projects_Research/LSAY_GapYear//QuadraticGrowth_ByGroup.txt", fit using jags,
## 4 chains, each with 15000 iterations (first 1000 discarded), n.thin = 10
## n.sims = 5600 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
## alpha.mu 3.044 9.39e-01 0.037 3.371 3.381 3.389 3.403 1.014 310
## alpha.sigma 0.300 1.02e-01 0.013 0.331 0.337 0.342 0.351 1.026 200
## beta.mu -0.030 2.20e-02 -0.051 -0.042 -0.037 -0.031 0.041 1.083 96
## beta.quad 0.038 1.45e-01 -0.020 -0.016 -0.014 -0.012 0.495 1.016 280
## beta.sigma 0.055 1.70e-02 0.013 0.055 0.061 0.065 0.071 1.003 1300
## beta2 -0.194 5.40e-01 -1.915 -0.011 -0.002 0.007 0.021 1.014 310
## beta3 -0.005 1.20e-02 -0.024 -0.011 -0.006 -0.001 0.025 1.039 150
## beta4 0.027 7.90e-02 -0.007 -0.003 -0.001 0.001 0.274 1.017 270
## deviance 21003.544 1.10e+04 16736.368 16908.711 17007.291 17134.976 54746.529 1.019 250
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
##
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 59865655.1 and DIC = 59886658.6
## DIC is an estimate of expected predictive error (lower deviance is better).
## Linear mixed model fit by REML ['lmerMod']
## Formula: FUTURE.PROS ~ wave + group + wave:group + (wave | id)
## Data: futPros
##
## REML criterion at convergence: 18131
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.10635 0.3261
## wave 0.00243 0.0493 -0.23
## Residual 0.17588 0.4194
## Number of obs: 12890, groups: id, 2882
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.421000 0.010468 327
## wave 0.000731 0.003398 0
## group 0.010825 0.010468 1
## wave:group -0.002889 0.003398 -1
##
## Correlation of Fixed Effects:
## (Intr) wave group
## wave -0.569
## group 0.564 -0.330
## wave:group -0.330 0.595 -0.569
## Linear mixed model fit by REML ['lmerMod']
## Formula: FUTURE.PROS ~ wave + group + I(wave^2) + wave:group + I(wave^2):group + (wave | id)
## Data: futPros
##
## REML criterion at convergence: 18135
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.1068 0.327
## wave 0.0025 0.050 -0.24
## Residual 0.1755 0.419
## Number of obs: 12890, groups: id, 2882
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.40820 0.01149 296.6
## wave 0.02590 0.01023 2.5
## group 0.01580 0.01149 1.4
## I(wave^2) -0.00600 0.00235 -2.6
## wave:group -0.01181 0.01023 -1.2
## group:I(wave^2) 0.00200 0.00235 0.8
##
## Correlation of Fixed Effects:
## (Intr) wave group I(w^2) wv:grp
## wave -0.560
## group 0.562 -0.324
## I(wave^2) 0.411 -0.943 0.243
## wave:group -0.324 0.602 -0.560 -0.582
## grp:I(wv^2) 0.243 -0.582 0.411 0.632 -0.943
## Data: futPros
## Models:
## M_linear: FUTURE.PROS ~ wave + group + wave:group + (wave | id)
## M_quad: FUTURE.PROS ~ wave + group + I(wave^2) + wave:group + I(wave^2):group +
## M_quad: (wave | id)
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## M_linear 8 18112 18171 -9048 18096
## M_quad 10 18099 18174 -9039 18079 16.6 2 0.00025 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Inference for Bugs model at "/Users/phparker/Dropbox/Projects_Research/LSAY_GapYear//QuadraticGrowth_ByGroup.txt", fit using jags,
## 4 chains, each with 15000 iterations (first 1000 discarded), n.thin = 10
## n.sims = 5600 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
## alpha.mu 3.072 0.971 0.051 3.421 3.430 3.438 3.451 1.003 1200
## alpha.sigma 0.278 0.096 0.015 0.309 0.314 0.319 0.327 1.007 630
## beta.mu -0.004 0.017 -0.020 -0.012 -0.008 -0.003 0.052 1.105 80
## beta.quad 0.049 0.147 -0.010 -0.007 -0.005 -0.003 0.498 1.004 920
## beta.sigma 0.037 0.010 0.012 0.034 0.040 0.044 0.051 1.006 540
## beta2 -0.208 0.556 -1.934 -0.012 -0.003 0.004 0.017 1.003 1200
## beta3 0.000 0.013 -0.037 -0.004 0.001 0.005 0.028 1.020 230
## beta4 0.032 0.080 -0.002 0.001 0.003 0.005 0.274 1.004 880
## deviance 18839.970 12122.839 14048.083 14219.937 14310.634 14432.482 54848.083 1.005 730
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
##
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 73279322.2 and DIC = 73298162.2
## DIC is an estimate of expected predictive error (lower deviance is better).
## Linear mixed model fit by REML ['lmerMod']
## Formula: LIFE.SAT ~ wave + group + wave:group + (wave | id)
## Data: lifeSat
##
## REML criterion at convergence: 17944
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.10782 0.3284
## wave 0.00226 0.0476 -0.11
## Residual 0.14805 0.3848
## Number of obs: 14154, groups: id, 2883
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.56615 0.00873 409
## wave -0.01603 0.00237 -7
## group 0.01825 0.00873 2
## wave:group -0.00242 0.00237 -1
##
## Correlation of Fixed Effects:
## (Intr) wave group
## wave -0.289
## group 0.421 0.039
## wave:group 0.039 0.286 -0.289
## Linear mixed model fit by REML ['lmerMod']
## Formula: LIFE.SAT ~ wave + group + I(wave^2) + wave:group + I(wave^2):group + (wave | id)
## Data: lifeSat
##
## REML criterion at convergence: 17929
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.10826 0.3290
## wave 0.00234 0.0484 -0.12
## Residual 0.14738 0.3839
## Number of obs: 14154, groups: id, 2883
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.56068 0.00929 383
## wave 0.01031 0.00492 2
## group 0.03560 0.00929 4
## I(wave^2) -0.00732 0.00127 -6
## wave:group -0.02011 0.00492 -4
## group:I(wave^2) 0.00304 0.00127 2
##
## Correlation of Fixed Effects:
## (Intr) wave group I(w^2) wv:grp
## wave -0.216
## group 0.392 0.284
## I(wave^2) -0.016 -0.828 -0.337
## wave:group 0.284 -0.447 -0.216 0.343
## grp:I(wv^2) -0.337 0.343 -0.016 -0.070 -0.828
## Data: lifeSat
## Models:
## M_linear: LIFE.SAT ~ wave + group + wave:group + (wave | id)
## M_quad: LIFE.SAT ~ wave + group + I(wave^2) + wave:group + I(wave^2):group +
## M_quad: (wave | id)
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## M_linear 8 17923 17984 -8954 17907
## M_quad 10 17890 17965 -8935 17870 37.3 2 7.8e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Inference for Bugs model at "/Users/phparker/Dropbox/Projects_Research/LSAY_GapYear//QuadraticGrowth_ByGroup.txt", fit using jags,
## 4 chains, each with 15000 iterations (first 1000 discarded), n.thin = 10
## n.sims = 5600 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
## alpha.mu 3.199 9.63e-01 0.040 3.525 3.533 3.539 3.552 1.012 380
## alpha.sigma 0.294 9.60e-02 0.013 0.322 0.328 0.332 0.340 1.021 260
## beta.mu -0.030 1.10e-02 -0.044 -0.032 -0.029 -0.026 -0.006 1.178 80
## beta.quad 0.027 9.90e-02 -0.010 -0.008 -0.007 -0.006 0.353 1.010 440
## beta.sigma 0.038 9.00e-03 0.013 0.037 0.040 0.043 0.048 1.001 5400
## beta2 -0.133 3.93e-01 -1.411 -0.006 0.002 0.009 0.021 1.012 380
## beta3 -0.025 6.30e-02 -0.224 -0.007 -0.004 -0.001 0.004 1.020 270
## beta4 -0.005 2.40e-02 -0.083 0.002 0.003 0.004 0.006 1.015 330
## deviance 18337.036 1.45e+04 12985.293 13158.038 13247.878 13362.111 63412.247 1.017 290
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
##
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 104176064.9 and DIC = 104194401.9
## DIC is an estimate of expected predictive error (lower deviance is better).
For Direct University Entrant Students, Status equals:
## $`3`
##
## 1 5
## 2180 68
##
## $`4`
##
## 1 2 3 4 5
## 1900 3 1 1 171
##
## $`5`
##
## 1 2 3 4 5
## 1648 80 9 3 226
##
## $`6`
##
## 1 2 3 4 5
## 1078 409 112 34 191
##
## $`7`
##
## 1 2 3 4 5
## 502 750 200 70 166
##
## $`8`
##
## 1 2 3 4 5
## 88 239 41 16 45
##
## $`9`
##
## 1 2 5
## 2 8 1
## Linear mixed model fit by REML ['lmerMod']
## Formula: LIFE.SAT ~ wave + as.factor(STATUS) + I(wave^2) + (wave | id)
## Data: UlifeSat
##
## REML criterion at convergence: 12959
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.15672 0.3959
## wave 0.00239 0.0489 -0.52
## Residual 0.14420 0.3797
## Number of obs: 10227, groups: id, 2248
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.34444 0.04716 70.9
## wave 0.08999 0.01948 4.6
## as.factor(STATUS)2 -0.00278 0.01584 -0.2
## as.factor(STATUS)3 -0.03064 0.02604 -1.2
## as.factor(STATUS)4 -0.04071 0.04351 -0.9
## as.factor(STATUS)5 0.00296 0.01945 0.2
## I(wave^2) -0.00996 0.00193 -5.2
##
## Correlation of Fixed Effects:
## (Intr) wave a.(STATUS)2 a.(STATUS)3 a.(STATUS)4 a.(STATUS)5
## wave -0.972
## a.(STATUS)2 -0.120 0.184
## a.(STATUS)3 -0.056 0.095 0.296
## a.(STATUS)4 -0.043 0.066 0.142 0.087
## a.(STATUS)5 0.075 -0.088 0.111 0.069 0.057
## I(wave^2) 0.929 -0.983 -0.286 -0.159 -0.103 0.058
## Linear mixed model fit by REML ['lmerMod']
## Formula: CAREER.PROS ~ wave + as.factor(STATUS) + I(wave^2) + (wave | id)
## Data: UcarrerPros
##
## REML criterion at convergence: 16533
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.24502 0.4950
## wave 0.00649 0.0805 -0.72
## Residual 0.21475 0.4634
## Number of obs: 10158, groups: id, 2248
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.10110 0.05810 53.4
## wave 0.12922 0.02404 5.4
## as.factor(STATUS)2 -0.01786 0.01958 -0.9
## as.factor(STATUS)3 0.04541 0.03231 1.4
## as.factor(STATUS)4 -0.09334 0.05460 -1.7
## as.factor(STATUS)5 -0.09982 0.02334 -4.3
## I(wave^2) -0.01332 0.00239 -5.6
##
## Correlation of Fixed Effects:
## (Intr) wave a.(STATUS)2 a.(STATUS)3 a.(STATUS)4 a.(STATUS)5
## wave -0.975
## a.(STATUS)2 -0.125 0.189
## a.(STATUS)3 -0.059 0.098 0.295
## a.(STATUS)4 -0.044 0.067 0.138 0.084
## a.(STATUS)5 0.069 -0.081 0.115 0.072 0.057
## I(wave^2) 0.928 -0.982 -0.292 -0.162 -0.104 0.050
## Linear mixed model fit by REML ['lmerMod']
## Formula: FUTURE.PROS ~ wave + as.factor(STATUS) + I(wave^2) + (wave | id)
## Data: UfutPros
##
## REML criterion at convergence: 14284
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.14390 0.3793
## wave 0.00223 0.0472 -0.54
## Residual 0.17561 0.4191
## Number of obs: 10156, groups: id, 2247
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.20696 0.05180 61.9
## wave 0.08506 0.02142 4.0
## as.factor(STATUS)2 -0.01149 0.01718 -0.7
## as.factor(STATUS)3 -0.01896 0.02827 -0.7
## as.factor(STATUS)4 -0.05975 0.04710 -1.3
## as.factor(STATUS)5 -0.03341 0.02072 -1.6
## I(wave^2) -0.00765 0.00212 -3.6
##
## Correlation of Fixed Effects:
## (Intr) wave a.(STATUS)2 a.(STATUS)3 a.(STATUS)4 a.(STATUS)5
## wave -0.975
## a.(STATUS)2 -0.114 0.176
## a.(STATUS)3 -0.052 0.089 0.292
## a.(STATUS)4 -0.040 0.063 0.146 0.089
## a.(STATUS)5 0.072 -0.083 0.124 0.077 0.060
## I(wave^2) 0.933 -0.984 -0.278 -0.152 -0.100 0.052
For gap-year, special equals:
## $`1`
##
## 0 1
## 87 454
##
## $`2`
##
## 0 1
## 87 454
##
## $`3`
##
## 0 1
## 87 454
##
## $`4`
##
## 0 1
## 87 454
##
## $`5`
##
## 0 1
## 87 454
##
## $`6`
##
## 0 1
## 87 454
##
## $`7`
##
## 0 1
## 87 454
##
## $`8`
##
## 0 1
## 87 454
##
## $`9`
##
## 0 1
## 87 454
## Linear mixed model fit by REML ['lmerMod']
## Formula: LIFE.SAT ~ wave + as.factor(special) + I(wave^2) + (wave | id)
## Data: DlifeSat
##
## REML criterion at convergence: 4540
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.11588 0.3404
## wave 0.00237 0.0487 -0.50
## Residual 0.15647 0.3956
## Number of obs: 3571, groups: id, 541
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.54131 0.04527 78.2
## wave 0.01567 0.01552 1.0
## as.factor(special)1 0.05981 0.03889 1.5
## I(wave^2) -0.00441 0.00185 -2.4
##
## Correlation of Fixed Effects:
## (Intr) wave as.()1
## wave -0.585
## as.fctr(s)1 -0.720 -0.001
## I(wave^2) 0.512 -0.966 0.003
## Linear mixed model fit by REML ['lmerMod']
## Formula: CAREER.PROS ~ wave + as.factor(special) + I(wave^2) + (wave | id)
## Data: DcarrerPros
##
## REML criterion at convergence: 3992
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.20178 0.4492
## wave 0.00665 0.0815 -0.71
## Residual 0.20255 0.4501
## Number of obs: 2545, groups: id, 541
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.95135 0.12374 23.85
## wave 0.15484 0.04865 3.18
## as.factor(special)1 0.13180 0.04508 2.92
## I(wave^2) -0.01685 0.00477 -3.53
##
## Correlation of Fixed Effects:
## (Intr) wave as.()1
## wave -0.933
## as.fctr(s)1 -0.311 0.006
## I(wave^2) 0.901 -0.989 -0.004
## Linear mixed model fit by REML ['lmerMod']
## Formula: FUTURE.PROS ~ wave + as.factor(special) + I(wave^2) + (wave | id)
## Data: DfutPros
##
## REML criterion at convergence: 3587
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.16026 0.4003
## wave 0.00297 0.0545 -0.67
## Residual 0.17764 0.4215
## Number of obs: 2553, groups: id, 541
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 3.32741 0.11456 29.04
## wave 0.02401 0.04496 0.53
## as.factor(special)1 0.07038 0.04146 1.70
## I(wave^2) -0.00281 0.00440 -0.64
##
## Correlation of Fixed Effects:
## (Intr) wave as.()1
## wave -0.932
## as.fctr(s)1 -0.309 0.005
## I(wave^2) 0.903 -0.990 -0.004