Description of the Data
The Raw Data
The original dataset is a collection of data from Snijders and Bosker
(2012) adapted from the raw data from a 1989 study by H. P. Brandsma and
W. M. Knuver containing information on 4106 pupils at 216 schools, found
in the R mice library (1). The 14 variables of the adapted dataset are
listed below, featuring demographic information on the students and
schools and their pre- and post-test scores for language and
mathematics. The information on the original study (2) shows that a
random sample of 250 Dutch primary schools were selected within which
all seventh grade students were tested on their proficiency in Dutch
language and mathematics before and after an interval of one year.
Information was also gathered on the student backgrounds and
schoolrelated factors with an intention of measuring the effects of
school and classrom characteristics on the progress of the students in
these subjects.
sch - School number (numeric)
pup - Pupil ID (numeric)
iqv - IQ verbal (numeric)
iqp - IQ performal (numeric)
sex - Sex of pupil (categorical)
ses - SES score of pupil (numeric)
min - Minority member 0/1 (categorical)
rpg - Number of repeated groups, 0, 1, 2 (categorical)
lpr - language score PRE (numeric)
lpo - language score POST (numeric)
apr - Arithmetic score PRE (numeric)
apo - Arithmetic score POST (numeric)
den - Denomination classification 1-4 - at school level
(categorical)
ssi - School SES indicator - at school level (numeric)
The Analytic Dataset
Over the course of the past two analyses, we have examined this
dataset through an initial exploratory examination of each of the
individual features as well as their relationships (3) as well as data
cleaning processes to create a final analytic dataset (4). We noticed
issues of skewness among the continuous features and sparse categories
in the categorical features, both of which were corrected for through
transformations and meaningful category regrouping during feature
engineering. Missing data was prevalent in the dataset and imputed
through multiple imputation using the R mice package. When examining
relationships between the features, a measure of correlation among the
continuous features was noticed, and PCA was examined to check for its
relevance in future analysis. Clustering algorithms were also
considered, and no obvious evidence in favor of them was found with a
view of future analysis. Each of the features was standardized with a
view of increasing the effectiveness of distance-based algorithms.
The final analytic dataset consists of 4601 observations of 15
variables, listed as follows:
sch - School number (numeric)
pup - Pupil ID (numeric)
tiqv - Transformed, standardized IQ verbal (numeric)
iqp - Standardized IQ performal (numeric)
sex - Sex of pupil (categorical)
tses - Transformed, standardized SES score of pupil (numeric)
min - Minority member 0/1 (categorical)
has_repeated_group - Whether or not there was a repeated group 0/1
(categorical)
tlpr - Transformed, standardized language score PRE (numeric)
tlpo - Transformed, standardized language score POST (numeric)
apr - Standardized Arithmetic score PRE (numeric)
tapo - Transformed, standardized Arithmetic score POST (numeric)
den - Denomination classification 1-4 - at school level
(categorical)
tssi - Transformed, standardized School SES indicator - at school
level (numeric)
tpost - Transformed, standardized post-score (transformed +
standardized after calculating lpo+apo) (numeric)
A summary of the variables can be observed as follows:
## # A tibble: 2 × 2
## l_chg_pos n
## <lgl> <int>
## 1 FALSE 646
## 2 TRUE 3460
## # A tibble: 2 × 2
## a_chg_pos n
## <lgl> <int>
## 1 FALSE 431
## 2 TRUE 3675
## pup sch sex min den tlpr
## Min. : 1 Min. : 1.0 0:2100 0:3868 1:1271 Min. :-2.81814
## 1st Qu.:1073 1st Qu.: 66.0 1:2006 1: 238 2:1600 1st Qu.:-0.70954
## Median :2128 Median :132.0 3:1038 Median : 0.03653
## Mean :2121 Mean :129.2 4: 197 Mean : 0.00000
## 3rd Qu.:3170 3rd Qu.:187.0 3rd Qu.: 0.69505
## Max. :4214 Max. :259.0 Max. : 2.57173
## tses tssi tiqv tlpo
## Min. :-2.171494 Min. :-2.47302 Min. :-3.14261 Min. :-2.589462
## 1st Qu.:-0.579751 1st Qu.:-0.58918 1st Qu.:-0.67684 1st Qu.:-0.764652
## Median : 0.001848 Median :-0.08691 Median : 0.03873 Median : 0.005185
## Mean : 0.000000 Mean : 0.00000 Mean : 0.00000 Mean : 0.000000
## 3rd Qu.: 0.807671 3rd Qu.: 0.79825 3rd Qu.: 0.53429 3rd Qu.: 0.754953
## Max. : 1.752968 Max. : 2.24266 Max. : 3.47449 Max. : 2.183419
## tapo apr iqp tpost
## Min. :-2.258335 Min. :-3.13107 Min. :-3.03163 Min. :-2.562547
## 1st Qu.:-0.882041 1st Qu.:-0.84354 1st Qu.:-0.77656 1st Qu.:-0.742193
## Median : 0.002267 Median : 0.01428 Median :-0.02488 Median :-0.006606
## Mean : 0.000000 Mean : 0.00000 Mean : 0.00000 Mean : 0.000000
## 3rd Qu.: 0.805056 3rd Qu.: 0.87210 3rd Qu.: 0.72380 3rd Qu.: 0.800893
## Max. : 1.658299 Max. : 2.30180 Max. : 2.98187 Max. : 2.092595
## has_repeated_group l_chg a_chg
## 0:3572 0: 646 0: 431
## 1: 534 1:3460 1:3675
##
##
##
##
As we can see, there are no longer missing values in any of the
variables and all of the numeric variables are centered at 0. We can
take a quick look at their distributions as follows:

The transformed distributions, while not perfect, generally show
patterns of being unimodally, symmetrically distributed around a mean of
0.
Linear Regression
We intend to create a linear regression model to identify factors
that impact the total post-score of the students and quantify their
effects. The predictors identified in the initial round of feature
selection done in the previous analysis were apr, tlpr, tiqv, iqp, tses,
rpg, tssi, den, min, and sex. We will look at the pairwise correlation
plots of the numeric variables below.

A measure of correlation is observable between some of the identified
predictors. We will see whether or not this will cause problems with our
regression assumptions later. If necessary, PCA will be used to create
noncorrelated principal components. We also observe that all of the
correlations are identified
Main effect models
We will create a full model with all the selected predictors and
perform stepwise selection to remove certain predictors based on their
AIC value.
kable(summary(full_model)$coef, caption ="Full Main Effects Linear Model Parameter Estimates")
Full Main Effects Linear Model Parameter Estimates
(Intercept) |
-0.1308347 |
0.0200139 |
-6.537187 |
0.0000000 |
sex1 |
0.0668590 |
0.0189540 |
3.527429 |
0.0004242 |
min1 |
0.0524918 |
0.0412210 |
1.273421 |
0.2029408 |
den2 |
0.2517373 |
0.0225912 |
11.143168 |
0.0000000 |
den3 |
0.0911318 |
0.0248985 |
3.660127 |
0.0002553 |
den4 |
0.1054136 |
0.0467413 |
2.255254 |
0.0241700 |
tlpr |
0.3266385 |
0.0131490 |
24.841355 |
0.0000000 |
tses |
0.0926378 |
0.0112352 |
8.245346 |
0.0000000 |
tssi |
0.0619850 |
0.0106659 |
5.811492 |
0.0000000 |
tiqv |
0.1672796 |
0.0127085 |
13.162781 |
0.0000000 |
apr |
0.2524611 |
0.0118620 |
21.283210 |
0.0000000 |
iqp |
0.1275108 |
0.0113208 |
11.263382 |
0.0000000 |
has_repeated_group1 |
-0.2388497 |
0.0288936 |
-8.266537 |
0.0000000 |
The stepwise model only removed the parameter of min, which
represents whether or not the student is of a minority group. This was
the same feature that was identified in the initial exploratory analysis
as having a nonsignificant p-value when a linear model for post was
created with the same predictors using the mice imputed dataset.
We can now check the assumptions of both models, looking for evidence
of violations to normality, homoscedasticity, and multicollinearity.


sex |
1.051641 |
1 |
1.025495 |
min |
1.086965 |
1 |
1.042576 |
den |
1.104577 |
3 |
1.016715 |
tlpr |
2.025026 |
1 |
1.423034 |
tses |
1.478446 |
1 |
1.215914 |
tssi |
1.332431 |
1 |
1.154310 |
tiqv |
1.891633 |
1 |
1.375366 |
apr |
1.648014 |
1 |
1.283750 |
iqp |
1.501076 |
1 |
1.225184 |
has_repeated_group |
1.106545 |
1 |
1.051924 |
sex |
1.051535 |
1 |
1.025443 |
den |
1.091854 |
3 |
1.014754 |
tlpr |
2.024256 |
1 |
1.422764 |
tses |
1.457989 |
1 |
1.207472 |
tssi |
1.329791 |
1 |
1.153166 |
tiqv |
1.870996 |
1 |
1.367844 |
apr |
1.645849 |
1 |
1.282906 |
iqp |
1.501039 |
1 |
1.225169 |
has_repeated_group |
1.105350 |
1 |
1.051356 |
The Q-Q plot is largely linear and therefore shows evidence of
normality, and the VIF values do not show evidence of multicollinearity
that requires adjustment to the predictors in either model. However, the
residuals vs. fitted plots do show a “football” shape, having smaller
variances at both ends and much larger variances in the middle rather
than completely random scatter above the line x = 0. This could evidence
against the variance being constant, which would violate the assumption
of homogeneity of variance and can lead to bias in the predictions of
the linear regression model.
However, with a large dataset, it’s also possible that the narrowness
observed for extreme values at each end is due to there being much fewer
observations at those points. This would naturally affect the visual
spread of the residuals compared to points along the x axis where there
are many more observations, even in a case where all observations
technically have the same variance. We will proceed with the rest of the
linear regression analysis with caution.
Using these two candidate models, we will compare their performance
using 5-fold cross-validation.
## Linear Regression
##
## 4106 samples
## 10 predictor
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 3285, 3285, 3285, 3284, 3285
## Resampling results:
##
## RMSE Rsquared MAE
## 0.5926082 0.6489202 0.473492
##
## Tuning parameter 'intercept' was held constant at a value of TRUE
## Linear Regression
##
## 4106 samples
## 9 predictor
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 3285, 3285, 3285, 3285, 3284
## Resampling results:
##
## RMSE Rsquared MAE
## 0.592711 0.6489602 0.4737837
##
## Tuning parameter 'intercept' was held constant at a value of TRUE
Comparing across the RMSE, the R-squared, and the MAE, we observe
very small differences between the performances of the full and
regression models. The test RMSE of around 0.593 for both models
indicates that the predictions of the standardized total post-score were
off by an average of 0.593. Using the standard deviation (118.3467) from
the transformed post-test score and the optimal lambda (1.515152) that
was used to transform its values in order to put this back into the
meaningful units of the original test score, this translates to
predicted total post-test scores that were on average, off by about 8.85
points.
With such similar performances between the two models, on principles
of parsimony, the information given suggests the stepwise model as the
best linear model for predicting the total post-test score.
Interaction models
To continue our investigation, we will go through a similar process
to the above but with a full model including all two-way interactions
and then run a stepwise variable selection algorithm to create a second
model.
Full Model Parameter Estimates (including 2-way Interaction
Terms)
(Intercept) |
-0.1064979 |
0.0274184 |
-3.8841685 |
0.0001043 |
sex1 |
0.0834443 |
0.0359217 |
2.3229521 |
0.0202313 |
min1 |
0.0301980 |
0.0857932 |
0.3519856 |
0.7248675 |
den2 |
0.2680596 |
0.0347261 |
7.7192615 |
0.0000000 |
den3 |
0.0693947 |
0.0375079 |
1.8501350 |
0.0643672 |
den4 |
0.0061992 |
0.0911167 |
0.0680359 |
0.9457604 |
tlpr |
0.3567131 |
0.0289933 |
12.3032765 |
0.0000000 |
tses |
0.0867820 |
0.0244123 |
3.5548412 |
0.0003825 |
tssi |
0.0811018 |
0.0221348 |
3.6639911 |
0.0002515 |
tiqv |
0.1636655 |
0.0276889 |
5.9108769 |
0.0000000 |
apr |
0.2416855 |
0.0255163 |
9.4718254 |
0.0000000 |
iqp |
0.1369337 |
0.0246568 |
5.5535907 |
0.0000000 |
has_repeated_group1 |
-0.1968657 |
0.0613606 |
-3.2083392 |
0.0013455 |
sex1:min1 |
-0.0736986 |
0.0862125 |
-0.8548486 |
0.3926858 |
sex1:den2 |
-0.0186334 |
0.0464428 |
-0.4012110 |
0.6882861 |
sex1:den3 |
0.0299292 |
0.0511305 |
0.5853495 |
0.5583455 |
sex1:den4 |
-0.1077445 |
0.0987973 |
-1.0905609 |
0.2755313 |
sex1:tlpr |
-0.0003232 |
0.0267767 |
-0.0120692 |
0.9903710 |
sex1:tses |
-0.0068189 |
0.0232842 |
-0.2928536 |
0.7696491 |
sex1:tssi |
0.0028004 |
0.0219202 |
0.1277525 |
0.8983512 |
sex1:tiqv |
-0.0193043 |
0.0259309 |
-0.7444518 |
0.4566466 |
sex1:apr |
-0.0134254 |
0.0243814 |
-0.5506387 |
0.5819119 |
sex1:iqp |
0.0476249 |
0.0230989 |
2.0617852 |
0.0392922 |
sex1:has_repeated_group1 |
-0.0661562 |
0.0609397 |
-1.0856009 |
0.2777205 |
min1:den2 |
-0.0329569 |
0.1106262 |
-0.2979121 |
0.7657856 |
min1:den3 |
0.2261507 |
0.1075253 |
2.1032317 |
0.0355074 |
min1:den4 |
-0.2282251 |
0.2883784 |
-0.7914081 |
0.4287524 |
min1:tlpr |
0.0079044 |
0.0608150 |
0.1299743 |
0.8965933 |
min1:tses |
-0.0586789 |
0.0437015 |
-1.3427185 |
0.1794388 |
min1:tssi |
-0.0196607 |
0.0405581 |
-0.4847537 |
0.6278774 |
min1:tiqv |
-0.0335875 |
0.0532779 |
-0.6304205 |
0.5284553 |
min1:apr |
0.0358244 |
0.0555244 |
0.6452014 |
0.5188334 |
min1:iqp |
-0.0324949 |
0.0557085 |
-0.5833024 |
0.5597224 |
min1:has_repeated_group1 |
-0.0706659 |
0.1069767 |
-0.6605724 |
0.5089243 |
den2:tlpr |
-0.0500320 |
0.0324534 |
-1.5416539 |
0.1232362 |
den3:tlpr |
-0.0166327 |
0.0354607 |
-0.4690444 |
0.6390633 |
den4:tlpr |
-0.0009627 |
0.0662051 |
-0.0145406 |
0.9883994 |
den2:tses |
0.0072369 |
0.0273919 |
0.2641977 |
0.7916411 |
den3:tses |
0.0057750 |
0.0305631 |
0.1889528 |
0.8501394 |
den4:tses |
0.0348950 |
0.0596058 |
0.5854295 |
0.5582917 |
den2:tssi |
-0.0592745 |
0.0251548 |
-2.3563879 |
0.0185012 |
den3:tssi |
0.0005565 |
0.0295975 |
0.0188021 |
0.9849999 |
den4:tssi |
0.1420138 |
0.0616173 |
2.3047714 |
0.0212300 |
den2:tiqv |
0.0495857 |
0.0319241 |
1.5532374 |
0.1204449 |
den3:tiqv |
0.0307880 |
0.0337703 |
0.9116894 |
0.3619868 |
den4:tiqv |
-0.0150837 |
0.0621569 |
-0.2426714 |
0.8082723 |
den2:apr |
0.0540228 |
0.0289048 |
1.8689948 |
0.0616960 |
den3:apr |
-0.0033687 |
0.0322360 |
-0.1045000 |
0.9167778 |
den4:apr |
0.0446393 |
0.0601225 |
0.7424725 |
0.4578444 |
den2:iqp |
-0.0494948 |
0.0279247 |
-1.7724345 |
0.0763980 |
den3:iqp |
-0.0551015 |
0.0307824 |
-1.7900305 |
0.0735241 |
den4:iqp |
-0.0677145 |
0.0572862 |
-1.1820404 |
0.2372594 |
den2:has_repeated_group1 |
-0.1269193 |
0.0687168 |
-1.8469906 |
0.0648218 |
den3:has_repeated_group1 |
-0.1092569 |
0.0819053 |
-1.3339425 |
0.1822981 |
den4:has_repeated_group1 |
-0.0224686 |
0.1555317 |
-0.1444630 |
0.8851421 |
tlpr:tses |
-0.0086364 |
0.0161707 |
-0.5340745 |
0.5933194 |
tlpr:tssi |
0.0152900 |
0.0149937 |
1.0197620 |
0.3079026 |
tlpr:tiqv |
-0.0258014 |
0.0133845 |
-1.9277052 |
0.0539620 |
tlpr:apr |
-0.0033486 |
0.0148403 |
-0.2256402 |
0.8214927 |
tlpr:iqp |
-0.0130072 |
0.0157016 |
-0.8283983 |
0.4074940 |
tlpr:has_repeated_group1 |
-0.0830017 |
0.0413268 |
-2.0084202 |
0.0446653 |
tses:tssi |
-0.0172101 |
0.0104507 |
-1.6467857 |
0.0996801 |
tses:tiqv |
-0.0052787 |
0.0154559 |
-0.3415336 |
0.7327197 |
tses:apr |
0.0085199 |
0.0145080 |
0.5872540 |
0.5570661 |
tses:iqp |
0.0170547 |
0.0137498 |
1.2403640 |
0.2149130 |
tses:has_repeated_group1 |
0.0004933 |
0.0354527 |
0.0139157 |
0.9888979 |
tssi:tiqv |
-0.0076874 |
0.0147291 |
-0.5219199 |
0.6017548 |
tssi:apr |
-0.0175288 |
0.0139338 |
-1.2580080 |
0.2084618 |
tssi:iqp |
-0.0135971 |
0.0134064 |
-1.0142289 |
0.3105344 |
tssi:has_repeated_group1 |
-0.0143241 |
0.0331753 |
-0.4317696 |
0.6659320 |
tiqv:apr |
-0.0166028 |
0.0157760 |
-1.0524064 |
0.2926762 |
tiqv:iqp |
0.0029304 |
0.0147826 |
0.1982330 |
0.8428728 |
tiqv:has_repeated_group1 |
-0.0407572 |
0.0406055 |
-1.0037374 |
0.3155654 |
apr:iqp |
0.0009278 |
0.0131381 |
0.0706159 |
0.9437069 |
apr:has_repeated_group1 |
-0.0351618 |
0.0363686 |
-0.9668194 |
0.3336923 |
iqp:has_repeated_group1 |
0.0249920 |
0.0353212 |
0.7075648 |
0.4792565 |
Stepwise Selected Model Parameter Estimates (including 2-way
Interaction Terms
(Intercept) |
-0.0938870 |
0.0214198 |
-4.3831819 |
0.0000120 |
sex1 |
0.0673288 |
0.0188566 |
3.5705675 |
0.0003603 |
min1 |
0.0196275 |
0.0670235 |
0.2928444 |
0.7696560 |
den2 |
0.2432223 |
0.0231820 |
10.4918435 |
0.0000000 |
den3 |
0.0749242 |
0.0257478 |
2.9099217 |
0.0036347 |
den4 |
-0.0477896 |
0.0696477 |
-0.6861614 |
0.4926503 |
tlpr |
0.3374911 |
0.0135927 |
24.8288267 |
0.0000000 |
tses |
0.0924705 |
0.0115384 |
8.0141386 |
0.0000000 |
tssi |
0.0746634 |
0.0163702 |
4.5609465 |
0.0000052 |
tiqv |
0.1721524 |
0.0126799 |
13.5768219 |
0.0000000 |
apr |
0.2550977 |
0.0118231 |
21.5761823 |
0.0000000 |
iqp |
0.1087985 |
0.0142298 |
7.6458219 |
0.0000000 |
has_repeated_group1 |
-0.2947357 |
0.0337451 |
-8.7341685 |
0.0000000 |
sex1:iqp |
0.0325983 |
0.0186337 |
1.7494267 |
0.0802924 |
min1:den2 |
-0.0932103 |
0.1021298 |
-0.9126656 |
0.3614723 |
min1:den3 |
0.1838061 |
0.1017406 |
1.8066155 |
0.0708959 |
min1:den4 |
-0.2264644 |
0.2773336 |
-0.8165775 |
0.4142176 |
min1:tses |
-0.0737487 |
0.0353519 |
-2.0861338 |
0.0370285 |
den2:tssi |
-0.0533074 |
0.0221998 |
-2.4012563 |
0.0163834 |
den3:tssi |
0.0023204 |
0.0260210 |
0.0891761 |
0.9289463 |
den4:tssi |
0.1402723 |
0.0532475 |
2.6343438 |
0.0084619 |
tlpr:tiqv |
-0.0389074 |
0.0084378 |
-4.6111006 |
0.0000041 |
tlpr:has_repeated_group1 |
-0.1138520 |
0.0314174 |
-3.6238571 |
0.0002938 |
tses:tssi |
-0.0145763 |
0.0095299 |
-1.5295377 |
0.1262087 |
tssi:apr |
-0.0157276 |
0.0096419 |
-1.6311683 |
0.1029320 |
The main and interaction effects identified with p-values less than
an alpha level of 0.05 by the full interaction model are the following:
Main effects: sex, den, tlpr, tses, tssi, tiqv, apr, iqp,
has_repeated_group Interaction effects: sex:iqp, min:den, den:tssi,
tlpr:has_repeated_group
All of the main effects that were selected in the main effects
stepwise selected model have been selected for the interaction model as
well, and min has also been included. The following two-way interaction
effects were also kept in the stepwise selected model: sex:iqp, min:den,
den:tses, den:tssi, tlpr:tiqv, tlpr:has_repeated_group, tses:tssi, and
tssi:tapr. The ones significant at a 0.05 alpha level are the following:
min:tses, den:tssi, tlpr:tiqv, and tlpr:has_repeated_group.
Once again, we will look at the residual plots and VIF to check for
violations to the assumptions of linear regression.


## there are higher-order terms (interactions) in this model
## consider setting type = 'predictor'; see ?vif
sex |
3.809247 |
1 |
1.951729 |
min |
4.748375 |
1 |
2.179077 |
den |
23.809346 |
3 |
1.696125 |
tlpr |
9.928978 |
1 |
3.151028 |
tses |
7.039262 |
1 |
2.653161 |
tssi |
5.787093 |
1 |
2.405638 |
tiqv |
9.055627 |
1 |
3.009257 |
apr |
7.690275 |
1 |
2.773134 |
iqp |
7.180935 |
1 |
2.679727 |
has_repeated_group |
5.032781 |
1 |
2.243386 |
sex:min |
2.248965 |
1 |
1.499655 |
sex:den |
19.999647 |
3 |
1.647544 |
sex:tlpr |
4.071754 |
1 |
2.017859 |
sex:tses |
3.155137 |
1 |
1.776271 |
sex:tssi |
2.725603 |
1 |
1.650940 |
sex:tiqv |
3.767442 |
1 |
1.940990 |
sex:apr |
3.162665 |
1 |
1.778388 |
sex:iqp |
2.883590 |
1 |
1.698114 |
sex:has_repeated_group |
1.984716 |
1 |
1.408799 |
min:den |
3.469597 |
3 |
1.230401 |
min:tlpr |
4.163131 |
1 |
2.040375 |
min:tses |
2.830600 |
1 |
1.682439 |
min:tssi |
2.008851 |
1 |
1.417339 |
min:tiqv |
4.360568 |
1 |
2.088197 |
min:apr |
2.538613 |
1 |
1.593302 |
min:iqp |
2.627520 |
1 |
1.620963 |
min:has_repeated_group |
1.978804 |
1 |
1.406699 |
den:tlpr |
27.074722 |
3 |
1.732849 |
den:tses |
11.067287 |
3 |
1.492818 |
den:tssi |
15.308579 |
3 |
1.575757 |
den:tiqv |
22.032438 |
3 |
1.674340 |
den:apr |
17.173784 |
3 |
1.606242 |
den:iqp |
12.455987 |
3 |
1.522520 |
den:has_repeated_group |
6.129867 |
3 |
1.352826 |
tlpr:tses |
3.270337 |
1 |
1.808407 |
tlpr:tssi |
2.853894 |
1 |
1.689347 |
tlpr:tiqv |
2.955831 |
1 |
1.719253 |
tlpr:apr |
2.890235 |
1 |
1.700069 |
tlpr:iqp |
3.311464 |
1 |
1.819743 |
tlpr:has_repeated_group |
3.110993 |
1 |
1.763801 |
tses:tssi |
1.437101 |
1 |
1.198792 |
tses:tiqv |
3.235540 |
1 |
1.798761 |
tses:apr |
2.599010 |
1 |
1.612144 |
tses:iqp |
2.309144 |
1 |
1.519587 |
tses:has_repeated_group |
2.260765 |
1 |
1.503584 |
tssi:tiqv |
2.822146 |
1 |
1.679924 |
tssi:apr |
2.341525 |
1 |
1.530204 |
tssi:iqp |
2.108531 |
1 |
1.452078 |
tssi:has_repeated_group |
1.749666 |
1 |
1.322749 |
tiqv:apr |
3.332620 |
1 |
1.825546 |
tiqv:iqp |
3.171794 |
1 |
1.780953 |
tiqv:has_repeated_group |
3.115000 |
1 |
1.764936 |
apr:iqp |
2.238348 |
1 |
1.496111 |
apr:has_repeated_group |
2.216783 |
1 |
1.488887 |
iqp:has_repeated_group |
2.066145 |
1 |
1.437409 |
## there are higher-order terms (interactions) in this model
## consider setting type = 'predictor'; see ?vif
sex |
1.054132 |
1 |
1.026709 |
min |
2.910283 |
1 |
1.705955 |
den |
2.957540 |
3 |
1.198087 |
tlpr |
2.191604 |
1 |
1.480407 |
tses |
1.579220 |
1 |
1.256670 |
tssi |
3.178745 |
1 |
1.782903 |
tiqv |
1.907127 |
1 |
1.380988 |
apr |
1.658111 |
1 |
1.287677 |
iqp |
2.401859 |
1 |
1.549793 |
has_repeated_group |
1.528596 |
1 |
1.236364 |
sex:iqp |
1.884477 |
1 |
1.372762 |
min:den |
2.565254 |
3 |
1.170007 |
min:tses |
1.860164 |
1 |
1.363878 |
den:tssi |
7.024258 |
3 |
1.383885 |
tlpr:tiqv |
1.179702 |
1 |
1.086141 |
tlpr:has_repeated_group |
1.805575 |
1 |
1.343717 |
tses:tssi |
1.200084 |
1 |
1.095483 |
tssi:apr |
1.125975 |
1 |
1.061119 |
The results look much the same to the main effect full and stepwise
reduced models.
Finally, we will perform 5-fold cross-validation with the interaction
models.
## Linear Regression
##
## 4106 samples
## 10 predictor
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 3285, 3285, 3285, 3284, 3285
## Resampling results:
##
## RMSE Rsquared MAE
## 0.5944228 0.6469267 0.4742027
##
## Tuning parameter 'intercept' was held constant at a value of TRUE
## Linear Regression
##
## 4106 samples
## 10 predictor
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 3285, 3285, 3285, 3285, 3284
## Resampling results:
##
## RMSE Rsquared MAE
## 0.5903401 0.651934 0.4711809
##
## Tuning parameter 'intercept' was held constant at a value of TRUE
Once again, the results appear similar to the main effect models.
Model Selection
We will assess the four models with a few goodness of fit
measures.
Goodness-of-fit Measures of Candidate Models
Full Main Effect Model |
1434.527 |
0.6505416 |
0.6495171 |
-4291.929 |
-4209.766 |
Stepwise Main Effect Model |
1435.095 |
0.6504032 |
0.6494639 |
-4292.302 |
-4216.460 |
Full Interaction Model |
1400.588 |
0.6588094 |
0.6524597 |
-4264.239 |
-3783.903 |
Stepwise Interaction Model |
1412.310 |
0.6559537 |
0.6539304 |
-4332.016 |
-4174.011 |
Once again, the statistics across the board are all quite similar;
this, combined with similar performances with 5-fold cross-validation,
suggest that the four models perform similarly in terms of predictive
potential. The adjusted R-square and AIC values recommend the stepwise
interaction model, while the BIC recommends the stepwise main effect
model.
Just to check, I will try one last reduced model where all
nonsignificant interaction effects from the stepwise interaction model
are removed, starting from the one with the greatest p-value and
continuing to remove terms one at a time and rerunning until a model is
created with all factors having a p-value under the alpha of 0.05.
This ends up giving the following predictors: Main effects: sex, min,
den, tlpr, tses, tssi, tiqv, apr, iqp, has_repeated_group Interaction
effects: den:tssi, tlpr:tiqv, tlpr:has_repeated_group, tssi:apr
## Linear Regression
##
## 4106 samples
## 10 predictor
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 3285, 3284, 3284, 3285, 3286
## Resampling results:
##
## RMSE Rsquared MAE
## 0.5889761 0.6528472 0.4701559
##
## Tuning parameter 'intercept' was held constant at a value of TRUE
Based on this, we do get a slightly smaller RMSE value compared to
the other models, but the measures are still very similar. Once again,
we can look at the goodness of fit statistics in comparison with our
other candidate models.
Goodness-of-fit Measures of Candidate Models
Full Main Effect Model |
1434.527 |
0.6505416 |
0.6495171 |
-4291.929 |
-4209.766 |
Stepwise Main Effect Model |
1435.095 |
0.6504032 |
0.6494639 |
-4292.302 |
-4216.460 |
Full Interaction Model |
1400.588 |
0.6588094 |
0.6524597 |
-4264.239 |
-3783.903 |
Stepwise Interaction Model |
1412.310 |
0.6559537 |
0.6539304 |
-4332.016 |
-4174.011 |
Simplified Stepwise Interaction Model |
1417.166 |
0.6547708 |
0.6532503 |
-4329.923 |
-4209.839 |
The adjusted R-squared and AIC values still recommend the original
stepwise interaction model. However, the BIC value for the simplified
stepwise interaction model is better than the original stepwise
interaction model and the original full main effects model, although the
stepwise main effects model still performs better than it according to
its BIC.
The models all seem to perform very similarly. As such, our first
recommendation would be the simplest model, that being the stepwise main
effects model, for having the fewest predictors and no interaction
effects. The differences in performance appear small enough for it to
not differ substantially to the other models for predictive purposes.
However, for analytic purposes and in order to examine the ways that the
features affect the response with more depth, my second recommendation
would be the stepwise interaction or simplified stepwise interaction
models. These include significant interaction terms and allow for the
modeling of the effects of certain features on the response changing
depending on other features.
The parameter estimates of the stepwise selected main effects model
are as follows:
(Intercept) |
-0.1261515 |
0.0196746 |
-6.411893 |
0.0000000 |
sex1 |
0.0666164 |
0.0189545 |
3.514540 |
0.0004453 |
den2 |
0.2486549 |
0.0224628 |
11.069615 |
0.0000000 |
den3 |
0.0895137 |
0.0248680 |
3.599555 |
0.0003225 |
den4 |
0.1039741 |
0.0467312 |
2.224940 |
0.0261396 |
tlpr |
0.3263121 |
0.0131475 |
24.819363 |
0.0000000 |
tses |
0.0909549 |
0.0111580 |
8.151532 |
0.0000000 |
tssi |
0.0613804 |
0.0106562 |
5.760079 |
0.0000000 |
tiqv |
0.1655893 |
0.0126400 |
13.100441 |
0.0000000 |
apr |
0.2530087 |
0.0118551 |
21.341785 |
0.0000000 |
iqp |
0.1274391 |
0.0113215 |
11.256336 |
0.0000000 |
has_repeated_group1 |
-0.2376406 |
0.0288801 |
-8.228511 |
0.0000000 |
All of the main effects from the original data are included except
for min, which had two levels to represent whether or not the student
was a minority. This model is based on the transformed, standardized
values of the variables. However, the signs of the parameters should be
unchanged. From this model, we can see that higher values for the
pre-test scores for arithemetic and language, higher values of
individual socioeconomic scores and school socioeconomic status, and
higher IQ scores are positively associated with the total post-test
score. For the categorical features, being part of the group sex = 1
meant higher average post-test scores, and the denomination
classifications of 2-4 performed better than the denomination classified
as 1. The only feature to have a negative parameter estimate is
has_repeated_group, where the presence of a repeated group for the
individual is negatively associated with their post-test score.
Main Effect Models
We’ll begin by fitting two full main effect logistic regression
models.
Language Improvement Logistic Full Main Effects Model Parameter
Estimates
(Intercept) |
1.6361975 |
0.0957575 |
17.086893 |
0.0000000 |
sex1 |
0.6883649 |
0.0957025 |
7.192757 |
0.0000000 |
min1 |
0.3101520 |
0.1864894 |
1.663107 |
0.0962909 |
den2 |
0.2228432 |
0.1139980 |
1.954799 |
0.0506068 |
den3 |
-0.1227161 |
0.1161427 |
-1.056597 |
0.2906954 |
den4 |
0.1635197 |
0.2613480 |
0.625678 |
0.5315262 |
tses |
0.1077922 |
0.0566879 |
1.901501 |
0.0572364 |
tssi |
0.0652368 |
0.0536795 |
1.215302 |
0.2242508 |
tiqv |
0.1769375 |
0.0589754 |
3.000193 |
0.0026981 |
apr |
-0.1188408 |
0.0622530 |
-1.908999 |
0.0562622 |
iqp |
-0.1289585 |
0.0581090 |
-2.219252 |
0.0264696 |
has_repeated_group1 |
-0.3020738 |
0.1189820 |
-2.538819 |
0.0111227 |
tapo |
0.8997798 |
0.0726812 |
12.379821 |
0.0000000 |
Arithmetic Improvement Logistic Full Main Effects Model
Parameter Estimates
(Intercept) |
3.0768835 |
0.1396567 |
22.0317616 |
0.0000000 |
sex1 |
-0.5606782 |
0.1185947 |
-4.7276856 |
0.0000023 |
min1 |
0.2706972 |
0.2094831 |
1.2922149 |
0.1962827 |
den2 |
0.2330302 |
0.1378617 |
1.6903179 |
0.0909671 |
den3 |
0.0932450 |
0.1440809 |
0.6471712 |
0.5175212 |
den4 |
0.3407571 |
0.3798616 |
0.8970559 |
0.3696891 |
tses |
0.0545382 |
0.0703005 |
0.7757869 |
0.4378748 |
tssi |
0.2195656 |
0.0653789 |
3.3583529 |
0.0007841 |
tiqv |
-0.0549398 |
0.0824169 |
-0.6666082 |
0.5050224 |
tlpr |
-0.0926403 |
0.0907471 |
-1.0208619 |
0.3073199 |
iqp |
0.4949656 |
0.0699995 |
7.0709867 |
0.0000000 |
has_repeated_group1 |
-0.3500899 |
0.1382767 |
-2.5318076 |
0.0113476 |
tlpo |
1.1531362 |
0.1001138 |
11.5182524 |
0.0000000 |
Just like the linear regression modeling process, we will continue by
conducting stepwise feature selection for both models.
Language Improvement Logistic Stepwise Main Effects Model
Parameter Estimates
(Intercept) |
1.6299280 |
0.0954738 |
17.0719897 |
0.0000000 |
sex1 |
0.6888521 |
0.0956709 |
7.2002278 |
0.0000000 |
min1 |
0.2992200 |
0.1860192 |
1.6085436 |
0.1077162 |
den2 |
0.2265830 |
0.1138930 |
1.9894375 |
0.0466529 |
den3 |
-0.1187971 |
0.1159940 |
-1.0241653 |
0.3057572 |
den4 |
0.2255486 |
0.2563477 |
0.8798541 |
0.3789384 |
tses |
0.1363646 |
0.0516008 |
2.6426827 |
0.0082252 |
tiqv |
0.1756527 |
0.0589409 |
2.9801527 |
0.0028810 |
apr |
-0.1207157 |
0.0621988 |
-1.9408044 |
0.0522820 |
iqp |
-0.1317355 |
0.0580559 |
-2.2691140 |
0.0232614 |
has_repeated_group1 |
-0.2921902 |
0.1185932 |
-2.4638033 |
0.0137472 |
tapo |
0.9095235 |
0.0722238 |
12.5931189 |
0.0000000 |
Arithmetic Improvement Logistic Stepwise Main Effects Model
Parameter Estimates
(Intercept) |
3.2057049 |
0.1125474 |
28.483163 |
0.0000000 |
sex1 |
-0.5561029 |
0.1176502 |
-4.726747 |
0.0000023 |
tssi |
0.2329103 |
0.0577382 |
4.033902 |
0.0000549 |
tlpr |
-0.1207150 |
0.0853155 |
-1.414924 |
0.1570906 |
iqp |
0.4885369 |
0.0686741 |
7.113842 |
0.0000000 |
has_repeated_group1 |
-0.3327376 |
0.1362473 |
-2.442160 |
0.0145997 |
tlpo |
1.1613681 |
0.0957747 |
12.126043 |
0.0000000 |
The stepwise feature selection algorithm selected the following
effects: Language Improvement Prediction Model: sex, min, den, ses,
tiqv, apr, iqp, has_repeated_group, tapo Arithmetic Improvement
Prediction Model: sex, tssi, tlpr, iqp, has_repeated_group, tlpo
The post-test score and pre-test score for the other subject are both
significant predictors for whether or not the student improves in
Arithmetic or Language, which makes logical sense, as higher values for
both probably correspond to improvement over time. Sex was selected by
both stepwise algorithms, as was performoral IQ and the presence of a
repeated group.
Once again, we will do 5-fold cross-validation to compare the full
and reduced models.
## Generalized Linear Model
##
## 4106 samples
## 10 predictor
## 2 classes: '0', '1'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 3285, 3285, 3285, 3284, 3285
## Resampling results:
##
## Accuracy Kappa
## 0.8429122 0.09970755
## Generalized Linear Model
##
## 4106 samples
## 9 predictor
## 2 classes: '0', '1'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 3284, 3285, 3285, 3285, 3285
## Resampling results:
##
## Accuracy Kappa
## 0.8412087 0.0853706
## Generalized Linear Model
##
## 4106 samples
## 10 predictor
## 2 classes: '0', '1'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 3284, 3285, 3285, 3285, 3285
## Resampling results:
##
## Accuracy Kappa
## 0.8933257 0.147092
## Generalized Linear Model
##
## 4106 samples
## 6 predictor
## 2 classes: '0', '1'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 3285, 3285, 3285, 3285, 3284
## Resampling results:
##
## Accuracy Kappa
## 0.8950313 0.1562784
The accuracy of predictions is pretty similar across all of these
models, with the language models having accuracy measures around 84% and
the arithmetic models having accuracy measures around 89%. However, the
kappa values are all very low compared to the accuracy; the kappa
measures how much better the model performs compared to random chance.
Due to the imbalance in sizes between the two groups that are being
predicted in the response variable, as many more students improved over
the course of the year than did not, a model that assigns every
individual into the “improved” category for arithmetic would accurately
predict for about 89% of the individuals and 84% of the individuals for
language.
Specificity and Sensitivity Comparisons of Main Effect Logistic
Candidate Models
Language Full Model |
0.0804954 |
0.9852601 |
Language Stepwise Model |
0.0712074 |
0.9849711 |
Arithmetic Full Model |
0.1136891 |
0.9847619 |
Arithmetic Stepwise Model |
0.1183295 |
0.9861224 |
We can see that the specificity is much lower than the sensitivity in
all models, in large part because of the imbalance in the population
sizes between the two groups being predicted. Many more students
improved over the course of the year than did not.
Interaction models
Again, we will consider candidate models including the two-way
interaction effects.
Language Improvement Logistic Full Interaction Model Parameter
Estimates
(Intercept) |
1.7062976 |
0.1394097 |
12.2394432 |
0.0000000 |
sex1 |
0.8634790 |
0.1954405 |
4.4181172 |
0.0000100 |
min1 |
0.3198667 |
0.4646885 |
0.6883465 |
0.4912346 |
den2 |
0.2438101 |
0.1807563 |
1.3488329 |
0.1773906 |
den3 |
-0.1696851 |
0.1844776 |
-0.9198145 |
0.3576697 |
den4 |
-0.0422157 |
0.5086950 |
-0.0829883 |
0.9338608 |
tses |
0.0342911 |
0.1224336 |
0.2800792 |
0.7794167 |
tssi |
0.1214235 |
0.1132944 |
1.0717519 |
0.2838315 |
tiqv |
0.3985457 |
0.1259812 |
3.1635338 |
0.0015587 |
apr |
-0.1118514 |
0.1340454 |
-0.8344287 |
0.4040394 |
iqp |
-0.2524666 |
0.1275772 |
-1.9789323 |
0.0478236 |
has_repeated_group1 |
-0.3416819 |
0.2619239 |
-1.3045080 |
0.1920604 |
tapo |
1.0653578 |
0.1553065 |
6.8597100 |
0.0000000 |
sex1:min1 |
0.1609503 |
0.4173068 |
0.3856883 |
0.6997275 |
sex1:den2 |
-0.4393610 |
0.2422003 |
-1.8140400 |
0.0696716 |
sex1:den3 |
-0.0244622 |
0.2512248 |
-0.0973719 |
0.9224310 |
sex1:den4 |
-0.9519515 |
0.5674422 |
-1.6776185 |
0.0934216 |
sex1:tses |
0.0359980 |
0.1233606 |
0.2918108 |
0.7704313 |
sex1:tssi |
-0.0308302 |
0.1164553 |
-0.2647385 |
0.7912109 |
sex1:tiqv |
-0.0571562 |
0.1258223 |
-0.4542617 |
0.6496405 |
sex1:apr |
0.2430774 |
0.1325842 |
1.8333811 |
0.0667459 |
sex1:iqp |
0.0747956 |
0.1249368 |
0.5986671 |
0.5493949 |
sex1:has_repeated_group1 |
-0.3092365 |
0.2544738 |
-1.2151995 |
0.2242900 |
sex1:tapo |
-0.2576516 |
0.1526308 |
-1.6880706 |
0.0913977 |
min1:den2 |
0.1846860 |
0.5325324 |
0.3468071 |
0.7287362 |
min1:den3 |
0.7325912 |
0.5418983 |
1.3518979 |
0.1764080 |
min1:den4 |
-0.7722948 |
1.3846286 |
-0.5577631 |
0.5770062 |
min1:tses |
0.4555638 |
0.2414286 |
1.8869504 |
0.0591670 |
min1:tssi |
-0.4578091 |
0.2001644 |
-2.2871658 |
0.0221861 |
min1:tiqv |
0.0288928 |
0.2194117 |
0.1316832 |
0.8952349 |
min1:apr |
0.1793195 |
0.2703232 |
0.6633521 |
0.5071051 |
min1:iqp |
-0.1010233 |
0.2628319 |
-0.3843647 |
0.7007082 |
min1:has_repeated_group1 |
0.4115781 |
0.4444190 |
0.9261037 |
0.3543921 |
min1:tapo |
0.1137495 |
0.3220799 |
0.3531717 |
0.7239597 |
den2:tses |
-0.0468959 |
0.1478047 |
-0.3172827 |
0.7510291 |
den3:tses |
0.0249514 |
0.1508217 |
0.1654366 |
0.8686004 |
den4:tses |
0.8345710 |
0.3631425 |
2.2981917 |
0.0215509 |
den2:tssi |
-0.1310320 |
0.1337752 |
-0.9794940 |
0.3273360 |
den3:tssi |
-0.0274568 |
0.1470471 |
-0.1867210 |
0.8518794 |
den4:tssi |
0.4204708 |
0.3665275 |
1.1471739 |
0.2513098 |
den2:tiqv |
-0.2145140 |
0.1515525 |
-1.4154435 |
0.1569385 |
den3:tiqv |
-0.2141464 |
0.1534821 |
-1.3952533 |
0.1629395 |
den4:tiqv |
-0.3290181 |
0.3745718 |
-0.8783845 |
0.3797351 |
den2:apr |
-0.2435519 |
0.1587055 |
-1.5346151 |
0.1248784 |
den3:apr |
0.0552162 |
0.1609949 |
0.3429687 |
0.7316220 |
den4:apr |
-0.0314004 |
0.3884972 |
-0.0808253 |
0.9355809 |
den2:iqp |
0.1660909 |
0.1487873 |
1.1162972 |
0.2642949 |
den3:iqp |
-0.0573170 |
0.1558758 |
-0.3677090 |
0.7130902 |
den4:iqp |
-0.1157497 |
0.3426259 |
-0.3378311 |
0.7354905 |
den2:has_repeated_group1 |
-0.0458396 |
0.2871392 |
-0.1596423 |
0.8731628 |
den3:has_repeated_group1 |
-0.0938449 |
0.3190394 |
-0.2941482 |
0.7686446 |
den4:has_repeated_group1 |
1.2326111 |
0.9065194 |
1.3597183 |
0.1739191 |
den2:tapo |
0.0922433 |
0.1814866 |
0.5082653 |
0.6112673 |
den3:tapo |
0.1703408 |
0.1952135 |
0.8725871 |
0.3828882 |
den4:tapo |
0.0833544 |
0.4274201 |
0.1950176 |
0.8453792 |
tses:tssi |
-0.2155143 |
0.0528084 |
-4.0810585 |
0.0000448 |
tses:tiqv |
0.0108522 |
0.0778450 |
0.1394083 |
0.8891275 |
tses:apr |
0.0219100 |
0.0808791 |
0.2708982 |
0.7864694 |
tses:iqp |
0.0887965 |
0.0740750 |
1.1987370 |
0.2306302 |
tses:has_repeated_group1 |
0.2303632 |
0.1513631 |
1.5219246 |
0.1280280 |
tses:tapo |
0.1368794 |
0.0935307 |
1.4634703 |
0.1433387 |
tssi:tiqv |
0.0648836 |
0.0737792 |
0.8794290 |
0.3791687 |
tssi:apr |
0.0461141 |
0.0765111 |
0.6027111 |
0.5467009 |
tssi:iqp |
-0.0212389 |
0.0731393 |
-0.2903905 |
0.7715175 |
tssi:has_repeated_group1 |
-0.2166419 |
0.1407285 |
-1.5394319 |
0.1236989 |
tssi:tapo |
-0.1345021 |
0.0856194 |
-1.5709305 |
0.1161988 |
tiqv:apr |
-0.0290393 |
0.0779735 |
-0.3724250 |
0.7095765 |
tiqv:iqp |
-0.0336456 |
0.0719880 |
-0.4673784 |
0.6402292 |
tiqv:has_repeated_group1 |
-0.1621035 |
0.1524055 |
-1.0636330 |
0.2874950 |
tiqv:tapo |
0.0220366 |
0.0902475 |
0.2441797 |
0.8070917 |
apr:iqp |
-0.1361049 |
0.0755065 |
-1.8025585 |
0.0714576 |
apr:has_repeated_group1 |
0.1910030 |
0.1605133 |
1.1899515 |
0.2340655 |
apr:tapo |
0.2178975 |
0.0822817 |
2.6481897 |
0.0080924 |
iqp:has_repeated_group1 |
0.2198454 |
0.1527205 |
1.4395283 |
0.1500009 |
iqp:tapo |
0.0503895 |
0.0845277 |
0.5961296 |
0.5510887 |
has_repeated_group1:tapo |
-0.3164266 |
0.1935579 |
-1.6347906 |
0.1020930 |
Arithmetic Improvement Logistic Full Interaction Model
Parameter Estimates
(Intercept) |
3.4695761 |
0.2606815 |
13.3096374 |
0.0000000 |
sex1 |
-0.8617315 |
0.2739196 |
-3.1459286 |
0.0016556 |
min1 |
-0.3416063 |
0.5332428 |
-0.6406206 |
0.5217692 |
den2 |
0.3328552 |
0.3095574 |
1.0752617 |
0.2822576 |
den3 |
-0.0281453 |
0.3158084 |
-0.0891214 |
0.9289854 |
den4 |
2.7795977 |
1.5085784 |
1.8425278 |
0.0653980 |
tses |
-0.0287365 |
0.1831552 |
-0.1568970 |
0.8753260 |
tssi |
0.6044787 |
0.1744104 |
3.4658401 |
0.0005286 |
tiqv |
-0.1624788 |
0.2162353 |
-0.7513981 |
0.4524131 |
tlpr |
-0.0176622 |
0.2302605 |
-0.0767053 |
0.9388580 |
iqp |
0.8027781 |
0.1903237 |
4.2179614 |
0.0000247 |
has_repeated_group1 |
-0.2672107 |
0.3907214 |
-0.6838907 |
0.4940442 |
tlpo |
1.6252616 |
0.2344123 |
6.9333459 |
0.0000000 |
sex1:min1 |
-0.2168854 |
0.4318151 |
-0.5022645 |
0.6154814 |
sex1:den2 |
0.0395357 |
0.3003371 |
0.1316378 |
0.8952707 |
sex1:den3 |
0.0934059 |
0.3156058 |
0.2959576 |
0.7672624 |
sex1:den4 |
-0.4829058 |
0.9578003 |
-0.5041822 |
0.6141334 |
sex1:tses |
0.1783996 |
0.1538894 |
1.1592713 |
0.2463456 |
sex1:tssi |
-0.4004225 |
0.1470836 |
-2.7224152 |
0.0064807 |
sex1:tiqv |
-0.0028840 |
0.1786224 |
-0.0161459 |
0.9871180 |
sex1:tlpr |
0.0737025 |
0.1940536 |
0.3798051 |
0.7040901 |
sex1:iqp |
0.0783898 |
0.1534750 |
0.5107658 |
0.6095150 |
sex1:has_repeated_group1 |
0.0515487 |
0.2929535 |
0.1759621 |
0.8603237 |
sex1:tlpo |
-0.3506656 |
0.2134852 |
-1.6425758 |
0.1004707 |
min1:den2 |
-0.5114155 |
0.5479291 |
-0.9333607 |
0.3506338 |
min1:den3 |
-0.8148272 |
0.5161926 |
-1.5785334 |
0.1144431 |
min1:den4 |
-1.0389231 |
1.6972414 |
-0.6121245 |
0.5404554 |
min1:tses |
-0.2961879 |
0.2259620 |
-1.3107861 |
0.1899300 |
min1:tssi |
-0.3154094 |
0.2075310 |
-1.5198183 |
0.1285566 |
min1:tiqv |
0.3646741 |
0.2791589 |
1.3063315 |
0.1914399 |
min1:tlpr |
-0.3967096 |
0.3520100 |
-1.1269838 |
0.2597493 |
min1:iqp |
-0.3001086 |
0.2627899 |
-1.1420096 |
0.2534500 |
min1:has_repeated_group1 |
0.2100001 |
0.4662606 |
0.4503921 |
0.6524278 |
min1:tlpo |
-0.0511923 |
0.3938281 |
-0.1299863 |
0.8965773 |
den2:tses |
-0.1176445 |
0.1760279 |
-0.6683286 |
0.5039238 |
den3:tses |
0.2602258 |
0.1828672 |
1.4230315 |
0.1547270 |
den4:tses |
0.2724586 |
0.5682346 |
0.4794827 |
0.6315953 |
den2:tssi |
-0.1528418 |
0.1612343 |
-0.9479486 |
0.3431556 |
den3:tssi |
-0.0525908 |
0.1743992 |
-0.3015540 |
0.7629921 |
den4:tssi |
-1.3008416 |
0.6886695 |
-1.8889200 |
0.0589025 |
den2:tiqv |
0.2136252 |
0.2115856 |
1.0096398 |
0.3126679 |
den3:tiqv |
-0.0255183 |
0.2095768 |
-0.1217612 |
0.9030882 |
den4:tiqv |
0.0530526 |
0.6544739 |
0.0810614 |
0.9353931 |
den2:tlpr |
0.0466693 |
0.2281831 |
0.2045255 |
0.8379428 |
den3:tlpr |
0.1561803 |
0.2331556 |
0.6698546 |
0.5029505 |
den4:tlpr |
-0.8203246 |
0.7657520 |
-1.0712667 |
0.2840495 |
den2:iqp |
-0.1913753 |
0.1765179 |
-1.0841697 |
0.2782896 |
den3:iqp |
-0.2280933 |
0.1831837 |
-1.2451614 |
0.2130724 |
den4:iqp |
-0.7141710 |
0.6223839 |
-1.1474766 |
0.2511847 |
den2:has_repeated_group1 |
-0.4869359 |
0.3340924 |
-1.4574888 |
0.1449815 |
den3:has_repeated_group1 |
0.0427078 |
0.3762529 |
0.1135083 |
0.9096276 |
den4:has_repeated_group1 |
-1.6764096 |
0.9917218 |
-1.6904032 |
0.0909509 |
den2:tlpo |
0.0667916 |
0.2534120 |
0.2635694 |
0.7921117 |
den3:tlpo |
-0.2134310 |
0.2560447 |
-0.8335693 |
0.4045237 |
den4:tlpo |
1.4258712 |
1.0386258 |
1.3728440 |
0.1698008 |
tses:tssi |
-0.0305961 |
0.0643780 |
-0.4752560 |
0.6346045 |
tses:tiqv |
0.1790616 |
0.1103070 |
1.6233024 |
0.1045248 |
tses:tlpr |
-0.0280786 |
0.1160395 |
-0.2419749 |
0.8087996 |
tses:iqp |
-0.1284964 |
0.0888481 |
-1.4462477 |
0.1481077 |
tses:has_repeated_group1 |
0.0025836 |
0.1766905 |
0.0146221 |
0.9883337 |
tses:tlpo |
0.0672296 |
0.1239356 |
0.5424560 |
0.5875044 |
tssi:tiqv |
-0.2102156 |
0.1007570 |
-2.0863634 |
0.0369457 |
tssi:tlpr |
0.1236997 |
0.1074341 |
1.1514012 |
0.2495672 |
tssi:iqp |
0.0374776 |
0.0869173 |
0.4311867 |
0.6663326 |
tssi:has_repeated_group1 |
-0.1660776 |
0.1619886 |
-1.0252423 |
0.3052489 |
tssi:tlpo |
0.0072497 |
0.1209184 |
0.0599550 |
0.9521915 |
tiqv:tlpr |
0.0089603 |
0.1154480 |
0.0776131 |
0.9381358 |
tiqv:iqp |
-0.0704031 |
0.1007303 |
-0.6989262 |
0.4845981 |
tiqv:has_repeated_group1 |
-0.1085234 |
0.1990527 |
-0.5451993 |
0.5856164 |
tiqv:tlpo |
-0.0532668 |
0.1341620 |
-0.3970336 |
0.6913426 |
tlpr:iqp |
0.0050247 |
0.1103647 |
0.0455278 |
0.9636866 |
tlpr:has_repeated_group1 |
-0.0269426 |
0.2283576 |
-0.1179842 |
0.9060802 |
tlpr:tlpo |
0.0986089 |
0.1188317 |
0.8298203 |
0.4066404 |
iqp:has_repeated_group1 |
0.2046135 |
0.1727716 |
1.1843011 |
0.2362939 |
iqp:tlpo |
0.3279312 |
0.1308045 |
2.5070331 |
0.0121749 |
has_repeated_group1:tlpo |
-0.0909160 |
0.2595510 |
-0.3502818 |
0.7261272 |
Language Improvement Logistic Stepwise Interaction Model
Parameter Estimates
(Intercept) |
1.6668519 |
0.1206726 |
13.8130119 |
0.0000000 |
sex1 |
0.8012155 |
0.1821726 |
4.3981118 |
0.0000109 |
min1 |
0.6000305 |
0.2970773 |
2.0197792 |
0.0434063 |
den2 |
0.3308813 |
0.1547745 |
2.1378279 |
0.0325307 |
den3 |
-0.1669390 |
0.1541003 |
-1.0833138 |
0.2786692 |
den4 |
0.4690459 |
0.3766675 |
1.2452520 |
0.2130391 |
tses |
0.1523072 |
0.0921371 |
1.6530493 |
0.0983208 |
tssi |
0.0632335 |
0.0589607 |
1.0724690 |
0.2835094 |
tiqv |
0.1813492 |
0.0598108 |
3.0320468 |
0.0024290 |
apr |
-0.1179500 |
0.0843279 |
-1.3987071 |
0.1619008 |
iqp |
-0.2032185 |
0.0687764 |
-2.9547729 |
0.0031290 |
has_repeated_group1 |
-0.5020179 |
0.1679044 |
-2.9899030 |
0.0027907 |
tapo |
1.1202457 |
0.0991842 |
11.2946015 |
0.0000000 |
sex1:den2 |
-0.4570286 |
0.2316355 |
-1.9730511 |
0.0484897 |
sex1:den3 |
-0.0099298 |
0.2402417 |
-0.0413326 |
0.9670308 |
sex1:den4 |
-0.9236706 |
0.5360329 |
-1.7231604 |
0.0848595 |
sex1:apr |
0.2017032 |
0.1237583 |
1.6298150 |
0.1031406 |
sex1:tapo |
-0.2318731 |
0.1311260 |
-1.7683219 |
0.0770071 |
min1:tses |
0.5087286 |
0.2096014 |
2.4271237 |
0.0152191 |
min1:tssi |
-0.4270360 |
0.1814359 |
-2.3536470 |
0.0185903 |
den2:tses |
-0.1988805 |
0.1120345 |
-1.7751723 |
0.0758694 |
den3:tses |
-0.0570852 |
0.1207752 |
-0.4726566 |
0.6364582 |
den4:tses |
0.7884426 |
0.3036920 |
2.5961917 |
0.0094263 |
tses:tssi |
-0.1829103 |
0.0497905 |
-3.6735983 |
0.0002392 |
tses:iqp |
0.0836591 |
0.0580539 |
1.4410583 |
0.1495682 |
tses:tapo |
0.1047601 |
0.0672718 |
1.5572660 |
0.1194073 |
apr:iqp |
-0.1359372 |
0.0610653 |
-2.2260960 |
0.0260078 |
apr:tapo |
0.2024796 |
0.0692151 |
2.9253671 |
0.0034405 |
iqp:has_repeated_group1 |
0.2358161 |
0.1409126 |
1.6734917 |
0.0942305 |
has_repeated_group1:tapo |
-0.3160075 |
0.1664020 |
-1.8990601 |
0.0575566 |
Arithmetic Improvement Logistic Stepwise Interaction Model
Parameter Estimates
(Intercept) |
3.4515175 |
0.1338034 |
25.7954357 |
0.0000000 |
sex1 |
-0.6895429 |
0.1274397 |
-5.4107382 |
0.0000001 |
min1 |
-0.6877967 |
0.3081668 |
-2.2318976 |
0.0256217 |
tses |
0.0764993 |
0.0915006 |
0.8360533 |
0.4031249 |
tssi |
0.4589999 |
0.1098708 |
4.1776319 |
0.0000295 |
tiqv |
-0.0011900 |
0.0902710 |
-0.0131827 |
0.9894820 |
tlpr |
-0.0192352 |
0.0960728 |
-0.2002151 |
0.8413124 |
iqp |
0.7341440 |
0.1015723 |
7.2277991 |
0.0000000 |
has_repeated_group1 |
-0.3901779 |
0.1431474 |
-2.7257064 |
0.0064164 |
tlpo |
1.3593522 |
0.1135831 |
11.9679130 |
0.0000000 |
sex1:tssi |
-0.3362162 |
0.1167376 |
-2.8801022 |
0.0039755 |
min1:tses |
-0.4223782 |
0.1818217 |
-2.3230348 |
0.0201773 |
min1:tlpr |
-0.3643695 |
0.2094919 |
-1.7393007 |
0.0819819 |
tses:tiqv |
0.1438352 |
0.0801324 |
1.7949700 |
0.0726585 |
tses:iqp |
-0.1290262 |
0.0738014 |
-1.7482891 |
0.0804140 |
tssi:tiqv |
-0.1777920 |
0.0849924 |
-2.0918571 |
0.0364513 |
tssi:tlpr |
0.1171160 |
0.0808115 |
1.4492493 |
0.1472680 |
tssi:has_repeated_group1 |
-0.2431223 |
0.1338241 |
-1.8167311 |
0.0692583 |
iqp:tlpo |
0.3471903 |
0.0853256 |
4.0690027 |
0.0000472 |
The main and interaction effects chosen for each of the models by
stepwise variable selection are as follows:
Language - Main effects: sex, min, den, tses, tssi, tiqv, apr, iqp,
has_repeated_group, tapo - Interaction effects: sex:den, sex:apr,
sex:tapo, min:tses, min:tssi, tses:iqp, tses:tapo, apr:iqp, apr:tapo,
iqp:has_repeated_group, has_repeated_group:tapo
Arithmetic: - Main effects: sex, min, tses, tssi, tiqv, tlpr, iqp,
has_repeated_group, tlpo - Interaction effects: sex:tssi, min:tses,
min:tlpr, tses:tiqv, tses:iqp, tssi:tiqv, tssi:tlpr,
tssi:has_repeated_group, iqp:tlpo
Again, we conduct 5-fold cross-validation on the interaction
models.
## Generalized Linear Model
##
## 4106 samples
## 10 predictor
## 2 classes: '0', '1'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 3285, 3285, 3285, 3284, 3285
## Resampling results:
##
## Accuracy Kappa
## 0.838773 0.1259755
## Generalized Linear Model
##
## 4106 samples
## 10 predictor
## 2 classes: '0', '1'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 3284, 3285, 3285, 3285, 3285
## Resampling results:
##
## Accuracy Kappa
## 0.8426706 0.1240706
## Generalized Linear Model
##
## 4106 samples
## 10 predictor
## 2 classes: '0', '1'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 3284, 3285, 3285, 3285, 3285
## Resampling results:
##
## Accuracy Kappa
## 0.8891862 0.1188952
## Generalized Linear Model
##
## 4106 samples
## 9 predictor
## 2 classes: '0', '1'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 3285, 3285, 3285, 3285, 3284
## Resampling results:
##
## Accuracy Kappa
## 0.8943005 0.1253494
The accuracy and kappa results for the interaction models are quite
similar to those of the main effect models.
We can also find the specificity and sensitivity for each of the
interaction models.
Specificity and Sensitivity Comparisons of Interaction Logistic
Candidate Models
Language Full Model |
0.1145511 |
0.9739884 |
Language Stepwise Model |
0.1037152 |
0.9806358 |
Arithmetic Full Model |
0.0974478 |
0.9820408 |
Arithmetic Stepwise Model |
0.0928074 |
0.9882993 |
Model Selection
To begin with model selection, we compare the ROC curves for all of
the language models and all of the arithmetic models.

The ROC curves for the Language interaction models look considerably
better than the main effect models, while no obvious difference in
performance seems apparent between the full and stepwise models for
both. On the other hand, all of the arithmetic ROC curves look very
similar.
We can examine the area under the curve for the ROC curves to greater
depth in the below tables including the AUC summary statistics:
5-fold Cross-Validated AUC for Language candidate
models
Full Main Effect Model |
0.7451 |
0.7461 |
0.7479 |
0.76004 |
0.7790 |
0.7821 |
Stepwise Main Effect Model |
0.7437 |
0.7459 |
0.7485 |
0.76014 |
0.7792 |
0.7834 |
Full Interaction Model |
0.7996 |
0.8198 |
0.8367 |
0.83366 |
0.8434 |
0.8688 |
Stepwise Interaction Model |
0.7981 |
0.8185 |
0.8332 |
0.83272 |
0.8439 |
0.8699 |
5-fold Cross-Validated AUC for Arithmetic candidate
models
Full Main Effect Model |
0.8343 |
0.8465 |
0.8496 |
0.84720 |
0.8517 |
0.8539 |
Stepwise Main Effect Model |
0.8372 |
0.8442 |
0.8473 |
0.84670 |
0.8487 |
0.8561 |
Full Interaction Model |
0.8362 |
0.8595 |
0.8602 |
0.86036 |
0.8697 |
0.8762 |
Stepwise Interaction Model |
0.8268 |
0.8460 |
0.8468 |
0.84900 |
0.8583 |
0.8671 |
As expected from the plots, the differences in the mean AUC for the
Language interaction models vs. main effect models are much smaller than
the differences in the mean AUC for the Arithmetic interaction mdoels
vs. main effect models.
The stepwise and full models performed similarly enough in all
considered metrics for their differences in predictive performance to be
considered negligible. As such, the stepwise models are preferred above
the full for both the main effect and interaction models for both
subjects on the principles of parsimony. Based on the AUC, the
differences between the interaction models and main effect models for
Language justifies the selection of the interaction model, while the
differences between the interaction models and main effect models for
Arithmetic do not seem as important. Therefore, the final logistic
models for both subjects are the following:
Language: Stepwise Interaction Model Arithmetic: Stepwise Main
Effects Model
Further Specifications
We can maximize the area under the curve for an ROC curve for each of
the selected models, and then find the specificity and sensitivity for
that point.

We can also create plots of the accuracy, specificity, and
sensitivity of each of these models over different cutoff probabilities
as well.


From these graphs, we see that an ideal cutoff probability of about
0.77 for the language models and 0.87 for the arithmetic models
maximizes sensitivity and specificity.
We take a final look at the parameter estimates for the two selected
models.
Language Improvement Logistic Stepwise Interaction Model
Parameter Estimates
(Intercept) |
1.6668519 |
0.1206726 |
13.8130119 |
0.0000000 |
sex1 |
0.8012155 |
0.1821726 |
4.3981118 |
0.0000109 |
min1 |
0.6000305 |
0.2970773 |
2.0197792 |
0.0434063 |
den2 |
0.3308813 |
0.1547745 |
2.1378279 |
0.0325307 |
den3 |
-0.1669390 |
0.1541003 |
-1.0833138 |
0.2786692 |
den4 |
0.4690459 |
0.3766675 |
1.2452520 |
0.2130391 |
tses |
0.1523072 |
0.0921371 |
1.6530493 |
0.0983208 |
tssi |
0.0632335 |
0.0589607 |
1.0724690 |
0.2835094 |
tiqv |
0.1813492 |
0.0598108 |
3.0320468 |
0.0024290 |
apr |
-0.1179500 |
0.0843279 |
-1.3987071 |
0.1619008 |
iqp |
-0.2032185 |
0.0687764 |
-2.9547729 |
0.0031290 |
has_repeated_group1 |
-0.5020179 |
0.1679044 |
-2.9899030 |
0.0027907 |
tapo |
1.1202457 |
0.0991842 |
11.2946015 |
0.0000000 |
sex1:den2 |
-0.4570286 |
0.2316355 |
-1.9730511 |
0.0484897 |
sex1:den3 |
-0.0099298 |
0.2402417 |
-0.0413326 |
0.9670308 |
sex1:den4 |
-0.9236706 |
0.5360329 |
-1.7231604 |
0.0848595 |
sex1:apr |
0.2017032 |
0.1237583 |
1.6298150 |
0.1031406 |
sex1:tapo |
-0.2318731 |
0.1311260 |
-1.7683219 |
0.0770071 |
min1:tses |
0.5087286 |
0.2096014 |
2.4271237 |
0.0152191 |
min1:tssi |
-0.4270360 |
0.1814359 |
-2.3536470 |
0.0185903 |
den2:tses |
-0.1988805 |
0.1120345 |
-1.7751723 |
0.0758694 |
den3:tses |
-0.0570852 |
0.1207752 |
-0.4726566 |
0.6364582 |
den4:tses |
0.7884426 |
0.3036920 |
2.5961917 |
0.0094263 |
tses:tssi |
-0.1829103 |
0.0497905 |
-3.6735983 |
0.0002392 |
tses:iqp |
0.0836591 |
0.0580539 |
1.4410583 |
0.1495682 |
tses:tapo |
0.1047601 |
0.0672718 |
1.5572660 |
0.1194073 |
apr:iqp |
-0.1359372 |
0.0610653 |
-2.2260960 |
0.0260078 |
apr:tapo |
0.2024796 |
0.0692151 |
2.9253671 |
0.0034405 |
iqp:has_repeated_group1 |
0.2358161 |
0.1409126 |
1.6734917 |
0.0942305 |
has_repeated_group1:tapo |
-0.3160075 |
0.1664020 |
-1.8990601 |
0.0575566 |
Arithmetic Improvement Logistic Stepwise Main Effects Model
Parameter Estimates
(Intercept) |
3.2057049 |
0.1125474 |
28.483163 |
0.0000000 |
sex1 |
-0.5561029 |
0.1176502 |
-4.726747 |
0.0000023 |
tssi |
0.2329103 |
0.0577382 |
4.033902 |
0.0000549 |
tlpr |
-0.1207150 |
0.0853155 |
-1.414924 |
0.1570906 |
iqp |
0.4885369 |
0.0686741 |
7.113842 |
0.0000000 |
has_repeated_group1 |
-0.3327376 |
0.1362473 |
-2.442160 |
0.0145997 |
tlpo |
1.1613681 |
0.0957747 |
12.126043 |
0.0000000 |
We can see that sex, the socioeconomic status of the school (ssi),
the post-test score for the opposite subject, their performal IQ, and
the presence of a repeated group shows up as significant features in
both models. In the language model, significant interactions are
observed between sex and denomination group, minority status and
individual socioeconomic score (ses), minority status and school
socioeconomic score, individual SES score and school SSI score,
arithmetic pre-test score and IQ performal score, and arithmetic
pre-test score and arithmetic post-test score in the prediction of a
positive change between the language post- and pre-test score.