For this week’s assignment, I chose a data set of examination results consisting of scores on two components of a science examination in 1989 by 1905 students in 73 schools and colleges. The exam is the General Certificate of Secondary Education (GCSE) taken at the end of compulsory schooling in England, normally when students are 16 years old. The GCSE is an academic qualification, generally taken in a number of subjects by pupils in secondary education in England. The two components of the exam that are shown in the data set are the results of the traditional written question paper (marked out of a total score of 160) and coursework score (marked out of a total score of 108). The exam is taken by students after they have completed the coursework.
The purpose of this analysis is to see the relationship between the marks of the written test at both the school and student level and whether there are gender differences in this relationship.The students are nested within schools and the schools provide a context for students. I used the written score as the dependent variable and gender and the coursework score as the predictor variables. The coursework score is taken as a measure of ability in obtaining high marks in the GCSE exam. In this case, in obtaining high marks on the written question paper. I used multilevel and random effect models to demonstrate my findings. I analyzed this data set using three competing approaches which include a school-level analysis (ecological regression), student-level analysis (complete-pooling) and another student-level analysis (no-pooling).
school: ID for 73 different schools and colleges
student: ID for 1905 students in the different schools
gender: 0 is for male and 1 is for female
writtenscore: score obtained on the written question paper exam (marked out of a total score of 160)
courseworkscore: score obtained upon completing the course and before taking the GCSE exam (marked out of a total score of 160)
library(readr)
scores<-read_csv("C:\\users\\Sangita Roy\\Desktop\\scores.csv")
head(scores)
library(dplyr)
scores2<-scores%>%
rename(gender=`gender, girl=1`,
writtenscore=`written score, missing=-1`,
courseworkscore=`coursework score, missing=-1`)%>%
filter(writtenscore>-1,
courseworkscore>-1)%>%
select(School, Student, gender, writtenscore, courseworkscore)
head(scores2)
After filtering missing values for both exams, there are now 1523 observations.
length(unique(scores2$School))
[1] 73
scores2 %>%
group_by(School) %>%
summarise(n_sch = n())
schoold <- scores2 %>%
group_by(School) %>%
summarise(mean_s = mean(writtenscore, na.rm = TRUE), mean_g = mean(gender, na.rm = TRUE))
head(schoold)
The data set has been grouped by schools and summarized using the mean written score for the GCSE exam and the mean gender.
ecoreg <- lm(mean_s ~ mean_g, data = schoold)
summary(ecoreg)
Call:
lm(formula = mean_s ~ mean_g, data = schoold)
Residuals:
Min 1Q Median 3Q Max
-18.2849 -4.6580 0.4681 3.9008 18.4128
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 57.062 3.243 17.598 < 2e-16 ***
mean_g -14.699 5.299 -2.774 0.00707 **
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 7.958 on 71 degrees of freedom
Multiple R-squared: 0.09778, Adjusted R-squared: 0.08507
F-statistic: 7.695 on 1 and 71 DF, p-value: 0.007071
This is a school-level analysis which shows the average written score for the schools is 57 and the average written score for females is 14 points lower than males. Ecological fallacy is a limitation of this approach because school-level association may or may not reflect student-level connection.
cpooling <- lm(writtenscore ~ gender, data = scores2)
summary(cpooling)
Call:
lm(formula = writtenscore ~ gender, data = scores2)
Residuals:
Min 1Q Median 3Q Max
-45.782 -9.113 0.056 9.218 44.431
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 48.9073 0.5363 91.196 < 2e-16 ***
gender -3.3379 0.6980 -4.782 1.9e-06 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 13.4 on 1521 degrees of freedom
Multiple R-squared: 0.01481, Adjusted R-squared: 0.01416
F-statistic: 22.87 on 1 and 1521 DF, p-value: 1.904e-06
This model is a student-level analysis in which schools are ignored. The average written exam score for females is 3 points lower than males. Average written exam for males is 48. The results are statistically significant, (p<.05).The limitation with the complete-pooling approach is that there may be important between-school variations in the relationship.
library(ggplot2)
Need help getting started? Try the cookbook for R:
http://www.cookbook-r.com/Graphs/
dcoef <- scores2 %>%
group_by(School) %>%
do(mod = lm(writtenscore ~ gender, data = .))
coef <- dcoef %>% do(data.frame(intc = coef(.$mod)[1]))
ggplot(coef, aes(x = intc)) + geom_histogram()
This histogram shows the earlier linear model grouped by schools, one model per school. The intercept shows that most schools score between 40 to 60 marks on the written exam. However, there are a few schools that score lower on the written exam as show on the right and left side of the histogram.
dcoef <- scores2 %>%
group_by(School) %>%
do(mod = lm(writtenscore ~ gender, data = .))
coef <- dcoef %>% do(data.frame(genderc = coef(.$mod)[2]))
ggplot(coef, aes(x = genderc)) + geom_histogram()
The histogram of the slope is shown above and it is evident that males score much higher on the written exam. In summary, there is a great deal of between-school level variation in both the intercept and the slope parameters. The complete pooling model results of 48.9073 and -3.3379 are the results of many countervailing forces.
Partial-pooling models allows between-school variation.
library(nlme)
m1_lme <- lme(writtenscore ~ gender, data = scores2, random = ~1|School, method = "ML")
summary(m1_lme)
Linear mixed-effects model fit by maximum likelihood
Data: scores2
AIC BIC logLik
11827.66 11848.97 -5909.83
Random effects:
Formula: ~1 | School
(Intercept) Residual
StdDev: 7.02069 11.17925
Fixed effects: writtenscore ~ gender
Value Std.Error DF t-value p-value
(Intercept) 49.61493 0.9734214 1449 50.96964 0
gender -2.44861 0.6016647 1449 -4.06972 0
Correlation:
(Intr)
gender -0.362
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.15425682 -0.67237074 -0.04393553 0.66460074 3.82829157
Number of Observations: 1523
Number of Groups: 73
Multilevel model #1 shows the effect of gender on the written exam score with random effects for the school. This is a random intercept model which does not allow gender difference in the written exam score to differ between schools. The average written exam score for males (reference group) is 49 and for females, the score is 2 points lower than males. The p-value for both results are 0 demonstrating statistical significance (p<.05).
m2_lme <- lme(writtenscore ~ gender, data = scores2, random = ~ gender|School, method = "ML")
summary(m2_lme)
Linear mixed-effects model fit by maximum likelihood
Data: scores2
AIC BIC logLik
11831 11862.97 -5909.501
Random effects:
Formula: ~gender | School
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 7.204514 (Intr)
gender 1.854513 -0.239
Residual 11.143610
Fixed effects: writtenscore ~ gender
Value Std.Error DF t-value p-value
(Intercept) 49.64272 0.9972701 1449 49.77861 0e+00
gender -2.49802 0.6589697 1449 -3.79079 2e-04
Correlation:
(Intr)
gender -0.415
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.12115295 -0.69057170 -0.05821439 0.65852908 3.81286240
Number of Observations: 1523
Number of Groups: 73
Multilevel model #2 shows the effect of gender on the written exam score with random effects for the gender and school nested within the school.There is not much of a difference between the first and second multilevel models.The average written exam score is approximately 50 for males and 2 points lower for females. Results are statistically significant (p<.05).
AIC(cpooling, m1_lme, m2_lme)
Based on the AIC values, multilevel model #1 is the best fit because it has the lowest AIC value of 11827.66
m3_lme <- lme(writtenscore ~ gender + courseworkscore, data = scores2, random = ~ gender|School, method = "ML")
summary(m3_lme)
Linear mixed-effects model fit by maximum likelihood
Data: scores2
AIC BIC logLik
11418.4 11455.7 -5702.202
Random effects:
Formula: ~gender | School
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 6.342501 (Intr)
gender 1.946438 -0.158
Residual 9.705215
Fixed effects: writtenscore ~ gender + courseworkscore
Value Std.Error DF t-value p-value
(Intercept) 21.475527 1.5607541 1448 13.759712 0
gender -5.347579 0.6090938 1448 -8.779566 0
courseworkscore 0.404319 0.0185338 1448 21.815194 0
Correlation:
(Intr) gender
gender -0.036
courseworkscore -0.827 -0.216
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.53113379 -0.66818742 -0.03719285 0.66189140 3.64597728
Number of Observations: 1523
Number of Groups: 73
I added a student-level covariate since I did not have any school-level variables in my data set. Model #3 shows the effect of gender and course work score obtained has on the written exam score with random effects for the gender and school nested within the school. Females receive 5 points lower on the written exam score than males, and for every increase in the course work score, the written exam score increases by 0.40. Results are statistically significant (p<.05).
m4_lme <- lme(writtenscore ~ gender*courseworkscore, data = scores2, random = ~ gender|School, method = "ML")
summary(m4_lme)
Linear mixed-effects model fit by maximum likelihood
Data: scores2
AIC BIC logLik
11413.16 11455.79 -5698.582
Random effects:
Formula: ~gender | School
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 6.402169 (Intr)
gender 2.103725 -0.185
Residual 9.673303
Fixed effects: writtenscore ~ gender * courseworkscore
Value Std.Error DF t-value p-value
(Intercept) 24.942316 2.0249013 1447 12.317794 0.0000
gender -11.974642 2.5288679 1447 -4.735179 0.0000
courseworkscore 0.354669 0.0261801 1447 13.547273 0.0000
gender:courseworkscore 0.090834 0.0337030 1447 2.695133 0.0071
Correlation:
(Intr) gender crswrk
gender -0.625
courseworkscore -0.900 0.649
gender:courseworkscore 0.636 -0.970 -0.707
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.50405734 -0.66224240 -0.02831706 0.66668044 3.58806548
Number of Observations: 1523
Number of Groups: 73
I also ran a cross-level interaction between gender and course work score. The interaction is not statistically significant since (p>.05).
AIC(cpooling, m1_lme, m2_lme, m3_lme, m4_lme)
Based on the AIC values, Multilevel model #4 is the best fit model because it has the lowest AIC value of 11413.16
library(magrittr)
scores2 %<>% mutate(ccourseworkscore = courseworkscore - mean(courseworkscore))
m4a_lme <- lme(writtenscore ~ gender*ccourseworkscore, data = scores2, random = ~ gender|School, method = "ML")
summary(m4a_lme)
Linear mixed-effects model fit by maximum likelihood
Data: scores2
AIC BIC logLik
11413.16 11455.79 -5698.582
Random effects:
Formula: ~gender | School
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 6.402168 (Intr)
gender 2.103743 -0.185
Residual 9.673302
Fixed effects: writtenscore ~ gender * ccourseworkscore
Value Std.Error DF t-value p-value
(Intercept) 50.98535 0.8889876 1447 57.35215 0.0000
gender -5.30478 0.6186048 1447 -8.57540 0.0000
ccourseworkscore 0.35467 0.0261801 1447 13.54727 0.0000
gender:ccourseworkscore 0.09083 0.0337031 1447 2.69513 0.0071
Correlation:
(Intr) gender ccrswr
gender -0.406
ccourseworkscore 0.113 -0.176
gender:ccourseworkscore -0.081 0.037 -0.707
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.50405829 -0.66224197 -0.02831735 0.66668066 3.58806522
Number of Observations: 1523
Number of Groups: 73
library(texreg)
htmlreg(list(m4_lme, m4a_lme))
| Model 1 | Model 2 | ||
|---|---|---|---|
| (Intercept) | 24.94*** | 50.99*** | |
| (2.02) | (0.89) | ||
| gender | -11.97*** | -5.30*** | |
| (2.53) | (0.62) | ||
| courseworkscore | 0.35*** | ||
| (0.03) | |||
| gender:courseworkscore | 0.09** | ||
| (0.03) | |||
| ccourseworkscore | 0.35*** | ||
| (0.03) | |||
| gender:ccourseworkscore | 0.09** | ||
| (0.03) | |||
| AIC | 11413.16 | 11413.16 | |
| BIC | 11455.79 | 11455.79 | |
| Log Likelihood | -5698.58 | -5698.58 | |
| Num. obs. | 1523 | 1523 | |
| Num. groups | 73 | 73 | |
| p < 0.001, p < 0.01, p < 0.05 | |||
These are two identical models, and despite the numerical differences in some coefficient, they tell the same story. Model 4 demonstrates a cross-level interaction between gender and course work score to see the effect on the written exam score. The second models (model 4a) refits the model by centering the covariates of the course work score variable and uses the centered covariates in the second model. Centering the independent variables (course work score) redefines the 0 point for the predictor variable to be the mean of the course work score. This shifts the scale over but retains the unit. The effect of the slope between course work score and written exam score does not change at all but the interpretation of the intercept does. The intercept of the second model is the mean written exam score when all the independent variables equal 0. Since the second model has turned “0” into the average score through centering,the intercept is now recognized as the value of the written exam score when the course work score is set to the average.
m0_lme <- lme(writtenscore ~ 1, random = ~ 1|School, data = scores2, method = "ML")
summary(m0_lme)
Linear mixed-effects model fit by maximum likelihood
Data: scores2
AIC BIC logLik
11842.15 11858.13 -5918.073
Random effects:
Formula: ~1 | School
(Intercept) Residual
StdDev: 7.124212 11.23567
Fixed effects: writtenscore ~ 1
Value Std.Error DF t-value p-value
(Intercept) 48.18298 0.91906 1450 52.42636 0
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.03956792 -0.66708524 -0.05519453 0.65553457 3.73019890
Number of Observations: 1523
Number of Groups: 73
Intra-class correlation measures the reliability of ratings or measurements for clusters - data that has been collected as groups or sorted into groups. The Intra-class correlation can be calculated using the following formula: (7.124212)/(7.124212 + 11.23567)= 0.3880314699. Since the ICC value is low and close to zero, it means that values (written exam score) from the same group (schools) are not similar.
intervals(m0_lme)
Approximate 95% confidence intervals
Fixed effects:
lower est. upper
(Intercept) 46.38074 48.18298 49.98521
attr(,"label")
[1] "Fixed effects:"
Random Effects:
Level: School
lower est. upper
sd((Intercept)) 5.837725 7.124212 8.694208
Within-group standard error:
lower est. upper
10.83388 11.23567 11.65235
A confidence interval gives an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data. Based on this same, the interval is very narrow and close to the estimates. (95% confident that the true written exam score in the population for fixed effects is between 46.3 and 49.9)
I used the “lmer” function of the merTools package to generate quantities of interest for the random intercept model. This version of the model gives a nearly identical result to the “lme” random slope model used above.
library(merTools)
Mod_RandInctMer <- lmer(writtenscore ~ gender + (1|School), data= scores2)
summary(Mod_RandInctMer)
Linear mixed model fit by REML ['lmerMod']
Formula: writtenscore ~ gender + (1 | School)
Data: scores2
REML criterion at convergence: 11817.2
Scaled residuals:
Min 1Q Median 3Q Max
-3.1531 -0.6717 -0.0434 0.6626 3.8270
Random effects:
Groups Name Variance Std.Dev.
School (Intercept) 50.16 7.082
Residual 125.06 11.183
Number of obs: 1523, groups: School, 73
Fixed effects:
Estimate Std. Error t value
(Intercept) 49.6161 0.9791 50.67
gender -2.4469 0.6015 -4.07
Correlation of Fixed Effects:
(Intr)
gender -0.360
reEx <- REsim(Mod_RandInctMer)
p1 <- plotREsim(reEx)
p1
This plot demonstrates that there is variation in both directions when schools are used as the interest variable in our model.
confint.merMod(Mod_RandInctMer,method="profile")
2.5 % 97.5 %
.sig01 5.779647 8.617223
.sigma 10.784234 11.598940
(Intercept) 47.691495 51.546352
gender -3.628869 -1.268884
The confidence intervals obtained using the merTools package indicates there is not a wide variation in the random effect.
University of Bristol. “Centre for Multilevel Modelling.” Datasets Used in Multilevel Statistical Models; 3rd Edition 2003 | Centre for Multilevel Modelling | University of Bristol, University of Bristol, 5 Nov. 2014, www.bristol.ac.uk/cmm/team/hg/msm-3rd-ed/datasets.html.
Goldstein, Harvey. Multilevel Statistical Models. E. Arnold, 2003.