1 Problem statement

Interpret the figure in the bottom panel of the Chem97 example.

2 Introduction

The data set contains the results of examination on A-level chemistry for 31,022 students from 2,410 schools in Britain in 1997. A mean General Certificate of Secondary Education (GCSE) score is derived from all GCSE subjects of the student. The cohort is aged between 18 and 19 years. Schools were grouped by 131 local education authority.

Source: Fielding, A., Yang, M., & Goldstein, H. (2003). Multilevel ordinal models for examination grades. Statistical Modelling, 3, 127-153.

Data: Chem97{mlmRev}

3 Data management

# install package
#install.packages("mlmRev")
#install.packages("lme4")
# load it
library(mlmRev)
## 載入需要的套件:lme4
## 載入需要的套件:Matrix
# invoke data
# some packages do this to load the data into working directory
data(Chem97, package="mlmRev")
# data structure
str(Chem97)
## 'data.frame':    31022 obs. of  8 variables:
##  $ lea      : Factor w/ 131 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ school   : Factor w/ 2410 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ student  : Factor w/ 31022 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ score    : num  4 10 10 10 8 10 6 8 4 10 ...
##  $ gender   : Factor w/ 2 levels "M","F": 2 2 2 2 2 2 2 2 2 2 ...
##  $ age      : num  3 -3 -4 -2 -1 4 1 4 3 0 ...
##  $ gcsescore: num  6.62 7.62 7.25 7.5 6.44 ...
##  $ gcsecnt  : num  0.339 1.339 0.964 1.214 0.158 ...
# first 6 rows
head(Chem97)
##   lea school student score gender age gcsescore   gcsecnt
## 1   1      1       1     4      F   3     6.625 0.3393157
## 2   1      1       2    10      F  -3     7.625 1.3393157
## 3   1      1       3    10      F  -4     7.250 0.9643157
## 4   1      1       4    10      F  -2     7.500 1.2143157
## 5   1      1       5     8      F  -1     6.444 0.1583157
## 6   1      1       6    10      F   4     7.750 1.4643157

4 Visualization

# install package
#install.packages("ggplot2")
# load it
library(ggplot2)
# use black and white theme 
theme_set(theme_bw())
# density plot by lea
ggplot(Chem97, aes(gcsescore, group = factor(lea))) +
 geom_density() +
 labs(x = "GCSE Score", y = "Density")

# forest plot of means by local authority
ggplot(Chem97, aes(x = reorder(lea, gcsescore, median), 
                   y = gcsescore)) +
 stat_summary(fun.data = "mean_cl_boot")+
 coord_flip()+
 labs(x='Local Education Authority', y='Mean GCSE score') +
 theme(axis.text.y = element_text(size = 6))

這個圖呈現各校平均成績的變異。

5 The End