1 Problem statement

In the NLSY86 example, does it appear that the trends of reading scores over time (in months) vary between gender and ethnicity?

2 Introduction

The data are drawn from the National Longitudinal Survey of Youth (NLSY). The sample observations are from the 1986, 1988, 1990, and 1992 assessment periods. Children were selected to be in kindergarten, first, and second grade and to be of age 5, 6, or 7 at the first assessment (1986). Both reading and mathematical achievement scores are recorded. The former is a recognition subscore of the Peabody Individual Achievement Test (PIAT). This was scaled as the percentage of 84 items that were answered correctly. The same 84 items were administered at all four time points, providing a consistent scale over time. The data set is a subsample of 166 subjects with complete observations.

Source: Bollen, K.A. & Curran, P.J. (2006). Latent curve models. A structural equation perspective. p.59.

Column 1: Student ID Column 2: Gender, male or female Column 3: Race, minority or majority Column 4: Measurement occasions Column 5: Grade at which measurements were made, Kindergarten = 0, First grade = 1, Second grade = 2 Column 6: Age in years Column 7: Age in months Column 8: Math score Column 9: Reading score

3 Data management

# input data
dta <- read.csv("C:/Users/Ching-Fang Wu/Documents/lmm/nlsy86long.csv",h=T)
# inspect data structure
str(dta)
## 'data.frame':    664 obs. of  9 variables:
##  $ id   : int  2390 2560 3740 4020 6350 7030 7200 7610 7680 7700 ...
##  $ sex  : chr  "Female" "Female" "Female" "Male" ...
##  $ race : chr  "Majority" "Majority" "Majority" "Majority" ...
##  $ time : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ grade: int  0 0 0 0 1 0 0 0 0 0 ...
##  $ year : int  6 6 6 5 7 5 6 7 6 6 ...
##  $ month: int  67 66 67 60 78 62 66 79 76 67 ...
##  $ math : num  14.29 20.24 17.86 7.14 29.76 ...
##  $ read : num  19.05 21.43 21.43 7.14 30.95 ...
# examine first 6 lines
head(dta)
##     id    sex     race time grade year month      math      read
## 1 2390 Female Majority    1     0    6    67 14.285714 19.047619
## 2 2560 Female Majority    1     0    6    66 20.238095 21.428571
## 3 3740 Female Majority    1     0    6    67 17.857143 21.428571
## 4 4020   Male Majority    1     0    5    60  7.142857  7.142857
## 5 6350   Male Majority    1     1    7    78 29.761905 30.952381
## 6 7030   Male Majority    1     0    5    62 14.285714 17.857143

4 Visualization

# tools
#install.packages("tidyverse")
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.2     v dplyr   1.0.7
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
# plot
ggplot(data=dta, aes(x=month, y=read, group=id)) +
 geom_point(size=rel(.5)) +
 stat_smooth(method ="lm", formula=y ~ x, se=F) +
 facet_grid(race ~ sex) +
 labs(x="Month", y="Reading score") +
 theme_bw()

閱讀成績隨時間越來越高,且無性別及種族差異差異。 但女性的變異較男性小。

5 The End