Problem: Perform an appropriate analysis of the free recall example.
Data: In a learning task known as a free-recall list, each subject was presented with a list of words, one at a time, for a few second each. After all the words had been presented, the subject was asked to recall as many of the words as possible. On each successive study trial the order of presenting the words differ.
Control: 2 trials on List A, followed by 10 trials on List B.
Experimental: 2 trials on List B, followed by 10 trials on List B.
The mean number of correct responses given on each trial for the two group was the outcome variable.
Column 1: Group Control / Experimental
Column 2: Trial 1-10
Column 3: mean number of correct responses
pacman::p_load(lattice, tidyverse, nlme, nlstools)
dta1 <- read.table("freeRecall.asc", header = T)
names(dta1) <- c("Group","Trial","CorrectN")
str(dta1)
## 'data.frame': 20 obs. of 3 variables:
## $ Group : chr "C" "C" "C" "C" ...
## $ Trial : int 1 2 3 4 5 6 7 8 9 10 ...
## $ CorrectN: num 7.9 10.9 11.9 13 14.2 14.2 14.7 15.1 14.8 15.2 ...
lattice.options(default.theme = modifyList(standard.theme(color=FALSE), list(strip.background=list(col="transparent"))))
xyplot(CorrectN ~ Trial | Group , data=dta1,
type=c("g", "p"), cex=.5,
xlab="Trial on list B",
ylab="Mean number of correct responses")
ggplot(dta1, aes(Trial, CorrectN, color = Group)) +
geom_point(size = rel(2)) +
geom_line() +
labs(x = "Trial on list B", y = "Mean number of correct responses")+
scale_x_continuous(limits = c(0, 10), breaks = seq(0, 10, by = 2)) +
scale_y_continuous(limits = c(5, 20), breaks = seq(5, 20, by = 5)) +
guides(color = guide_legend(reverse = T)) +
theme(legend.position = c(.9,.2))
Model:
CRi = α × exp(β sqrt(triali)), where α is starting performance and β is the rate of improvement over trial.
dta1_init <- c(a = 5, b = log(10/5))
summary(m0 <- nls(CorrectN ~ a*exp(b*sqrt(Trial)), data = dta1,
start = dta1_init))
##
## Formula: CorrectN ~ a * exp(b * sqrt(Trial))
##
## Parameters:
## Estimate Std. Error t value Pr(>|t|)
## a 8.92939 0.50253 17.769 7.34e-13 ***
## b 0.17714 0.02269 7.806 3.47e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.864 on 18 degrees of freedom
##
## Number of iterations to convergence: 6
## Achieved convergence tolerance: 4.602e-07
ggplot(dta1, aes(Trial, CorrectN, color = Group)) +
stat_smooth(method = "nls", formula = y ~ a*exp(b*sqrt(x)),
method.args = list(start = dta1_init),
se = F,
size = rel(.5)) +
geom_point(pch = 1, size = rel(2)) +
scale_x_continuous(limits = c(0, 10), breaks = seq(0, 10, by = 2)) +
scale_y_continuous(limits = c(5, 20), breaks = seq(5, 20, by = 5)) +
labs(x = "Trial on list B",
y = "Mean number of correct responses")
some interaction effect between two group???
Source: Schwartz, R.M., & Humphreys, M.S. (1973). List differentiation in part/whole free recall. American Journal of Psychology, 86, 79-88.
Problem: Fit the model described in the *fox & geese** example to the data.
Data: During a 3-week period, subjects played a chess board game called, “Fox & Geese”. Each subject played up to 27 games. The number of moves made by the subject before committing a catastrophic error (resulting in defeat) was the outcome of interest. The reading scores of these subjects were also collected as a potential covariate.
Column 1: Subject ID
Column 2: Game played
Column 3: Number of moves made before defeat
Column 4: Reading score
## 'data.frame': 445 obs. of 4 variables:
## $ ID : Factor w/ 17 levels "S101","S102",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Game : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Moves: int 4 7 8 3 3 3 7 6 3 7 ...
## $ Read : num 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 ...
## ID Game Moves Read
## 1 S101 1 4 1.4
## 2 S101 2 7 1.4
## 3 S101 3 8 1.4
## 4 S101 4 3 1.4
## 5 S101 5 3 1.4
## 6 S101 6 3 1.4
xyplot(Moves ~ Game | ID , data=dta2,
type=c("g", "p"), cex=.5,
xlab="Number of game played",
ylab="Number of moves made before defeat")
Basically the more number of game played, the the more number of moves made.
Cannot precisely attack? lost the focus on the game?
S101 and S106 is differ from the others
# individual profiles on one panel
ggplot(dta2, aes(Game, Moves, group = ID)) +
geom_point(pch = 20)+
geom_line(alpha = .3) +
labs(x = "Number of game played", y = "Number of moves made before defeat")
# coerce data into groupedData object
dta2_g <- groupedData(Moves ~ Game | ID, outer = ~ Read,
data = as.data.frame(dta2),
labels = list(x = "Number of game played",
y = "Number of moves made before defeat"),
units = list(x = "(bouts)", y = "(steps)"))
Model: Let Yij be the number of moves for a subject i at jth game played. Yij = 1 + 19 / (1 + b0i exp(-b1iTimeij)) + εij,
where Timeij is the number of times game played by subject i. The parameters (b0i, b1i) are random effects.
b0i (150, 15, or 1.5)
b1i (0.1, 0.3, or 0.5)
mfx <- function(x, A, B) { (1 + (19 /(1 + (A*exp(-B*x)))))}
m0 <- nlsList(Moves ~ mfx(Game, A, B),
start = c(A = 150, B = 0.1),
data = dta2_g)
## Warning: 1 error caught in nls(model, data = data, control = controlvals, start
## = start): number of iterations exceeded maximum of 50
## A B
## S113 3.444376 0.019393820
## S110 19.021257 0.083381612
## S101 3.085531 -0.034697912
## S102 28.499570 0.148899148
## S103 5.385204 0.005747663
## S104 6.257776 0.133481440
## S108 26.681234 0.250404162
## S116 7.093699 0.141653735
## S106 8.742240 0.017753536
## S105 16.633833 0.125658158
## S117 NA NA
## S115 16.674963 0.160175972
## S107 3.146240 0.047900002
## S114 101.147125 0.298797401
## S112 88.215308 0.176818526
## S111 56.277746 0.198327030
## S109 8.227554 0.155514017
m1 <- nlme(m0, random=pdDiag(A + B ~ 1))
prm0 <- round(unlist(lapply(m1$coef$fixed, mean)), digits=3)
prm0
## A B
## 10.357 0.109
m2 <- update(m1, weights=varIdent(form = ~ 1 | ID),
control=nlmeControl(maxIter=1e8, msMaxIter=1e8, niterEM=500))
summary(m2)
## Nonlinear mixed-effects model fit by maximum likelihood
## Model: Moves ~ mfx(Game, A, B)
## Data: dta2_g
## AIC BIC logLik
## 2456.174 2542.234 -1207.087
##
## Random effects:
## Formula: list(A ~ 1, B ~ 1)
## Level: ID
## Structure: Diagonal
## A B Residual
## StdDev: 3.468549 0.05936851 3.982061
##
## Variance function:
## Structure: Different standard deviations per stratum
## Formula: ~1 | ID
## Parameter estimates:
## S113 S110 S101 S102 S103 S104 S108 S116
## 1.0000000 0.8277653 0.4279184 1.1215850 0.5515780 1.1217129 0.7118691 1.2247194
## S106 S105 S117 S115 S107 S114 S112 S111
## 0.3392785 0.8255157 1.0328394 0.9281161 0.6649241 0.9527503 0.9995639 1.1471650
## S109
## 1.2466181
## Fixed effects: list(A ~ 1, B ~ 1)
## Value Std.Error DF t-value p-value
## A 8.579734 1.2977639 427 6.611167 0
## B 0.096326 0.0159375 427 6.044003 0
## Correlation:
## A
## B 0.287
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.1980973 -0.6733032 -0.1749343 0.5312265 4.0182717
##
## Number of Observations: 445
## Number of Groups: 17
m3 <- update(m2,
fixed = list(A + B ~ Read),
random = pdDiag(A + B ~ 1),
groups = ~ ID,
start = list(fixed = c(A = c(prm0[1], 0),
B = c(prm0[2], 0))))
summary(m3)
## Nonlinear mixed-effects model fit by maximum likelihood
## Model: Moves ~ mfx(Game, A, B)
## Data: dta2_g
## AIC BIC logLik
## 2452.096 2546.352 -1203.048
##
## Random effects:
## Formula: list(A ~ 1, B ~ 1)
## Level: ID
## Structure: Diagonal
## A.(Intercept) B.(Intercept) Residual
## StdDev: 2.026809 0.05351836 3.894973
##
## Variance function:
## Structure: Different standard deviations per stratum
## Formula: ~1 | ID
## Parameter estimates:
## S113 S110 S101 S102 S103 S104 S108 S116
## 1.0000000 0.8633940 0.4407039 1.1670976 0.5603937 1.1435745 0.7424576 1.2491533
## S106 S105 S117 S115 S107 S114 S112 S111
## 0.3458037 0.8407405 1.0628409 0.9404110 0.8148864 0.9537729 0.9889874 1.1371776
## S109
## 1.3115064
## Fixed effects: list(A + B ~ Read)
## Value Std.Error DF t-value p-value
## A.(Intercept) -4.631553 4.837131 425 -0.9575000 0.3389
## A.Read 8.021219 3.152020 425 2.5447863 0.0113
## B.(Intercept) -0.001241 0.040061 425 -0.0309764 0.9753
## B.Read 0.054672 0.019629 425 2.7852689 0.0056
## Correlation:
## A.(In) A.Read B.(In)
## A.Read -0.975
## B.(Intercept) 0.306 -0.273
## B.Read -0.395 0.389 -0.928
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.2978828 -0.6328086 -0.1662572 0.5546343 4.0940868
##
## Number of Observations: 445
## Number of Groups: 17
## $fixed
## A.(Intercept) A.Read B.(Intercept) B.Read
## -2.74734459 7.22293656 0.01193214 0.04959430
##
## $random
## $random$ID
## A.(Intercept) B.(Intercept)
## S113 -4.085972e-09 -0.0270401937
## S110 6.172083e-07 -0.0337839074
## S101 -5.643789e-07 -0.0648744083
## S102 2.413142e-07 -0.0009086372
## S103 -2.609555e-08 -0.0540500740
## S104 -3.181928e-07 0.0525316734
## S108 -2.533298e-08 0.0778790059
## S116 -3.087107e-07 0.0486219917
## S106 8.287331e-08 -0.0727529894
## S105 1.920052e-07 0.0035211630
## S117 2.764930e-07 0.0494357469
## S115 -1.579409e-08 0.0296559402
## S107 -3.029171e-07 0.0029401752
## S114 3.284553e-08 0.0487637620
## S112 1.309247e-07 -0.0476460667
## S111 5.234170e-08 -0.0267635683
## S109 -6.049783e-08 0.0144703865
## Model df AIC BIC logLik Test L.Ratio p-value
## m1 1 5 2500.012 2520.502 -1245.006
## m2 2 21 2456.174 2542.234 -1207.087 1 vs 2 75.83759 <.0001
## m3 3 23 2452.096 2546.352 -1203.048 2 vs 3 8.07780 0.0176
## m4 4 24 2423.659 2522.013 -1187.830 3 vs 4 30.43706 <.0001
## m5 5 23 2494.401 2588.656 -1224.200 4 vs 5 72.74149 <.0001
dta2$yhat <- fitted(m3)
ggplot(dta2, aes(Game, Moves), group = Read, coloer = Read ) +
geom_point(size = rel(.5), pch = 1, col = "skyblue") +
geom_line(aes(y = yhat, group = ID), col = "indianred") +
labs(x = "Number of game played (bouts)",
y = "Number of moves made before defeat (steps)") +
facet_wrap(~ ID, ncol = 6)+
theme(legend.position = "NONE")
Source: Tivnan, T. (1980). Dissertation at Harvard Graduate School of Education. Reported in Singer, J.D., & Willet, J.B. (2003). Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence.
Problem: Use a non-linear mixed-effects model to account for the individual data presented in the planes example.
Data: A computer game simulating the task of an air traffic controller was used in a study of skill acquisition. The objective was to examine the performance of subjects to safely bring in planes. The task was continuous, and the response variable represented the number of planes brought in to land safely every 10 min. Subjects were allowed 10-min breaks following completion of each set of three subsequent trials after the initial trial (i.e., after trial 4 and trial 7) to minimize massed practice effects. The sample here consists of 28 subjects. Scores were recorded from the task continuously for a period of 100 minutes, yielding 10 scores. however, scores from the first trial were excluded, allowing for an adjustment period to the task.
Column 1: Subject ID
Column 2: Trial number (1:9) Column 3: Number of planes
## 'data.frame': 252 obs. of 3 variables:
## $ ID : Factor w/ 28 levels "S11","S12","S13",..: 1 1 1 1 1 1 1 1 1 2 ...
## $ Trial: int 1 2 3 4 5 6 7 8 9 1 ...
## $ Plane: int 39 40 47 51 52 50 60 52 56 38 ...
## ID Trial Plane
## 1 S11 1 39
## 2 S11 2 40
## 3 S11 3 47
## 4 S11 4 51
## 5 S11 5 52
## 6 S11 6 50
ggplot(dta3, aes(Trial, Plane, group = ID)) +
geom_line() +
geom_point() +
# facet_wrap( ~ ID) +
labs(x = "Trial", y = "Plane")
ggplot(dta3, aes(Trial, Plane)) +
geom_line() +
geom_point() +
facet_wrap( ~ ID) +
labs(x = "Trial", y = "Plane")
dta3_g <- groupedData(Plane ~ Trial | ID, data = as.data.frame(dta3),
labels = list(x = "Trial", y = "number of planes land safely"),
units = list(x = "(order)", y = "(unit)") )
plot(dta3_g, pch = 1, aspect = .9)
dta3_init <- c(a = 1, b = log(10/1))
summary(m0 <- nls(Plane ~ a*exp(b*sqrt(Trial)), data = dta3,
start = dta3_init))
##
## Formula: Plane ~ a * exp(b * sqrt(Trial))
##
## Parameters:
## Estimate Std. Error t value Pr(>|t|)
## a 16.35533 1.17486 13.92 <2e-16 ***
## b 0.30581 0.02953 10.36 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.637 on 250 degrees of freedom
##
## Number of iterations to convergence: 9
## Achieved convergence tolerance: 4.348e-07
ggplot(dta3, aes(Trial, Plane)) +
stat_smooth(method = "nls", formula = y ~ a*exp(b*sqrt(x)),
method.args = list(start = dta3_init),
se = F,
size = rel(.5)) +
geom_point(pch = 1, size = rel(2)) +
scale_x_continuous(limits = c(0, 10), breaks = seq(0, 10, by = 5)) +
scale_y_continuous(limits = c(0, 70), breaks = seq(0, 70, by = 20)) +
facet_wrap( ~ ID) +
labs(x = "Trial (order)",
y = "Number of planes land safely (unit)")
## Warning: 2 times caught the same error in nls(y ~ x/(K + x), data = xy,
## start = list(K = abs(pars[2L]/pars[1L])), algorithm = "plinear"): step factor
## 0.000488281 reduced below 'minFactor' of 0.000976562
## Call:
## Model: Plane ~ SSmicmen(Trial, Vm, K) | ID
## Data: dta3_g
##
## Coefficients:
## Vm K
## S37 NA NA
## S32 38.42522 2.6382919
## S21 35.58824 0.6117632
## S26 33.49397 0.8342775
## S31 38.25314 1.4534320
## S29 46.66226 2.9971659
## S35 NA NA
## S23 35.37661 0.6846419
## S27 39.04922 0.8537614
## S28 46.16958 2.2907550
## S33 58.51689 4.8733793
## S34 58.73462 5.3899245
## S36 72.57835 7.9155112
## S14 30.51662 -0.1808489
## S30 48.34655 1.5879277
## S19 41.70994 0.9235549
## S25 46.88423 1.9489698
## S20 45.17192 1.2972892
## S38 102.88531 12.2902544
## S18 45.74841 0.8423952
## S22 45.58569 0.9419266
## S13 43.09619 0.1979081
## S24 70.15462 5.2446174
## S15 46.70072 0.4936488
## S17 52.95904 1.5766528
## S12 53.66601 0.7619322
## S16 54.07300 0.7409107
## S11 58.77729 0.6510748
##
## Degrees of freedom: 234 total; 182 residual
## Residual standard error: 3.575345
## Nonlinear mixed-effects model fit by maximum likelihood
## Model: Plane ~ SSmicmen(Trial, Vm, K)
## Data: dta3_g
## AIC BIC logLik
## 1554.784 1575.96 -771.3919
##
## Random effects:
## Formula: list(Vm ~ 1, K ~ 1)
## Level: ID
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## Vm 7.582077 Vm
## K 1.681533 0.441
## Residual 3.861442
##
## Fixed effects: list(Vm ~ 1, K ~ 1)
## Value Std.Error DF t-value p-value
## Vm 45.83635 1.7120608 223 26.772621 0
## K 1.93060 0.3545793 223 5.444764 0
## Correlation:
## Vm
## K 0.546
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.72172107 -0.66635039 -0.03338046 0.57852813 2.08310019
##
## Number of Observations: 252
## Number of Groups: 28
## Nonlinear mixed-effects model fit by maximum likelihood
## Model: Plane ~ SSmicmen(Trial, Vm, K)
## Data: dta3_g
## AIC BIC logLik
## 1632.314 1646.431 -812.1568
##
## Random effects:
## Formula: Vm ~ 1 | ID
## Vm Residual
## StdDev: 8.988347 5.192305
##
## Fixed effects: list(Vm ~ 1, K ~ 1)
## Value Std.Error DF t-value p-value
## Vm 42.64821 1.9577317 223 21.78450 0
## K 1.22035 0.1196625 223 10.19829 0
## Correlation:
## Vm
## K 0.439
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.79153678 -0.52887678 -0.03144453 0.53400462 3.16157119
##
## Number of Observations: 252
## Number of Groups: 28
dta3_m1a <- data.frame(dta3, fit = fitted(m1a_g), fit1 = fitted(m1_g))
ggplot(dta3_m1a, aes(Trial, fit)) +
geom_point(aes(Trial, Plane), pch = 1)+
geom_line(col ="steelblue", lwd = .6)+
geom_line(aes(Trial, fit1), col = "indianred", lwd = .6) +
facet_wrap( ~ ID) +
labs(x = "Trial (order)", y = "Number of planes land safely (unit)")
Source: Kanfer, R., & Ackerman, P. L. (1989). Motivation and cognitive abilities: An integrative/aptitude-treatment interaction approach to skill acquisition. Journal of Applied Psychology, 74, 657-690.
Problem: One can treat the item response data using a cumulative normal distribition function (pnorm in R) to form a non-linear model (like a probit link for binary data). Perform an analysis of the mathematics placement test example using this approach.
Data: The dataset contains the results of a mathematics placement test for 200 freshmen entering Bowling Green State University. Form B of this test consists of 35 multiple-choice questions on topics in high school algebra.
Column 1: Student ID
Columns 2-36: Score on questions 1-35, Correct = 1, Incorrect = 0
dta4 <- read.table("mathPlacement.asc", header = F)
names(dta4) <- c("ID", paste("a", 1:35, sep = ""))
str(dta4)
## 'data.frame': 200 obs. of 36 variables:
## $ ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ a1 : int 0 1 1 1 1 1 0 0 1 0 ...
## $ a2 : int 1 1 1 1 1 1 1 0 0 0 ...
## $ a3 : int 1 1 1 1 1 1 0 0 1 1 ...
## $ a4 : int 0 1 1 0 1 1 0 0 0 1 ...
## $ a5 : int 1 1 1 1 1 1 1 0 1 0 ...
## $ a6 : int 1 1 1 1 1 1 1 0 1 1 ...
## $ a7 : int 1 1 1 1 1 1 1 0 1 1 ...
## $ a8 : int 0 1 1 1 1 1 1 0 0 1 ...
## $ a9 : int 0 1 1 1 1 1 0 0 1 1 ...
## $ a10: int 1 1 1 1 1 0 1 0 0 1 ...
## $ a11: int 1 1 1 1 1 1 1 0 1 0 ...
## $ a12: int 1 1 1 1 0 1 0 0 1 1 ...
## $ a13: int 1 0 1 1 1 1 1 0 1 0 ...
## $ a14: int 1 1 1 1 1 1 0 1 0 1 ...
## $ a15: int 1 1 1 0 0 1 1 0 0 1 ...
## $ a16: int 0 0 1 0 0 1 0 0 0 1 ...
## $ a17: int 0 1 1 0 1 1 1 0 1 0 ...
## $ a18: int 1 1 1 1 1 1 1 0 1 0 ...
## $ a19: int 1 1 1 0 0 1 1 0 0 0 ...
## $ a20: int 1 1 1 1 1 1 1 0 0 1 ...
## $ a21: int 1 1 0 0 0 0 0 0 0 0 ...
## $ a22: int 1 0 1 1 0 0 0 0 1 1 ...
## $ a23: int 1 1 1 1 1 1 1 0 0 0 ...
## $ a24: int 1 1 1 0 1 0 1 0 0 0 ...
## $ a25: int 1 1 1 1 0 1 1 1 0 0 ...
## $ a26: int 1 1 1 1 0 1 0 1 0 1 ...
## $ a27: int 0 1 1 0 0 1 1 0 1 1 ...
## $ a28: int 0 1 1 0 0 1 1 0 0 0 ...
## $ a29: int 0 0 1 1 1 1 0 0 0 0 ...
## $ a30: int 1 1 1 1 0 1 1 0 1 1 ...
## $ a31: int 1 1 1 1 1 1 0 0 0 0 ...
## $ a32: int 0 0 1 1 0 1 1 0 0 0 ...
## $ a33: int 1 0 1 1 0 0 1 0 0 0 ...
## $ a34: int 1 1 0 0 1 0 1 0 0 1 ...
## $ a35: int 1 1 1 1 1 1 1 0 0 0 ...
## ID a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 a16 a17 a18 a19 a20 a21
## 1 1 0 1 1 0 1 1 1 0 0 1 1 1 1 1 1 0 0 1 1 1 1
## 2 2 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 1
## 3 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
## 4 4 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 1 0
## 5 5 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 0 1 1 0 1 0
## 6 6 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0
## a22 a23 a24 a25 a26 a27 a28 a29 a30 a31 a32 a33 a34 a35
## 1 1 1 1 1 1 0 0 0 1 1 0 1 1 1
## 2 0 1 1 1 1 1 1 0 1 1 0 0 1 1
## 3 1 1 1 1 1 1 1 1 1 1 1 1 0 1
## 4 1 1 0 1 1 0 0 1 1 1 1 1 0 1
## 5 0 1 1 0 0 0 0 1 0 1 0 0 1 1
## 6 0 1 0 1 1 1 1 1 1 1 1 0 0 1
Model:
Pr{Yij = 1} = Φ(ajθi - bj),
where i is the subject index and j is the item index.
Source: Johnson, V.E., & Albert, J.H. (1998). Ordinal Data Modeling. p. 212.