1 Homework-1

Anorexia example

1.1 Problem

Perform an analysis of the anorexia data example following the paired or independent samples example in the course note.
Data info: This dataset presents 17 paired data corresponding to the weights of girls before and after treatment for anorexia. A more complete version can be found in the package MASS. There is actually a cluster of four points in this dataset.

1.2 Setting and input data

#install.packages("PairedData")
#load package
library(PairedData, Hmisc)

## Loading required package: MASS

## Loading required package: gld

## Loading required package: mvtnorm

## Loading required package: lattice

## Loading required package: ggplot2

## 
## Attaching package: 'PairedData'

## The following object is masked from 'package:base':
## 
##     summary

#input data
data(Anorexia, package="PairedData")
dta1 <- Anorexia
str(dta1)

## 'data.frame':    17 obs. of  2 variables:
##  $ Prior: num  83.8 83.3 86 82.5 86.7 79.6 76.9 94.2 73.4 80.5 ...
##  $ Post : num  95.2 94.3 91.5 91.9 100.3 ...

head(dta1)

##   Prior  Post
## 1  83.8  95.2
## 2  83.3  94.3
## 3  86.0  91.5
## 4  82.5  91.9
## 5  86.7 100.3
## 6  79.6  76.7

1.3 Data analysis

# compute variable means (Prior mean weight=83.22941 < Post mean weight =90.49412)
colMeans(dta1)

##    Prior     Post 
## 83.22941 90.49412

# compute variable sd (Prior SD of weight=5.02 < Post SD of weight =8.48)
print(apply(dta1, 2, sd), 3)

## Prior  Post 
##  5.02  8.48

# co-variances
cov(dta1)

##          Prior     Post
## Prior 25.16721 22.88268
## Post  22.88268 71.82684

# correlations of Post and Prior weight = 0.538
print(cor(dta1),3)

##       Prior  Post
## Prior 1.000 0.538
## Post  0.538 1.000

# 2-sample t-test
t.test(dta1$Post, dta1$Prior)

## 
##  Welch Two Sample t-test
## 
## data:  dta1$Post and dta1$Prior
## t = 3.0414, df = 25.986, p-value = 0.005324
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   2.35469 12.17472
## sample estimates:
## mean of x mean of y 
##  90.49412  83.22941

# paired t-test
t.test(dta1$Post, dta1$Prior, pair=T)

## 
##  Paired t-test
## 
## data:  dta1$Post and dta1$Prior
## t = 4.1849, df = 16, p-value = 0.0007003
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   3.58470 10.94471
## sample estimates:
## mean of the differences 
##                7.264706

95% CI and p-value of Paired t-test (more precise) are smaller than Two Sample t-test.

1.4 Visualization

# turn wide format to long format
dta1L <- tidyr::gather(dta1, key = "Therapy", value = "weights", Prior, Post)
# show first 6 lines
head(dta1L)

##   Therapy weights
## 1   Prior    83.8
## 2   Prior    83.3
## 3   Prior    86.0
## 4   Prior    82.5
## 5   Prior    86.7
## 6   Prior    79.6

#

#mean_cl_boot: implementation of the basic nonparametric bootstrap for obtaining confidence limits for the population mean without assuming normality.(x, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)??

ggplot(dta1L, aes(Therapy, weights)) + 
  geom_point(shape = 1,
             colour = "coral") + 
  stat_summary(fun.data = mean_cl_boot, 
               geom = "pointrange",
               colour = "deepskyblue4", 
               width = 0.5) + 
  coord_flip() + #Cartesian coordinates with x and y flipped?
  labs(x = "Therapy", y = "weights") +
  theme_bw()

## Warning: Ignoring unknown parameters: width

The blue line show the mean weights and CI of post therapy higher than prior.

1.5 Without subject effects

# show results of anova
summary(lm(weights ~ Therapy -1, data=dta1L))

## 
## Call:
## lm(formula = weights ~ Therapy - 1, data = dta1L)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -15.294  -2.454   1.106   4.004  11.106 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## TherapyPost    90.494      1.689   53.58   <2e-16 ***
## TherapyPrior   83.229      1.689   49.28   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.964 on 32 degrees of freedom
## Multiple R-squared:  0.994,  Adjusted R-squared:  0.9936 
## F-statistic:  2649 on 2 and 32 DF,  p-value: < 2.2e-16

summary(aov(weights ~ Therapy -1 , data=dta1L))

##           Df Sum Sq Mean Sq F value Pr(>F)    
## Therapy    2 256977  128489    2649 <2e-16 ***
## Residuals 32   1552      48                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Weight of therapyPost=(weight of therapyPrior + 7.265) The Estimate value of therapyPost(90.494) and therapyPrior(83.229) are same as sample estimates mean from the two sample t test.

1.6 Same as paired t-test

# generate sbj id for each group
dta1L <- dplyr::mutate(dta1L, Subject=rep(paste0("S", 1:17), 2), Sbj=paste0("S", 1:34))
# a random effect from Subject but then fix Therapy with Therapy nested within Subject levels versus which is just taking Subject as the random effects with no constraint on other variables. 
summary(aov(weights ~ Therapy + Error(Subject/Therapy), data = dta1L))

## 
## Error: Subject
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 16   1142   71.38               
## 
## Error: Subject:Therapy
##           Df Sum Sq Mean Sq F value Pr(>F)    
## Therapy    1  448.6   448.6   17.51  7e-04 ***
## Residuals 16  409.8    25.6                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# within subject design
summary(aov(weights ~ Therapy + Subject, data = dta1L))

##             Df Sum Sq Mean Sq F value Pr(>F)    
## Therapy      1  448.6   448.6  17.513 0.0007 ***
## Subject     16 1142.1    71.4   2.787 0.0240 *  
## Residuals   16  409.8    25.6                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# between design
summary(aov(weights ~ Therapy + Sbj, data = dta1L))

##             Df Sum Sq Mean Sq
## Therapy      1  448.6   448.6
## Sbj         32 1551.9    48.5

Sun sq of sbj (1551.9)=1142.1(Subject) + 409.8 (Residuals)
Sum Sq between therapy is smaller the Sum Sq between subject. (整體個體間變異性相較於治療組內個體變異性大)
Mean Sq of Therapy without subject effects (128489) is larger than 128489 > Mean Sq of Therapy within subject (448.6)

# plot for paired
with(Anorexia, plot(paired(Prior, Post), type="profile")) +
 labs(x="Therapy", y="Weight (in lb)") +
 theme_bw()

2 Homework-2

2.1 Problem

Identify the study design of the ergoStool example and replicate the results of analysis reported. (similar to random effect nested within a fixed effect? repeated measurement?)
Data info: An experimenter wishes to compare 4 particular type of stools. Each of nine individuals randomly selected from a subject pool is asked to test the stools by recording the effort required to arise from each of 4 types of stools.(ergoStool{nlme}) 大便型態對於解便用力的程度分數的關係

2.2 Setting and input data

input data from nlme package

data(ergoStool, package="nlme")
dta2 <- ergoStool
str(dta2)

## Classes 'nffGroupedData', 'nfGroupedData', 'groupedData' and 'data.frame':   36 obs. of  3 variables:
##  $ effort : num  12 15 12 10 10 14 13 12 7 14 ...
##  $ Type   : Factor w/ 4 levels "T1","T2","T3",..: 1 2 3 4 1 2 3 4 1 2 ...
##  $ Subject: Ord.factor w/ 9 levels "8"<"5"<"4"<"9"<..: 8 8 8 8 9 9 9 9 6 6 ...
##  - attr(*, "formula")=Class 'formula'  language effort ~ Type | Subject
##   .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
##  - attr(*, "labels")=List of 2
##   ..$ x: chr "Type of stool"
##   ..$ y: chr "Effort required to arise"
##  - attr(*, "units")=List of 1
##   ..$ y: chr "(Borg scale)"
##  - attr(*, "FUN")=function (x)  
##   ..- attr(*, "source")= chr "function (x) mean(x, na.rm = TRUE)"
##  - attr(*, "order.groups")= logi TRUE

head(dta2)

##   effort Type Subject
## 1     12   T1       1
## 2     15   T2       1
## 3     12   T3       1
## 4     10   T4       1
## 5     10   T1       2
## 6     14   T2       2

2.3 Data analysis

2.3.1 Descriptive statistics

mean effort score of each type of stool and of each subject
the highest mean effort score of stool type is type 2

aggregate(dta2[,1], list(dta2$Type), mean)

##   Group.1         x
## 1      T1  8.555556
## 2      T2 12.444444
## 3      T3 10.777778
## 4      T4  9.222222

aggregate(dta2[,1], list(dta2$Subject), mean)

##   Group.1     x
## 1       8  8.25
## 2       5  8.50
## 3       4  9.25
## 4       9 10.00
## 5       6 10.25
## 6       3 10.75
## 7       7 10.75
## 8       1 12.25
## 9       2 12.25

2.3.2 Lmm fit by REML

# Random effects (random effect term: Subject)
summary(m0 <- nlme::lme(effort ~ Type, random = ~ 1 | Subject, data=dta2, method="REML") )

## Linear mixed-effects model fit by REML
##  Data: dta2 
##        AIC      BIC    logLik
##   133.1308 141.9252 -60.56539
## 
## Random effects:
##  Formula: ~1 | Subject
##         (Intercept) Residual
## StdDev:    1.332465 1.100295
## 
## Fixed effects: effort ~ Type 
##                Value Std.Error DF   t-value p-value
## (Intercept) 8.555556 0.5760123 24 14.853079  0.0000
## TypeT2      3.888889 0.5186838 24  7.497610  0.0000
## TypeT3      2.222222 0.5186838 24  4.284348  0.0003
## TypeT4      0.666667 0.5186838 24  1.285304  0.2110
##  Correlation: 
##        (Intr) TypeT2 TypeT3
## TypeT2 -0.45               
## TypeT3 -0.45   0.50        
## TypeT4 -0.45   0.50   0.50 
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -1.80200345 -0.64316591  0.05783115  0.70099706  1.63142054 
## 
## Number of Observations: 36
## Number of Groups: 9

faraway::sumary(m1 <-lme4::lmer(effort ~ Type + (1 | Subject), data=dta2) )

## Fixed Effects:
##             coef.est coef.se
## (Intercept) 8.56     0.58   
## TypeT2      3.89     0.52   
## TypeT3      2.22     0.52   
## TypeT4      0.67     0.52   
## 
## Random Effects:
##  Groups   Name        Std.Dev.
##  Subject  (Intercept) 1.33    
##  Residual             1.10    
## ---
## number of obs: 36, groups: Subject, 9
## AIC = 133.1, DIC = 123.2
## deviance = 122.1

faraway::sumary(m2 <-lme4::lmer(effort ~ 1 + (1 | Subject), data=dta2) )

## Fixed Effects:
## coef.est  coef.se 
##    10.25     0.48 
## 
## Random Effects:
##  Groups   Name        Std.Dev.
##  Subject  (Intercept) 1.03    
##  Residual             2.02    
## ---
## number of obs: 36, groups: Subject, 9
## AIC = 163.8, DIC = 158.5
## deviance = 158.2

2.3.3 Approximate 95% CI

nlme::intervals(m0)

## Approximate 95% confidence intervals
## 
##  Fixed effects:
##                  lower      est.    upper
## (Intercept)  7.3667247 8.5555556 9.744386
## TypeT2       2.8183781 3.8888889 4.959400
## TypeT3       1.1517114 2.2222222 3.292733
## TypeT4      -0.4038442 0.6666667 1.737177
## attr(,"label")
## [1] "Fixed effects:"
## 
##  Random Effects:
##   Level: Subject 
##                    lower     est.    upper
## sd((Intercept)) 0.749509 1.332465 2.368835
## 
##  Within-group standard error:
##     lower      est.     upper 
## 0.8292494 1.1002946 1.4599324

# fixed effects parameters only
confint(m1,parm="beta_", oldNames=F)

## Computing profile confidence intervals ...

##                  2.5 %   97.5 %
## (Intercept)  7.4238425 9.687269
## TypeT2       2.8953043 4.882473
## TypeT3       1.2286377 3.215807
## TypeT4      -0.3269179 1.660251

# variance covariance
confint(m1,parm="theta_", oldNames=F)

## Computing profile confidence intervals ...

##                            2.5 %   97.5 %
## sd_(Intercept)|Subject 0.7342354 2.287261
## sigma                  0.8119798 1.390104

# plot
VCA::varPlot(effort ~ Type/Subject, 
             Data = dta2, 
             keep.order = T, 
             Points = list(pch=1, cex=1, col="gray"), 
             YLabel = list(text="effort",side=2,cex=1), 
             MeanLine = list(var=c("Type", "Subject"),
                             col=c("deepskyblue4","coral"), 
                             lwd=c(1,2)))

## Warning in min(x): min 中沒有無漏失的引數; 回傳 Inf

## Warning in max(x): max 中沒有無漏失的引數；回傳 -Inf

3 Homework-3

3.1 Problem

Examine the channel effects of the EEG study by following the analysis in the left brow EMG example.
data info: At the end of six weeks of therapy, the changes of the absolute theta power of electroencephalogram (EEG) of 19 depressed patients in nine selected channels of the patients’ brain were recorded.

3.2 Setting and input data

pacman::p_load(car, tidyverse, lme4, GGally)
dta3 <- read.table("thetaEEG.txt", header = T)
str(dta3)

## 'data.frame':    19 obs. of  10 variables:
##  $ ID  : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Ch3 : num  -3.54 5.72 0.52 0 2.07 1.67 9.13 -0.43 -0.56 1.28 ...
##  $ Ch4 : num  -3.11 5.07 -0.18 0.74 0.76 ...
##  $ Ch5 : num  -0.24 6.87 0.9 1.1 3.51 2.77 3.44 -0.31 -1.22 1.89 ...
##  $ Ch6 : num  0.42 5.96 0.6 0.13 0.6 4.55 4.8 -0.61 0.67 1.77 ...
##  $ Ch7 : num  -0.49 8.2 1.27 0.19 3.71 1.8 0.48 -1.04 -0.97 1.83 ...
##  $ Ch8 : num  2.13 4.87 1.28 0.07 1.86 4.79 1.63 -0.13 -0.98 0.91 ...
##  $ Ch17: num  -4.15 5.48 -0.95 0.8 1.49 4.51 9.94 -0.61 0 1.4 ...
##  $ Ch18: num  2.87 5.57 1.74 0.25 3.11 3.24 1.34 -0.61 -1.22 1.1 ...
##  $ Ch19: num  1.34 6.33 0.79 -0.66 1.8 3.99 1.53 -0.43 -0.91 -0.12 ...

head(dta3)

##   ID   Ch3   Ch4   Ch5  Ch6   Ch7  Ch8  Ch17 Ch18  Ch19
## 1  1 -3.54 -3.11 -0.24 0.42 -0.49 2.13 -4.15 2.87  1.34
## 2  2  5.72  5.07  6.87 5.96  8.20 4.87  5.48 5.57  6.33
## 3  3  0.52 -0.18  0.90 0.60  1.27 1.28 -0.95 1.74  0.79
## 4  4  0.00  0.74  1.10 0.13  0.19 0.07  0.80 0.25 -0.66
## 5  5  2.07  0.76  3.51 0.60  3.71 1.86  1.49 3.11  1.80
## 6  6  1.67  4.18  2.77 4.55  1.80 4.79  4.51 3.24  3.99

chan <- as.factor(colnames(dta3[, -1]))
# linear model
m0 <- lm(as.matrix(dta3[, -1]) ~ 1)
m0_aov <- Anova(m0, idata = data.frame(chan), idesign = ~ chan)

## Note: model has only an intercept; equivalent type-III tests substituted.

summary(m0_aov)

## 
## Type III Repeated Measures MANOVA Tests:
## 
## ------------------------------------------
##  
## Term: (Intercept) 
## 
##  Response transformation matrix:
##      (Intercept)
## Ch3            1
## Ch4            1
## Ch5            1
## Ch6            1
## Ch7            1
## Ch8            1
## Ch17           1
## Ch18           1
## Ch19           1
## 
## Sum of squares and products for the hypothesis:
##             (Intercept)
## (Intercept)    1761.616
## 
## Multivariate Tests: (Intercept)
##                  Df test stat approx F num Df den Df   Pr(>F)  
## Pillai            1 0.2035907 4.601445      1     18 0.045839 *
## Wilks             1 0.7964093 4.601445      1     18 0.045839 *
## Hotelling-Lawley  1 0.2556358 4.601445      1     18 0.045839 *
## Roy               1 0.2556358 4.601445      1     18 0.045839 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## ------------------------------------------
##  
## Term: chan 
## 
##  Response transformation matrix:
##      chan1 chan2 chan3 chan4 chan5 chan6 chan7 chan8
## Ch3      0     0     0     1     0     0     0     0
## Ch4      0     0     0     0     1     0     0     0
## Ch5      0     0     0     0     0     1     0     0
## Ch6      0     0     0     0     0     0     1     0
## Ch7      0     0     0     0     0     0     0     1
## Ch8     -1    -1    -1    -1    -1    -1    -1    -1
## Ch17     1     0     0     0     0     0     0     0
## Ch18     0     1     0     0     0     0     0     0
## Ch19     0     0     1     0     0     0     0     0
## 
## Sum of squares and products for the hypothesis:
##         chan1         chan2        chan3        chan4       chan5        chan6
## chan1  5.7475  1.650000e-02  1.485000000  0.506000000  7.70000000  1.930500000
## chan2  0.0165  4.736842e-05  0.004263158  0.001452632  0.02210526  0.005542105
## chan3  1.4850  4.263158e-03  0.383684211  0.130736842  1.98947368  0.498789474
## chan4  0.5060  1.452632e-03  0.130736842  0.044547368  0.67789474  0.169957895
## chan5  7.7000  2.210526e-02  1.989473684  0.677894737 10.31578947  2.586315789
## chan6  1.9305  5.542105e-03  0.498789474  0.169957895  2.58631579  0.648426316
## chan7  3.0635  8.794737e-03  0.791526316  0.269705263  4.10421053  1.028984211
## chan8 -0.0165 -4.736842e-05 -0.004263158 -0.001452632 -0.02210526 -0.005542105
##              chan7         chan8
## chan1  3.063500000 -1.650000e-02
## chan2  0.008794737 -4.736842e-05
## chan3  0.791526316 -4.263158e-03
## chan4  0.269705263 -1.452632e-03
## chan5  4.104210526 -2.210526e-02
## chan6  1.028984211 -5.542105e-03
## chan7  1.632889474 -8.794737e-03
## chan8 -0.008794737  4.736842e-05
## 
## Multivariate Tests: chan
##                  Df test stat approx F num Df den Df  Pr(>F)
## Pillai            1 0.5243665 1.515882      8     11 0.25588
## Wilks             1 0.4756335 1.515882      8     11 0.25588
## Hotelling-Lawley  1 1.1024593 1.515882      8     11 0.25588
## Roy               1 1.1024593 1.515882      8     11 0.25588
## 
## Univariate Type III Repeated-Measures ANOVA Assuming Sphericity
## 
##              Sum Sq num Df Error SS den Df F value  Pr(>F)  
## (Intercept) 195.735      1   765.68     18  4.6014 0.04584 *
## chan         10.702      8   389.31    144  0.4948 0.85840  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Mauchly Tests for Sphericity
## 
##      Test statistic    p-value
## chan     0.00037434 9.7367e-11
## 
## 
## Greenhouse-Geisser and Huynh-Feldt Corrections
##  for Departure from Sphericity
## 
##       GG eps Pr(>F[GG])
## chan 0.31531     0.6559
## 
##         HF eps Pr(>F[HF])
## chan 0.3709213  0.6854028

# wide format to long format
dta3L <- dta3 %>% 
  gather( key = channel, value = EMG, 2:10) %>% 
  mutate( channel = factor(channel)) %>% 
  mutate( ID = factor(ID))

#
str(dta3L)

## 'data.frame':    171 obs. of  3 variables:
##  $ ID     : Factor w/ 19 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ channel: Factor w/ 9 levels "Ch17","Ch18",..: 4 4 4 4 4 4 4 4 4 4 ...
##  $ EMG    : num  -3.54 5.72 0.52 0 2.07 1.67 9.13 -0.43 -0.56 1.28 ...

# set theme to black and white and save parameters
ot <- theme_set(theme_bw())

3.3 Data analysis

# boxplot
ggplot(dta3L, aes(reorder(channel, EMG, median), EMG)) +
 geom_boxplot() +
 geom_point(alpha = .3) +
 geom_hline(yintercept = mean(dta3L$EMG)) +
 labs(x ="Channel condition", y = "EMG amplitutde")

# univariate approach
summary(m1<- aov(EMG ~ channel + Error(ID/channel), data = dta3L))

## 
## Error: ID
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 18  765.7   42.54               
## 
## Error: ID:channel
##            Df Sum Sq Mean Sq F value Pr(>F)
## channel     8   10.7   1.338   0.495  0.858
## Residuals 144  389.3   2.704

# show tables of model output
model.tables(m1, se = T)

## Tables of effects
## 
##  channel 
## channel
##    Ch17    Ch18    Ch19     Ch3     Ch4     Ch5     Ch6     Ch7     Ch8 
##  0.3327 -0.2157 -0.0751 -0.1688  0.5196 -0.0325  0.0759 -0.2188 -0.2173 
## 
## Standard errors of effects
##         channel
##          0.3772
## replic.      19

model.tables(m1, "means")

## Tables of means
## Grand mean
##          
## 1.069883 
## 
##  channel 
## channel
##   Ch17   Ch18   Ch19    Ch3    Ch4    Ch5    Ch6    Ch7    Ch8 
## 1.4026 0.8542 0.9947 0.9011 1.5895 1.0374 1.1458 0.8511 0.8526

# relevel factor channel by mean EMG #[c(3,2,1,4)]?
dta3L$channel <- factor(dta3L$channel, levels(dta3L$channel))

# EMG over channel condition with CIs
ggplot(dta3L, aes(channel, EMG)) +
 stat_summary(fun.data = "mean_cl_boot", geom = "pointrange", col = "firebrick") + 
 geom_hline(yintercept = mean(dta3L$EMG), linetype = "dashed") +
 geom_point(alpha = .3, size = rel(.7)) +
 labs(x = "channel condition", y = "EMG amplititude")

# Patient by emg with chan colors
ggplot(dta3L, aes(reorder(ID, EMG, mean), EMG, color = channel)) +
 geom_point() +
 geom_hline(yintercept = mean(dta3L$EMG), linetype = "dashed") + 
 coord_flip() +
 scale_color_manual(values = c("gray", "forestgreen", "orangered", "navy", "blue", "red", "coral", "green", "skyblue"))+
 labs(y = "EMG amplitude", x = "Patient ID") +
 theme(legend.position = c(.9, .2))

# mixed-effect approach
summary(m2 <- lmer(EMG ~ channel + (1 | ID), data = dta3L))

## Linear mixed model fit by REML ['lmerMod']
## Formula: EMG ~ channel + (1 | ID)
##    Data: dta3L
## 
## REML criterion at convergence: 697
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.6502 -0.3882 -0.0002  0.4215  4.6394 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  ID       (Intercept) 4.426    2.104   
##  Residual             2.704    1.644   
## Number of obs: 171, groups:  ID, 19
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   1.4026     0.6126   2.290
## channelCh18  -0.5484     0.5335  -1.028
## channelCh19  -0.4079     0.5335  -0.765
## channelCh3   -0.5016     0.5335  -0.940
## channelCh4    0.1868     0.5335   0.350
## channelCh5   -0.3653     0.5335  -0.685
## channelCh6   -0.2568     0.5335  -0.481
## channelCh7   -0.5516     0.5335  -1.034
## channelCh8   -0.5500     0.5335  -1.031
## 
## Correlation of Fixed Effects:
##             (Intr) chnC18 chnC19 chnnC3 chnnC4 chnnC5 chnnC6 chnnC7
## channelCh18 -0.435                                                 
## channelCh19 -0.435  0.500                                          
## channelCh3  -0.435  0.500  0.500                                   
## channelCh4  -0.435  0.500  0.500  0.500                            
## channelCh5  -0.435  0.500  0.500  0.500  0.500                     
## channelCh6  -0.435  0.500  0.500  0.500  0.500  0.500              
## channelCh7  -0.435  0.500  0.500  0.500  0.500  0.500  0.500       
## channelCh8  -0.435  0.500  0.500  0.500  0.500  0.500  0.500  0.500

#
confint(m2)

## Computing profile confidence intervals ...

##                  2.5 %    97.5 %
## .sig01       1.4944091 2.9720041
## .sigma       1.4360421 1.7986384
## (Intercept)  0.2052226 2.6000405
## channelCh18 -1.5725629 0.4757208
## channelCh19 -1.4320366 0.6162471
## channelCh3  -1.5257208 0.5225629
## channelCh4  -0.8372997 1.2109839
## channelCh5  -1.3894050 0.6588787
## channelCh6  -1.2809839 0.7672997
## channelCh7  -1.5757208 0.4725629
## channelCh8  -1.5741418 0.4741418

#
ranef(m2)

## $ID
##    (Intercept)
## 1  -1.49820126
## 2   4.62406308
## 3  -0.38071113
## 4  -0.72927649
## 5   0.96568759
## 6   2.27566903
## 7   3.70218575
## 8  -1.60121012
## 9  -1.44617658
## 10  0.14057616
## 11  1.24662086
## 12  0.11560432
## 13 -2.64586570
## 14 -1.24848279
## 15 -1.24744230
## 16  2.16017424
## 17 -0.09977786
## 18 -2.34100107
## 19 -1.99243572
## 
## with conditional variances for "ID"

# residual plot
plot(m2, form = resid(., type = "pearson") ~ fitted(.) | channel, 
     abline = 0, lty = 2, layout = c(4,1), pch = 20, 
     xlab = "Fitted values", ylab = "Residuals")

# variance emg by chan
v_chan <- dta3L %>% 
  group_by(channel) %>% 
  summarise(var_chan = var(EMG)) %>% 
  as.data.frame

## `summarise()` ungrouping output (override with `.groups` argument)

# augument data with variance of emg by channel
dta3L2 <- merge(dta3L, v_chan, by = "channel")
as.data.frame(dta3L2)

##     channel ID   EMG  var_chan
## 1      Ch17  1 -4.15 10.738754
## 2      Ch17  2  5.48 10.738754
## 3      Ch17  3 -0.95 10.738754
## 4      Ch17  4  0.80 10.738754
## 5      Ch17  5  1.49 10.738754
## 6      Ch17  6  4.51 10.738754
## 7      Ch17  7  9.94 10.738754
## 8      Ch17  8 -0.61 10.738754
## 9      Ch17  9  0.00 10.738754
## 10     Ch17 10  1.40 10.738754
## 11     Ch17 11  3.31 10.738754
## 12     Ch17 12  1.46 10.738754
## 13     Ch17 13 -1.40 10.738754
## 14     Ch17 14 -0.12 10.738754
## 15     Ch17 15  0.71 10.738754
## 16     Ch17 16  6.12 10.738754
## 17     Ch17 17  1.20 10.738754
## 18     Ch17 18 -1.77 10.738754
## 19     Ch17 19 -0.77 10.738754
## 20     Ch18  1  2.87  6.769626
## 21     Ch18  2  5.57  6.769626
## 22     Ch18  3  1.74  6.769626
## 23     Ch18  4  0.25  6.769626
## 24     Ch18  5  3.11  6.769626
## 25     Ch18  6  3.24  6.769626
## 26     Ch18  7  1.34  6.769626
## 27     Ch18  8 -0.61  6.769626
## 28     Ch18  9 -1.22  6.769626
## 29     Ch18 10  1.10  6.769626
## 30     Ch18 11  1.20  6.769626
## 31     Ch18 12  2.26  6.769626
## 32     Ch18 13 -1.28  6.769626
## 33     Ch18 14  0.98  6.769626
## 34     Ch18 15  0.35  6.769626
## 35     Ch18 16  2.83  6.769626
## 36     Ch18 17 -0.24  6.769626
## 37     Ch18 18 -0.12  6.769626
## 38     Ch18 19 -7.14  6.769626
## 39     Ch19  1  1.34  5.566882
## 40     Ch19  2  6.33  5.566882
## 41     Ch19  3  0.79  5.566882
## 42     Ch19  4 -0.66  5.566882
## 43     Ch19  5  1.80  5.566882
## 44     Ch19  6  3.99  5.566882
## 45     Ch19  7  1.53  5.566882
## 46     Ch19  8 -0.43  5.566882
## 47     Ch19  9 -0.91  5.566882
## 48     Ch19 10 -0.12  5.566882
## 49     Ch19 11  2.15  5.566882
## 50     Ch19 12  2.01  5.566882
## 51     Ch19 13 -1.59  5.566882
## 52     Ch19 14  0.06  5.566882
## 53     Ch19 15 -1.29  5.566882
## 54     Ch19 16  4.39  5.566882
## 55     Ch19 17  0.36  5.566882
## 56     Ch19 18 -3.66  5.566882
## 57     Ch19 19  2.81  5.566882
## 58      Ch3  1 -3.54  8.645877
## 59      Ch3  2  5.72  8.645877
## 60      Ch3  3  0.52  8.645877
## 61      Ch3  4  0.00  8.645877
## 62      Ch3  5  2.07  8.645877
## 63      Ch3  6  1.67  8.645877
## 64      Ch3  7  9.13  8.645877
## 65      Ch3  8 -0.43  8.645877
## 66      Ch3  9 -0.56  8.645877
## 67      Ch3 10  1.28  8.645877
## 68      Ch3 11  3.21  8.645877
## 69      Ch3 12  1.47  8.645877
## 70      Ch3 13 -2.14  8.645877
## 71      Ch3 14 -1.40  8.645877
## 72      Ch3 15  0.29  8.645877
## 73      Ch3 16  2.58  8.645877
## 74      Ch3 17  0.85  8.645877
## 75      Ch3 18 -1.71  8.645877
## 76      Ch3 19 -1.89  8.645877
## 77      Ch4  1 -3.11 12.348605
## 78      Ch4  2  5.07 12.348605
## 79      Ch4  3 -0.18 12.348605
## 80      Ch4  4  0.74 12.348605
## 81      Ch4  5  0.76 12.348605
## 82      Ch4  6  4.18 12.348605
## 83      Ch4  7 12.92 12.348605
## 84      Ch4  8 -1.59 12.348605
## 85      Ch4  9  0.92 12.348605
## 86      Ch4 10  0.92 12.348605
## 87      Ch4 11  3.41 12.348605
## 88      Ch4 12  2.38 12.348605
## 89      Ch4 13 -2.44 12.348605
## 90      Ch4 14  0.37 12.348605
## 91      Ch4 15  0.31 12.348605
## 92      Ch4 16  3.16 12.348605
## 93      Ch4 17  3.02 12.348605
## 94      Ch4 18 -1.40 12.348605
## 95      Ch4 19  0.76 12.348605
## 96      Ch5  1 -0.24  5.560076
## 97      Ch5  2  6.87  5.560076
## 98      Ch5  3  0.90  5.560076
## 99      Ch5  4  1.10  5.560076
## 100     Ch5  5  3.51  5.560076
## 101     Ch5  6  2.77  5.560076
## 102     Ch5  7  3.44  5.560076
## 103     Ch5  8 -0.31  5.560076
## 104     Ch5  9 -1.22  5.560076
## 105     Ch5 10  1.89  5.560076
## 106     Ch5 11  3.92  5.560076
## 107     Ch5 12  1.22  5.560076
## 108     Ch5 13 -2.01  5.560076
## 109     Ch5 14 -1.10  5.560076
## 110     Ch5 15 -0.13  5.560076
## 111     Ch5 16  2.67  5.560076
## 112     Ch5 17 -0.59  5.560076
## 113     Ch5 18 -1.83  5.560076
## 114     Ch5 19 -1.15  5.560076
## 115     Ch6  1  0.42  5.073637
## 116     Ch6  2  5.96  5.073637
## 117     Ch6  3  0.60  5.073637
## 118     Ch6  4  0.13  5.073637
## 119     Ch6  5  0.60  5.073637
## 120     Ch6  6  4.55  5.073637
## 121     Ch6  7  4.80  5.073637
## 122     Ch6  8 -0.61  5.073637
## 123     Ch6  9  0.67  5.073637
## 124     Ch6 10  1.77  5.073637
## 125     Ch6 11  0.85  5.073637
## 126     Ch6 12  1.71  5.073637
## 127     Ch6 13 -1.95  5.073637
## 128     Ch6 14  0.18  5.073637
## 129     Ch6 15 -0.89  5.073637
## 130     Ch6 16  3.77  5.073637
## 131     Ch6 17  2.17  5.073637
## 132     Ch6 18 -2.01  5.073637
## 133     Ch6 19 -0.95  5.073637
## 134     Ch7  1 -0.49  5.181343
## 135     Ch7  2  8.20  5.181343
## 136     Ch7  3  1.27  5.181343
## 137     Ch7  4  0.19  5.181343
## 138     Ch7  5  3.71  5.181343
## 139     Ch7  6  1.80  5.181343
## 140     Ch7  7  0.48  5.181343
## 141     Ch7  8 -1.04  5.181343
## 142     Ch7  9 -0.97  5.181343
## 143     Ch7 10  1.83  5.181343
## 144     Ch7 11  2.77  5.181343
## 145     Ch7 12 -0.25  5.181343
## 146     Ch7 13  0.31  5.181343
## 147     Ch7 14 -1.71  5.181343
## 148     Ch7 15 -0.65  5.181343
## 149     Ch7 16  1.26  5.181343
## 150     Ch7 17  0.65  5.181343
## 151     Ch7 18  0.24  5.181343
## 152     Ch7 19 -1.43  5.181343
## 153     Ch8  1  2.13  4.281098
## 154     Ch8  2  4.87  4.281098
## 155     Ch8  3  1.28  4.281098
## 156     Ch8  4  0.07  4.281098
## 157     Ch8  5  1.86  4.281098
## 158     Ch8  6  4.79  4.281098
## 159     Ch8  7  1.63  4.281098
## 160     Ch8  8 -0.13  4.281098
## 161     Ch8  9 -0.98  4.281098
## 162     Ch8 10  0.91  4.281098
## 163     Ch8 11  0.79  4.281098
## 164     Ch8 12 -1.52  4.281098
## 165     Ch8 13 -3.30  4.281098
## 166     Ch8 14  0.37  4.281098
## 167     Ch8 15 -1.06  4.281098
## 168     Ch8 16  3.61  4.281098
## 169     Ch8 17  1.25  4.281098
## 170     Ch8 18 -0.61  4.281098
## 171     Ch8 19  0.24  4.281098

# weighted fit
summary(m3 <- lmer(EMG ~ channel + (1 | ID), data = dta3L2,
                   weights = 1/var_chan))

## Linear mixed model fit by REML ['lmerMod']
## Formula: EMG ~ channel + (1 | ID)
##    Data: dta3L2
## Weights: 1/var_chan
## 
## REML criterion at convergence: 683.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.8607 -0.4157 -0.0211  0.4295  4.0287 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  ID       (Intercept) 3.9833   1.9958  
##  Residual             0.3712   0.6093  
## Number of obs: 171, groups:  ID, 19
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   1.4026     0.6477   2.166
## channelCh18  -0.5484     0.5849  -0.938
## channelCh19  -0.4079     0.5644  -0.723
## channelCh3   -0.5016     0.6154  -0.815
## channelCh4    0.1868     0.6716   0.278
## channelCh5   -0.3653     0.5643  -0.647
## channelCh6   -0.2568     0.5558  -0.462
## channelCh7   -0.5516     0.5577  -0.989
## channelCh8   -0.5500     0.5417  -1.015
## 
## Correlation of Fixed Effects:
##             (Intr) chnC18 chnC19 chnnC3 chnnC4 chnnC5 chnnC6 chnnC7
## channelCh18 -0.554                                                 
## channelCh19 -0.574  0.636                                          
## channelCh3  -0.526  0.583  0.604                                   
## channelCh4  -0.482  0.534  0.553  0.508                            
## channelCh5  -0.574  0.636  0.659  0.604  0.554                     
## channelCh6  -0.583  0.645  0.669  0.613  0.562  0.669              
## channelCh7  -0.581  0.643  0.667  0.611  0.560  0.667  0.677       
## channelCh8  -0.598  0.662  0.686  0.629  0.577  0.686  0.697  0.694

# residual plot
plot(m3, form = resid(., type = "pearson") ~ fitted(.) | channel, 
     abline = 0, lty = 2, layout = c(4,1), pch = 20,
     xlab = "Fitted values", ylab = "Residuals")

# residual plots - combined and compared
# dta3L_m3 <- fortify(m3, dta3L2) %>% select() %>% mutate(fit3 = .fitted, res3 = .scresid) %>% select(- contains(".")) 

dta3L_m3 <- dta3L2 %>% 
  mutate(fit3 = fitted(m3),res3 = residuals(m3))

# dta3L_m2 <- fortify(m2, dta3L) %>% select(1:4, 6) %>% mutate( fit2 = .fitted, res2 = .scresid) %>% select( - contains("."))

dta3L_m2 <- dta3L %>% 
  mutate(fit2 = fitted(m2),res2 = residuals(m2))

dta3_2m <- merge(dta3L_m2, dta3L_m3, by = c("ID","channel", "EMG"))
#
ggplot(dta3_2m, aes(fit2, res2)) +
 geom_point(alpha = .3) +
 geom_point(aes(fit3, res3), alpha = .6) +
 geom_hline(yintercept = 0, linetype="dashed") +
 facet_grid(. ~ channel) +
 labs(x = "Fitted values", y = "Standardized Residuals")

4 Homework-4

Nested designs

4.1 Problem

Conduct an analysis of the cognitive task example and comment on the results produced by each code chunk.
data info: An experiment on a cognitive task was conducted. The design involved 4 classes nested within two methods of instruction. Thirty-two subjects were randomly assigned to one of the eight conditions. The score was the number of correct responses on the post-treatment test.

4.2 Setting and input data

input data

dta4 <- read.table("cognitive_task.txt", h=T)
str(dta4)

## 'data.frame':    32 obs. of  5 variables:
##  $ ID    : chr  "S01" "S02" "S03" "S04" ...
##  $ Score : int  3 6 3 3 1 2 2 2 5 6 ...
##  $ Method: chr  "I1" "I1" "I1" "I1" ...
##  $ Class : chr  "C1" "C1" "C1" "C1" ...
##  $ Klass : chr  "K1" "K1" "K1" "K1" ...

head(dta4)

##    ID Score Method Class Klass
## 1 S01     3     I1    C1    K1
## 2 S02     6     I1    C1    K1
## 3 S03     3     I1    C1    K1
## 4 S04     3     I1    C1    K1
## 5 S05     1     I1    C2    K2
## 6 S06     2     I1    C2    K2

4.3 Plot

4.3.1 Variability Chart for Hierarchical Models.

VCA::varPlot(Score ~ Method/Class/ID, 
             Data=dta4,
             keep.order = T, 
             Points = list(pch=1, cex=1, col="gray"),
             YLabel=list(text="Score", side=2, cex=1),
             MeanLine=list(var=c("Method", "Class"),
                           col=c("deepskyblue4", "salmon"), 
                           lwd=c(1, 2)))

## Warning in min(x): min 中沒有無漏失的引數; 回傳 Inf

## Warning in max(x): max 中沒有無漏失的引數；回傳 -Inf

32 children(S01-S32) were randomly assigned to 4 classes(each for 4 children) with 2 intervention(I1/I2)
MeanLine (deepskyblue4: 各介入措施組別治療組的平均值, coral: 各班的平均值), Score = (後測-前測)
mean score I2 > I1

4.4 Data analysis

4.4.1 The GLM Procedure

# Class as the random effects with no constraint on other variables. 
m0 <- aov(Score ~ Method + Error(Class), data=dta4)
summary(m0)

## 
## Error: Class
##           Df Sum Sq Mean Sq F value Pr(>F)  
## Method     1  112.5  112.50   6.459  0.044 *
## Residuals  6  104.5   17.42                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: Within
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 24   18.5  0.7708

# A random effect from Method but then fix Class with Class nested within Method levels versus which is just taking Method as the random effects with no constraint on other variables. 
m01 <- aov(Score ~ Method + Error(Method / Klass), data=dta4)
summary(m01)

## 
## Error: Method
##        Df Sum Sq Mean Sq
## Method  1  112.5   112.5
## 
## Error: Method:Klass
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals  6  104.5   17.42               
## 
## Error: Within
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 24   18.5  0.7708

m0 result is same as m01

4.4.2 Random effects (random effect term: Class)

# random effect term: Class
m1 <- lme4::lmer(Score ~ Method + (1 | Class), data=dta4)
summary(m1)

## Linear mixed model fit by REML ['lmerMod']
## Formula: Score ~ Method + (1 | Class)
##    Data: dta4
## 
## REML criterion at convergence: 101.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.3028 -0.8416 -0.0126  0.3150  2.5753 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Class    (Intercept) 4.1615   2.040   
##  Residual             0.7708   0.878   
## Number of obs: 32, groups:  Class, 8
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)    3.500      1.043   3.355
## MethodI2       3.750      1.475   2.542
## 
## Correlation of Fixed Effects:
##          (Intr)
## MethodI2 -0.707

# estimated deviation
lme4::ranef(m1)

## $Class
##    (Intercept)
## C1   0.2389354
## C2  -1.6725478
## C3   1.9114833
## C4  -0.4778708
## C5  -0.2389354
## C6  -3.1061603
## C7   0.7168062
## C8   2.6282895
## 
## with conditional variances for "Class"

# random effect term: Method, fix Class with Class nested within Method levels
m11 <- lme4::lmer(Score ~ Method + (1 | Method:Klass), data=dta4)
summary(m11)

## Linear mixed model fit by REML ['lmerMod']
## Formula: Score ~ Method + (1 | Method:Klass)
##    Data: dta4
## 
## REML criterion at convergence: 101.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.3028 -0.8416 -0.0126  0.3150  2.5753 
## 
## Random effects:
##  Groups       Name        Variance Std.Dev.
##  Method:Klass (Intercept) 4.1615   2.040   
##  Residual                 0.7708   0.878   
## Number of obs: 32, groups:  Method:Klass, 8
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)    3.500      1.043   3.355
## MethodI2       3.750      1.475   2.542
## 
## Correlation of Fixed Effects:
##          (Intr)
## MethodI2 -0.707

lme4::ranef(m11)

## $`Method:Klass`
##       (Intercept)
## I1:K1   0.2389354
## I1:K2  -1.6725478
## I1:K3   1.9114833
## I1:K4  -0.4778708
## I2:K1  -0.2389354
## I2:K2  -3.1061603
## I2:K3   0.7168062
## I2:K4   2.6282895
## 
## with conditional variances for "Method:Klass"

# computes confidence intervals for parameters in m1 by bootstrap.
confint(m1, oldNames=F, method="boot")

## Computing bootstrap confidence intervals ...

##                          2.5 %   97.5 %
## sd_(Intercept)|Class 0.7871557 3.131486
## sigma                0.6281750 1.147624
## (Intercept)          1.4499874 5.812490
## MethodI2             1.0244681 7.018932

Estimate coefficients of Method I1 (3.500) < Method I2 (7.250)

Homework: Analysis of variance

Chang-Chun Chen

2020-10-03