Coursera Statistics One Week 6 Assignment 6

We now return to cognitive training. Suppose we conducted a training experiment in which subjects were randomly assigned to
ONE (1) of THREE (3) conditions:

(a) Working Memory training (WM)

(b) Physical Exercise training (PE)

© Designed Sports training (DS)

Further assume that we measured spatial reasoning ability BEFORE and AFTER training, using TWO (2) separate measures:

(a) SR1 (pretraining)

(b) SR2 (posttraining)

Fictional data are available in the file: DAA.05.txt. Write an R script to answer the following questions (the main analysis
should be a 3x2 mixed factorial ANOVA).

Round to TWO (2) significant digits (for example, if the correlation is .456 then write .46).

library(psych)
library(car)

## Loading required package: MASS

## Loading required package: nnet

## Attaching package: 'car'

## The following object(s) are masked from 'package:psych':
## 
## logit


#
# |------------------------------------------------------------------------------------------|
# | I N I T I A L I Z A T I O N |
# |------------------------------------------------------------------------------------------|
Init <- function(fileStr, workDirStr = "C:/Users/denbrige/100 FxOption/103 FxOptionVerBack/080 Fx Git/R-nonsource") {
    setwd(workDirStr)
    retDfr <- read.table(fileStr, header = T)
    return(retDfr)
}

#
# |------------------------------------------------------------------------------------------|
# | I N T E R N A L F U N C T I O N S |
# |------------------------------------------------------------------------------------------|
eta.2 = function(aov.mdl, ret.labels = FALSE) {
    eta.2vector = c()
    labels = c()
    for (table in summary(aov.mdl)) {
        #each block of factors
        SS.vector = table[[1]]$"Sum Sq"  #table is a list with 1 entry, but you have to use [[1]] anyway
        last = length(SS.vector)
        labels = c(labels, row.names(table[[1]])[-last])  #all but last (error term)
        for (SS in SS.vector[-last]) {
            #all but last entry (error term)
            new.etaval = SS/(SS + SS.vector[last])
            eta.2vector = c(eta.2vector, new.etaval)
        }
    }
    if (ret.labels) 
        return(data.frame(eta.2 = eta.2vector, row.names = labels))
    return(eta.2vector)
}

#
# |------------------------------------------------------------------------------------------|
# | M A I N P R O C E D U R E |
# |------------------------------------------------------------------------------------------|
# --- Init loading raw data
rawDfr <- Init("DAA.05.txt")
# --- Count of raw data
nrow(rawDfr)

## [1] 96

# --- Names of header
names(rawDfr)

## [1] "subject"   "condition" "time"      "SR"

# --- Peek at data
head(rawDfr)

##   subject condition time SR
## 1       1        WM  pre 11
## 2       2        WM  pre 13
## 3       3        WM  pre 16
## 4       4        WM  pre 11
## 5       5        WM  pre  8
## 6       6        WM  pre 15


# --- Omnibus analysis is a THREE(3)xTWO(2) mixed factorial with condition
# (3) and time (2) as the independent variables and spatial reasoning
# ability as the dependent variable.  The THREE (3) levels of condition
# are Working Memory training (WM), Physical Exercise training (PE), and
# Designed Sports training (DS).  The TWO (2) levels of time are PRE and
# POST training. --- The Error() function tricks R into adding a repeated
# measures variable, i.e. EACH subject is repeated in both PRE and POST
# training.
timeFtr <- factor(rawDfr$time, levels = c("pre", "post"))
mainAov = aov(rawDfr$SR ~ (rawDfr$condition * rawDfr$time) + Error(factor(rawDfr$subject)/rawDfr$time))

Main effect (Omnibus ANOVA Analysis)

The F-value of the predictor 'condition' is small, and therefore insignificant (p-value > 0.05). This suggests that there is
similarity between the groups and there is no significant variation between using different training methods. However, the F-values
of BOTH the predictor 'time' and interaction 'time*condition' are large, and therefore significant (p-value < 0.05). Therefore, it
is safe to assume there is dissimilarity between the PRE and POST training, which warrants that training does improve spatial
reasoning ability, and there is dissimilarity between the interaction of training and different groups.

summary(mainAov)

## 
## Error: factor(rawDfr$subject)
##                  Df Sum Sq Mean Sq F value Pr(>F)
## rawDfr$condition  2     33    16.6    1.55   0.22
## Residuals        45    483    10.7               
## 
## Error: factor(rawDfr$subject):rawDfr$time
##                              Df Sum Sq Mean Sq F value  Pr(>F)    
## rawDfr$time                   1   38.8    38.8    39.5 1.2e-07 ***
## rawDfr$condition:rawDfr$time  2   22.6    11.3    11.5 9.2e-05 ***
## Residuals                    45   44.2     1.0                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Percentage Explained

eta.2(mainAov, ret.labels = TRUE)

##                                eta.2
## rawDfr$condition             0.06442
## rawDfr$time                  0.46746
## rawDfr$condition:rawDfr$time 0.33838

Levene's Test

The F-value of the Levene's test is small, and therefore insignificant (p-value > 0.05). This suggests that there is homogeneity
of variance between groups, i.e. different conditions exhibit the same amount of variance.

# --- Levene's test for homogeneity of variance A large F-value means
# significant, therefore violate homogeneity of variance
leveneTest(rawDfr$SR, rawDfr$condition, center = "mean")

## Levene's Test for Homogeneity of Variance (center = "mean")
##       Df F value Pr(>F)
## group  2    0.62   0.54
##       93

Simple Effect (ONE(1)-way ANOVA Analysis)

We test EACH condition (WM, PE, DS) separately.

For WM and DS, the F-values of the predictor 'time' are large, and therefore significant (p-value < 0.05). Therefore, it is safe to
assume there is dissimilarity between the PRE and POST training, which warrants that training does improve spatial reasoning
ability.

However, in contrast for condition PE, there is no justification for training as it does not improve spatial training. This is
evident by the small F-value for the predictor 'time', which implies that it is insignificant (p-value > 0.05).

# --- Simple effects analysis for EACH simple condition WM, PE, DS
simpleAov = aov(rawDfr$SR[rawDfr$condition == "WM"] ~ rawDfr$time[rawDfr$condition == 
    "WM"] + Error(factor(rawDfr$subject[rawDfr$condition == "WM"])/rawDfr$time[rawDfr$condition == 
    "WM"]))
summary(simpleAov)

## 
## Error: factor(rawDfr$subject[rawDfr$condition == "WM"])
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 15    133    8.88               
## 
## Error: factor(rawDfr$subject[rawDfr$condition == "WM"]):rawDfr$time[rawDfr$condition == "WM"]
##                                       Df Sum Sq Mean Sq F value  Pr(>F)
## rawDfr$time[rawDfr$condition == "WM"]  1   13.8   13.78    19.3 0.00053
## Residuals                             15   10.7    0.71                
##                                          
## rawDfr$time[rawDfr$condition == "WM"] ***
## Residuals                                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

eta.2(simpleAov, ret.labels = TRUE)

##                                        eta.2
## rawDfr$time[rawDfr$condition == "WM"] 0.5625


simpleAov = aov(rawDfr$SR[rawDfr$condition == "PE"] ~ rawDfr$time[rawDfr$condition == 
    "PE"] + Error(factor(rawDfr$subject[rawDfr$condition == "PE"])/rawDfr$time[rawDfr$condition == 
    "PE"]))
summary(simpleAov)

## 
## Error: factor(rawDfr$subject[rawDfr$condition == "PE"])
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 15    169    11.3               
## 
## Error: factor(rawDfr$subject[rawDfr$condition == "PE"]):rawDfr$time[rawDfr$condition == "PE"]
##                                       Df Sum Sq Mean Sq F value Pr(>F)
## rawDfr$time[rawDfr$condition == "PE"]  1   0.03   0.031    0.04   0.84
## Residuals                             15  11.47   0.765

eta.2(simpleAov, ret.labels = TRUE)

##                                          eta.2
## rawDfr$time[rawDfr$condition == "PE"] 0.002717


simpleAov = aov(rawDfr$SR[rawDfr$condition == "DS"] ~ rawDfr$time[rawDfr$condition == 
    "DS"] + Error(factor(rawDfr$subject[rawDfr$condition == "DS"])/rawDfr$time[rawDfr$condition == 
    "DS"]))
summary(simpleAov)

## 
## Error: factor(rawDfr$subject[rawDfr$condition == "DS"])
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 15    180      12               
## 
## Error: factor(rawDfr$subject[rawDfr$condition == "DS"]):rawDfr$time[rawDfr$condition == "DS"]
##                                       Df Sum Sq Mean Sq F value  Pr(>F)
## rawDfr$time[rawDfr$condition == "DS"]  1   47.5    47.5    32.5 4.2e-05
## Residuals                             15   22.0     1.5                
##                                          
## rawDfr$time[rawDfr$condition == "DS"] ***
## Residuals                                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

eta.2(simpleAov, ret.labels = TRUE)

##                                        eta.2
## rawDfr$time[rawDfr$condition == "DS"] 0.6839

Simple Effect (TWO(2)xTWO(2) Mixed Factorial ANOVA Analysis)

We test a TWO(2)xTWO(2) Mixed Factorial ANOVA excluding the group WM.

The results are similar to the Main effect (Omnibus ANOVA Analysis).

# --- Simple effects analysis for complex span (this is a 2x2 mixed
# factorial)
complexAov = aov(rawDfr$SR[rawDfr$condition != "WM"] ~ rawDfr$condition[rawDfr$condition != 
    "WM"] * rawDfr$time[rawDfr$condition != "WM"] + Error(factor(rawDfr$subject[rawDfr$condition != 
    "WM"])/rawDfr$time[rawDfr$condition != "WM"]))
summary(complexAov)

## 
## Error: factor(rawDfr$subject[rawDfr$condition != "WM"])
##                                            Df Sum Sq Mean Sq F value
## rawDfr$condition[rawDfr$condition != "WM"]  1     30    30.2     2.6
## Residuals                                  30    350    11.7        
##                                            Pr(>F)
## rawDfr$condition[rawDfr$condition != "WM"]   0.12
## Residuals                                        
## 
## Error: factor(rawDfr$subject[rawDfr$condition != "WM"]):rawDfr$time[rawDfr$condition != "WM"]
##                                                                                  Df
## rawDfr$time[rawDfr$condition != "WM"]                                             1
## rawDfr$condition[rawDfr$condition != "WM"]:rawDfr$time[rawDfr$condition != "WM"]  1
## Residuals                                                                        30
##                                                                                  Sum Sq
## rawDfr$time[rawDfr$condition != "WM"]                                              25.0
## rawDfr$condition[rawDfr$condition != "WM"]:rawDfr$time[rawDfr$condition != "WM"]   22.6
## Residuals                                                                          33.4
##                                                                                  Mean Sq
## rawDfr$time[rawDfr$condition != "WM"]                                              25.00
## rawDfr$condition[rawDfr$condition != "WM"]:rawDfr$time[rawDfr$condition != "WM"]   22.56
## Residuals                                                                           1.11
##                                                                                  F value
## rawDfr$time[rawDfr$condition != "WM"]                                               22.4
## rawDfr$condition[rawDfr$condition != "WM"]:rawDfr$time[rawDfr$condition != "WM"]    20.2
## Residuals                                                                               
##                                                                                   Pr(>F)
## rawDfr$time[rawDfr$condition != "WM"]                                            4.9e-05
## rawDfr$condition[rawDfr$condition != "WM"]:rawDfr$time[rawDfr$condition != "WM"] 9.5e-05
## Residuals                                                                               
##                                                                                     
## rawDfr$time[rawDfr$condition != "WM"]                                            ***
## rawDfr$condition[rawDfr$condition != "WM"]:rawDfr$time[rawDfr$condition != "WM"] ***
## Residuals                                                                           
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

eta.2(complexAov, ret.labels = TRUE)

##                                                                                    eta.2
## rawDfr$condition[rawDfr$condition != "WM"]                                       0.07962
## rawDfr$time[rawDfr$condition != "WM"]                                            0.42781
## rawDfr$condition[rawDfr$condition != "WM"]:rawDfr$time[rawDfr$condition != "WM"] 0.40290


# --- Graph Bar plot
wspan = describe.by(rawDfr$SR[rawDfr$condition == "WM"], group = rawDfr$time[rawDfr$condition == 
    "WM"], mat = T)

## Warning: describe.by is deprecated.  Please use the describeBy function

rspan = describe.by(rawDfr$SR[rawDfr$condition == "PE"], group = rawDfr$time[rawDfr$condition == 
    "PE"], mat = T)

## Warning: describe.by is deprecated.  Please use the describeBy function

sspan = describe.by(rawDfr$SR[rawDfr$condition == "DS"], group = rawDfr$time[rawDfr$condition == 
    "DS"], mat = T)

## Warning: describe.by is deprecated.  Please use the describeBy function

graphme = cbind(WorkingMemory = wspan$mean, PhysicalExercise = rspan$mean, DesignedSports = sspan$mean)
rownames(graphme) = c("PRE-training", "POST-training")
se = cbind(wspan$se, rspan$se, sspan$se)

Bar chart of results

bp = barplot(graphme, beside = TRUE, space = c(0, 0.5), ylim = c(0, 20), ylab = "Spatial Reasoning Ability", 
    legend.text = TRUE, args.legend = c(x = "topright"))
abline(h = 0)
for (ii in 1:3) {
    arrows(bp[1, ii], graphme[1, ii] - se[1, ii], y1 = graphme[1, ii] + se[1, 
        ii], angle = 90, code = 3)
    arrows(bp[2, ii], graphme[2, ii] - se[2, ii], y1 = graphme[2, ii] + se[2, 
        ii], angle = 90, code = 3)
}

plot of chunk unnamed-chunk-7

Phonological Similarity Effects in Simple and Complex Span Tasks (Part TWO)

The analysis for this part should be done as an exercise.

#
# |------------------------------------------------------------------------------------------|
# | P A R T T W O P R O C E D U R E |
# |------------------------------------------------------------------------------------------|
e1sr <- Init("STATS1.EX.08.txt")
nrow(e1sr)

## [1] 122

names(e1sr)

## [1] "task"    "recall"  "subject" "stim"

head(e1sr)

##   task recall subject stim
## 1    R   0.48       1    S
## 2    R   0.65       2    S
## 3    R   0.74       3    S
## 4    R   0.83       4    S
## 5    R   0.35       5    S
## 6    R   0.78       6    S


stim = factor(e1sr$stim, levels = c("S", "D"))  #reverse levels (for graphs like the article)
aov.e1sr = aov(e1sr$recall ~ (e1sr$task * e1sr$stim) + Error(factor(e1sr$subject)/e1sr$stim))
summary(aov.e1sr)

## 
## Error: factor(e1sr$subject)
##           Df Sum Sq Mean Sq F value  Pr(>F)    
## e1sr$task  2  0.738   0.369    10.5 0.00012 ***
## Residuals 58  2.031   0.035                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Error: factor(e1sr$subject):e1sr$stim
##                     Df Sum Sq Mean Sq F value  Pr(>F)    
## e1sr$stim            1  0.016  0.0161    1.96    0.17    
## e1sr$task:e1sr$stim  2  0.372  0.1858   22.62 5.5e-08 ***
## Residuals           58  0.476  0.0082                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

eta.2(aov.e1sr, ret.labels = TRUE)

##                       eta.2
## e1sr$task           0.26664
## e1sr$stim           0.03262
## e1sr$task:e1sr$stim 0.43825


# Levene's test
leveneTest(e1sr$recall, e1sr$task, center = "mean")

## Levene's Test for Homogeneity of Variance (center = "mean")
##        Df F value Pr(>F)  
## group   2    3.43  0.036 *
##       119                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


# Simple effects analysis for simple span (i.e., word span) and for other
# two task
aov.e1srw = aov(e1sr$recall[e1sr$task == "W"] ~ e1sr$stim[e1sr$task == "W"] + 
    Error(factor(e1sr$subject[e1sr$task == "W"])/e1sr$stim[e1sr$task == "W"]))
summary(aov.e1srw)

## 
## Error: factor(e1sr$subject[e1sr$task == "W"])
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 19   0.34  0.0179               
## 
## Error: factor(e1sr$subject[e1sr$task == "W"]):e1sr$stim[e1sr$task == "W"]
##                             Df Sum Sq Mean Sq F value  Pr(>F)    
## e1sr$stim[e1sr$task == "W"]  1  0.328   0.328    78.8 3.5e-08 ***
## Residuals                   19  0.079   0.004                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

eta.2(aov.e1srw, ret.labels = TRUE)

##                              eta.2
## e1sr$stim[e1sr$task == "W"] 0.8057


aov.e1srw = aov(e1sr$recall[e1sr$task == "R"] ~ e1sr$stim[e1sr$task == "R"] + 
    Error(factor(e1sr$subject[e1sr$task == "R"])/e1sr$stim[e1sr$task == "R"]))
summary(aov.e1srw)

## 
## Error: factor(e1sr$subject[e1sr$task == "R"])
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 19  0.582  0.0306               
## 
## Error: factor(e1sr$subject[e1sr$task == "R"]):e1sr$stim[e1sr$task == "R"]
##                             Df Sum Sq Mean Sq F value Pr(>F)  
## e1sr$stim[e1sr$task == "R"]  1 0.0292 0.02916    3.71  0.069 .
## Residuals                   19 0.1492 0.00785                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

eta.2(aov.e1srw, ret.labels = TRUE)

##                              eta.2
## e1sr$stim[e1sr$task == "R"] 0.1635


aov.e1srw = aov(e1sr$recall[e1sr$task == "S"] ~ e1sr$stim[e1sr$task == "S"] + 
    Error(factor(e1sr$subject[e1sr$task == "S"])/e1sr$stim[e1sr$task == "S"]))
summary(aov.e1srw)

## 
## Error: factor(e1sr$subject[e1sr$task == "S"])
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 20   1.11  0.0554               
## 
## Error: factor(e1sr$subject[e1sr$task == "S"]):e1sr$stim[e1sr$task == "S"]
##                             Df Sum Sq Mean Sq F value Pr(>F)
## e1sr$stim[e1sr$task == "S"]  1 0.0309  0.0309    2.49   0.13
## Residuals                   20 0.2482  0.0124

eta.2(aov.e1srw, ret.labels = TRUE)

##                              eta.2
## e1sr$stim[e1sr$task == "S"] 0.1109


# Simple effects analysis for complex span (this is a 2x2 mixed factorial)
aov.e1srnw = aov(e1sr$recall[e1sr$task != "W"] ~ e1sr$task[e1sr$task != "W"] * 
    e1sr$stim[e1sr$task != "W"] + Error(factor(e1sr$subject[e1sr$task != "W"])/e1sr$stim[e1sr$task != 
    "W"]))
summary(aov.e1srnw)

## 
## Error: factor(e1sr$subject[e1sr$task != "W"])
##                             Df Sum Sq Mean Sq F value Pr(>F)
## e1sr$task[e1sr$task != "W"]  1  0.112  0.1118    2.58   0.12
## Residuals                   39  1.691  0.0433               
## 
## Error: factor(e1sr$subject[e1sr$task != "W"]):e1sr$stim[e1sr$task != "W"]
##                                                         Df Sum Sq Mean Sq
## e1sr$stim[e1sr$task != "W"]                              1  0.060  0.0601
## e1sr$task[e1sr$task != "W"]:e1sr$stim[e1sr$task != "W"]  1  0.000  0.0000
## Residuals                                               39  0.397  0.0102
##                                                         F value Pr(>F)  
## e1sr$stim[e1sr$task != "W"]                                 5.9   0.02 *
## e1sr$task[e1sr$task != "W"]:e1sr$stim[e1sr$task != "W"]     0.0   0.99  
## Residuals                                                               
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

eta.2(aov.e1srnw, ret.labels = TRUE)

##                                                             eta.2
## e1sr$task[e1sr$task != "W"]                             6.201e-02
## e1sr$stim[e1sr$task != "W"]                             1.314e-01
## e1sr$task[e1sr$task != "W"]:e1sr$stim[e1sr$task != "W"] 1.052e-06


####### Graph Bar plot
wspan = describe.by(e1sr$recall[e1sr$task == "W"], group = e1sr$stim[e1sr$task == 
    "W"], mat = T)

## Warning: describe.by is deprecated.  Please use the describeBy function

rspan = describe.by(e1sr$recall[e1sr$task == "R"], group = e1sr$stim[e1sr$task == 
    "R"], mat = T)

## Warning: describe.by is deprecated.  Please use the describeBy function

sspan = describe.by(e1sr$recall[e1sr$task == "S"], group = e1sr$stim[e1sr$task == 
    "S"], mat = T)

## Warning: describe.by is deprecated.  Please use the describeBy function

graphme = cbind(Words = wspan$mean, Sentences = rspan$mean, Stories = sspan$mean)
rownames(graphme) = c("Phonologically Similar", "Phonologically Dissimilar")
se = cbind(wspan$se, rspan$se, sspan$se)

bp = barplot(graphme, beside = TRUE, space = c(0, 0.5), ylim = c(0, 1), ylab = "Percentage recalled", 
    legend.text = TRUE, args.legend = c(x = "topright"))
abline(h = 0)
for (ii in 1:3) {
    arrows(bp[1, ii], graphme[1, ii] - se[1, ii], y1 = graphme[1, ii] + se[1, 
        ii], angle = 90, code = 3)
    arrows(bp[2, ii], graphme[2, ii] - se[2, ii], y1 = graphme[2, ii] + se[2, 
        ii], angle = 90, code = 3)
}

plot of chunk unnamed-chunk-8


#
# |------------------------------------------------------------------------------------------|
# | E N D O F S C R I P T |
# |------------------------------------------------------------------------------------------|