What assumption must we test to include a variable as a blocking factor? Nrmality, Independence of Observation, Equal Variance, and Additivity of Interactions. The block should explain some of the variance from the Sum of Squares (SS) error so we can first: explain more variance and second: have less unexplained variance.
Recognize the IV, DV, block and create a table for the following research statement. “A company is planning to investigate the motor skills or elderly population. The company separates the target population into three age categories: 60 – 69, 70 – 79, and above 80 then randomly assign the participants in the study to one of the three task conditions. After individuals have completed the task, their performance will be compared.”
IV - task condition 1, 2, 3 (predictor) DV - performance score (outcome) Block - age categories (1: 60-69, 2: 70-79, 3: 80&above) Table - BlockTask Condition 1 Task Condition 2 Task Condition 3 Age 1
Age 2
Age 3
Use the data “Lab 3” with the research question to perform a fine report. age “1”:60-69, “2”: 70-79 and “3”: above 80.
The statement of the research/ study purpose H0: Motor skills (performance score) of elederly population are equal (Condition1 = Condition2 = Condition3) H1: at least one pair of scores differ from one another
The type of analysis conducted, i.e. D’Agostino test, Scatterplot of residuals, Bartlett test. etc.
Descriptive statistics: basic information of the data, i.e. age and gender of the participants.
The ANOVA test
Post-hoc analysis
Effect size
Conclusions
setwd("E:\\mikhilesh\\HU Sem VI ANLY 510 and 506\\ANLY 510 Kao Principals and Applications\\assignment and project")
library(readxl)
## Warning: package 'readxl' was built under R version 3.6.3
data3 <- read_excel("Lecture 3 Lab3.xlsx")
names(data3)
## [1] "Age" "Performance_score" "Condition"
str(data3)
## Classes 'tbl_df', 'tbl' and 'data.frame': 89 obs. of 3 variables:
## $ Age : num 1 1 1 1 1 1 1 1 1 1 ...
## $ Performance_score: num 36 39 35 36 37 35 36 37 35 35 ...
## $ Condition : num 1 1 1 1 1 1 1 1 1 2 ...
summary(data3)
## Age Performance_score Condition
## Min. :1 Min. :15.00 Min. :1.000
## 1st Qu.:1 1st Qu.:24.00 1st Qu.:1.000
## Median :2 Median :27.00 Median :2.000
## Mean :2 Mean :27.52 Mean :2.034
## 3rd Qu.:3 3rd Qu.:32.00 3rd Qu.:3.000
## Max. :3 Max. :39.00 Max. :3.000
Assumptions
library(moments)
plot(density(data3$Performance_score), main = "Density Plot")
qqnorm(data3$Performance_score)
agostino.test(data3$Performance_score)
##
## D'Agostino skewness test
##
## data: data3$Performance_score
## skew = -0.11171, z = -0.45976, p-value = 0.6457
## alternative hypothesis: data have a skewness
shapiro.test(data3$Performance_score)
##
## Shapiro-Wilk normality test
##
## data: data3$Performance_score
## W = 0.9755, p-value = 0.09018
anscombe.test(data3$Performance_score)
##
## Anscombe-Glynn kurtosis test
##
## data: data3$Performance_score
## kurt = 2.2365, z = -2.0554, p-value = 0.03984
## alternative hypothesis: kurtosis is not equal to 3
#Residual plot (lm - linear model)
performance.lm <- lm(Performance_score ~ Condition, data = data3)
performance.lm
##
## Call:
## lm(formula = Performance_score ~ Condition, data = data3)
##
## Coefficients:
## (Intercept) Condition
## 36.686 -4.509
performance.res <- resid(performance.lm) #OR performance.res <- residuals(performance.lm)
performance.res
## 1 2 3 4 5 6 7
## 3.8225868 6.8225868 2.8225868 3.8225868 4.8225868 2.8225868 3.8225868
## 8 9 10 11 12 13 14
## 4.8225868 2.8225868 7.3311713 6.3311713 3.3311713 4.3311713 5.3311713
## 15 16 17 18 19 20 21
## 5.3311713 6.3311713 7.3311713 6.3311713 4.3311713 6.8397558 6.8397558
## 22 23 24 25 26 27 28
## 4.8397558 4.8397558 5.8397558 5.8397558 2.8397558 3.8397558 3.8397558
## 29 30 31 32 33 34 35
## 4.8397558 1.8225868 0.8225868 2.8225868 -0.1774132 4.8225868 0.8225868
## 36 37 38 39 40 41 42
## 1.8225868 -0.1774132 -1.1774132 -0.1774132 2.3311713 0.3311713 0.3311713
## 43 44 45 46 47 48 49
## -0.6688287 -0.6688287 -1.6688287 -1.6688287 0.3311713 1.3311713 -1.6688287
## 50 51 52 53 54 55 56
## 1.8397558 -0.1602442 -0.1602442 0.8397558 1.8397558 0.8397558 -0.1602442
## 57 58 59 60 61 62 63
## -2.1602442 -1.1602442 0.8397558 0.8397558 -5.1774132 -4.1774132 -5.1774132
## 64 65 66 67 68 69 70
## -6.1774132 -7.1774132 -6.1774132 -7.1774132 -5.1774132 -4.1774132 -4.6688287
## 71 72 73 74 75 76 77
## -5.6688287 -3.6688287 -5.6688287 -3.6688287 -2.6688287 -3.6688287 -4.6688287
## 78 79 80 81 82 83 84
## -6.6688287 -7.6688287 -4.1602442 -6.1602442 -6.1602442 -5.1602442 -4.1602442
## 85 86 87 88 89
## -7.1602442 -6.1602442 -5.1602442 -4.1602442 -8.1602442
plot(data3$Performance_score, performance.res, ylab = "Residual", xlab = "Condition", main = "Independence of Observation")
abline(0, 0)
bartlett.test(data3$Performance_score, data3$Condition)
##
## Bartlett test of homogeneity of variances
##
## data: data3$Performance_score and data3$Condition
## Bartlett's K-squared = 0.14381, df = 2, p-value = 0.9306
# Variance are equal. p-value = 0.9306 failed to reject the null hypothesis
#Checking variance among groups
tapply(data3$Performance_score, data3$Condition, var)
## 1 2 3
## 18.36508 20.94713 20.72903
#ratio of largest variance to smallest variance 20.95/18.37 = 1.1 - is way less than 3 (signifies that there is no issue with failing to reject null hypothesis)
#After checking all the assumptions fulfill our criteria, now we will perform ANOVA with Blocking Design
#IV - 3 conditions (predictor) categorical
#DV - performance score (outcome) continuous
#Testing Additivity of Interactions - To check there is no interaction between predictor/IV - conditions and block - age (Interaction variable - factor(Condition)*factor(Age))
#Perform the linear model:
model1 <- aov(Performance_score ~ factor(Condition)*factor(Age), data = data3)
model1
## Call:
## aov(formula = Performance_score ~ factor(Condition) * factor(Age),
## data = data3)
##
## Terms:
## factor(Condition) factor(Age) factor(Condition):factor(Age)
## Sum of Squares 1199.0299 1549.6499 22.6398
## Deg. of Freedom 2 2 4
## Residuals
## Sum of Squares 152.9051
## Deg. of Freedom 80
##
## Residual standard error: 1.382502
## Estimated effects may be unbalanced
summary(model1) #OR summary(aov(Performance_score ~ factor(Condition), data = data3)) #(Don't forget to factor()ize your predictor)
## Df Sum Sq Mean Sq F value Pr(>F)
## factor(Condition) 2 1199.0 599.5 313.667 <2e-16 ***
## factor(Age) 2 1549.6 774.8 405.389 <2e-16 ***
## factor(Condition):factor(Age) 4 22.6 5.7 2.961 0.0246 *
## Residuals 80 152.9 1.9
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Interaction effect - factor(Condition):factor(Age) is not sifnificant (p-value < 0.05)
model2 <- aov(Performance_score ~ factor(Condition), data = data3)
model2
## Call:
## aov(formula = Performance_score ~ factor(Condition), data = data3)
##
## Terms:
## factor(Condition) Residuals
## Sum of Squares 1199.030 1725.195
## Deg. of Freedom 2 86
##
## Residual standard error: 4.478884
## Estimated effects may be unbalanced
summary(model2)
## Df Sum Sq Mean Sq F value Pr(>F)
## factor(Condition) 2 1199 599.5 29.89 1.4e-10 ***
## Residuals 86 1725 20.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Effect of Condition is signifiacnt
#OTHER MOdels FOR COmparison
model3 <- aov(Performance_score ~ factor(Age), data = data3)
model3
## Call:
## aov(formula = Performance_score ~ factor(Age), data = data3)
##
## Terms:
## factor(Age) Residuals
## Sum of Squares 1549.733 1374.492
## Deg. of Freedom 2 86
##
## Residual standard error: 3.997807
## Estimated effects may be unbalanced
summary(model3)
## Df Sum Sq Mean Sq F value Pr(>F)
## factor(Age) 2 1550 774.9 48.48 7.97e-15 ***
## Residuals 86 1374 16.0
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
model4 <- aov(Performance_score ~ factor(Condition) + factor(Age), data = data3)
model4
## Call:
## aov(formula = Performance_score ~ factor(Condition) + factor(Age),
## data = data3)
##
## Terms:
## factor(Condition) factor(Age) Residuals
## Sum of Squares 1199.0299 1549.6499 175.5449
## Deg. of Freedom 2 2 84
##
## Residual standard error: 1.445621
## Estimated effects may be unbalanced
summary(model4)
## Df Sum Sq Mean Sq F value Pr(>F)
## factor(Condition) 2 1199.0 599.5 286.9 <2e-16 ***
## factor(Age) 2 1549.6 774.8 370.8 <2e-16 ***
## Residuals 84 175.5 2.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Age can not be considered a block as there is significant Interaction effect.
#Analysis of Variance Table
anova(model1, model2)
## Analysis of Variance Table
##
## Model 1: Performance_score ~ factor(Condition) * factor(Age)
## Model 2: Performance_score ~ factor(Condition)
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 80 152.91
## 2 86 1725.19 -6 -1572.3 137.1 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#To find out what group differs from the others
library(pgirmess)
## Warning: package 'pgirmess' was built under R version 3.6.3
#Pairwise comparisons using t tests - Bonferroni Method
pairwise.t.test(data3$Performance_score, data3$Condition, paired = FALSE, p.adjust.method = "bonferroni") #method can be "none", "bonferroni", "holm", "hochberg", "hommel", "BH", or "BY“
##
## Pairwise comparisons using t tests with pooled SD
##
## data: data3$Performance_score and data3$Condition
##
## 1 2
## 2 0.0017 -
## 3 6e-11 0.0002
##
## P value adjustment method: bonferroni
#Significant difference between condition 1, 2, and 3 noted.
#Kurskal-Wallis:
kruskalmc(Performance_score ~ factor(Condition), data = data3)
## Multiple comparison test after Kruskal-Wallis
## p.value: 0.05
## Comparisons
## obs.dif critical.dif difference
## 1-2 19.60357 16.25251 TRUE
## 1-3 39.31970 16.12547 TRUE
## 2-3 19.71613 15.84052 TRUE
#Tukey’s Test:
TukeyHSD(model1)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = Performance_score ~ factor(Condition) * factor(Age), data = data3)
##
## $`factor(Condition)`
## diff lwr upr p adj
## 2-1 -4.204762 -5.072310 -3.337214 0
## 3-1 -9.006912 -9.867679 -8.146146 0
## 3-2 -4.802151 -5.647707 -3.956594 0
##
## $`factor(Age)`
## diff lwr upr p adj
## 2-1 -4.516166 -5.369099 -3.663233 0
## 3-1 -10.310345 -11.177377 -9.443313 0
## 3-2 -5.794179 -6.647112 -4.941246 0
##
## $`factor(Condition):factor(Age)`
## diff lwr upr p adj
## 2:1-1:1 -2.922222e+00 -4.947474 -0.8969700 0.0005097
## 3:1-1:1 -8.022222e+00 -10.047474 -5.9969700 0.0000000
## 1:2-1:1 -2.922222e+00 -4.947474 -0.8969700 0.0005097
## 2:2-1:1 -8.722222e+00 -10.747474 -6.6969700 0.0000000
## 3:2-1:1 -1.276768e+01 -14.748843 -10.7865103 0.0000000
## 1:3-1:1 -9.666667e+00 -11.744532 -7.5888017 0.0000000
## 2:3-1:1 -1.342222e+01 -15.447474 -11.3969700 0.0000000
## 3:3-1:1 -1.872222e+01 -20.747474 -16.6969700 0.0000000
## 3:1-2:1 -5.100000e+00 -7.071236 -3.1287642 0.0000000
## 1:2-2:1 1.421085e-14 -1.971236 1.9712358 1.0000000
## 2:2-2:1 -5.800000e+00 -7.771236 -3.8287642 0.0000000
## 3:2-2:1 -9.845455e+00 -11.771369 -7.9195406 0.0000000
## 1:3-2:1 -6.744444e+00 -8.769697 -4.7191922 0.0000000
## 2:3-2:1 -1.050000e+01 -12.471236 -8.5287642 0.0000000
## 3:3-2:1 -1.580000e+01 -17.771236 -13.8287642 0.0000000
## 1:2-3:1 5.100000e+00 3.128764 7.0712358 0.0000000
## 2:2-3:1 -7.000000e-01 -2.671236 1.2712358 0.9674422
## 3:2-3:1 -4.745455e+00 -6.671369 -2.8195406 0.0000000
## 1:3-3:1 -1.644444e+00 -3.669697 0.3808078 0.2078478
## 2:3-3:1 -5.400000e+00 -7.371236 -3.4287642 0.0000000
## 3:3-3:1 -1.070000e+01 -12.671236 -8.7287642 0.0000000
## 2:2-1:2 -5.800000e+00 -7.771236 -3.8287642 0.0000000
## 3:2-1:2 -9.845455e+00 -11.771369 -7.9195406 0.0000000
## 1:3-1:2 -6.744444e+00 -8.769697 -4.7191922 0.0000000
## 2:3-1:2 -1.050000e+01 -12.471236 -8.5287642 0.0000000
## 3:3-1:2 -1.580000e+01 -17.771236 -13.8287642 0.0000000
## 3:2-2:2 -4.045455e+00 -5.971369 -2.1195406 0.0000001
## 1:3-2:2 -9.444444e-01 -2.969697 1.0808078 0.8583631
## 2:3-2:2 -4.700000e+00 -6.671236 -2.7287642 0.0000000
## 3:3-2:2 -1.000000e+01 -11.971236 -8.0287642 0.0000000
## 1:3-3:2 3.101010e+00 1.119844 5.0821766 0.0001161
## 2:3-3:2 -6.545455e-01 -2.580459 1.2713685 0.9750171
## 3:3-3:2 -5.954545e+00 -7.880459 -4.0286315 0.0000000
## 2:3-1:3 -3.755556e+00 -5.780808 -1.7303033 0.0000028
## 3:3-1:3 -9.055556e+00 -11.080808 -7.0303033 0.0000000
## 3:3-2:3 -5.300000e+00 -7.271236 -3.3287642 0.0000000
library(pastecs)
## Warning: package 'pastecs' was built under R version 3.6.3
library(compute.es)
## Warning: package 'compute.es' was built under R version 3.6.3
#First get the relevant stats for each factor level (only relevant output pasted below):
by(data3$Performance_score, data3$Condition, stat.desc)
## data3$Condition: 1
## nbr.val nbr.null nbr.na min max range
## 28.0000000 0.0000000 0.0000000 25.0000000 39.0000000 14.0000000
## sum median mean SE.mean CI.mean.0.95 var
## 898.0000000 33.0000000 32.0714286 0.8098739 1.6617239 18.3650794
## std.dev coef.var
## 4.2854497 0.1336220
## ------------------------------------------------------------
## data3$Condition: 2
## nbr.val nbr.null nbr.na min max range
## 30.0000000 0.0000000 0.0000000 20.0000000 35.0000000 15.0000000
## sum median mean SE.mean CI.mean.0.95 var
## 836.0000000 27.5000000 27.8666667 0.8356061 1.7090064 20.9471264
## std.dev coef.var
## 4.5768031 0.1642393
## ------------------------------------------------------------
## data3$Condition: 3
## nbr.val nbr.null nbr.na min max range
## 31.0000000 0.0000000 0.0000000 15.0000000 30.0000000 15.0000000
## sum median mean SE.mean CI.mean.0.95 var
## 715.0000000 24.0000000 23.0645161 0.8177276 1.6700226 20.7290323
## std.dev coef.var
## 4.5529147 0.1973991
#n - sample size, M - mean, SD - standard deviation
#Condition 1 (n = 28, M = 32.07 , SD = 4.3)
#Condition 2 (n = 30, M = 27.87, SD = 4.6)
#Condition 3 (n = 31, M = 23.06, SD = 4.6)
#Use these values to compute a few standard Measures of Effect Size (MES or mes) for any pair of interest
#OPtion 1 will compare Condition 1 and 2 ~ Performace_score vs Condition
mes(27.87, 32.07, 4.6, 4.3, 30, 28)
## Mean Differences ES:
##
## d [ 95 %CI] = -0.94 [ -1.48 , -0.4 ]
## var(d) = 0.08
## p-value(d) = 0
## U3(d) = 17.31 %
## CLES(d) = 25.26 %
## Cliff's Delta = -0.49
##
## g [ 95 %CI] = -0.93 [ -1.46 , -0.39 ]
## var(g) = 0.07
## p-value(g) = 0
## U3(g) = 17.63 %
## CLES(g) = 25.55 %
##
## Correlation ES:
##
## r [ 95 %CI] = -0.43 [ -0.62 , -0.2 ]
## var(r) = 0.01
## p-value(r) = 0
##
## z [ 95 %CI] = -0.46 [ -0.73 , -0.2 ]
## var(z) = 0.02
## p-value(z) = 0
##
## Odds Ratio ES:
##
## OR [ 95 %CI] = 0.18 [ 0.07 , 0.48 ]
## p-value(OR) = 0
##
## Log OR [ 95 %CI] = -1.71 [ -2.69 , -0.72 ]
## var(lOR) = 0.25
## p-value(Log OR) = 0
##
## Other:
##
## NNT = -6.14
## Total N = 58
#OPtion 2 will compare Condition 2 and 3 ~ Performace_score vs Condition
mes(23.06, 27.87, 4.6, 4.6, 31, 30)
## Mean Differences ES:
##
## d [ 95 %CI] = -1.05 [ -1.58 , -0.51 ]
## var(d) = 0.07
## p-value(d) = 0
## U3(d) = 14.79 %
## CLES(d) = 22.98 %
## Cliff's Delta = -0.54
##
## g [ 95 %CI] = -1.03 [ -1.56 , -0.5 ]
## var(g) = 0.07
## p-value(g) = 0
## U3(g) = 15.1 %
## CLES(g) = 23.27 %
##
## Correlation ES:
##
## r [ 95 %CI] = -0.47 [ -0.64 , -0.25 ]
## var(r) = 0.01
## p-value(r) = 0
##
## z [ 95 %CI] = -0.51 [ -0.77 , -0.25 ]
## var(z) = 0.02
## p-value(z) = 0
##
## Odds Ratio ES:
##
## OR [ 95 %CI] = 0.15 [ 0.06 , 0.4 ]
## p-value(OR) = 0
##
## Log OR [ 95 %CI] = -1.9 [ -2.87 , -0.93 ]
## var(lOR) = 0.25
## p-value(Log OR) = 0
##
## Other:
##
## NNT = -5.87
## Total N = 61
#OPtion 3 will compare Condition 1 and 3 ~ Performace_score vs Condition
mes(23.06, 32.07, 4.6, 4.3, 31, 28)
## Mean Differences ES:
##
## d [ 95 %CI] = -2.02 [ -2.65 , -1.39 ]
## var(d) = 0.1
## p-value(d) = 0
## U3(d) = 2.17 %
## CLES(d) = 7.66 %
## Cliff's Delta = -0.85
##
## g [ 95 %CI] = -1.99 [ -2.61 , -1.37 ]
## var(g) = 0.1
## p-value(g) = 0
## U3(g) = 2.31 %
## CLES(g) = 7.93 %
##
## Correlation ES:
##
## r [ 95 %CI] = -0.72 [ -0.82 , -0.56 ]
## var(r) = 0
## p-value(r) = 0
##
## z [ 95 %CI] = -0.9 [ -1.16 , -0.64 ]
## var(z) = 0.02
## p-value(z) = 0
##
## Odds Ratio ES:
##
## OR [ 95 %CI] = 0.03 [ 0.01 , 0.08 ]
## p-value(OR) = 0
##
## Log OR [ 95 %CI] = -3.66 [ -4.8 , -2.53 ]
## var(lOR) = 0.34
## p-value(Log OR) = 0
##
## Other:
##
## NNT = -5.05
## Total N = 59
#Effect size (more than or equal to) < = 0.1 is small, 0.25 is medium, 0.4 is large (Cohen, 1988)
#Summary of ANOVA:
Observations from the study were analyzed by conducting a one-way analysis of variance using R version 3.6.1. First, all assumptions are met and there is no adjustment made. Results suggest that the task conditions (predictor) has a significant effect on performace score (outcome) (F(2, 86) = 29.89, p < .001). [We can not consider age group as blocks because it also has a significant effect on outcomes F(1, 87) = 60.21, p < .001.] Continuing the discussion with specifically which task condition produced the signiificantaly differed measures of the performance score, a Tukey’s hoc test was established. The result suggested that there is a significant difference between task condition 1 and 2, condition 2 and 3, and condition 1 and 3 (All p-value < 0.001) in terms of the performance score. The effect were large, Cohen’ D = 0.95, 1.05, and 2.02.