Problem:You have been given the following data from an upset client whose data scientists have all quit before they provided a summary of the results of a study they had conducted. The study looked at the effects of gender (male coded 0), handedness (i.e., left hand dominant coded 0), and age group (younger coded 0) on IQ. The client has no idea how the study was conducted. Using the data below determine the type of study run, analyze the data properly, and write up the study design and conclusions in a concise and clear fashion
Looking for the distribution in the data
library(readr)
lab6_510 <- read_csv("~/Downloads/lab6_510.csv")
## Parsed with column specification:
## cols(
## Handedness = col_integer(),
## Gender = col_integer(),
## AgeGroup = col_integer(),
## Block = col_integer(),
## Repetition = col_integer(),
## IQ = col_integer()
## )
WeeklyLab6Data <- lab6_510
WeeklyLab6Data
## # A tibble: 16 x 6
## Handedness Gender AgeGroup Block Repetition IQ
## <int> <int> <int> <int> <int> <int>
## 1 0 0 0 0 0 95
## 2 1 1 0 0 0 101
## 3 1 0 1 0 0 102
## 4 0 1 1 0 0 97
## 5 1 0 0 1 0 120
## 6 0 1 0 1 0 99
## 7 0 0 1 1 0 96
## 8 1 1 1 1 0 111
## 9 0 0 0 0 1 100
## 10 1 1 0 0 1 100
## 11 1 0 1 1 1 107
## 12 0 1 1 1 1 97
## 13 1 0 0 1 1 116
## 14 0 1 0 1 1 101
## 15 0 0 1 0 1 95
## 16 1 1 1 0 1 112
d <- density(WeeklyLab6Data$IQ)
plot(d)
The distribution looks fairly normal with a potential positive skew.
Checking for skewness in the data
library(moments)
agostino.test(WeeklyLab6Data$IQ)
##
## D'Agostino skewness test
##
## data: WeeklyLab6Data$IQ
## skew = 0.91806, z = 1.79580, p-value = 0.07252
## alternative hypothesis: data have a skewness
Making sure all variables are treated as categorical
WeeklyLab6Data$Handedness <- factor(WeeklyLab6Data$Handedness)
WeeklyLab6Data$Gender <- factor(WeeklyLab6Data$Gender)
WeeklyLab6Data$AgeGroup <- factor(WeeklyLab6Data$AgeGroup)
WeeklyLab6Data$Block <- factor(WeeklyLab6Data$Block)
WeeklyLab6Data$Repetition <- factor(WeeklyLab6Data$Repetition)
Checking for Variance Equality
bartlett.test(WeeklyLab6Data$IQ, WeeklyLab6Data$Handedness)
##
## Bartlett test of homogeneity of variances
##
## data: WeeklyLab6Data$IQ and WeeklyLab6Data$Handedness
## Bartlett's K-squared = 7.5206, df = 1, p-value = 0.0061
bartlett.test(WeeklyLab6Data$IQ, WeeklyLab6Data$Gender)
##
## Bartlett test of homogeneity of variances
##
## data: WeeklyLab6Data$IQ and WeeklyLab6Data$Gender
## Bartlett's K-squared = 1.513, df = 1, p-value = 0.2187
bartlett.test(WeeklyLab6Data$IQ, WeeklyLab6Data$AgeGroup)
##
## Bartlett test of homogeneity of variances
##
## data: WeeklyLab6Data$IQ and WeeklyLab6Data$AgeGroup
## Bartlett's K-squared = 0.38713, df = 1, p-value = 0.5338
The assumption for Handedness is violated. Checking for how big the violation is:
tapply(WeeklyLab6Data$IQ, WeeklyLab6Data$Handedness, var)
## 0 1
## 5.142857 54.267857
It looks quite large. This can be solved by subsracting the smallest observation -1 from the observations and then taking the log and then fitting the model
WeeklyLab6Data$IQfixed <- log(WeeklyLab6Data$IQ-93)
Model <- aov(IQfixed ~ Repetition+Repetition/Block+Handedness*Gender*AgeGroup, data = WeeklyLab6Data)
library(xtable)
table <- xtable(Model)
Checking for residuals
qqnorm(Model$residuals)
Do they violate normality?
shapiro.test(Model$residuals)
##
## Shapiro-Wilk normality test
##
## data: Model$residuals
## W = 0.98005, p-value = 0.964
From the p-value we can intrepret that model does a good job
tapply(WeeklyLab6Data$IQfixed, WeeklyLab6Data$Handedness, mean)
## 0 1
## 1.384326 2.640972
Right Handers have higher IQ
interaction.plot(WeeklyLab6Data$Gender, WeeklyLab6Data$AgeGroup, WeeklyLab6Data$IQ)
Intersection plot shows that males have higher IQ when they are young, while females have slightly higher IQ when older
library(lsmeans)
## The 'lsmeans' package is being deprecated.
## Users are encouraged to switch to 'emmeans'.
## See help('transition') for more information, including how
## to convert 'lsmeans' objects and scripts to work with 'emmeans'.
lsmip(Model, Gender~AgeGroup)
## NOTE: Results may be misleading due to involvement in interactions
TukeyHSD(Model, "Gender:AgeGroup")
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = IQfixed ~ Repetition + Repetition/Block + Handedness * Gender * AgeGroup, data = WeeklyLab6Data)
##
## $`Gender:AgeGroup`
## diff lwr upr p adj
## 1:0-0:0 -0.2934589 -1.2806549 0.6937371 0.7066505
## 0:1-0:0 -0.6105868 -1.5977828 0.3766093 0.2212729
## 1:1-0:0 -0.1157472 -1.1029433 0.8714488 0.9702234
## 0:1-1:0 -0.3171278 -1.3043239 0.6700682 0.6602403
## 1:1-1:0 0.1777117 -0.8094843 1.1649077 0.9062055
## 1:1-0:1 0.4948395 -0.4923565 1.4820355 0.3524188
Tukey shows the change in IQ with age is not that significant in terms of both males and females.
We analyzed a partially confounded design over two replications with two blocks each. The design was a 2 (handedness) x 2 (age group) x 2 (gender) factorial design and we analyzed it with a factorial ANOVA (after skew correction via log transformation). In terms of main effects only the effect of handedness was significant with left handers (M = 1.38) having lower IQs than right handers (M = 2.64), F(1,5) = 44.12, p < .01. We found the interaction between age group and gender to be marginally significant, F(1,5) = 4.34, p = .09. Figure 1 suggests this is driven by the fact that males have a higher IQ when younger than when older, while the reverse appears true for females: importantly neither of these shifts in IQ with age are significant, ps > .21. In short, it appears right handers have higher IQs than left handers.