Problem:You have been given the following data from an upset client whose data scientists have all quit before they provided a summary of the results of a study they had conducted. The study looked at the effects of gender (male coded 0), handedness (i.e., left hand dominant coded 0), and age group (younger coded 0) on IQ. The client has no idea how the study was conducted. Using the data below determine the type of study run, analyze the data properly, and write up the study design and conclusions in a concise and clear fashion

Looking for the distribution in the data

library(readr)
lab6_510 <- read_csv("~/Downloads/lab6_510.csv")

## Parsed with column specification:
## cols(
##   Handedness = col_integer(),
##   Gender = col_integer(),
##   AgeGroup = col_integer(),
##   Block = col_integer(),
##   Repetition = col_integer(),
##   IQ = col_integer()
## )

WeeklyLab6Data <- lab6_510
WeeklyLab6Data

## # A tibble: 16 x 6
##    Handedness Gender AgeGroup Block Repetition    IQ
##         <int>  <int>    <int> <int>      <int> <int>
##  1          0      0        0     0          0    95
##  2          1      1        0     0          0   101
##  3          1      0        1     0          0   102
##  4          0      1        1     0          0    97
##  5          1      0        0     1          0   120
##  6          0      1        0     1          0    99
##  7          0      0        1     1          0    96
##  8          1      1        1     1          0   111
##  9          0      0        0     0          1   100
## 10          1      1        0     0          1   100
## 11          1      0        1     1          1   107
## 12          0      1        1     1          1    97
## 13          1      0        0     1          1   116
## 14          0      1        0     1          1   101
## 15          0      0        1     0          1    95
## 16          1      1        1     0          1   112

d <- density(WeeklyLab6Data$IQ)
plot(d)

The distribution looks fairly normal with a potential positive skew.

Checking for skewness in the data

library(moments)
agostino.test(WeeklyLab6Data$IQ)

## 
##  D'Agostino skewness test
## 
## data:  WeeklyLab6Data$IQ
## skew = 0.91806, z = 1.79580, p-value = 0.07252
## alternative hypothesis: data have a skewness

Making sure all variables are treated as categorical

WeeklyLab6Data$Handedness <- factor(WeeklyLab6Data$Handedness)
WeeklyLab6Data$Gender <- factor(WeeklyLab6Data$Gender)
WeeklyLab6Data$AgeGroup <- factor(WeeklyLab6Data$AgeGroup)
WeeklyLab6Data$Block <- factor(WeeklyLab6Data$Block)
WeeklyLab6Data$Repetition <- factor(WeeklyLab6Data$Repetition)

Checking for Variance Equality

bartlett.test(WeeklyLab6Data$IQ, WeeklyLab6Data$Handedness)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  WeeklyLab6Data$IQ and WeeklyLab6Data$Handedness
## Bartlett's K-squared = 7.5206, df = 1, p-value = 0.0061

bartlett.test(WeeklyLab6Data$IQ, WeeklyLab6Data$Gender)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  WeeklyLab6Data$IQ and WeeklyLab6Data$Gender
## Bartlett's K-squared = 1.513, df = 1, p-value = 0.2187

bartlett.test(WeeklyLab6Data$IQ, WeeklyLab6Data$AgeGroup)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  WeeklyLab6Data$IQ and WeeklyLab6Data$AgeGroup
## Bartlett's K-squared = 0.38713, df = 1, p-value = 0.5338

The assumption for Handedness is violated. Checking for how big the violation is:

tapply(WeeklyLab6Data$IQ, WeeklyLab6Data$Handedness, var)

##         0         1 
##  5.142857 54.267857

It looks quite large. This can be solved by subsracting the smallest observation -1 from the observations and then taking the log and then fitting the model

WeeklyLab6Data$IQfixed <- log(WeeklyLab6Data$IQ-93)
Model <- aov(IQfixed ~ Repetition+Repetition/Block+Handedness*Gender*AgeGroup, data = WeeklyLab6Data)
library(xtable)
table <- xtable(Model)

Checking for residuals

qqnorm(Model$residuals)

Do they violate normality?

shapiro.test(Model$residuals)

## 
##  Shapiro-Wilk normality test
## 
## data:  Model$residuals
## W = 0.98005, p-value = 0.964

From the p-value we can intrepret that model does a good job

tapply(WeeklyLab6Data$IQfixed, WeeklyLab6Data$Handedness, mean)

##        0        1 
## 1.384326 2.640972

Right Handers have higher IQ

interaction.plot(WeeklyLab6Data$Gender, WeeklyLab6Data$AgeGroup, WeeklyLab6Data$IQ)

Intersection plot shows that males have higher IQ when they are young, while females have slightly higher IQ when older

library(lsmeans)

## The 'lsmeans' package is being deprecated.
## Users are encouraged to switch to 'emmeans'.
## See help('transition') for more information, including how
## to convert 'lsmeans' objects and scripts to work with 'emmeans'.

lsmip(Model, Gender~AgeGroup)

## NOTE: Results may be misleading due to involvement in interactions

TukeyHSD(Model, "Gender:AgeGroup")

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = IQfixed ~ Repetition + Repetition/Block + Handedness * Gender * AgeGroup, data = WeeklyLab6Data)
## 
## $`Gender:AgeGroup`
##               diff        lwr       upr     p adj
## 1:0-0:0 -0.2934589 -1.2806549 0.6937371 0.7066505
## 0:1-0:0 -0.6105868 -1.5977828 0.3766093 0.2212729
## 1:1-0:0 -0.1157472 -1.1029433 0.8714488 0.9702234
## 0:1-1:0 -0.3171278 -1.3043239 0.6700682 0.6602403
## 1:1-1:0  0.1777117 -0.8094843 1.1649077 0.9062055
## 1:1-0:1  0.4948395 -0.4923565 1.4820355 0.3524188

Tukey shows the change in IQ with age is not that significant in terms of both males and females.

Summary

We analyzed a partially confounded design over two replications with two blocks each. The design was a 2 (handedness) x 2 (age group) x 2 (gender) factorial design and we analyzed it with a factorial ANOVA (after skew correction via log transformation). In terms of main effects only the effect of handedness was significant with left handers (M = 1.38) having lower IQs than right handers (M = 2.64), F(1,5) = 44.12, p < .01. We found the interaction between age group and gender to be marginally significant, F(1,5) = 4.34, p = .09. Figure 1 suggests this is driven by the fact that males have a higher IQ when younger than when older, while the reverse appears true for females: importantly neither of these shifts in IQ with age are significant, ps > .21. In short, it appears right handers have higher IQs than left handers.

Data Model in R

Shreejit

5/14/2018

Summary