You have been given the following data from an upset client whose data scientists have all quit before they provided a summary of the results of a study they had conducted. The study looked at the effects of gender (male coded 0), handedness (i.e., left hand dominant coded 0), and age group (younger coded 0) on IQ. The client has no idea how the study was conducted.
Using the data below determine the type of study run, analyze the data properly, and write up the study design and conclusions in a concise and clear fashion.
library("readxl")
## Warning: package 'readxl' was built under R version 3.4.4
IQData <- read_excel("C:/Users/Enrique/OneDrive/Documents/HU/ANLY510_Principles7Applicaitons02/Data/Effects on IQ.xlsx")
str(IQData)
## Classes 'tbl_df', 'tbl' and 'data.frame': 16 obs. of 6 variables:
## $ Handedness: num 0 1 1 0 1 0 0 1 0 1 ...
## $ Gender : num 0 1 0 1 0 1 0 1 0 1 ...
## $ AgeGroup : num 0 0 1 1 0 0 1 1 0 0 ...
## $ Block : num 0 0 0 0 1 1 1 1 0 0 ...
## $ Repetition: num 0 0 0 0 0 0 0 0 1 1 ...
## $ IQ : num 95 101 102 97 120 99 96 111 100 100 ...
Variables need to be converted to factors and numbers need to be replaced
####Assign names to values
IQData$Handedness= replace(IQData$Handedness,IQData$Handedness=="Left hand","Left")
IQData$Handedness= replace(IQData$Handedness,IQData$Handedness=="1","Right")
IQData$Gender= replace(IQData$Gender,IQData$Gender=="0","Male")
IQData$Gender= replace(IQData$Gender,IQData$Gender=="1","Female")
IQData$AgeGroup= replace(IQData$AgeGroup,IQData$AgeGroup=="0","Younger")
IQData$AgeGroup= replace(IQData$AgeGroup,IQData$AgeGroup=="1","Older")
str(IQData)
## Classes 'tbl_df', 'tbl' and 'data.frame': 16 obs. of 6 variables:
## $ Handedness: chr "0" "Right" "Right" "0" ...
## $ Gender : chr "Male" "Female" "Male" "Female" ...
## $ AgeGroup : chr "Younger" "Younger" "Older" "Older" ...
## $ Block : num 0 0 0 0 1 1 1 1 0 0 ...
## $ Repetition: num 0 0 0 0 0 0 0 0 1 1 ...
## $ IQ : num 95 101 102 97 120 99 96 111 100 100 ...
####Covnert to factors
IQData$Handedness=as.factor(IQData$Handedness)
IQData$Gender=as.factor(IQData$Gender)
IQData$AgeGroup=as.factor(IQData$AgeGroup)
IQData$Block=as.factor(IQData$Block)
IQData$Repetition=as.factor(IQData$Repetition)
str(IQData)
## Classes 'tbl_df', 'tbl' and 'data.frame': 16 obs. of 6 variables:
## $ Handedness: Factor w/ 2 levels "0","Right": 1 2 2 1 2 1 1 2 1 2 ...
## $ Gender : Factor w/ 2 levels "Female","Male": 2 1 2 1 2 1 2 1 2 1 ...
## $ AgeGroup : Factor w/ 2 levels "Older","Younger": 2 2 1 1 2 2 1 1 2 2 ...
## $ Block : Factor w/ 2 levels "0","1": 1 1 1 1 2 2 2 2 1 1 ...
## $ Repetition: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 2 2 ...
## $ IQ : num 95 101 102 97 120 99 96 111 100 100 ...
Now that we have the proper format, let’s take a look at the distribution of IQ
plot(density(IQData$IQ),main="Distribution of IQData",xlab="IQ",col="red",lwd=2)
IQ data follows a normal distribution.
Lets visualize the data per categories to see if there’s any patterns that might need further analysis
boxplot(IQData$IQ~IQData$Handedness,main="Handedness IQ",xlab="Handedness",ylab="IQ",col=c("yellow","gold"))
boxplot(IQData$IQ~IQData$Gender,main="Gender IQ",xlab="Gender",ylab="IQ",col=c("yellow","gold"))
boxplot(IQData$IQ~IQData$AgeGroup,main="AgeGroup IQ",xlab="AgeGroup",ylab="IQ",col=c("yellow","gold"))
Right handed applicants seem to have a much higher IQ. Both genders seem to have a similar average IQ. Younger group has a slightly higher average IQ.
Let’s verify these means to confirm boxplot results:
tapply(IQData$IQ,INDEX =IQData$Handedness,FUN = mean)
## 0 Right
## 97.500 108.625
tapply(IQData$IQ,INDEX =IQData$Gender,FUN = mean)
## Female Male
## 102.250 103.875
tapply(IQData$IQ,INDEX =IQData$AgeGroup,FUN = mean)
## Older Younger
## 102.125 104.000
There seems to be a significant difference on IQ from Handedness. Let’s take a look at the variance.
bartlett.test(IQData$IQ,IQData$Handedness)
##
## Bartlett test of homogeneity of variances
##
## data: IQData$IQ and IQData$Handedness
## Bartlett's K-squared = 7.5206, df = 1, p-value = 0.0061
bartlett.test(IQData$IQ,IQData$Gender)
##
## Bartlett test of homogeneity of variances
##
## data: IQData$IQ and IQData$Gender
## Bartlett's K-squared = 1.513, df = 1, p-value = 0.2187
bartlett.test(IQData$IQ,IQData$AgeGroup)
##
## Bartlett test of homogeneity of variances
##
## data: IQData$IQ and IQData$AgeGroup
## Bartlett's K-squared = 0.38713, df = 1, p-value = 0.5338
Bartlett test confirms that significant variance exists on IQ for Handedness variable.
Let’s perform an ANOVA to test if there’s significant difference across groups
anova(lm(IQData$IQ~IQData$Handedness))
## Analysis of Variance Table
##
## Response: IQData$IQ
## Df Sum Sq Mean Sq F value Pr(>F)
## IQData$Handedness 1 495.06 495.06 16.666 0.00112 **
## Residuals 14 415.88 29.71
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lm(IQData$IQ~IQData$Gender))
## Analysis of Variance Table
##
## Response: IQData$IQ
## Df Sum Sq Mean Sq F value Pr(>F)
## IQData$Gender 1 10.56 10.563 0.1642 0.6914
## Residuals 14 900.37 64.312
anova(lm(IQData$IQ~IQData$AgeGroup))
## Analysis of Variance Table
##
## Response: IQData$IQ
## Df Sum Sq Mean Sq F value Pr(>F)
## IQData$AgeGroup 1 14.06 14.063 0.2195 0.6466
## Residuals 14 896.87 64.062
High F value and Pvalue<.05 confirm variance is quite significant on IQ for handendess. We accept HA of unequal means across groups
Lets run an interaction plot to see if there are any patterns when combining variables:
interaction.plot(IQData$Gender,IQData$AgeGroup,IQData$IQ,ylab="mean of IQ",xlab="Gender",cex.lab=2)
interaction.plot(IQData$Gender,IQData$Handedness,IQData$IQ)
Interaction plots show a pattern between gender and age.
Conclusion:
Density plot confirmed IQ data is normally distributed. ANOVA confirmed right handed individuals have a much higher IQ than left hadned. Interaction plot showed that males have a higher IQ when they are young and females when they are old.