One way ANOVA with different species of lizard from Europe. A data collection with three types of lizards and thier lengths recorded. WIth the ANOVA test, we want to see whether one kind of lizard is longer than the other.

Assumption 1: All samples are independant, and collected in more than two independent catagorical groups.

\(H_0\) : There is no difference between lengths of three species or mean lengths are equal

We will conduct the experiment following the steps below:

1.) First upload the data and call it “dataoneway”

dataoneway <- read.table("onewayanova.txt",h=T)
# dataoneway<- read.table("series.txt",h=T)
attach(dataoneway)
  1. What are the names of the arrays? Print names. How many types of groups are available?
names(dataoneway)
## [1] "Group"  "Length"
  1. Catagorize / Factor “Group”
dataoneway$Group <- as.factor(dataoneway$Group) 
dataoneway$Group = factor(dataoneway$Group, labels= c("Wall Lizard", "Viviparous Lizard", "Snake-eyed Lizard"))
  1. Check if you have done the classification correctly!
dataoneway$Group
##   [1] Wall Lizard       Wall Lizard       Wall Lizard      
##   [4] Wall Lizard       Wall Lizard       Wall Lizard      
##   [7] Wall Lizard       Wall Lizard       Wall Lizard      
##  [10] Wall Lizard       Wall Lizard       Wall Lizard      
##  [13] Wall Lizard       Wall Lizard       Wall Lizard      
##  [16] Wall Lizard       Wall Lizard       Wall Lizard      
##  [19] Wall Lizard       Wall Lizard       Wall Lizard      
##  [22] Wall Lizard       Wall Lizard       Wall Lizard      
##  [25] Wall Lizard       Wall Lizard       Wall Lizard      
##  [28] Wall Lizard       Wall Lizard       Wall Lizard      
##  [31] Wall Lizard       Wall Lizard       Wall Lizard      
##  [34] Wall Lizard       Wall Lizard       Viviparous Lizard
##  [37] Viviparous Lizard Viviparous Lizard Viviparous Lizard
##  [40] Viviparous Lizard Viviparous Lizard Viviparous Lizard
##  [43] Viviparous Lizard Viviparous Lizard Viviparous Lizard
##  [46] Viviparous Lizard Viviparous Lizard Viviparous Lizard
##  [49] Viviparous Lizard Viviparous Lizard Viviparous Lizard
##  [52] Viviparous Lizard Viviparous Lizard Viviparous Lizard
##  [55] Viviparous Lizard Viviparous Lizard Viviparous Lizard
##  [58] Viviparous Lizard Viviparous Lizard Viviparous Lizard
##  [61] Viviparous Lizard Viviparous Lizard Viviparous Lizard
##  [64] Viviparous Lizard Viviparous Lizard Viviparous Lizard
##  [67] Viviparous Lizard Viviparous Lizard Viviparous Lizard
##  [70] Viviparous Lizard Snake-eyed Lizard Snake-eyed Lizard
##  [73] Snake-eyed Lizard Snake-eyed Lizard Snake-eyed Lizard
##  [76] Snake-eyed Lizard Snake-eyed Lizard Snake-eyed Lizard
##  [79] Snake-eyed Lizard Snake-eyed Lizard Snake-eyed Lizard
##  [82] Snake-eyed Lizard Snake-eyed Lizard Snake-eyed Lizard
##  [85] Snake-eyed Lizard Snake-eyed Lizard Snake-eyed Lizard
##  [88] Snake-eyed Lizard Snake-eyed Lizard Snake-eyed Lizard
##  [91] Snake-eyed Lizard Snake-eyed Lizard Snake-eyed Lizard
##  [94] Snake-eyed Lizard Snake-eyed Lizard Snake-eyed Lizard
##  [97] Snake-eyed Lizard Snake-eyed Lizard Snake-eyed Lizard
## [100] Snake-eyed Lizard Snake-eyed Lizard Snake-eyed Lizard
## [103] Snake-eyed Lizard Snake-eyed Lizard Snake-eyed Lizard
## Levels: Wall Lizard Viviparous Lizard Snake-eyed Lizard
  1. In this experiment, the dependent variable is length and it is continuous. So for ANOVA test, assumption 2: dependent variable is continuous fulfilled. Next we have to check assumption 3: no major outliers. You can check this is R.

    1. First create Group1, Group2, Group3 as 3 subset of “Group”.
Group1 <- subset(dataoneway, Group =="Wall Lizard")
Group2 <- subset(dataoneway, Group == "Viviparous Lizard")
Group3 <- subset(dataoneway, Group == "Snake-eyed Lizard")
(b) Draw the normal quantile plot for each group and see if there is any major outliers in every single group.
qqnorm(Group1$Length)
qqline(Group1$Length)

qqnorm(Group2$Length)
qqline(Group2$Length)

qqnorm(Group3$Length)
qqline(Group3$Length)

  1. Before doing ANOVA, check the homogeneity of variance. That is actually assumption 4: homogeneity of variance
bartlett.test(Length ~ Group, data=dataoneway)
## 
##  Bartlett test of homogeneity of variances
## 
## data:  Length by Group
## Bartlett's K-squared = 0.43292, df = 2, p-value = 0.8054
  1. What is the p-value from the bartlett.test? Is it > 0.05? What does it mean?

    The p-value is 0.8054. It is >0.05, that means that we can’t reject the null hypothesis. So the variance of all three groups are pretty much the same.

  2. For ANOVA test, create the linear model with “Length” versus “Group” and call it model1. Then do ANOVA: anova(model1)

model1 <- lm(Length~Group, data= dataoneway)
anova(model1)
## Analysis of Variance Table
## 
## Response: Length
##            Df Sum Sq Mean Sq F value Pr(>F)   
## Group       2 10.615  5.3074  7.0982 0.0013 **
## Residuals 102 76.267  0.7477                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  1. Report the p-value. What can you conclude about the null hypothesis?

    The p-value is 0.0013. This means that we can reject the null hypothesis.

  2. We don’t know which species is longer than the others. We will verify with Post-hooc test TukeyHSD.

TukeyHSD(aov(model1))
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = model1)
## 
## $Group
##                                           diff        lwr        upr
## Viviparous Lizard-Wall Lizard       -0.7200000 -1.2116284 -0.2283716
## Snake-eyed Lizard-Wall Lizard       -0.1028571 -0.5944855  0.3887713
## Snake-eyed Lizard-Viviparous Lizard  0.6171429  0.1255145  1.1087713
##                                         p adj
## Viviparous Lizard-Wall Lizard       0.0020955
## Snake-eyed Lizard-Wall Lizard       0.8726158
## Snake-eyed Lizard-Viviparous Lizard 0.0098353
What can you say from the p-values?
That the snake-eyed and wall lizard have pretty much the same lengths, while the vivparous and wall Lizard and the snake eyed and viviporus have different lengths.
  1. Visualize the data with ggplot2:
library("ggplot2")
ggplot(dataoneway, aes(x= Group, y= Length)) +
geom_boxplot(fill= "grey80", col = "black") +
scale_x_discrete() + xlab("Treatment Group") + 
ylab("Length (cm)")