One way ANOVA with different species of lizard from Eurpoe. A data is collected with three types of lizards and their lengths are recorded. With the ANOVA test, we want to see whether one kind of lizard is longer than the other.
Assumption 1: All samples are independent, and collected in more than two independent categorical groups.
\(\H_0\) : There is no difference between lengths of three species or mean lengths are equal
We will conduct the experiment following the steps below:
dataoneway <- read.table("onewayanova.txt", h = T)
# dataoneway <- read.table("series.txt", h = T) ##Example
names(dataoneway)
## [1] "Group" "Length"
dataoneway$Group <- as.factor(dataoneway$Group)
dataoneway$Group = factor(dataoneway$Group,labels = c("Wall lizard", "Viviparous lizard", "Snake-eyed lizard"))
Check if you have done the classification correctly!
In this experiment, the dependent variable is length and it is continuous. So for ANOVA test, assumption 2: dependent variable is continuous is fulfilled. Next, we have to check assumption 3: na major outliers. You can check this in R.
Group1 = subset(dataoneway, Group == "Wall lizard")
Group2 = subset(dataoneway, Group == "Viviparous lizard")
Group3 = subset(dataoneway, Group == "Snake-eyed lizard" )
qqnorm(Group1$Length)
qqnorm(Group2$Length)
qqnorm(Group3$Length)
bartlett.test(Length ~ Group, data = dataoneway)
##
## Bartlett test of homogeneity of variances
##
## data: Length by Group
## Bartlett's K-squared = 0.43292, df = 2, p-value = 0.8054
the p-value in our barlett.test is .8 which is greather than .05. This shows that there is homogeneity of variance in our data
8.For ANOVA test, create the linear model with Length versus Group and call it model1. Then do ANOVA
model1 = lm(Length ~ Group, data = dataoneway)
anova(model1)
## Analysis of Variance Table
##
## Response: Length
## Df Sum Sq Mean Sq F value Pr(>F)
## Group 2 10.615 5.3074 7.0982 0.0013 **
## Residuals 102 76.267 0.7477
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
I believe the p value in this anova table is .0013 which is < .05 which would tell us to reject the null hypothesis which says there is no difference between lengths of three species or mean lengths are equal.
TukeyHSD(aov(model1))
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = model1)
##
## $Group
## diff lwr upr
## Viviparous lizard-Wall lizard -0.7200000 -1.2116284 -0.2283716
## Snake-eyed lizard-Wall lizard -0.1028571 -0.5944855 0.3887713
## Snake-eyed lizard-Viviparous lizard 0.6171429 0.1255145 1.1087713
## p adj
## Viviparous lizard-Wall lizard 0.0020955
## Snake-eyed lizard-Wall lizard 0.8726158
## Snake-eyed lizard-Viviparous lizard 0.0098353
With the adjusted p-values we are able to see the comparison of each type of lizard compared to another. I think based off of the p-value we can look specifically on the relationship of each type of lizard. looking at the p-scores, Viviparous Lizard compared to Wall lizard shows a p-score under .05 which we can conclude to reject the null that their relationship in size is the same. This can also be said about Snake-eyed lizard compared to Vivaparous lizard. Snake-eyed lizard compared to wall lizard shows a p-score of .8726 which is greater than .05. With this we can assume that Snake-eyed and wall lizards are the same in length and mean.
library("ggplot2")
ggplot(dataoneway, aes(x=Group, y = Length)) + geom_boxplot(fill = "grey80", colour = "black") + scale_x_discrete() + xlab("Treatment Group")+ ylab("Length (cm)")