data <- read.csv("zoombies(1).csv")
boxplot(data$Basic,data$Conehead,data$Buckethead,names = c('basic','conehead','buckhead'))
From these plots we can guess the means of three populations are different, but their variance seems to be about the same.
\(H_0: \mu_1=\mu_2=\mu_3\)
\(H_1:\) at least one \(\mu\) differs
data1 <- data[,-1]
stacked_data <- stack(data1)
model <- aov(values~ind,data = stacked_data)
summary(model)
## Df Sum Sq Mean Sq F value Pr(>F)
## ind 2 396.4 198.20 73.8 1.79e-14 ***
## Residuals 42 112.8 2.69
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
With a p-value of 1.79e-14, we reject the null hypothesis with \(\alpha=0.05\), meaning at least one group mean is different that others.
plot(model,2)
With a visual inspection of this plot, it seems that the data has a normal distribution, however it has a indication of āSā shape which means there could be some long tailedness.
plot(model,5)
With a visual inspection of this plot, it seems that the data has constant variance.
a <- TukeyHSD(model)
print(a)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = values ~ ind, data = stacked_data)
##
## $ind
## diff lwr upr p adj
## Conehead-Basic 5.2 3.746165 6.6538348 0.0000000
## Buckethead-Basic -1.8 -3.253835 -0.3461652 0.0120723
## Buckethead-Conehead -7.0 -8.453835 -5.5461652 0.0000000
plot(a)
As the 0 is not include in any of the intervals, and also the p value shows, the different between all the pairs are significant with \(\alpha=0.05\). Furthermore, the plot shows that 0 is not included in any of the pair comparisons which agrees with our conclusion