data <- read.csv("zoombies(1).csv")

Question 1

boxplot(data$Basic,data$Conehead,data$Buckethead,names = c('basic','conehead','buckhead'))

From these plots we can guess the means of three populations are different, but their variance seems to be about the same.

Question 2

\(H_0: \mu_1=\mu_2=\mu_3\)

\(H_1:\) at least one \(\mu\) differs

data1 <- data[,-1]
stacked_data <- stack(data1)
model <- aov(values~ind,data = stacked_data)
summary(model)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## ind          2  396.4  198.20    73.8 1.79e-14 ***
## Residuals   42  112.8    2.69                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

With a p-value of 1.79e-14, we reject the null hypothesis with \(\alpha=0.05\), meaning at least one group mean is different that others.

Question 3

plot(model,2)

With a visual inspection of this plot, it seems that the data has a normal distribution, however it has a indication of “S” shape which means there could be some long tailedness.

plot(model,5)

With a visual inspection of this plot, it seems that the data has constant variance.

Question 4

a <- TukeyHSD(model)
print(a)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = values ~ ind, data = stacked_data)
## 
## $ind
##                     diff       lwr        upr     p adj
## Conehead-Basic       5.2  3.746165  6.6538348 0.0000000
## Buckethead-Basic    -1.8 -3.253835 -0.3461652 0.0120723
## Buckethead-Conehead -7.0 -8.453835 -5.5461652 0.0000000

plot(a)

As the 0 is not include in any of the intervals, and also the p value shows, the different between all the pairs are significant with \(\alpha=0.05\). Furthermore, the plot shows that 0 is not included in any of the pair comparisons which agrees with our conclusion

Midterm_question 6

2022-10-20

Question 1

Question 2

Question 3

Question 4