
Title: Design of Experiments-Project
Aim: Testing whether landing distance of three different balls are same or different.
Instructor - Dr.Timothy Matis
Authors: Sujit Thakur, Tajammul Mohammed, Gowtam Sasikumar
Determining the size using power calculation
pwr.anova.test(k=3,n=NULL,f=0.5, sig.level = 0.05 ,power = 0.75)
##
## Balanced one-way analysis of variance power calculation
##
## k = 3
## n = 12.50714
## f = 0.5
## sig.level = 0.05
## power = 0.75
##
## NOTE: n is number in each group
We can see from the power calculation that number of samples required per group is 13
Randomized Design of Sample Collection - Complete Random Design
trt1 <- c("Yellow","Red","black")
crd <- design.crd(trt1,r=13,seed=1234)
crd$book
## plots r trt1
## 1 101 1 Red
## 2 102 1 Yellow
## 3 103 2 Yellow
## 4 104 1 black
## 5 105 2 Red
## 6 106 2 black
## 7 107 3 Red
## 8 108 3 Yellow
## 9 109 4 Red
## 10 110 5 Red
## 11 111 3 black
## 12 112 4 Yellow
## 13 113 4 black
## 14 114 5 Yellow
## 15 115 6 Yellow
## 16 116 5 black
## 17 117 6 black
## 18 118 6 Red
## 19 119 7 Red
## 20 120 7 Yellow
## 21 121 8 Red
## 22 122 8 Yellow
## 23 123 7 black
## 24 124 9 Red
## 25 125 9 Yellow
## 26 126 10 Yellow
## 27 127 10 Red
## 28 128 8 black
## 29 129 11 Yellow
## 30 130 11 Red
## 31 131 9 black
## 32 132 12 Yellow
## 33 133 13 Yellow
## 34 134 12 Red
## 35 135 10 black
## 36 136 11 black
## 37 137 12 black
## 38 138 13 Red
## 39 139 13 black
Collection of Data
data <- read_excel("Proj1-Oct5-DOE -Tidy2.xlsx")
data <- as.data.frame(data)
print.data.frame(data)
## Ball Colour Obs
## 1 black 50.0
## 2 black 51.0
## 3 black 50.0
## 4 black 48.5
## 5 black 49.0
## 6 black 53.0
## 7 black 52.0
## 8 black 51.0
## 9 black 51.0
## 10 black 51.0
## 11 black 49.5
## 12 black 50.0
## 13 black 50.0
## 14 Red 50.0
## 15 Red 51.0
## 16 Red 49.0
## 17 Red 49.0
## 18 Red 50.0
## 19 Red 49.5
## 20 Red 49.0
## 21 Red 50.5
## 22 Red 49.0
## 23 Red 50.0
## 24 Red 50.0
## 25 Red 49.0
## 26 Red 52.0
## 27 Yellow 55.0
## 28 Yellow 49.0
## 29 Yellow 53.0
## 30 Yellow 54.0
## 31 Yellow 55.0
## 32 Yellow 51.0
## 33 Yellow 50.0
## 34 Yellow 54.0
## 35 Yellow 51.0
## 36 Yellow 51.0
## 37 Yellow 56.0
## 38 Yellow 56.0
## 39 Yellow 57.0
Data Wrangling
data <- read_excel("Proj1-Oct5-DOE -Tidy2.xlsx")
data <- as.data.frame(data)
data$Obs <- as.numeric(data$Obs)
data$`Ball Colour` <- as.factor(data$`Ball Colour`)
Hypothesis Testing
Notations :-
Red = mean 1
Black = mean 2
Yellow = mean 3
Null Hypothesis : \(H_o: \mu_1= \mu_2 = \mu_3 = \mu\)
Alternative Hypothesis : Ha: at least one of the \(\mu_i\) differs
first.model <- aov(data$Obs~data$`Ball Colour`,data = data)
summary(first.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## data$`Ball Colour` 2 84.51 42.26 14.05 3.08e-05 ***
## Residuals 36 108.23 3.01
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We can see that P value is 3.08e-05 which is smaller than 0.05 .
Hence we reject Null Hypothesis , claiming that at least one of the mean differs
Conclusions on residual plots
From above residual plots we can state that our assumptions for anova model , Normal probability and Constant variance are not violated .
As the residuals fall fairly in straight line in the Normal probability plot which shows that the residuals are normally distributed
The residual vs fitted value plot shows that variance does not differ significantly
Investigating pairwise comparisons
tukey_firstmodel <- TukeyHSD(first.model)
tukey_firstmodel
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = data$Obs ~ data$`Ball Colour`, data = data)
##
## $`data$`Ball Colour``
## diff lwr upr p adj
## Red-black -0.6153846 -2.277731 1.046961 0.6407829
## Yellow-black 2.7692308 1.106885 4.431577 0.0006994
## Yellow-Red 3.3846154 1.722269 5.046961 0.0000471
plot(tukey_firstmodel,col="deepskyblue")

From TukeysHSD results and plot, we can claim that pair-Red and Black are similar because zero lies in the 95% confidence interval range.
The mean value of yellow differs from red as well as black. So the pairs of yellow-red and yellow-black differ significantly because 0 is not in the 95% confidence interval range.