Title: Design of Experiments-Project

Aim: Testing whether landing distance of three different balls are same or different.

Instructor - Dr.Timothy Matis

Authors: Sujit Thakur, Tajammul Mohammed, Gowtam Sasikumar


Determining the size using power calculation

pwr.anova.test(k=3,n=NULL,f=0.5, sig.level = 0.05 ,power = 0.75)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 3
##               n = 12.50714
##               f = 0.5
##       sig.level = 0.05
##           power = 0.75
## 
## NOTE: n is number in each group

We can see from the power calculation that number of samples required per group is 13

Randomized Design of Sample Collection - Complete Random Design

trt1 <- c("Yellow","Red","black")
crd <- design.crd(trt1,r=13,seed=1234)
crd$book
##    plots  r   trt1
## 1    101  1    Red
## 2    102  1 Yellow
## 3    103  2 Yellow
## 4    104  1  black
## 5    105  2    Red
## 6    106  2  black
## 7    107  3    Red
## 8    108  3 Yellow
## 9    109  4    Red
## 10   110  5    Red
## 11   111  3  black
## 12   112  4 Yellow
## 13   113  4  black
## 14   114  5 Yellow
## 15   115  6 Yellow
## 16   116  5  black
## 17   117  6  black
## 18   118  6    Red
## 19   119  7    Red
## 20   120  7 Yellow
## 21   121  8    Red
## 22   122  8 Yellow
## 23   123  7  black
## 24   124  9    Red
## 25   125  9 Yellow
## 26   126 10 Yellow
## 27   127 10    Red
## 28   128  8  black
## 29   129 11 Yellow
## 30   130 11    Red
## 31   131  9  black
## 32   132 12 Yellow
## 33   133 13 Yellow
## 34   134 12    Red
## 35   135 10  black
## 36   136 11  black
## 37   137 12  black
## 38   138 13    Red
## 39   139 13  black

Collection of Data

data <- read_excel("Proj1-Oct5-DOE -Tidy2.xlsx")
data <- as.data.frame(data)
print.data.frame(data)
##    Ball Colour  Obs
## 1        black 50.0
## 2        black 51.0
## 3        black 50.0
## 4        black 48.5
## 5        black 49.0
## 6        black 53.0
## 7        black 52.0
## 8        black 51.0
## 9        black 51.0
## 10       black 51.0
## 11       black 49.5
## 12       black 50.0
## 13       black 50.0
## 14         Red 50.0
## 15         Red 51.0
## 16         Red 49.0
## 17         Red 49.0
## 18         Red 50.0
## 19         Red 49.5
## 20         Red 49.0
## 21         Red 50.5
## 22         Red 49.0
## 23         Red 50.0
## 24         Red 50.0
## 25         Red 49.0
## 26         Red 52.0
## 27      Yellow 55.0
## 28      Yellow 49.0
## 29      Yellow 53.0
## 30      Yellow 54.0
## 31      Yellow 55.0
## 32      Yellow 51.0
## 33      Yellow 50.0
## 34      Yellow 54.0
## 35      Yellow 51.0
## 36      Yellow 51.0
## 37      Yellow 56.0
## 38      Yellow 56.0
## 39      Yellow 57.0

Data Wrangling

data <- read_excel("Proj1-Oct5-DOE -Tidy2.xlsx")
data <- as.data.frame(data)
data$Obs <- as.numeric(data$Obs)
data$`Ball Colour` <- as.factor(data$`Ball Colour`)

Hypothesis Testing

Notations :-

Red = mean 1

Black = mean 2

Yellow = mean 3

Null Hypothesis : \(H_o: \mu_1= \mu_2 = \mu_3 = \mu\)

Alternative Hypothesis : Ha: at least one of the \(\mu_i\) differs

first.model <- aov(data$Obs~data$`Ball Colour`,data = data)
summary(first.model)
##                    Df Sum Sq Mean Sq F value   Pr(>F)    
## data$`Ball Colour`  2  84.51   42.26   14.05 3.08e-05 ***
## Residuals          36 108.23    3.01                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can see that P value is 3.08e-05 which is smaller than 0.05 .

Hence we reject Null Hypothesis , claiming that at least one of the mean differs

Checking our Anova Model Adequacy

plot(first.model,col="deepskyblue")

Conclusions on residual plots

  • From above residual plots we can state that our assumptions for anova model , Normal probability and Constant variance are not violated .

  • As the residuals fall fairly in straight line in the Normal probability plot which shows that the residuals are normally distributed

  • The residual vs fitted value plot shows that variance does not differ significantly

Investigating pairwise comparisons

tukey_firstmodel <- TukeyHSD(first.model)
tukey_firstmodel
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = data$Obs ~ data$`Ball Colour`, data = data)
## 
## $`data$`Ball Colour``
##                    diff       lwr      upr     p adj
## Red-black    -0.6153846 -2.277731 1.046961 0.6407829
## Yellow-black  2.7692308  1.106885 4.431577 0.0006994
## Yellow-Red    3.3846154  1.722269 5.046961 0.0000471
plot(tukey_firstmodel,col="deepskyblue")

From TukeysHSD results and plot, we can claim that pair-Red and Black are similar because zero lies in the 95% confidence interval range.

The mean value of yellow differs from red as well as black. So the pairs of yellow-red and yellow-black differ significantly because 0 is not in the 95% confidence interval range.