
Title: Design of Experiment-Project
Aim: Testing whether landing distance of three different balls are same or different.
Instructor - Dr.Timoty Matis
Author: Sujit Thakur , Tajammul Mohammed , Gowtam Sasikumar
 
 Determining the size using power calculation 
pwr.anova.test(k=3,n=NULL,f=0.5, sig.level = 0.05 ,power = 0.75)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 3
##               n = 12.50714
##               f = 0.5
##       sig.level = 0.05
##           power = 0.75
## 
## NOTE: n is number in each group
 We can see from the power calculation that number of samples required per group is 13 
 
Randomized Design of Sample Collection - Complete Random Design
trt1 <- c("Yellow","Red","black")
crd <- design.crd(trt1,r=13,seed=1234)
crd$book
##    plots  r   trt1
## 1    101  1    Red
## 2    102  1 Yellow
## 3    103  2 Yellow
## 4    104  1  black
## 5    105  2    Red
## 6    106  2  black
## 7    107  3    Red
## 8    108  3 Yellow
## 9    109  4    Red
## 10   110  5    Red
## 11   111  3  black
## 12   112  4 Yellow
## 13   113  4  black
## 14   114  5 Yellow
## 15   115  6 Yellow
## 16   116  5  black
## 17   117  6  black
## 18   118  6    Red
## 19   119  7    Red
## 20   120  7 Yellow
## 21   121  8    Red
## 22   122  8 Yellow
## 23   123  7  black
## 24   124  9    Red
## 25   125  9 Yellow
## 26   126 10 Yellow
## 27   127 10    Red
## 28   128  8  black
## 29   129 11 Yellow
## 30   130 11    Red
## 31   131  9  black
## 32   132 12 Yellow
## 33   133 13 Yellow
## 34   134 12    Red
## 35   135 10  black
## 36   136 11  black
## 37   137 12  black
## 38   138 13    Red
## 39   139 13  black
 
Collection of Data
data <- read_excel("Proj1-Oct5-DOE -Tidy2.xlsx")
data
## # A tibble: 39 x 2
##    `Ball Colour`   Obs
##    <chr>         <dbl>
##  1 black          50  
##  2 black          51  
##  3 black          50  
##  4 black          48.5
##  5 black          49  
##  6 black          53  
##  7 black          52  
##  8 black          51  
##  9 black          51  
## 10 black          51  
## # ... with 29 more rows
 
Data Wrangling
data <- read_excel("Proj1-Oct5-DOE -Tidy2.xlsx")
data <- as.data.frame(data)
data$Obs <- as.numeric(data$Obs)
data$`Ball Colour` <- as.factor(data$`Ball Colour`)
 
Hypothesis Testing 
Notations :-
Red = mean 1
Black = mean 2
Yellow = mean 3
 Null Hypothesis : \(H_o: \mu_1= \mu_2 = \mu_3 = \mu\)
Alternative Hypothesis : Ha: at least one of the \(\mu_i\) differs 
first.model <- aov(data$Obs~data$`Ball Colour`,data = data)
summary(first.model)
##                    Df Sum Sq Mean Sq F value   Pr(>F)    
## data$`Ball Colour`  2  84.51   42.26   14.05 3.08e-05 ***
## Residuals          36 108.23    3.01                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 We can see that P value is 3.08e-05 which is smaller than 0.05 .
 Hence we reject Null Hypothesis , claiming that at least one of the mean differs 
 
 Conclusions on residual plots
From above residual plots we can state that our assumptions for anova model , Normal probability and Constant variance are not voilated .
 
As the residuals fall fairly in straight line in the Normal probability plot which shows that the residuals are normally distributed
 
 The residual vs fitted value plot shows that varinace does not differ significantly 
 
 
 
Investigating pairwise comparisons
tukey_firstmodel <- TukeyHSD(first.model)
tukey_firstmodel
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = data$Obs ~ data$`Ball Colour`, data = data)
## 
## $`data$`Ball Colour``
##                    diff       lwr      upr     p adj
## Red-black    -0.6153846 -2.277731 1.046961 0.6407829
## Yellow-black  2.7692308  1.106885 4.431577 0.0006994
## Yellow-Red    3.3846154  1.722269 5.046961 0.0000471
plot(tukey_firstmodel,col="deepskyblue")

 From TukeysHSD results and plot, we can claim that pair-Red and Black are similar because zero lies in the 95% confidence interval range. 
 The mean value of yellow differs from red as well as black. So the pairs of yellow-red and yellow-black differ significantly because 0 is not in the 95% confidence interval range.