1 Phase 1

1.1 Part A

Using pwr.anova.test, it was determined that for a power of 55% and a significance of 0.05.

11 samples are needed for each group, so 33 samples total.

pwr.anova.test(k=3,sig.level = 0.05, power=.55,f=.9/2)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 3
##               n = 10.21488
##               f = 0.45
##       sig.level = 0.05
##           power = 0.55
## 
## NOTE: n is number in each group

1.2 Parts B & C

Using the agricolae package, the following layout was devised.

dat
##    Order Treatmeant Distance.in.
## 1      1       Golf           86
## 2      2     Tennis           55
## 3      3     Tennis           61
## 4      4     Tennis           49
## 5      5       Golf           78
## 6      6       Rock           41
## 7      7       Golf           58
## 8      8       Golf           77
## 9      9       Rock           46
## 10    10       Golf           77
## 11    11       Rock           45
## 12    12     Tennis           63
## 13    13     Tennis           72
## 14    14       Golf           71
## 15    15       Rock           43
## 16    16       Golf           64
## 17    17     Tennis           65
## 18    18       Rock           73
## 19    19       Rock           45
## 20    20     Tennis           74
## 21    21     Tennis           73
## 22    22     Tennis           69
## 23    23     Tennis           70
## 24    24       Rock           62
## 25    25       Rock           70
## 26    26       Golf           69
## 27    27       Rock           81
## 28    28     Tennis           76
## 29    29       Rock           73
## 30    30       Rock           61
## 31    31       Golf           70
## 32    32       Golf           72
## 33    33       Golf           73

The measured distances in inches are also included in the table above.

1.3 Part D

Before conducting hypothesis test, data is checked for normality through the use of box-plots.

As seen from the box-plots, the data is normal enough from the sample sizes of 11.

Before doing any hypothesis testing it seems that in general the golf ball flew the furthest.

The Null Hypothesis is that there is not a difference in the means. An ANOVA test is used to test the hypothesis.

The results of the ANOVA test are shown below.

##             Df Sum Sq Mean Sq F value Pr(>F)  
## Treatmeant   2   1098   548.8   4.815 0.0154 *
## Residuals   30   3419   114.0                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From a P-value of 0.0154, we can definitely reject the null hypothesis and perform pairwise comparison.

1.4 Part E

Comparisons <- TukeyHSD(model,conf.level = .05)

Comparisons
##   Tukey multiple comparisons of means
##     5% family-wise confidence level
## 
## Fit: aov(formula = Distance.in. ~ Treatmeant, data = dat)
## 
## $Treatmeant
##                   diff        lwr        upr     p adj
## Rock-Golf   -14.090909 -15.480644 -12.701174 0.0114484
## Tennis-Golf  -6.181818  -7.571553  -4.792083 0.3752133
## Tennis-Rock   7.909091   6.519356   9.298826 0.2081230

From pairwise comparison, all levels of the factor are statiscally different from each other.

1.5 Part F

We found that all factors are tatiscally different from each other, meaning that all tested balls have different means in their travel distance .

2 Full Code Phase 1

library(pwr)
library(agricolae) 
library(car)
pwr.anova.test(k=3,sig.level = 0.05, power=.55,f=.9/2)
treatments = c("Golf" ,"Tennis","Rock ")
design.crd(trt= treatments, r=12, seed= 828465)

# The experimental data is in the github below
dat <- read.csv("https://raw.githubusercontent.com/luaraiza/IE-Exploratory-Data-Analysis/main/Ball%20throws.csv")

names(dat)[names(dat) == "ï..Order"] <- "Order"
golf <- c(86,78,58,77,77,71,64,69,70,72,73)
Rock <- c(81,73,73,70,62,61,46,45,45,43,41)
Tennis <- c(76,74,73,72,70,69,65,63,61,55,49)

boxplot(Tennis,golf,Rock, names = c("Tennis", "Golf", "Rock"),ylab= "Distance in Inches",main= "Ball Type")

model <- aov(Distance.in.~Treatmeant, data = dat)
summary(model)

Comparisons <- TukeyHSD(model,conf.level = .05)

3 Phase 2

3.1 Part A

We have 2 factors, pin elevation and release angle, therefore the model equation takes the form:

\[ y_{ij}=\mu + \alpha_i+\beta_j+ \alpha\beta_{ij}+\epsilon_{ij}\]

where we want to test first if there is interaction between the effects:

\[ H_0: \alpha\beta_{ij}=0 \hspace{1cm} \text{for any ij}\\ H_a: \alpha\beta_{ij} \neq 0 \hspace{1cm} \text{for any ij}\]

3.2 Part B

The proposed layout is performed by using the design.ab function with a randomized order.

trt <- c(2,3)
design.ab(trt=trt,r=3,design="crd",seed=548613)
## $parameters
## $parameters$design
## [1] "factorial"
## 
## $parameters$trt
## [1] "1 1" "1 2" "1 3" "2 1" "2 2" "2 3"
## 
## $parameters$r
## [1] 3 3 3 3 3 3
## 
## $parameters$serie
## [1] 2
## 
## $parameters$seed
## [1] 548613
## 
## $parameters$kinds
## [1] "Super-Duper"
## 
## $parameters[[7]]
## [1] TRUE
## 
## $parameters$applied
## [1] "crd"
## 
## 
## $book
##    plots r A B
## 1    101 1 1 1
## 2    102 1 2 3
## 3    103 1 1 3
## 4    104 2 1 1
## 5    105 1 2 1
## 6    106 2 1 3
## 7    107 1 2 2
## 8    108 2 2 1
## 9    109 2 2 2
## 10   110 3 1 1
## 11   111 1 1 2
## 12   112 2 2 3
## 13   113 3 2 1
## 14   114 2 1 2
## 15   115 3 1 3
## 16   116 3 2 3
## 17   117 3 2 2
## 18   118 3 1 2

3.3 Part C

3.4 Part D

library(DoE.base)
library(GAD)

Release_angle<-c(rep(90,6),rep(110,6),rep(120,6))
Pin_position<-c(rep(1,3),rep(3,3),rep(1,3),rep(3,3),rep(1,3),rep(3,3))
obs<-c(28,34,30,38,52,45,33,36,35,45,53,66,29,31,26,37,47,61)

Release_angle<-as.random(Release_angle)
Pin_position<-as.fixed(Pin_position)

dat<-data.frame(Release_angle,Pin_position,obs)


model <- aov(obs~Release_angle*Pin_position,data = dat)

GAD::gad(model)
## Analysis of Variance Table
## 
## Response: obs
##                            Df  Sum Sq Mean Sq F value  Pr(>F)  
## Release_angle               2  170.33   85.17  1.5739 0.24716  
## Pin_position                1 1458.00 1458.00 96.1319 0.01024 *
## Release_angle:Pin_position  2   30.33   15.17  0.2803 0.76038  
## Residual                   12  649.33   54.11                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From the ANOVA test, considering the release angle as random effects and the pin elevation as fixed effects, using a significance level of \(\alpha=0.05\) we can see that the interaction between them is not significant. Therefore we proceed to test the main effects, with the null and alternative hypotheses as:

\[ H_0: \alpha_{i}=0 \\ H_a: \alpha_{i} \neq 0 \\ H_0: \beta_{i}=0 \\ H_a: \beta_{i} \neq 0\]

model <- aov(obs~Release_angle+Pin_position,data = dat)
GAD::gad(model)
## Analysis of Variance Table
## 
## Response: obs
##               Df  Sum Sq Mean Sq F value    Pr(>F)    
## Release_angle  2  170.33   85.17  1.7543     0.209    
## Pin_position   1 1458.00 1458.00 30.0324 8.105e-05 ***
## Residual      14  679.67   48.55                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The statistical test suggests that the pin position significantly affected the ball distance whereas the release angle was not significant. These findings may also be affected by the read precision.

plot(model,1)

plot(model,2)

The data seems to show outliers in the variance plot but in general it seems fairly constant. The normal plot shows that the data is not so normal but the test statistics performed is robust against deviations of normality.

4 Phase 3

4.1 Part A

library(agricolae)
trt <- c(2,2,2,2) # Vector Containing the amount of factors and levels
Our_design <-design.ab(trt,design = "crd",r=1,randomization = TRUE) # Used completly randomized design
Our_design
## $parameters
## $parameters$design
## [1] "factorial"
## 
## $parameters$trt
##  [1] "1 1 1 1" "1 1 1 2" "1 1 2 1" "1 1 2 2" "1 2 1 1" "1 2 1 2" "1 2 2 1"
##  [8] "1 2 2 2" "2 1 1 1" "2 1 1 2" "2 1 2 1" "2 1 2 2" "2 2 1 1" "2 2 1 2"
## [15] "2 2 2 1" "2 2 2 2"
## 
## $parameters$r
##  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## 
## $parameters$serie
## [1] 2
## 
## $parameters$seed
## [1] 1312868305
## 
## $parameters$kinds
## [1] "Super-Duper"
## 
## $parameters[[7]]
## [1] TRUE
## 
## $parameters$applied
## [1] "crd"
## 
## 
## $book
##    plots r A B C D
## 1    101 1 1 2 1 2
## 2    102 1 2 2 1 1
## 3    103 1 1 1 1 2
## 4    104 1 2 1 2 1
## 5    105 1 2 2 2 2
## 6    106 1 1 2 1 1
## 7    107 1 1 2 2 1
## 8    108 1 2 2 1 2
## 9    109 1 2 1 1 1
## 10   110 1 2 1 1 2
## 11   111 1 2 1 2 2
## 12   112 1 1 1 2 1
## 13   113 1 1 1 1 1
## 14   114 1 2 2 2 1
## 15   115 1 1 1 2 2
## 16   116 1 1 2 2 2

4.2 Part B

4.3 Part C

The model equation for a \(2^4\) factorial design takes the form:

\[ y_{ijklm}=\mu+\alpha_i+\beta_j+\gamma_k+\zeta_l+\alpha\beta_{ij}+ \alpha\gamma_{ik}+\alpha\zeta_{il}+\beta\gamma_{jk}+\beta\zeta_{jl}+ \gamma\zeta_{kl}+\alpha\beta\gamma_{ijk}+\alpha\beta\zeta_{ijl}+\alpha\gamma\zeta_{ikl}+\beta\gamma\zeta_{jkl} + \alpha\beta\gamma\zeta_{ijkl} + \epsilon_{ijklm}\]

library(DoE.base)
Pin_elevation<-c(-1,-1,-1,-1,1,1,1,1,1,1,-1,-1,1,1,-1,-1)
Bunge_position<-c(1,1,1,1,-1,-1,1,1,-1,1,-1,-1,-1,1,-1,-1)
Angle<-c(1,-1,1,-1,-1,-1,1,-1,1,1,-1,1,1,-1,-1,1)
Ball_type<-c(-1,1,1,-1,1,-1,-1,1,-1,1,1,1,1,-1,-1,-1)
             
obs<-c(20,23,28,23,29,34,29,22,38,40,22,17,29,37,21,24)
       
dat<-data.frame(Pin_elevation,Bunge_position,Angle,Ball_type,obs)
dat
##    Pin_elevation Bunge_position Angle Ball_type obs
## 1             -1              1     1        -1  20
## 2             -1              1    -1         1  23
## 3             -1              1     1         1  28
## 4             -1              1    -1        -1  23
## 5              1             -1    -1         1  29
## 6              1             -1    -1        -1  34
## 7              1              1     1        -1  29
## 8              1              1    -1         1  22
## 9              1             -1     1        -1  38
## 10             1              1     1         1  40
## 11            -1             -1    -1         1  22
## 12            -1             -1     1         1  17
## 13             1             -1     1         1  29
## 14             1              1    -1        -1  37
## 15            -1             -1    -1        -1  21
## 16            -1             -1     1        -1  24
mod<-lm(obs~Pin_elevation*Bunge_position*Angle*Ball_type,data = dat)
coef(mod)
##                                  (Intercept) 
##                                       27.250 
##                                Pin_elevation 
##                                        5.000 
##                               Bunge_position 
##                                        0.500 
##                                        Angle 
##                                        0.875 
##                                    Ball_type 
##                                       -1.000 
##                 Pin_elevation:Bunge_position 
##                                       -0.750 
##                          Pin_elevation:Angle 
##                                        0.875 
##                         Bunge_position:Angle 
##                                        0.625 
##                      Pin_elevation:Ball_type 
##                                       -1.250 
##                     Bunge_position:Ball_type 
##                                        1.500 
##                              Angle:Ball_type 
##                                        1.375 
##           Pin_elevation:Bunge_position:Angle 
##                                        0.125 
##       Pin_elevation:Bunge_position:Ball_type 
##                                       -0.250 
##                Pin_elevation:Angle:Ball_type 
##                                        1.375 
##               Bunge_position:Angle:Ball_type 
##                                        2.875 
## Pin_elevation:Bunge_position:Angle:Ball_type 
##                                        0.875
halfnormal(mod)

According with the Half-normal scores plot, the factor Pin Elevation and the interaction (Bunge Position : Angle Ball : Ball Type) appear to be significant, with an \(\alpha = 0.05\).

4.4 Part D

aov_mod <- aov(obs~Pin_elevation+Bunge_position*Angle*Ball_type,data = dat)
summary(aov_mod)
##                                Df Sum Sq Mean Sq F value   Pr(>F)    
## Pin_elevation                   1  400.0   400.0  31.111 0.000835 ***
## Bunge_position                  1    4.0     4.0   0.311 0.594379    
## Angle                           1   12.3    12.3   0.953 0.361532    
## Ball_type                       1   16.0    16.0   1.244 0.301443    
## Bunge_position:Angle            1    6.3     6.3   0.486 0.508154    
## Bunge_position:Ball_type        1   36.0    36.0   2.800 0.138180    
## Angle:Ball_type                 1   30.2    30.2   2.353 0.168930    
## Bunge_position:Angle:Ball_type  1  132.2   132.2  10.286 0.014916 *  
## Residuals                       7   90.0    12.9                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Based on the Anova test, the new model equation can be written as:

\[ y_{ijklm}=\mu+\alpha_i+\beta\gamma\zeta_{jkl}+\epsilon{ijklm} \]