Using pwr.anova.test, it was determined that for a power of 55% and a significance of 0.05.
11 samples are needed for each group, so 33 samples total.
pwr.anova.test(k=3,sig.level = 0.05, power=.55,f=.9/2)
##
## Balanced one-way analysis of variance power calculation
##
## k = 3
## n = 10.21488
## f = 0.45
## sig.level = 0.05
## power = 0.55
##
## NOTE: n is number in each group
Using the agricolae package, the following layout was devised.
dat
## Order Treatmeant Distance.in.
## 1 1 Golf 86
## 2 2 Tennis 55
## 3 3 Tennis 61
## 4 4 Tennis 49
## 5 5 Golf 78
## 6 6 Rock 41
## 7 7 Golf 58
## 8 8 Golf 77
## 9 9 Rock 46
## 10 10 Golf 77
## 11 11 Rock 45
## 12 12 Tennis 63
## 13 13 Tennis 72
## 14 14 Golf 71
## 15 15 Rock 43
## 16 16 Golf 64
## 17 17 Tennis 65
## 18 18 Rock 73
## 19 19 Rock 45
## 20 20 Tennis 74
## 21 21 Tennis 73
## 22 22 Tennis 69
## 23 23 Tennis 70
## 24 24 Rock 62
## 25 25 Rock 70
## 26 26 Golf 69
## 27 27 Rock 81
## 28 28 Tennis 76
## 29 29 Rock 73
## 30 30 Rock 61
## 31 31 Golf 70
## 32 32 Golf 72
## 33 33 Golf 73
The measured distances in inches are also included in the table above.
Before conducting hypothesis test, data is checked for normality through the use of box-plots.
As seen from the box-plots, the data is normal enough from the sample sizes of 11.
Before doing any hypothesis testing it seems that in general the golf ball flew the furthest.
The Null Hypothesis is that there is not a difference in the means. An ANOVA test is used to test the hypothesis.
The results of the ANOVA test are shown below.
## Df Sum Sq Mean Sq F value Pr(>F)
## Treatmeant 2 1098 548.8 4.815 0.0154 *
## Residuals 30 3419 114.0
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From a P-value of 0.0154, we can definitely reject the null hypothesis and perform pairwise comparison.
Comparisons <- TukeyHSD(model,conf.level = .05)
Comparisons
## Tukey multiple comparisons of means
## 5% family-wise confidence level
##
## Fit: aov(formula = Distance.in. ~ Treatmeant, data = dat)
##
## $Treatmeant
## diff lwr upr p adj
## Rock-Golf -14.090909 -15.480644 -12.701174 0.0114484
## Tennis-Golf -6.181818 -7.571553 -4.792083 0.3752133
## Tennis-Rock 7.909091 6.519356 9.298826 0.2081230
From pairwise comparison, all levels of the factor are statiscally different from each other.
We found that all factors are tatiscally different from each other, meaning that all tested balls have different means in their travel distance .
library(pwr)
library(agricolae)
library(car)
pwr.anova.test(k=3,sig.level = 0.05, power=.55,f=.9/2)
treatments = c("Golf" ,"Tennis","Rock ")
design.crd(trt= treatments, r=12, seed= 828465)
# The experimental data is in the github below
dat <- read.csv("https://raw.githubusercontent.com/luaraiza/IE-Exploratory-Data-Analysis/main/Ball%20throws.csv")
names(dat)[names(dat) == "ï..Order"] <- "Order"
golf <- c(86,78,58,77,77,71,64,69,70,72,73)
Rock <- c(81,73,73,70,62,61,46,45,45,43,41)
Tennis <- c(76,74,73,72,70,69,65,63,61,55,49)
boxplot(Tennis,golf,Rock, names = c("Tennis", "Golf", "Rock"),ylab= "Distance in Inches",main= "Ball Type")
model <- aov(Distance.in.~Treatmeant, data = dat)
summary(model)
Comparisons <- TukeyHSD(model,conf.level = .05)
We have 2 factors, pin elevation and release angle, therefore the model equation takes the form:
\[ y_{ij}=\mu + \alpha_i+\beta_j+ \alpha\beta_{ij}+\epsilon_{ij}\]
where we want to test first if there is interaction between the effects:
\[ H_0: \alpha\beta_{ij}=0 \hspace{1cm} \text{for any ij}\\ H_a: \alpha\beta_{ij} \neq 0 \hspace{1cm} \text{for any ij}\]
The proposed layout is performed by using the design.ab function with a randomized order.
trt <- c(2,3)
design.ab(trt=trt,r=3,design="crd",seed=548613)
## $parameters
## $parameters$design
## [1] "factorial"
##
## $parameters$trt
## [1] "1 1" "1 2" "1 3" "2 1" "2 2" "2 3"
##
## $parameters$r
## [1] 3 3 3 3 3 3
##
## $parameters$serie
## [1] 2
##
## $parameters$seed
## [1] 548613
##
## $parameters$kinds
## [1] "Super-Duper"
##
## $parameters[[7]]
## [1] TRUE
##
## $parameters$applied
## [1] "crd"
##
##
## $book
## plots r A B
## 1 101 1 1 1
## 2 102 1 2 3
## 3 103 1 1 3
## 4 104 2 1 1
## 5 105 1 2 1
## 6 106 2 1 3
## 7 107 1 2 2
## 8 108 2 2 1
## 9 109 2 2 2
## 10 110 3 1 1
## 11 111 1 1 2
## 12 112 2 2 3
## 13 113 3 2 1
## 14 114 2 1 2
## 15 115 3 1 3
## 16 116 3 2 3
## 17 117 3 2 2
## 18 118 3 1 2
library(DoE.base)
library(GAD)
Release_angle<-c(rep(90,6),rep(110,6),rep(120,6))
Pin_position<-c(rep(1,3),rep(3,3),rep(1,3),rep(3,3),rep(1,3),rep(3,3))
obs<-c(28,34,30,38,52,45,33,36,35,45,53,66,29,31,26,37,47,61)
Release_angle<-as.random(Release_angle)
Pin_position<-as.fixed(Pin_position)
dat<-data.frame(Release_angle,Pin_position,obs)
model <- aov(obs~Release_angle*Pin_position,data = dat)
GAD::gad(model)
## Analysis of Variance Table
##
## Response: obs
## Df Sum Sq Mean Sq F value Pr(>F)
## Release_angle 2 170.33 85.17 1.5739 0.24716
## Pin_position 1 1458.00 1458.00 96.1319 0.01024 *
## Release_angle:Pin_position 2 30.33 15.17 0.2803 0.76038
## Residual 12 649.33 54.11
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From the ANOVA test, considering the release angle as random effects and the pin elevation as fixed effects, using a significance level of \(\alpha=0.05\) we can see that the interaction between them is not significant. Therefore we proceed to test the main effects, with the null and alternative hypotheses as:
\[ H_0: \alpha_{i}=0 \\ H_a: \alpha_{i} \neq 0 \\ H_0: \beta_{i}=0 \\ H_a: \beta_{i} \neq 0\]
model <- aov(obs~Release_angle+Pin_position,data = dat)
GAD::gad(model)
## Analysis of Variance Table
##
## Response: obs
## Df Sum Sq Mean Sq F value Pr(>F)
## Release_angle 2 170.33 85.17 1.7543 0.209
## Pin_position 1 1458.00 1458.00 30.0324 8.105e-05 ***
## Residual 14 679.67 48.55
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The statistical test suggests that the pin position significantly affected the ball distance whereas the release angle was not significant. These findings may also be affected by the read precision.
plot(model,1)
plot(model,2)
The data seems to show outliers in the variance plot but in general it seems fairly constant. The normal plot shows that the data is not so normal but the test statistics performed is robust against deviations of normality.
library(agricolae)
trt <- c(2,2,2,2) # Vector Containing the amount of factors and levels
Our_design <-design.ab(trt,design = "crd",r=1,randomization = TRUE) # Used completly randomized design
Our_design
## $parameters
## $parameters$design
## [1] "factorial"
##
## $parameters$trt
## [1] "1 1 1 1" "1 1 1 2" "1 1 2 1" "1 1 2 2" "1 2 1 1" "1 2 1 2" "1 2 2 1"
## [8] "1 2 2 2" "2 1 1 1" "2 1 1 2" "2 1 2 1" "2 1 2 2" "2 2 1 1" "2 2 1 2"
## [15] "2 2 2 1" "2 2 2 2"
##
## $parameters$r
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##
## $parameters$serie
## [1] 2
##
## $parameters$seed
## [1] 1312868305
##
## $parameters$kinds
## [1] "Super-Duper"
##
## $parameters[[7]]
## [1] TRUE
##
## $parameters$applied
## [1] "crd"
##
##
## $book
## plots r A B C D
## 1 101 1 1 2 1 2
## 2 102 1 2 2 1 1
## 3 103 1 1 1 1 2
## 4 104 1 2 1 2 1
## 5 105 1 2 2 2 2
## 6 106 1 1 2 1 1
## 7 107 1 1 2 2 1
## 8 108 1 2 2 1 2
## 9 109 1 2 1 1 1
## 10 110 1 2 1 1 2
## 11 111 1 2 1 2 2
## 12 112 1 1 1 2 1
## 13 113 1 1 1 1 1
## 14 114 1 2 2 2 1
## 15 115 1 1 1 2 2
## 16 116 1 1 2 2 2
The model equation for a \(2^4\) factorial design takes the form:
\[ y_{ijklm}=\mu+\alpha_i+\beta_j+\gamma_k+\zeta_l+\alpha\beta_{ij}+ \alpha\gamma_{ik}+\alpha\zeta_{il}+\beta\gamma_{jk}+\beta\zeta_{jl}+ \gamma\zeta_{kl}+\alpha\beta\gamma_{ijk}+\alpha\beta\zeta_{ijl}+\alpha\gamma\zeta_{ikl}+\beta\gamma\zeta_{jkl} + \alpha\beta\gamma\zeta_{ijkl} + \epsilon_{ijklm}\]
library(DoE.base)
Pin_elevation<-c(-1,-1,-1,-1,1,1,1,1,1,1,-1,-1,1,1,-1,-1)
Bunge_position<-c(1,1,1,1,-1,-1,1,1,-1,1,-1,-1,-1,1,-1,-1)
Angle<-c(1,-1,1,-1,-1,-1,1,-1,1,1,-1,1,1,-1,-1,1)
Ball_type<-c(-1,1,1,-1,1,-1,-1,1,-1,1,1,1,1,-1,-1,-1)
obs<-c(20,23,28,23,29,34,29,22,38,40,22,17,29,37,21,24)
dat<-data.frame(Pin_elevation,Bunge_position,Angle,Ball_type,obs)
dat
## Pin_elevation Bunge_position Angle Ball_type obs
## 1 -1 1 1 -1 20
## 2 -1 1 -1 1 23
## 3 -1 1 1 1 28
## 4 -1 1 -1 -1 23
## 5 1 -1 -1 1 29
## 6 1 -1 -1 -1 34
## 7 1 1 1 -1 29
## 8 1 1 -1 1 22
## 9 1 -1 1 -1 38
## 10 1 1 1 1 40
## 11 -1 -1 -1 1 22
## 12 -1 -1 1 1 17
## 13 1 -1 1 1 29
## 14 1 1 -1 -1 37
## 15 -1 -1 -1 -1 21
## 16 -1 -1 1 -1 24
mod<-lm(obs~Pin_elevation*Bunge_position*Angle*Ball_type,data = dat)
coef(mod)
## (Intercept)
## 27.250
## Pin_elevation
## 5.000
## Bunge_position
## 0.500
## Angle
## 0.875
## Ball_type
## -1.000
## Pin_elevation:Bunge_position
## -0.750
## Pin_elevation:Angle
## 0.875
## Bunge_position:Angle
## 0.625
## Pin_elevation:Ball_type
## -1.250
## Bunge_position:Ball_type
## 1.500
## Angle:Ball_type
## 1.375
## Pin_elevation:Bunge_position:Angle
## 0.125
## Pin_elevation:Bunge_position:Ball_type
## -0.250
## Pin_elevation:Angle:Ball_type
## 1.375
## Bunge_position:Angle:Ball_type
## 2.875
## Pin_elevation:Bunge_position:Angle:Ball_type
## 0.875
halfnormal(mod)
According with the Half-normal scores plot, the factor Pin Elevation and the interaction (Bunge Position : Angle Ball : Ball Type) appear to be significant, with an \(\alpha = 0.05\).
aov_mod <- aov(obs~Pin_elevation+Bunge_position*Angle*Ball_type,data = dat)
summary(aov_mod)
## Df Sum Sq Mean Sq F value Pr(>F)
## Pin_elevation 1 400.0 400.0 31.111 0.000835 ***
## Bunge_position 1 4.0 4.0 0.311 0.594379
## Angle 1 12.3 12.3 0.953 0.361532
## Ball_type 1 16.0 16.0 1.244 0.301443
## Bunge_position:Angle 1 6.3 6.3 0.486 0.508154
## Bunge_position:Ball_type 1 36.0 36.0 2.800 0.138180
## Angle:Ball_type 1 30.2 30.2 2.353 0.168930
## Bunge_position:Angle:Ball_type 1 132.2 132.2 10.286 0.014916 *
## Residuals 7 90.0 12.9
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Based on the Anova test, the new model equation can be written as:
\[ y_{ijklm}=\mu+\alpha_i+\beta\gamma\zeta_{jkl}+\epsilon{ijklm} \]