Project, Part 1

Perform a designed experiment on the effect of Release Angle on the distance in which the ball is thrown. Specifically, we would like to study whether the settings of 175, 180, and 185 degrees significantly differ in their mean distance. Since pulling the lever back takes additional work, we would like to investigate whether this makes a significant difference on the mean distance thrown. The other factors will be set to the following Fire Angle = 90deg, Bungee Position = 200mm, Pin Elevation = 200mm, and Cup Elevation = 300mm. To test this hypothesis, we wish to use a completely randomized design with an alpha around 0.05.

Determine how many samples should be collected to detect a mean difference with a medium effect (i.e. 50% of the standard deviation) with a probability of 75%.

Given

groups: k = 3
effect: f > 0.5
α = 0.05
power (1-β) = 0.75

We need 13 samples from each population

# balanced one-way analysis effect size = 0.5
library(pwr)
pwr.anova.test(k=3,n=NULL,f=0.5,sig.level=0.05,power=.75)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 3
##               n = 12.50714
##               f = 0.5
##       sig.level = 0.05
##           power = 0.75
## 
## NOTE: n is number in each group

Propose a layout using the number of samples from part (a) with randomized run order

The layout should include a minimum of 39 samples, ideally with the Release Angle from the set {175°, 180°, 185°} should be selected in random order so as to produce a completely random experimental error. In this scheme, we may need to perform more samples past 39 until there are 13 samples of each Release Angle recorded to maintain the power of our experiment.

Collect data and record observations on layout proposed in part (b)

# Load data into a data frame
dat<-read.csv("https://raw.githubusercontent.com/forestwhite/RStatistics/main/projectpart1.csv", header=TRUE)
dat$angle <- as.character(dat$angle)
dat

##    angle distance
## 1    185    475.0
## 2    185    452.0
## 3    175    432.0
## 4    185    432.5
## 5    180    434.0
## 6    175    420.0
## 7    180    445.5
## 8    175    418.0
## 9    185    464.0
## 10   180    440.0
## 11   180    429.5
## 12   180    441.5
## 13   180    440.5
## 14   185    467.5
## 15   175    416.0
## 16   175    420.0
## 17   180    434.0
## 18   185    450.0
## 19   180    463.5
## 20   185    457.0
## 21   180    440.0
## 22   175    427.0
## 23   175    420.5
## 24   175    420.5
## 25   180    442.0
## 26   180    418.0
## 27   185    454.0
## 28   180    437.0
## 29   175    414.5
## 30   185    464.0
## 31   175    425.5
## 32   185    451.5
## 33   175    425.0
## 34   180    416.0
## 35   175    425.5
## 36   175    421.5
## 37   185    445.0
## 38   185    462.0
## 39   185    453.5

Perform hypothesis test and check residuals. Be sure to comment and take corrective action if necessary.

The data is NOT normally distributed.

# AOV normal Q-Q plots of the data for the 3 release angles
aov1<-aov(distance ~ angle,data=dat)
plot(aov1,2)

The data sets for each treatment do not have equal variance.

# AOV residuals vs fitted comparison the data for the 3 release angles
plot(aov1,1)

# boxplots comparing the 3 launch angles
boxplot(distance ~ angle,data=dat, col=c("steelblue","firebrick2", "forestgreen"),xlab = "release angle",ylab = "distance (mm)", main="distance by launch angle")

We need to perform a transform to normalize the variance and use a non-parameteric test to conduct ANOVA. BoxCox analysis below indicates a negative lambda. Hence, the appropriate transform is logarithmic, which brings the variance significantly closer to equal.

# Boxcox power estimate
library(MASS)
boxcox(stack(dat)[,1]~stack(dat)[,2], plotit=TRUE )

# Logarithmic transformation
dat2<-dat
dat2[,2]<-log(dat2[,2])

# boxplots comparing the logarithmically transformed data from 3 launch angles
boxplot(distance ~ angle,data=dat2, col=c("steelblue","firebrick2", "forestgreen"),xlab = "release angle",ylab = "logarithm of the distance (mm)", main="logarithm of the distance by launch angle")

The hypotheses to test are:

  H₀: log(µ_k) = log(µ_l) for all k,l ∈ {175°,180°,185°}
  H₁: log(µ_k) ≠ log(µ_l) for any k,l ∈ {175°,180°,185°} where k ≠ l

Looking at the normal Q-Q plot for the transformed data, we see it is not normally distributed. Using the non-parametric Kruskal-Wallis test, our p-value = 1.958e-14 < 0.05 = α, so we reject the null hypothesis, and predict that at least one of the treatment sample means is not equal to the other sample treatment means.

# AOV normal Q-Q plots of the logarithmically transformed data for the 3 release angles
aov2<-aov(distance ~ angle,data=dat2)
plot(aov2,2)

# Kruskal-Wallis Test
kruskal.test(values ~ ind,data=(stack(dat2)))

## 
##  Kruskal-Wallis rank sum test
## 
## data:  values by ind
## Kruskal-Wallis chi-squared = 58.574, df = 1, p-value = 1.958e-14

If the null hypothesis is rejected, investigate pairwise comparisons.

Using Tukey’s test, we see that all pairs differed from each other by at least a mean of ~15.0 mm.

# Use Tukey's test to determine Honest Significant Differences between the treatment means, 95% confidence
library(car)

## Loading required package: carData

TukeyHSD(aov1)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = distance ~ angle, data = dat)
## 
## $angle
##             diff       lwr      upr     p adj
## 180-175 15.03846  5.694264 24.38266 0.0010416
## 185-175 34.00000 24.655802 43.34420 0.0000000
## 185-180 18.96154  9.617341 28.30574 0.0000496

plot(TukeyHSD(aov1))

State conclusions and make recommendation.

Each release angle setting produces significantly different results. We observe the highest mean distance at a release angle of 185°, and each increase in release angle by 5° yields similar increases in distance when we compare them.

Complete Code

Complete R code used in this analysis.

# balanced one-way analysis effect size = 0.5
library(pwr)
pwr.anova.test(k=3,n=NULL,f=0.5,sig.level=0.05,power=.75)

# Load data into a data frame
dat<-read.csv("https://raw.githubusercontent.com/forestwhite/RStatistics/main/projectpart1.csv", header=TRUE)
dat$angle <- as.character(dat$angle)
dat

# AOV normal Q-Q plots of the data for the 3 release angles
aov1<-aov(distance ~ angle,data=dat)
plot(aov1,2)

# AOV residuals vs fitted comparison the data for the 3 release angles
plot(aov1,1)
# boxplots comparing the 3 launch angles
boxplot(distance ~ angle,data=dat, col=c("steelblue","firebrick2", "forestgreen"),xlab = "release angle",ylab = "distance (mm)", main="distance by launch angle")

# Boxcox power estimate
library(MASS)
boxcox(stack(dat)[,1]~stack(dat)[,2], plotit=TRUE )

# Logarithmic transformation
dat2<-dat
dat2[,2]<-log(dat2[,2])

# boxplots comparing the logarithmically transformed data from 3 release angles
boxplot(distance ~ angle,data=dat2, col=c("steelblue","firebrick2", "forestgreen"),xlab = "release angle",ylab = "logarithm of the distance (mm)", main="logarithm of the distance by launch angle")

# AOV normal Q-Q plots of the logarithmically transformed data for the 3 release angles
aov2<-aov(distance ~ angle,data=dat2)
plot(aov2,2)

# Kruskal-Wallis Test
kruskal.test(values ~ ind,data=(stack(dat2)))

# Use Tukey's test to determine Honest Significant Differences between the treatment means, 95% confidence
library(car)
TukeyHSD(aov1)
plot(TukeyHSD(aov1))