Monte Carlo Estimation of Power
If resampling is performed from a known theoretical distribution, it is refered to as a “Monte Carlo” simulation. Monte Carlo simulations are algorithms that utilize repeated random sampling techniques to determine the distribution of a data set.
meanstar=with(weight,tapply(Weight,Team,mean))
#creates an array of the means of the levels
with(weight,tapply(Weight,Team,var))
## Broncos Cowboys Dolphins Forty Niners Packers
## 263.6 235.3 281.1 288.5 291.6
#creates an array of the variances of the levels
with(weight,tapply(Weight,Team,length))
## Broncos Cowboys Dolphins Forty Niners Packers
## 17 17 17 17 17
#creates an array of the lengths of the levels
summary(aov(Weight~Team,data=weight))
## Df Sum Sq Mean Sq F value Pr(>F)
## Team 4 1714 428 1.58 0.19
## Residuals 80 21761 272
#summary of the avona model for reference
meanWeight = mean(weight$Weight)
#mean of all the weights in the data set
sqrtMS = sqrt(272)
#squareroot of the mean square of the residuals
simTeam = weight$Team
#assigns the Teams to one data set
Runs = 1000
#number of random simulations that will be run
Fstar = numeric(Runs)
for (i in 1:Runs)
{A = rnorm(17, mean=meanWeight, sd=sqrtMS)
B = rnorm(17, mean=meanWeight, sd=sqrtMS)
C = rnorm(17, mean=meanWeight, sd=sqrtMS)
D = rnorm(17, mean=meanWeight, sd=sqrtMS)
E = rnorm(17, mean=meanWeight, sd=sqrtMS)
simWeight = c(A,B,C,D,E)
simdata = data.frame(simWeight,simTeam)
Fstar[i] = oneway.test(simWeight~simTeam, var.equal=T, data=simdata)$statistic}
The plotted points of the theoretical distribution match up relatively nicely with the Monte Carlo F-distribution. This indicates that the original data set is close to being normally distributed.
Bootstrap Method
Bootstrap F-distribution Histogram
hist(newFstar, prob=TRUE, main ="BootStrap F-Distribution", ylim=c(0,0.8), xlim=c(0,7))
x=seq(.25,7,.5)
#4=5levels-1mean and 80=85values-5means
points(x,y=df(x,4,80),type="b",col="blue")

The bootstrapping histogram is very similar to the theoretical distribution from the first histogram, confirming that the original data set is very close to being normally distributed.
Estimatation of the Alpha Level
# Alpha level (Bootstrapped Distribution)
print(realFstar<-oneway.test(Weight~Team, var.equal=T, data=weight)$statistic)
## F
## 1.575
#Display the probability
mean(newFstar>=realFstar)
## [1] 0.16
#Quantiles
#Analytic F distribution
qf(.95,4,80)
## [1] 2.486
#Bootstrapped F distribution (alpha= 0.05)
quantile(newFstar,.95)
## 95%
## 2.231
Repeat ANOVA Test with Resampling
newmodel=aov(newsimweight~newsimTeam, data=newsimdata)
anova(newmodel)
## Analysis of Variance Table
##
## Response: newsimweight
## Df Sum Sq Mean Sq F value Pr(>F)
## newsimTeam 4 1575 394 1.55 0.2
## Residuals 80 20293 254
ANOVA Results: The second anova test that analyzed the variation in player weight as a result of the variation in player team produced a p-value greater than 0.19. Again, this indicates that there is a high probability that the variation of player weight can be attributed to solely randomization.
Estimation of Parameters
Summary of all factors and levels
#Summaries
#Broncos
broncos<-weight$Team=="Broncos"
#assigns all bronco data a vector
summary(weight[broncos,"Weight"])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 216 250 256 254 265 281
#displays a summary of the weight data for the Broncos
#Packers
packers<-weight$Team=="Packers"
#assigns all packer data a vector
summary(weight[packers,"Weight"])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 222 240 254 251 263 275
#displays a summary of the weight data for the Packers
#Dolphins
dolphins<-weight$Team=="Dolphins"
#assigns all packer data a vector
summary(weight[dolphins,"Weight"])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 210 236 253 249 263 268
#displays a summary of the weight data for the Dolphins
#49ers
fortyniners<-weight$Team=="Forty Niners"
#assigns all packer data a vector
summary(weight[fortyniners,"Weight"])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 208 228 247 241 249 271
#displays a summary of the weight data for the 49ers
#Cowboys
cowboys<-weight$Team=="Cowboys"
#assigns all packer data a vector
summary(weight[cowboys,"Weight"])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 220 245 250 247 255 266
#displays a summary of the weight data for the Cowboys
Diagnostics/Model Adequacy Checking
The Normal Q-Q plots for both anova models produced relatively linear relationships between the residual and theoretical values, indicating that the use of ANOVA in this recipe was appropriate.
A Residuals vs. Fits Plot is a common graph used in residual analysis. It is a scatter plot of residuals as a function of fitted values, or the estimated responses. These plots are used to identify linearity, outliers, and error variances.
#Player Weights
plot(fitted(model),residuals(model), main="Residual vs Fitted Plot for NFL Player Weights")

plot(fitted(newmodel),residuals(newmodel), main="Residual vs Fitted Plot for NFL Player Weights (Resampling)")

#produces a Residual vs Fits Plot for the first and second ANOVA
The residual plots above both produced relatively even distributions about zero and do not contain any obvious outliers, indicating that the use of ANOVA was appropraite in this recipe.