19.12, 19.13, 19.31, 19.39, 20.8, 20.9, 20.10, 21.9, 21.10, 21.20
## [1] 9.2
## [1] 13
## [1] 13.6
## [1] 16.4
## 1 6 11 16
## 9.2 13.6 13.0 16.4
#av$fitted.values
round(sum(av$residuals),6) # Sum to zer
## [1] 0
The residuals sum to 0
plot(av,1)
Using this plot we cannot find major departures from: \[\sum_i(\alpha\beta)_{ij} = \sum_j(\alpha\beta)_{ij} =0$ and $\epsilon \sim N(0,\sigma^2)\]
plot(av,2)
n = length(av$residuals)
se = sqrt(sum(av$residuals^2/n))
exp_values = sapply(1:n, function(k) se * qnorm((k-.375)/(n+.25)))
cor(exp_values,sort(av$residuals))
## [1] 0.9762576
\(\hat{\rho} = 0.9762576\) Using \(\alpha =0.05\) and table B.6, \(\hat{\rho} = 0.9762576>.951\) so we fail to reject normality. So yes, normality seems reasonable ## Part E
plot(df$Rating[(df$Gender == "Male") & (df$Eye=="Presnt")],residuals(av)[1:5])
plot(df$Rating[(df$Gender == "Male") & (df$Eye=="Absent")],residuals(av)[6:10])
plot(df$Rating[(df$Gender == "Female") & (df$Eye=="Presnt")],residuals(av)[11:15])
plot(df$Rating[(df$Gender == "Female") & (df$Eye=="Absent")],residuals(av)[16:20])
#or
plot(residuals(av)[1:5])
plot(residuals(av)[6:10])
plot(residuals(av)[11:15])
plot(residuals(av)[16:20])
#plot(df$V4,residuals(av))
They seem to be growing (not sure about this one)
interaction.plot(df$Gender, df$Eye, df$Rating, fun = mean)
Yes there seems to be factor effect since the lines are far apart.
anova(av)
## Analysis of Variance Table
##
## Response: Rating
## Df Sum Sq Mean Sq F value Pr(>F)
## Gender 1 76.05 76.050 12.5185 0.002734 **
## Eye 1 54.45 54.450 8.9630 0.008589 **
## Gender:Eye 1 1.25 1.250 0.2058 0.656202
## Residuals 16 97.20 6.075
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
It seems that both sources account for the variation, not just one.
\[H_0: (\alpha_i\beta_i) = 0 \forall i\] \[H_1:\neg H_0\]
qf(.99,1,16)
## [1] 8.530965
Since \(8.530965=F>F^* =0.2058\) we conclude \(H_0\) and that the interaction terms are not significant.
Also the p value from the ANOVA table is \(p_{value} = 0.656202 > .01\)
They all have the same \(F\), and the \(F^*\) differ in each case. For \(\alpha\) \[H_0: (\alpha_i) = 0 \forall i\] \[H_1:\neg H_0\] For \(\beta\) \[H_0: (\beta_i) = 0 \forall i\] \[H_1:\neg H_0\] In both cases \(F^*_\alpha = 8.9630 > 8.53 = F\) & \(F^*_\beta = 12.5185 > 8.53 =F\) So in both cases we conclude \(H_1\) and that the main effects are significant. The P values are \(p_\alpha = 0.008589\), \(p_{\beta} = 0.002734\) ## Part e \[\alpha \leq 1-(.99)^3 \leq 0.03\] ## Part f The results from C,D agree with the graph in part A. There are factor effects and no interaction effects
It is not meaningful to test here since we have an unimportant factor still in the ANOVA model. # Question 3 - 19.31 ## Part a
av = aov(data=df,Rating~Eye+Gender+Eye*Gender)
av$fitted.values
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
## 9.2 9.2 9.2 9.2 9.2 13.6 13.6 13.6 13.6 13.6 13.0 13.0 13.0 13.0 13.0
## 16 17 18 19 20
## 16.4 16.4 16.4 16.4 16.4
model.tables(av,se=T)
## Tables of effects
##
## Eye
## Eye
## Absent Presnt
## 1.65 -1.65
##
## Gender
## Gender
## Female Male
## 1.95 -1.95
##
## Eye:Gender
## Gender
## Eye Female Male
## Absent -0.25 0.25
## Presnt 0.25 -0.25
##
## Standard errors of effects
## Eye Gender Eye:Gender
## 0.7794 0.7794 1.1023
## replic. 10 10 5
t = qt(.995,16)
print("Confidence interval:")
## [1] "Confidence interval:"
13+1.1023*t*c(-1,1)
## [1] 9.780422 16.219578
The fitted values of \(Y_{21.}=13\) and \(s(Y_{21})=1.1023\), \(t(0.995,16)=2.921\). The confidence interval is then \(9.780422\leq \mu_{21} \leq 16.219578\)
mu = mean(df$Rating[df$Eye=="Presnt"])
mu+t*0.7794*c(-1,1)
## [1] 9.123543 13.676457
\(\hat{Y_{1..}} = 11.4\), \(s(\hat{Y_{1..}}) = 0.7794\) from the table above. Using the same t value we get this confidence interval: \(9.123543 \leq \mu_{1.}\leq 13.676457\)
mu.1 = mean(df$Rating[df$Gender=="Male"])
mu.2 =mean(df$Rating[df$Gender=="Female"])
barplot(c(mu.1,mu.2),names.arg=c("Male","Female"),main = "Means of factor B")
This plot suggest that Females tend to have higher ratings. ## Part D
# Y1.
mu.1 + 0.7794*t*c(-1,1)
## [1] 8.823543 13.376457
mu.2 + 0.7794*t*c(-1,1)
## [1] 12.72354 17.27646
\(8.823543 \leq \mu_{.1} \leq 13.376457\), \(12.72354 \leq \mu_{.2} \leq 17.27646\)
The family confidence coefficient fo the set of two estimates is \(1-.01-.01 =.98\) .
mu1. = mean(df$Rating[df$Eye=="Presnt"])
mu2. =mean(df$Rating[df$Eye=="Absent"])
barplot(c(mu.1,mu.2),names.arg=c("Present","Absent"),main = "Means of factor A")
This plot suggests that no eye contact has higher rating
d1 = mu2.-mu1.
d2 = mu.2-mu.1
s.d1 = sqrt(4*6.075/(length(df$Rating)))
t.bonf = qt(1-.05/4,16)
d1+s.d1*t.bonf*c(-1,1)
## [1] 0.5742195 6.0257805
d2+s.d1*t.bonf*c(-1,1)
## [1] 1.174219 6.625781
So we get: \(0.5742195 \leq D_1\leq 6.0257805\) and \(1.174219 \leq D_2 \leq 6.625781\)
Yes since we are making a few pairwise comparison which are specifically of interest in advance of this study.
floor((2.32/(1.2/2.4))^2)
## [1] 21
Using table B.13, with \(\lambda =1.2\), \(\sigma =2.4\) we get the proper value of n should be 21. # Question 5 - 20.8
df = read.table("http://users.stat.ufl.edu/~rrandles/sta4210/Rclassnotes/data/textdatasets/KutnerData/Chapter%2020%20Data%20Sets/CH20PR08.txt")
df$Color = df$V1
df$Humidity = factor(df$V2)
df$Temp = factor(df$V3)
interaction.plot(df$Temp,df$Humidity, df$Color, fun = mean)
av = aov(Color~Humidity+Temp,data=df)
It appears that there are interactions. Main effects may be present but seem small.
anova(av)
## Analysis of Variance Table
##
## Response: Color
## Df Sum Sq Mean Sq F value Pr(>F)
## Humidity 2 2.122 1.061 0.9666 0.4326
## Temp 3 202.200 67.400 61.4123 6.781e-05 ***
## Residuals 6 6.585 1.097
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We will test the following on \(alpha = .025\) \[H_0: (\alpha_i) = 0 \forall i\] \[H_1:\exists i: (\alpha_i) \neq 0 \] \[H_0: (\beta_i) = 0 \forall i\] \[H_1:\exists i: (\beta_i) \neq 0 \] Our \(p_a = 0.4326>0.025\) so we fail to reject the null so that \(\alpha\) are all 0.
Our \(p_b \approx 0 < 0.025\) so we reject the null and conclude that at least one of the \(\beta\)’s is not 0.
y.1 = mean(df$Color[df$Temp==1])
y.2 = mean(df$Color[df$Temp==2])
y.3 = mean(df$Color[df$Temp==3])
y.4 = mean(df$Color[df$Temp==4])
d1 = y.2 -y.1
d2 = y.3-y.2
d3=y.4-y.3
s.d = sqrt(8*1.097/(length(df$Temp)))
bonf = qt(1-.025/3,6)
d1 +(s.d*bonf)*c(-1,1)
## [1] -2.411367 3.211367
d2 +(s.d*bonf)*c(-1,1)
## [1] 2.588633 8.211367
d3 +(s.d*bonf)*c(-1,1)
## [1] 1.321966 6.944700
The bonferoni is the most efficient here since these differences are of interest before the expirement. # Question 6 - 20.9 ## A)
y2. = mean(df$Color[df$Humidity==2])
u.23.hat = y2.+y.3 -mean(df$Color)
u.23.hat
## [1] 21.09167
s.hat = 1.097
var.mu23 = s.hat*(6)/(12) #(\sigma^2(a+b+1)(ab))
var.mu23
## [1] 0.5485
\(Var(\hat\mu_{ij}) = \hat\sigma^2(a+b+1)(ab) = 1.097(6)/(12) = 0.5485\)
sig.mu23 = c(sqrt(var.mu23))
u.23.hat + qt(1-.02/2,6)*sig.mu23*c(-1,1)
## [1] 18.76418 23.41915
There is a 98% probability that the interval above will cover the true value of \(\mu_{23}\)
mu.23.hat = sin(u.23.hat/2)^2
The interval will not be applicapable if the two factors interact and the interpretation of the arcsine transformed variable will change.
We are testing (\(\alpha = 0.005\)): \[H_0:D=0\] \[H_a:D\neq0\]
require(asbio)
## Loading required package: asbio
## Warning: package 'asbio' was built under R version 3.4.3
## Loading required package: tcltk
tukey.add.test(df$Color, df$Humidity, df$Temp)
##
## Tukey's one df test for additivity
## F = 2.0481273 Denom df = 5 p-value = 0.2118051
So we fail to reject the null hypothesis and conclude that \(D=0\).
If the additive model to work we can preform other transformations on Y via boxcox or other methods. # Question 8 - 21.9
Pain tolerance of the subject is a blocking variable because it is not something that can be controlled by an expirement. It is an external factor observed in the expirement.
The assumption of no interactions. Accupuncture and pain tolerance will likely interect
av = aov(data =df, Score~Block+A+B+A*B )
plot(av,1)
plot(av,2)
n = length(av$residuals)
se = sqrt(sum(av$residuals^2/n))
exp_values = sapply(1:n, function(k) se * qnorm((k-.375)/(n+.25)))
cor(exp_values,sort(av$residuals))
## [1] 0.9830392
Using table B.6 and \(\alpha=0.05\) we get a \(\rho = 0.964 < 0.9830392=\hat{\rho}\) so we fail to reject normality.
The Residual vs. Fitted plot suggest that there may be interaction terms between blocking variables and factors
df$together[df$A==1 & df$B==1] = 11
df$together[df$A==1 & df$B==2] = 12
df$together[df$A==2 & df$B==1] = 21
df$together[df$A==2 & df$B==2] = 22
df$together = factor(df$together)
interaction.plot(df$together, df$Block, df$Score, fun = mean,col = 2:8)
The interactions do not seem severe.
We will conduct a Tukey Test for additivity with \(\alpha =0.01\)
\[H_0:(\tau\rho)_{ij}=0 \forall ij\] \[H_0:(\tau\rho)_{ij}\neq0 \exists (i,j)\]
tukey.add.test(df$Score, df$Block, df$together)
##
## Tukey's one df test for additivity
## F = 0.3368083 Denom df = 20 p-value = 0.5681593
We have a \(p^* = 0.5681593 > .01 =\alpha\) we we fail to reject the null and conclude that there are no interaction terms between the blocking variabels and factors. # Question 9 - 21.10
anova(av)
## Analysis of Variance Table
##
## Response: Score
## Df Sum Sq Mean Sq F value Pr(>F)
## Block 7 5.5987 0.7998 55.2963 4.126e-12 ***
## A 1 2.3112 2.3112 159.7901 2.773e-11 ***
## B 1 3.3800 3.3800 233.6790 7.465e-13 ***
## A:B 1 0.0450 0.0450 3.1111 0.0923 .
## Residuals 21 0.3037 0.0145
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We will test for interaction between \(\alpha\) and \(\beta\) with \(\alpha_{error} =.01\) \[H_0: (\alpha\beta)_{ij} = 0 \forall i,j\] \[H_a: (\alpha\beta)_{ij} \neq 0 \exists (i,j)\] The p value for the F test is $ p^* = 0.0923 >_{error}$ so we fail to reject the null hypothesis and conclude that we can remove these interactions.
\[H_0: (\alpha)_{i} = 0 \forall i,\] \[H_a: (\alpha)_{i} \neq 0 \exists i\] \[H_0: (\beta)_{j} = 0 \forall j\] \[H_a: (\beta)_{j} \neq 0 \exists j\]
anova(aov(data =df, Score~Block+A+B))
## Analysis of Variance Table
##
## Response: Score
## Df Sum Sq Mean Sq F value Pr(>F)
## Block 7 5.5987 0.7998 50.455 4.273e-12 ***
## A 1 2.3112 2.3112 145.799 3.528e-11 ***
## B 1 3.3800 3.3800 213.219 8.428e-13 ***
## Residuals 22 0.3487 0.0159
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In both tests we see that the \(p_value < \alpha_{error} = .01\). In both cases we conclude that the main effects are significant. The \(p_{value} \approx 0\) for both tests.
l1= mean(df$Score[df$A==1])- mean(df$Score[df$A==2])
l2= mean(df$Score[df$B==1])- mean(df$Score[df$B==2])
s.d = sqrt(0.0159)/3 # sqrt of MSE since c1^2 =c2^2 = 1
bonf = qt(1-.05/2/2,21) # Not sure about this
l1 +(s.d*bonf)*c(-1,1)
## [1] -0.6389581 -0.4360419
l2 +(s.d*bonf)*c(-1,1)
## [1] -0.7514581 -0.5485419
There is a .95 probability that the interval will cover both the true values of \(L_1\) and \(L_2\)
\[H_0: (\rho)_{i} = 0 \forall i,\] \[H_a: (\rho)_{i} \neq 0 \exists i\] Looking at the ANOVA table \(p_{value} \approx 0 <\alpha_{error}\). We reject the null hypothesis and conclude that blocking variables are significant.
s_r2 = (7*0.7998+8*(3)*0.0145)/(8*4-1)
E = s_r2/0.0145
E
## [1] 13.22937
Since we have such a high \(\hat{E}\) value, we concure that the blocking variable is efficient. Using this blocking variable is much more useful then just using a CRD.