######### HOMEWORK 3, STOR 665: COMPUTING PART, DUE MAR 3, 2016 #########
######### Student Name: Jie Huang
# INSTRUCTION: Edit this R file by adding your solution to the end of each
# question. Your submitted file should run smoothly in R and provide all
# required results. You are allowed to work with other students but the
# homework should be in your own words. Identical solutions will receive a 0
# in grade and will be investigated.
# This homework is an exercise on Two-way ANOVA. We will build up R code
# that interprets the coefficients in Two-way ANOVA.
# We will be negligent in the sense that we will ignore precautions
# that a numerical analyst would build into his code, and we will not
# consider measures to improve numerical conditioning such as pivoting
# (switching the order of the variables) either.
#----------------------------------------------------------------
# PROBLEM 1: Description of the pvc data. Start by loading the data:
library(faraway) # install this package if needed
attach(pvc) # load the pvc data.
pvc # Type help(pvc) for more details
## psize operator resin
## 1 36.2 1 1
## 2 36.3 1 1
## 3 35.3 1 2
## 4 35.0 1 2
## 5 30.8 1 3
## 6 30.6 1 3
## 7 29.8 1 4
## 8 29.6 1 4
## 9 32.0 1 5
## 10 31.7 1 5
## 11 30.7 1 6
## 12 29.7 1 6
## 13 33.4 1 7
## 14 32.4 1 7
## 15 37.1 1 8
## 16 36.5 1 8
## 17 35.8 2 1
## 18 35.0 2 1
## 19 35.6 2 2
## 20 35.1 2 2
## 21 30.4 2 3
## 22 28.9 2 3
## 23 30.2 2 4
## 24 29.9 2 4
## 25 31.1 2 5
## 26 31.7 2 5
## 27 30.9 2 6
## 28 30.4 2 6
## 29 32.9 2 7
## 30 32.1 2 7
## 31 36.7 2 8
## 32 36.2 2 8
## 33 36.0 3 1
## 34 34.6 3 1
## 35 33.0 3 2
## 36 33.7 3 2
## 37 31.3 3 3
## 38 27.1 3 3
## 39 30.0 3 4
## 40 27.3 3 4
## 41 28.7 3 5
## 42 29.9 3 5
## 43 30.8 3 6
## 44 28.7 3 6
## 45 35.5 3 7
## 46 30.1 3 7
## 47 32.6 3 8
## 48 33.7 3 8
operator;resin
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3
## [36] 3 3 3 3 3 3 3 3 3 3 3 3 3
## Levels: 1 2 3
## [1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 1 1 2
## [36] 2 3 3 4 4 5 5 6 6 7 7 8 8
## Levels: 1 2 3 4 5 6 7 8
# Describe the data. Are the group balanced? Also make the interaction plot.
# Does there seem to be an interaction effect?
# SOLUTION:
There are two main effects: operator and resin.
Operator has 3 levels, resin has 8 levels.
There are 2 repeats in each group (K=2), it’s a balanced design.
par(mfrow=c(2,1),mar=c(3,3,2,2),mgp=c(1.8,0.5,0))
interaction.plot(pvc$operator,pvc$resin,pvc$psize)
title("Interaction Plot", sub="PVC")
interaction.plot(pvc$resin,pvc$operator,pvc$psize)
title("Interaction Plot", sub="PVC")
The lines of some factors are not perfectly parallel, so there seems to be interaction effect. But clearly there is no servere interaction. We will further check interaction from the anova table
#----------------------------------------------------------------
# PROBLEM 2: Start with the default contrasts
options()$contrasts
## unordered ordered
## "contr.treatment" "contr.poly"
# Find the mean particle size for each resin in operator 3
# (with two decimal places).
# Use only numbers from the following output
summary(lm(psize~operator*resin, data=pvc))
##
## Call:
## lm(formula = psize ~ operator * resin, data = pvc)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.7000 -0.3625 0.0000 0.3625 2.7000
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.2500 0.8598 42.164 < 2e-16 ***
## operator2 -0.8500 1.2159 -0.699 0.491216
## operator3 -0.9500 1.2159 -0.781 0.442245
## resin2 -1.1000 1.2159 -0.905 0.374615
## resin3 -5.5500 1.2159 -4.565 0.000126 ***
## resin4 -6.5500 1.2159 -5.387 1.56e-05 ***
## resin5 -4.4000 1.2159 -3.619 0.001372 **
## resin6 -6.0500 1.2159 -4.976 4.42e-05 ***
## resin7 -3.3500 1.2159 -2.755 0.011014 *
## resin8 0.5500 1.2159 0.452 0.655078
## operator2:resin2 1.0500 1.7195 0.611 0.547175
## operator3:resin2 -0.8500 1.7195 -0.494 0.625567
## operator2:resin3 -0.2000 1.7195 -0.116 0.908372
## operator3:resin3 -0.5500 1.7195 -0.320 0.751842
## operator2:resin4 1.2000 1.7195 0.698 0.491960
## operator3:resin4 -0.1000 1.7195 -0.058 0.954105
## operator2:resin5 0.4000 1.7195 0.233 0.818024
## operator3:resin5 -1.6000 1.7195 -0.931 0.361376
## operator2:resin6 1.3000 1.7195 0.756 0.456985
## operator3:resin6 0.5000 1.7195 0.291 0.773715
## operator2:resin7 0.4500 1.7195 0.262 0.795782
## operator3:resin7 0.8500 1.7195 0.494 0.625567
## operator2:resin8 0.5000 1.7195 0.291 0.773715
## operator3:resin8 -2.7000 1.7195 -1.570 0.129454
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.216 on 24 degrees of freedom
## Multiple R-squared: 0.8999, Adjusted R-squared: 0.804
## F-statistic: 9.382 on 23 and 24 DF, p-value: 3.447e-07
summary(aov(psize~operator*resin, data=pvc))
## Df Sum Sq Mean Sq F value Pr(>F)
## operator 2 20.72 10.36 7.007 0.00401 **
## resin 7 283.95 40.56 27.439 5.66e-10 ***
## operator:resin 14 14.34 1.02 0.693 0.75987
## Residuals 24 35.48 1.48
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# SOLUTION:
\(\bar{y}_{1 1 \cdot}\) = coefficient of Intercept = 36.2500
\(\bar{y}_{3 1 \cdot}\) = coefficient of Intercept + operator3 = 36.2500-0.9500=35.30
\(\bar{y}_{3 2 \cdot}\) = coefficient of Intercept + operator3 + resin2 + operator3:resin2 =36.2500-0.9500-1.1000-0.8500 = 33.35
\(\bar{y}_{3 3 \cdot}\) = coefficient of Intercept + operator3 + resin3 + operator3:resin3 =36.2500-0.9500-5.5500-0.5500 = 29.20
\(\bar{y}_{3 4 \cdot}\) = coefficient of Intercept + operator3 + resin4 + operator3:resin4 =36.2500-0.9500-6.5500-0.1000 = 28.65
\(\bar{y}_{3 5 \cdot}\) = coefficient of Intercept + operator3 + resin5 + operator3:resin5 =36.2500-0.9500-4.4000-1.600 = 29.30
\(\bar{y}_{3 6 \cdot}\) = coefficient of Intercept + operator3 + resin6 + operator3:resin6 =36.2500-0.9500-6.0500+0.5000 = 29.75
\(\bar{y}_{3 7 \cdot}\) = coefficient of Intercept + operator3 + resin7 + operator3:resin7 =36.2500-0.9500-3.3500+0.8500 = 32.80
\(\bar{y}_{3 8 \cdot}\) = coefficient of Intercept + operator3 + resin8 + operator3:resin8 =36.2500-0.9500+0.5500-2.7000 = 33.15
#----------------------------------------------------------------
# PROBLEM 3: Now change the contrasts to
options(contrasts=c("contr.sum", "contr.sum"))
# Find the mean particle size for each resin in operator 2.
# (with two decimal places).
# Use only numbers from the following output.
summary(lm(psize~operator*resin, data=pvc))
##
## Call:
## lm(formula = psize ~ operator * resin, data = pvc)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.7000 -0.3625 0.0000 0.3625 2.7000
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 32.35417 0.17550 184.359 < 2e-16 ***
## operator1 0.58958 0.24819 2.376 0.025855 *
## operator2 0.32708 0.24819 1.318 0.199983
## resin1 3.29583 0.46432 7.098 2.45e-07 ***
## resin2 2.26250 0.46432 4.873 5.74e-05 ***
## resin3 -2.50417 0.46432 -5.393 1.54e-05 ***
## resin4 -2.88750 0.46432 -6.219 1.99e-06 ***
## resin5 -1.50417 0.46432 -3.240 0.003490 **
## resin6 -2.15417 0.46432 -4.639 0.000104 ***
## resin7 0.37917 0.46432 0.817 0.422183
## operator1:resin1 0.01042 0.65664 0.016 0.987474
## operator2:resin1 -0.57708 0.65664 -0.879 0.388203
## operator1:resin2 -0.05625 0.65664 -0.086 0.932445
## operator2:resin2 0.40625 0.65664 0.619 0.541957
## operator1:resin3 0.26042 0.65664 0.397 0.695176
## operator2:resin3 -0.52708 0.65664 -0.803 0.430030
## operator1:resin4 -0.35625 0.65664 -0.543 0.592455
## operator2:resin4 0.25625 0.65664 0.390 0.699800
## operator1:resin5 0.41042 0.65664 0.625 0.537854
## operator2:resin5 0.22292 0.65664 0.339 0.737202
## operator1:resin6 -0.58958 0.65664 -0.898 0.378172
## operator2:resin6 0.12292 0.65664 0.187 0.853086
## operator1:resin7 -0.42292 0.65664 -0.644 0.525646
## operator2:resin7 -0.56042 0.65664 -0.853 0.401844
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.216 on 24 degrees of freedom
## Multiple R-squared: 0.8999, Adjusted R-squared: 0.804
## F-statistic: 9.382 on 23 and 24 DF, p-value: 3.447e-07
summary(aov(psize~operator*resin, data=pvc))
## Df Sum Sq Mean Sq F value Pr(>F)
## operator 2 20.72 10.36 7.007 0.00401 **
## resin 7 283.95 40.56 27.439 5.66e-10 ***
## operator:resin 14 14.34 1.02 0.693 0.75987
## Residuals 24 35.48 1.48
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# SOLUTION:
\(\bar{y}_{\cdot \cdot \cdot}=\hat{\mu}\) = coefficient of Intercept = 32.35417
\(\bar{y}_{i j \cdot}=\hat{\mu}_{i j} =\hat{\mu} + \hat{\alpha}_i + \hat{\beta}_j +\hat{\gamma}_{i j}\) = coefficient of Intercept + \(\alpha_i+\beta_j+\gamma_{i j}\) in the table (think \(\alpha\) as the operator, and \(\beta\) as rasin)
Therefore:
\(\bar{y}_{2 1 \cdot}\) = coefficient of Intercept + operator2 + resin1 + operator2:resin1 = 32.35417+0.32708+3.29583-0.57708 = 35.40
\(\bar{y}_{2 2 \cdot}\) = coefficient of Intercept + operator2 + resin2 + operator2:resin2 = 32.35417+0.32708+2.26250+0.40625 = 35.35
\(\bar{y}_{2 3 \cdot}\) = coefficient of Intercept + operator2 + resin3 + operator2:resin3 = 32.35417+0.32708-2.50417 -0.52708 = 29.65
\(\bar{y}_{2 4 \cdot}\) = coefficient of Intercept + operator2 + resin4 + operator2:resin4 = 32.35417+0.32708-2.88750 +0.25625 = 30.05
\(\bar{y}_{2 5 \cdot}\) = coefficient of Intercept + operator2 + resin5 + operator2:resin5 = 32.35417+0.32708-1.50417 +0.22292 = 31.40
\(\bar{y}_{2 6 \cdot}\) = coefficient of Intercept + operator2 + resin6 + operator2:resin6 = 32.35417+0.32708-2.15417 +0.12292 = 30.65
\(\bar{y}_{2 7 \cdot}\) = coefficient of Intercept + operator2 + resin7 + operator2:resin7 = 32.35417+0.32708+0.37917 -0.56042 = 32.50
Under this contract, \(\sum_{i=1}^8 resin_i=0\), and \(\sum_{i=1}^8 operator2:resin_i=0\)
\(resin8 = -\sum_{i=1}^7 resin_i=3.11251\)
\(operator2:resin8 = -\sum_{i=1}^7 operator2:resin_i=0.65624\)
Therefore,
\(\bar{y}_{2 8 \cdot}\) = coefficient of Intercept + operator2 + resin8 + operator2:resin8 = 32.35417+0.32708+3.11251 +0.65624 = 36.45
#----------------------------------------------------------------
# PROBLEM 4: Produce the ANOVA table for the complete model. Is
# the interaction significant?
# SOLUTION:
summary(aov(psize~operator*resin, data=pvc))
## Df Sum Sq Mean Sq F value Pr(>F)
## operator 2 20.72 10.36 7.007 0.00401 **
## resin 7 283.95 40.56 27.439 5.66e-10 ***
## operator:resin 14 14.34 1.02 0.693 0.75987
## Residuals 24 35.48 1.48
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#----------------------------------------------------------------
The interaction does not seem to be significant.