Homework #4 is to be completed and submitted individually. You may discuss concepts with classmates, but all work you submit must be your own.
You MUST include the names of the students you worked with.
Worked with: None
A materials lab is comparing four coating methods for corrosion resistance. Metal specimens are processed in five production batches, and batch-to-batch variation is expected to affect the response. Therefore, the experiment uses a randomized complete block design with:
Each batch contains one specimen for each coating method. The observed scores are:
| Batch | A | B | C | D |
|---|---|---|---|---|
| 1 | 62 | 68 | 64 | 70 |
| 2 | 58 | 65 | 61 | 67 |
| 3 | 64 | 71 | 66 | 73 |
| 4 | 60 | 66 | 63 | 69 |
| 5 | 59 | 67 | 62 | 68 |
State why this is an RCBD:
\[Y_{ij} = \mu + \tau_i + \beta_j + \epsilon_{ij}\]
Define each term in context (treatment effect, block effect, and random error).
Write the null and alternative hypotheses for testing whether the mean corrosion scores differ by coating method.
\(H_0: \mu_A = \mu_B = \mu_C = \mu_D\) \(H_a: \mu_i \neq \mu_{i'} \text{ for at least one pair } i \neq i'\)
Use R to fit the RCBD model (for example, with aov) and
report:
Provide a brief interpretation of each test.
scores <- c(62,68,64,70, 58,65,61,67, 64,71,66,73, 60,66,63,69, 59,67,62,68)
coating <- factor(rep(c("A","B","C","D"), 5))
batch <- factor(rep(1:5, each=4))
model <- aov(score ~ coating + batch, data=data.frame(score=scores, coating, batch))
summary(model)
## Df Sum Sq Mean Sq F value Pr(>F)
## coating 3 238.2 79.38 366.38 4.63e-12 ***
## batch 4 77.8 19.45 89.77 7.78e-09 ***
## Residuals 12 2.6 0.22
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Treatment (Coating): F(3, 12) = 366.39, p < 0.0001. We reject H0, there is strong evidence that mean corrosion scores differ significantly across the four coating methods.
Blocks (Batch): F(4, 12) = 89.77, p < 0.0001. The batch effect is also highly significant, confirming that blocking was essential. Batches differed greatly, and including them remove a large source of variability from the error term. Without blocking, this variation would have inflated \(MS_E\) and weakened the treatment test
model.tables(model, type = "means")
## Tables of means
## Grand mean
##
## 65.15
##
## coating
## coating
## A B C D
## 60.6 67.4 63.2 69.4
##
## batch
## batch
## 1 2 3 4 5
## 66.00 62.75 68.50 64.50 64.00
Compute and report treatment means. Based on your model output and treatment means, identify which coating method appears best and explain whether your conclusion is statistical, practical, or both.
Coating D has the highest mean corrosion score (69.40), followed by B (67.40), C (63.20), and A (60.60). Statistically, coating D is best; the near-zero \(MS_E\) = 0.217 makes it clear the means are different. Practically, the gap between D and A is ~8.8 points on the corrosion scale, a relevant difference in protection. Coating D provides the strongest corrosion resistance across all five batches consistently. This conclusion is both statistical (significant F-test) and practical (large, consistent effect size).
Check RCBD model assumptions. State whether assumptions appear reasonable, and use the methods described in class.
par(mfrow = c(1, 2))
plot(model, which = 1)
plot(model, which = 2)
shapiro.test(residuals(model))
##
## Shapiro-Wilk normality test
##
## data: residuals(model)
## W = 0.98619, p-value = 0.9879
interaction.plot(batch, coating, scores,
xlab = "Batch",
ylab = "Mean Corrosion Score",
trace.label = "Coating")
The Normal Q-Q plot shows points falling roughly in a linear pattern,
and the residual plot shows no clear pattern, suggesting assumptions are
reasonably met. Interaction plots show roughly parallel lines,
supporting the additivity assumption.
A sensory panel is evaluating five formulations of a beverage concentrate (T1, T2, T3, T4, T5). Because each panel session can evaluate only three formulations before fatigue effects appear, blocks are incomplete.
Suppose the study plans to use:
Explain why an RCBD is not feasible in this setting and why a BIBD is an appropriate target design.
An RCBD requires every block to contain all treatments. Here, each panel session can only evaluate k=3 formulations, but there are t=5 treatments, so it is physically impossible to include all five in one session. A BIBD is appropriate because it distributes treatments across blocks so that every pair of treatments appears together in exactly \(\lambda\) blocks, ensuring all pairwise comparisons are estimable with equal precision despite the incomplete blocks.
Using the BIBD identities, compute:
\(r\) (replications per treatment), \[r = \frac{bk}{t} = \frac{10 \times 3}{5} = 6\]
\(\lambda\) (pairwise concurrence).
\[\lambda = \frac{r(k-1)}{t-1} = \frac{6(3-1)}{5-1} = \frac{12}{4} = 3\]
Based on your values of \(r\) and \(\lambda\), verify whether this parameter set can satisfy BIBD balance requirements. State what “balanced” means in this context.
\(bk = tr \Rightarrow 10 \times 3 = 5 \times 6 = 30 \checkmark\) \(bk = tr \Rightarrow 10 \times 3 = 5 \times 6 = 30 \checkmark\) \(b \geq t: 10 \geq 5 \checkmark\)
Balanced means every pair of treatments co-occurs in exactly \(\lambda\) = 3 blocks, so all pairwise treatment comparisons are estimated with equal variance, no pair is more precisely estimated than another.
Consider the proposed block schedule:
For this schedule, determine if it satisfied the BIBD criteria and explain why.
# Define the block schedule
blocks <- list(
c("T1","T2","T3"),
c("T1","T4","T5"),
c("T2","T4","T5"),
c("T3","T4","T5"),
c("T1","T2","T4"),
c("T1","T3","T5"),
c("T2","T3","T4"),
c("T2","T3","T5"),
c("T1","T3","T4"),
c("T1","T2","T5")
)
treatments <- c("T1","T2","T3","T4","T5")
# Check r: how many times each treatment appears
r_counts <- sapply(treatments, function(t) sum(sapply(blocks, function(b) t %in% b)))
print(r_counts)
## T1 T2 T3 T4 T5
## 6 6 6 6 6
# Check lambda: how many times each pair co-occurs
pairs <- combn(treatments, 2)
lambda_counts <- apply(pairs, 2, function(p) {
sum(sapply(blocks, function(b) all(p %in% b)))
})
pair_names <- apply(pairs, 2, paste, collapse="-")
names(lambda_counts) <- pair_names
print(lambda_counts)
## T1-T2 T1-T3 T1-T4 T1-T5 T2-T3 T2-T4 T2-T5 T3-T4 T3-T5 T4-T5
## 3 3 3 3 3 3 3 3 3 3
This satisfies the BIBD criteria as every treatment appears exactly r=6 times and every pair appears exactly \(\lambda\) = 3 times.
Assume this schedule is used and response is a sensory score.
\[Y_{ij} = \mu + \tau_i + \beta_j + \epsilon_{ij}\]
\(H_0: \tau_1 = \tau_2 = \tau_3 = \tau_4 = \tau_5 = 0\) \(H_a: \tau_i \neq 0 \text{ for at least one } i\)
Panelist/session fatigue and rater differences inflate variability. By modeling session effects as \(\beta_j\), that variability is separated out of \(MS_E\), making the treatment F-test more powerful. Ignoring sessions would fold session-to-session differences into the error, inflating \(MS_E\) and making it harder to detect real formulation differences.
Calculate estimates of each treatment effect using R. Use the data
sensory.csv
sensory <- read.csv("sensory.csv")
sensory$Formulation <- factor(sensory$Formulation)
sensory$Session <- factor(sensory$Session)
model <- aov(Score ~ Formulation + Session, data = sensory)
summary(model)
## Df Sum Sq Mean Sq F value Pr(>F)
## Formulation 4 14.707 3.677 521.102 < 2e-16 ***
## Session 9 0.442 0.049 6.962 0.000427 ***
## Residuals 16 0.113 0.007
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
model.tables(model, type = "effects")
## Tables of effects
##
## Formulation
## Formulation
## T1 T2 T3 T4 T5
## -0.1667 -0.8333 -0.5333 0.4000 1.1333
##
## Session
## Session
## 1 2 3 4 5 6 7 8
## -0.07222 0.09444 -0.21667 -0.08333 0.08333 0.17222 -0.02778 -0.10556
## 9 10
## 0.05000 0.10556
model.tables(model, type = "means")
## Tables of means
## Grand mean
##
## 7.583333
##
## Formulation
## Formulation
## T1 T2 T3 T4 T5
## 7.417 6.750 7.050 7.983 8.717
##
## Session
## Session
## 1 2 3 4 5 6 7 8 9 10
## 7.511 7.678 7.367 7.500 7.667 7.756 7.556 7.478 7.633 7.689