library(daewr)
library(tidyverse)
Lawson, John. Design and Analysis of Experiments with R (Chapman & Hall/CRC Texts in Statistical Science) (p. 137). CRC Press. Kindle Edition. Submit responses to Exercises 5a, 5b, 5c, 7a, & 7b from Lawson, Chapter 4.
Lew (2007) presents the data from an experiment to determine whether cultured cells respond to two drugs. The experiment was conducted using a stable cell line plated onto Petri dishes, with each experimental run involving assays of responses in three Petri dishes: one treated with drug 1, one treated with drug 2, and one untreated serving as a control. The data are shown in the table below:
Control Drug 1 Drug 2
Experiment 1 1147 1169 1009
Experiment 2 1273 1323 1260
Experiment 3 1216 1276 1143
Experiment 4 1046 1240 1099
Experiment 5 1108 1432 1385
Experiment 6 1265 1562 1164
exp <- factor(rep(c(1,2,3,4,5,6), each = 3))
exp
## [1] 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6
## Levels: 1 2 3 4 5 6
treat <- factor(rep(c("Control", "Drug 1", "Drug 2")))
treat
## [1] Control Drug 1 Drug 2
## Levels: Control Drug 1 Drug 2
resp <- c(1147, 1169, 1009,
1273, 1323, 1260,
1216, 1276, 1143,
1046, 1240, 1099,
1108, 1432, 1385,
1265, 1562, 1164)
data.frame(block = exp, treatment = treat, response = resp) -> data
data
## block treatment response
## 1 1 Control 1147
## 2 1 Drug 1 1169
## 3 1 Drug 2 1009
## 4 2 Control 1273
## 5 2 Drug 1 1323
## 6 2 Drug 2 1260
## 7 3 Control 1216
## 8 3 Drug 1 1276
## 9 3 Drug 2 1143
## 10 4 Control 1046
## 11 4 Drug 1 1240
## 12 4 Drug 2 1099
## 13 5 Control 1108
## 14 5 Drug 1 1432
## 15 5 Drug 2 1385
## 16 6 Control 1265
## 17 6 Drug 1 1562
## 18 6 Drug 2 1164
# Using CRD methodology
# Hypothesis: u1 = u2 = u3
# Alternative: one of the factor effects is different
# Based on the p-value, we can reject the alternative, and conclude that there is no significant difference between the treatment groups at alpha of 0.05
(aov(response ~ treatment, data = data)) -> mod
summary(mod)
## Df Sum Sq Mean Sq F value Pr(>F)
## treatment 2 99122 49561 3.258 0.0669 .
## Residuals 15 228213 15214
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Using RCB Methodology
# Hypothesis: u1 = u2 = u3
# Alternative: one of the factor effects is different
# Based on the p-value, we can reject the null hypothesis, and conclude that there is significant difference between the treatment groups at alpha of 0.05. At least one of the treatment group is significantly different than the rest.
summary(mod <- aov(response ~ treatment + block, data = data))
## Df Sum Sq Mean Sq F value Pr(>F)
## treatment 2 99122 49561 5.271 0.0273 *
## block 5 134190 26838 2.854 0.0743 .
## Residuals 10 94024 9402
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
TukeyHSD(mod, "treatment")
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = response ~ treatment + block, data = data)
##
## $treatment
## diff lwr upr p adj
## Drug 1-Control 157.8333333 4.366592 311.300075 0.0440035
## Drug 2-Control 0.8333333 -152.633408 154.300075 0.9998778
## Drug 2-Drug 1 -157.0000000 -310.466741 -3.533259 0.0450905
Yes, there is difference between (a) and (b). The cause of difference is that in completely randomized design, we treat each entry as an experimental unit. In randomized complete block approach, we group homogenous units into blocks which helps to minimize within group variance. This blocking is intended to increase the sensitivity to treatment. For this particular experiment, it makes sense to minimize the natural variation in order to detect response to treatment since we trying to determine response to drug treatments. The presence of natural variance in individual cells might impede with the test which blocking helps us to minimize.
knitr::include_graphics("q7.png")
Q7
He used a stopwatch to time a runner going from home to second. He started the watch when the runner crossed a point 35 feet from home plate and stopped the watch at a point 15 feet short of second base. This eliminated the variability in times caused by stopping and starting. Finally, he timed a random sample of 22 different runners, so that his conclusions could be generalized to all runners. In addition, after an appropriate rest period, he had each runner take each path (in a random order). In that way he could use the runners as blocks and eliminate the runner-to-runner variability from the error. The data is shown in the table below.
knitr::include_graphics("q7_b.PNG")
Q7
# An RCB model is ideal for this data with each runner representing a block
# Blocking the runners this way helps to keep them homogenous within block while separate blocks contain heterogenous runners
block <- factor(rep(c(1:22), each = 3))
block
## [1] 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8
## [24] 8 9 9 9 10 10 10 11 11 11 12 12 12 13 13 13 14 14 14 15 15 15 16
## [47] 16 16 17 17 17 18 18 18 19 19 19 20 20 20 21 21 21 22 22 22
## Levels: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
treat <- c("Roundout", "Narrow", "Wide")
treat
## [1] "Roundout" "Narrow" "Wide"
resp <- c(5.40, 5.50, 5.55,
5.85, 5.70, 5.75,
5.20, 5.60, 5.50,
5.55, 5.50, 5.40,
5.90, 5.85, 5.70,
5.45, 5.55, 5.60,
5.40, 5.40, 5.35,
5.45, 5.50, 5.35,
5.25, 5.15, 5.00,
5.85, 5.80, 5.70,
5.25, 5.20, 5.10,
5.65, 5.55, 5.45,
5.60, 5.35, 5.45,
5.05, 5.00, 4.95,
5.50, 5.50, 5.40,
5.45, 5.55, 5.50,
5.55, 5.55, 5.35,
5.45, 5.50, 5.55,
5.50, 5.45, 5.25,
5.65, 5.60, 5.40,
5.70, 5.65, 5.55,
6.30, 6.30, 6.25)
resp
## [1] 5.40 5.50 5.55 5.85 5.70 5.75 5.20 5.60 5.50 5.55 5.50 5.40 5.90 5.85
## [15] 5.70 5.45 5.55 5.60 5.40 5.40 5.35 5.45 5.50 5.35 5.25 5.15 5.00 5.85
## [29] 5.80 5.70 5.25 5.20 5.10 5.65 5.55 5.45 5.60 5.35 5.45 5.05 5.00 4.95
## [43] 5.50 5.50 5.40 5.45 5.55 5.50 5.55 5.55 5.35 5.45 5.50 5.55 5.50 5.45
## [57] 5.25 5.65 5.60 5.40 5.70 5.65 5.55 6.30 6.30 6.25
data.frame(block, treatment = treat, response = resp) -> data
data
## block treatment response
## 1 1 Roundout 5.40
## 2 1 Narrow 5.50
## 3 1 Wide 5.55
## 4 2 Roundout 5.85
## 5 2 Narrow 5.70
## 6 2 Wide 5.75
## 7 3 Roundout 5.20
## 8 3 Narrow 5.60
## 9 3 Wide 5.50
## 10 4 Roundout 5.55
## 11 4 Narrow 5.50
## 12 4 Wide 5.40
## 13 5 Roundout 5.90
## 14 5 Narrow 5.85
## 15 5 Wide 5.70
## 16 6 Roundout 5.45
## 17 6 Narrow 5.55
## 18 6 Wide 5.60
## 19 7 Roundout 5.40
## 20 7 Narrow 5.40
## 21 7 Wide 5.35
## 22 8 Roundout 5.45
## 23 8 Narrow 5.50
## 24 8 Wide 5.35
## 25 9 Roundout 5.25
## 26 9 Narrow 5.15
## 27 9 Wide 5.00
## 28 10 Roundout 5.85
## 29 10 Narrow 5.80
## 30 10 Wide 5.70
## 31 11 Roundout 5.25
## 32 11 Narrow 5.20
## 33 11 Wide 5.10
## 34 12 Roundout 5.65
## 35 12 Narrow 5.55
## 36 12 Wide 5.45
## 37 13 Roundout 5.60
## 38 13 Narrow 5.35
## 39 13 Wide 5.45
## 40 14 Roundout 5.05
## 41 14 Narrow 5.00
## 42 14 Wide 4.95
## 43 15 Roundout 5.50
## 44 15 Narrow 5.50
## 45 15 Wide 5.40
## 46 16 Roundout 5.45
## 47 16 Narrow 5.55
## 48 16 Wide 5.50
## 49 17 Roundout 5.55
## 50 17 Narrow 5.55
## 51 17 Wide 5.35
## 52 18 Roundout 5.45
## 53 18 Narrow 5.50
## 54 18 Wide 5.55
## 55 19 Roundout 5.50
## 56 19 Narrow 5.45
## 57 19 Wide 5.25
## 58 20 Roundout 5.65
## 59 20 Narrow 5.60
## 60 20 Wide 5.40
## 61 21 Roundout 5.70
## 62 21 Narrow 5.65
## 63 21 Wide 5.55
## 64 22 Roundout 6.30
## 65 22 Narrow 6.30
## 66 22 Wide 6.25
# Hypothesis: u1 = u2 = u3
# Alternative: one of the treatment effects is different
# Based on the p-value, we can reject the null hypothesis, and conclude that there is significant difference between the treatment groups at alpha of 0.05. At least one of the treatment group is significantly different than the rest.
# Tukey's HSD shows that while Randout vs. Narrow is not significant, wide vs. narrow and wide vs. roundout is significant
summary(mod <- aov(response ~ treatment + block, data = data))
## Df Sum Sq Mean Sq F value Pr(>F)
## treatment 2 0.094 0.04686 6.288 0.00408 **
## block 21 4.219 0.20089 26.960 < 2e-16 ***
## Residuals 42 0.313 0.00745
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
TukeyHSD(mod, "treatment")
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = response ~ treatment + block, data = data)
##
## $treatment
## diff lwr upr p adj
## Roundout-Narrow 0.009090909 -0.05414087 0.07232269 0.9350652
## Wide-Narrow -0.075000000 -0.13823178 -0.01176822 0.0167382
## Wide-Roundout -0.084090909 -0.14732269 -0.02085913 0.0066475