Analysis of Daphnia data
課題レポート解説(三中信宏)
2013-11-04
## Loading required package: splines コマンダーの GUI
## はインタラクティブなセッションでしか起動しません.
> # データ行列読み込み
> Dataset <- read.table("/Users/minaka/Documents/R packages/data/daphnia.txt",
+ header = TRUE, sep = "", na.strings = "NA", dec = ".", strip.white = TRUE)
> # 以下,念のため,正規性検定と等分散性検定を実行する.
> # 正規性検定(Shapiro-Wilk test)
> shapiro.test(Dataset$Growth.rate)
Shapiro-Wilk normality test
data: Dataset$Growth.rate
W = 0.9602, p-value = 0.0226
> # 等分散性検定(Bartlett test):Daphnia
> tapply(Dataset$Growth.rate, Dataset$Daphnia, var, na.rm = TRUE)
Clone1 Clone2 Clone3
0.3313 1.5301 1.5290
> bartlett.test(Growth.rate ~ Daphnia, data = Dataset)
Bartlett test of homogeneity of variances
data: Growth.rate by Daphnia
Bartlett's K-squared = 14.03, df = 2, p-value = 0.0008987
> # 等分散性検定(Bartlett test):Detergent
> tapply(Dataset$Growth.rate, Dataset$Detergent, var, na.rm = TRUE)
BrandA BrandB BrandC BrandD
1.511 1.090 1.780 2.381
> bartlett.test(Growth.rate ~ Detergent, data = Dataset)
Bartlett test of homogeneity of variances
data: Growth.rate by Detergent
Bartlett's K-squared = 2.606, df = 3, p-value = 0.4565
> # 等分散性検定(Bartlett test):Water
> tapply(Dataset$Growth.rate, Dataset$Water, var, na.rm = TRUE)
Tyne Wear
1.105 2.186
> bartlett.test(Growth.rate ~ Water, data = Dataset)
Bartlett test of homogeneity of variances
data: Growth.rate by Water
Bartlett's K-squared = 3.936, df = 1, p-value = 0.04727
> # 線形統計モデル(完全モデル)構築と当てはめ.演算子「*」によりすべての主効果と交互作用効果がモデルに含まれる.
> LinearModel.1 <- lm(Growth.rate ~ Daphnia * Detergent * Water, data = Dataset)
> summary(LinearModel.1)
Call:
lm(formula = Growth.rate ~ Daphnia * Detergent * Water, data = Dataset)
Residuals:
Min 1Q Median 3Q Max
-1.4882 -0.5440 0.0239 0.3560 1.5250
Coefficients:
Estimate Std. Error t value
(Intercept) 2.8113 0.4818 5.83
DaphniaClone2 0.4964 0.6814 0.73
DaphniaClone3 2.0553 0.6814 3.02
DetergentBrandB -0.0354 0.6814 -0.05
DetergentBrandC 0.4763 0.6814 0.70
DetergentBrandD -0.2141 0.6814 -0.31
WaterWear -0.1581 0.6814 -0.23
DaphniaClone2:DetergentBrandB 0.9189 0.9636 0.95
DaphniaClone3:DetergentBrandB -0.0649 0.9636 -0.07
DaphniaClone2:DetergentBrandC -0.1634 0.9636 -0.17
DaphniaClone3:DetergentBrandC -0.8079 0.9636 -0.84
DaphniaClone2:DetergentBrandD 1.0121 0.9636 1.05
DaphniaClone3:DetergentBrandD -1.2867 0.9636 -1.34
DaphniaClone2:WaterWear 1.3808 0.9636 1.43
DaphniaClone3:WaterWear 0.4316 0.9636 0.45
DetergentBrandB:WaterWear 0.4646 0.9636 0.48
DetergentBrandC:WaterWear -0.2743 0.9636 -0.28
DetergentBrandD:WaterWear 0.2173 0.9636 0.23
DaphniaClone2:DetergentBrandB:WaterWear -1.2638 1.3628 -0.93
DaphniaClone3:DetergentBrandB:WaterWear -0.8744 1.3628 -0.64
DaphniaClone2:DetergentBrandC:WaterWear 1.3561 1.3628 1.00
DaphniaClone3:DetergentBrandC:WaterWear -1.0302 1.3628 -0.76
DaphniaClone2:DetergentBrandD:WaterWear 0.7762 1.3628 0.57
DaphniaClone3:DetergentBrandD:WaterWear -1.5540 1.3628 -1.14
Pr(>|t|)
(Intercept) 4.5e-07 ***
DaphniaClone2 0.4699
DaphniaClone3 0.0041 **
DetergentBrandB 0.9588
DetergentBrandC 0.4879
DetergentBrandD 0.7547
WaterWear 0.8175
DaphniaClone2:DetergentBrandB 0.3451
DaphniaClone3:DetergentBrandB 0.9466
DaphniaClone2:DetergentBrandC 0.8661
DaphniaClone3:DetergentBrandC 0.4060
DaphniaClone2:DetergentBrandD 0.2988
DaphniaClone3:DetergentBrandD 0.1881
DaphniaClone2:WaterWear 0.1584
DaphniaClone3:WaterWear 0.6563
DetergentBrandB:WaterWear 0.6319
DetergentBrandC:WaterWear 0.7771
DetergentBrandD:WaterWear 0.8226
DaphniaClone2:DetergentBrandB:WaterWear 0.3584
DaphniaClone3:DetergentBrandB:WaterWear 0.5241
DaphniaClone2:DetergentBrandC:WaterWear 0.3247
DaphniaClone3:DetergentBrandC:WaterWear 0.4534
DaphniaClone2:DetergentBrandD:WaterWear 0.5716
DaphniaClone3:DetergentBrandD:WaterWear 0.2598
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.835 on 48 degrees of freedom
Multiple R-squared: 0.715, Adjusted R-squared: 0.578
F-statistic: 5.23 on 23 and 48 DF, p-value: 7.02e-07
> library(car)
> # 分散分析表
> Anova(LinearModel.1, type = "II")
Anova Table (Type II tests)
Response: Growth.rate
Sum Sq Df F value Pr(>F)
Daphnia 39.2 2 28.13 8.2e-09 ***
Detergent 2.2 3 1.06 0.37548
Water 2.0 1 2.85 0.09784 .
Daphnia:Detergent 20.6 6 4.93 0.00053 ***
Daphnia:Water 13.7 2 9.86 0.00026 ***
Detergent:Water 0.2 3 0.08 0.96861
Daphnia:Detergent:Water 5.8 6 1.40 0.23432
Residuals 33.4 48
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> # この分散分析表から,主効果 Daphnia ならびに Daphnia
> # が関係する二次の交互作用効果 Daphnia:Detergent と Daphnia:Water
> # が有意になることがわかる.
> # AICによる逐次的モデル改良
> stepwise(LinearModel.1, direction = "backward/forward", criterion = "AIC")
Loading required package: MASS
Direction: backward/forward
Criterion: AIC
Start: AIC=-7.24
Growth.rate ~ Daphnia * Detergent * Water
Df Sum of Sq RSS AIC
- Daphnia:Detergent:Water 6 5.85 39.3 -7.64
<none> 33.4 -7.24
Step: AIC=-7.64
Growth.rate ~ Daphnia + Detergent + Water + Daphnia:Detergent +
Daphnia:Water + Detergent:Water
Df Sum of Sq RSS AIC
- Detergent:Water 3 0.17 39.5 -13.32
<none> 39.3 -7.64
+ Daphnia:Detergent:Water 6 5.85 33.4 -7.24
- Daphnia:Water 2 13.73 53.0 9.95
- Daphnia:Detergent 6 20.60 59.9 10.72
Step: AIC=-13.32
Growth.rate ~ Daphnia + Detergent + Water + Daphnia:Detergent +
Daphnia:Water
Df Sum of Sq RSS AIC
<none> 39.5 -13.32
+ Detergent:Water 3 0.17 39.3 -7.64
- Daphnia:Water 2 13.73 53.2 4.19
- Daphnia:Detergent 6 20.60 60.1 4.93
Call:
lm(formula = Growth.rate ~ Daphnia + Detergent + Water + Daphnia:Detergent +
Daphnia:Water, data = Dataset)
Coefficients:
(Intercept) DaphniaClone2
2.7603 0.3878
DaphniaClone3 DetergentBrandB
2.4876 0.1969
DetergentBrandC DetergentBrandD
0.3391 -0.1054
WaterWear DaphniaClone2:DetergentBrandB
-0.0562 0.2870
DaphniaClone3:DetergentBrandB DaphniaClone2:DetergentBrandC
-0.5021 0.5147
DaphniaClone3:DetergentBrandC DaphniaClone2:DetergentBrandD
-1.3230 1.4002
DaphniaClone3:DetergentBrandD DaphniaClone2:WaterWear
-2.0637 1.5979
DaphniaClone3:WaterWear
-0.4331
> # AICによる逐次的モデル改良の結果は,分散分析表で有意になったすべての項とともに
> # Daphnia 以外のふたつの主効果 Detergent と Water も含まれることがわかる.
> library(multcomp, pos = 4)
Loading required package: mvtnorm Loading required package: survival
> library(abind, pos = 4)
> # Tukey HSD 検定:Daphnia
> AnovaModel.2 <- aov(Growth.rate ~ Daphnia, data = Dataset)
> summary(AnovaModel.2)
Df Sum Sq Mean Sq F value Pr(>F)
Daphnia 2 39.2 19.59 17.3 8e-07 ***
Residuals 69 78.0 1.13
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> numSummary(Dataset$Growth.rate, groups = Dataset$Daphnia, statistics = c("mean",
+ "sd"))
Loading required package: e1071 Loading required package: class
mean sd data:n
Clone1 2.840 0.5756 24
Clone2 4.577 1.2370 24
Clone3 4.139 1.2365 24
> .Pairs <- glht(AnovaModel.2, linfct = mcp(Daphnia = "Tukey"))
> summary(.Pairs) # pairwise tests
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Fit: aov(formula = Growth.rate ~ Daphnia, data = Dataset)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
Clone2 - Clone1 == 0 1.737 0.307 5.66 < 1e-04 ***
Clone3 - Clone1 == 0 1.299 0.307 4.23 0.00022 ***
Clone3 - Clone2 == 0 -0.438 0.307 -1.43 0.33207
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
> confint(.Pairs) # confidence intervals
Simultaneous Confidence Intervals
Multiple Comparisons of Means: Tukey Contrasts
Fit: aov(formula = Growth.rate ~ Daphnia, data = Dataset)
Quantile = 2.395
95% family-wise confidence level
Linear Hypotheses:
Estimate lwr upr
Clone2 - Clone1 == 0 1.737 1.002 2.472
Clone3 - Clone1 == 0 1.299 0.564 2.034
Clone3 - Clone2 == 0 -0.438 -1.174 0.297
> cld(.Pairs) # compact letter display
Clone1 Clone2 Clone3
"a" "b" "b"
> old.oma <- par(oma = c(0, 5, 0, 0))
> plot(confint(.Pairs))

> par(old.oma)
> remove(.Pairs)
> # Tukey HSD 検定:Detergent
> AnovaModel.3 <- aov(Growth.rate ~ Detergent, data = Dataset)
> summary(AnovaModel.3)
Df Sum Sq Mean Sq F value Pr(>F)
Detergent 3 2.2 0.737 0.44 0.73
Residuals 68 114.9 1.690
> numSummary(Dataset$Growth.rate, groups = Dataset$Detergent, statistics = c("mean",
+ "sd"))
mean sd data:n
BrandA 3.885 1.229 18
BrandB 4.010 1.044 18
BrandC 3.955 1.334 18
BrandD 3.558 1.543 18
> .Pairs <- glht(AnovaModel.3, linfct = mcp(Detergent = "Tukey"))
> summary(.Pairs) # pairwise tests
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Fit: aov(formula = Growth.rate ~ Detergent, data = Dataset)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
BrandB - BrandA == 0 0.1252 0.4334 0.29 0.99
BrandC - BrandA == 0 0.0697 0.4334 0.16 1.00
BrandD - BrandA == 0 -0.3266 0.4334 -0.75 0.87
BrandC - BrandB == 0 -0.0555 0.4334 -0.13 1.00
BrandD - BrandB == 0 -0.4518 0.4334 -1.04 0.73
BrandD - BrandC == 0 -0.3963 0.4334 -0.91 0.80
(Adjusted p values reported -- single-step method)
> confint(.Pairs) # confidence intervals
Simultaneous Confidence Intervals
Multiple Comparisons of Means: Tukey Contrasts
Fit: aov(formula = Growth.rate ~ Detergent, data = Dataset)
Quantile = 2.633
95% family-wise confidence level
Linear Hypotheses:
Estimate lwr upr
BrandB - BrandA == 0 0.1252 -1.0160 1.2665
BrandC - BrandA == 0 0.0697 -1.0716 1.2109
BrandD - BrandA == 0 -0.3266 -1.4679 0.8147
BrandC - BrandB == 0 -0.0555 -1.1968 1.0857
BrandD - BrandB == 0 -0.4518 -1.5931 0.6894
BrandD - BrandC == 0 -0.3963 -1.5375 0.7450
> cld(.Pairs) # compact letter display
BrandA BrandB BrandC BrandD
"a" "a" "a" "a"
> old.oma <- par(oma = c(0, 5, 0, 0))
> plot(confint(.Pairs))

> par(old.oma)
> remove(.Pairs)
> # Tukey HSD 検定:Water ※二水準なので通常のt検定
> AnovaModel.4 <- aov(Growth.rate ~ Water, data = Dataset)
> summary(AnovaModel.4)
Df Sum Sq Mean Sq F value Pr(>F)
Water 1 2 1.99 1.21 0.28
Residuals 70 115 1.65
> numSummary(Dataset$Growth.rate, groups = Dataset$Water, statistics = c("mean",
+ "sd"))
mean sd data:n
Tyne 3.686 1.051 36
Wear 4.018 1.478 36
> # 平均値のプロット(1)
> plotMeans(Dataset$Growth.rate, Dataset$Detergent, Dataset$Daphnia, error.bars = "se")

> # 平均値のプロット(2)
> plotMeans(Dataset$Growth.rate, Dataset$Daphnia, Dataset$Water, error.bars = "se")

> # 平均値のプロット(3)
> plotMeans(Dataset$Growth.rate, Dataset$Detergent, Dataset$Water, error.bars = "se")

> # これらの平均値のプロットを見ると,Daphnia
> # が関係する最初の二つのグラフは「交差」が生じ,交互作用効果が可視化されていることがわかる.
> # −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
> # 以上の解説は R Commander での実行結果を markdown
> # ファイルとして保存した上で,RStudio の knitr を用いて html 化し,最終的に
> # RPubs にオンライン公開した.