Analysis of Daphnia data

課題レポート解説（三中信宏）

2013-11-04

## Loading required package: splines コマンダーの GUI
## はインタラクティブなセッションでしか起動しません．

> # データ行列読み込み
> Dataset <- read.table("/Users/minaka/Documents/R packages/data/daphnia.txt", 
+     header = TRUE, sep = "", na.strings = "NA", dec = ".", strip.white = TRUE)

> # 以下，念のため，正規性検定と等分散性検定を実行する．

> # 正規性検定（Shapiro-Wilk test）
> shapiro.test(Dataset$Growth.rate)


    Shapiro-Wilk normality test

data:  Dataset$Growth.rate
W = 0.9602, p-value = 0.0226

> # 等分散性検定（Bartlett test）：Daphnia
> tapply(Dataset$Growth.rate, Dataset$Daphnia, var, na.rm = TRUE)

Clone1 Clone2 Clone3 
0.3313 1.5301 1.5290

> bartlett.test(Growth.rate ~ Daphnia, data = Dataset)


    Bartlett test of homogeneity of variances

data:  Growth.rate by Daphnia
Bartlett's K-squared = 14.03, df = 2, p-value = 0.0008987

> # 等分散性検定（Bartlett test）：Detergent
> tapply(Dataset$Growth.rate, Dataset$Detergent, var, na.rm = TRUE)

BrandA BrandB BrandC BrandD 
 1.511  1.090  1.780  2.381

> bartlett.test(Growth.rate ~ Detergent, data = Dataset)


    Bartlett test of homogeneity of variances

data:  Growth.rate by Detergent
Bartlett's K-squared = 2.606, df = 3, p-value = 0.4565

> # 等分散性検定（Bartlett test）：Water
> tapply(Dataset$Growth.rate, Dataset$Water, var, na.rm = TRUE)

 Tyne  Wear 
1.105 2.186

> bartlett.test(Growth.rate ~ Water, data = Dataset)


    Bartlett test of homogeneity of variances

data:  Growth.rate by Water
Bartlett's K-squared = 3.936, df = 1, p-value = 0.04727

> # 線形統計モデル（完全モデル）構築と当てはめ．演算子「*」によりすべての主効果と交互作用効果がモデルに含まれる．
> LinearModel.1 <- lm(Growth.rate ~ Daphnia * Detergent * Water, data = Dataset)
> summary(LinearModel.1)


Call:
lm(formula = Growth.rate ~ Daphnia * Detergent * Water, data = Dataset)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.4882 -0.5440  0.0239  0.3560  1.5250 

Coefficients:
                                        Estimate Std. Error t value
(Intercept)                               2.8113     0.4818    5.83
DaphniaClone2                             0.4964     0.6814    0.73
DaphniaClone3                             2.0553     0.6814    3.02
DetergentBrandB                          -0.0354     0.6814   -0.05
DetergentBrandC                           0.4763     0.6814    0.70
DetergentBrandD                          -0.2141     0.6814   -0.31
WaterWear                                -0.1581     0.6814   -0.23
DaphniaClone2:DetergentBrandB             0.9189     0.9636    0.95
DaphniaClone3:DetergentBrandB            -0.0649     0.9636   -0.07
DaphniaClone2:DetergentBrandC            -0.1634     0.9636   -0.17
DaphniaClone3:DetergentBrandC            -0.8079     0.9636   -0.84
DaphniaClone2:DetergentBrandD             1.0121     0.9636    1.05
DaphniaClone3:DetergentBrandD            -1.2867     0.9636   -1.34
DaphniaClone2:WaterWear                   1.3808     0.9636    1.43
DaphniaClone3:WaterWear                   0.4316     0.9636    0.45
DetergentBrandB:WaterWear                 0.4646     0.9636    0.48
DetergentBrandC:WaterWear                -0.2743     0.9636   -0.28
DetergentBrandD:WaterWear                 0.2173     0.9636    0.23
DaphniaClone2:DetergentBrandB:WaterWear  -1.2638     1.3628   -0.93
DaphniaClone3:DetergentBrandB:WaterWear  -0.8744     1.3628   -0.64
DaphniaClone2:DetergentBrandC:WaterWear   1.3561     1.3628    1.00
DaphniaClone3:DetergentBrandC:WaterWear  -1.0302     1.3628   -0.76
DaphniaClone2:DetergentBrandD:WaterWear   0.7762     1.3628    0.57
DaphniaClone3:DetergentBrandD:WaterWear  -1.5540     1.3628   -1.14
                                        Pr(>|t|)    
(Intercept)                              4.5e-07 ***
DaphniaClone2                             0.4699    
DaphniaClone3                             0.0041 ** 
DetergentBrandB                           0.9588    
DetergentBrandC                           0.4879    
DetergentBrandD                           0.7547    
WaterWear                                 0.8175    
DaphniaClone2:DetergentBrandB             0.3451    
DaphniaClone3:DetergentBrandB             0.9466    
DaphniaClone2:DetergentBrandC             0.8661    
DaphniaClone3:DetergentBrandC             0.4060    
DaphniaClone2:DetergentBrandD             0.2988    
DaphniaClone3:DetergentBrandD             0.1881    
DaphniaClone2:WaterWear                   0.1584    
DaphniaClone3:WaterWear                   0.6563    
DetergentBrandB:WaterWear                 0.6319    
DetergentBrandC:WaterWear                 0.7771    
DetergentBrandD:WaterWear                 0.8226    
DaphniaClone2:DetergentBrandB:WaterWear   0.3584    
DaphniaClone3:DetergentBrandB:WaterWear   0.5241    
DaphniaClone2:DetergentBrandC:WaterWear   0.3247    
DaphniaClone3:DetergentBrandC:WaterWear   0.4534    
DaphniaClone2:DetergentBrandD:WaterWear   0.5716    
DaphniaClone3:DetergentBrandD:WaterWear   0.2598    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.835 on 48 degrees of freedom
Multiple R-squared:  0.715, Adjusted R-squared:  0.578 
F-statistic: 5.23 on 23 and 48 DF,  p-value: 7.02e-07

> library(car)

> # 分散分析表
> Anova(LinearModel.1, type = "II")

Anova Table (Type II tests)

Response: Growth.rate
                        Sum Sq Df F value  Pr(>F)    
Daphnia                   39.2  2   28.13 8.2e-09 ***
Detergent                  2.2  3    1.06 0.37548    
Water                      2.0  1    2.85 0.09784 .  
Daphnia:Detergent         20.6  6    4.93 0.00053 ***
Daphnia:Water             13.7  2    9.86 0.00026 ***
Detergent:Water            0.2  3    0.08 0.96861    
Daphnia:Detergent:Water    5.8  6    1.40 0.23432    
Residuals                 33.4 48                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

> # この分散分析表から，主効果 Daphnia ならびに Daphnia
> # が関係する二次の交互作用効果 Daphnia:Detergent と Daphnia:Water
> # が有意になることがわかる．

> # AICによる逐次的モデル改良
> stepwise(LinearModel.1, direction = "backward/forward", criterion = "AIC")

Loading required package: MASS


Direction:  backward/forward
Criterion:  AIC 

Start:  AIC=-7.24
Growth.rate ~ Daphnia * Detergent * Water

                          Df Sum of Sq  RSS   AIC
- Daphnia:Detergent:Water  6      5.85 39.3 -7.64
<none>                                 33.4 -7.24

Step:  AIC=-7.64
Growth.rate ~ Daphnia + Detergent + Water + Daphnia:Detergent + 
    Daphnia:Water + Detergent:Water

                          Df Sum of Sq  RSS    AIC
- Detergent:Water          3      0.17 39.5 -13.32
<none>                                 39.3  -7.64
+ Daphnia:Detergent:Water  6      5.85 33.4  -7.24
- Daphnia:Water            2     13.73 53.0   9.95
- Daphnia:Detergent        6     20.60 59.9  10.72

Step:  AIC=-13.32
Growth.rate ~ Daphnia + Detergent + Water + Daphnia:Detergent + 
    Daphnia:Water

                    Df Sum of Sq  RSS    AIC
<none>                           39.5 -13.32
+ Detergent:Water    3      0.17 39.3  -7.64
- Daphnia:Water      2     13.73 53.2   4.19
- Daphnia:Detergent  6     20.60 60.1   4.93


Call:
lm(formula = Growth.rate ~ Daphnia + Detergent + Water + Daphnia:Detergent + 
    Daphnia:Water, data = Dataset)

Coefficients:
                  (Intercept)                  DaphniaClone2  
                       2.7603                         0.3878  
                DaphniaClone3                DetergentBrandB  
                       2.4876                         0.1969  
              DetergentBrandC                DetergentBrandD  
                       0.3391                        -0.1054  
                    WaterWear  DaphniaClone2:DetergentBrandB  
                      -0.0562                         0.2870  
DaphniaClone3:DetergentBrandB  DaphniaClone2:DetergentBrandC  
                      -0.5021                         0.5147  
DaphniaClone3:DetergentBrandC  DaphniaClone2:DetergentBrandD  
                      -1.3230                         1.4002  
DaphniaClone3:DetergentBrandD        DaphniaClone2:WaterWear  
                      -2.0637                         1.5979  
      DaphniaClone3:WaterWear  
                      -0.4331

> # AICによる逐次的モデル改良の結果は，分散分析表で有意になったすべての項とともに
> # Daphnia 以外のふたつの主効果 Detergent と Water も含まれることがわかる．

> library(multcomp, pos = 4)

Loading required package: mvtnorm Loading required package: survival

> library(abind, pos = 4)

> # Tukey HSD 検定：Daphnia
> AnovaModel.2 <- aov(Growth.rate ~ Daphnia, data = Dataset)
> summary(AnovaModel.2)

            Df Sum Sq Mean Sq F value Pr(>F)    
Daphnia      2   39.2   19.59    17.3  8e-07 ***
Residuals   69   78.0    1.13                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

> numSummary(Dataset$Growth.rate, groups = Dataset$Daphnia, statistics = c("mean", 
+     "sd"))

Loading required package: e1071 Loading required package: class

        mean     sd data:n
Clone1 2.840 0.5756     24
Clone2 4.577 1.2370     24
Clone3 4.139 1.2365     24

> .Pairs <- glht(AnovaModel.2, linfct = mcp(Daphnia = "Tukey"))
> summary(.Pairs)  # pairwise tests


     Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Fit: aov(formula = Growth.rate ~ Daphnia, data = Dataset)

Linear Hypotheses:
                     Estimate Std. Error t value Pr(>|t|)    
Clone2 - Clone1 == 0    1.737      0.307    5.66  < 1e-04 ***
Clone3 - Clone1 == 0    1.299      0.307    4.23  0.00022 ***
Clone3 - Clone2 == 0   -0.438      0.307   -1.43  0.33207    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

> confint(.Pairs)  # confidence intervals


     Simultaneous Confidence Intervals

Multiple Comparisons of Means: Tukey Contrasts


Fit: aov(formula = Growth.rate ~ Daphnia, data = Dataset)

Quantile = 2.395
95% family-wise confidence level


Linear Hypotheses:
                     Estimate lwr    upr   
Clone2 - Clone1 == 0  1.737    1.002  2.472
Clone3 - Clone1 == 0  1.299    0.564  2.034
Clone3 - Clone2 == 0 -0.438   -1.174  0.297

> cld(.Pairs)  # compact letter display

Clone1 Clone2 Clone3 
   "a"    "b"    "b"

> old.oma <- par(oma = c(0, 5, 0, 0))
> plot(confint(.Pairs))

plot of chunk unnamed-chunk-16

> par(old.oma)
> remove(.Pairs)

> # Tukey HSD 検定：Detergent
> AnovaModel.3 <- aov(Growth.rate ~ Detergent, data = Dataset)
> summary(AnovaModel.3)

            Df Sum Sq Mean Sq F value Pr(>F)
Detergent    3    2.2   0.737    0.44   0.73
Residuals   68  114.9   1.690

> numSummary(Dataset$Growth.rate, groups = Dataset$Detergent, statistics = c("mean", 
+     "sd"))

        mean    sd data:n
BrandA 3.885 1.229     18
BrandB 4.010 1.044     18
BrandC 3.955 1.334     18
BrandD 3.558 1.543     18

> .Pairs <- glht(AnovaModel.3, linfct = mcp(Detergent = "Tukey"))
> summary(.Pairs)  # pairwise tests


     Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Fit: aov(formula = Growth.rate ~ Detergent, data = Dataset)

Linear Hypotheses:
                     Estimate Std. Error t value Pr(>|t|)
BrandB - BrandA == 0   0.1252     0.4334    0.29     0.99
BrandC - BrandA == 0   0.0697     0.4334    0.16     1.00
BrandD - BrandA == 0  -0.3266     0.4334   -0.75     0.87
BrandC - BrandB == 0  -0.0555     0.4334   -0.13     1.00
BrandD - BrandB == 0  -0.4518     0.4334   -1.04     0.73
BrandD - BrandC == 0  -0.3963     0.4334   -0.91     0.80
(Adjusted p values reported -- single-step method)

> confint(.Pairs)  # confidence intervals


     Simultaneous Confidence Intervals

Multiple Comparisons of Means: Tukey Contrasts


Fit: aov(formula = Growth.rate ~ Detergent, data = Dataset)

Quantile = 2.633
95% family-wise confidence level


Linear Hypotheses:
                     Estimate lwr     upr    
BrandB - BrandA == 0  0.1252  -1.0160  1.2665
BrandC - BrandA == 0  0.0697  -1.0716  1.2109
BrandD - BrandA == 0 -0.3266  -1.4679  0.8147
BrandC - BrandB == 0 -0.0555  -1.1968  1.0857
BrandD - BrandB == 0 -0.4518  -1.5931  0.6894
BrandD - BrandC == 0 -0.3963  -1.5375  0.7450

> cld(.Pairs)  # compact letter display

BrandA BrandB BrandC BrandD 
   "a"    "a"    "a"    "a"

> old.oma <- par(oma = c(0, 5, 0, 0))
> plot(confint(.Pairs))

plot of chunk unnamed-chunk-17

> par(old.oma)
> remove(.Pairs)

> # Tukey HSD 検定：Water ※二水準なので通常のt検定
> AnovaModel.4 <- aov(Growth.rate ~ Water, data = Dataset)
> summary(AnovaModel.4)

            Df Sum Sq Mean Sq F value Pr(>F)
Water        1      2    1.99    1.21   0.28
Residuals   70    115    1.65

> numSummary(Dataset$Growth.rate, groups = Dataset$Water, statistics = c("mean", 
+     "sd"))

      mean    sd data:n
Tyne 3.686 1.051     36
Wear 4.018 1.478     36

> # 平均値のプロット（1）
> plotMeans(Dataset$Growth.rate, Dataset$Detergent, Dataset$Daphnia, error.bars = "se")

plot of chunk unnamed-chunk-19

> # 平均値のプロット（2）
> plotMeans(Dataset$Growth.rate, Dataset$Daphnia, Dataset$Water, error.bars = "se")

plot of chunk unnamed-chunk-20

> # 平均値のプロット（3）
> plotMeans(Dataset$Growth.rate, Dataset$Detergent, Dataset$Water, error.bars = "se")

plot of chunk unnamed-chunk-21

> # これらの平均値のプロットを見ると，Daphnia
> # が関係する最初の二つのグラフは「交差」が生じ，交互作用効果が可視化されていることがわかる．

> # −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
> # 以上の解説は R Commander での実行結果を markdown
> # ファイルとして保存した上で，RStudio の knitr を用いて html 化し，最終的に
> # RPubs にオンライン公開した．