Data Set: Plant Growth
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.5
plant.df=PlantGrowth
plant.df$group=factor(plant.df$group,labels = c("Control","Treatment1","Treatment2"))
Visulazing Factors(with reordering):
attach(plant.df)
ggplot(plant.df,aes(group,weight,fill=group))+geom_boxplot(aes(reorder(group,weight,median)),color="blue",notch = F)
fitting ANOVA Model:
anovafit=aov(weight~ group,data=plant.df)
summary(anovafit)
## Df Sum Sq Mean Sq F value Pr(>F)
## group 2 3.766 1.8832 4.846 0.0159 *
## Residuals 27 10.492 0.3886
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
ANOVA table shows P value of 0.0159 which though rejects H{0}, NULL Hypothesis that group means are same, it does not tell how much different they are. To do pairwise comparison, we need to do pairwise comparison. Pairwise t stat comparison:
pairwise.t.test(weight,group,p.adjust.method = "bonferroni")
##
## Pairwise comparisons using t tests with pooled SD
##
## data: weight and group
##
## Control Treatment1
## Treatment1 0.583 -
## Treatment2 0.263 0.013
##
## P value adjustment method: bonferroni
Tuckey Honest Significance test:
TukeyHSD(anovafit,conf.level=0.95)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = weight ~ group, data = plant.df)
##
## $group
## diff lwr upr p adj
## Treatment1-Control -0.371 -1.0622161 0.3202161 0.3908711
## Treatment2-Control 0.494 -0.1972161 1.1852161 0.1979960
## Treatment2-Treatment1 0.865 0.1737839 1.5562161 0.0120064
Tuckey Test shows that Treatment1 and Treatment 2 means are different, but there is no conclusive evidence for control group.
Anova Test using lm function:
lmfit=lm(weight~group,data=plant.df)
summary(lmfit)
##
## Call:
## lm(formula = weight ~ group, data = plant.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.0710 -0.4180 -0.0060 0.2627 1.3690
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.0320 0.1971 25.527 <2e-16 ***
## groupTreatment1 -0.3710 0.2788 -1.331 0.1944
## groupTreatment2 0.4940 0.2788 1.772 0.0877 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6234 on 27 degrees of freedom
## Multiple R-squared: 0.2641, Adjusted R-squared: 0.2096
## F-statistic: 4.846 on 2 and 27 DF, p-value: 0.01591
anova(lmfit)
## Analysis of Variance Table
##
## Response: weight
## Df Sum Sq Mean Sq F value Pr(>F)
## group 2 3.7663 1.8832 4.8461 0.01591 *
## Residuals 27 10.4921 0.3886
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
here it can be seen easily that both method show same F stat. Calculation of confident Interval:
confint(lmfit,level=.95)
## 2.5 % 97.5 %
## (Intercept) 4.62752600 5.4364740
## groupTreatment1 -0.94301261 0.2010126
## groupTreatment2 -0.07801261 1.0660126
plant.mod=data.frame(fitted=fitted(lmfit),residuals=resid(lmfit),treatment=plant.df$group)
ggplot(plant.mod,aes(fitted,residuals,color=treatment))+geom_point()