PROBLEM 1
library(nlme)
dye <- read.table("http://www.rohan.sdsu.edu/~babailey/stat700/dye.dat", header=T)
dye$Batch <- as.factor(dye$Batch)
#dye
(a) Make strip chart of Strength by Batch. What do you notice?
stripchart(Strength ~ Batch, vertical=T,method="stack",xlab="Batch",ylab="Strength",data=dye,main="Dyestuff Data Stripchart")
One notices that batches 2,3 and 5 are on the lower quartiles of strength with roughly similar standard deviations. While batches 4 and 6 are in the higher quartiles of strength. Again the standard deviations of those five batches seem plausible to all be the same although the means may be different. As for batch one there appears to be a higher standard devation.
fit = lm(Strength ~ Batch , data=dye)
summary(fit)
##
## Call:
## lm(formula = Strength ~ Batch, data = dye)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.000 -1.250 0.125 1.500 4.500
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 92.000 0.863 106.59 < 2e-16 ***
## Batch2 -1.500 1.221 -1.23 0.23
## Batch3 -0.750 1.221 -0.61 0.54
## Batch4 6.250 1.221 5.12 1.7e-05 ***
## Batch5 -0.833 1.221 -0.68 0.50
## Batch6 6.500 1.221 5.33 9.3e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.11 on 30 degrees of freedom
## Multiple R-squared: 0.756, Adjusted R-squared: 0.715
## F-statistic: 18.6 on 5 and 30 DF, p-value: 2.16e-08
anova(fit)
## Analysis of Variance Table
##
## Response: Strength
## Df Sum Sq Mean Sq F value Pr(>F)
## Batch 5 415 83.1 18.6 2.2e-08 ***
## Residuals 30 134 4.5
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
After looking at the Anova table the p-value of the batch is 2.157e-08 which is signicantly small, therefore we reject the null hypothesis H0: = 0
Problem 2 (a) We can use the R function lme to fit
fit2 = lme(Strength~1, data=dye, random=~1 | Batch)
summary(fit2)
## Linear mixed-effects model fit by REML
## Data: dye
## AIC BIC logLik
## 175.9 180.6 -84.96
##
## Random effects:
## Formula: ~1 | Batch
## (Intercept) Residual
## StdDev: 3.62 2.114
##
## Fixed effects: Strength ~ 1
## Value Std.Error DF t-value p-value
## (Intercept) 93.61 1.519 30 61.62 0
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -1.93304 -0.65770 0.07683 0.81223 2.08757
##
## Number of Observations: 36
## Number of Groups: 6
For the intercept /simga^2 we get a StdDev = 3.619968 and a Residual Std Deviation simga^2 a = 2.114106.
plot(fit2)
fit
##
## Call:
## lm(formula = Strength ~ Batch, data = dye)
##
## Coefficients:
## (Intercept) Batch2 Batch3 Batch4 Batch5
## 92.000 -1.500 -0.750 6.250 -0.833
## Batch6
## 6.500
The plot now has a gap in values from 93 to 98 but overall there is no specific pattern in the “Standardized residuals vs Fitted values” plot.