Low birth weight is an outcome that has been of concern to physicians for years. This is due to the fact that infant mortality rates and birth defect rates are very high for low birth weight babies. A woman’s behavior during pregnancy (including diet, smoking habits, and receiving prenatal care) can greatly alter the chances of carrying the baby to term and, consequently, of delivering a baby of normal birth weight.
Data were collected on 189 women, 59 of which had low birth weight babies and 130 of which had normal birth weight babies (weighing at least 2500 grams). Two variables thought to be of importance were race and the number of physician visits during the first trimester of pregnancy.
x <- read.csv("~/Desktop/Zoe/Recipe 3.csv")
attach(x)
head(x)
## ID LOW AGE LWT RACE SMOKE PTL HT UI FTV BWT
## 1 85 0 19 182 2 0 0 0 1 0 2523
## 2 86 0 33 155 3 0 0 0 0 3 2551
## 3 87 0 20 105 1 1 0 0 0 1 2557
## 4 88 0 21 108 1 1 0 0 1 2 2594
## 5 89 0 18 107 1 1 0 0 1 0 2600
## 6 91 0 21 124 3 0 0 0 0 0 2622
The two factors of interest are RACE and FTV, each with 3 levels. RACE (1 = White, 2 = Black, 3 = Other). FTV: Number of Physician Visits During the First Trimester transformed to (0=no visits, 1=one visit, 2=multiple visits).
plot(as.factor(FTV), xlab="FTV");plot(as.factor(RACE), xlab="RACE")
FTV[FTV>0 & FTV<2]=1 #transform to 3-level
FTV[FTV>1]=2
The continuous response variable is BWT birth weight of baby in grams.
summary(BWT)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 709 2410 2980 2940 3480 4990
This experiment is a completely randomized design. We will use a two-factor ANOVA to test the hypothesis that either RACE or FTV can explain the variation in BWT.
From initial boxplots of BWT accross RACE and FTV levels, it is not super clear whether there is a significant difference in means accross levels.
boxplot(BWT~RACE, xlab="FTV") ; boxplot(BWT~FTV, xlab="RACE")
boxplot(BWT~RACE+FTV, xlab="RACE.FTV")
qqnorm(BWT) ; qqline(BWT)
interaction.plot(RACE, FTV, BWT)
It does however appear that BWT is normally distributed and there are only slight interactions between RACE and FTV.
model <- aov(BWT~RACE+FTV+RACE*FTV)
anova(model)
## Analysis of Variance Table
##
## Response: BWT
## Df Sum Sq Mean Sq F value Pr(>F)
## RACE 1 3846362 3846362 7.42 0.0071 **
## FTV 1 201337 201337 0.39 0.5338
## RACE:FTV 1 31369 31369 0.06 0.8059
## Residuals 185 95837985 518043
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From our ANOVA analysis, we can reject the null hypothesis that randomizaton alone can account for the variation in BWT.
RACE is determined to be a significant factor with p-value .0071. Neither FTV nor the interaction term FTV*RACE are found to be significant factors.
summary(BWT)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 709 2410 2980 2940 3480 4990
aggregate(BWT, by=list(RACE), FUN=mean)
## Group.1 x
## 1 1 3104
## 2 2 2720
## 3 3 2804
tukey <- TukeyHSD(aov(BWT~as.factor(RACE)))
plot(tukey)
tukey
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = BWT ~ as.factor(RACE))
##
## $`as.factor(RACE)`
## diff lwr upr p adj
## 2-1 -384.05 -757.0 -11.05 0.0420
## 3-1 -299.72 -568.3 -31.15 0.0245
## 3-2 84.32 -305.5 474.15 0.8661
Recalling that RACE codes (1 = White, 2 = Black, 3 = Other), we can see that there are significat differences in means between white mothers and Black/Other.
White mothers gave birth to babies with a mean BWT of 3104 grams where as Black, Other mothers gave birth to babies between three and four hundred grams lighter (means of 2720, 2804, respectively.)
We can validate our model’s assumption of normality.
qqnorm(residuals(model)) ; qqline(residuals(model))