Recipe 3: Completely Randomized Design

Recipes for the Design of Experiments

Zoe Konrad

Rensselaer Polytechnic Institute

Fall 2014 v1

1. Setting

System under test

Low birth weight is an outcome that has been of concern to physicians for years. This is due to the fact that infant mortality rates and birth defect rates are very high for low birth weight babies. A woman’s behavior during pregnancy (including diet, smoking habits, and receiving prenatal care) can greatly alter the chances of carrying the baby to term and, consequently, of delivering a baby of normal birth weight.

Data were collected on 189 women, 59 of which had low birth weight babies and 130 of which had normal birth weight babies (weighing at least 2500 grams). Two variables thought to be of importance were race and the number of physician visits during the first trimester of pregnancy.

x <- read.csv("~/Desktop/Zoe/Recipe 3.csv")
attach(x)
head(x)
##   ID LOW AGE LWT RACE SMOKE PTL HT UI FTV  BWT
## 1 85   0  19 182    2     0   0  0  1   0 2523
## 2 86   0  33 155    3     0   0  0  0   3 2551
## 3 87   0  20 105    1     1   0  0  0   1 2557
## 4 88   0  21 108    1     1   0  0  1   2 2594
## 5 89   0  18 107    1     1   0  0  1   0 2600
## 6 91   0  21 124    3     0   0  0  0   0 2622

Factors and Levels

The two factors of interest are RACE and FTV, each with 3 levels. RACE (1 = White, 2 = Black, 3 = Other). FTV: Number of Physician Visits During the First Trimester transformed to (0=no visits, 1=one visit, 2=multiple visits).

plot(as.factor(FTV), xlab="FTV");plot(as.factor(RACE), xlab="RACE")

plot of chunk unnamed-chunk-2plot of chunk unnamed-chunk-2

FTV[FTV>0 & FTV<2]=1 #transform to 3-level
FTV[FTV>1]=2

Response variables

The continuous response variable is BWT birth weight of baby in grams.

summary(BWT)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     709    2410    2980    2940    3480    4990

2. (Experimental) Design

This experiment is a completely randomized design. We will use a two-factor ANOVA to test the hypothesis that either RACE or FTV can explain the variation in BWT.

3. (Statistical) Analysis

Exploratory Data Analysis Graphics

From initial boxplots of BWT accross RACE and FTV levels, it is not super clear whether there is a significant difference in means accross levels.

boxplot(BWT~RACE,  xlab="FTV") ; boxplot(BWT~FTV, xlab="RACE")

plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4

boxplot(BWT~RACE+FTV, xlab="RACE.FTV")

plot of chunk unnamed-chunk-4

qqnorm(BWT) ; qqline(BWT)

plot of chunk unnamed-chunk-4

interaction.plot(RACE, FTV, BWT)

plot of chunk unnamed-chunk-4

It does however appear that BWT is normally distributed and there are only slight interactions between RACE and FTV.

Testing

model <- aov(BWT~RACE+FTV+RACE*FTV)
anova(model)
## Analysis of Variance Table
## 
## Response: BWT
##            Df   Sum Sq Mean Sq F value Pr(>F)   
## RACE        1  3846362 3846362    7.42 0.0071 **
## FTV         1   201337  201337    0.39 0.5338   
## RACE:FTV    1    31369   31369    0.06 0.8059   
## Residuals 185 95837985  518043                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From our ANOVA analysis, we can reject the null hypothesis that randomizaton alone can account for the variation in BWT.

RACE is determined to be a significant factor with p-value .0071. Neither FTV nor the interaction term FTV*RACE are found to be significant factors.

Estimation (of Parameters)

summary(BWT)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     709    2410    2980    2940    3480    4990
aggregate(BWT, by=list(RACE), FUN=mean)
##   Group.1    x
## 1       1 3104
## 2       2 2720
## 3       3 2804
tukey <- TukeyHSD(aov(BWT~as.factor(RACE)))
plot(tukey)

plot of chunk unnamed-chunk-6

tukey
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = BWT ~ as.factor(RACE))
## 
## $`as.factor(RACE)`
##        diff    lwr    upr  p adj
## 2-1 -384.05 -757.0 -11.05 0.0420
## 3-1 -299.72 -568.3 -31.15 0.0245
## 3-2   84.32 -305.5 474.15 0.8661

Recalling that RACE codes (1 = White, 2 = Black, 3 = Other), we can see that there are significat differences in means between white mothers and Black/Other.

White mothers gave birth to babies with a mean BWT of 3104 grams where as Black, Other mothers gave birth to babies between three and four hundred grams lighter (means of 2720, 2804, respectively.)

Diagnostics/Model Adequacy Checking

We can validate our model’s assumption of normality.

qqnorm(residuals(model)) ; qqline(residuals(model))

plot of chunk unnamed-chunk-7

4. References to the literature