8.3 Baby weights, Part III.

We considered the variables smoke and parity, one at a time, in modeling birth weights of babies in Exercises 8.1 and 8.2. A more realistic approach to modeling infant weights is to consider all possibly related variables at once. Other variables of interest include length of pregnancy in days (gestation), mother’s age in years (age), mother’s height in inches (height), and mother’s pregnancy weight in pounds (weight). Below are three observations from this data set.

bwt gestation parity age height weight smoke
1 120 284 0 27 62 100 0
2 113 282 0 33 64 135 0
. . . . . . . .
. . . . . . . .
. . . . . . . .
1236 117 297 0 38 65 129 0

The summary table below shows the results of a regression model for predicting the average birth weight of babies based on all of the variables included in the data set.

Estimate Std. Error t value Pr(>|t|)
(Intercept) -80.41 14.35 -5.60 0.0000
gestation 0.44 0.03 15.26 0.0000
parity -3.33 1.13 -2.95 0.0033
age -0.01 0.09 -0.10 0.9170
height 1.15 0.21 5.63 0.0000
weight 0.05 0.03 1.99 0.0471
smoke -8.40 0.95 -8.81 0.0000
  1. Write the equation of the regression line that includes all of the variables.

\[\widehat{\text{baby_weight}} = -80.41 + 0.44 \times gestation - 3.33 \times parity − 0.01 \times age\] \[+ 1.15 \times height + 0.05 \times weight − 8.40 \times smoke\]

  1. Interpret the slopes of gestation and age in this context.
  1. The coefficient for parity is different than in the linear model shown in Exercise 8.2. Why might there be a difference?
  1. Calculate the residual for the first observation in the data set.

\[\widehat{\text{baby_weight}} = -80.41 + 0.44 \times 284 - 3.33 \times 0 − 0.01 \times 27\] \[+ 1.15 \times 62 + 0.05 \times 100 − 8.40 \times 0=120.58\]

\[e_i=y_i - \hat{y_i}=120 - 120.58 = -0.58\]

  1. The variance of the residuals is \(249.28\), and the variance of the birth weights of all babies in the data set is \(332.57\). Calculate the \(R^2\) and the adjusted \(R^2\). Note that there are \(1,236\) observations in the data set.

\[R^2 = 1 - \frac{\text{variability in residuals}}{\text{variability in the outcome}}=1 - \frac{Var(e_i)}{Var(y_i)}\]

\[R^2 = 1 - \frac{249.28}{332.57}=0.2504\] \[R^2_{adj} = 1 - \frac{Var(e_i)/(n - k - 1)}{Var(y_i)/(n-1)}\] \[R^2_{adj} = 1 - \frac{249.28/(1236-6-1)}{332.57/(1236-1)}=0.2468\]