We considered the variables smoke and parity, one at a time, in modeling birth weights of babies in Exercises 8.1 and 8.2. A more realistic approach to modeling infant weights is to consider all possibly related variables at once. Other variables of interest include length of pregnancy in days (gestation), mother’s age in years (age), mother’s height in inches (height), and mother’s pregnancy weight in pounds (weight). Below are three observations from this data set.
| bwt | gestation | parity | age | height | weight | smoke | |
|---|---|---|---|---|---|---|---|
| 1 | 120 | 284 | 0 | 27 | 62 | 100 | 0 |
| 2 | 113 | 282 | 0 | 33 | 64 | 135 | 0 |
| . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . |
| . | . | . | . | . | . | . | . |
| 1236 | 117 | 297 | 0 | 38 | 65 | 129 | 0 |
The summary table below shows the results of a regression model for predicting the average birth weight of babies based on all of the variables included in the data set.
| Estimate | Std. Error | t value | Pr(>|t|) | |
|---|---|---|---|---|
| (Intercept) | -80.41 | 14.35 | -5.60 | 0.0000 |
| gestation | 0.44 | 0.03 | 15.26 | 0.0000 |
| parity | -3.33 | 1.13 | -2.95 | 0.0033 |
| age | -0.01 | 0.09 | -0.10 | 0.9170 |
| height | 1.15 | 0.21 | 5.63 | 0.0000 |
| weight | 0.05 | 0.03 | 1.99 | 0.0471 |
| smoke | -8.40 | 0.95 | -8.81 | 0.0000 |
\[\widehat{\text{baby_weight}} = -80.41 + 0.44 \times gestation - 3.33 \times parity − 0.01 \times age\] \[+ 1.15 \times height + 0.05 \times weight − 8.40 \times smoke\]
\[\widehat{\text{baby_weight}} = -80.41 + 0.44 \times 284 - 3.33 \times 0 − 0.01 \times 27\] \[+ 1.15 \times 62 + 0.05 \times 100 − 8.40 \times 0=120.58\]
\[e_i=y_i - \hat{y_i}=120 - 120.58 = -0.58\]
\[R^2 = 1 - \frac{\text{variability in residuals}}{\text{variability in the outcome}}=1 - \frac{Var(e_i)}{Var(y_i)}\]
\[R^2 = 1 - \frac{249.28}{332.57}=0.2504\] \[R^2_{adj} = 1 - \frac{Var(e_i)/(n - k - 1)}{Var(y_i)/(n-1)}\] \[R^2_{adj} = 1 - \frac{249.28/(1236-6-1)}{332.57/(1236-1)}=0.2468\]