8.1 BABY WEIGHTS, PART I. The Child Health and Development Studies investigate a range of topics. One study considered all pregnancies between 1960 and 1967 among women in the Kaiser Foundation Health Plan in the San Francisco East Bay area. Here, we study the relationship between smoking and weight of the baby. The variable smoke is coded 1 if the mother is a smoker, and 0 if not. The summary table below shows the results of a linear regression model for predicting the average birth weight of babies, measured in ounces, based on the smoking status of the mother.

Var Estimate Std. Error t value Pr(>|t|)
(Intercept) 123.05 0.65 189.60 0.0000
smoke -8.94 1.03 -8.65 0.0000

The variability within the smokers and non-smokers are about equal and the distributions are symmetric. With these conditions satisfied, it is reasonable to apply the model. (Note that we don’t need to check linearity since the predictor has only two levels.)

(a) Write the equation of the regression line.

Since we are only dealing with a linear regression model, and the y-intercept (123.05) and the slope (-8.94) are given, the equation can be written as below:

\(\hat{babyweight} = 123.05 - 8.94 \times smoke\)

(b) Interpret the slope in this context, and calculate the predicted birth weight of babies born to smoker and non-smoker mothers.

The variable smoke is a two-level categorical value (smoke = 1, not a smoker = 0), so our x-axis is simply a 0 and 1, with the y-values (weight in ounces) stacked up. The regression line is drawn from the y-intercept (123.05), across the two variables at the given slope (-8.94). Because of the negative slope, the line would have a downward trend from left to right.

In a regular scatterplot of values, the regression line is the rate of increase or decrease in predicted values for a given variable. Since there are only two variables here, the slope indicates the overall estimate for body weight of babies born to mothers who smoke vs. those who do not:

Mother Who Smokes: 123.05 - 8.94 x 1 = 114.11 Mother Who Does Not: 123.05 - 8.94 x 0 = 123.05

The model predicts babies born to mothers who smoke will weigh 8.94 ounces less, versus those born to mothers who do not.

(c) Is there a statistically significant relationship between the average birth weight and smoking?

Looking at the table, from the given regression output, the p-value for smoke is at, or close to zero (0). Thinking of it in terms of hypothesis testing:

\(H_{0}\): \(\beta_{1}\) = 0 (no difference in variable)
\(H_{A}\): \(\beta_{1}\) \(\neq\) 0 (there is a difference)

Since the p-value is so small, we would reject the null hypothesis and conclude that the data provide strong evidence that the slope (\(\beta_{1}\)) is not zero. Since we reject the null hypothesis we can also conclude that there is an association (statistically significant) between smoking and birth weights.