8.2 Baby weights, Part II:

Exercise 8.1 introduces a data set on birth weight of babies. Another variable we consider is parity, which is 0 if the child is the first born, and 1 otherwise. The summary table below shows the results of a linear regression model for predicting the average birth weight of babies, measured in ounces, from parity.

  1. Write the equation of the regression line.
  2. Interpret the slope in this context, and calculate the predicted birth weight of first borns and others.
  3. Is there a statistically significant relationship between the average birth weight and parity?
Answer:
  1. y <- 120.07 - 1.93 * parity
  2. firstBorns = 120.07; notFirstBorns = 120.07 - 1.93 = 118.14
  3. There is no statistically significant relationship, since the t-value of -1.62 is diferent from the typical .05 alpha level.

8.4 Absenteeism. Researchers interested in the relationship between absenteeism from school

and certain demographic characteristics of children collected data from 146 randomly sampled students in rural New SouthWales, Australia, in a particular school year. Below are three observations from this data set.

  1. Write the equation of the regression line.
  2. Interpret each one of the slopes in this context.
  3. Calculate the residual for the first observation in the data set: a student who is aboriginal, male, a slow learner, and missed 2 days of school.
  4. The variance of the residuals is 240.57, and the variance of the number of absent days for all students in the data set is 264.17. Calculate the R2 and the adjusted R2. Note that there are 146 observations in the data set.
Answer:
  1. \(\hat{y} = 18.93 ??? 9.11 ??? eth + 3.10 ??? sex + 2.15 ??? lrn\)
  2. With everything else being equal, the slope coefficients mean that an aboriginal student is likely to be absent 9.11 more days, a male is likely to miss 3.11 more days and a “slow learner” was likely to miss 2.15 more days.
  3. \(\hat{y_0} = 18.93 ??? 9.11 ??? 0 + 3.10 ??? 1 + 2.15 ??? 1 = 24.18\)

$ e_1 = y_1 ??? = 2 ??? 24.18 = ???22.18 $

$R^2 = 1 - = 0.0893364 $ \(R^2_adj = 1 - \frac{240.57}{264.17} * \frac{145}{145 - 3} = 0.070097\)

8.8 Absenteeism, Part II:

The no learner status has the highest \(R_{adj}^2 = 0.0723\), and it is higher than the \(R_{adj}^2 = 0.0701\) of the full model. The learner status variable should be eliminated from the model first.

8.16

Challenger disaster, Part I:

a). Out of 23 shuttle missions, it appears that missions launched at higher ambient tempreatures produced the least or no damaged O-rings. Also, the launch with the most damaged O-rings (5) happens to be the mission launched on the coldest ambient temperature (53 F). All missions launched in temperature below 65 F had at least one damaged O-ring. Out of 13 missions launched in temperature 70 F or higher, only 3 showed damaged O-rings. These two variables seem to be inversely related.

b). From the summary table, there is an inverse relationship between temperature and O-ring failures. This implies that Increase in temperature decreases the probability of an O-ring failure by 0.216.

c). \(log(\frac{\hat{p}}{1-\hat{p}}) = 11.6630 - 0.2162 * Temperature\)

d). The p-value is practically 0 which shows that the failures are likely not due to chance. Moreover, there appears to be an inverse relationship between temperature and failures of the O-RINGS. So, I think it is justified to be concerned.

8.18 Challenger disaster, Part II:

p51<- exp(11.663-51*.2126)/(1+exp(11.663-51*.2126))
p51
## [1] 0.6943212
p53<- exp(11.663-53*.2126)/(1+exp(11.663-53*.2126))
p53
## [1] 0.5975339
p55<- exp(11.663-55*.2126)/(1+exp(11.663-55*.2126))
p55
## [1] 0.4925006
probs <- data.frame(pbs = c(p51, p53, p55, .341, .251, .179, .124, .084, .056, .037, .024), index = c(1:11))

ggplot(probs) + geom_smooth(aes(y = pbs, x = index))
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Binary codes (0 and 1) are used to denote either damaged or not for the O-rings without any indication about the degree of damage of the rings, and this could hold significant information. Moreover, there might be a situation whereby the rings were reused for different missions. The sample size of 23 diffrerent missions doesn’t seem large enough. The model could be improved by collecting more data on more missions