Exercise 8.2 Baby Weights, Part II (p395)

(a) Write the equation of the regression line.

\(y=120.07 - 1.93 x_{parity}\)

(b) Interpret the slope in this context, and calculate the predicted birth weight of first borns and others.

(c) Is there a statistically significant relationship between the average birth weight and parity?.

Exercise 8.4 Absenteeism. (p397)

(a) Write the equation of the regression line.

\(y = 18.93 - 9.11 x_{eth} + 3.10 x_{sex} + 2.15 x_{lrn}]\)

(b) Interpret each one of the slopes in this context.

(c) Calculate the residual for the first observation in the data set: a student who is aboriginal, male, a slow learner, and missed 2 days of school.

eth <- 0
sex <- 1
lrn <- 1

predictdays <- 18.93 - 9.11*eth + 3.1*sex + 2.15*lrn
days <- 2

resid <- days - predictdays

resid
## [1] -22.18

The variance of the residuals is 240.57, and the variance of the number of absent days for all students in the data set is 264.17. Calculate the R2 and the adjusted R2. Note that there are 146 observations in the data set.

varresid <- 240.57
varabs <- 264.17
n <- 146
k <- 3

R2 <- 1-(varresid/varabs)
R2a <- 1 - ((varresid/varabs)*((n-1)/(n-k-1)))

R2
## [1] 0.08933641
R2a
## [1] 0.07009704

8.8 Absenteeism, Part II (p399)

8.16 Challenger disaster, Part I (p403)

Each column of theh table above represents a different shutttle mission. Examine these data and describe what you observe with respect to the relationship between temperature and damaged O-rings

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.5
temperature <- c(53,57,58,63,66,67,67,67,68,69,70,70,70,70,72,73,75,75,76,76,
                 78,79,81)

damaged <- c(5,1,1,1,0,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0,0)

undamaged <- c(1,5,5,5,6,6,6,6,6,6,5,6,5,6,6,6,6,5,6,6,6,6,6)

data <- data.frame(temperature = temperature, damaged = damaged, 
                   undamaged = undamaged)

ggplot(data,aes(x=temperature,y=damaged)) + geom_point() 

(b) Failures have been coded as 1 for a damaged O-ring and 0 for an undamaged O-ring, and a logistic regression model was fit to these data. A summary of this model is given below. Describe the key components of this summary table in words.

(c) Write out the logistic model using the point estimates of the model parameters.

\(\log_e(\frac{p_i}{1-p_i}) = 11.6630 - 0.2162 x_{temp}\)

(d) Based on the model, do you think concerns regarding O-rings are justified? Explain.

oringModel <- function(temp)
{
  right <- 11.6630 - 0.2162 * temp
  
  prob <- exp(right) / (1 + exp(right))
  
  return (prob)
}
temps <- seq(32, 85)
dfProbDamage <- data.frame(Temperature=temps, ProbDamage=oringModel(temps))

g1 <- ggplot(dfProbDamage) + geom_line(aes(x=Temperature, y=ProbDamage )) 
g1

8.18 Challenger disaster, Part II (p404)

temps <- c(51,53,55)
dfProbDamage <- data.frame(Temperature=temps, ProbDamage=oringModel(temps))
dfProbDamage
##   Temperature ProbDamage
## 1          51  0.6540297
## 2          53  0.5509228
## 3          55  0.4432456
dfRaw <- data.frame(Missing=seq(1, 23), 
                    Temp=c(53,57,58,63,66,67,67,67,68,69,70,70,70,
                           70,72,73,75,75,76,76,78,79,81),
                    Damaged=c(5,1,1,1,0,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0,0),
                    Undamaged=c(1,5,5,5,6,6,6,6,6,6,5,6,5,6,6,6,6,5,6,6,6,6,6))
dfRaw$ProbDamage <- dfRaw$Damaged / (dfRaw$Damaged + dfRaw$Undamaged)
head(dfRaw)
##   Missing Temp Damaged Undamaged ProbDamage
## 1       1   53       5         1  0.8333333
## 2       2   57       1         5  0.1666667
## 3       3   58       1         5  0.1666667
## 4       4   63       1         5  0.1666667
## 5       5   66       0         6  0.0000000
## 6       6   67       0         6  0.0000000
temps <- seq(51, 71, by=2)
dfProbDamage <- data.frame(Temperature=temps, ProbDamage=oringModel(temps))
g1 <- ggplot(dfRaw) + 
  geom_point(aes(x=Temp, y=ProbDamage), alpha=0.5, colour="blue") + 
  geom_line(data=dfProbDamage, aes(x=Temperature, y=ProbDamage), colour="red") +
  geom_point(data=dfProbDamage, aes(x=Temperature, y=ProbDamage), colour="red") +
  labs(x="Temperature", y="Probability of damage") +
  ylim(0, 1) 
g1

(c) Describe any concerns you may have regarding applying logistic regression in this application, and note any assumptions that are required to accept the model’s validity.