\[birthweight = 120.07 - 1.93parity\]
(not_first_born = 120.07 - 1.93 * 1 )
## [1] 118.14
(first_born = 120.07)
## [1] 120.07
\[y = 18.93 - 9.11eth + 3.10sex + 2.15lrn\]
eth: a student has 9.11 fewer days absent if ethnicity is not aboriginal sex: a student has 3.10 more days absent if sex is male lrn: a student has 2.15 more days absent if a slow learner
eth = 0
sex = 1
lrn = 1
days = 2
(fitted_value = 18.93 - 9.11 * eth + 3.10 * sex + 2.15 * lrn )
## [1] 24.18
(residual = days - fitted_value )
## [1] -22.18
The residual is -22.18 days. That is, the model predicts a higher number of absent days for this student that was actually realized.
(R_squared = 1 - (240.57/264.17) )
## [1] 0.08933641
n = 146
k = 3
( R_squared_adj = 1 - (240.57/264.17) * ( n - 1)/(n - k - 1) )
## [1] 0.07009704
We prefer the model with the highest adjusted R-squared. THis would be the model 4 which has no learner status. It has an adjusted R-squared of 7.23%. So we should remove the lrn (learner status) variable.
The data suggests that lower temperatures are associated with more damaged O-rings.
As the temperature increases, the probability of O-ring damage decreases because the sign of the coefficient for temp predictor is negative. The intercept of 11.663 is associated with the logistic regression model for which temperature is zero.
The logistic model equation has the form:
\[log( \frac{p}{1-p}) = 11.663 - 0.2162 temp\]
This is equivalent to:
\[p = \frac{ exp( 11.663 - 0.2162 temp )}{ 1 + exp( 11.663 - 0.2162 temp)}\]
t = 53
(prob = exp( 11.663 - 0.2162 * t)/ ( 1 + exp(11.663 - 0.2162 * t)))
## [1] 0.5509228
model_prob <- function(t){
return( exp( 11.663 - 0.2162 * t) / ( 1 + exp( 11.663 - 0.2162 * t )))
}
(model_prob(51 ) )
## [1] 0.6540297
(model_prob(53 ) )
## [1] 0.5509228
(model_prob(55 ) )
## [1] 0.4432456
library(knitr)
library(tidyverse)
library(kableExtra)
raw_data = data.frame( mission = c(1:23),
temp = c( 53, 57, 58, 63, 66, 67, 67, 67, 68, 69, 70, 70, 70, 70, 72, 73, 75, 75, 76, 76, 78, 79, 81 ) ,
damage = c( 5, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0 , 0 ) )
raw_data %>% mutate( freq = damage / 6.0 , fitted = model_prob(temp)) -> raw_data
Display the raw data from actual missions first.
knitr::kable( raw_data, digits = 4 ) %>% kable_styling( bootstrap_options = c("striped", "hover") )
| mission | temp | damage | freq | fitted |
|---|---|---|---|---|
| 1 | 53 | 5 | 0.8333 | 0.5509 |
| 2 | 57 | 1 | 0.1667 | 0.3406 |
| 3 | 58 | 1 | 0.1667 | 0.2939 |
| 4 | 63 | 1 | 0.1667 | 0.1237 |
| 5 | 66 | 0 | 0.0000 | 0.0687 |
| 6 | 67 | 0 | 0.0000 | 0.0561 |
| 7 | 67 | 0 | 0.0000 | 0.0561 |
| 8 | 67 | 0 | 0.0000 | 0.0561 |
| 9 | 68 | 0 | 0.0000 | 0.0457 |
| 10 | 69 | 0 | 0.0000 | 0.0372 |
| 11 | 70 | 1 | 0.1667 | 0.0301 |
| 12 | 70 | 0 | 0.0000 | 0.0301 |
| 13 | 70 | 1 | 0.1667 | 0.0301 |
| 14 | 70 | 0 | 0.0000 | 0.0301 |
| 15 | 72 | 0 | 0.0000 | 0.0198 |
| 16 | 73 | 0 | 0.0000 | 0.0160 |
| 17 | 75 | 0 | 0.0000 | 0.0104 |
| 18 | 75 | 1 | 0.1667 | 0.0104 |
| 19 | 76 | 0 | 0.0000 | 0.0084 |
| 20 | 76 | 0 | 0.0000 | 0.0084 |
| 21 | 78 | 0 | 0.0000 | 0.0055 |
| 22 | 79 | 0 | 0.0000 | 0.0044 |
| 23 | 81 | 0 | 0.0000 | 0.0029 |
To render the plot, add the points associated with temperatures for which no mission occurred into the data frame.
raw_data = add_row( raw_data, mission = 24, temp = 51, damage = 0, fitted = model_prob(temp), freq = fitted )
raw_data = add_row( raw_data, mission = 25, temp = 55, damage = 0, fitted = model_prob(temp), freq = fitted )
raw_data = add_row( raw_data, mission = 26, temp = 59, damage = 0, fitted = model_prob(temp), freq = fitted )
raw_data = add_row( raw_data, mission = 26, temp = 61, damage = 0, fitted = model_prob(temp), freq = fitted )
raw_data = add_row( raw_data, mission = 27, temp = 65, damage = 0, fitted = model_prob(temp), freq = fitted )
ggplot( data=raw_data, aes( x= temp, y = freq)) + geom_point() + geom_line( aes( x= temp, y = fitted), color = "red") +
ggtitle("Challenger O-ring probability of damage with logistic fitted model in red")