Questions

8.2

a.) Equation: \(\hat{y} = 120.07 - 1.93 * parity\)
b.) The slope of -1.93 means that for every 1 increase in parity, the birth weight decrease by 1.93 ounces

birth_weight <- function(parity) 120.07 - 1.93 * parity
kable(data_frame(child = 0:3, weight = map_dbl(0:3, birth_weight)))

child	weight
0	120.07
1	118.14
2	116.21
3	114.28
* c.)	The p-value is .0152; we can conclude that there is not a statistically significant relationship between birth weight and parity

8.4

a.) Equation: \(\hat{y} = 18.93 - 9.11 * eth + 3.10 * sex + 2.15 * lrn\)
b.) eth: number of days absent decreases by 9.11 when eth increases by 1 (goes from aboriginal to not aboriginal) sex: number of days absent increases by 3.10 when sex increases by 1 (goes from female to male) lrn: number of days absent increases by 2.15 when lrn increases by 1 (goes from avg learner to slow learner)
c.)

days_absent <- function(eth, sex, lrn) 18.93 - 9.11 * eth + 3.1 * sex + 2.15 * 
    lrn
actual <- 2
prediction <- days_absent(0, 1, 1)

residual <- (actual - prediction) %>% print

## [1] -22.18

n <- 146
k <- 3

varResidual <- 240.57
varStudents <- 264.17

r2 <- 1 - (varResidual/varStudents)
adjR2 <- 1 - (varResidual/varStudents) * ((n - 1)/(n - k - 1))

\(R^2 = 0.0893364\)

\(R^2_{adj} = 0.070097\)

8.8

The lrn variable should be removed from the model first because it has the highest adjusted R2.

8.16

a.) As temperature increases, the number of damaged rings seems to decrease
b.) The key components of the summary table include: intercept: At 11.6630 it means that when the temperature value is zero, the damaged O-rings will have a value of 11.6630 slope: At -.2162, it means that as the temperature increases by 1, the damaged O-rings will decrease by .2162 z value/p-value: indicate signifance. Temperature has a greater significance as its closer to 0
c.) \(log_{e}\left( \frac{p_i}{1 - p_i} \right) = 11.6630 - .2162 * temperature\)
d.) Based on the model, there will be a high chance of damaged rings under 50 degrees. Since O-rings are critical components to success, the concerns are justified

8.18

prob_dmg <- function(temp) {
    dmg_o <- 11.663 - 0.2162 * temp
    p <- exp(dmg_o)/(1 + exp(dmg_o))
    return(round(p, 3))
}

# probabilities
temps <- seq(from = 51, to = 71, by = 2)
probs <- prob_dmg(temps)
df <- data_frame(temps, probs)
kable(df)

temps	probs
51	0.654
53	0.551
55	0.443
57	0.341
59	0.251
61	0.179
63	0.124
65	0.084
67	0.056
69	0.037
71	0.024

ggplot(df, aes(x = temps, y = probs)) + geom_point() + geom_smooth(method = "lm")

c.) For a logistic regression, each predictor \(x_i\) should be linearly related to its \(log(p_i)\) and each outcome \(Y_i\) should be independent of the other outcomes. Based on this, the conditions are met

Data 606 - HW8

Baron Curtin

2018-05-04

Questions

8.2

8.4

8.8

8.16

8.18