library(ggplot2)
library(tidyverse)## -- Attaching packages ----------------------------------------------------------------- tidyverse 1.2.1 --
## v tibble 1.4.1 v purrr 0.2.4
## v tidyr 0.8.0 v dplyr 0.7.4
## v readr 1.1.1 v stringr 1.2.0
## v tibble 1.4.1 v forcats 0.2.0
## -- Conflicts -------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
8.2
A
- baby_weight= 120.07 -1.93(parity_1)
B
- By this standard first born children are heavier than non-first born children.
- First_born=120.07
- Non_first born= 120.07-1.93(1)
paste(120.7-1.93)## [1] "118.77"
C
- The p value is over .05 therefore there isn’t a staistically significant relationship between birthweight and parity
8.4
a
- ABSENTEE= 18.93 - 9.11(eth)+ 3.1(sex)+ 2.15(lrn)
b
- Aboriginal cildren miss significantly more days, Male children miss more days, and slow learners miss less days. The only significant effect however is Aboriginal or non-aboriginal
c
- residual = actual- projected
- -22.18
18.93 - 9.11*(0)+ 3.1*(1)+ 2.15*(1)## [1] 24.18
paste("residual = ",2-24.18)## [1] "residual = -22.18"
d
- 240.57-
R2 = 1 - variability in residuals/ variability in the outcome 1-var(ei)/ var(yi) * n-1/ n-k-1
rsquared <- 1- 240.57/264
adjusted_r= 1-(240.57/264)*(145/142)
paste( "rsquared is ",rsquared," and adjusted r squared is ",adjusted_r )## [1] "rsquared is 0.08875 and adjusted r squared is 0.0694982394366197"
8.8
- The table shows the effect of dropping a variable from the model. Therefore, if the r squared in creases from .0701 it shows that that feature should be dropped. In this case we would want to drop “no learner status” as it results in a adj R2 of .0723
8.16
temp <- c(53,57,58,63,66,67,67,67,68,69,70,70,70,70,72,73,75,75,76,76,78,79,81)
damaged <- c(5,1,1,1,0,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0,0)
not_damaged <-c(1,5,5,5,6,6,6,6,6,6,5,6,5,6,6,6,6,5,6,6,6,6,6)
shuttles <- as_data_frame(cbind(temp,damaged,not_damaged))
ggplot(shuttles, aes(temp,damaged))+
geom_point()A
- It seems that lower temperatures lead to increased risk of damaged o rings
B
- The model implies that at 0 temperature 11 orings would be damaged( not of practical value)
- As temperature increases by 1 degree .2 less orings are likely to become damaged.
- The P value 0 indicates that temperature is a significantly significant predictor of damage to O rings
C
- log(p/1-p) = 11.6630- .2162(temp)
D
- This model doesn’t tell us anything about the failure rates of missions, it merely allows us to evaluate if temperature has an effect on the condition of Orings. While we can conclude through the P value that temperature certiantly explains damage to Orings, the concerns reguarding O rings effect on missions cant be evaluated
8.18
A
- 0.6540297 0.5509228 0.4432456
temps<- c(51,53,55)
damaged_prob <- exp(11.6630-0.2162*temps)/(1+exp(11.6630-0.2162*temps))
damaged_prob## [1] 0.6540297 0.5509228 0.4432456
B
temps_2 <- 51:71
length(temps)## [1] 3
damaged_prob <- exp(11.6630-0.2162*temps_2)/(1+exp(11.6630-0.2162*temps_2))
shuttles_2 <- as_data_frame(cbind(temps_2,damaged_prob))
ggplot(shuttles_2, aes(damaged_prob,temps_2))+
geom_smooth()## `geom_smooth()` using method = 'loess'
# p_vals <- c(.341,.251,.179,.124,.084,.056,.037,.024)C
There are two key conditions for fitting a logistic regression model: 1. Each predictor xi is linearly related to logit(pi) if all other predictors are held constant. 2. Each outcome Yi is independent of the other outcomes.
- Condition two is clearly met and I believe condition one is met