require("ggplot2")
## Loading required package: ggplot2
Wegiht = 120.07 - 1.93(Parity)
The slope of -1.93 means that if the baby is not a first born, it will be, on average, 1.93 ounces less than a first born.
According to this model, a first born baby weighs, on average, 120.07 ounces.
According to this model, non-first born baby weighs, on average, 118.14 ounces.
Because the p-value is greater than .05 and the t is less than 2 (greater than -2), I would say there is not a statistically significant relationship between the average birth weight and parity.
Days = 18.93 -9.11(eth) + 3.10(sex) + 2.15(lrn)
According to this model, if you are not aboriginal, your number of days absent will be 9.11 less on average.
According to this model, if you are male (rather than female), your number of days absent will be 3.10 more on average.
According to this model, if you are a slow learner, your number of days absent will be 2.15 more on average.
Days(predicted) = 18.93 -9.11(0) + 3.10(1) + 2.15(1)
Days(predicted) = 24.18
Days(actual) = 2
residual = 24.18 - 2 = 22.18
R2 = 1 - (240.57)/(264.17)
R2
## [1] 0.08933641
AdjR2 = 1 - (240.57/264.17)*((146-1)/(146-3-1))
AdjR2
## [1] 0.07009704
Confidence Interval at the 95% Level alpha = .05 t = 2.05
CI = (b1 - t * se, b1 + t * se)
Lower = .34 - 2.05 * .13
Lower
## [1] 0.0735
Upper = .34 + 2.05 * .13
Upper
## [1] 0.6065
CI = (0.0735, 0.6065)
We are 95% confident that the coefficient of height will fall between .0735 and .06065.
Volume(predicted) = -57.99 + .34(height) + 4.71(diameter)
Volume = -57.99 + .34*(79) + 4.71*(11.3)
Volume
## [1] 22.093
Volume(predicted) = 22.093
Volume(actual) = 24.2
The model slightly underestimated the volume of this tree, only by 2.107
The learner status variable should be removed because we get a better adjusted R2 value.
Ethnicity should be added to the model first due to the adjusted R2 value and the p-value below .05, meaning it is statistically significant.
I think the p-value approah would be better for selecting variables here. The p-value will show if the variable is statistically significant in the model.
Looking at the plots, the residuals seem mostly normal, and follow the regression line very well. It seems like the regression should have a strong R2 value. There are just a few lower valus that do not follow the regression line. It is also good to note that there does not seem to be correlation between the residuals and fitted values.The regression model looks appropriate for these data.
Temperature1 = 51
Damage1 = 11.6630 - 0.2162 * Temperature1
P1 = exp(Damage1) / (1 + exp(Damage1))
P1
## [1] 0.6540297
Temperature2 = 53
Damage2 = 11.6630 - 0.2162 * Temperature2
P2 = exp(Damage2) / (1 + exp(Damage2))
P2
## [1] 0.5509228
Temperature3 = 55
Damage3 = 11.6630 - 0.2162 * Temperature3
P3 = exp(Damage3) / (1 + exp(Damage3))
P3
## [1] 0.4432456
Temperature <- c(53,57,58,63,66,67,67,67,68,69,70,70,70,70,72,73,75,75,76,76,78,79,81)
Damaged <- c(5,1,1,1,0,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0,0)
Undamaged <- c(1,5,5,5,6,6,6,6,6,6,5,6,5,6,6,6,6,5,6,6,6,6,6)
Challenger <- data.frame(Temperature, Damaged, Undamaged)
library(ggplot2)
ggplot(Challenger,aes(x=Temperature,y=Damaged)) + geom_point() +
stat_smooth(method = 'glm', family = 'binomial')
## Warning: Ignoring unknown parameters: family
Temp <- seq(from = 51, to = 71, by = 2)
Prob <- c(P1, P2, P3, 0.341, 0.251, 0.179, 0.124, 0.084, 0.056, 0.037, 0.024)
plot1 = plot(Temp, Prob, type = "o", col = "red")
In order to apply logistic regression in this applicatin:
Each predictor xi must be linearly related to logit(pi) if all other predictors are held constant.
This condition is tough to varify.
The second condition is:
Each outcome Yi is independent of the other outcomes.
Each launch should be independent of the others, therefore this condition is met.