## [1] "C:/CUNY/606Statistics/Assignments"
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 488574 26.1 940480 50.3 750400 40.1
## Vcells 883893 6.8 1650153 12.6 1125908 8.6
##
## Welcome to CUNY DATA606 Statistics and Probability for Data Analytics
## This package is designed to support this course. The text book used
## is OpenIntro Statistics, 3rd Edition. You can read this by typing
## vignette('os3') or visit www.OpenIntro.org.
##
## The getLabs() function will return a list of the labs available.
##
## The demo(package='DATA606') will list the demos that are available.
a. Regression line equaltion is
\(\widehat{babyweight} = 120.07 - 1.93 \times parity\)
b. Predicted birth weight of first borns is 1.93 (120.07) ounces more than others (120.07-1.93)
c. Since the NULL Hypothesis cannot be rejected as the P-value is 0.1052 and is greater than 0.05.
a. Regression Equation:
\(\widehat{daysabsent} = 18.93 - 9.11 \times eth + 3.10 \times sex + 2.15 \times lrn\)
b.
1. Slope of *eth*: Model predicts that non-aboriginal students miss 9.11 less days to school.
2. Slope of *sex*: Model predicts that for male students there is increase in attendance of 3.10 days.
3. Slope of *lrn*: Model predicts that Slow learning students miss 2.15 additional days to school.
c.
Residual is -22.18
\(\widehat{daysabsent} = 18.93 - (9.11 \times 0) + (3.10 \times 1) + (2.15 \times 1)\)
daysabsent <- 18.93 - (9.11 * 0) + (3.10 * 1) + (2.15 * 1)
daysabsent## [1] 24.18
residual <- 2 - daysabsent
residual## [1] -22.18
d.
variance_e <- 240.57
variance_y <- 264.17
n <- 146
k <- 3
r2 <- 1 - (variance_e / variance_y);
r2## [1] 0.08933641
adjR2 <- 1 - ((variance_e / variance_y) * (n-1) / (n-k-1));
adjR2## [1] 0.07009704
a. The highest adjusted R^2 is 0.0723, so the *Lrn* status variable should be removed first.
a. The shuttle mission has more damaged O rings at lower temperature.
temperature <- c(53,57,58,63,66,67,67,67,68,69,70,70,70,70,72,73,75,75,76,76,78,79,81)
damaged <- c(5,1,1,1,0,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0,0)
undamaged <- c(1,5,5,5,6,6,6,6,6,6,5,6,5,6,6,6,6,5,6,6,6,6,6)
shuttleMission <- data.frame(temperature, damaged, undamaged)
summary(shuttleMission)## temperature damaged undamaged
## Min. :53.00 Min. :0.0000 Min. :1.000
## 1st Qu.:67.00 1st Qu.:0.0000 1st Qu.:5.000
## Median :70.00 Median :0.0000 Median :6.000
## Mean :69.57 Mean :0.4783 Mean :5.522
## 3rd Qu.:75.00 3rd Qu.:1.0000 3rd Qu.:6.000
## Max. :81.00 Max. :5.0000 Max. :6.000
plot(shuttleMission)b. Intercept is 11.6630 is the propability of O ring damage at zero degrees. For every 1 degree drop in temprature there is 0.2162 o rings damaged.
c.
\(log_e({\frac{p}{1-p}}) = 116630 - 0.2162 \times temperature\)
d. Since the P-value is low, it is an important concern on damage to O rings, which are critical components of the shuttle.
a.
if \(\hat{p}\) is probobility of O ring would be damaged \(log_e({\frac{\hat{p}}{1-\hat{p}}}) = 116630 - 0.2162 \times temperature\)
#phat <- exp(11.6630 - 0.2162 * temp) / (1 + exp(11.6630 - 0.2162 * temp))
phat51 <- exp(11.6630 - 0.2162 * 51) / (1 + exp(11.6630 - 0.2162 * 51))
phat53 <- exp(11.6630 - 0.2162 * 53) / (1 + exp(11.6630 - 0.2162 * 53))
phat55 <- exp(11.6630 - 0.2162 * 55) / (1 + exp(11.6630 - 0.2162 * 55))
phat51## [1] 0.6540297
phat53## [1] 0.5509228
phat55## [1] 0.4432456
\(\widehat{p_{51}}\) = 0.6540297
\(\widehat{p_{53}}\) = 0.5509228
\(\widehat{p_{55}}\) = 0.4432456
b.
library(ggplot2)
ggplot(shuttleMission,aes(x=temperature,y=damaged)) + geom_point() + stat_smooth(method = 'glm', family = 'binomial')## Warning: Ignoring unknown parameters: family
c.
For logistic regression, the predictor is linearly related and theoutcome is independent of other data.