Regress Days on Index using simple linear regression, what are the estimates of your fitted regression line?
x<-c(16.7,17.1,18.2,18.1,17.2,18.2,16.0,17.2,18.0,17.2,16.9,17.1,18.2,17.3,17.5,16.6)
y<-c(91,105,106,108,88,91,58,82,81,65,61,48,61,43,33,36)
model<-lm(y~x)
coefficients(model)
## (Intercept) x
## -192.98383 15.29637
--> As You can see estimates.$$\beta_{0} = -192.98383$$ $$\beta_{1}=15.296$$
What is the value of R^2?
summary(model)$r.square
## [1] 0.1584636
--> AS you can see above R-square value is 0.1584.
Test for the signifiance of the regression at a 0.05 level of signficance assuming the response is Normally distributed, what is your conclusion?
summary(model)
##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -41.70 -21.54 2.12 18.56 36.42
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -192.984 163.503 -1.180 0.258
## x 15.296 9.421 1.624 0.127
##
## Residual standard error: 23.79 on 14 degrees of freedom
## Multiple R-squared: 0.1585, Adjusted R-squared: 0.09835
## F-statistic: 2.636 on 1 and 14 DF, p-value: 0.1267
--> As P - value is less than 0.05, x is significant. so we can reject null Hypothesis. R^2 is very less but considering significance level we can reject null hypothesis
Regardless of whether you conclude that the regression is signficant above, make a scatterplot of the data showing the fitted regression line, confidence interval, and prediction interval
min(x)
## [1] 16
max(x)
## [1] 18.2
newx<-seq(16,18.2,.05)
confd<-predict(model,data.frame(x=newx),interval="confidence")
pred<-predict(model,data.frame(x=newx),interval="prediction")
#view confidence and prediction intervals
confd
## fit lwr upr
## 1 51.75801 21.75791 81.75811
## 2 52.52283 23.43393 81.61172
## 3 53.28765 25.10320 81.47209
## 4 54.05246 26.76504 81.33989
## 5 54.81728 28.41869 81.21588
## 6 55.58210 30.06330 81.10090
## 7 56.34692 31.69790 80.99594
## 8 57.11174 33.32139 80.90208
## 9 57.87656 34.93253 80.82058
## 10 58.64137 36.52990 80.75285
## 11 59.40619 38.11188 80.70051
## 12 60.17101 39.67663 80.66539
## 13 60.93583 41.22205 80.64961
## 14 61.70065 42.74576 80.65554
## 15 62.46546 44.24504 80.68589
## 16 63.23028 45.71681 80.74375
## 17 63.99510 47.15763 80.83258
## 18 64.75992 48.56359 80.95625
## 19 65.52474 49.93042 81.11906
## 20 66.28956 51.25340 81.32571
## 21 67.05437 52.52748 81.58127
## 22 67.81919 53.74736 81.89103
## 23 68.58401 54.90761 82.26042
## 24 69.34883 56.00294 82.69472
## 25 70.11365 57.02842 83.19887
## 26 70.87847 57.97983 83.77710
## 27 71.64328 58.85392 84.43265
## 28 72.40810 59.64870 85.16750
## 29 73.17292 60.36362 85.98222
## 30 73.93774 60.99960 86.87588
## 31 74.70256 61.55896 87.84616
## 32 75.46738 62.04522 88.88953
## 33 76.23219 62.46281 90.00157
## 34 76.99701 62.81679 91.17723
## 35 77.76183 63.11249 92.41117
## 36 78.52665 63.35533 93.69796
## 37 79.29147 63.55057 95.03237
## 38 80.05628 63.70317 96.40940
## 39 80.82110 63.81774 97.82447
## 40 81.58592 63.89847 99.27337
## 41 82.35074 63.94915 100.75233
## 42 83.11556 63.97312 102.25799
## 43 83.88038 63.97338 103.78737
## 44 84.64519 63.95255 105.33784
## 45 85.41001 63.91295 106.90708
pred
## fit lwr upr
## 1 51.75801 -7.4415504 110.9576
## 2 52.52283 -6.2202208 111.2659
## 3 53.28765 -5.0128258 111.5881
## 4 54.05246 -3.8196852 111.9246
## 5 54.81728 -2.6411177 112.2757
## 6 55.58210 -1.4774404 112.6416
## 7 56.34692 -0.3289677 113.0228
## 8 57.11174 0.8039897 113.4195
## 9 57.87656 1.9211255 113.8320
## 10 58.64137 3.0221391 114.2606
## 11 59.40619 4.1067363 114.7056
## 12 60.17101 5.1746310 115.1674
## 13 60.93583 6.2255454 115.6461
## 14 61.70065 7.2592119 116.1421
## 15 62.46546 8.2753737 116.6556
## 16 63.23028 9.2737863 117.1868
## 17 63.99510 10.2542182 117.7360
## 18 64.75992 11.2164521 118.3034
## 19 65.52474 12.1602862 118.8892
## 20 66.28956 13.0855346 119.4936
## 21 67.05437 13.9920288 120.1167
## 22 67.81919 14.8796182 120.7588
## 23 68.58401 15.7481710 121.4199
## 24 69.34883 16.5975750 122.1001
## 25 70.11365 17.4277379 122.7996
## 26 70.87847 18.2385881 123.5183
## 27 71.64328 19.0300748 124.2565
## 28 72.40810 19.8021686 125.0140
## 29 73.17292 20.5548615 125.7910
## 30 73.93774 21.2881669 126.5873
## 31 74.70256 22.0021195 127.4030
## 32 75.46738 22.6967753 128.2380
## 33 76.23219 23.3722112 129.0922
## 34 76.99701 24.0285244 129.9655
## 35 77.76183 24.6658322 130.8578
## 36 78.52665 25.2842711 131.7690
## 37 79.29147 25.8839964 132.6989
## 38 80.05628 26.4651808 133.6474
## 39 80.82110 27.0280144 134.6142
## 40 81.58592 27.5727029 135.5991
## 41 82.35074 28.0994673 136.6020
## 42 83.11556 28.6085424 137.6226
## 43 83.88038 29.1001759 138.6606
## 44 84.64519 29.5746275 139.7158
## 45 85.41001 30.0321673 140.7879
#plot confidence and prediction intervals
plot(x,y)
abline(model)
#confd[rows(lwr),colm(upr)]
lines(newx,confd[,2])
lines(newx,confd[,3])
lines(newx,pred[,2])
lines(newx,pred[,3])
Calculate a 95% confidence interval on the mean number of days the ozone level exceeds 20ppm when the meterological index is 17.0. Comment on the meaning of this interval? Calculate a 95% prediction interval on the mean number of days the ozone level exceeds 20ppm when the meterological index is 17.0. Comment on the meaning of this intervall? Compare the width of the prediction interval to that of the confidence interval and comment
newx<-c(17.0)
#define vector with single point where interval is desired
predict(model,data.frame(x=newx),interval="confidence") #confidence interval
## fit lwr upr
## 1 67.05437 52.52748 81.58127
predict(model,data.frame(x=newx),interval="prediction") #prediction interval
## fit lwr upr
## 1 67.05437 13.99203 120.1167
--> The width of the confidence level is equals to: $$\text{Widhth of Confidece interval}=81.58127-52.52748=29.053$$ $$\text{Width of Prediction interval}=120.1167-13.99203=106.12467$$ As you can see the width of the Prediction interval is greater than the width of the confidence interval As per given data in problem 2.13 from book. we created a regression model to develop the prediction interval that can determine where the next point/Future event may appear. where as the confidence interval focuses on the whole mean value. we can also describe this in terms of variance. variance tells you how far each point in your data set is from the average. The more dispersion in the data, the larger you have the variance in that case and this is the reason of the wider range of values in prediction interval