Regress Days on Index using simple linear regression, what are the estimates of your fitted regression line?
x<-c(16.7,17.1,18.2,18.1,17.2,18.2,16.0,17.2,18.0,17.2,16.9,17.1,18.2,17.3,17.5,16.6)
y<-c(91,105,106,108,88,91,58,82,81,65,61,48,61,43,33,36)
model<-lm(y~x)
coefficients(model)
## (Intercept) x
## -192.98383 15.29637
--> As You can see estimates.$$\beta_{0} = -192.98383$$ $$\beta_{1}=15.296$$
What is the value of R^2?
summary(model)$r.square
## [1] 0.1584636
--> AS you can see above R-square value is 0.1584.
Test for the signifiance of the regression at a 0.05 level of signficance assuming the response is Normally distributed, what is your conclusion?
summary(model)
##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -41.70 -21.54 2.12 18.56 36.42
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -192.984 163.503 -1.180 0.258
## x 15.296 9.421 1.624 0.127
##
## Residual standard error: 23.79 on 14 degrees of freedom
## Multiple R-squared: 0.1585, Adjusted R-squared: 0.09835
## F-statistic: 2.636 on 1 and 14 DF, p-value: 0.1267
--> As P - value is less than 0.05, x is significant. so we can reject null Hypothesis. R^2 is very less but considering significance level we can reject null hypothesis
Regardless of whether you conclude that the regression is signficant above, make a scatterplot of the data showing the fitted regression line, confidence interval, and prediction interval
plot(x,y)
abline(model)
Calculate a 95% confidence interval on the mean number of days the ozone level exceeds 20ppm when the meterological index is 17.0. Comment on the meaning of this interval? Calculate a 95% prediction interval on the mean number of days the ozone level exceeds 20ppm when the meterological index is 17.0. Comment on the meaning of this intervall? Compare the width of the prediction interval to that of the confidence interval and comment
newx<-c(17.0)
#define vector with single point where interval is desired
predict(model,data.frame(x=newx),interval="confidence") #confidence interval
## fit lwr upr
## 1 67.05437 52.52748 81.58127
predict(model,data.frame(x=newx),interval="prediction") #prediction interval
## fit lwr upr
## 1 67.05437 13.99203 120.1167
--> The width of the confidence level is equals to: $$\text{Widhth of Confidece interval}=81.58127-52.52748=29.053$$ $$\text{Width of Prediction interval}=120.1167-13.99203=106.12467$$ As you can see the width of the Prediction interval is greater than the width of the confidence interval As per given data in problem 2.13 from book. we created a regression model to develop the prediction interval that can determine where the next point/Future event may appear. where as the confidence interval focuses on the whole mean value. we can also describe this in terms of variance. variance tells you how far each point in your data set is from the average. The more dispersion in the data, the larger you have the variance in that case and this is the reason of the wider range of values in prediction interval