Question 1

Regress Days on Index using simple linear regression, what are the estimates of your fitted regression line?

Based on the summary table we estimated the coefficients of \(\beta_{0}\) = -192.98 and \(\beta_{1}\) = 15.296. After estimating the linear coefficients, we calculated the fitted values for regression line.

##    Year Days Index Fitted Values
## 1  1976   91  16.7      62.46546
## 2  1977  105  17.1      68.58401
## 3  1978  106  18.2      85.41001
## 4  1979  108  18.1      83.88038
## 5  1980   88  17.2      70.11365
## 6  1981   91  18.2      85.41001
## 7  1982   58  16.0      51.75801
## 8  1983   82  17.2      70.11365
## 9  1984   81  18.0      82.35074
## 10 1985   65  17.2      70.11365
## 11 1986   61  16.9      65.52474
## 12 1987   48  17.1      68.58401
## 13 1988   61  18.2      85.41001
## 14 1989   43  17.3      71.64328
## 15 1990   33  17.5      74.70256
## 16 1991   36  16.6      60.93583

Question 2

What is the \(R^{2}\)?

The \(R^{2}\) is 0.1585. Based on the results, the model explains only 15.85% of the variation.

Question 3

Test for the significance of the regression at a 0.05 level of significance assuming the response is Normally distributed, what is your conclusion? \(H_0: \hat\beta_1 = 0\)

\(H_a: \hat\beta_1 \neq 0\)

## 
## Call:
## lm(formula = Days ~ Index, data = Regression_Data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -41.70 -21.54   2.12  18.56  36.42 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept) -192.984    163.503  -1.180    0.258
## Index         15.296      9.421   1.624    0.127
## 
## Residual standard error: 23.79 on 14 degrees of freedom
## Multiple R-squared:  0.1585, Adjusted R-squared:  0.09835 
## F-statistic: 2.636 on 1 and 14 DF,  p-value: 0.1267

We concluded that \(t_{0}\) value to be 1.624 and \(t_{critcal}\) to be 2.1447867. Since \(t_{0}<t_{critcal}\), We the conclude that index term is not significant and we accept the null hypothesis (\(H_0\)).

Question 4

Regardless of whether you conclude that the regression is significant above, make a scatter plot of the data showing the fitted regression line, confidence interval, and prediction interval. \

Here is the plot with a fitted line, the green confidence interval , and brown prediction interval.

Question 5

Calculate a 95% confidence interval on the mean number of days the ozone level exceeds 0.20 ppm when the meteorological index is 17.0. Comment on the meaning of this interval?

Using the code.

conf_at_17<-predict(Regression_model,data.frame(Index=17),interval="confidence")

The confidence interval ranges is \(52.53 <= E(y|x)<= 81.58\)

Question 6

Calculate a 95% prediction interval on the mean number of days the ozone level exceeds 0.20 ppm when the meteorological index is 17.0. Comment on the meaning of this interval? Compare the width of the prediction interval to that of the confidence interval and comment.

pridiction_at_17<-predict(Regression_model,data.frame(Index=17),interval="prediction")

The pridiction interval ranges is \(13.99 <= E(y|x)<= 120.12\).

The prediction interval is used to predict new values of y (index). The prediction interval is wider than confidence interval because of future variation of the newly measured responses