This report analyzes the relationship between Index (X) and Days (Y) using a Simple Linear Regression Model. The goal of this analysis is to determine whether Index is a useful predictor of Days and to evaluate the significance and reliability of the relationship.
x <-c(16.7,17.1,18.2,18.1,17.2,18.2,16.0,17.2,18.0,17.2,16.9,17.1,18.2,17.3,17.5,16.6)
y <-c(91,105,106,108,88,91,58,82,81,65,61,48,61,43,33,36)
#check
length(x)## [1] 16
## [1] 16
## x y
## 1 16.7 91
## 2 17.1 105
## 3 18.2 106
## 4 18.1 108
## 5 17.2 88
## 6 18.2 91
## 7 16.0 58
## 8 17.2 82
## 9 18.0 81
## 10 17.2 65
## 11 16.9 61
## 12 17.1 48
## 13 18.2 61
## 14 17.3 43
## 15 17.5 33
## 16 16.6 36
The data table consists of 16 variables and 2 variables as shown in table above.
Use the plot() function to create a scatter plot of the
Days on Index.
We fit the model into:
\[ Y=\beta0 + B1x +\epsilon \]
##
## Call:
## lm(formula = y ~ x)
##
## Coefficients:
## (Intercept) x
## -193.0 15.3
##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -41.70 -21.54 2.12 18.56 36.42
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -192.984 163.503 -1.180 0.258
## x 15.296 9.421 1.624 0.127
##
## Residual standard error: 23.79 on 14 degrees of freedom
## Multiple R-squared: 0.1585, Adjusted R-squared: 0.09835
## F-statistic: 2.636 on 1 and 14 DF, p-value: 0.1267
Intercept (\(\beta0\)): -193.00
Slope (\(\beta1\)): 15.30
We test \(Ho:\beta1=0\) vs \(Ha:\beta1\neq 0\)
## [1] 17.34375
## [1] 0.414414063 0.059414062 0.733164062 0.571914063 0.020664063 0.733164062
## [7] 1.805664062 0.020664063 0.430664062 0.020664063 0.196914063 0.059414062
## [13] 0.733164062 0.001914062 0.024414062 0.553164062
## [1] 6.379375
## [1] 9.420995
## [1] 1.624032
At \(\sigma=0.05\)
## [1] 2.144787
## [1] TRUE
Since 1.62 is less than 2.145, we reject \(Ho\) therefore regression is not significant.
\[ R^2=\frac{SSR}{SST} \]
## [1] 62.51 68.63 85.46 83.93 70.16 85.46 51.80 70.16 82.40 70.16 65.57 68.63
## [13] 85.46 71.69 74.75 60.98
## [1] 72.3125
## [1] -9.8025 -3.6825 13.1475 11.6175 -2.1525 13.1475 -20.5125 -2.1525
## [9] 10.0875 -2.1525 -6.7425 -3.6825 13.1475 -0.6225 2.4375 -11.3325
## [1] 96.0890063 13.5608062 172.8567562 134.9663063 4.6332562 172.8567562
## [7] 420.7626562 4.6332562 101.7576563 4.6332562 45.4613063 13.5608062
## [13] 172.8567562 0.3875063 5.9414062 128.4255562
## [1] 1493.383
## [1] 18.6875 32.6875 33.6875 35.6875 15.6875 18.6875 -14.3125 9.6875
## [9] 8.6875 -7.3125 -11.3125 -24.3125 -11.3125 -29.3125 -39.3125 -36.3125
## [1] 349.22266 1068.47266 1134.84766 1273.59766 246.09766 349.22266
## [7] 204.84766 93.84766 75.47266 53.47266 127.97266 591.09766
## [13] 127.97266 859.22266 1545.47266 1318.59766
## [1] 9419.438
## [1] 0.1585427
The value is \(R^2\)=0.1585427
## fit lwr upr
## 1 62.46546 44.24504 80.68589
## 2 68.58401 54.90761 82.26042
## 3 85.41001 63.91295 106.90708
## 4 83.88038 63.97338 103.78737
## 5 70.11365 57.02842 83.19887
## 6 85.41001 63.91295 106.90708
## 7 51.75801 21.75791 81.75811
## 8 70.11365 57.02842 83.19887
## 9 82.35074 63.94915 100.75233
## 10 70.11365 57.02842 83.19887
## 11 65.52474 49.93042 81.11906
## 12 68.58401 54.90761 82.26042
## 13 85.41001 63.91295 106.90708
## 14 71.64328 58.85392 84.43265
## 15 74.70256 61.55896 87.84616
## 16 60.93583 41.22205 80.64961
## fit lwr upr
## 1 62.46546 8.275374 116.6556
## 2 68.58401 15.748171 121.4199
## 3 85.41001 30.032167 140.7879
## 4 83.88038 29.100176 138.6606
## 5 70.11365 17.427738 122.7996
## 6 85.41001 30.032167 140.7879
## 7 51.75801 -7.441550 110.9576
## 8 70.11365 17.427738 122.7996
## 9 82.35074 28.099467 136.6020
## 10 70.11365 17.427738 122.7996
## 11 65.52474 12.160286 118.8892
## 12 68.58401 15.748171 121.4199
## 13 85.41001 30.032167 140.7879
## 14 71.64328 19.030075 124.2565
## 15 74.70256 22.002119 127.4030
## 16 60.93583 6.225545 115.6461
| Plot | Purpose |
|---|---|
| Residual vs Fitted | To check linearity |
| Normal Q-Q | To check normality |
| Scale-Location | To check constant variance |
| Residuals vs Leverage | To check influential points |
Three assumptions are followed to check for normal adequacy
Normal Errors: Use the normal probability plot
Constant Variance: Plot the residuals by the predicted
Independence: Plot the residual by the time of order of collection
Finally, the regression model provides a very reliable tool that to understand and forecast the expected Days.