Part A

Regress y on all the four variables

## 
## Call:
## lm(formula = y ~ x_1 + x_2 + x_3 + x_4, data = FA11_Data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.1750 -1.6709  0.2508  1.3783  3.9254 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  62.4054    70.0710   0.891   0.3991  
## x_1           1.5511     0.7448   2.083   0.0708 .
## x_2           0.5102     0.7238   0.705   0.5009  
## x_3           0.1019     0.7547   0.135   0.8959  
## x_4          -0.1441     0.7091  -0.203   0.8441  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.446 on 8 degrees of freedom
## Multiple R-squared:  0.9824, Adjusted R-squared:  0.9736 
## F-statistic: 111.5 on 4 and 8 DF,  p-value: 4.756e-07

Looking at the f statistic p value, the model is good fit to the data.

The t test probabilities are not significant, except \(x_2\) is almost significant a factor. So, inspecting to see if the factors are co-linear and causing any problems in the model

Part B

Determine the VIF for each predictor in the model

##       x_1       x_2       x_3       x_4 
##  38.49621 254.42317  46.86839 282.51286

The above table shows the VIF’s for all predictor variables. All the predictors are greater than 10 so we have issues with our model.

Removing the Highest predictor variable with the highest model might improve the model.