Team members: Nadia Shchetnikova, Daniel Zhang, Esubalew Milam, Aubrey Cook

Project 1

Question 1:

NFL=read.csv("NFL Team Scoring.csv")
NFL=NFL[,-1]
summary(NFL)

      Year           PTS              TA              PAVG      
 Min.   :2009   Min.   :10.90   Min.   :0.9375   Min.   :5.100  
 1st Qu.:2009   1st Qu.:18.77   1st Qu.:1.3750   1st Qu.:6.500  
 Median :2010   Median :22.20   Median :1.6250   Median :7.000  
 Mean   :2010   Mean   :21.90   Mean   :1.6504   Mean   :7.042  
 3rd Qu.:2011   3rd Qu.:24.60   3rd Qu.:1.8906   3rd Qu.:7.700  
 Max.   :2011   Max.   :35.00   Max.   :2.5000   Max.   :9.300  
       TO             SCK              RAVG             FD       
 Min.   :0.625   Min.   :0.8125   Min.   :3.300   Min.   :14.10  
 1st Qu.:1.375   1st Qu.:1.7969   1st Qu.:4.000   1st Qu.:17.48  
 Median :1.594   Median :2.1875   Median :4.200   Median :18.95  
 Mean   :1.655   Mean   :2.2259   Mean   :4.231   Mean   :19.02  
 3rd Qu.:1.938   3rd Qu.:2.6875   3rd Qu.:4.400   3rd Qu.:20.70  
 Max.   :2.625   Max.   :3.5000   Max.   :5.400   Max.   :26.00  
      P20             R20              COMP            THRD      
 Min.   :1.562   Min.   :0.1875   Min.   :49.40   Min.   :25.80  
 1st Qu.:2.500   1st Qu.:0.4375   1st Qu.:57.58   1st Qu.:34.02  
 Median :2.812   Median :0.6875   Median :60.45   Median :37.65  
 Mean   :3.024   Mean   :0.7201   Mean   :60.36   Mean   :38.33  
 3rd Qu.:3.578   3rd Qu.:0.8750   3rd Qu.:62.95   3rd Qu.:41.73  
 Max.   :4.500   Max.   :1.6875   Max.   :71.30   Max.   :56.70  
       PA              RA              RY              PY       
 Min.   :24.60   Min.   :20.00   Min.   : 80.9   Min.   :129.8  
 1st Qu.:31.20   1st Qu.:25.20   1st Qu.:100.7   1st Qu.:192.3  
 Median :33.95   Median :27.00   Median :114.5   Median :219.0  
 Mean   :33.67   Mean   :27.33   Mean   :116.1   Mean   :223.2  
 3rd Qu.:36.02   3rd Qu.:29.10   3rd Qu.:127.0   3rd Qu.:253.9  
 Max.   :42.40   Max.   :37.90   Max.   :172.3   Max.   :334.2

attach(NFL)

Question 2:

NFL.lm1=lm(PTS ~ ., data=NFL)
summary(NFL.lm1)


Call:
lm(formula = PTS ~ ., data = NFL)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.5866 -1.3118  0.0109  1.3825  4.0220 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 110.79041  577.63096   0.192   0.8484    
Year         -0.05770    0.28673  -0.201   0.8410    
TA            3.69956    0.63061   5.867  9.6e-08 ***
PAVG          0.58548    2.46659   0.237   0.8130    
TO           -1.50645    0.57465  -2.622   0.0105 *  
SCK          -0.46959    0.65403  -0.718   0.4748    
RAVG          0.65070    1.04548   0.622   0.5355    
FD            0.58207    0.36352   1.601   0.1133    
P20           0.29124    0.82689   0.352   0.7256    
R20           0.06099    1.14600   0.053   0.9577    
COMP         -0.07057    0.11360  -0.621   0.5362    
THRD          0.07478    0.06683   1.119   0.2665    
PA           -0.17546    0.49597  -0.354   0.7244    
RA            0.01212    0.11852   0.102   0.9188    
RY            0.01517    0.03393   0.447   0.6560    
PY            0.04865    0.06784   0.717   0.4754    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.032 on 80 degrees of freedom
Multiple R-squared:  0.8518,    Adjusted R-squared:  0.8241 
F-statistic: 30.67 on 15 and 80 DF,  p-value: < 2.2e-16

library(boot)

Warning: package 'boot' was built under R version 4.5.3

NFL.glm1=glm(PTS~., NFL, family=gaussian)
cv.err1=cv.glm(NFL, NFL.glm1)
cv.err1$delta

[1] 4.963998 4.955193

NFL.glm1$aic

[1] 425.1115

The MSE of PTS is 6.737557, but since this MSE is the squared error of PTS in points, it’s units are currently in points squared. To put the MSE into the the same scale as PTS (points), we take the square root of the MSE and get 2.596, indicating that the predicted points scored per game will be off by approximtaely 2.596 points from the actual points scored per game, more or less.

Question 3:

plot(NFL.lm1, which=1)

library(car)

Warning: package 'car' was built under R version 4.5.3

Loading required package: carData


Attaching package: 'car'

The following object is masked from 'package:boot':

    logit

residualPlots(NFL.lm1)

           Test stat Pr(>|Test stat|)   
Year         -1.1992         0.234030   
TA            0.6917         0.491169   
PAVG          0.1056         0.916155   
TO            0.3792         0.705553   
SCK           0.8677         0.388165   
RAVG          0.3998         0.690410   
FD            1.9734         0.051948 . 
P20          -0.6348         0.527388   
R20           0.5836         0.561123   
COMP          3.0311         0.003294 **
THRD          2.4008         0.018713 * 
PA            2.0456         0.044124 * 
RA            0.8955         0.373217   
RY            1.8072         0.074544 . 
PY            1.3956         0.166750   
Tukey test    0.7276         0.466865   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

par(mfrow=c(1,2))
plot(NFL.lm1, which=2)
hist(NFL.lm1$residuals, breaks=30)

par(mfrow=c(1,1))

plot(NFL.lm1, which=3)

ncvTest(NFL.lm1)

Non-constant Variance Score Test 
Variance formula: ~ fitted.values 
Chisquare = 0.004490501, Df = 1, p = 0.94657

plot(NFL.lm1, which=5)

durbinWatsonTest(NFL.lm1)

 lag Autocorrelation D-W Statistic p-value
   1       0.1568834      1.648938   0.084
 Alternative hypothesis: rho != 0

par(mfrow=c(1,2))
plot(acf(NFL.lm1$residuals, plot=F)[1:23])
pacf(NFL.lm1$residuals)

par(mfrow=c(1,1))

vif(NFL.lm1)

      Year         TA       PAVG         TO        SCK       RAVG         FD 
  1.273739   1.273692  99.727090   1.383903   3.559683   4.293351  17.548656 
       P20        R20       COMP       THRD         PA         RA         RY 
  8.251755   3.002759   5.960095   3.505388  64.810355   3.133505  10.982873 
        PY 
188.392735

cor(NFL[,c("PAVG","P20","PA","PY")])

          PAVG       P20        PA        PY
PAVG 1.0000000 0.8803076 0.2126571 0.8112698
P20  0.8803076 1.0000000 0.3718023 0.8181119
PA   0.2126571 0.3718023 1.0000000 0.7317244
PY   0.8112698 0.8181119 0.7317244 1.0000000

cor(NFL[,c("RAVG","R20","RA","RY")])

          RAVG       R20        RA        RY
RAVG 1.0000000 0.7574716 0.3139383 0.6789253
R20  0.7574716 1.0000000 0.4563373 0.7020973
RA   0.3139383 0.4563373 1.0000000 0.7442925
RY   0.6789253 0.7020973 0.7442925 1.0000000

Linearity: Satisfied; The residuals are distributed relatively evenly across the residuals vs fitted plot, implying linearity.

Normality: Not Satisfied; The residuals closely follow the diagonal line with some slight skew at the beginning and end, indicating a not normal pattern. To resolve this issue, we could linearize the model by squaring the response variable, PTS.

Homoscedasticity: Satisfied; the NCV test of the model results in a p-value greater than 0.05, meaning we fail to reject the null hypothesis that there is no heteroscedasticity or non-constant variance between residuals.

Independence of Errors: Satisfied; the Durbin Watson test of the model results in a p-value greater than 0.05, meaning we fail to reject the null hypothesis that there is a no significant evidence of autocorrelation between residuals.

No Highly Influencial Observations: Satisfied; No points venture beyond Cook’s distance.

No Multicollinearity: Not Satisfied; multiple predictor variables have VIF statistics above 5 or 10, indicating moderate or high multicollinearity with other predictor variables. To resolve this issue, we could eliminate the predictors with high VIF statistics one at a time starting with the greatest VIF and rerun the model until all VIF’s are below 10 or 5.

Question 4:

pred.lm1=predict(NFL.lm1,interval="prediction")

Warning in predict.lm(NFL.lm1, interval = "prediction"): predictions on current data refer to _future_ responses

pred.table.lm1=data.frame(PTS, pred.lm1)
pred.table.lm1[PTS>=30,]

    PTS      fit      lwr      upr
1  35.0 32.65760 28.18026 37.13495
2  34.2 30.99501 26.32840 35.66161
3  32.1 33.09497 28.55918 37.63076
33 32.4 29.87107 25.43507 34.30707
65 31.9 30.31310 25.99270 34.63351

pred.table.lm1[PTS<=15,]

    PTS      fit       lwr      upr
30 13.6 16.83217 12.528517 21.13583
31 13.3 18.88663 14.596145 23.17712
32 12.1 14.62476 10.313931 18.93559
64 12.3 13.60868  9.288410 17.92896
95 12.3 13.03727  8.746164 17.32838
96 10.9 13.51508  9.262574 17.76758

Question 5:

library(leaps)

Warning: package 'leaps' was built under R version 4.5.3

all.fit=regsubsets(PTS~.,NFL,nvmax=15)
all.sum=summary(all.fit)

par(mfrow=c(1,3))
plot(all.sum$cp,type="b")
all.mincp=which.min(all.sum$cp)
all.mincp

[1] 6

points(all.mincp,all.sum$cp[all.mincp],pch=4,cex=2,col=2)

plot(all.sum$bic,type="b")
all.minbic=which.min(all.sum$bic)
all.minbic

[1] 4

points(all.minbic,all.sum$bic[all.minbic],pch=4,cex=2,col=2)

plot(all.sum$adjr2,type="b")
all.maxadjr2=which.max(all.sum$adjr2)
all.maxadjr2

[1] 7

points(all.maxadjr2,all.sum$adjr2[all.maxadjr2],pch=4,cex=2,col=2)

par(mfrow=c(1,1))

Cp (left plot)optimal predictors: 6 The minimum Cp is at index 6, meaning a 6-predictor model offers the best balance of fit and complexity by this criterion. / BIC (middle plot) → optimal predictors: 4 BIC penalizes model complexity more heavily than Cp, so it selects a more parsimonious 4-predictor model. / Adjusted R^2 (right plot) optimal predictors: 7, The adjusted R^2 keeps increasing until 7 predictors, where it levels off — adding more predictors beyond that doesn’t meaningfully improve the model. / The three criteria suggest slightly different optimal model sizes. Cp selects 6 predictors, BIC selects 4, and adjusted R^2 selects 7. BIC tends to favor smaller models due to its stronger penalty for complexity, while adjusted R^2 allows for a few more predictors. Overall, a model with around 4–7 predictors appears to be a reasonable choice for predicting NFL team points scored ##

Question 6:

all.sum$outmat

          Year TA  PAVG TO  SCK RAVG FD  P20 R20 COMP THRD PA  RA  RY  PY 
1  ( 1 )  " "  " " " "  " " " " " "  "*" " " " " " "  " "  " " " " " " " "
2  ( 1 )  " "  "*" " "  " " " " " "  "*" " " " " " "  " "  " " " " " " " "
3  ( 1 )  " "  "*" "*"  " " " " " "  "*" " " " " " "  " "  " " " " " " " "
4  ( 1 )  " "  "*" "*"  "*" " " " "  "*" " " " " " "  " "  " " " " " " " "
5  ( 1 )  " "  "*" " "  "*" " " " "  " " " " " " " "  "*"  " " " " "*" "*"
6  ( 1 )  " "  "*" " "  "*" " " " "  "*" " " " " " "  " "  "*" " " "*" "*"
7  ( 1 )  " "  "*" " "  "*" " " " "  "*" " " " " " "  "*"  "*" " " "*" "*"
8  ( 1 )  " "  "*" " "  "*" " " " "  "*" " " " " "*"  "*"  "*" " " "*" "*"
9  ( 1 )  " "  "*" " "  "*" "*" "*"  "*" " " " " "*"  "*"  "*" " " " " "*"
10  ( 1 ) " "  "*" "*"  "*" "*" "*"  "*" " " " " "*"  "*"  " " " " "*" "*"
11  ( 1 ) " "  "*" " "  "*" "*" "*"  "*" "*" " " "*"  "*"  "*" " " "*" "*"
12  ( 1 ) " "  "*" "*"  "*" "*" "*"  "*" "*" " " "*"  "*"  "*" " " "*" "*"
13  ( 1 ) "*"  "*" "*"  "*" "*" "*"  "*" "*" " " "*"  "*"  "*" " " "*" "*"
14  ( 1 ) "*"  "*" "*"  "*" "*" "*"  "*" "*" " " "*"  "*"  "*" "*" "*" "*"
15  ( 1 ) "*"  "*" "*"  "*" "*" "*"  "*" "*" "*" "*"  "*"  "*" "*" "*" "*"

write.csv(all.sum$outmat, file="all_outmat.csv", row.names = TRUE)

model2 <- lm(PTS ~ TA + TO + FD + PA + RY + PY, data = NFL)
model3 <- lm(PTS ~ TA + PAVG + TO + FD, data = NFL)
model4 <- lm(PTS ~ TA + TO + FD + THRD + PA + RY + PY, data = NFL)

cv2 <- cv.glm(NFL, glm(PTS ~ TA + TO + FD + PA + RY + PY, data = NFL))$delta[1]
cv3 <- cv.glm(NFL, glm(PTS ~ TA + PAVG + TO + FD, data = NFL))$delta[1]
cv4 <- cv.glm(NFL, glm(PTS ~ TA + TO + FD + THRD + PA + RY + PY, data = NFL))$delta[1]
cv2; cv3; cv4

[1] 4.145494

[1] 4.322625

[1] 4.152935

AIC(model2); AIC(model3); AIC(model4)

[1] 410.4483

[1] 414.2749

[1] 410.8583

Model 2 (Cp, 6 predictors) has the best prediction accuracy with the lowest LOOCV error (4.145) and lowest AIC (410.45). / Model 4 (adjusted R^2, 7 predictors) performs similarly with a LOOCV of 4.153 and AIC of 410.86. / Model 3 (BIC, 4 predictors) has the weakest prediction accuracy, with the highest LOOCV (4.323) and AIC (414.27).

Question 7:

x=model.matrix(PTS~(TA+TO+FD+PA+RY+PY)^2., NFL)
dim(x)

[1] 96 22

library(leaps)                                                                  
x = model.matrix(PTS ~ (TA + TO + FD + PA + RY + PY)^2, data = NFL)
y = NFL$PTS

all.interact.fit = regsubsets(PTS ~ (TA + TO + FD + PA + RY + PY)^2, data = NFL, nvmax = 21)
all.interact.sum = summary(all.interact.fit)

best.cp.index = which.min(all.interact.sum$cp)

coef(all.interact.fit, best.cp.index)

 (Intercept)           TA           PA        TA:TO        TA:RY        FD:RY 
12.504544487 10.398222190 -0.736362590 -0.795752994 -0.045227617  0.005504967 
       PA:PY 
 0.002056339

The six predictors that are selected in the model are the the following:

TA (Takeaways)
FD (First Downs)
PA (Pass Attempts)
PY (Passing Yards)
TA:TO (Interaction: Takeaways \(\times\) Turnovers)
FD:RY (Interaction: First Downs \(\times\) Rushing Yards)

Question 8:

model_hierarchical <- lm(PTS ~ TA + TO + FD + RY + PA + PY + TA:TO + FD:RY, data = NFL)
summary(model_hierarchical)


Call:
lm(formula = PTS ~ TA + TO + FD + RY + PA + PY + TA:TO + FD:RY, 
    data = NFL)

Residuals:
   Min     1Q Median     3Q    Max 
-5.793 -1.348  0.024  1.367  3.835 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.212004  13.508044  -0.016  0.98751    
TA           5.297162   1.949036   2.718  0.00793 ** 
TO           0.352340   2.165974   0.163  0.87116    
FD           0.202642   0.643805   0.315  0.75370    
RY          -0.026915   0.111214  -0.242  0.80934    
PA          -0.266589   0.117470  -2.269  0.02572 *  
PY           0.072659   0.014605   4.975 3.26e-06 ***
TA:TO       -1.028073   1.183307  -0.869  0.38734    
FD:RY        0.003427   0.005544   0.618  0.53807    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.971 on 87 degrees of freedom
Multiple R-squared:  0.8485,    Adjusted R-squared:  0.8345 
F-statistic: 60.89 on 8 and 87 DF,  p-value: < 2.2e-16

model_step2 <- update(model_hierarchical, . ~ . - TA:TO)
summary(model_step2)


Call:
lm(formula = PTS ~ TA + TO + FD + RY + PA + PY + FD:RY, data = NFL)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.8249 -1.3553  0.0341  1.4507  3.7683 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.592633  13.098365   0.198  0.84355    
TA           3.681243   0.581816   6.327 1.01e-08 ***
TO          -1.471540   0.532621  -2.763  0.00698 ** 
FD           0.268742   0.638403   0.421  0.67481    
RY          -0.024314   0.111019  -0.219  0.82715    
PA          -0.279938   0.116298  -2.407  0.01817 *  
PY           0.071431   0.014516   4.921 3.99e-06 ***
FD:RY        0.003096   0.005523   0.561  0.57652    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.968 on 88 degrees of freedom
Multiple R-squared:  0.8472,    Adjusted R-squared:  0.835 
F-statistic: 69.68 on 7 and 88 DF,  p-value: < 2.2e-16

model_step3 <- update(model_step2, . ~ . - FD:RY)
summary(model_step3)


Call:
lm(formula = PTS ~ TA + TO + FD + RY + PA + PY, data = NFL)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.8633 -1.3791  0.0302  1.3929  3.8124 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -4.40314    3.96207  -1.111  0.26942    
TA           3.63016    0.57242   6.342 9.19e-09 ***
TO          -1.45196    0.52942  -2.743  0.00737 ** 
FD           0.59558    0.25902   2.299  0.02383 *  
RY           0.03693    0.01965   1.879  0.06347 .  
PA          -0.27055    0.11464  -2.360  0.02046 *  
PY           0.07261    0.01431   5.075 2.11e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.961 on 89 degrees of freedom
Multiple R-squared:  0.8466,    Adjusted R-squared:  0.8363 
F-statistic: 81.87 on 6 and 89 DF,  p-value: < 2.2e-16

Model5 <- lm(PTS ~ TA + TO + FD + PA + RY + PY, data = NFL)
summary(Model5)


Call:
lm(formula = PTS ~ TA + TO + FD + PA + RY + PY, data = NFL)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.8633 -1.3791  0.0302  1.3929  3.8124 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -4.40314    3.96207  -1.111  0.26942    
TA           3.63016    0.57242   6.342 9.19e-09 ***
TO          -1.45196    0.52942  -2.743  0.00737 ** 
FD           0.59558    0.25902   2.299  0.02383 *  
PA          -0.27055    0.11464  -2.360  0.02046 *  
RY           0.03693    0.01965   1.879  0.06347 .  
PY           0.07261    0.01431   5.075 2.11e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.961 on 89 degrees of freedom
Multiple R-squared:  0.8466,    Adjusted R-squared:  0.8363 
F-statistic: 81.87 on 6 and 89 DF,  p-value: < 2.2e-16

The Predictors included in Model 5 are as follows:

TA (Takeaways)
TO (Turnovers)
FD (First Downs)
PA (Pass Attempts)
RY (Rushing Yards)
PY (Passing Yards)

Question 9:

plot(Model5, which=1)

library(car)
residualPlots(Model5)

           Test stat Pr(>|Test stat|)  
TA            0.3940          0.69451  
TO            0.5232          0.60218  
FD            1.4793          0.14264  
PA            1.7235          0.08831 .
RY            1.7665          0.08078 .
PY            0.8285          0.40961  
Tukey test    0.5010          0.61638  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

par(mfrow=c(1,2))
plot(Model5, which=2)
hist(Model5$residuals, breaks=30)

par(mfrow=c(1,1))

plot(Model5, which=3)

ncvTest(Model5)

Non-constant Variance Score Test 
Variance formula: ~ fitted.values 
Chisquare = 0.0002173105, Df = 1, p = 0.98824

plot(Model5, which=5)

durbinWatsonTest(Model5)

 lag Autocorrelation D-W Statistic p-value
   1       0.1824282      1.597095    0.04
 Alternative hypothesis: rho != 0

par(mfrow=c(1,2))
plot(acf(Model5$residuals, plot=F)[1:23])
pacf(Model5$residuals)

par(mfrow=c(1,1))

vif(Model5)

      TA       TO       FD       PA       RY       PY 
1.127632 1.262146 9.573658 3.720616 3.958191 9.004008

cor(NFL[,c("PAVG","P20","PA","PY")])

          PAVG       P20        PA        PY
PAVG 1.0000000 0.8803076 0.2126571 0.8112698
P20  0.8803076 1.0000000 0.3718023 0.8181119
PA   0.2126571 0.3718023 1.0000000 0.7317244
PY   0.8112698 0.8181119 0.7317244 1.0000000

cor(NFL[,c("RAVG","R20","RA","RY")])

          RAVG       R20        RA        RY
RAVG 1.0000000 0.7574716 0.3139383 0.6789253
R20  0.7574716 1.0000000 0.4563373 0.7020973
RA   0.3139383 0.4563373 1.0000000 0.7442925
RY   0.6789253 0.7020973 0.7442925 1.0000000

Linearity: Satisfied; The residuals are distributed relatively evenly across the residuals vs fitted plot, implying linearity.

Normality: Satisfied; At first, the residuals appear to have a slight skew at the edges of the Q-Q residuals plot, indicating a non-normal pattern. Normally, to resolve this issue, we could make the model more linear by finding out which variables need to be squared with residualPlots(). However, after using residualPlots(), none of the predictors have a p-value less than 0.05, meaning the skewness is statistically insignificant at a = 0.05. PA and RY residuals have p statistics insignificant at a = 0.05, but significant at a = 0.10.

Homoscedasticity: Satisfied; the NCV test of Model5 has a p-value greater than 0.05, meaning we fail to reject the null hypothesis that there is no heteroscedasticity or non-constant variance between residuals.

Independence of Errors: Not Satisfied; the Durbin Watson test of the model results in a p-value less than 0.05, meaning there is autocorrelation between residuals. To fix this, generalized least squares (GLS) functions can be used.

No Highly Influential Observations: Satisfied; No points are beyond Cook’s distance.

No Multicollinearity: Not Satisfied; FD and PY have VIF statistics above 5, indicating multicollinearity with other predictor variables. To resolve this issue, we could eliminate the predictors with high VIF statistics one at a time starting with the greatest VIF (FD) and rerun the model until all VIF’s are below 10 or 5.

Fixing the model

Multicollinearity: Remove the FD predictor since it has the greatest VIF score.

Model6 <- lm(PTS ~ TA + TO + PA + RY + PY, data = NFL)
vif(Model6)

      TA       TO       PA       RY       PY 
1.082188 1.209053 3.236346 1.513007 2.711652

The new VIF statistics show no scores above 5, so no further removals are necessary.

Independence of errors:

durbinWatsonTest(Model6)

 lag Autocorrelation D-W Statistic p-value
   1       0.1484058      1.665536   0.072
 Alternative hypothesis: rho != 0

par(mfrow=c(1,2))
plot(acf(Model6$residuals, plot=F)[1:23])
pacf(Model6$residuals)

par(mfrow=c(1,1))

After removing the FD predictor, the new test statistics show no violation of independence of errors, so no further changes are necessary.

Question 10:

Model6_glm <- glm(PTS ~ TA + TO + PA + RY + PY, data = NFL)
cv.glm(NFL, Model6_glm)$delta

[1] 4.305078 4.302199

AIC(Model6)

[1] 413.988

Model5_glm <- glm(PTS ~ TA + TO + FD + PA + RY + PY, data = NFL)
cv.glm(NFL, Model5_glm)$delta

[1] 4.145494 4.142310

AIC(Model5)

[1] 410.4483

Model6 is not the best model in terms of prediction accuracy. Models 2 and 5 have the lowest test MSE estimate and AIC, making them the best for predictions.