Multiple Regression

Using R, build a multiple regression model for data that interests you. Include in this model at least one quadratic term, one dichotomous term, and one dichotomous vs. quantitative interaction term. Interpret all coefficients. Conduct residual analysis. Was the linear model appropriate? Why or why not?

I will use a dataset from an old project where I completed a linear regression ( my research question was: Does the temperature predict the quantity of daily bike rentals?) and I will add more variables to it to complete the discussion board.

# load data
library (readr)
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
urlfile="https://raw.githubusercontent.com/MarjeteV/606-Project/main/Dataset%20Days"
mydata<-read_csv(url(urlfile))
## Rows: 731 Columns: 16
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl  (15): instant, season, yr, mnth, holiday, weekday, workingday, weathers...
## date  (1): dteday
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
summary(mydata)
##     instant          dteday               season            yr        
##  Min.   :  1.0   Min.   :2011-01-01   Min.   :1.000   Min.   :0.0000  
##  1st Qu.:183.5   1st Qu.:2011-07-02   1st Qu.:2.000   1st Qu.:0.0000  
##  Median :366.0   Median :2012-01-01   Median :3.000   Median :1.0000  
##  Mean   :366.0   Mean   :2012-01-01   Mean   :2.497   Mean   :0.5007  
##  3rd Qu.:548.5   3rd Qu.:2012-07-01   3rd Qu.:3.000   3rd Qu.:1.0000  
##  Max.   :731.0   Max.   :2012-12-31   Max.   :4.000   Max.   :1.0000  
##       mnth          holiday           weekday        workingday   
##  Min.   : 1.00   Min.   :0.00000   Min.   :0.000   Min.   :0.000  
##  1st Qu.: 4.00   1st Qu.:0.00000   1st Qu.:1.000   1st Qu.:0.000  
##  Median : 7.00   Median :0.00000   Median :3.000   Median :1.000  
##  Mean   : 6.52   Mean   :0.02873   Mean   :2.997   Mean   :0.684  
##  3rd Qu.:10.00   3rd Qu.:0.00000   3rd Qu.:5.000   3rd Qu.:1.000  
##  Max.   :12.00   Max.   :1.00000   Max.   :6.000   Max.   :1.000  
##    weathersit         temp             atemp              hum        
##  Min.   :1.000   Min.   :0.05913   Min.   :0.07907   Min.   :0.0000  
##  1st Qu.:1.000   1st Qu.:0.33708   1st Qu.:0.33784   1st Qu.:0.5200  
##  Median :1.000   Median :0.49833   Median :0.48673   Median :0.6267  
##  Mean   :1.395   Mean   :0.49538   Mean   :0.47435   Mean   :0.6279  
##  3rd Qu.:2.000   3rd Qu.:0.65542   3rd Qu.:0.60860   3rd Qu.:0.7302  
##  Max.   :3.000   Max.   :0.86167   Max.   :0.84090   Max.   :0.9725  
##    windspeed           casual         registered        cnt      
##  Min.   :0.02239   Min.   :   2.0   Min.   :  20   Min.   :  22  
##  1st Qu.:0.13495   1st Qu.: 315.5   1st Qu.:2497   1st Qu.:3152  
##  Median :0.18097   Median : 713.0   Median :3662   Median :4548  
##  Mean   :0.19049   Mean   : 848.2   Mean   :3656   Mean   :4504  
##  3rd Qu.:0.23321   3rd Qu.:1096.0   3rd Qu.:4776   3rd Qu.:5956  
##  Max.   :0.50746   Max.   :3410.0   Max.   :6946   Max.   :8714
#temp was rescaled so the range of these values is between 0-1
# how do I determine quadratic variable? look for potential non-linear relationship
#I decided to use humidity as the quadratic variable

modela <- lm(cnt ~ hum + I(hum^2), data = mydata)

# Plot the data and the quadratic regression line
ggplot(mydata, aes(x = hum, y = cnt)) +
  geom_point() +
  geom_smooth(method = "lm", formula = y ~ poly(x, 2), se = FALSE, col = "blue") +
  labs(title = "Scatter Plot with Quadratic Regression Line",
       x = "hum",
       y = "Bike Rentals")

summary(modela)
## 
## Call:
## lm(formula = cnt ~ hum + I(hum^2), data = mydata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4281.4 -1252.3     2.3  1515.0  3936.6 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -1245.5      954.2  -1.305    0.192    
## hum          21021.7     3093.3   6.796 2.24e-11 ***
## I(hum^2)    -17971.9     2452.3  -7.329 6.19e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1863 on 728 degrees of freedom
## Multiple R-squared:  0.07814,    Adjusted R-squared:  0.07561 
## F-statistic: 30.86 on 2 and 728 DF,  p-value: 1.372e-13
#Dischotomous Term is a binary - yes or no/ 1 or 0; working day and holiday could be selected from the dataset. 
#One Dichotomous vs Quantitative Interaction Term- will try to use cnt and working day
modelb <- lm(cnt ~ temp * workingday, data = mydata)
summary(modelb)
## 
## Call:
## lm(formula = cnt ~ temp * workingday, data = mydata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4652.8 -1121.4  -100.5  1059.8  3781.6 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        850.0      274.2   3.100  0.00201 ** 
## temp              7232.1      531.1  13.618  < 2e-16 ***
## workingday         560.1      338.9   1.653  0.09880 .  
## temp:workingday   -907.1      649.2  -1.397  0.16277    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1508 on 727 degrees of freedom
## Multiple R-squared:  0.3962, Adjusted R-squared:  0.3937 
## F-statistic:   159 on 3 and 727 DF,  p-value: < 2.2e-16
#Next I will create a dataframe & add new columns 
dfa <- mydata %>%
  mutate( hum_squared = `hum`^2,
          DvsQ= `temp` * `workingday`) #this seems off since any working day = 0 will make the new variable 0

#I want to remove some columns that seem unlikely to help in MLR
remove <- c("instant", "dteday", "yr")
df <- dfa [, !(names(dfa) %in% remove)]

glimpse(df)
## Rows: 731
## Columns: 15
## $ season      <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ mnth        <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ holiday     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0…
## $ weekday     <dbl> 6, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4…
## $ workingday  <dbl> 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1…
## $ weathersit  <dbl> 2, 2, 1, 1, 1, 1, 2, 2, 1, 1, 2, 1, 1, 1, 2, 1, 2, 2, 2, 2…
## $ temp        <dbl> 0.3441670, 0.3634780, 0.1963640, 0.2000000, 0.2269570, 0.2…
## $ atemp       <dbl> 0.3636250, 0.3537390, 0.1894050, 0.2121220, 0.2292700, 0.2…
## $ hum         <dbl> 0.805833, 0.696087, 0.437273, 0.590435, 0.436957, 0.518261…
## $ windspeed   <dbl> 0.1604460, 0.2485390, 0.2483090, 0.1602960, 0.1869000, 0.0…
## $ casual      <dbl> 331, 131, 120, 108, 82, 88, 148, 68, 54, 41, 43, 25, 38, 5…
## $ registered  <dbl> 654, 670, 1229, 1454, 1518, 1518, 1362, 891, 768, 1280, 12…
## $ cnt         <dbl> 985, 801, 1349, 1562, 1600, 1606, 1510, 959, 822, 1321, 12…
## $ hum_squared <dbl> 0.6493668, 0.4845371, 0.1912077, 0.3486135, 0.1909314, 0.2…
## $ DvsQ        <dbl> 0.0000000, 0.0000000, 0.1963640, 0.2000000, 0.2269570, 0.2…
pairs(df, gap = .5)

model.all <- lm(cnt ~ ., data = df)

summary(model.all)
## Warning in summary.lm(model.all): essentially perfect fit: summary may be
## unreliable
## 
## Call:
## lm(formula = cnt ~ ., data = df)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -1.466e-11 -3.890e-13 -4.800e-14  2.430e-13  4.394e-11 
## 
## Coefficients:
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept) -8.378e-13  1.362e-12 -6.150e-01  0.53867    
## season       6.988e-13  1.445e-13  4.837e+00 1.61e-06 ***
## mnth        -8.440e-14  4.400e-14 -1.918e+00  0.05548 .  
## holiday     -1.194e-12  5.188e-13 -2.302e+00  0.02163 *  
## weekday      4.148e-14  4.211e-14  9.850e-01  0.32500    
## workingday  -1.472e-12  5.088e-13 -2.894e+00  0.00392 ** 
## weathersit   1.932e-12  2.141e-13  9.027e+00  < 2e-16 ***
## temp        -1.073e-12  3.758e-12 -2.850e-01  0.77538    
## atemp       -6.743e-12  4.085e-12 -1.651e+00  0.09921 .  
## hum          3.650e-12  4.013e-12  9.090e-01  0.36343    
## windspeed   -3.425e-13  1.190e-12 -2.880e-01  0.77354    
## casual       1.000e+00  2.535e-16  3.944e+15  < 2e-16 ***
## registered   1.000e+00  8.880e-17  1.126e+16  < 2e-16 ***
## hum_squared -1.772e-12  3.280e-12 -5.400e-01  0.58907    
## DvsQ         6.401e-13  1.083e-12  5.910e-01  0.55474    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.234e-12 on 716 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 3.92e+31 on 14 and 716 DF,  p-value: < 2.2e-16
#Warning: essentially perfect fit: summary may be unreliable
#Backward Elimination- take the predictor with the largest p-value & recompute
model1 <- update(model.all, .~. - temp, data = df)

summary(model1)
## Warning in summary.lm(model1): essentially perfect fit: summary may be
## unreliable
## 
## Call:
## lm(formula = cnt ~ season + mnth + holiday + weekday + workingday + 
##     weathersit + atemp + hum + windspeed + casual + registered + 
##     hum_squared + DvsQ, data = df)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -1.560e-11 -3.010e-13 -5.500e-14  1.850e-13  4.446e-11 
## 
## Coefficients:
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept) -3.096e-12  1.334e-12 -2.320e+00   0.0206 *  
## season       1.807e-13  1.416e-13  1.276e+00   0.2024    
## mnth        -3.370e-15  4.312e-14 -7.800e-02   0.9377    
## holiday     -7.950e-13  5.079e-13 -1.565e+00   0.1179    
## weekday      2.822e-14  4.126e-14  6.840e-01   0.4943    
## workingday   4.734e-13  4.888e-13  9.680e-01   0.3331    
## weathersit   2.587e-12  2.097e-13  1.234e+01   <2e-16 ***
## atemp        2.452e-12  1.159e-12  2.115e+00   0.0348 *  
## hum         -5.865e-13  3.931e-12 -1.490e-01   0.8815    
## windspeed   -2.814e-12  1.149e-12 -2.448e+00   0.0146 *  
## casual       1.000e+00  2.443e-16  4.094e+15   <2e-16 ***
## registered   1.000e+00  8.675e-17  1.153e+16   <2e-16 ***
## hum_squared -4.034e-13  3.214e-12 -1.260e-01   0.9001    
## DvsQ        -1.240e-12  1.019e-12 -1.217e+00   0.2241    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.19e-12 on 717 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 4.394e+31 on 13 and 717 DF,  p-value: < 2.2e-16
model2 <- update(model1, .~. - mnth, data = df)

summary(model2)
## Warning in summary.lm(model2): essentially perfect fit: summary may be
## unreliable
## 
## Call:
## lm(formula = cnt ~ season + holiday + weekday + workingday + 
##     weathersit + atemp + hum + windspeed + casual + registered + 
##     hum_squared + DvsQ, data = df)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -3.376e-11 -2.570e-13  1.400e-14  2.780e-13  4.822e-11 
## 
## Coefficients:
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept) -1.629e-12  1.515e-12 -1.075e+00 0.282663    
## season       1.238e-13  9.725e-14  1.273e+00 0.203571    
## holiday      7.136e-14  5.780e-13  1.230e-01 0.901777    
## weekday      9.401e-14  4.695e-14  2.002e+00 0.045616 *  
## workingday  -2.760e-12  5.565e-13 -4.960e+00 8.81e-07 ***
## weathersit   1.604e-12  2.388e-13  6.714e+00 3.85e-11 ***
## atemp       -3.566e-12  1.319e-12 -2.702e+00 0.007047 ** 
## hum         -1.725e-12  4.453e-12 -3.870e-01 0.698671    
## windspeed    4.958e-12  1.309e-12  3.788e+00 0.000164 ***
## casual       1.000e+00  2.777e-16  3.600e+15  < 2e-16 ***
## registered   1.000e+00  9.879e-17  1.012e+16  < 2e-16 ***
## hum_squared  2.866e-12  3.646e-12  7.860e-01 0.432075    
## DvsQ         5.786e-13  1.160e-12  4.990e-01 0.618180    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.494e-12 on 718 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 3.67e+31 on 12 and 718 DF,  p-value: < 2.2e-16
model3 <- update(model2, .~. - holiday, data = df)

summary(model3)
## Warning in summary.lm(model3): essentially perfect fit: summary may be
## unreliable
## 
## Call:
## lm(formula = cnt ~ season + weekday + workingday + weathersit + 
##     atemp + hum + windspeed + casual + registered + hum_squared + 
##     DvsQ, data = df)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -3.563e-11 -2.830e-13  1.000e-15  2.440e-13  4.546e-11 
## 
## Coefficients:
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept) -1.061e-12  1.442e-12 -7.360e-01   0.4621    
## season       8.882e-14  9.258e-14  9.590e-01   0.3377    
## weekday     -4.392e-14  4.454e-14 -9.860e-01   0.3245    
## workingday  -2.527e-12  5.257e-13 -4.808e+00 1.86e-06 ***
## weathersit  -3.967e-13  2.274e-13 -1.745e+00   0.0814 .  
## atemp        5.181e-13  1.255e-12  4.130e-01   0.6799    
## hum          5.617e-13  4.233e-12  1.330e-01   0.8945    
## windspeed    7.650e-12  1.246e-12  6.140e+00 1.36e-09 ***
## casual       1.000e+00  2.632e-16  3.800e+15  < 2e-16 ***
## registered   1.000e+00  9.404e-17  1.063e+16  < 2e-16 ***
## hum_squared  5.959e-13  3.464e-12  1.720e-01   0.8635    
## DvsQ         5.285e-13  1.104e-12  4.790e-01   0.6324    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.374e-12 on 719 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 4.419e+31 on 11 and 719 DF,  p-value: < 2.2e-16
model4 <- update(model3, .~. - hum, data = df)

summary(model4)
## Warning in summary.lm(model4): essentially perfect fit: summary may be
## unreliable
## 
## Call:
## lm(formula = cnt ~ season + weekday + workingday + weathersit + 
##     atemp + windspeed + casual + registered + hum_squared + DvsQ, 
##     data = df)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -1.614e-11 -3.110e-13 -2.800e-14  1.780e-13  4.346e-11 
## 
## Coefficients:
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept)  2.421e-12  5.987e-13  4.044e+00 5.81e-05 ***
## season       2.010e-14  8.603e-14  2.340e-01  0.81531    
## weekday      5.060e-15  4.139e-14  1.220e-01  0.90274    
## workingday  -4.131e-12  4.882e-13 -8.460e+00  < 2e-16 ***
## weathersit  -6.090e-13  2.069e-13 -2.943e+00  0.00335 ** 
## atemp        3.443e-12  1.164e-12  2.958e+00  0.00320 ** 
## windspeed   -7.838e-13  1.147e-12 -6.840e-01  0.49442    
## casual       1.000e+00  2.442e-16  4.095e+15  < 2e-16 ***
## registered   1.000e+00  8.664e-17  1.154e+16  < 2e-16 ***
## hum_squared -1.323e-14  6.526e-13 -2.000e-02  0.98383    
## DvsQ         1.625e-12  1.024e-12  1.586e+00  0.11312    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.206e-12 on 720 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 5.629e+31 on 10 and 720 DF,  p-value: < 2.2e-16
model5 <- update(model4, .~. - hum_squared, data = df)

summary(model5)
## Warning in summary.lm(model5): essentially perfect fit: summary may be
## unreliable
## 
## Call:
## lm(formula = cnt ~ season + weekday + workingday + weathersit + 
##     atemp + windspeed + casual + registered + DvsQ, data = df)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -1.614e-11 -3.110e-13 -2.700e-14  1.780e-13  4.346e-11 
## 
## Coefficients:
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept)  2.421e-12  5.914e-13  4.094e+00 4.71e-05 ***
## season       2.010e-14  8.500e-14  2.360e-01 0.813112    
## weekday      5.060e-15  4.128e-14  1.230e-01 0.902465    
## workingday  -4.131e-12  4.879e-13 -8.467e+00  < 2e-16 ***
## weathersit  -6.090e-13  1.604e-13 -3.797e+00 0.000159 ***
## atemp        3.443e-12  1.143e-12  3.012e+00 0.002686 ** 
## windspeed   -7.838e-13  1.098e-12 -7.140e-01 0.475751    
## casual       1.000e+00  2.430e-16  4.116e+15  < 2e-16 ***
## registered   1.000e+00  8.603e-17  1.162e+16  < 2e-16 ***
## DvsQ         1.626e-12  1.023e-12  1.590e+00 0.112332    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.205e-12 on 721 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 6.263e+31 on 9 and 721 DF,  p-value: < 2.2e-16
model6 <- update(model5, .~. - weekday, data = df)

summary(model6)
## Warning in summary.lm(model6): essentially perfect fit: summary may be
## unreliable
## 
## Call:
## lm(formula = cnt ~ season + workingday + weathersit + atemp + 
##     windspeed + casual + registered + DvsQ, data = df)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -1.238e-11 -2.400e-13 -2.400e-14  1.780e-13  4.631e-11 
## 
## Coefficients:
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept) -1.159e-12  5.140e-13 -2.256e+00   0.0244 *  
## season       3.272e-13  7.528e-14  4.346e+00 1.58e-05 ***
## workingday  -6.873e-13  4.321e-13 -1.591e+00   0.1121    
## weathersit   9.839e-13  1.417e-13  6.941e+00 8.66e-12 ***
## atemp       -1.420e-12  1.006e-12 -1.412e+00   0.1585    
## windspeed    1.868e-12  9.725e-13  1.921e+00   0.0552 .  
## casual       1.000e+00  2.131e-16  4.692e+15  < 2e-16 ***
## registered   1.000e+00  7.619e-17  1.313e+16  < 2e-16 ***
## DvsQ         2.590e-13  9.036e-13  2.870e-01   0.7745    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.952e-12 on 722 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 8.983e+31 on 8 and 722 DF,  p-value: < 2.2e-16
model7 <- update(model6, .~. - DvsQ, data = df)

summary(model7)
## Warning in summary.lm(model7): essentially perfect fit: summary may be
## unreliable
## 
## Call:
## lm(formula = cnt ~ season + workingday + weathersit + atemp + 
##     windspeed + casual + registered, data = df)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -1.238e-11 -2.440e-13 -2.500e-14  1.740e-13  4.632e-11 
## 
## Coefficients:
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept) -1.257e-12  4.228e-13 -2.972e+00  0.00306 ** 
## season       3.264e-13  7.505e-14  4.349e+00 1.56e-05 ***
## workingday  -5.426e-13  2.521e-13 -2.153e+00  0.03169 *  
## weathersit   9.839e-13  1.415e-13  6.953e+00 8.03e-12 ***
## atemp       -1.420e-12  6.199e-13 -2.291e+00  0.02225 *  
## windspeed    1.868e-12  9.710e-13  1.924e+00  0.05479 .  
## casual       1.000e+00  1.937e-16  5.164e+15  < 2e-16 ***
## registered   1.000e+00  7.328e-17  1.365e+16  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.951e-12 on 723 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 1.028e+32 on 7 and 723 DF,  p-value: < 2.2e-16
summary(model7)$r.squared
## Warning in summary.lm(model7): essentially perfect fit: summary may be
## unreliable
## [1] 1
par(mmfrow = c(2.2))
## Warning in par(mmfrow = c(2.2)): "mmfrow" is not a graphical parameter
plot(model7)

##conclusion

Rsquared continues to be 1 in all the models so obviously something went wrong, making interpretation of residuals not worth it since there are errors in the model , and I do not feel confident in stating either way if MLR is a good model for this data.