Q2

KNN Classifier models both determine the value at a given point by evaluating the values at the K nearest points. The difference lies in that classifier models calculate the probabilities that a certain point will have a feature based on the features it’s neighbors has, whereas regression models simply average the values for a feature from the nearest neighbors. For example, if we were interested in the average home price in Bexar county, we could use a KNN regression model or a KNN classification model. For regression, we would average the home prices in surrounding counties and predict that value for Bexar county. For classification, we would assign a factor level (say 1-5 with 1 being lowest value and 5 being highest value) to the counties surrounding Bexar county, and then predict Bexar county’s factor level based on the most common factor level for these surrounding counties.

Q9

  1. Scatterplot Matrix of Variables.
auto = Auto
attach(auto)
pairs(auto)

  1. Matrix of Correlation.
cor(auto[,0:8])
##                     mpg  cylinders displacement horsepower     weight
## mpg           1.0000000 -0.7776175   -0.8051269 -0.7784268 -0.8322442
## cylinders    -0.7776175  1.0000000    0.9508233  0.8429834  0.8975273
## displacement -0.8051269  0.9508233    1.0000000  0.8972570  0.9329944
## horsepower   -0.7784268  0.8429834    0.8972570  1.0000000  0.8645377
## weight       -0.8322442  0.8975273    0.9329944  0.8645377  1.0000000
## acceleration  0.4233285 -0.5046834   -0.5438005 -0.6891955 -0.4168392
## year          0.5805410 -0.3456474   -0.3698552 -0.4163615 -0.3091199
## origin        0.5652088 -0.5689316   -0.6145351 -0.4551715 -0.5850054
##              acceleration       year     origin
## mpg             0.4233285  0.5805410  0.5652088
## cylinders      -0.5046834 -0.3456474 -0.5689316
## displacement   -0.5438005 -0.3698552 -0.6145351
## horsepower     -0.6891955 -0.4163615 -0.4551715
## weight         -0.4168392 -0.3091199 -0.5850054
## acceleration    1.0000000  0.2903161  0.2127458
## year            0.2903161  1.0000000  0.1815277
## origin          0.2127458  0.1815277  1.0000000
  1. MLR. The relationship between the predictors and the response is significant as these predictors are able to explain 81.8% of the variance in the response variable. Displacement, weight, year, and origin are significant at the .01 level for alpha. The coefficient for the year variable suggests that for each increment increase in the year, there is a .75 increase in mpg. This translates to newer vehicles having greater mpg.
autolm1 = lm(mpg ~. -name, data = auto)
summary(autolm1)
## 
## Call:
## lm(formula = mpg ~ . - name, data = auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.5903 -2.1565 -0.1169  1.8690 13.0604 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -17.218435   4.644294  -3.707  0.00024 ***
## cylinders     -0.493376   0.323282  -1.526  0.12780    
## displacement   0.019896   0.007515   2.647  0.00844 ** 
## horsepower    -0.016951   0.013787  -1.230  0.21963    
## weight        -0.006474   0.000652  -9.929  < 2e-16 ***
## acceleration   0.080576   0.098845   0.815  0.41548    
## year           0.750773   0.050973  14.729  < 2e-16 ***
## origin         1.426141   0.278136   5.127 4.67e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.328 on 384 degrees of freedom
## Multiple R-squared:  0.8215, Adjusted R-squared:  0.8182 
## F-statistic: 252.4 on 7 and 384 DF,  p-value: < 2.2e-16
  1. MLR Diagnostics. From the Q-Q plot we see most data is normally distributed; however, observations in the higher quantiles have increasingly larger residuals. There appear to be a few outliers. Observations 321, 324, and 325 have abnormally large residuals. Points 325 and 389 have abnormally large leverage.
par(mfrow=c(2,2))
plot(autolm1)

  1. MLR with interactions. Using stepwise selection on a linear model with all interactions between 2 variables, a new model was developed with an r squared of 88%, this is a significant improvement over the model that had no interactions. Predictors were eliminated from the step model for the sake of simplifying interpretation if the p-value was greater than 0.1. Note that the predictors ‘year’ and ‘cylinders’ dropped out of the model. Cylinders is most likely covariate, and years is only present due to interaction effects.
autolm2 = lm(mpg ~ (cylinders+displacement+horsepower+weight+acceleration+year+origin)^2, data = auto)
automodel = step(autolm2)
## Start:  AIC=804.99
## mpg ~ (cylinders + displacement + horsepower + weight + acceleration + 
##     year + origin)^2
## 
##                             Df Sum of Sq    RSS    AIC
## - horsepower:origin          1     0.042 2635.6 803.00
## - displacement:horsepower    1     0.629 2636.2 803.09
## - weight:origin              1     0.961 2636.5 803.14
## - cylinders:weight           1     1.157 2636.7 803.16
## - cylinders:horsepower       1     1.672 2637.2 803.24
## - cylinders:displacement     1     1.995 2637.6 803.29
## - horsepower:weight          1     3.291 2638.9 803.48
## - cylinders:origin           1     4.839 2640.4 803.71
## - weight:acceleration        1     7.631 2643.2 804.13
## - displacement:acceleration  1     7.869 2643.4 804.16
## - weight:year                1     8.092 2643.7 804.19
## - displacement:origin        1    11.020 2646.6 804.63
## <none>                                   2635.6 804.99
## - horsepower:year            1    15.950 2651.5 805.36
## - cylinders:acceleration     1    20.241 2655.8 805.99
## - displacement:weight        1    20.542 2656.1 806.04
## - cylinders:year             1    23.329 2658.9 806.45
## - year:origin                1    25.719 2661.3 806.80
## - horsepower:acceleration    1    27.303 2662.9 807.03
## - acceleration:year          1    34.324 2669.9 808.06
## - displacement:year          1    44.728 2680.3 809.59
## - acceleration:origin        1    62.142 2697.7 812.13
## 
## Step:  AIC=803
## mpg ~ cylinders + displacement + horsepower + weight + acceleration + 
##     year + origin + cylinders:displacement + cylinders:horsepower + 
##     cylinders:weight + cylinders:acceleration + cylinders:year + 
##     cylinders:origin + displacement:horsepower + displacement:weight + 
##     displacement:acceleration + displacement:year + displacement:origin + 
##     horsepower:weight + horsepower:acceleration + horsepower:year + 
##     weight:acceleration + weight:year + weight:origin + acceleration:year + 
##     acceleration:origin + year:origin
## 
##                             Df Sum of Sq    RSS    AIC
## - displacement:horsepower    1     0.697 2636.3 801.10
## - weight:origin              1     1.081 2636.7 801.16
## - cylinders:weight           1     1.200 2636.8 801.18
## - cylinders:horsepower       1     1.637 2637.2 801.24
## - cylinders:displacement     1     2.006 2637.6 801.30
## - horsepower:weight          1     3.448 2639.1 801.51
## - cylinders:origin           1     5.195 2640.8 801.77
## - weight:acceleration        1     7.604 2643.2 802.13
## - displacement:acceleration  1     7.860 2643.5 802.17
## - weight:year                1     8.077 2643.7 802.20
## - displacement:origin        1    10.981 2646.6 802.63
## <none>                                   2635.6 803.00
## - horsepower:year            1    16.183 2651.8 803.40
## - cylinders:acceleration     1    20.302 2655.9 804.01
## - displacement:weight        1    21.220 2656.8 804.14
## - cylinders:year             1    23.341 2659.0 804.46
## - horsepower:acceleration    1    28.176 2663.8 805.17
## - year:origin                1    29.336 2664.9 805.34
## - acceleration:year          1    34.968 2670.6 806.17
## - displacement:year          1    44.704 2680.3 807.59
## - acceleration:origin        1    94.035 2729.7 814.74
## 
## Step:  AIC=801.1
## mpg ~ cylinders + displacement + horsepower + weight + acceleration + 
##     year + origin + cylinders:displacement + cylinders:horsepower + 
##     cylinders:weight + cylinders:acceleration + cylinders:year + 
##     cylinders:origin + displacement:weight + displacement:acceleration + 
##     displacement:year + displacement:origin + horsepower:weight + 
##     horsepower:acceleration + horsepower:year + weight:acceleration + 
##     weight:year + weight:origin + acceleration:year + acceleration:origin + 
##     year:origin
## 
##                             Df Sum of Sq    RSS    AIC
## - weight:origin              1     0.929 2637.2 799.24
## - cylinders:horsepower       1     0.939 2637.2 799.24
## - cylinders:displacement     1     2.246 2638.6 799.44
## - horsepower:weight          1     3.668 2640.0 799.65
## - cylinders:weight           1     4.007 2640.3 799.70
## - cylinders:origin           1     5.338 2641.7 799.90
## - weight:acceleration        1     7.060 2643.4 800.15
## - displacement:acceleration  1     7.868 2644.2 800.27
## - weight:year                1    10.386 2646.7 800.64
## - displacement:origin        1    10.696 2647.0 800.69
## <none>                                   2636.3 801.10
## - horsepower:year            1    15.548 2651.9 801.41
## - displacement:weight        1    24.949 2661.3 802.80
## - cylinders:acceleration     1    25.880 2662.2 802.93
## - horsepower:acceleration    1    27.483 2663.8 803.17
## - cylinders:year             1    27.649 2664.0 803.19
## - year:origin                1    29.006 2665.3 803.39
## - acceleration:year          1    37.662 2674.0 804.66
## - displacement:year          1    53.216 2689.5 806.94
## - acceleration:origin        1    94.783 2731.1 812.95
## 
## Step:  AIC=799.24
## mpg ~ cylinders + displacement + horsepower + weight + acceleration + 
##     year + origin + cylinders:displacement + cylinders:horsepower + 
##     cylinders:weight + cylinders:acceleration + cylinders:year + 
##     cylinders:origin + displacement:weight + displacement:acceleration + 
##     displacement:year + displacement:origin + horsepower:weight + 
##     horsepower:acceleration + horsepower:year + weight:acceleration + 
##     weight:year + acceleration:year + acceleration:origin + year:origin
## 
##                             Df Sum of Sq    RSS    AIC
## - cylinders:horsepower       1     1.472 2638.7 797.46
## - horsepower:weight          1     4.477 2641.7 797.91
## - cylinders:origin           1     5.343 2642.6 798.03
## - cylinders:weight           1     6.723 2644.0 798.24
## - cylinders:displacement     1     6.989 2644.2 798.28
## - weight:acceleration        1     7.380 2644.6 798.34
## - displacement:acceleration  1     8.579 2645.8 798.51
## - weight:year                1    10.656 2647.9 798.82
## <none>                                   2637.2 799.24
## - horsepower:year            1    15.247 2652.5 799.50
## - displacement:origin        1    16.471 2653.7 799.68
## - cylinders:acceleration     1    25.560 2662.8 801.02
## - horsepower:acceleration    1    26.564 2663.8 801.17
## - displacement:weight        1    27.727 2665.0 801.34
## - year:origin                1    28.209 2665.4 801.41
## - cylinders:year             1    29.544 2666.8 801.61
## - acceleration:year          1    40.115 2677.4 803.16
## - displacement:year          1    55.417 2692.7 805.39
## - acceleration:origin        1    96.200 2733.4 811.29
## 
## Step:  AIC=797.46
## mpg ~ cylinders + displacement + horsepower + weight + acceleration + 
##     year + origin + cylinders:displacement + cylinders:weight + 
##     cylinders:acceleration + cylinders:year + cylinders:origin + 
##     displacement:weight + displacement:acceleration + displacement:year + 
##     displacement:origin + horsepower:weight + horsepower:acceleration + 
##     horsepower:year + weight:acceleration + weight:year + acceleration:year + 
##     acceleration:origin + year:origin
## 
##                             Df Sum of Sq    RSS    AIC
## - horsepower:weight          1     3.832 2642.6 796.03
## - cylinders:displacement     1     5.631 2644.3 796.30
## - cylinders:origin           1     7.132 2645.8 796.52
## - displacement:acceleration  1     8.065 2646.8 796.66
## - weight:year                1     9.360 2648.1 796.85
## - cylinders:weight           1    11.708 2650.4 797.20
## <none>                                   2638.7 797.46
## - weight:acceleration        1    13.909 2652.6 797.52
## - displacement:origin        1    15.690 2654.4 797.78
## - horsepower:year            1    19.966 2658.7 798.41
## - cylinders:acceleration     1    26.541 2665.2 799.38
## - year:origin                1    28.859 2667.6 799.72
## - cylinders:year             1    29.724 2668.4 799.85
## - displacement:weight        1    32.488 2671.2 800.26
## - horsepower:acceleration    1    33.940 2672.7 800.47
## - acceleration:year          1    39.192 2677.9 801.24
## - displacement:year          1    54.491 2693.2 803.47
## - acceleration:origin        1   100.968 2739.7 810.18
## 
## Step:  AIC=796.03
## mpg ~ cylinders + displacement + horsepower + weight + acceleration + 
##     year + origin + cylinders:displacement + cylinders:weight + 
##     cylinders:acceleration + cylinders:year + cylinders:origin + 
##     displacement:weight + displacement:acceleration + displacement:year + 
##     displacement:origin + horsepower:acceleration + horsepower:year + 
##     weight:acceleration + weight:year + acceleration:year + acceleration:origin + 
##     year:origin
## 
##                             Df Sum of Sq    RSS    AIC
## - cylinders:displacement     1     3.294 2645.8 794.52
## - cylinders:origin           1     7.262 2649.8 795.10
## - displacement:acceleration  1     7.521 2650.1 795.14
## - weight:year                1     9.152 2651.7 795.38
## - cylinders:weight           1    10.084 2652.6 795.52
## <none>                                   2642.6 796.03
## - weight:acceleration        1    14.852 2657.4 796.23
## - displacement:origin        1    15.789 2658.3 796.36
## - horsepower:year            1    16.286 2658.8 796.44
## - cylinders:acceleration     1    25.474 2668.0 797.79
## - year:origin                1    26.094 2668.6 797.88
## - cylinders:year             1    26.913 2669.5 798.00
## - horsepower:acceleration    1    30.557 2673.1 798.54
## - displacement:weight        1    32.449 2675.0 798.81
## - displacement:year          1    50.719 2693.3 801.48
## - acceleration:year          1    51.787 2694.3 801.64
## - acceleration:origin        1    98.788 2741.3 808.42
## 
## Step:  AIC=794.52
## mpg ~ cylinders + displacement + horsepower + weight + acceleration + 
##     year + origin + cylinders:weight + cylinders:acceleration + 
##     cylinders:year + cylinders:origin + displacement:weight + 
##     displacement:acceleration + displacement:year + displacement:origin + 
##     horsepower:acceleration + horsepower:year + weight:acceleration + 
##     weight:year + acceleration:year + acceleration:origin + year:origin
## 
##                             Df Sum of Sq    RSS    AIC
## - displacement:acceleration  1     5.515 2651.3 793.33
## - cylinders:weight           1     6.798 2652.6 793.52
## - weight:year                1    10.364 2656.2 794.05
## - cylinders:origin           1    10.686 2656.5 794.10
## - weight:acceleration        1    12.002 2657.8 794.29
## <none>                                   2645.8 794.52
## - displacement:origin        1    15.944 2661.8 794.87
## - horsepower:year            1    16.149 2662.0 794.90
## - year:origin                1    27.133 2673.0 796.52
## - cylinders:acceleration     1    27.218 2673.1 796.53
## - cylinders:year             1    27.556 2673.4 796.58
## - horsepower:acceleration    1    30.063 2675.9 796.95
## - displacement:weight        1    32.157 2678.0 797.25
## - acceleration:year          1    49.390 2695.2 799.77
## - displacement:year          1    51.412 2697.2 800.06
## - acceleration:origin        1   108.293 2754.1 808.24
## 
## Step:  AIC=793.33
## mpg ~ cylinders + displacement + horsepower + weight + acceleration + 
##     year + origin + cylinders:weight + cylinders:acceleration + 
##     cylinders:year + cylinders:origin + displacement:weight + 
##     displacement:year + displacement:origin + horsepower:acceleration + 
##     horsepower:year + weight:acceleration + weight:year + acceleration:year + 
##     acceleration:origin + year:origin
## 
##                           Df Sum of Sq    RSS    AIC
## - cylinders:weight         1     6.213 2657.6 792.25
## - weight:year              1     6.383 2657.7 792.28
## - weight:acceleration      1     7.291 2658.7 792.41
## - cylinders:origin         1     9.589 2660.9 792.75
## <none>                                 2651.3 793.33
## - displacement:origin      1    17.045 2668.4 793.85
## - horsepower:year          1    19.567 2670.9 794.22
## - cylinders:year           1    22.565 2673.9 794.66
## - year:origin              1    24.170 2675.5 794.89
## - cylinders:acceleration   1    25.823 2677.2 795.13
## - displacement:weight      1    36.241 2687.6 796.66
## - horsepower:acceleration  1    44.515 2695.9 797.86
## - acceleration:year        1    47.132 2698.5 798.24
## - displacement:year        1    47.429 2698.8 798.28
## - acceleration:origin      1   137.836 2789.2 811.20
## 
## Step:  AIC=792.25
## mpg ~ cylinders + displacement + horsepower + weight + acceleration + 
##     year + origin + cylinders:acceleration + cylinders:year + 
##     cylinders:origin + displacement:weight + displacement:year + 
##     displacement:origin + horsepower:acceleration + horsepower:year + 
##     weight:acceleration + weight:year + acceleration:year + acceleration:origin + 
##     year:origin
## 
##                           Df Sum of Sq    RSS    AIC
## - weight:year              1      5.27 2662.8 791.03
## - cylinders:origin         1      5.65 2663.2 791.08
## - weight:acceleration      1      6.32 2663.9 791.18
## <none>                                 2657.6 792.25
## - displacement:origin      1     21.20 2678.8 793.37
## - cylinders:acceleration   1     23.99 2681.6 793.77
## - horsepower:year          1     24.67 2682.2 793.87
## - year:origin              1     25.72 2683.3 794.03
## - cylinders:year           1     29.70 2687.3 794.61
## - horsepower:acceleration  1     39.28 2696.8 796.00
## - acceleration:year        1     43.81 2701.4 796.66
## - displacement:year        1     55.67 2713.2 798.38
## - acceleration:origin      1    134.47 2792.0 809.60
## - displacement:weight      1    323.15 2980.7 835.23
## 
## Step:  AIC=791.03
## mpg ~ cylinders + displacement + horsepower + weight + acceleration + 
##     year + origin + cylinders:acceleration + cylinders:year + 
##     cylinders:origin + displacement:weight + displacement:year + 
##     displacement:origin + horsepower:acceleration + horsepower:year + 
##     weight:acceleration + acceleration:year + acceleration:origin + 
##     year:origin
## 
##                           Df Sum of Sq    RSS    AIC
## - weight:acceleration      1      3.78 2666.6 789.58
## - cylinders:origin         1      4.74 2667.6 789.72
## <none>                                 2662.8 791.03
## - displacement:origin      1     22.01 2684.8 792.25
## - cylinders:acceleration   1     25.80 2688.6 792.81
## - year:origin              1     29.76 2692.6 793.38
## - cylinders:year           1     31.10 2693.9 793.58
## - horsepower:acceleration  1     34.06 2696.9 794.01
## - acceleration:year        1     38.59 2701.4 794.67
## - horsepower:year          1     48.14 2711.0 796.05
## - displacement:year        1     51.56 2714.4 796.54
## - acceleration:origin      1    132.70 2795.5 808.09
## - displacement:weight      1    358.76 3021.6 838.57
## 
## Step:  AIC=789.58
## mpg ~ cylinders + displacement + horsepower + weight + acceleration + 
##     year + origin + cylinders:acceleration + cylinders:year + 
##     cylinders:origin + displacement:weight + displacement:year + 
##     displacement:origin + horsepower:acceleration + horsepower:year + 
##     acceleration:year + acceleration:origin + year:origin
## 
##                           Df Sum of Sq    RSS    AIC
## - cylinders:origin         1      4.22 2670.8 788.20
## <none>                                 2666.6 789.58
## - displacement:origin      1     23.34 2689.9 791.00
## - year:origin              1     29.16 2695.8 791.85
## - cylinders:year           1     30.17 2696.8 791.99
## - horsepower:acceleration  1     30.30 2696.9 792.01
## - acceleration:year        1     39.49 2706.1 793.35
## - cylinders:acceleration   1     42.93 2709.5 793.84
## - displacement:year        1     52.39 2719.0 795.21
## - horsepower:year          1     55.84 2722.4 795.71
## - acceleration:origin      1    129.75 2796.4 806.21
## - displacement:weight      1    466.55 3133.2 850.79
## 
## Step:  AIC=788.2
## mpg ~ cylinders + displacement + horsepower + weight + acceleration + 
##     year + origin + cylinders:acceleration + cylinders:year + 
##     displacement:weight + displacement:year + displacement:origin + 
##     horsepower:acceleration + horsepower:year + acceleration:year + 
##     acceleration:origin + year:origin
## 
##                           Df Sum of Sq    RSS    AIC
## <none>                                 2670.8 788.20
## - year:origin              1     26.71 2697.5 790.10
## - cylinders:year           1     27.00 2697.8 790.14
## - horsepower:acceleration  1     30.30 2701.1 790.62
## - acceleration:year        1     37.57 2708.4 791.68
## - cylinders:acceleration   1     42.66 2713.5 792.41
## - displacement:year        1     48.20 2719.0 793.21
## - horsepower:year          1     53.54 2724.4 793.98
## - displacement:origin      1     86.56 2757.4 798.70
## - acceleration:origin      1    133.55 2804.4 805.33
## - displacement:weight      1    496.55 3167.4 853.04
automodel$coefficients
##             (Intercept)               cylinders            displacement 
##            2.829154e+01            9.862727e+00           -4.291861e-01 
##              horsepower                  weight            acceleration 
##            6.120519e-01           -9.522978e-03           -4.695130e+00 
##                    year                  origin  cylinders:acceleration 
##            6.621645e-01           -1.985448e+01            1.686669e-01 
##          cylinders:year     displacement:weight       displacement:year 
##           -1.590265e-01            2.218471e-05            4.228778e-03 
##     displacement:origin horsepower:acceleration         horsepower:year 
##            2.762865e-02           -5.467217e-03           -7.806115e-03 
##       acceleration:year     acceleration:origin             year:origin 
##            4.601257e-02            4.766509e-01            1.234407e-01
autolm3 = lm(mpg ~ displacement + horsepower + weight + acceleration + origin + displacement:weight + displacement:year + displacement:origin + horsepower:year + acceleration:year + acceleration:origin + year:origin, data = auto)
summary(autolm3)
## 
## Call:
## lm(formula = mpg ~ displacement + horsepower + weight + acceleration + 
##     origin + displacement:weight + displacement:year + displacement:origin + 
##     horsepower:year + acceleration:year + acceleration:origin + 
##     year:origin, data = auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.5747 -1.6133 -0.0176  1.3028 10.9087 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          7.127e+01  3.789e+00  18.812  < 2e-16 ***
## displacement        -2.925e-01  7.891e-02  -3.706 0.000242 ***
## horsepower           6.034e-01  1.735e-01   3.478 0.000564 ***
## weight              -9.218e-03  7.725e-04 -11.932  < 2e-16 ***
## acceleration        -5.095e+00  5.973e-01  -8.529 3.57e-16 ***
## origin              -2.087e+01  4.815e+00  -4.334 1.88e-05 ***
## displacement:weight  2.148e-05  2.332e-06   9.210  < 2e-16 ***
## displacement:year    2.572e-03  1.010e-03   2.546 0.011301 *  
## displacement:origin  2.368e-02  7.592e-03   3.119 0.001951 ** 
## horsepower:year     -8.799e-03  2.337e-03  -3.766 0.000193 ***
## acceleration:year    5.742e-02  7.668e-03   7.488 4.94e-13 ***
## acceleration:origin  4.054e-01  9.067e-02   4.471 1.03e-05 ***
## origin:year          1.569e-01  5.612e-02   2.796 0.005434 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.69 on 379 degrees of freedom
## Multiple R-squared:  0.8849, Adjusted R-squared:  0.8813 
## F-statistic: 242.8 on 12 and 379 DF,  p-value: < 2.2e-16
  1. Predictor Transformation. Using log, root, and square transformations on the remaining variables provided negligible improvements on the existing model. It is possible that the interaction effects capture these effects to an extent.
autolm4 = lm(mpg ~ displacement + horsepower + I(log(weight)) + acceleration + origin + displacement:weight + displacement:year + displacement:origin + horsepower:year + acceleration:year + acceleration:origin + year:origin, data = auto)
summary(autolm4)
## 
## Call:
## lm(formula = mpg ~ displacement + horsepower + I(log(weight)) + 
##     acceleration + origin + displacement:weight + displacement:year + 
##     displacement:origin + horsepower:year + acceleration:year + 
##     acceleration:origin + year:origin, data = auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.7073 -1.5315 -0.0119  1.2626 11.1870 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          2.179e+02  1.353e+01  16.109  < 2e-16 ***
## displacement        -2.444e-01  7.862e-02  -3.109 0.002022 ** 
## horsepower           5.853e-01  1.731e-01   3.381 0.000798 ***
## I(log(weight))      -2.184e+01  1.810e+00 -12.065  < 2e-16 ***
## acceleration        -5.422e+00  5.901e-01  -9.187  < 2e-16 ***
## origin              -1.914e+01  4.804e+00  -3.985 8.10e-05 ***
## displacement:weight  1.154e-05  1.860e-06   6.204 1.44e-09 ***
## displacement:year    2.319e-03  1.008e-03   2.300 0.022013 *  
## displacement:origin  2.635e-02  7.649e-03   3.444 0.000636 ***
## horsepower:year     -8.505e-03  2.333e-03  -3.646 0.000303 ***
## acceleration:year    6.166e-02  7.612e-03   8.101 7.55e-15 ***
## acceleration:origin  4.038e-01  9.040e-02   4.467 1.05e-05 ***
## origin:year          1.311e-01  5.606e-02   2.338 0.019918 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.681 on 379 degrees of freedom
## Multiple R-squared:  0.8856, Adjusted R-squared:  0.882 
## F-statistic: 244.5 on 12 and 379 DF,  p-value: < 2.2e-16

Q10

  1. MLR with Select Predictors.
carseats = Carseats
attach(carseats)
carseatslm1 = lm(Sales ~ Price + Urban + US)
summary(carseatslm1)
## 
## Call:
## lm(formula = Sales ~ Price + Urban + US)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.9206 -1.6220 -0.0564  1.5786  7.0581 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 13.043469   0.651012  20.036  < 2e-16 ***
## Price       -0.054459   0.005242 -10.389  < 2e-16 ***
## UrbanYes    -0.021916   0.271650  -0.081    0.936    
## USYes        1.200573   0.259042   4.635 4.86e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.472 on 396 degrees of freedom
## Multiple R-squared:  0.2393, Adjusted R-squared:  0.2335 
## F-statistic: 41.52 on 3 and 396 DF,  p-value: < 2.2e-16
  1. Coefficient Interpretation. With a small negative coefficient , each dollar increment in the price results in a decrease in sales of 0.054 (thousands). If a store is located in an urban area, then it would expect to sell .022 (thousands) fewer units. If a store was located in the Us, then it could expect to sell 1.2 (thousands) more units compared to stores outside the US.
  1. Sales (thousands) = 13.04 - 0.054 * (Price) + - 0.022 * (If Urban Based) + 1.2 * (If US Located)
  1. The Urban based predictor has a p value almost equal to one. We can remove it.

  2. Reduced MLR.

carseatslm2 = lm(Sales ~ Price + US)
summary(carseatslm2)
## 
## Call:
## lm(formula = Sales ~ Price + US)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.9269 -1.6286 -0.0574  1.5766  7.0515 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 13.03079    0.63098  20.652  < 2e-16 ***
## Price       -0.05448    0.00523 -10.416  < 2e-16 ***
## USYes        1.19964    0.25846   4.641 4.71e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.469 on 397 degrees of freedom
## Multiple R-squared:  0.2393, Adjusted R-squared:  0.2354 
## F-statistic: 62.43 on 2 and 397 DF,  p-value: < 2.2e-16
  1. With R Squared values of .234 and .235, the models explain approximately 23% of the variance in the sales for a carseat at a given location. While the information we have is a good fit, we do not have much predictive power for sales.

  2. Confidence Intervals. Price: (-.05971, -.04925) US Located: (.94118, 1.4581) Intercept: (12.39981, 13.66177)

  3. Outliers and Leverage. There is evidence for observations 377, 51, and 69 as outliers and observations 26, 50, and 368 as high leverage points. However, I would include these in the model as they are relatively typical. I suspect taht if we removed them we would have new points as our outliers and high leverage points.

par(mfrow=c(2,2))
plot(carseatslm2)

Q12

  1. The coefficient estimates will be identical when the model perfectly fits the data or if the variance form the line of best fit is identical.

set.seed(1)
x = rnorm(100)
y = 4 * x + rnorm(100)

test = lm(y~x+0)
summary(test)
## 
## Call:
## lm(formula = y ~ x + 0)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.9154 -0.6472 -0.1771  0.5056  2.3109 
## 
## Coefficients:
##   Estimate Std. Error t value Pr(>|t|)    
## x   3.9939     0.1065   37.51   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9586 on 99 degrees of freedom
## Multiple R-squared:  0.9343, Adjusted R-squared:  0.9336 
## F-statistic:  1407 on 1 and 99 DF,  p-value: < 2.2e-16
test2 = lm(x~y+0)
summary(test2)
## 
## Call:
## lm(formula = x ~ y + 0)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.50931 -0.10863  0.05499  0.14436  0.44044 
## 
## Coefficients:
##   Estimate Std. Error t value Pr(>|t|)    
## y 0.233923   0.006236   37.51   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.232 on 99 degrees of freedom
## Multiple R-squared:  0.9343, Adjusted R-squared:  0.9336 
## F-statistic:  1407 on 1 and 99 DF,  p-value: < 2.2e-16
plot(x,y)
abline(test)
abline(test2, col= 'red', lty = 'dashed')

x = 1:10
y = 10:1

test = lm(y~x+0)
summary(test)
## 
## Call:
## lm(formula = y ~ x + 0)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -4.714 -1.179  2.357  5.893  9.429 
## 
## Coefficients:
##   Estimate Std. Error t value Pr(>|t|)  
## x   0.5714     0.2736   2.089   0.0663 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.367 on 9 degrees of freedom
## Multiple R-squared:  0.3265, Adjusted R-squared:  0.2517 
## F-statistic: 4.364 on 1 and 9 DF,  p-value: 0.0663
test2 = lm(x~y+0)
summary(test2)
## 
## Call:
## lm(formula = x ~ y + 0)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -4.714 -1.179  2.357  5.893  9.429 
## 
## Coefficients:
##   Estimate Std. Error t value Pr(>|t|)  
## y   0.5714     0.2736   2.089   0.0663 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.367 on 9 degrees of freedom
## Multiple R-squared:  0.3265, Adjusted R-squared:  0.2517 
## F-statistic: 4.364 on 1 and 9 DF,  p-value: 0.0663
plot(x,y)
abline(test, )
abline(test2, col= 'red', lty = 'dashed')