KNN Classifier models both determine the value at a given point by evaluating the values at the K nearest points. The difference lies in that classifier models calculate the probabilities that a certain point will have a feature based on the features it’s neighbors has, whereas regression models simply average the values for a feature from the nearest neighbors. For example, if we were interested in the average home price in Bexar county, we could use a KNN regression model or a KNN classification model. For regression, we would average the home prices in surrounding counties and predict that value for Bexar county. For classification, we would assign a factor level (say 1-5 with 1 being lowest value and 5 being highest value) to the counties surrounding Bexar county, and then predict Bexar county’s factor level based on the most common factor level for these surrounding counties.
auto = Auto
attach(auto)
pairs(auto)
cor(auto[,0:8])
## mpg cylinders displacement horsepower weight
## mpg 1.0000000 -0.7776175 -0.8051269 -0.7784268 -0.8322442
## cylinders -0.7776175 1.0000000 0.9508233 0.8429834 0.8975273
## displacement -0.8051269 0.9508233 1.0000000 0.8972570 0.9329944
## horsepower -0.7784268 0.8429834 0.8972570 1.0000000 0.8645377
## weight -0.8322442 0.8975273 0.9329944 0.8645377 1.0000000
## acceleration 0.4233285 -0.5046834 -0.5438005 -0.6891955 -0.4168392
## year 0.5805410 -0.3456474 -0.3698552 -0.4163615 -0.3091199
## origin 0.5652088 -0.5689316 -0.6145351 -0.4551715 -0.5850054
## acceleration year origin
## mpg 0.4233285 0.5805410 0.5652088
## cylinders -0.5046834 -0.3456474 -0.5689316
## displacement -0.5438005 -0.3698552 -0.6145351
## horsepower -0.6891955 -0.4163615 -0.4551715
## weight -0.4168392 -0.3091199 -0.5850054
## acceleration 1.0000000 0.2903161 0.2127458
## year 0.2903161 1.0000000 0.1815277
## origin 0.2127458 0.1815277 1.0000000
autolm1 = lm(mpg ~. -name, data = auto)
summary(autolm1)
##
## Call:
## lm(formula = mpg ~ . - name, data = auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.5903 -2.1565 -0.1169 1.8690 13.0604
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -17.218435 4.644294 -3.707 0.00024 ***
## cylinders -0.493376 0.323282 -1.526 0.12780
## displacement 0.019896 0.007515 2.647 0.00844 **
## horsepower -0.016951 0.013787 -1.230 0.21963
## weight -0.006474 0.000652 -9.929 < 2e-16 ***
## acceleration 0.080576 0.098845 0.815 0.41548
## year 0.750773 0.050973 14.729 < 2e-16 ***
## origin 1.426141 0.278136 5.127 4.67e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.328 on 384 degrees of freedom
## Multiple R-squared: 0.8215, Adjusted R-squared: 0.8182
## F-statistic: 252.4 on 7 and 384 DF, p-value: < 2.2e-16
par(mfrow=c(2,2))
plot(autolm1)
autolm2 = lm(mpg ~ (cylinders+displacement+horsepower+weight+acceleration+year+origin)^2, data = auto)
automodel = step(autolm2)
## Start: AIC=804.99
## mpg ~ (cylinders + displacement + horsepower + weight + acceleration +
## year + origin)^2
##
## Df Sum of Sq RSS AIC
## - horsepower:origin 1 0.042 2635.6 803.00
## - displacement:horsepower 1 0.629 2636.2 803.09
## - weight:origin 1 0.961 2636.5 803.14
## - cylinders:weight 1 1.157 2636.7 803.16
## - cylinders:horsepower 1 1.672 2637.2 803.24
## - cylinders:displacement 1 1.995 2637.6 803.29
## - horsepower:weight 1 3.291 2638.9 803.48
## - cylinders:origin 1 4.839 2640.4 803.71
## - weight:acceleration 1 7.631 2643.2 804.13
## - displacement:acceleration 1 7.869 2643.4 804.16
## - weight:year 1 8.092 2643.7 804.19
## - displacement:origin 1 11.020 2646.6 804.63
## <none> 2635.6 804.99
## - horsepower:year 1 15.950 2651.5 805.36
## - cylinders:acceleration 1 20.241 2655.8 805.99
## - displacement:weight 1 20.542 2656.1 806.04
## - cylinders:year 1 23.329 2658.9 806.45
## - year:origin 1 25.719 2661.3 806.80
## - horsepower:acceleration 1 27.303 2662.9 807.03
## - acceleration:year 1 34.324 2669.9 808.06
## - displacement:year 1 44.728 2680.3 809.59
## - acceleration:origin 1 62.142 2697.7 812.13
##
## Step: AIC=803
## mpg ~ cylinders + displacement + horsepower + weight + acceleration +
## year + origin + cylinders:displacement + cylinders:horsepower +
## cylinders:weight + cylinders:acceleration + cylinders:year +
## cylinders:origin + displacement:horsepower + displacement:weight +
## displacement:acceleration + displacement:year + displacement:origin +
## horsepower:weight + horsepower:acceleration + horsepower:year +
## weight:acceleration + weight:year + weight:origin + acceleration:year +
## acceleration:origin + year:origin
##
## Df Sum of Sq RSS AIC
## - displacement:horsepower 1 0.697 2636.3 801.10
## - weight:origin 1 1.081 2636.7 801.16
## - cylinders:weight 1 1.200 2636.8 801.18
## - cylinders:horsepower 1 1.637 2637.2 801.24
## - cylinders:displacement 1 2.006 2637.6 801.30
## - horsepower:weight 1 3.448 2639.1 801.51
## - cylinders:origin 1 5.195 2640.8 801.77
## - weight:acceleration 1 7.604 2643.2 802.13
## - displacement:acceleration 1 7.860 2643.5 802.17
## - weight:year 1 8.077 2643.7 802.20
## - displacement:origin 1 10.981 2646.6 802.63
## <none> 2635.6 803.00
## - horsepower:year 1 16.183 2651.8 803.40
## - cylinders:acceleration 1 20.302 2655.9 804.01
## - displacement:weight 1 21.220 2656.8 804.14
## - cylinders:year 1 23.341 2659.0 804.46
## - horsepower:acceleration 1 28.176 2663.8 805.17
## - year:origin 1 29.336 2664.9 805.34
## - acceleration:year 1 34.968 2670.6 806.17
## - displacement:year 1 44.704 2680.3 807.59
## - acceleration:origin 1 94.035 2729.7 814.74
##
## Step: AIC=801.1
## mpg ~ cylinders + displacement + horsepower + weight + acceleration +
## year + origin + cylinders:displacement + cylinders:horsepower +
## cylinders:weight + cylinders:acceleration + cylinders:year +
## cylinders:origin + displacement:weight + displacement:acceleration +
## displacement:year + displacement:origin + horsepower:weight +
## horsepower:acceleration + horsepower:year + weight:acceleration +
## weight:year + weight:origin + acceleration:year + acceleration:origin +
## year:origin
##
## Df Sum of Sq RSS AIC
## - weight:origin 1 0.929 2637.2 799.24
## - cylinders:horsepower 1 0.939 2637.2 799.24
## - cylinders:displacement 1 2.246 2638.6 799.44
## - horsepower:weight 1 3.668 2640.0 799.65
## - cylinders:weight 1 4.007 2640.3 799.70
## - cylinders:origin 1 5.338 2641.7 799.90
## - weight:acceleration 1 7.060 2643.4 800.15
## - displacement:acceleration 1 7.868 2644.2 800.27
## - weight:year 1 10.386 2646.7 800.64
## - displacement:origin 1 10.696 2647.0 800.69
## <none> 2636.3 801.10
## - horsepower:year 1 15.548 2651.9 801.41
## - displacement:weight 1 24.949 2661.3 802.80
## - cylinders:acceleration 1 25.880 2662.2 802.93
## - horsepower:acceleration 1 27.483 2663.8 803.17
## - cylinders:year 1 27.649 2664.0 803.19
## - year:origin 1 29.006 2665.3 803.39
## - acceleration:year 1 37.662 2674.0 804.66
## - displacement:year 1 53.216 2689.5 806.94
## - acceleration:origin 1 94.783 2731.1 812.95
##
## Step: AIC=799.24
## mpg ~ cylinders + displacement + horsepower + weight + acceleration +
## year + origin + cylinders:displacement + cylinders:horsepower +
## cylinders:weight + cylinders:acceleration + cylinders:year +
## cylinders:origin + displacement:weight + displacement:acceleration +
## displacement:year + displacement:origin + horsepower:weight +
## horsepower:acceleration + horsepower:year + weight:acceleration +
## weight:year + acceleration:year + acceleration:origin + year:origin
##
## Df Sum of Sq RSS AIC
## - cylinders:horsepower 1 1.472 2638.7 797.46
## - horsepower:weight 1 4.477 2641.7 797.91
## - cylinders:origin 1 5.343 2642.6 798.03
## - cylinders:weight 1 6.723 2644.0 798.24
## - cylinders:displacement 1 6.989 2644.2 798.28
## - weight:acceleration 1 7.380 2644.6 798.34
## - displacement:acceleration 1 8.579 2645.8 798.51
## - weight:year 1 10.656 2647.9 798.82
## <none> 2637.2 799.24
## - horsepower:year 1 15.247 2652.5 799.50
## - displacement:origin 1 16.471 2653.7 799.68
## - cylinders:acceleration 1 25.560 2662.8 801.02
## - horsepower:acceleration 1 26.564 2663.8 801.17
## - displacement:weight 1 27.727 2665.0 801.34
## - year:origin 1 28.209 2665.4 801.41
## - cylinders:year 1 29.544 2666.8 801.61
## - acceleration:year 1 40.115 2677.4 803.16
## - displacement:year 1 55.417 2692.7 805.39
## - acceleration:origin 1 96.200 2733.4 811.29
##
## Step: AIC=797.46
## mpg ~ cylinders + displacement + horsepower + weight + acceleration +
## year + origin + cylinders:displacement + cylinders:weight +
## cylinders:acceleration + cylinders:year + cylinders:origin +
## displacement:weight + displacement:acceleration + displacement:year +
## displacement:origin + horsepower:weight + horsepower:acceleration +
## horsepower:year + weight:acceleration + weight:year + acceleration:year +
## acceleration:origin + year:origin
##
## Df Sum of Sq RSS AIC
## - horsepower:weight 1 3.832 2642.6 796.03
## - cylinders:displacement 1 5.631 2644.3 796.30
## - cylinders:origin 1 7.132 2645.8 796.52
## - displacement:acceleration 1 8.065 2646.8 796.66
## - weight:year 1 9.360 2648.1 796.85
## - cylinders:weight 1 11.708 2650.4 797.20
## <none> 2638.7 797.46
## - weight:acceleration 1 13.909 2652.6 797.52
## - displacement:origin 1 15.690 2654.4 797.78
## - horsepower:year 1 19.966 2658.7 798.41
## - cylinders:acceleration 1 26.541 2665.2 799.38
## - year:origin 1 28.859 2667.6 799.72
## - cylinders:year 1 29.724 2668.4 799.85
## - displacement:weight 1 32.488 2671.2 800.26
## - horsepower:acceleration 1 33.940 2672.7 800.47
## - acceleration:year 1 39.192 2677.9 801.24
## - displacement:year 1 54.491 2693.2 803.47
## - acceleration:origin 1 100.968 2739.7 810.18
##
## Step: AIC=796.03
## mpg ~ cylinders + displacement + horsepower + weight + acceleration +
## year + origin + cylinders:displacement + cylinders:weight +
## cylinders:acceleration + cylinders:year + cylinders:origin +
## displacement:weight + displacement:acceleration + displacement:year +
## displacement:origin + horsepower:acceleration + horsepower:year +
## weight:acceleration + weight:year + acceleration:year + acceleration:origin +
## year:origin
##
## Df Sum of Sq RSS AIC
## - cylinders:displacement 1 3.294 2645.8 794.52
## - cylinders:origin 1 7.262 2649.8 795.10
## - displacement:acceleration 1 7.521 2650.1 795.14
## - weight:year 1 9.152 2651.7 795.38
## - cylinders:weight 1 10.084 2652.6 795.52
## <none> 2642.6 796.03
## - weight:acceleration 1 14.852 2657.4 796.23
## - displacement:origin 1 15.789 2658.3 796.36
## - horsepower:year 1 16.286 2658.8 796.44
## - cylinders:acceleration 1 25.474 2668.0 797.79
## - year:origin 1 26.094 2668.6 797.88
## - cylinders:year 1 26.913 2669.5 798.00
## - horsepower:acceleration 1 30.557 2673.1 798.54
## - displacement:weight 1 32.449 2675.0 798.81
## - displacement:year 1 50.719 2693.3 801.48
## - acceleration:year 1 51.787 2694.3 801.64
## - acceleration:origin 1 98.788 2741.3 808.42
##
## Step: AIC=794.52
## mpg ~ cylinders + displacement + horsepower + weight + acceleration +
## year + origin + cylinders:weight + cylinders:acceleration +
## cylinders:year + cylinders:origin + displacement:weight +
## displacement:acceleration + displacement:year + displacement:origin +
## horsepower:acceleration + horsepower:year + weight:acceleration +
## weight:year + acceleration:year + acceleration:origin + year:origin
##
## Df Sum of Sq RSS AIC
## - displacement:acceleration 1 5.515 2651.3 793.33
## - cylinders:weight 1 6.798 2652.6 793.52
## - weight:year 1 10.364 2656.2 794.05
## - cylinders:origin 1 10.686 2656.5 794.10
## - weight:acceleration 1 12.002 2657.8 794.29
## <none> 2645.8 794.52
## - displacement:origin 1 15.944 2661.8 794.87
## - horsepower:year 1 16.149 2662.0 794.90
## - year:origin 1 27.133 2673.0 796.52
## - cylinders:acceleration 1 27.218 2673.1 796.53
## - cylinders:year 1 27.556 2673.4 796.58
## - horsepower:acceleration 1 30.063 2675.9 796.95
## - displacement:weight 1 32.157 2678.0 797.25
## - acceleration:year 1 49.390 2695.2 799.77
## - displacement:year 1 51.412 2697.2 800.06
## - acceleration:origin 1 108.293 2754.1 808.24
##
## Step: AIC=793.33
## mpg ~ cylinders + displacement + horsepower + weight + acceleration +
## year + origin + cylinders:weight + cylinders:acceleration +
## cylinders:year + cylinders:origin + displacement:weight +
## displacement:year + displacement:origin + horsepower:acceleration +
## horsepower:year + weight:acceleration + weight:year + acceleration:year +
## acceleration:origin + year:origin
##
## Df Sum of Sq RSS AIC
## - cylinders:weight 1 6.213 2657.6 792.25
## - weight:year 1 6.383 2657.7 792.28
## - weight:acceleration 1 7.291 2658.7 792.41
## - cylinders:origin 1 9.589 2660.9 792.75
## <none> 2651.3 793.33
## - displacement:origin 1 17.045 2668.4 793.85
## - horsepower:year 1 19.567 2670.9 794.22
## - cylinders:year 1 22.565 2673.9 794.66
## - year:origin 1 24.170 2675.5 794.89
## - cylinders:acceleration 1 25.823 2677.2 795.13
## - displacement:weight 1 36.241 2687.6 796.66
## - horsepower:acceleration 1 44.515 2695.9 797.86
## - acceleration:year 1 47.132 2698.5 798.24
## - displacement:year 1 47.429 2698.8 798.28
## - acceleration:origin 1 137.836 2789.2 811.20
##
## Step: AIC=792.25
## mpg ~ cylinders + displacement + horsepower + weight + acceleration +
## year + origin + cylinders:acceleration + cylinders:year +
## cylinders:origin + displacement:weight + displacement:year +
## displacement:origin + horsepower:acceleration + horsepower:year +
## weight:acceleration + weight:year + acceleration:year + acceleration:origin +
## year:origin
##
## Df Sum of Sq RSS AIC
## - weight:year 1 5.27 2662.8 791.03
## - cylinders:origin 1 5.65 2663.2 791.08
## - weight:acceleration 1 6.32 2663.9 791.18
## <none> 2657.6 792.25
## - displacement:origin 1 21.20 2678.8 793.37
## - cylinders:acceleration 1 23.99 2681.6 793.77
## - horsepower:year 1 24.67 2682.2 793.87
## - year:origin 1 25.72 2683.3 794.03
## - cylinders:year 1 29.70 2687.3 794.61
## - horsepower:acceleration 1 39.28 2696.8 796.00
## - acceleration:year 1 43.81 2701.4 796.66
## - displacement:year 1 55.67 2713.2 798.38
## - acceleration:origin 1 134.47 2792.0 809.60
## - displacement:weight 1 323.15 2980.7 835.23
##
## Step: AIC=791.03
## mpg ~ cylinders + displacement + horsepower + weight + acceleration +
## year + origin + cylinders:acceleration + cylinders:year +
## cylinders:origin + displacement:weight + displacement:year +
## displacement:origin + horsepower:acceleration + horsepower:year +
## weight:acceleration + acceleration:year + acceleration:origin +
## year:origin
##
## Df Sum of Sq RSS AIC
## - weight:acceleration 1 3.78 2666.6 789.58
## - cylinders:origin 1 4.74 2667.6 789.72
## <none> 2662.8 791.03
## - displacement:origin 1 22.01 2684.8 792.25
## - cylinders:acceleration 1 25.80 2688.6 792.81
## - year:origin 1 29.76 2692.6 793.38
## - cylinders:year 1 31.10 2693.9 793.58
## - horsepower:acceleration 1 34.06 2696.9 794.01
## - acceleration:year 1 38.59 2701.4 794.67
## - horsepower:year 1 48.14 2711.0 796.05
## - displacement:year 1 51.56 2714.4 796.54
## - acceleration:origin 1 132.70 2795.5 808.09
## - displacement:weight 1 358.76 3021.6 838.57
##
## Step: AIC=789.58
## mpg ~ cylinders + displacement + horsepower + weight + acceleration +
## year + origin + cylinders:acceleration + cylinders:year +
## cylinders:origin + displacement:weight + displacement:year +
## displacement:origin + horsepower:acceleration + horsepower:year +
## acceleration:year + acceleration:origin + year:origin
##
## Df Sum of Sq RSS AIC
## - cylinders:origin 1 4.22 2670.8 788.20
## <none> 2666.6 789.58
## - displacement:origin 1 23.34 2689.9 791.00
## - year:origin 1 29.16 2695.8 791.85
## - cylinders:year 1 30.17 2696.8 791.99
## - horsepower:acceleration 1 30.30 2696.9 792.01
## - acceleration:year 1 39.49 2706.1 793.35
## - cylinders:acceleration 1 42.93 2709.5 793.84
## - displacement:year 1 52.39 2719.0 795.21
## - horsepower:year 1 55.84 2722.4 795.71
## - acceleration:origin 1 129.75 2796.4 806.21
## - displacement:weight 1 466.55 3133.2 850.79
##
## Step: AIC=788.2
## mpg ~ cylinders + displacement + horsepower + weight + acceleration +
## year + origin + cylinders:acceleration + cylinders:year +
## displacement:weight + displacement:year + displacement:origin +
## horsepower:acceleration + horsepower:year + acceleration:year +
## acceleration:origin + year:origin
##
## Df Sum of Sq RSS AIC
## <none> 2670.8 788.20
## - year:origin 1 26.71 2697.5 790.10
## - cylinders:year 1 27.00 2697.8 790.14
## - horsepower:acceleration 1 30.30 2701.1 790.62
## - acceleration:year 1 37.57 2708.4 791.68
## - cylinders:acceleration 1 42.66 2713.5 792.41
## - displacement:year 1 48.20 2719.0 793.21
## - horsepower:year 1 53.54 2724.4 793.98
## - displacement:origin 1 86.56 2757.4 798.70
## - acceleration:origin 1 133.55 2804.4 805.33
## - displacement:weight 1 496.55 3167.4 853.04
automodel$coefficients
## (Intercept) cylinders displacement
## 2.829154e+01 9.862727e+00 -4.291861e-01
## horsepower weight acceleration
## 6.120519e-01 -9.522978e-03 -4.695130e+00
## year origin cylinders:acceleration
## 6.621645e-01 -1.985448e+01 1.686669e-01
## cylinders:year displacement:weight displacement:year
## -1.590265e-01 2.218471e-05 4.228778e-03
## displacement:origin horsepower:acceleration horsepower:year
## 2.762865e-02 -5.467217e-03 -7.806115e-03
## acceleration:year acceleration:origin year:origin
## 4.601257e-02 4.766509e-01 1.234407e-01
autolm3 = lm(mpg ~ displacement + horsepower + weight + acceleration + origin + displacement:weight + displacement:year + displacement:origin + horsepower:year + acceleration:year + acceleration:origin + year:origin, data = auto)
summary(autolm3)
##
## Call:
## lm(formula = mpg ~ displacement + horsepower + weight + acceleration +
## origin + displacement:weight + displacement:year + displacement:origin +
## horsepower:year + acceleration:year + acceleration:origin +
## year:origin, data = auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.5747 -1.6133 -0.0176 1.3028 10.9087
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.127e+01 3.789e+00 18.812 < 2e-16 ***
## displacement -2.925e-01 7.891e-02 -3.706 0.000242 ***
## horsepower 6.034e-01 1.735e-01 3.478 0.000564 ***
## weight -9.218e-03 7.725e-04 -11.932 < 2e-16 ***
## acceleration -5.095e+00 5.973e-01 -8.529 3.57e-16 ***
## origin -2.087e+01 4.815e+00 -4.334 1.88e-05 ***
## displacement:weight 2.148e-05 2.332e-06 9.210 < 2e-16 ***
## displacement:year 2.572e-03 1.010e-03 2.546 0.011301 *
## displacement:origin 2.368e-02 7.592e-03 3.119 0.001951 **
## horsepower:year -8.799e-03 2.337e-03 -3.766 0.000193 ***
## acceleration:year 5.742e-02 7.668e-03 7.488 4.94e-13 ***
## acceleration:origin 4.054e-01 9.067e-02 4.471 1.03e-05 ***
## origin:year 1.569e-01 5.612e-02 2.796 0.005434 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.69 on 379 degrees of freedom
## Multiple R-squared: 0.8849, Adjusted R-squared: 0.8813
## F-statistic: 242.8 on 12 and 379 DF, p-value: < 2.2e-16
autolm4 = lm(mpg ~ displacement + horsepower + I(log(weight)) + acceleration + origin + displacement:weight + displacement:year + displacement:origin + horsepower:year + acceleration:year + acceleration:origin + year:origin, data = auto)
summary(autolm4)
##
## Call:
## lm(formula = mpg ~ displacement + horsepower + I(log(weight)) +
## acceleration + origin + displacement:weight + displacement:year +
## displacement:origin + horsepower:year + acceleration:year +
## acceleration:origin + year:origin, data = auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.7073 -1.5315 -0.0119 1.2626 11.1870
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.179e+02 1.353e+01 16.109 < 2e-16 ***
## displacement -2.444e-01 7.862e-02 -3.109 0.002022 **
## horsepower 5.853e-01 1.731e-01 3.381 0.000798 ***
## I(log(weight)) -2.184e+01 1.810e+00 -12.065 < 2e-16 ***
## acceleration -5.422e+00 5.901e-01 -9.187 < 2e-16 ***
## origin -1.914e+01 4.804e+00 -3.985 8.10e-05 ***
## displacement:weight 1.154e-05 1.860e-06 6.204 1.44e-09 ***
## displacement:year 2.319e-03 1.008e-03 2.300 0.022013 *
## displacement:origin 2.635e-02 7.649e-03 3.444 0.000636 ***
## horsepower:year -8.505e-03 2.333e-03 -3.646 0.000303 ***
## acceleration:year 6.166e-02 7.612e-03 8.101 7.55e-15 ***
## acceleration:origin 4.038e-01 9.040e-02 4.467 1.05e-05 ***
## origin:year 1.311e-01 5.606e-02 2.338 0.019918 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.681 on 379 degrees of freedom
## Multiple R-squared: 0.8856, Adjusted R-squared: 0.882
## F-statistic: 244.5 on 12 and 379 DF, p-value: < 2.2e-16
carseats = Carseats
attach(carseats)
carseatslm1 = lm(Sales ~ Price + Urban + US)
summary(carseatslm1)
##
## Call:
## lm(formula = Sales ~ Price + Urban + US)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.9206 -1.6220 -0.0564 1.5786 7.0581
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13.043469 0.651012 20.036 < 2e-16 ***
## Price -0.054459 0.005242 -10.389 < 2e-16 ***
## UrbanYes -0.021916 0.271650 -0.081 0.936
## USYes 1.200573 0.259042 4.635 4.86e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.472 on 396 degrees of freedom
## Multiple R-squared: 0.2393, Adjusted R-squared: 0.2335
## F-statistic: 41.52 on 3 and 396 DF, p-value: < 2.2e-16
The Urban based predictor has a p value almost equal to one. We can remove it.
Reduced MLR.
carseatslm2 = lm(Sales ~ Price + US)
summary(carseatslm2)
##
## Call:
## lm(formula = Sales ~ Price + US)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.9269 -1.6286 -0.0574 1.5766 7.0515
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13.03079 0.63098 20.652 < 2e-16 ***
## Price -0.05448 0.00523 -10.416 < 2e-16 ***
## USYes 1.19964 0.25846 4.641 4.71e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.469 on 397 degrees of freedom
## Multiple R-squared: 0.2393, Adjusted R-squared: 0.2354
## F-statistic: 62.43 on 2 and 397 DF, p-value: < 2.2e-16
With R Squared values of .234 and .235, the models explain approximately 23% of the variance in the sales for a carseat at a given location. While the information we have is a good fit, we do not have much predictive power for sales.
Confidence Intervals. Price: (-.05971, -.04925) US Located: (.94118, 1.4581) Intercept: (12.39981, 13.66177)
Outliers and Leverage. There is evidence for observations 377, 51, and 69 as outliers and observations 26, 50, and 368 as high leverage points. However, I would include these in the model as they are relatively typical. I suspect taht if we removed them we would have new points as our outliers and high leverage points.
par(mfrow=c(2,2))
plot(carseatslm2)
The coefficient estimates will be identical when the model perfectly fits the data or if the variance form the line of best fit is identical.
set.seed(1)
x = rnorm(100)
y = 4 * x + rnorm(100)
test = lm(y~x+0)
summary(test)
##
## Call:
## lm(formula = y ~ x + 0)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.9154 -0.6472 -0.1771 0.5056 2.3109
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## x 3.9939 0.1065 37.51 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9586 on 99 degrees of freedom
## Multiple R-squared: 0.9343, Adjusted R-squared: 0.9336
## F-statistic: 1407 on 1 and 99 DF, p-value: < 2.2e-16
test2 = lm(x~y+0)
summary(test2)
##
## Call:
## lm(formula = x ~ y + 0)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.50931 -0.10863 0.05499 0.14436 0.44044
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## y 0.233923 0.006236 37.51 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.232 on 99 degrees of freedom
## Multiple R-squared: 0.9343, Adjusted R-squared: 0.9336
## F-statistic: 1407 on 1 and 99 DF, p-value: < 2.2e-16
plot(x,y)
abline(test)
abline(test2, col= 'red', lty = 'dashed')
x = 1:10
y = 10:1
test = lm(y~x+0)
summary(test)
##
## Call:
## lm(formula = y ~ x + 0)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.714 -1.179 2.357 5.893 9.429
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## x 0.5714 0.2736 2.089 0.0663 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.367 on 9 degrees of freedom
## Multiple R-squared: 0.3265, Adjusted R-squared: 0.2517
## F-statistic: 4.364 on 1 and 9 DF, p-value: 0.0663
test2 = lm(x~y+0)
summary(test2)
##
## Call:
## lm(formula = x ~ y + 0)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.714 -1.179 2.357 5.893 9.429
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## y 0.5714 0.2736 2.089 0.0663 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.367 on 9 degrees of freedom
## Multiple R-squared: 0.3265, Adjusted R-squared: 0.2517
## F-statistic: 4.364 on 1 and 9 DF, p-value: 0.0663
plot(x,y)
abline(test, )
abline(test2, col= 'red', lty = 'dashed')