library(tidyverse)
library(openintro)
library(car)

Exercise 2

Carefully explain the differences between the KNN classifier and KNN regression methods.

The KNN classifier is used to predict classes, while KNN regression is used to predict numerical values.

Specifically, KNN classification will predict the class of a new instance by finding the K nearest training instances and assigning the most frequent class amongst them as the prediction for the new instance.

In contrast, KNN regression estimates the numerical value of a new instance by finding the K nearest training instances and averaging or using another aggregate function on their values to create an estimation of the new instance.

Exercise 9

This question involves the use of multiple linear regression on the Auto data set.

\((a)\) Produce a scatterplot matrix which includes all of the variables in the data set.

library(ISLR2)
library(MASS)
## 
## Attaching package: 'MASS'
## The following object is masked from 'package:ISLR2':
## 
##     Boston
## The following objects are masked from 'package:openintro':
## 
##     housing, mammals
## The following object is masked from 'package:dplyr':
## 
##     select
plot(Auto)

\((b)\) Compute the matrix of correlations between the variables using the function cor(). You will need to exclude the name variable, which is qualitative.

Auto$name=NULL
cor(Auto)
##                     mpg  cylinders displacement horsepower     weight
## mpg           1.0000000 -0.7776175   -0.8051269 -0.7784268 -0.8322442
## cylinders    -0.7776175  1.0000000    0.9508233  0.8429834  0.8975273
## displacement -0.8051269  0.9508233    1.0000000  0.8972570  0.9329944
## horsepower   -0.7784268  0.8429834    0.8972570  1.0000000  0.8645377
## weight       -0.8322442  0.8975273    0.9329944  0.8645377  1.0000000
## acceleration  0.4233285 -0.5046834   -0.5438005 -0.6891955 -0.4168392
## year          0.5805410 -0.3456474   -0.3698552 -0.4163615 -0.3091199
## origin        0.5652088 -0.5689316   -0.6145351 -0.4551715 -0.5850054
##              acceleration       year     origin
## mpg             0.4233285  0.5805410  0.5652088
## cylinders      -0.5046834 -0.3456474 -0.5689316
## displacement   -0.5438005 -0.3698552 -0.6145351
## horsepower     -0.6891955 -0.4163615 -0.4551715
## weight         -0.4168392 -0.3091199 -0.5850054
## acceleration    1.0000000  0.2903161  0.2127458
## year            0.2903161  1.0000000  0.1815277
## origin          0.2127458  0.1815277  1.0000000

\((c)\) Use the lm() function to perform a multiple linear regression with mpg as the response and all other variables except name as the predictors. Use the summary() function to print the results. Comment on the output. For instance:

Automodel <- lm(mpg ~., data = Auto)
summary(Automodel)
## 
## Call:
## lm(formula = mpg ~ ., data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.5903 -2.1565 -0.1169  1.8690 13.0604 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -17.218435   4.644294  -3.707  0.00024 ***
## cylinders     -0.493376   0.323282  -1.526  0.12780    
## displacement   0.019896   0.007515   2.647  0.00844 ** 
## horsepower    -0.016951   0.013787  -1.230  0.21963    
## weight        -0.006474   0.000652  -9.929  < 2e-16 ***
## acceleration   0.080576   0.098845   0.815  0.41548    
## year           0.750773   0.050973  14.729  < 2e-16 ***
## origin         1.426141   0.278136   5.127 4.67e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.328 on 384 degrees of freedom
## Multiple R-squared:  0.8215, Adjusted R-squared:  0.8182 
## F-statistic: 252.4 on 7 and 384 DF,  p-value: < 2.2e-16

\(i.\) Is there a relationship between the predictors and the response?

Yes, there is. However, some predictors do not have a statistically significant effect on the response. The R-squared value implies that 81.82% of the changes in the response can be explained by the predictors in this regression model.

\(ii.\) Which predictors appear to have a statistically significant relationship to the response?

displacement, weight, year, and origin appear to have a statistically significant relationship to the response.

\(iii.\) What does the coefficient for the year variable suggest?

When every other predictor is held constant, the mpg value increases by 0.75 with each year that passes. This suggests that newer cars will have a higher mpg.

\((d)\) Use the plot() function to produce diagnostic plots of the linear regression fit. Comment on any problems you see with the fit. Do the residual plots suggest any unusually large outliers? Does the leverage plot identify any observations with unusually high leverage?

par(mfrow = c(2,2))
plot(Automodel)

The Residuals vs Fitted plot shows that there is a non-linear relationship between the response and the predictors. The QQ plot shows that the residuals are not normally distributed and are right skewed. The Scale Location plot shows that the constant variance of error assumption is not true for this model. The Residuals vs Leverage plot shows that there are no leverage points. However, observation 14 stands out as a potential leverage point.

\((e)\) Use the * and : symbols to fit linear regression models with interaction effects. Do any interactions appear to be statistically significant?

Auto$origin <- as.factor(Auto$origin)
Auto$cylinders <- as.factor(Auto$cylinders)
vif(Automodel)
##    cylinders displacement   horsepower       weight acceleration         year 
##    10.737535    21.836792     9.943693    10.831260     2.625806     1.244952 
##       origin 
##     1.772386
Automodel0.1 <- lm(mpg ~ . - displacement, Auto)
vif(Automodel0.1)
##                  GVIF Df GVIF^(1/(2*Df))
## cylinders    8.904486  4        1.314320
## horsepower   9.761605  1        3.124357
## weight       9.675322  1        3.110518
## acceleration 2.651730  1        1.628413
## year         1.305357  1        1.142522
## origin       2.023470  2        1.192681
summary(Automodel0.1)
## 
## Call:
## lm(formula = mpg ~ . - displacement, data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.4432 -1.9509 -0.0635  1.5634 12.7861 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -21.420180   4.567763  -4.689 3.82e-06 ***
## cylinders4     7.674517   1.624427   4.724 3.25e-06 ***
## cylinders5     8.387557   2.483501   3.377 0.000807 ***
## cylinders6     5.244637   1.684031   3.114 0.001983 ** 
## cylinders8     8.031903   1.792288   4.481 9.82e-06 ***
## horsepower    -0.025486   0.012811  -1.989 0.047369 *  
## weight        -0.005097   0.000578  -8.819  < 2e-16 ***
## acceleration  -0.000761   0.093156  -0.008 0.993486    
## year           0.722135   0.048950  14.753  < 2e-16 ***
## origin2        1.280424   0.522596   2.450 0.014730 *  
## origin3        2.213387   0.507419   4.362 1.66e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.121 on 381 degrees of freedom
## Multiple R-squared:  0.8442, Adjusted R-squared:  0.8401 
## F-statistic: 206.5 on 10 and 381 DF,  p-value: < 2.2e-16

origin and cylinders were converted into factors given their categorical nature.

Interactions will be tested amongst the significant predictors in the baseline model: cylinders, horsepower, weight, year, and origin.

Automodel1 <- lm(mpg ~. - displacement + year:origin, data = Auto)
summary(Automodel1)
## 
## Call:
## lm(formula = mpg ~ . - displacement + year:origin, data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.5305 -1.9615 -0.1253  1.3497 13.6865 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -6.222e+00  5.405e+00  -1.151  0.25040    
## cylinders4    7.322e+00  1.586e+00   4.617 5.35e-06 ***
## cylinders5    6.609e+00  2.433e+00   2.716  0.00690 ** 
## cylinders6    4.461e+00  1.650e+00   2.705  0.00714 ** 
## cylinders8    7.012e+00  1.759e+00   3.986 8.07e-05 ***
## horsepower   -2.693e-02  1.243e-02  -2.166  0.03094 *  
## weight       -5.099e-03  5.606e-04  -9.095  < 2e-16 ***
## acceleration  1.665e-04  9.072e-02   0.002  0.99854    
## year          5.333e-01  6.110e-02   8.729  < 2e-16 ***
## origin2      -4.390e+01  9.637e+00  -4.555 7.06e-06 ***
## origin3      -2.591e+01  8.722e+00  -2.971  0.00316 ** 
## year:origin2  5.922e-01  1.265e-01   4.683 3.95e-06 ***
## year:origin3  3.618e-01  1.122e-01   3.224  0.00137 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.026 on 379 degrees of freedom
## Multiple R-squared:  0.8543, Adjusted R-squared:  0.8497 
## F-statistic: 185.2 on 12 and 379 DF,  p-value: < 2.2e-16

In Automodel1, year:origin (2 = European, 3 = Asian), are statistically significant and indicate that European cars have a steeper increase in mpg over time than American cars and Asian cars also improve over time, but at a slightly lower rate than European cars. The R-Squared value marginally decreases in this model. cylinders and horsepower are removed from future models as they are not significant.

Automodel2 <- lm(mpg ~. - displacement - cylinders - horsepower - acceleration
                 + year:origin + weight:origin, data=Auto)
summary(Automodel2)
## 
## Call:
## lm(formula = mpg ~ . - displacement - cylinders - horsepower - 
##     acceleration + year:origin + weight:origin, data = Auto)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.129 -1.899 -0.049  1.764 12.360 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    -9.439e+00  4.997e+00  -1.889  0.05964 .  
## weight         -5.653e-03  2.764e-04 -20.450  < 2e-16 ***
## year            6.421e-01  6.007e-02  10.690  < 2e-16 ***
## origin2        -3.219e+01  9.855e+00  -3.267  0.00119 ** 
## origin3        -1.209e+01  9.296e+00  -1.301  0.19409    
## year:origin2    5.402e-01  1.287e-01   4.197 3.36e-05 ***
## year:origin3    3.513e-01  1.145e-01   3.069  0.00230 ** 
## weight:origin2 -2.663e-03  8.390e-04  -3.174  0.00162 ** 
## weight:origin3 -5.579e-03  1.144e-03  -4.878 1.57e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.138 on 383 degrees of freedom
## Multiple R-squared:  0.8417, Adjusted R-squared:  0.8384 
## F-statistic: 254.5 on 8 and 383 DF,  p-value: < 2.2e-16

In Automodel2, year:origin continues to be significant. weight:origin has a significant negative interaction across factors, and suggests that the penalty on fuel efficiency due to weight is largest for Asian cars, followed by European cars, and smallest for American cars. The R-Squared value marginally improves in this model.

Automodel3 <- lm(mpg ~. - displacement -cylinders - horsepower - acceleration
                 + year:origin + weight:origin + year:weight, 
                 data=Auto)
summary(Automodel3)
## 
## Call:
## lm(formula = mpg ~ . - displacement - cylinders - horsepower - 
##     acceleration + year:origin + weight:origin + year:weight, 
##     data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.0371 -1.8479 -0.0772  1.6285 12.2731 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    -9.526e+01  1.722e+01  -5.533 5.86e-08 ***
## weight          2.166e-02  5.266e-03   4.113 4.78e-05 ***
## year            1.786e+00  2.278e-01   7.840 4.53e-14 ***
## origin2        -1.256e+01  1.026e+01  -1.225   0.2214    
## origin3         1.164e+01  1.009e+01   1.154   0.2493    
## year:origin2    2.589e-01  1.358e-01   1.907   0.0573 .  
## year:origin3    7.586e-03  1.290e-01   0.059   0.9531    
## weight:origin2 -1.895e-03  8.253e-04  -2.296   0.0222 *  
## weight:origin3 -4.523e-03  1.125e-03  -4.019 7.04e-05 ***
## weight:year    -3.656e-04  7.040e-05  -5.194 3.36e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.037 on 382 degrees of freedom
## Multiple R-squared:  0.8521, Adjusted R-squared:  0.8486 
## F-statistic: 244.5 on 9 and 382 DF,  p-value: < 2.2e-16

In Automodel3, year:origin is no longer significant. This could possibly occur due to the inclusion of year:weight, which better explains the effect of year on mpg. The negative relationship between year and weight with mpg suggests that in older cars, weight didn’t reduce mpg as much, but in newer cars, added weight leads to a larger drop in fuel efficiency. The R-Square value is now greater than the value found in the baseline model (Automodel0.1).

\((f)\) Try a few different transformations of the variables, such as log(X), √X, X2. Comment on your findings.

Auto$log_year <- log(Auto$year)
Auto$sqrt_year <- sqrt(Auto$year)
Auto$sq_year <- Auto$year^2

Auto$log_weight <- log(Auto$weight)
Auto$sqrt_weight <- sqrt(Auto$weight)
Auto$sq_weight <- Auto$weight^2
Automodel5<-lm(mpg ~ cylinders + horsepower + log_weight + acceleration 
               + log_year + origin, Auto)
summary(Automodel5)
## 
## Call:
## lm(formula = mpg ~ cylinders + horsepower + log_weight + acceleration + 
##     log_year + origin, data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.7011 -1.9145 -0.0667  1.4550 12.7600 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -89.48120   17.49646  -5.114 5.00e-07 ***
## cylinders4     7.11016    1.56295   4.549 7.25e-06 ***
## cylinders5     8.84993    2.38531   3.710 0.000238 ***
## cylinders6     5.52389    1.61655   3.417 0.000701 ***
## cylinders8     7.52252    1.69958   4.426 1.25e-05 ***
## horsepower    -0.01283    0.01220  -1.052 0.293331    
## log_weight   -17.68816    1.60401 -11.027  < 2e-16 ***
## acceleration   0.04044    0.08792   0.460 0.645780    
## log_year      57.07426    3.59855  15.860  < 2e-16 ***
## origin2        1.05522    0.50244   2.100 0.036369 *  
## origin3        1.61346    0.49685   3.247 0.001268 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.997 on 381 degrees of freedom
## Multiple R-squared:  0.8563, Adjusted R-squared:  0.8526 
## F-statistic: 227.1 on 10 and 381 DF,  p-value: < 2.2e-16
par(mfrow = c(2,2))
plot(Automodel5)

Replacing year and weight with log transformations in a model that included all other significant variables from the baseline model resulted in log_weight and log_year being statistically significant. For log_weight, a 1% increase in weight leads to a decrease in mpg by approximately 0.1769 mpg. This suggests that heavier cars have worse fuel efficiency, but the rate of decline slows as weight increases. For log_year, a 1% increase in year (moving from 1970 to 1971, etc.) results in an increase of ~0.57 mpg, possibly due to advancements in technology, regulations, and manufacturing techniques. The R-Squared value is greater than the baseline model in this case.

Automodel6<-lm(mpg ~ cylinders + sqrt_weight + sqrt_year + origin, Auto)
summary(Automodel6)
## 
## Call:
## lm(formula = mpg ~ cylinders + sqrt_weight + sqrt_year + origin, 
##     data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.2281 -1.8103 -0.0506  1.5655 12.9361 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -64.35442    7.46492  -8.621  < 2e-16 ***
## cylinders4    7.66922    1.57850   4.859 1.73e-06 ***
## cylinders5    9.23969    2.41597   3.824 0.000153 ***
## cylinders6    5.76830    1.64268   3.512 0.000499 ***
## cylinders8    7.58827    1.73561   4.372 1.59e-05 ***
## sqrt_weight  -0.67217    0.04752 -14.144  < 2e-16 ***
## sqrt_year    13.35462    0.80174  16.657  < 2e-16 ***
## origin2       1.18526    0.51201   2.315 0.021145 *  
## origin3       1.77767    0.49760   3.572 0.000399 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.059 on 383 degrees of freedom
## Multiple R-squared:  0.8496, Adjusted R-squared:  0.8464 
## F-statistic: 270.3 on 8 and 383 DF,  p-value: < 2.2e-16
par(mfrow = c(2,2))
plot(Automodel6)

When horsepower and acceleration were included in the model, they were insignificant. Thus, they were removed and the model was refit. In this updated model, the predictors have a square root transformation instead. The square root transformation of weight and year are both statistically significant. The R-Squared value in this model decreases marginally compared to the previous.

Automodel7<-lm(mpg ~ cylinders + sq_weight + sq_year + origin,Auto)
summary(Automodel7)
## 
## Call:
## lm(formula = mpg ~ cylinders + sq_weight + sq_year + origin, 
##     data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.5047 -2.1180 -0.0608  1.6639 13.1049 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -5.809e+00  2.616e+00  -2.220 0.026976 *  
## cylinders4   8.613e+00  1.712e+00   5.032 7.49e-07 ***
## cylinders5   8.251e+00  2.623e+00   3.145 0.001791 ** 
## cylinders6   4.802e+00  1.782e+00   2.695 0.007344 ** 
## cylinders8   6.439e+00  1.908e+00   3.375 0.000812 ***
## sq_weight   -7.205e-07  7.032e-08 -10.246  < 2e-16 ***
## sq_year      4.876e-03  3.290e-04  14.819  < 2e-16 ***
## origin2      1.496e+00  5.560e-01   2.691 0.007428 ** 
## origin3      2.692e+00  5.296e-01   5.083 5.81e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.324 on 383 degrees of freedom
## Multiple R-squared:  0.8223, Adjusted R-squared:  0.8186 
## F-statistic: 221.6 on 8 and 383 DF,  p-value: < 2.2e-16
par(mfrow = c(2,2))
plot(Automodel7)

In this model, a squared transformation is applied to year and weight. Both transformations are once again statistically significant. However, the R-Squared value on this model decreases a little to 81.86%. Thus, the square root transformation is the idea transformation to apply for these variables based on its lower residual error and higher R-Squared.

Exercise 10

This question should be answered using the Carseats data set.

data("Carseats")
summary(Carseats)
##      Sales          CompPrice       Income        Advertising    
##  Min.   : 0.000   Min.   : 77   Min.   : 21.00   Min.   : 0.000  
##  1st Qu.: 5.390   1st Qu.:115   1st Qu.: 42.75   1st Qu.: 0.000  
##  Median : 7.490   Median :125   Median : 69.00   Median : 5.000  
##  Mean   : 7.496   Mean   :125   Mean   : 68.66   Mean   : 6.635  
##  3rd Qu.: 9.320   3rd Qu.:135   3rd Qu.: 91.00   3rd Qu.:12.000  
##  Max.   :16.270   Max.   :175   Max.   :120.00   Max.   :29.000  
##    Population        Price        ShelveLoc        Age          Education   
##  Min.   : 10.0   Min.   : 24.0   Bad   : 96   Min.   :25.00   Min.   :10.0  
##  1st Qu.:139.0   1st Qu.:100.0   Good  : 85   1st Qu.:39.75   1st Qu.:12.0  
##  Median :272.0   Median :117.0   Medium:219   Median :54.50   Median :14.0  
##  Mean   :264.8   Mean   :115.8                Mean   :53.32   Mean   :13.9  
##  3rd Qu.:398.5   3rd Qu.:131.0                3rd Qu.:66.00   3rd Qu.:16.0  
##  Max.   :509.0   Max.   :191.0                Max.   :80.00   Max.   :18.0  
##  Urban       US     
##  No :118   No :142  
##  Yes:282   Yes:258  
##                     
##                     
##                     
## 

\((a)\) Fit a multiple regression model to predict Sales using Price, Urban, and US.

Carseatsmodel <-  lm(Sales ~ Price + Urban + US, data = Carseats)
summary(Carseatsmodel)
## 
## Call:
## lm(formula = Sales ~ Price + Urban + US, data = Carseats)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.9206 -1.6220 -0.0564  1.5786  7.0581 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 13.043469   0.651012  20.036  < 2e-16 ***
## Price       -0.054459   0.005242 -10.389  < 2e-16 ***
## UrbanYes    -0.021916   0.271650  -0.081    0.936    
## USYes        1.200573   0.259042   4.635 4.86e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.472 on 396 degrees of freedom
## Multiple R-squared:  0.2393, Adjusted R-squared:  0.2335 
## F-statistic: 41.52 on 3 and 396 DF,  p-value: < 2.2e-16

\((b)\) Provide an interpretation of each coefficient in the model. Be careful—some of the variables in the model are qualitative!

Price: There is likely a correlation between price and sales, with the coefficient showing a negative relationship. This suggests that for each one-unit increase in price, sales are expected to decrease by approximately 0.0544 units.

UrbanYes: There is not enough evidence to suggest a link between the location of the store and the number of sales. With the given information, UrbanYes is not a significant predictor for the model.

USYes: There appears to be a positive relationship between whether a store is located in the US or not and the amount of sales, with an approximate increase of 1.2 sales units if the store is based in the US.

\((c)\) Write out the model in equation form, being careful to handle the qualitative variables properly.

\(\text{Sales}=13.04-0.05\times\text{Price}-0.02\times\text{UrbanYes}+1.20\times\text{USYes}\)

\((d)\) For which of the predictors can you reject the null hypothesis \(H_0\): \(\beta_j\)=0?

The null hypothesis can be rejected for Price and USYes based on the p-values.

\((e)\) On the basis of your response to the previous question, fit a smaller model that only uses the predictors for which there is evidence of association with the outcome.

Carseatsmodel2<-  lm(Sales ~ Price + US, data = Carseats)
summary(Carseatsmodel2)
## 
## Call:
## lm(formula = Sales ~ Price + US, data = Carseats)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.9269 -1.6286 -0.0574  1.5766  7.0515 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 13.03079    0.63098  20.652  < 2e-16 ***
## Price       -0.05448    0.00523 -10.416  < 2e-16 ***
## USYes        1.19964    0.25846   4.641 4.71e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.469 on 397 degrees of freedom
## Multiple R-squared:  0.2393, Adjusted R-squared:  0.2354 
## F-statistic: 62.43 on 2 and 397 DF,  p-value: < 2.2e-16

\((f)\) How well do the models in \((a)\) and \((e)\) fit the data?

Based on the R-squared values, both models fit the data similarly.

\((g)\) Using the model from \((e)\), obtain 95% confidence intervals for the coefficient(s).

confint(Carseatsmodel2)
##                   2.5 %      97.5 %
## (Intercept) 11.79032020 14.27126531
## Price       -0.06475984 -0.04419543
## USYes        0.69151957  1.70776632

\((h)\) Is there evidence of outliers or high leverage observations in the model from \((e)\)?

par(mfrow = c(2, 2))
plot(Carseatsmodel2)

summary(influence.measures(Carseatsmodel2))
## Potentially influential observations of
##   lm(formula = Sales ~ Price + US, data = Carseats) :
## 
##     dfb.1_ dfb.Pric dfb.USYs dffit   cov.r   cook.d hat    
## 26   0.24  -0.18    -0.17     0.28_*  0.97_*  0.03   0.01  
## 29  -0.10   0.10    -0.10    -0.18    0.97_*  0.01   0.01  
## 43  -0.11   0.10     0.03    -0.11    1.05_*  0.00   0.04_*
## 50  -0.10   0.17    -0.17     0.26_*  0.98    0.02   0.01  
## 51  -0.05   0.05    -0.11    -0.18    0.95_*  0.01   0.00  
## 58  -0.05  -0.02     0.16    -0.20    0.97_*  0.01   0.01  
## 69  -0.09   0.10     0.09     0.19    0.96_*  0.01   0.01  
## 126 -0.07   0.06     0.03    -0.07    1.03_*  0.00   0.03_*
## 160  0.00   0.00     0.00     0.01    1.02_*  0.00   0.02  
## 166  0.21  -0.23    -0.04    -0.24    1.02    0.02   0.03_*
## 172  0.06  -0.07     0.02     0.08    1.03_*  0.00   0.02  
## 175  0.14  -0.19     0.09    -0.21    1.03_*  0.02   0.03_*
## 210 -0.14   0.15    -0.10    -0.22    0.97_*  0.02   0.01  
## 270 -0.03   0.05    -0.03     0.06    1.03_*  0.00   0.02  
## 298 -0.06   0.06    -0.09    -0.15    0.97_*  0.01   0.00  
## 314 -0.05   0.04     0.02    -0.05    1.03_*  0.00   0.02_*
## 353 -0.02   0.03     0.09     0.15    0.97_*  0.01   0.00  
## 357  0.02  -0.02     0.02    -0.03    1.03_*  0.00   0.02  
## 368  0.26  -0.23    -0.11     0.27_*  1.01    0.02   0.02_*
## 377  0.14  -0.15     0.12     0.24    0.95_*  0.02   0.01  
## 384  0.00   0.00     0.00     0.00    1.02_*  0.00   0.02  
## 387 -0.03   0.04    -0.03     0.05    1.02_*  0.00   0.02  
## 396 -0.05   0.05     0.08     0.14    0.98_*  0.01   0.00

The residuals appear to be bounded close to the reference line. Therefore, we can say that there are not many outliers present in the data.

Most of the provided DFB values appear to be relatively small (in the range of 0.0 to 0.3), so there are no strong indications of outliers from the DFB values alone. There are a few DFFIT values that flag observations as influential (obs. #26, #50, and #368). Cook’s D values do not cause any alarm in this case. The Hat values are relatively low but those that are higher might be observations of high leverage.

Exercise 12

This problem involves simple linear regression without an intercept.

\((a)\) Recall that the coefficient estimate \(\hat{\beta}\) for the linear regression of Y onto X without an intercept is given by (3.38). Under what circumstance is the coefficient estimate for the regression of X onto Y the same as the coefficient estimate for the regression of Y onto X?

The coefficient estimate for the regression of Y onto X is \(\hat{\beta}\) = \(\frac{\sum_ix_iy_i}{\sum_jx_j^2}\).

The coefficient estimate for the regression of X onto Y is \(\hat{\beta}\) = \(\frac{\sum_ix_iy_i}{\sum_jy_j^2}\).

The coefficients are the same if \(\sum_jx_j^2=\sum_jy_j^2\).

\((b)\) Generate an example in R with n = 100 observations in which the coefficient estimate for the regression of X onto Y is different from the coefficient estimate for the regression of Y onto X.

set.seed(1)
n <- 100
x <- rnorm(n)
y <- 2*x+rnorm(100, mean = 0, sd = 1)
fit.Y <- lm(y ~ x)
summary(fit.Y)
## 
## Call:
## lm(formula = y ~ x)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.8768 -0.6138 -0.1395  0.5394  2.3462 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.03769    0.09699  -0.389    0.698    
## x            1.99894    0.10773  18.556   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9628 on 98 degrees of freedom
## Multiple R-squared:  0.7784, Adjusted R-squared:  0.7762 
## F-statistic: 344.3 on 1 and 98 DF,  p-value: < 2.2e-16
fit.X <- lm(x ~ y)
summary(fit.X)
## 
## Call:
## lm(formula = x ~ y)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.90848 -0.28101  0.06274  0.24570  0.85736 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.03880    0.04266    0.91    0.365    
## y            0.38942    0.02099   18.56   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4249 on 98 degrees of freedom
## Multiple R-squared:  0.7784, Adjusted R-squared:  0.7762 
## F-statistic: 344.3 on 1 and 98 DF,  p-value: < 2.2e-16

The coefficient for Y is much smaller than the coefficient for X, showing that the regression of X on Y does not have the same slope as the regression of Y on X.

\((c)\) Generate an example in R with n = 100 observations in which the coefficient estimate for the regression of X onto Y is the same as the coefficient estimate for the regression of Y onto X.

n <- 100
x <- rnorm(n)
y <- x
fit.Y <- lm(y ~ x)
summary(fit.Y)
## Warning in summary.lm(fit.Y): essentially perfect fit: summary may be
## unreliable
## 
## Call:
## lm(formula = y ~ x)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -3.318e-16 -6.100e-17 -2.560e-17 -8.000e-19  3.220e-15 
## 
## Coefficients:
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept) -1.110e-17  3.379e-17 -3.290e-01    0.743    
## x            1.000e+00  3.283e-17  3.046e+16   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.378e-16 on 98 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 9.28e+32 on 1 and 98 DF,  p-value: < 2.2e-16
fit.X <- lm(x ~ y)
summary(fit.X)
## Warning in summary.lm(fit.X): essentially perfect fit: summary may be
## unreliable
## 
## Call:
## lm(formula = x ~ y)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -3.318e-16 -6.100e-17 -2.560e-17 -8.000e-19  3.220e-15 
## 
## Coefficients:
##               Estimate Std. Error    t value Pr(>|t|)    
## (Intercept) -1.110e-17  3.379e-17 -3.290e-01    0.743    
## y            1.000e+00  3.283e-17  3.046e+16   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.378e-16 on 98 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 9.28e+32 on 1 and 98 DF,  p-value: < 2.2e-16

The coefficients are exactly the same.

LS0tCnRpdGxlOiAiQXNzaWdubWVudCAyIgphdXRob3I6ICJSYW5pIE1pc3JhIgpkYXRlOiAiYHIgU3lzLkRhdGUoKWAiCm91dHB1dDogb3BlbmludHJvOjpsYWJfcmVwb3J0Ci0tLQoKYGBge3IgbG9hZC1wYWNrYWdlcywgbWVzc2FnZT1GQUxTRX0KbGlicmFyeSh0aWR5dmVyc2UpCmxpYnJhcnkob3BlbmludHJvKQpsaWJyYXJ5KGNhcikKYGBgCgojIyMgRXhlcmNpc2UgMgoqKkNhcmVmdWxseSBleHBsYWluIHRoZSBkaWZmZXJlbmNlcyBiZXR3ZWVuIHRoZSBLTk4gY2xhc3NpZmllciBhbmQgS05OIHJlZ3Jlc3Npb24gbWV0aG9kcy4qKgoKVGhlIEtOTiBjbGFzc2lmaWVyIGlzIHVzZWQgdG8gcHJlZGljdCBjbGFzc2VzLCB3aGlsZSBLTk4gcmVncmVzc2lvbiBpcyB1c2VkIHRvIHByZWRpY3QgbnVtZXJpY2FsIHZhbHVlcy4KClNwZWNpZmljYWxseSwgS05OIGNsYXNzaWZpY2F0aW9uIHdpbGwgcHJlZGljdCB0aGUgY2xhc3Mgb2YgYSBuZXcgaW5zdGFuY2UgYnkgZmluZGluZyB0aGUgSyBuZWFyZXN0IHRyYWluaW5nIGluc3RhbmNlcyBhbmQgYXNzaWduaW5nIHRoZSBtb3N0IGZyZXF1ZW50IGNsYXNzIGFtb25nc3QgdGhlbSBhcyB0aGUgcHJlZGljdGlvbiBmb3IgdGhlIG5ldyBpbnN0YW5jZS4KCkluIGNvbnRyYXN0LCBLTk4gcmVncmVzc2lvbiBlc3RpbWF0ZXMgdGhlIG51bWVyaWNhbCB2YWx1ZSBvZiBhIG5ldyBpbnN0YW5jZSBieSBmaW5kaW5nIHRoZSBLIG5lYXJlc3QgdHJhaW5pbmcgaW5zdGFuY2VzIGFuZCBhdmVyYWdpbmcgb3IgdXNpbmcgYW5vdGhlciBhZ2dyZWdhdGUgZnVuY3Rpb24gb24gdGhlaXIgdmFsdWVzIHRvIGNyZWF0ZSBhbiBlc3RpbWF0aW9uIG9mIHRoZSBuZXcgaW5zdGFuY2UuCgojIyMgRXhlcmNpc2UgOQoqKlRoaXMgcXVlc3Rpb24gaW52b2x2ZXMgdGhlIHVzZSBvZiBtdWx0aXBsZSBsaW5lYXIgcmVncmVzc2lvbiBvbiB0aGUgQXV0byBkYXRhIHNldC4qKgoKKiokKGEpJCBQcm9kdWNlIGEgc2NhdHRlcnBsb3QgbWF0cml4IHdoaWNoIGluY2x1ZGVzIGFsbCBvZiB0aGUgdmFyaWFibGVzIGluIHRoZSBkYXRhIHNldC4qKgoKYGBge3J9CmxpYnJhcnkoSVNMUjIpCmxpYnJhcnkoTUFTUykKcGxvdChBdXRvKQpgYGAKCioqJChiKSQgQ29tcHV0ZSB0aGUgbWF0cml4IG9mIGNvcnJlbGF0aW9ucyBiZXR3ZWVuIHRoZSB2YXJpYWJsZXMgdXNpbmcgdGhlIGZ1bmN0aW9uIGNvcigpLiBZb3Ugd2lsbCBuZWVkIHRvIGV4Y2x1ZGUgdGhlIG5hbWUgdmFyaWFibGUsIHdoaWNoIGlzIHF1YWxpdGF0aXZlLioqCgpgYGB7cn0KQXV0byRuYW1lPU5VTEwKY29yKEF1dG8pCmBgYAoKKiokKGMpJCBVc2UgdGhlIGxtKCkgZnVuY3Rpb24gdG8gcGVyZm9ybSBhIG11bHRpcGxlIGxpbmVhciByZWdyZXNzaW9uIHdpdGggbXBnIGFzIHRoZSByZXNwb25zZSBhbmQgYWxsIG90aGVyIHZhcmlhYmxlcyBleGNlcHQgbmFtZSBhcyB0aGUgcHJlZGljdG9ycy4gVXNlIHRoZSBzdW1tYXJ5KCkgZnVuY3Rpb24gdG8gcHJpbnQgdGhlIHJlc3VsdHMuIENvbW1lbnQgb24gdGhlIG91dHB1dC4gRm9yIGluc3RhbmNlOioqCgpgYGB7cn0KQXV0b21vZGVsIDwtIGxtKG1wZyB+LiwgZGF0YSA9IEF1dG8pCnN1bW1hcnkoQXV0b21vZGVsKQpgYGAKCioqJGkuJCBJcyB0aGVyZSBhIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIHRoZSBwcmVkaWN0b3JzIGFuZCB0aGUgcmVzcG9uc2U/KioKClllcywgdGhlcmUgaXMuIEhvd2V2ZXIsIHNvbWUgcHJlZGljdG9ycyBkbyBub3QgaGF2ZSBhIHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQgZWZmZWN0IG9uIHRoZSByZXNwb25zZS4gVGhlIFItc3F1YXJlZCB2YWx1ZSBpbXBsaWVzIHRoYXQgODEuODIlIG9mIHRoZSBjaGFuZ2VzIGluIHRoZSByZXNwb25zZSBjYW4gYmUgZXhwbGFpbmVkIGJ5IHRoZSBwcmVkaWN0b3JzIGluIHRoaXMgcmVncmVzc2lvbiBtb2RlbC4KCioqJGlpLiQgV2hpY2ggcHJlZGljdG9ycyBhcHBlYXIgdG8gaGF2ZSBhIHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQgcmVsYXRpb25zaGlwIHRvIHRoZSByZXNwb25zZT8qKgoKYGRpc3BsYWNlbWVudGAsIGB3ZWlnaHRgLCBgeWVhcmAsIGFuZCBgb3JpZ2luYCBhcHBlYXIgdG8gaGF2ZSBhIHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQgcmVsYXRpb25zaGlwIHRvIHRoZSByZXNwb25zZS4gCgoqKiRpaWkuJCBXaGF0IGRvZXMgdGhlIGNvZWZmaWNpZW50IGZvciB0aGUgeWVhciB2YXJpYWJsZSBzdWdnZXN0PyoqCgpXaGVuIGV2ZXJ5IG90aGVyIHByZWRpY3RvciBpcyBoZWxkIGNvbnN0YW50LCB0aGUgbXBnIHZhbHVlIGluY3JlYXNlcyBieSAwLjc1IHdpdGggZWFjaCB5ZWFyIHRoYXQgcGFzc2VzLiBUaGlzIHN1Z2dlc3RzIHRoYXQgbmV3ZXIgY2FycyB3aWxsIGhhdmUgYSBoaWdoZXIgbXBnLiAKCioqJChkKSQgVXNlIHRoZSBwbG90KCkgZnVuY3Rpb24gdG8gcHJvZHVjZSBkaWFnbm9zdGljIHBsb3RzIG9mIHRoZSBsaW5lYXIgcmVncmVzc2lvbiBmaXQuIENvbW1lbnQgb24gYW55IHByb2JsZW1zIHlvdSBzZWUgd2l0aCB0aGUgZml0LiBEbyB0aGUgcmVzaWR1YWwgcGxvdHMgc3VnZ2VzdCBhbnkgdW51c3VhbGx5IGxhcmdlIG91dGxpZXJzPyBEb2VzIHRoZSBsZXZlcmFnZSBwbG90IGlkZW50aWZ5IGFueSBvYnNlcnZhdGlvbnMgd2l0aCB1bnVzdWFsbHkgaGlnaCBsZXZlcmFnZT8qKgoKYGBge3J9CnBhcihtZnJvdyA9IGMoMiwyKSkKcGxvdChBdXRvbW9kZWwpCmBgYAoKVGhlIFJlc2lkdWFscyB2cyBGaXR0ZWQgcGxvdCBzaG93cyB0aGF0IHRoZXJlIGlzIGEgbm9uLWxpbmVhciByZWxhdGlvbnNoaXAgYmV0d2VlbiB0aGUgcmVzcG9uc2UgYW5kIHRoZSBwcmVkaWN0b3JzLiBUaGUgUVEgcGxvdCBzaG93cyB0aGF0IHRoZSByZXNpZHVhbHMgYXJlIG5vdCBub3JtYWxseSBkaXN0cmlidXRlZCBhbmQgYXJlIHJpZ2h0IHNrZXdlZC4gVGhlIFNjYWxlIExvY2F0aW9uIHBsb3Qgc2hvd3MgdGhhdCB0aGUgY29uc3RhbnQgdmFyaWFuY2Ugb2YgZXJyb3IgYXNzdW1wdGlvbiBpcyBub3QgdHJ1ZSBmb3IgdGhpcyBtb2RlbC4gVGhlIFJlc2lkdWFscyB2cyBMZXZlcmFnZSBwbG90IHNob3dzIHRoYXQgdGhlcmUgYXJlIG5vIGxldmVyYWdlIHBvaW50cy4gSG93ZXZlciwgb2JzZXJ2YXRpb24gMTQgc3RhbmRzIG91dCBhcyBhIHBvdGVudGlhbCBsZXZlcmFnZSBwb2ludC4gCgoqKiQoZSkkIFVzZSB0aGUgKiBhbmQgOiBzeW1ib2xzIHRvIGZpdCBsaW5lYXIgcmVncmVzc2lvbiBtb2RlbHMgd2l0aCBpbnRlcmFjdGlvbiBlZmZlY3RzLiBEbyBhbnkgaW50ZXJhY3Rpb25zIGFwcGVhciB0byBiZSBzdGF0aXN0aWNhbGx5IHNpZ25pZmljYW50PyoqCgpgYGB7cn0KQXV0byRvcmlnaW4gPC0gYXMuZmFjdG9yKEF1dG8kb3JpZ2luKQpBdXRvJGN5bGluZGVycyA8LSBhcy5mYWN0b3IoQXV0byRjeWxpbmRlcnMpCnZpZihBdXRvbW9kZWwpCkF1dG9tb2RlbDAuMSA8LSBsbShtcGcgfiAuIC0gZGlzcGxhY2VtZW50LCBBdXRvKQp2aWYoQXV0b21vZGVsMC4xKQpzdW1tYXJ5KEF1dG9tb2RlbDAuMSkKYGBgCgpgb3JpZ2luYCBhbmQgYGN5bGluZGVyc2Agd2VyZSBjb252ZXJ0ZWQgaW50byBmYWN0b3JzIGdpdmVuIHRoZWlyIGNhdGVnb3JpY2FsIG5hdHVyZS4KCkludGVyYWN0aW9ucyB3aWxsIGJlIHRlc3RlZCBhbW9uZ3N0IHRoZSBzaWduaWZpY2FudCBwcmVkaWN0b3JzIGluIHRoZSBiYXNlbGluZSAKbW9kZWw6IGBjeWxpbmRlcnNgLCBgaG9yc2Vwb3dlcmAsIGB3ZWlnaHRgLCBgeWVhcmAsIGFuZCBgb3JpZ2luYC4gCgpgYGB7cn0KQXV0b21vZGVsMSA8LSBsbShtcGcgfi4gLSBkaXNwbGFjZW1lbnQgKyB5ZWFyOm9yaWdpbiwgZGF0YSA9IEF1dG8pCnN1bW1hcnkoQXV0b21vZGVsMSkKYGBgCgpJbiBgQXV0b21vZGVsMWAsIGB5ZWFyOm9yaWdpbmAgKDIgPSBFdXJvcGVhbiwgMyA9IEFzaWFuKSwgYXJlIHN0YXRpc3RpY2FsbHkgCnNpZ25pZmljYW50IGFuZCBpbmRpY2F0ZSB0aGF0IEV1cm9wZWFuIGNhcnMgaGF2ZSBhIHN0ZWVwZXIgaW5jcmVhc2UgaW4gbXBnIG92ZXIgCnRpbWUgdGhhbiBBbWVyaWNhbiBjYXJzIGFuZCBBc2lhbiBjYXJzIGFsc28gaW1wcm92ZSBvdmVyIHRpbWUsIGJ1dCBhdCBhIHNsaWdodGx5IApsb3dlciByYXRlIHRoYW4gRXVyb3BlYW4gY2Fycy4gVGhlIFItU3F1YXJlZCB2YWx1ZSBtYXJnaW5hbGx5IGRlY3JlYXNlcyBpbiB0aGlzIG1vZGVsLiAKYGN5bGluZGVyc2AgYW5kIGBob3JzZXBvd2VyYCBhcmUgcmVtb3ZlZCBmcm9tIGZ1dHVyZSBtb2RlbHMgYXMgdGhleSBhcmUgbm90IHNpZ25pZmljYW50LgoKYGBge3J9CkF1dG9tb2RlbDIgPC0gbG0obXBnIH4uIC0gZGlzcGxhY2VtZW50IC0gY3lsaW5kZXJzIC0gaG9yc2Vwb3dlciAtIGFjY2VsZXJhdGlvbgogICAgICAgICAgICAgICAgICsgeWVhcjpvcmlnaW4gKyB3ZWlnaHQ6b3JpZ2luLCBkYXRhPUF1dG8pCnN1bW1hcnkoQXV0b21vZGVsMikKYGBgCgpJbiBgQXV0b21vZGVsMmAsIGB5ZWFyOm9yaWdpbmAgY29udGludWVzIHRvIGJlIHNpZ25pZmljYW50LiBgd2VpZ2h0Om9yaWdpbmAgCmhhcyBhIHNpZ25pZmljYW50IG5lZ2F0aXZlIGludGVyYWN0aW9uIGFjcm9zcyBmYWN0b3JzLCBhbmQgc3VnZ2VzdHMgdGhhdCB0aGUgcGVuYWx0eSAKb24gZnVlbCBlZmZpY2llbmN5IGR1ZSB0byB3ZWlnaHQgaXMgbGFyZ2VzdCBmb3IgQXNpYW4gY2FycywgZm9sbG93ZWQgYnkgRXVyb3BlYW4gCmNhcnMsIGFuZCBzbWFsbGVzdCBmb3IgQW1lcmljYW4gY2Fycy4gVGhlIFItU3F1YXJlZCB2YWx1ZSBtYXJnaW5hbGx5IGltcHJvdmVzIGluIAp0aGlzIG1vZGVsLiAKCmBgYHtyfQpBdXRvbW9kZWwzIDwtIGxtKG1wZyB+LiAtIGRpc3BsYWNlbWVudCAtY3lsaW5kZXJzIC0gaG9yc2Vwb3dlciAtIGFjY2VsZXJhdGlvbgogICAgICAgICAgICAgICAgICsgeWVhcjpvcmlnaW4gKyB3ZWlnaHQ6b3JpZ2luICsgeWVhcjp3ZWlnaHQsIAogICAgICAgICAgICAgICAgIGRhdGE9QXV0bykKc3VtbWFyeShBdXRvbW9kZWwzKQpgYGAKCkluIGBBdXRvbW9kZWwzYCwgYHllYXI6b3JpZ2luYCBpcyBubyBsb25nZXIgc2lnbmlmaWNhbnQuIFRoaXMgY291bGQgcG9zc2libHkgb2NjdXIgCmR1ZSB0byB0aGUgaW5jbHVzaW9uIG9mIGB5ZWFyOndlaWdodGAsIHdoaWNoIGJldHRlciBleHBsYWlucyB0aGUgZWZmZWN0IG9mIGB5ZWFyYCAKb24gbXBnLiBUaGUgbmVnYXRpdmUgcmVsYXRpb25zaGlwIGJldHdlZW4gYHllYXJgIGFuZCBgd2VpZ2h0YCB3aXRoIG1wZyBzdWdnZXN0cyAKdGhhdCBpbiBvbGRlciBjYXJzLCB3ZWlnaHQgZGlkbid0IHJlZHVjZSBtcGcgYXMgbXVjaCwgYnV0IGluIG5ld2VyIGNhcnMsIGFkZGVkIAp3ZWlnaHQgbGVhZHMgdG8gYSBsYXJnZXIgZHJvcCBpbiBmdWVsIGVmZmljaWVuY3kuIFRoZSBSLVNxdWFyZSB2YWx1ZSBpcyBub3cgZ3JlYXRlciAKdGhhbiB0aGUgdmFsdWUgZm91bmQgaW4gdGhlIGJhc2VsaW5lIG1vZGVsIChgQXV0b21vZGVsMC4xYCkuIAoKKiokKGYpJCBUcnkgYSBmZXcgZGlmZmVyZW50IHRyYW5zZm9ybWF0aW9ucyBvZiB0aGUgdmFyaWFibGVzLCBzdWNoIGFzIGxvZyhYKSwg4oiaWCwgWDIuIENvbW1lbnQgb24geW91ciBmaW5kaW5ncy4qKgoKYGBge3J9CkF1dG8kbG9nX3llYXIgPC0gbG9nKEF1dG8keWVhcikKQXV0byRzcXJ0X3llYXIgPC0gc3FydChBdXRvJHllYXIpCkF1dG8kc3FfeWVhciA8LSBBdXRvJHllYXJeMgoKQXV0byRsb2dfd2VpZ2h0IDwtIGxvZyhBdXRvJHdlaWdodCkKQXV0byRzcXJ0X3dlaWdodCA8LSBzcXJ0KEF1dG8kd2VpZ2h0KQpBdXRvJHNxX3dlaWdodCA8LSBBdXRvJHdlaWdodF4yCmBgYAoKYGBge3J9CkF1dG9tb2RlbDU8LWxtKG1wZyB+IGN5bGluZGVycyArIGhvcnNlcG93ZXIgKyBsb2dfd2VpZ2h0ICsgYWNjZWxlcmF0aW9uIAogICAgICAgICAgICAgICArIGxvZ195ZWFyICsgb3JpZ2luLCBBdXRvKQpzdW1tYXJ5KEF1dG9tb2RlbDUpCnBhcihtZnJvdyA9IGMoMiwyKSkKcGxvdChBdXRvbW9kZWw1KQpgYGAKClJlcGxhY2luZyBgeWVhcmAgYW5kIGB3ZWlnaHRgIHdpdGggbG9nIHRyYW5zZm9ybWF0aW9ucyBpbiBhIG1vZGVsIHRoYXQgCmluY2x1ZGVkIGFsbCBvdGhlciBzaWduaWZpY2FudCB2YXJpYWJsZXMgZnJvbSB0aGUgYmFzZWxpbmUgbW9kZWwgcmVzdWx0ZWQgaW4gCmBsb2dfd2VpZ2h0YCBhbmQgYGxvZ195ZWFyYCBiZWluZyBzdGF0aXN0aWNhbGx5IHNpZ25pZmljYW50LiBGb3IgYGxvZ193ZWlnaHRgLCAKYSAxJSBpbmNyZWFzZSBpbiBgd2VpZ2h0YCBsZWFkcyB0byBhIGRlY3JlYXNlIGluIG1wZyBieSBhcHByb3hpbWF0ZWx5IDAuMTc2OSBtcGcuIApUaGlzIHN1Z2dlc3RzIHRoYXQgaGVhdmllciBjYXJzIGhhdmUgd29yc2UgZnVlbCBlZmZpY2llbmN5LCBidXQgdGhlIHJhdGUgb2YgCmRlY2xpbmUgc2xvd3MgYXMgd2VpZ2h0IGluY3JlYXNlcy4gRm9yIGBsb2dfeWVhcmAsIGEgMSUgaW5jcmVhc2UgaW4geWVhciAobW92aW5nIApmcm9tIDE5NzAgdG8gMTk3MSwgZXRjLikgcmVzdWx0cyBpbiBhbiBpbmNyZWFzZSBvZiB+MC41NyBtcGcsIHBvc3NpYmx5IGR1ZSB0byAKYWR2YW5jZW1lbnRzIGluIHRlY2hub2xvZ3ksIHJlZ3VsYXRpb25zLCBhbmQgbWFudWZhY3R1cmluZyB0ZWNobmlxdWVzLiBUaGUgUi1TcXVhcmVkIAp2YWx1ZSBpcyBncmVhdGVyIHRoYW4gdGhlIGJhc2VsaW5lIG1vZGVsIGluIHRoaXMgY2FzZS4KCmBgYHtyfQpBdXRvbW9kZWw2PC1sbShtcGcgfiBjeWxpbmRlcnMgKyBzcXJ0X3dlaWdodCArIHNxcnRfeWVhciArIG9yaWdpbiwgQXV0bykKc3VtbWFyeShBdXRvbW9kZWw2KQpwYXIobWZyb3cgPSBjKDIsMikpCnBsb3QoQXV0b21vZGVsNikKYGBgCgpXaGVuIGBob3JzZXBvd2VyYCBhbmQgYGFjY2VsZXJhdGlvbmAgd2VyZSBpbmNsdWRlZCBpbiB0aGUgbW9kZWwsIHRoZXkgd2VyZSBpbnNpZ25pZmljYW50LiAKVGh1cywgdGhleSB3ZXJlIHJlbW92ZWQgYW5kIHRoZSBtb2RlbCB3YXMgcmVmaXQuIEluIHRoaXMgdXBkYXRlZCBtb2RlbCwgCnRoZSBwcmVkaWN0b3JzIGhhdmUgYSBzcXVhcmUgcm9vdCB0cmFuc2Zvcm1hdGlvbiBpbnN0ZWFkLiBUaGUgc3F1YXJlIHJvb3QgCnRyYW5zZm9ybWF0aW9uIG9mIGB3ZWlnaHRgIGFuZCBgeWVhcmAgYXJlIGJvdGggc3RhdGlzdGljYWxseSBzaWduaWZpY2FudC4gClRoZSBSLVNxdWFyZWQgdmFsdWUgaW4gdGhpcyBtb2RlbCBkZWNyZWFzZXMgbWFyZ2luYWxseSBjb21wYXJlZCB0byB0aGUgcHJldmlvdXMuIAoKYGBge3J9CkF1dG9tb2RlbDc8LWxtKG1wZyB+IGN5bGluZGVycyArIHNxX3dlaWdodCArIHNxX3llYXIgKyBvcmlnaW4sQXV0bykKc3VtbWFyeShBdXRvbW9kZWw3KQpwYXIobWZyb3cgPSBjKDIsMikpCnBsb3QoQXV0b21vZGVsNykKYGBgCgpJbiB0aGlzIG1vZGVsLCBhIHNxdWFyZWQgdHJhbnNmb3JtYXRpb24gaXMgYXBwbGllZCB0byBgeWVhcmAgYW5kIGB3ZWlnaHRgLgpCb3RoIHRyYW5zZm9ybWF0aW9ucyBhcmUgb25jZSBhZ2FpbiBzdGF0aXN0aWNhbGx5IHNpZ25pZmljYW50LiBIb3dldmVyLCB0aGUgClItU3F1YXJlZCB2YWx1ZSBvbiB0aGlzIG1vZGVsIGRlY3JlYXNlcyBhIGxpdHRsZSB0byA4MS44NiUuIFRodXMsIHRoZSBzcXVhcmUgcm9vdCAKdHJhbnNmb3JtYXRpb24gaXMgdGhlIGlkZWEgdHJhbnNmb3JtYXRpb24gdG8gYXBwbHkgZm9yIHRoZXNlIHZhcmlhYmxlcyBiYXNlZCBvbiAKaXRzIGxvd2VyIHJlc2lkdWFsIGVycm9yIGFuZCBoaWdoZXIgUi1TcXVhcmVkLiAKCiMjIyBFeGVyY2lzZSAxMAoqKlRoaXMgcXVlc3Rpb24gc2hvdWxkIGJlIGFuc3dlcmVkIHVzaW5nIHRoZSBDYXJzZWF0cyBkYXRhIHNldC4qKgoKYGBge3J9CmRhdGEoIkNhcnNlYXRzIikKc3VtbWFyeShDYXJzZWF0cykKYGBgCgoqKiQoYSkkIEZpdCBhIG11bHRpcGxlIHJlZ3Jlc3Npb24gbW9kZWwgdG8gcHJlZGljdCBTYWxlcyB1c2luZyBQcmljZSwgVXJiYW4sIGFuZCBVUy4qKgoKYGBge3J9CkNhcnNlYXRzbW9kZWwgPC0gIGxtKFNhbGVzIH4gUHJpY2UgKyBVcmJhbiArIFVTLCBkYXRhID0gQ2Fyc2VhdHMpCnN1bW1hcnkoQ2Fyc2VhdHNtb2RlbCkKYGBgCgoqKiQoYikkIFByb3ZpZGUgYW4gaW50ZXJwcmV0YXRpb24gb2YgZWFjaCBjb2VmZmljaWVudCBpbiB0aGUgbW9kZWwuIEJlIGNhcmVmdWzigJRzb21lIG9mIHRoZSB2YXJpYWJsZXMgaW4gdGhlIG1vZGVsIGFyZSBxdWFsaXRhdGl2ZSEqKgoKUHJpY2U6IFRoZXJlIGlzIGxpa2VseSBhIGNvcnJlbGF0aW9uIGJldHdlZW4gcHJpY2UgYW5kIHNhbGVzLCB3aXRoIHRoZSBjb2VmZmljaWVudCBzaG93aW5nIGEgbmVnYXRpdmUgcmVsYXRpb25zaGlwLiBUaGlzIHN1Z2dlc3RzIHRoYXQgZm9yIGVhY2ggb25lLXVuaXQgaW5jcmVhc2UgaW4gcHJpY2UsIApzYWxlcyBhcmUgZXhwZWN0ZWQgdG8gZGVjcmVhc2UgYnkgYXBwcm94aW1hdGVseSAwLjA1NDQgdW5pdHMuCgpVcmJhblllczogVGhlcmUgaXMgbm90IGVub3VnaCBldmlkZW5jZSB0byBzdWdnZXN0IGEgbGluayBiZXR3ZWVuIHRoZSBsb2NhdGlvbiBvZiAKdGhlIHN0b3JlIGFuZCB0aGUgbnVtYmVyIG9mIHNhbGVzLiBXaXRoIHRoZSBnaXZlbiBpbmZvcm1hdGlvbiwgVXJiYW5ZZXMgaXMgbm90IGEgCnNpZ25pZmljYW50IHByZWRpY3RvciBmb3IgdGhlIG1vZGVsLiAKClVTWWVzOiBUaGVyZSBhcHBlYXJzIHRvIGJlIGEgcG9zaXRpdmUgcmVsYXRpb25zaGlwIGJldHdlZW4gd2hldGhlciBhIHN0b3JlIGlzIApsb2NhdGVkIGluIHRoZSBVUyBvciBub3QgYW5kIHRoZSBhbW91bnQgb2Ygc2FsZXMsIHdpdGggYW4gYXBwcm94aW1hdGUgaW5jcmVhc2Ugb2YgCjEuMiBzYWxlcyB1bml0cyBpZiB0aGUgc3RvcmUgaXMgYmFzZWQgaW4gdGhlIFVTLgoKKiokKGMpJCBXcml0ZSBvdXQgdGhlIG1vZGVsIGluIGVxdWF0aW9uIGZvcm0sIGJlaW5nIGNhcmVmdWwgdG8gaGFuZGxlIHRoZSBxdWFsaXRhdGl2ZSB2YXJpYWJsZXMgcHJvcGVybHkuKioKCiRcdGV4dHtTYWxlc309MTMuMDQtMC4wNVx0aW1lc1x0ZXh0e1ByaWNlfS0wLjAyXHRpbWVzXHRleHR7VXJiYW5ZZXN9KzEuMjBcdGltZXNcdGV4dHtVU1llc30kCgoqKiQoZCkkIEZvciB3aGljaCBvZiB0aGUgcHJlZGljdG9ycyBjYW4geW91IHJlamVjdCB0aGUgbnVsbCBoeXBvdGhlc2lzICRIXzAkOiAkXGJldGFfaiQ9MD8qKgoKVGhlIG51bGwgaHlwb3RoZXNpcyBjYW4gYmUgcmVqZWN0ZWQgZm9yIFByaWNlIGFuZCBVU1llcyBiYXNlZCBvbiB0aGUgcC12YWx1ZXMuIAoKKiokKGUpJCBPbiB0aGUgYmFzaXMgb2YgeW91ciByZXNwb25zZSB0byB0aGUgcHJldmlvdXMgcXVlc3Rpb24sIGZpdCBhIHNtYWxsZXIgbW9kZWwgdGhhdCBvbmx5IHVzZXMgdGhlIHByZWRpY3RvcnMgZm9yIHdoaWNoIHRoZXJlIGlzIGV2aWRlbmNlIG9mIGFzc29jaWF0aW9uIHdpdGggdGhlIG91dGNvbWUuKioKCmBgYHtyfQpDYXJzZWF0c21vZGVsMjwtICBsbShTYWxlcyB+IFByaWNlICsgVVMsIGRhdGEgPSBDYXJzZWF0cykKc3VtbWFyeShDYXJzZWF0c21vZGVsMikKYGBgCioqJChmKSQgSG93IHdlbGwgZG8gdGhlIG1vZGVscyBpbiAkKGEpJCBhbmQgJChlKSQgZml0IHRoZSBkYXRhPyoqCgpCYXNlZCBvbiB0aGUgUi1zcXVhcmVkIHZhbHVlcywgYm90aCBtb2RlbHMgZml0IHRoZSBkYXRhIHNpbWlsYXJseS4gCgoqKiQoZykkIFVzaW5nIHRoZSBtb2RlbCBmcm9tICQoZSkkLCBvYnRhaW4gOTUlIGNvbmZpZGVuY2UgaW50ZXJ2YWxzIGZvciB0aGUgY29lZmZpY2llbnQocykuKioKCmBgYHtyfQpjb25maW50KENhcnNlYXRzbW9kZWwyKQpgYGAKCioqJChoKSQgSXMgdGhlcmUgZXZpZGVuY2Ugb2Ygb3V0bGllcnMgb3IgaGlnaCBsZXZlcmFnZSBvYnNlcnZhdGlvbnMgaW4gdGhlIG1vZGVsIGZyb20gJChlKSQ/KioKCmBgYHtyfQpwYXIobWZyb3cgPSBjKDIsIDIpKQpwbG90KENhcnNlYXRzbW9kZWwyKQpzdW1tYXJ5KGluZmx1ZW5jZS5tZWFzdXJlcyhDYXJzZWF0c21vZGVsMikpCmBgYAoKVGhlIHJlc2lkdWFscyBhcHBlYXIgdG8gYmUgYm91bmRlZCBjbG9zZSB0byB0aGUgcmVmZXJlbmNlIGxpbmUuIApUaGVyZWZvcmUsIHdlIGNhbiBzYXkgdGhhdCB0aGVyZSBhcmUgbm90IG1hbnkgb3V0bGllcnMgcHJlc2VudCBpbiB0aGUgZGF0YS4KCk1vc3Qgb2YgdGhlIHByb3ZpZGVkIERGQiB2YWx1ZXMgYXBwZWFyIHRvIGJlIHJlbGF0aXZlbHkgc21hbGwgKGluIHRoZSByYW5nZSBvZiAwLjAgCnRvIDAuMyksIHNvIHRoZXJlIGFyZSBubyBzdHJvbmcgaW5kaWNhdGlvbnMgb2Ygb3V0bGllcnMgZnJvbSB0aGUgREZCIHZhbHVlcyBhbG9uZS4KVGhlcmUgYXJlIGEgZmV3IERGRklUIHZhbHVlcyB0aGF0IGZsYWcgb2JzZXJ2YXRpb25zIGFzIGluZmx1ZW50aWFsIChvYnMuICMyNiwgIzUwLCAKYW5kICMzNjgpLiBDb29rJ3MgRCB2YWx1ZXMgZG8gbm90IGNhdXNlIGFueSBhbGFybSBpbiB0aGlzIGNhc2UuIFRoZSBIYXQgdmFsdWVzIGFyZSAKcmVsYXRpdmVseSBsb3cgYnV0IHRob3NlIHRoYXQgYXJlIGhpZ2hlciBtaWdodCBiZSBvYnNlcnZhdGlvbnMgb2YgaGlnaCBsZXZlcmFnZS4gCgojIyMgRXhlcmNpc2UgMTIKKipUaGlzIHByb2JsZW0gaW52b2x2ZXMgc2ltcGxlIGxpbmVhciByZWdyZXNzaW9uIHdpdGhvdXQgYW4gaW50ZXJjZXB0LioqCgoqKiQoYSkkIFJlY2FsbCB0aGF0IHRoZSBjb2VmZmljaWVudCBlc3RpbWF0ZSAkXGhhdHtcYmV0YX0kIGZvciB0aGUgbGluZWFyIHJlZ3Jlc3Npb24gb2YgWSBvbnRvIFggd2l0aG91dCBhbiBpbnRlcmNlcHQgaXMgZ2l2ZW4gYnkgKDMuMzgpLiBVbmRlciB3aGF0IGNpcmN1bXN0YW5jZSBpcyB0aGUgY29lZmZpY2llbnQgZXN0aW1hdGUgZm9yIHRoZSByZWdyZXNzaW9uIG9mIFggb250byBZIHRoZSBzYW1lIGFzIHRoZSBjb2VmZmljaWVudCBlc3RpbWF0ZSBmb3IgdGhlIHJlZ3Jlc3Npb24gb2YgWSBvbnRvIFg/KioKClRoZSBjb2VmZmljaWVudCBlc3RpbWF0ZSBmb3IgdGhlIHJlZ3Jlc3Npb24gb2YgWSBvbnRvIFggaXMgJFxoYXR7XGJldGF9JCA9ICRcZnJhY3tcc3VtX2l4X2l5X2l9e1xzdW1fanhfal4yfSQuIAoKVGhlIGNvZWZmaWNpZW50IGVzdGltYXRlIGZvciB0aGUgcmVncmVzc2lvbiBvZiBYIG9udG8gWSBpcyAkXGhhdHtcYmV0YX0kID0gJFxmcmFje1xzdW1faXhfaXlfaX17XHN1bV9qeV9qXjJ9JC4gCgpUaGUgY29lZmZpY2llbnRzIGFyZSB0aGUgc2FtZSBpZiAkXHN1bV9qeF9qXjI9XHN1bV9qeV9qXjIkLiAKCioqJChiKSQgR2VuZXJhdGUgYW4gZXhhbXBsZSBpbiBSIHdpdGggbiA9IDEwMCBvYnNlcnZhdGlvbnMgaW4gd2hpY2ggdGhlIGNvZWZmaWNpZW50IGVzdGltYXRlIGZvciB0aGUgcmVncmVzc2lvbiBvZiBYIG9udG8gWSBpcyBkaWZmZXJlbnQgZnJvbSB0aGUgY29lZmZpY2llbnQgZXN0aW1hdGUgZm9yIHRoZSByZWdyZXNzaW9uIG9mIFkgb250byBYLioqCgpgYGB7cn0Kc2V0LnNlZWQoMSkKbiA8LSAxMDAKeCA8LSBybm9ybShuKQp5IDwtIDIqeCtybm9ybSgxMDAsIG1lYW4gPSAwLCBzZCA9IDEpCmZpdC5ZIDwtIGxtKHkgfiB4KQpzdW1tYXJ5KGZpdC5ZKQpmaXQuWCA8LSBsbSh4IH4geSkKc3VtbWFyeShmaXQuWCkKYGBgCgpUaGUgY29lZmZpY2llbnQgZm9yIFkgaXMgbXVjaCBzbWFsbGVyIHRoYW4gdGhlIGNvZWZmaWNpZW50IGZvciBYLCBzaG93aW5nIHRoYXQgCnRoZSByZWdyZXNzaW9uIG9mIFggb24gWSBkb2VzIG5vdCBoYXZlIHRoZSBzYW1lIHNsb3BlIGFzIHRoZSByZWdyZXNzaW9uIG9mIFkgb24gWC4gCgoqKiQoYykkIEdlbmVyYXRlIGFuIGV4YW1wbGUgaW4gUiB3aXRoIG4gPSAxMDAgb2JzZXJ2YXRpb25zIGluIHdoaWNoIHRoZSBjb2VmZmljaWVudCBlc3RpbWF0ZSBmb3IgdGhlIHJlZ3Jlc3Npb24gb2YgWCBvbnRvIFkgaXMgdGhlIHNhbWUgYXMgdGhlIGNvZWZmaWNpZW50IGVzdGltYXRlIGZvciB0aGUgcmVncmVzc2lvbiBvZiBZIG9udG8gWC4qKgoKYGBge3J9Cm4gPC0gMTAwCnggPC0gcm5vcm0obikKeSA8LSB4CmZpdC5ZIDwtIGxtKHkgfiB4KQpzdW1tYXJ5KGZpdC5ZKQpmaXQuWCA8LSBsbSh4IH4geSkKc3VtbWFyeShmaXQuWCkKYGBgCgpUaGUgY29lZmZpY2llbnRzIGFyZSBleGFjdGx5IHRoZSBzYW1lLiAK