Chapter 3: Assignment #2

Chapter 03 (page 120): 2, 9, 10, 12

Question 2:

Carefully explain the differences between the KNN classifier and KNN regression methods.

The KNN classifier uses qualitative responses to attempt to predict the value of an output variable with probability (Bayes Theorem) and the KNN regression method seeks to make a quantitative estimate by averaging the result of the K nearest neighbors.

Question 9:

This question involves the use of multiple linear regression on the Auto data set.

library(ISLR)
attach(Auto)

Produce a scatterplot matrix which includes all of the variables in the data set.

pairs(Auto, panel = panel.smooth)

Compute the matrix of correlations between the variables using the function cor(). You will need to exclude the name variable, cor() which is qualitative.

#all columns but name
cor(Auto[,1:8])

##                     mpg  cylinders displacement horsepower     weight
## mpg           1.0000000 -0.7776175   -0.8051269 -0.7784268 -0.8322442
## cylinders    -0.7776175  1.0000000    0.9508233  0.8429834  0.8975273
## displacement -0.8051269  0.9508233    1.0000000  0.8972570  0.9329944
## horsepower   -0.7784268  0.8429834    0.8972570  1.0000000  0.8645377
## weight       -0.8322442  0.8975273    0.9329944  0.8645377  1.0000000
## acceleration  0.4233285 -0.5046834   -0.5438005 -0.6891955 -0.4168392
## year          0.5805410 -0.3456474   -0.3698552 -0.4163615 -0.3091199
## origin        0.5652088 -0.5689316   -0.6145351 -0.4551715 -0.5850054
##              acceleration       year     origin
## mpg             0.4233285  0.5805410  0.5652088
## cylinders      -0.5046834 -0.3456474 -0.5689316
## displacement   -0.5438005 -0.3698552 -0.6145351
## horsepower     -0.6891955 -0.4163615 -0.4551715
## weight         -0.4168392 -0.3091199 -0.5850054
## acceleration    1.0000000  0.2903161  0.2127458
## year            0.2903161  1.0000000  0.1815277
## origin          0.2127458  0.1815277  1.0000000

Use the lm() function to perform a multiple linear regression with mpg as the response and all other variables except name as the predictors. Use the summary() function to print the results.

lm.auto <- lm(mpg ~.-name, data = Auto)
summary(lm.auto)

## 
## Call:
## lm(formula = mpg ~ . - name, data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.5903 -2.1565 -0.1169  1.8690 13.0604 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -17.218435   4.644294  -3.707  0.00024 ***
## cylinders     -0.493376   0.323282  -1.526  0.12780    
## displacement   0.019896   0.007515   2.647  0.00844 ** 
## horsepower    -0.016951   0.013787  -1.230  0.21963    
## weight        -0.006474   0.000652  -9.929  < 2e-16 ***
## acceleration   0.080576   0.098845   0.815  0.41548    
## year           0.750773   0.050973  14.729  < 2e-16 ***
## origin         1.426141   0.278136   5.127 4.67e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.328 on 384 degrees of freedom
## Multiple R-squared:  0.8215, Adjusted R-squared:  0.8182 
## F-statistic: 252.4 on 7 and 384 DF,  p-value: < 2.2e-16

Comment on the output. For instance: i. Is there a relationship between the predictors and the response? ii. Which predictors appear to have a statistically significant relationship to the response? iii. What does the coefficient for the year variable suggest?

We will use a test statistic value of 0.05. Y - Our response variable is mpg - The predictor variables are every variable except name.

We check the individual significance by the t-test.

cylinders produces a p-value of 0.12780 displacement produces a p-value of 0.00844 horsepower produces a p-value of 0.21963 weight - < 2e-16 acceleration - 0.41548
year - < 2e-16 origin - 4.67e-07

With these p-values we test the following hypothesis:

Ho: No linear relationship Ha: Linear relationship

For p-value less than 0.05, we reject our null hypothesis and accept the alternative. Displacement, weight, year, and origin have a significant linear relationship.

The coefficient for year suggests that for all other predictors held constant, the mpg value increases by 0.750773 each year.

Next we will look at the value for Rˆ2. Our model produces an Rˆ2 value of 0.8182 This means that the variation of mpg that can be explained by our model with all variables but name is 81.82%.

Use the plot() function to produce diagnostic plots of the linear regression fit. Comment on any problems you see with the fit. Do the residual plots suggest any unusually large outliers? Does the leverage plot identify any observations with unusually high leverage?

par(mfrow=c(2,2))
plot(lm.auto)

Our QQ plot and our standardized residual plot support our normality claim. Our data does represent traits of normality in our QQ plot since our values almost fall on a straight line, but we do see some outliers on the upper tail end. On our standardized residual plot all values are between 0.0 and 2.0.

We use our residual plot to check for homoscedasticity (equal variance assumption on Y at each given X = x). We see that there is no special pattern in the residual plot. There isn’t strong evidence of unequal variance.Now we will check our Residuals vs Leverage plot. We see that there are three points that have a much greater distance then the other points and observation 14 has high leverage.

Use the * and : symbols to fit linear regression models with interaction effects. Do any interactions appear to be statistically significant?

interaction_1 <- lm(mpg ~ . - name + weight*acceleration,data=Auto)
summary(interaction_1)

## 
## Call:
## lm(formula = mpg ~ . - name + weight * acceleration, data = Auto)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.247 -2.048 -0.045  1.619 12.193 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         -4.364e+01  5.811e+00  -7.511 4.18e-13 ***
## cylinders           -2.141e-01  3.078e-01  -0.696 0.487117    
## displacement         3.138e-03  7.495e-03   0.419 0.675622    
## horsepower          -4.141e-02  1.348e-02  -3.071 0.002287 ** 
## weight               4.027e-03  1.636e-03   2.462 0.014268 *  
## acceleration         1.629e+00  2.422e-01   6.726 6.36e-11 ***
## year                 7.821e-01  4.833e-02  16.184  < 2e-16 ***
## origin               1.033e+00  2.686e-01   3.846 0.000141 ***
## weight:acceleration -5.826e-04  8.408e-05  -6.928 1.81e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.141 on 383 degrees of freedom
## Multiple R-squared:  0.8414, Adjusted R-squared:  0.838 
## F-statistic: 253.9 on 8 and 383 DF,  p-value: < 2.2e-16

The p-value for the interaction term, weight:accelertion, is small, indicating that there is strong evidence for Ha : β ne 0. In other words, it is clear that the true relationship is not additive.

The R2 for this model is 83.8%, compared to our previous model only 81.82% without an interaction term. This means that (83.8 − 81.82) / (100 − 83.8) = 12% of the variability mpg that remains after fitting the additive model has been explained by the interaction term.

interaction_2 <-lm(mpg ~.-name+displacement:weight, data = Auto)
summary(interaction_2)

## 
## Call:
## lm(formula = mpg ~ . - name + displacement:weight, data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.9027 -1.8092 -0.0946  1.5549 12.1687 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         -5.389e+00  4.301e+00  -1.253   0.2109    
## cylinders            1.175e-01  2.943e-01   0.399   0.6899    
## displacement        -6.837e-02  1.104e-02  -6.193 1.52e-09 ***
## horsepower          -3.280e-02  1.238e-02  -2.649   0.0084 ** 
## weight              -1.064e-02  7.136e-04 -14.915  < 2e-16 ***
## acceleration         6.724e-02  8.805e-02   0.764   0.4455    
## year                 7.852e-01  4.553e-02  17.246  < 2e-16 ***
## origin               5.610e-01  2.622e-01   2.139   0.0331 *  
## displacement:weight  2.269e-05  2.257e-06  10.054  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.964 on 383 degrees of freedom
## Multiple R-squared:  0.8588, Adjusted R-squared:  0.8558 
## F-statistic: 291.1 on 8 and 383 DF,  p-value: < 2.2e-16

The p-value for the interaction term, weight:accelertion, is small, indicating that there is strong evidence for Ha : β ne 0. In other words, it is clear that the true relationship is not additive.

The R2 for this model is 85.58%, compared to our previous model only 81.82% without an interaction term. This means that (85.58 − 81.82) / (100 − 85.58) = 26% of the variability mpg that remains after fitting the additive model has been explained by the interaction term.

interaction_3 <-lm(mpg ~.-name+year:weight, data = Auto)
summary(interaction_3)

## 
## Call:
## lm(formula = mpg ~ . - name + year:weight, data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.9995 -1.8495 -0.1559  1.6061 11.7042 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -1.186e+02  1.338e+01  -8.864  < 2e-16 ***
## cylinders    -1.218e-01  3.032e-01  -0.402   0.6881    
## displacement  1.293e-02  7.019e-03   1.842   0.0663 .  
## horsepower   -2.877e-02  1.286e-02  -2.236   0.0259 *  
## weight        3.044e-02  4.652e-03   6.543 1.94e-10 ***
## acceleration  1.447e-01  9.196e-02   1.574   0.1164    
## year          2.084e+00  1.732e-01  12.033  < 2e-16 ***
## origin        1.174e+00  2.597e-01   4.519 8.30e-06 ***
## weight:year  -4.879e-04  6.097e-05  -8.002 1.47e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.084 on 383 degrees of freedom
## Multiple R-squared:  0.847,  Adjusted R-squared:  0.8439 
## F-statistic: 265.1 on 8 and 383 DF,  p-value: < 2.2e-16

The p-value for the interaction term, weight:accelertion, is small, indicating that there is strong evidence for Ha : β ne 0. In other words, it is clear that the true relationship is not additive.

The R2 for this model is 84.39%, compared to our previous model only 81.82% without an interaction term. This means that (84.39 − 81.82) / (100 − 84.39) = 16% of the variability mpg that remains after fitting the additive model has been explained by the interaction term.

interaction_4 <-lm(mpg ~.-name+horsepower:origin, data = Auto)
summary(interaction_4)

## 
## Call:
## lm(formula = mpg ~ . - name + horsepower:origin, data = Auto)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.277 -1.875 -0.225  1.570 12.080 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       -2.196e+01  4.396e+00  -4.996 8.94e-07 ***
## cylinders         -5.275e-01  3.028e-01  -1.742   0.0823 .  
## displacement      -1.486e-03  7.607e-03  -0.195   0.8452    
## horsepower         8.173e-02  1.856e-02   4.404 1.38e-05 ***
## weight            -4.710e-03  6.555e-04  -7.186 3.52e-12 ***
## acceleration      -1.124e-01  9.617e-02  -1.168   0.2434    
## year               7.327e-01  4.780e-02  15.328  < 2e-16 ***
## origin             7.695e+00  8.858e-01   8.687  < 2e-16 ***
## horsepower:origin -7.955e-02  1.074e-02  -7.405 8.44e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.116 on 383 degrees of freedom
## Multiple R-squared:  0.8438, Adjusted R-squared:  0.8406 
## F-statistic: 258.7 on 8 and 383 DF,  p-value: < 2.2e-16

The p-value for the interaction term, weight:accelertion, is small, indicating that there is strong evidence for Ha : β ne 0. In other words, it is clear that the true relationship is not additive.

The R2 for this model is 84.06%, compared to our previous model only 81.82% without an interaction term. This means that (84.06 − 81.82) / (100 − 84.06) = 14% of the variability mpg that remains after fitting the additive model has been explained by the interaction term.

Try a few different transformations of the variables, such as log(X), √X, X2. Comment on your findings.

log_lm <- lm(mpg ~ . -name + log(acceleration), data=Auto)
summary(log_lm)

## 
## Call:
## lm(formula = mpg ~ . - name + log(acceleration), data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.7931 -2.0052 -0.1279  1.9299 13.1085 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        4.552e+01  1.479e+01   3.077  0.00224 ** 
## cylinders         -2.796e-01  3.193e-01  -0.876  0.38172    
## displacement       8.042e-03  7.805e-03   1.030  0.30344    
## horsepower        -3.434e-02  1.401e-02  -2.450  0.01473 *  
## weight            -5.343e-03  6.854e-04  -7.795 6.15e-14 ***
## acceleration       2.167e+00  4.782e-01   4.532 7.82e-06 ***
## year               7.560e-01  4.978e-02  15.186  < 2e-16 ***
## origin             1.329e+00  2.724e-01   4.877 1.58e-06 ***
## log(acceleration) -3.513e+01  7.886e+00  -4.455 1.10e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.249 on 383 degrees of freedom
## Multiple R-squared:  0.8303, Adjusted R-squared:  0.8267 
## F-statistic: 234.2 on 8 and 383 DF,  p-value: < 2.2e-16

The p-value for the log of acceleration is small indicating that there is strong evidence for Ha : β ne 0. The R2 for this model is 82.67%, compared to our previous model only 81.82% without an interaction term. This means that (82.67 − 81.82) / (100 − 82.67) = 5% of the variability mpg that remains after taking the log of acceleration in the model has been explained.

sqrt_lm <- lm(mpg ~ . -name + I(cylinders^2), data=Auto)
summary(sqrt_lm)

## 
## Call:
## lm(formula = mpg ~ . - name + I(cylinders^2), data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -11.426  -2.028  -0.161   1.717  12.876 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    -2.0178890  5.7290872  -0.352  0.72487    
## cylinders      -5.8179557  1.2643565  -4.602 5.71e-06 ***
## displacement    0.0197886  0.0073457   2.694  0.00737 ** 
## horsepower     -0.0312646  0.0138721  -2.254  0.02478 *  
## weight         -0.0062906  0.0006387  -9.848  < 2e-16 ***
## acceleration    0.1048520  0.0967778   1.083  0.27930    
## year            0.7453135  0.0498398  14.954  < 2e-16 ***
## origin          1.2279200  0.2756596   4.454 1.11e-05 ***
## I(cylinders^2)  0.4644689  0.1067911   4.349 1.75e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.253 on 383 degrees of freedom
## Multiple R-squared:  0.8299, Adjusted R-squared:  0.8263 
## F-statistic: 233.5 on 8 and 383 DF,  p-value: < 2.2e-16

The p-value for the cylinders squared is small, indicating that there is strong evidence for Ha : β ne 0.

The R2 for this model is 82.63%, compared to our previous model only 81.82% without an interaction term. This means that (82.63 − 81.82) / (100 − 82.63) = 5% of the variability mpg that remains after fitting the model has been explained by the squaring the term.

Question 10.

This question should be answered using the Carseats data set.

attach(Carseats)

Fit a multiple regression model to predict Sales using Price, Urban, and US.

lm.carseats <- lm(Sales~Price+Urban+US, data = Carseats)
summary(lm.carseats)

## 
## Call:
## lm(formula = Sales ~ Price + Urban + US, data = Carseats)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.9206 -1.6220 -0.0564  1.5786  7.0581 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 13.043469   0.651012  20.036  < 2e-16 ***
## Price       -0.054459   0.005242 -10.389  < 2e-16 ***
## UrbanYes    -0.021916   0.271650  -0.081    0.936    
## USYes        1.200573   0.259042   4.635 4.86e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.472 on 396 degrees of freedom
## Multiple R-squared:  0.2393, Adjusted R-squared:  0.2335 
## F-statistic: 41.52 on 3 and 396 DF,  p-value: < 2.2e-16

Provide an interpretation of each coefficient in the model. Be careful—some of the variables in the model are qualitative!

The following is each coefficient with their corresponding p-values:

Price - < 2e-16 *** UrbanYes - 0.936
USYes - 4.86e-06

With these p-values we test the following hypothesis:

Ho: No linear relationship Ha: Linear relationship

For p-value less than 0.05, we reject our null hypothesis and accept the alternative.

Price has a significant linear relationship with sales.

For Urban there is not a significant linear relationship between a stores being in an urban location and sales.

There is a linear relationship for stores located in the US and sales.

Write out the model in equation form, being careful to handle the qualitative variables properly.

Our estimated regression line is:

y hat = 13.043469 - 0.054459 (Price) - 0.021916 (UrbanYes) + 1.200573 (USYes)

For which of the predictors can you reject the null hypothesis H0 : βj = 0? We can reject the null hypothesis for Price and US Yes.
On the basis of your response to the previous question, fit a smaller model that only uses the predictors for which there is evidence of association with the outcome.

lm.carseats2 <-  lm(Sales ~ Price + US, data = Carseats)
summary(lm.carseats2)

## 
## Call:
## lm(formula = Sales ~ Price + US, data = Carseats)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.9269 -1.6286 -0.0574  1.5766  7.0515 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 13.03079    0.63098  20.652  < 2e-16 ***
## Price       -0.05448    0.00523 -10.416  < 2e-16 ***
## USYes        1.19964    0.25846   4.641 4.71e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.469 on 397 degrees of freedom
## Multiple R-squared:  0.2393, Adjusted R-squared:  0.2354 
## F-statistic: 62.43 on 2 and 397 DF,  p-value: < 2.2e-16

How well do the models in (a) and (e) fit the data? Reviewing our value for Rˆ2 from both models we see that model (a) has an R^2 value of 0.2335 and model (e) produces an Rˆ2 value of 0.2354 This means that the variation of Price that can be explained by our model for (a) is 23.35% and (e) is 23.54%. This Rˆ2 is a little larger than the Rˆ2 produced in our original model. They both fit the model similarly, but model (e) is a little better.

Our estimated regression line from model (a) is:

y hat = 13.043469 - 0.054459 (Price) - 0.021916 (UrbanYes) + 1.200573 (USYes)

Our estimated regression line from model (e) is:

y hat = 13.03079 - 0.05448 (Price) + 1.19964 (USYes)

The slope of our regression model as slightly decreased from model a to e. Our previous model was the average sales decrease by 0.054459 with one unit increase in price. The new model sales decrease by 0.05448 with one unit increase in price. They are almost equal.

Using the model from (e), obtain 95 % confidence intervals for the coefficient(s).

confint(lm.carseats2)

##                   2.5 %      97.5 %
## (Intercept) 11.79032020 14.27126531
## Price       -0.06475984 -0.04419543
## USYes        0.69151957  1.70776632

Is there evidence of outliers or high leverage observations in the model from (e)?

par(mfrow = c(2, 2))
plot(lm.carseats2)

Observing the plots above we do see that there are evidence of outliers and high leverage observations. Points that fall horizontally away from the center of the cloud tend to pull harder on the line, so we call them points with high leverage.

Question 12.

This problem involves simple linear regression without an intercept.

Recall that the coefficient estimate βˆ for the linear regression of Y onto X without an intercept is given by (3.38). Under what circumstance is the coefficient estimate for the regression of X onto Y the same as the coefficient estimate for the regression of Y onto X?

yˆ = βˆ1x where ˆy indicates a prediction of Y on the basis of X = x (without an intercept) and: βˆ = \[\sum_{i=1}^{n} x_iy_i\]/\[\sum_{i=1}^{n} x^2_(i')\]

The estimate of Y onto X: xˆ = βˆ1y where βˆ = \[\sum_{i=1}^{n} x_iy_i\]/\[\sum_{i=1}^{n} y^2_(i')\]

Thus the coefficients (βˆ) are the same if \[\sum_{i=1}^{n} x^2_(i')\] = \[\sum_{i=1}^{n} y^2_(i')\] (b) Generate an example in R with n = 100 observations in which the coefficient estimate for the regression of X onto Y is different from the coefficient estimate for the regression of Y onto X.

set.seed(1)
x=rnorm(100)
sum(x^2)

## [1] 81.05509

y <- 3 * x + rnorm(100)
sum(y^2)

## [1] 817.4962

fit_y <- lm(y ~ x + 0)
summary(fit_y)

## 
## Call:
## lm(formula = y ~ x + 0)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.9154 -0.6472 -0.1771  0.5056  2.3109 
## 
## Coefficients:
##   Estimate Std. Error t value Pr(>|t|)    
## x   2.9939     0.1065   28.12   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9586 on 99 degrees of freedom
## Multiple R-squared:  0.8887, Adjusted R-squared:  0.8876 
## F-statistic: 790.6 on 1 and 99 DF,  p-value: < 2.2e-16

fit_x <- lm(x ~ y + 0)
summary(fit_x)

## 
## Call:
## lm(formula = x ~ y + 0)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.63420 -0.16066  0.07099  0.18507  0.59841 
## 
## Coefficients:
##   Estimate Std. Error t value Pr(>|t|)    
## y  0.29684    0.01056   28.12   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3019 on 99 degrees of freedom
## Multiple R-squared:  0.8887, Adjusted R-squared:  0.8876 
## F-statistic: 790.6 on 1 and 99 DF,  p-value: < 2.2e-16

Generate an example in R with n = 100 observations in which the coefficient estimate for the regression of X onto Y is the same as the coefficient estimate for the regression of Y onto X.

set.seed(1)
x_2=rnorm(100)
sum((x_2)^2)

## [1] 81.05509

set.seed(1)
y_2=rnorm(100)
sum((y_2)^2)

## [1] 81.05509

fit.Y <- lm(y ~ x + 0)
fit.X <- lm(x ~ y + 0)
summary(fit.Y)

## 
## Call:
## lm(formula = y ~ x + 0)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.9154 -0.6472 -0.1771  0.5056  2.3109 
## 
## Coefficients:
##   Estimate Std. Error t value Pr(>|t|)    
## x   2.9939     0.1065   28.12   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9586 on 99 degrees of freedom
## Multiple R-squared:  0.8887, Adjusted R-squared:  0.8876 
## F-statistic: 790.6 on 1 and 99 DF,  p-value: < 2.2e-16

summary(fit.X)

## 
## Call:
## lm(formula = x ~ y + 0)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.63420 -0.16066  0.07099  0.18507  0.59841 
## 
## Coefficients:
##   Estimate Std. Error t value Pr(>|t|)    
## y  0.29684    0.01056   28.12   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3019 on 99 degrees of freedom
## Multiple R-squared:  0.8887, Adjusted R-squared:  0.8876 
## F-statistic: 790.6 on 1 and 99 DF,  p-value: < 2.2e-16

Chapter 3: Assignment #2

Melissa Franco

6/8/2021

Chapter 03 (page 120): 2, 9, 10, 12

Question 2:

Question 9:

Question 10.

Question 12.