Carefully explain the differences between the KNN classifier and KNN regression methods.
K-nearest neighbors (KNN) classifier is a predictive model best suited for categorical data in which the responses are qualitative. This method begins by identifying ‘K’ number of points nearest to a test observation. It then classifies this test observation based on conditional probability. By observing the classification of the nearest points to the test observation, you can determine the probability of the test observation being a certain classification. K-nearest neighbors (KNN) regression is a predictive model best suited for continuous data in which the responses are quantitative. This method uses similar logic to the KNN classifier in creating a predictive model. This method begins by identifying K number of training observations that are closest to a prediction point. It then estimates a function for the predictive model using an average of all the selected training responses.
This question involves the use of multiple linear regression on the Auto data set.
library(ISLR)
attach(Auto)
(a) Produce a scatterplot matrix which includes all of the variables in the data set.
pairs(Auto)
(b) Compute the matrix of correlations between the variables using the function cor(). You will need to exclude the name variable, cor() which is qualitative.
cor(Auto[-9])
## mpg cylinders displacement horsepower weight
## mpg 1.0000000 -0.7776175 -0.8051269 -0.7784268 -0.8322442
## cylinders -0.7776175 1.0000000 0.9508233 0.8429834 0.8975273
## displacement -0.8051269 0.9508233 1.0000000 0.8972570 0.9329944
## horsepower -0.7784268 0.8429834 0.8972570 1.0000000 0.8645377
## weight -0.8322442 0.8975273 0.9329944 0.8645377 1.0000000
## acceleration 0.4233285 -0.5046834 -0.5438005 -0.6891955 -0.4168392
## year 0.5805410 -0.3456474 -0.3698552 -0.4163615 -0.3091199
## origin 0.5652088 -0.5689316 -0.6145351 -0.4551715 -0.5850054
## acceleration year origin
## mpg 0.4233285 0.5805410 0.5652088
## cylinders -0.5046834 -0.3456474 -0.5689316
## displacement -0.5438005 -0.3698552 -0.6145351
## horsepower -0.6891955 -0.4163615 -0.4551715
## weight -0.4168392 -0.3091199 -0.5850054
## acceleration 1.0000000 0.2903161 0.2127458
## year 0.2903161 1.0000000 0.1815277
## origin 0.2127458 0.1815277 1.0000000
(c) Use the lm() function to perform a multiple linear regression with mpg as the response and all other variables except name as the predictors. Use the summary() function to print the results.
model=lm(mpg~.-name,data=Auto)
summary(model)
##
## Call:
## lm(formula = mpg ~ . - name, data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.5903 -2.1565 -0.1169 1.8690 13.0604
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -17.218435 4.644294 -3.707 0.00024 ***
## cylinders -0.493376 0.323282 -1.526 0.12780
## displacement 0.019896 0.007515 2.647 0.00844 **
## horsepower -0.016951 0.013787 -1.230 0.21963
## weight -0.006474 0.000652 -9.929 < 2e-16 ***
## acceleration 0.080576 0.098845 0.815 0.41548
## year 0.750773 0.050973 14.729 < 2e-16 ***
## origin 1.426141 0.278136 5.127 4.67e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.328 on 384 degrees of freedom
## Multiple R-squared: 0.8215, Adjusted R-squared: 0.8182
## F-statistic: 252.4 on 7 and 384 DF, p-value: < 2.2e-16
Comment on the output. For instance:
i. Is there a relationship between the predictors and the
response?
The linear regression produced would suggest that there is a
relationship between MPG as the response variable and the
following predictors, Displacement, Weight,
Year, and Origin. We can conclude that the
predictor variables Cylinders, Horsepower and
Acceleration do not have a significant relationship with
the response variable MPG, given p-values that are greater
than 0.05. The R-squared value calculated shows that of 82.15% of
variance in MPG can be explained by the predictors in this
regression model.
ii. Which predictors appear to have a statistically significant
relationship to the response?
The predictors that appear to have a statistically significant
relationship to the response variable MPG are
Displacement, Weight, Year, and
Origin. These four predictors all have p-values that are
less than 0.05.
iii. What does the coefficient for the year variable
suggest?
The regression coefficient for year is 0.750773. This suggests that with
every other predictor held constant, cars become more fuel efficient
every year. MPG in our regression model is increased by
0.750773 every year.
(d) Use the plot() function to produce diagnostic plots of the linear regression fit. Comment on any problems you see with the fit. Do the residual plots suggest any unusually large outliers? Does the leverage plot identify any observations with unusually high leverage?
par(mfrow=c(2,2))
plot(model)
We can determine that the data reflect a non-linear relationship given the U-Shape pattern in the Residuals vs. Fitted graph. It appears that the data are normally distributed and right skewed given the Normal Q-Q plot produced. The Scale-Location graph also suggests that the variance in the data is not constant, as the residuals appear to be increasingly spread out. From the Residuals vs. Leverage graph, we can see that observation 14 appears to have a higher amount of leverage, but is not beyond Cook’s distance lines. The plots produced do not suggest any unusually large outliers
(e) Use the * and : symbols to fit linear regression models with interaction effects. Do any interactions appear to be statistically significant?
model=lm(mpg~.-name+displacement*weight,data=Auto)
summary(model)
##
## Call:
## lm(formula = mpg ~ . - name + displacement * weight, data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.9027 -1.8092 -0.0946 1.5549 12.1687
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.389e+00 4.301e+00 -1.253 0.2109
## cylinders 1.175e-01 2.943e-01 0.399 0.6899
## displacement -6.837e-02 1.104e-02 -6.193 1.52e-09 ***
## horsepower -3.280e-02 1.238e-02 -2.649 0.0084 **
## weight -1.064e-02 7.136e-04 -14.915 < 2e-16 ***
## acceleration 6.724e-02 8.805e-02 0.764 0.4455
## year 7.852e-01 4.553e-02 17.246 < 2e-16 ***
## origin 5.610e-01 2.622e-01 2.139 0.0331 *
## displacement:weight 2.269e-05 2.257e-06 10.054 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.964 on 383 degrees of freedom
## Multiple R-squared: 0.8588, Adjusted R-squared: 0.8558
## F-statistic: 291.1 on 8 and 383 DF, p-value: < 2.2e-16
model=lm(mpg~.-name+displacement*weight+acceleration*horsepower+cylinders*weight,data=Auto)
summary(model)
##
## Call:
## lm(formula = mpg ~ . - name + displacement * weight + acceleration *
## horsepower + cylinders * weight, data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.849 -1.620 0.035 1.492 12.002
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.247e+01 6.070e+00 -2.053 0.0407 *
## cylinders -1.268e+00 1.538e+00 -0.825 0.4100
## displacement -4.872e-02 2.389e-02 -2.040 0.0421 *
## horsepower 6.296e-02 2.526e-02 2.492 0.0131 *
## weight -9.994e-03 1.596e-03 -6.261 1.03e-09 ***
## acceleration 6.654e-01 1.638e-01 4.061 5.92e-05 ***
## year 7.834e-01 4.457e-02 17.577 < 2e-16 ***
## origin 4.845e-01 2.594e-01 1.868 0.0626 .
## displacement:weight 1.269e-05 6.561e-06 1.934 0.0539 .
## horsepower:acceleration -7.876e-03 1.824e-03 -4.318 2.01e-05 ***
## cylinders:weight 4.943e-04 4.545e-04 1.088 0.2774
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.901 on 381 degrees of freedom
## Multiple R-squared: 0.8654, Adjusted R-squared: 0.8618
## F-statistic: 244.9 on 10 and 381 DF, p-value: < 2.2e-16
model=lm(mpg~.-name+displacement*weight+acceleration*horsepower+cylinders*weight+year*origin,data=Auto)
summary(model)
##
## Call:
## lm(formula = mpg ~ . - name + displacement * weight + acceleration *
## horsepower + cylinders * weight + year * origin, data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.0077 -1.6880 0.0343 1.3731 12.7822
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.845e+00 9.215e+00 0.960 0.337743
## cylinders -1.120e+00 1.522e+00 -0.736 0.462409
## displacement -5.387e-02 2.369e-02 -2.274 0.023533 *
## horsepower 5.740e-02 2.506e-02 2.291 0.022516 *
## weight -9.879e-03 1.580e-03 -6.254 1.08e-09 ***
## acceleration 6.239e-01 1.626e-01 3.836 0.000146 ***
## year 5.133e-01 9.895e-02 5.187 3.48e-07 ***
## origin -1.209e+01 4.133e+00 -2.926 0.003640 **
## displacement:weight 1.356e-05 6.497e-06 2.087 0.037596 *
## horsepower:acceleration -7.212e-03 1.818e-03 -3.968 8.68e-05 ***
## cylinders:weight 4.394e-04 4.500e-04 0.976 0.329454
## year:origin 1.618e-01 5.307e-02 3.049 0.002456 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.87 on 380 degrees of freedom
## Multiple R-squared: 0.8686, Adjusted R-squared: 0.8648
## F-statistic: 228.3 on 11 and 380 DF, p-value: < 2.2e-16
From the last model produced, we can see that the interaction between
Cylinders and Weight is not significant, given
a p-value for this interaction that is well above 0.05. All of the other
interactions in this model appear to be significant and the R-squared
value calculated shows that 86.86% of the variance in MPG
can be explained by the predictors.
(f) Try a few different transformations of the variables, such as log(X), √X, X2. Comment on your findings.
summary(lm(mpg~.-name+log(displacement),data=Auto))
##
## Call:
## lm(formula = mpg ~ . - name + log(displacement), data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.1562 -1.8388 -0.0423 1.6999 11.7871
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.529e+01 8.485e+00 5.337 1.62e-07 ***
## cylinders 3.391e-03 3.025e-01 0.011 0.991060
## displacement 7.744e-02 9.655e-03 8.021 1.29e-14 ***
## horsepower -4.380e-02 1.304e-02 -3.358 0.000864 ***
## weight -4.536e-03 6.404e-04 -7.083 6.80e-12 ***
## acceleration -1.352e-02 9.142e-02 -0.148 0.882479
## year 7.827e-01 4.695e-02 16.671 < 2e-16 ***
## origin 4.485e-01 2.799e-01 1.602 0.109926
## log(displacement) -1.537e+01 1.804e+00 -8.520 3.70e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.055 on 383 degrees of freedom
## Multiple R-squared: 0.8499, Adjusted R-squared: 0.8468
## F-statistic: 271.1 on 8 and 383 DF, p-value: < 2.2e-16
By transforming the displacement variable with log(displacement) we can see that it is more significant than displacement.
summary(lm(mpg~.-name+I(acceleration^2),data=Auto))
##
## Call:
## lm(formula = mpg ~ . - name + I(acceleration^2), data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.9680 -1.9266 -0.0124 1.9153 13.2722
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.1088174 6.4930423 0.787 0.4319
## cylinders -0.3181584 0.3165577 -1.005 0.3155
## displacement 0.0090446 0.0076528 1.182 0.2380
## horsepower -0.0346411 0.0139094 -2.490 0.0132 *
## weight -0.0054113 0.0006719 -8.053 1.03e-14 ***
## acceleration -2.6374431 0.5758788 -4.580 6.30e-06 ***
## year 0.7535781 0.0495815 15.199 < 2e-16 ***
## origin 1.3265929 0.2713219 4.889 1.49e-06 ***
## I(acceleration^2) 0.0790472 0.0165131 4.787 2.42e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.237 on 383 degrees of freedom
## Multiple R-squared: 0.8316, Adjusted R-squared: 0.828
## F-statistic: 236.3 on 8 and 383 DF, p-value: < 2.2e-16
By squaring the acceleration variable we can see that it
is slightly more significant than the original acceleration
variable input.
summary(lm(mpg~.-name+sqrt(cylinders),data=Auto))
##
## Call:
## lm(formula = mpg ~ . - name + sqrt(cylinders), data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.7190 -2.1361 -0.1756 1.7299 12.9229
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.281e+01 1.453e+01 2.258 0.024490 *
## cylinders 8.550e+00 2.513e+00 3.402 0.000739 ***
## displacement 2.001e-02 7.399e-03 2.704 0.007149 **
## horsepower -2.867e-02 1.395e-02 -2.055 0.040585 *
## weight -6.365e-03 6.427e-04 -9.905 < 2e-16 ***
## acceleration 1.062e-01 9.757e-02 1.088 0.277224
## year 7.474e-01 5.019e-02 14.891 < 2e-16 ***
## origin 1.255e+00 2.779e-01 4.514 8.46e-06 ***
## sqrt(cylinders) -4.261e+01 1.175e+01 -3.628 0.000325 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.276 on 383 degrees of freedom
## Multiple R-squared: 0.8274, Adjusted R-squared: 0.8238
## F-statistic: 229.5 on 8 and 383 DF, p-value: < 2.2e-16
Taking the square root of the cylinders variable appears
to be more significant than the original cylinders variable
input.
This question should be answered using the Carseats data set.
library(ISLR)
attach(Carseats)
(a) Fit a multiple regression model to predict
Sales using Price, Urban, and
US.
fit<-lm(Sales~Price+Urban+US)
summary(fit)
##
## Call:
## lm(formula = Sales ~ Price + Urban + US)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.9206 -1.6220 -0.0564 1.5786 7.0581
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13.043469 0.651012 20.036 < 2e-16 ***
## Price -0.054459 0.005242 -10.389 < 2e-16 ***
## UrbanYes -0.021916 0.271650 -0.081 0.936
## USYes 1.200573 0.259042 4.635 4.86e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.472 on 396 degrees of freedom
## Multiple R-squared: 0.2393, Adjusted R-squared: 0.2335
## F-statistic: 41.52 on 3 and 396 DF, p-value: < 2.2e-16
(b) Provide an interpretation of each coefficient in the
model. Be careful—some of the variables in the model are
qualitative!
From the table above, Price and US are
significant predictors of Sales. For every $1,000.00
increase in price, sales decrease by $54. Sales inside the US are
$1,200.00 higher than sales outside of the US. Urban has no effect on
Sales.
(c) Write out the model in equation form, being careful to handle the qualitative variables properly. \(Sales=13.043469 - 0.054459Price - 0.021916Urban_{Yes} + 1.200573US_{Yes}\)
(d) For which of the predictors can you reject the null
hypothesis \(H_0 : \beta_j =
0\)?
Given p-values of less than .05 you are able to reject the null
hypothesis for both of the predictors Price and
US.
(e) On the basis of your response to the previous question, fit a smaller model that only uses the predictors for which there is evidence of association with the outcome.
fit<-lm(Sales~Price+US)
summary(fit)
##
## Call:
## lm(formula = Sales ~ Price + US)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.9269 -1.6286 -0.0574 1.5766 7.0515
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13.03079 0.63098 20.652 < 2e-16 ***
## Price -0.05448 0.00523 -10.416 < 2e-16 ***
## USYes 1.19964 0.25846 4.641 4.71e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.469 on 397 degrees of freedom
## Multiple R-squared: 0.2393, Adjusted R-squared: 0.2354
## F-statistic: 62.43 on 2 and 397 DF, p-value: < 2.2e-16
(f) How well do the models in (a) and (e) fit the
data?
The fit is not adequate for either model given the multiple R-squared
value of 0.2393. This indicates that each model only explains 23.93% of
the variance in Sales.
(g) Using the model from (e), obtain 95 % confidence intervals for the coefficient(s).
confint(fit)
## 2.5 % 97.5 %
## (Intercept) 11.79032020 14.27126531
## Price -0.06475984 -0.04419543
## USYes 0.69151957 1.70776632
(h) Is there evidence of outliers or high leverage
observations in the model from (e)?
Included below are several plots which are used to subjectively
demonstrate the existence of outliers or high leverage observations in
our model. In the Residuals vs. Fitted plot, we can see that the
residuals all follow the same random pattern around the zero line. This
would suggest that the data have a linear relationship. From the
Residuals vs. Leverage plot that was produced we can see that there are
a couple of points that that appear to have a higher leverage, but none
that are beyond the Cook’s distance lines. One example for an upper
limit of acceptable influence on an observation would be average
leverage. The average leverage for this model is calculated as \(\frac{(2+1)}{400} = 0.0075\)
par(mfrow=c(2,2))
plot(fit)
I have summarized points below that are identified by R as violating one of the rules for an acceptable level of influence.
summary(influence.measures(fit))
## Potentially influential observations of
## lm(formula = Sales ~ Price + US) :
##
## dfb.1_ dfb.Pric dfb.USYs dffit cov.r cook.d hat
## 26 0.24 -0.18 -0.17 0.28_* 0.97_* 0.03 0.01
## 29 -0.10 0.10 -0.10 -0.18 0.97_* 0.01 0.01
## 43 -0.11 0.10 0.03 -0.11 1.05_* 0.00 0.04_*
## 50 -0.10 0.17 -0.17 0.26_* 0.98 0.02 0.01
## 51 -0.05 0.05 -0.11 -0.18 0.95_* 0.01 0.00
## 58 -0.05 -0.02 0.16 -0.20 0.97_* 0.01 0.01
## 69 -0.09 0.10 0.09 0.19 0.96_* 0.01 0.01
## 126 -0.07 0.06 0.03 -0.07 1.03_* 0.00 0.03_*
## 160 0.00 0.00 0.00 0.01 1.02_* 0.00 0.02
## 166 0.21 -0.23 -0.04 -0.24 1.02 0.02 0.03_*
## 172 0.06 -0.07 0.02 0.08 1.03_* 0.00 0.02
## 175 0.14 -0.19 0.09 -0.21 1.03_* 0.02 0.03_*
## 210 -0.14 0.15 -0.10 -0.22 0.97_* 0.02 0.01
## 270 -0.03 0.05 -0.03 0.06 1.03_* 0.00 0.02
## 298 -0.06 0.06 -0.09 -0.15 0.97_* 0.01 0.00
## 314 -0.05 0.04 0.02 -0.05 1.03_* 0.00 0.02_*
## 353 -0.02 0.03 0.09 0.15 0.97_* 0.01 0.00
## 357 0.02 -0.02 0.02 -0.03 1.03_* 0.00 0.02
## 368 0.26 -0.23 -0.11 0.27_* 1.01 0.02 0.02_*
## 377 0.14 -0.15 0.12 0.24 0.95_* 0.02 0.01
## 384 0.00 0.00 0.00 0.00 1.02_* 0.00 0.02
## 387 -0.03 0.04 -0.03 0.05 1.02_* 0.00 0.02
## 396 -0.05 0.05 0.08 0.14 0.98_* 0.01 0.00
We are able to further analyze these points by generating a report which demonstrates statistics for a regression with the outliers removed and comparing to a regression with all data included.
outlying.obs<-c(26,29,43,50,51,58,69,126,160,166,172,175,210,270,298,314,353,357,368,377,384,396)
Carseats.small<-Carseats[-outlying.obs,]
fit2<-lm(Sales~Price+US,data=Carseats.small)
summary(fit2)
##
## Call:
## lm(formula = Sales ~ Price + US, data = Carseats.small)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.2772 -1.5953 -0.0449 1.5735 5.4274
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.902800 0.662642 19.472 < 2e-16 ***
## Price -0.053710 0.005473 -9.813 < 2e-16 ***
## USYes 1.246667 0.247884 5.029 7.65e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.288 on 375 degrees of freedom
## Multiple R-squared: 0.2397, Adjusted R-squared: 0.2356
## F-statistic: 59.11 on 2 and 375 DF, p-value: < 2.2e-16
We can see from the report generated that removing these outliers does not result in a significant change on the fit of the linear model to the full data set. By comparing the confidence intervals, we can also see that the confidence interval for the coefficient estimates produced by the linear model fit to the data set with outliers removed is contained by the confidence interval produced from the linear model fit to the full data set. It is safe to conclude that there are no outliers which should be excluded from the data points in our model.
This problem involves simple linear regression without an intercept.
(a) Recall that the coefficient estimate βˆ for the linear regression of Y onto X without an intercept is given by (3.38). Under what circumstance is the coefficient estimate for the regression of X onto Y the same as the coefficient estimate for the regression of Y onto X? The coefficient estimate for the regression of \(Y\) onto \(X\) is \[\hat{\beta} = \frac{\sum_ix_iy_i}{\sum_jx_j^2};\] The coefficient estimate for the regression of \(X\) onto \(Y\) is \[\hat{\beta}' = \frac{\sum_ix_iy_i}{\sum_jy_j^2}.\] The coefficients are the same if \(\sum_jx_j^2 = \sum_jy_j^2\).
When the variance of X is equal to the variance of
Ythe coefficient estimate for the regression of
X onto Y would be the same as the coefficient
estimate for the regression of Y onto X.
(b) Generate an example in R with n = 100 observations in which the coefficient estimate for the regression of X onto Y is different from the coefficient estimate for the regression of Y onto X.
x=rnorm(100)
y=0.5*x+rnorm(100)
coefficients(lm(x~y+0))
## y
## 0.3854639
coefficients(lm(y~x+0))
## x
## 0.5761462
(c) Generate an example in R with n = 100 observations in which the coefficient estimate for the regression of X onto Y is the same as the coefficient estimate for the regression of Y onto X.
x=1:100
y=100:1
eg3<-lm(y~x+0)
summary(eg3)
##
## Call:
## lm(formula = y ~ x + 0)
##
## Residuals:
## Min 1Q Median 3Q Max
## -49.75 -12.44 24.87 62.18 99.49
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## x 0.5075 0.0866 5.86 6.09e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 50.37 on 99 degrees of freedom
## Multiple R-squared: 0.2575, Adjusted R-squared: 0.25
## F-statistic: 34.34 on 1 and 99 DF, p-value: 6.094e-08
eg4<-lm(x~y+0)
summary(eg4)
##
## Call:
## lm(formula = x ~ y + 0)
##
## Residuals:
## Min 1Q Median 3Q Max
## -49.75 -12.44 24.87 62.18 99.49
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## y 0.5075 0.0866 5.86 6.09e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 50.37 on 99 degrees of freedom
## Multiple R-squared: 0.2575, Adjusted R-squared: 0.25
## F-statistic: 34.34 on 1 and 99 DF, p-value: 6.094e-08