library(ISLR)
data(Auto)
fit <- lm(mpg ~ horsepower, data = Auto)
summary(fit)
Call:
lm(formula = mpg ~ horsepower, data = Auto)
Residuals:
Min 1Q Median 3Q Max
-13.5710 -3.2592 -0.3435 2.7630 16.9240
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 39.935861 0.717499 55.66 <2e-16 ***
horsepower -0.157845 0.006446 -24.49 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.906 on 390 degrees of freedom
Multiple R-squared: 0.6059, Adjusted R-squared: 0.6049
F-statistic: 599.7 on 1 and 390 DF, p-value: < 2.2e-16
predict(fit, data.frame(horsepower = c(98)), interval = 'confidence')
fit lwr upr
1 24.46708 23.97308 24.96108
predict(fit, data.frame(horsepower = c(98)), interval = 'prediction')
fit lwr upr
1 24.46708 14.8094 34.12476
There is a relationship between mpg and horsepower in data frame Auto, because the p-value of the t-statistic is less than 0.05.
The \(R^2\) statistic suggests horsepower explains 61% of mpg.
Negative. For the coeffecient of horsepower is less than 0.
\(mpg = 39.9 - 0.158 \times horsepower = 39.9 - 0.158 \times 98 = 24.4\)
The associated 95% confidence interval is \([23.97, 24.96]\), the prediction interval is \([14.81, 34.12]\).
plot(Auto$horsepower, Auto$mpg)
abline(fit)
plot(predict(fit), residuals(fit))
plot(predict(fit), rstudent(fit))
par(mfrow=c(2,2))
plot(fit)
The patterns in the Residuals vs Fitted graph shows there are some non-linearity in the relationship of horsepower and mpg.
plot(Auto) # equivalent to `pairs(Auto)`
cor(within(Auto, rm(name)))
mpg cylinders displacement horsepower weight acceleration year origin
mpg 1.0000000 -0.7776175 -0.8051269 -0.7784268 -0.8322442 0.4233285 0.5805410 0.5652088
cylinders -0.7776175 1.0000000 0.9508233 0.8429834 0.8975273 -0.5046834 -0.3456474 -0.5689316
displacement -0.8051269 0.9508233 1.0000000 0.8972570 0.9329944 -0.5438005 -0.3698552 -0.6145351
horsepower -0.7784268 0.8429834 0.8972570 1.0000000 0.8645377 -0.6891955 -0.4163615 -0.4551715
weight -0.8322442 0.8975273 0.9329944 0.8645377 1.0000000 -0.4168392 -0.3091199 -0.5850054
acceleration 0.4233285 -0.5046834 -0.5438005 -0.6891955 -0.4168392 1.0000000 0.2903161 0.2127458
year 0.5805410 -0.3456474 -0.3698552 -0.4163615 -0.3091199 0.2903161 1.0000000 0.1815277
origin 0.5652088 -0.5689316 -0.6145351 -0.4551715 -0.5850054 0.2127458 0.1815277 1.0000000
mfit <- lm(mpg ~ . -name, data = Auto)
summary(mfit)
Call:
lm(formula = mpg ~ . - name, data = Auto)
Residuals:
Min 1Q Median 3Q Max
-9.5903 -2.1565 -0.1169 1.8690 13.0604
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -17.218435 4.644294 -3.707 0.00024 ***
cylinders -0.493376 0.323282 -1.526 0.12780
displacement 0.019896 0.007515 2.647 0.00844 **
horsepower -0.016951 0.013787 -1.230 0.21963
weight -0.006474 0.000652 -9.929 < 2e-16 ***
acceleration 0.080576 0.098845 0.815 0.41548
year 0.750773 0.050973 14.729 < 2e-16 ***
origin 1.426141 0.278136 5.127 4.67e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.328 on 384 degrees of freedom
Multiple R-squared: 0.8215, Adjusted R-squared: 0.8182
F-statistic: 252.4 on 7 and 384 DF, p-value: < 2.2e-16
Yes, there is.
weight, year and origin.
Cars become more oil-efficient as the service life continue, the mpg increase about 0.75 per year.
par(mfrow=c(2,2))
plot(mfit)
par(mfrow=c(1,1))
plot(predict(mfit), rstudent(mfit))
abline(3, 0)
The residual plot suggests no large outliers. The leverage plot suggests the no. 14 observation have unusually high leverage.
In rstudent plot, we see there are about 4 outliers which abs(x) > 3
, but not far from 3.
From the result of (c).ii, I choose the following 3 predictors:
imfit1 <- lm(mpg ~ weight * year * origin, data = Auto)
summary(imfit1)
Call:
lm(formula = mpg ~ weight * year * origin, data = Auto)
Residuals:
Min 1Q Median 3Q Max
-9.7880 -1.9187 -0.1022 1.4576 12.1862
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.170e+02 3.551e+01 -6.111 2.43e-09 ***
weight 7.198e-02 1.334e-02 5.398 1.18e-07 ***
year 3.331e+00 4.660e-01 7.147 4.50e-12 ***
origin 9.961e+01 2.508e+01 3.972 8.51e-05 ***
weight:year -1.005e-03 1.749e-04 -5.749 1.83e-08 ***
weight:origin -4.313e-02 1.080e-02 -3.995 7.75e-05 ***
year:origin -1.236e+00 3.254e-01 -3.798 0.000170 ***
weight:year:origin 5.402e-04 1.399e-04 3.861 0.000132 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.055 on 384 degrees of freedom
Multiple R-squared: 0.8495, Adjusted R-squared: 0.8468
F-statistic: 309.7 on 7 and 384 DF, p-value: < 2.2e-16
The result shows all the interaction items have significant effect on mpg.
From results of (b), we find the largest 2 correlations: cylinders vs. displacement (0.95), weight vs. displacement (0.93). So we test their interaction effects:
imfit2 <- lm(mpg ~ cylinders * displacement + displacement * weight, data = Auto)
summary(imfit2)
Call:
lm(formula = mpg ~ cylinders * displacement + displacement *
weight, data = Auto)
Residuals:
Min 1Q Median 3Q Max
-13.2934 -2.5184 -0.3476 1.8399 17.7723
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.262e+01 2.237e+00 23.519 < 2e-16 ***
cylinders 7.606e-01 7.669e-01 0.992 0.322
displacement -7.351e-02 1.669e-02 -4.403 1.38e-05 ***
weight -9.888e-03 1.329e-03 -7.438 6.69e-13 ***
cylinders:displacement -2.986e-03 3.426e-03 -0.872 0.384
displacement:weight 2.128e-05 5.002e-06 4.254 2.64e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.103 on 386 degrees of freedom
Multiple R-squared: 0.7272, Adjusted R-squared: 0.7237
F-statistic: 205.8 on 5 and 386 DF, p-value: < 2.2e-16
So the interaction between displacement and weight is significant, while that between cylinders and displacement is insignificant.
data("Carseats")
fit <- lm(Sales ~ Price + Urban + US, data = Carseats)
summary(fit)
Call:
lm(formula = Sales ~ Price + Urban + US, data = Carseats)
Residuals:
Min 1Q Median 3Q Max
-6.9206 -1.6220 -0.0564 1.5786 7.0581
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 13.043469 0.651012 20.036 < 2e-16 ***
Price -0.054459 0.005242 -10.389 < 2e-16 ***
UrbanYes -0.021916 0.271650 -0.081 0.936
USYes 1.200573 0.259042 4.635 4.86e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.472 on 396 degrees of freedom
Multiple R-squared: 0.2393, Adjusted R-squared: 0.2335
F-statistic: 41.52 on 3 and 396 DF, p-value: < 2.2e-16
Expensive car seats have less sales than cheaper ones. Whether the store is at urban or not doesn’t effect the sales. The stores in the U.S. have more sales than that outside the U.S.
\[sales = \begin{cases} 13.04 - 0.05 \times price & \text{if US = 0} \\ 13.04 - 0.05 \times price + 1.20 & \text{if US = 1} \end{cases} \]
Price and US.
sfit <- lm(Sales ~ Price + US, data = Carseats)
summary(sfit)
Call:
lm(formula = Sales ~ Price + US, data = Carseats)
Residuals:
Min 1Q Median 3Q Max
-6.9269 -1.6286 -0.0574 1.5766 7.0515
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 13.03079 0.63098 20.652 < 2e-16 ***
Price -0.05448 0.00523 -10.416 < 2e-16 ***
USYes 1.19964 0.25846 4.641 4.71e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.469 on 397 degrees of freedom
Multiple R-squared: 0.2393, Adjusted R-squared: 0.2354
F-statistic: 62.43 on 2 and 397 DF, p-value: < 2.2e-16
Model (e) is a little better than (a) (0.2354 vs. 0.2335)
confint(sfit)
2.5 % 97.5 %
(Intercept) 11.79032020 14.27126531
Price -0.06475984 -0.04419543
USYes 0.69151957 1.70776632
Compared with 8(a).iv, compute the confidence/prediction intervals of response variable with predict(model, data.frame(predictor = c(v1, v2, ...)), interval = 'confidence/prediction')
. Compute confidence interval of coefficients with function confint(model)
.
plot(predict(sfit), rstudent(sfit))
par(mfrow=c(2,2))
plot(sfit)
So in the model (e), there is no outliers.
For leverage check, \((p + 1) / n = 3 / 400 = 0.0075\) So there is no points with high leverage (larger than 0.0075).
Reference: The last paragraph in p98:
So if a given observation has a leverage statistic that greatly exceeds \((p+1)/n\), then we may suspect that the corresponding point has high leverage.
set.seed(1)
x <- rnorm(100)
y <- 2 * x + rnorm(100)
fit <- lm(y ~ x + 0)
summary(fit)
Call:
lm(formula = y ~ x + 0)
Residuals:
Min 1Q Median 3Q Max
-1.9154 -0.6472 -0.1771 0.5056 2.3109
Coefficients:
Estimate Std. Error t value Pr(>|t|)
x 1.9939 0.1065 18.73 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9586 on 99 degrees of freedom
Multiple R-squared: 0.7798, Adjusted R-squared: 0.7776
F-statistic: 350.7 on 1 and 99 DF, p-value: < 2.2e-16
So the coefficient estimate \(\hat \beta = 1.99\). Its standard error is 0.1065. The t-statistic is 18.73, and its p-value is less than \(2^{-16}\). The regression result, 1.99, is very close to the theoretical value, which is 2.
fitr <- lm(x ~ y + 0)
summary(fitr)
Call:
lm(formula = x ~ y + 0)
Residuals:
Min 1Q Median 3Q Max
-0.8699 -0.2368 0.1030 0.2858 0.8938
Coefficients:
Estimate Std. Error t value Pr(>|t|)
y 0.39111 0.02089 18.73 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4246 on 99 degrees of freedom
Multiple R-squared: 0.7798, Adjusted R-squared: 0.7776
F-statistic: 350.7 on 1 and 99 DF, p-value: < 2.2e-16
The coefficient estimate \(\hat \beta = 0.39\), its standard error is 0.02. The t-stats and its p-value is 18.73 and \(\lt 2^{-16}\). The regression result, 0.39, is less than theoretical value \(\frac12\).
The coefficients in (a) and (b) should be reciprocal, because \[ y = 2x + \epsilon \\ \Rightarrow x = \frac{y - \epsilon}2 \\ \Rightarrow x = \frac12 y - \frac{\epsilon}2 \\ \therefore \hat\beta = \frac12 \] ## 11d
\[\begin{aligned} SE(\hat\beta) &= \sqrt{\frac{\sum_{i=1}^n(y_i - x_i \hat\beta)^2}{(n-1)\sum_{i=1}^n x_i^2}} \\ &= \sqrt{\frac{\sum_{i=1}^n y_i^2 - 2\hat\beta\sum_{i=1}^nx_iy_i + \hat\beta^2\sum_{i=1}^nx_i^2}{(n-1)\sum_{i=1}^n x_i^2}} \\ \end{aligned} \] Take this into (3.14), we have: \[ \begin{aligned} t &= \frac{\hat\beta}{SE(\hat\beta)} \\ &= \frac{\hat\beta\sqrt{(n-1)\sum_{i=1}^n x_i^2}}{\sqrt{\sum_{i=1}^n y_i^2 - 2\hat\beta\sum_{i=1}^nx_iy_i + \hat\beta^2\sum_{i=1}^nx_i^2}} \\ &= \sqrt{\frac{(n-1)\sum_{i=1}^nx_i^2}{\frac{\sum_{i=1}^n y_i^2 - 2\hat\beta\sum_{i=1}^nx_iy_i + \hat\beta^2\sum_{i=1}^nx_i^2}{\hat\beta^2}}} \\ &= \sqrt{\frac{(n-1)\sum_{i=1}^nx_i^2}{\frac{\sum_{i=1}^n y_i^2}{\hat\beta^2} - 2\frac1{\hat\beta}\sum_{i=1}^nx_iy_i + \sum_{i=1}^nx_i^2}} \\ &= \sqrt{\frac{n-1}{\frac{\sum_{i=1}^ny_i^2}{\hat\beta^2 \sum_{i=1}^n x_i^2} -2\frac{\sum_{i=1}^n x_i y_i}{\hat\beta\sum_{i=1}^n x_i^2} + 1}} \end{aligned} \]
Take (3.38) into above equation, we have: $$ \[\begin{aligned} t &= \sqrt{\frac{n-1}{\frac{\sum_{i=1}^nx_i^2 \sum_{i=1}^ny_i^2}{(\sum_{i=1}^n x_i y_i)^2} - 2 + 1}} \\ &= \sqrt { \frac{ (n-1)(\sum_{i=1}^n x_i y_i)^2} { \sum_{i=1}^n x_i^2 \sum_{i=1}^n y_i^2 - (\sum_{i=1}^n x_i y_i)^2} } \\ &= \frac { \sqrt{n-1} \sum_{i=1}^n x_i y_i } { \sqrt{ \sum_{i=1}^n x_i^2 \sum_{i=1}^n y_i^2 - (\sum_{i=1}^n x_i y_i)^2 } } \end{aligned}\]$$
Proof complete.
\(t\) calculated in R:
sqrt(length(x) -1) * sum(x*y) / sqrt(sum(x^2)*sum(y^2) - (sum(x * y))^2)
[1] 18.72593
18.72593 vs 18.73 in 11a.
From the \(t\) expression above, it’s easy to find that \(x\) and \(y\) are symmetric, which means the result keeps the same when we swap \(x\) and \(y\). So the t-statistic for the regression of \(y\) onto \(x\) is the same as the t-statistic for the regression of \(x\) onto \(y\).
summary(lm(y ~ x))
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-1.8768 -0.6138 -0.1395 0.5394 2.3462
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.03769 0.09699 -0.389 0.698
x 1.99894 0.10773 18.556 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9628 on 98 degrees of freedom
Multiple R-squared: 0.7784, Adjusted R-squared: 0.7762
F-statistic: 344.3 on 1 and 98 DF, p-value: < 2.2e-16
summary(lm(x ~ y))
Call:
lm(formula = x ~ y)
Residuals:
Min 1Q Median 3Q Max
-0.90848 -0.28101 0.06274 0.24570 0.85736
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.03880 0.04266 0.91 0.365
y 0.38942 0.02099 18.56 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4249 on 98 degrees of freedom
Multiple R-squared: 0.7784, Adjusted R-squared: 0.7762
F-statistic: 344.3 on 1 and 98 DF, p-value: < 2.2e-16
So the t-statistic are 18.556 and 18.56.
Based on (3.38), to keep the coefficient the same, we have: \[ \frac{\sum_{i=1}^n x_i y_i}{\sum_{j=1}^n x_j^2} = \frac{\sum_{i=1}^n x_i y_i}{\sum_{j=1}^n y_j^2} \\ \therefore \sum_{i=1}^n x_i^2 = \sum_{i=1}^n y_i^2 \] So when sum of squares of observed \(y\) is equals to sum of squares of observed \(x\), the \(\hat\beta\) is the same.
The same with 11a.
set.seed(1)
x <- rnorm(100)
y <- sample(x, 100)
print(sum(x ^ 2) == sum(y ^ 2))
[1] TRUE
summary(lm(y ~ x + 0))
Call:
lm(formula = y ~ x + 0)
Residuals:
Min 1Q Median 3Q Max
-2.2315 -0.5124 0.1027 0.6877 2.3926
Coefficients:
Estimate Std. Error t value Pr(>|t|)
x 0.02148 0.10048 0.214 0.831
Residual standard error: 0.9046 on 99 degrees of freedom
Multiple R-squared: 0.0004614, Adjusted R-squared: -0.009635
F-statistic: 0.0457 on 1 and 99 DF, p-value: 0.8312
summary(lm(x ~ y + 0))
Call:
lm(formula = x ~ y + 0)
Residuals:
Min 1Q Median 3Q Max
-2.2400 -0.5154 0.1213 0.6788 2.3959
Coefficients:
Estimate Std. Error t value Pr(>|t|)
y 0.02148 0.10048 0.214 0.831
Residual standard error: 0.9046 on 99 degrees of freedom
Multiple R-squared: 0.0004614, Adjusted R-squared: -0.009635
F-statistic: 0.0457 on 1 and 99 DF, p-value: 0.8312
set.seed(1)
x <- rnorm(100)
eps <- rnorm(100, sd = 0.25)
y <- -1 + 0.5 * x + eps
length(y)
[1] 100
Length of the vecto \(y\) is 100. \(\beta_0 = -1\), \(\beta_1 = 0.5\).
plot(x, y)
With bare eyes we can see when \(x=0\), \(y \approx -1\), and \(y \approx 0 \lvert x = 2\), which matches the line \(y = -1 + 0.5 x\).
f13e <- lm(y ~ x)
summary(f13e)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-0.46921 -0.15344 -0.03487 0.13485 0.58654
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.00942 0.02425 -41.63 <2e-16 ***
x 0.49973 0.02693 18.56 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2407 on 98 degrees of freedom
Multiple R-squared: 0.7784, Adjusted R-squared: 0.7762
F-statistic: 344.3 on 1 and 98 DF, p-value: < 2.2e-16
So \(\beta_0 = -1\) vs. \(\hat\beta_0 = -1.009\), \(\beta_1 = 0.5\) vs \(\hat\beta_1 = 0.4997\).
plot(x, y)
abline(coef(f13e), col = 'red', lty = 4)
abline(-1, 0.5, col = 'blue', lty = 2)
legend('bottomright', c('population regression', 'least square'), lty = c(4,2), col = c('red', 'blue'), bty = 'n')
See figure 3.3 for reference.
f13g <- lm(y ~ poly(x, 2))
summary(f13g)
Call:
lm(formula = y ~ poly(x, 2))
Residuals:
Min 1Q Median 3Q Max
-0.4913 -0.1563 -0.0322 0.1451 0.5675
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.95501 0.02395 -39.874 <2e-16 ***
poly(x, 2)1 4.46612 0.23951 18.647 <2e-16 ***
poly(x, 2)2 -0.33602 0.23951 -1.403 0.164
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2395 on 97 degrees of freedom
Multiple R-squared: 0.7828, Adjusted R-squared: 0.7784
F-statistic: 174.8 on 2 and 97 DF, p-value: < 2.2e-16
\(R^2\) statistic from 0.7762 to 0.7784 shows that the quadratic term has almost no improvements for the model fit.
set.seed(1)
x <- rnorm(100)
eps <- rnorm(100, sd = 0.1)
y <- -1 + 0.5 * x + eps
f13h <- lm(y ~ x)
summary(f13h)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-0.18768 -0.06138 -0.01395 0.05394 0.23462
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.003769 0.009699 -103.5 <2e-16 ***
x 0.499894 0.010773 46.4 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.09628 on 98 degrees of freedom
Multiple R-squared: 0.9565, Adjusted R-squared: 0.956
F-statistic: 2153 on 1 and 98 DF, p-value: < 2.2e-16
plot(x, y)
abline(f13h, col = 'red', lty = 4)
abline(-1, 0.5, col = 'blue', lty = 2)
legend('bottomright', c('population regression', 'least square'), lty = c(4,2), col = c('red', 'blue'), bty = 'n')
The observations are more closed to the population regression line. \(R^2\) statistic increases from 0.78 to 0.96.
set.seed(1)
x <- rnorm(100)
eps <- rnorm(100, sd = 0.5)
y <- -1 + 0.5 * x + eps
f13i <- lm(y ~ x)
summary(f13i)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-0.93842 -0.30688 -0.06975 0.26970 1.17309
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.01885 0.04849 -21.010 < 2e-16 ***
x 0.49947 0.05386 9.273 4.58e-15 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4814 on 98 degrees of freedom
Multiple R-squared: 0.4674, Adjusted R-squared: 0.4619
F-statistic: 85.99 on 1 and 98 DF, p-value: 4.583e-15
plot(x, y)
abline(f13i, col = 'red', lty = 4)
abline(-1, 0.5, col = 'blue', lty = 2)
legend('bottomright', c('population regression', 'least square'), lty = c(4,2), col = c('red', 'blue'), bty = 'n')
The observations are more disperse from the population regression line. \(R^2\) statistic decreases from 0.78 to 0.46.
confint(f13e)
2.5 % 97.5 %
(Intercept) -1.0575402 -0.9613061
x 0.4462897 0.5531801
confint(f13h)
2.5 % 97.5 %
(Intercept) -1.0230161 -0.9845224
x 0.4785159 0.5212720
confint(f13i)
2.5 % 97.5 %
(Intercept) -1.1150804 -0.9226122
x 0.3925794 0.6063602
We can see that as the noises in the observations increase, the confidence intervals become much wider, but the mean value keep the same.
Perform the following commands in R:
set.seed(1)
x1 <- runif(100)
x2 <- 0.5 * x1 + rnorm(100) / 10
y <- 2 + 2 * x1 + 0.3 * x2 + rnorm(100)
The last line corresponds to creating a linear model in which \(y\) is a function of \(x1\) and \(x2\). Write out the form of the linear model. What are the regression coefficients?
The linear model is: \[ y = 2 + 2x_1 + 0.3x_2 + \epsilon \]
The coefficients are: \[ \beta_0 = 2 \\ \beta_1 = 2 \\ \beta_2 = 0.3 \]
What is the correlation between x1 and x2? Create a scatterplot displaying the relationship between the variables.
cor(x1, x2)
[1] 0.8351212
plot(x1, x2)
Using this data, fit a least squares regression to predict \(y\) using \(x1\) and \(x2\). Describe the results obtained. What are \(\hat \beta_0\), \(\hat \beta_1\), and \(\hat \beta_2\)? How do these relate to the true \(\beta_0\), \(\beta_1\), and \(\beta_2\)? Can you reject the null hypothesis \(H_0 : \beta_1 = 0\)? How about the null hypothesis \(H_0 : \beta_2 = 0\)?
f14c <- lm(y ~ x1 + x2)
summary(f14c)
Call:
lm(formula = y ~ x1 + x2)
Residuals:
Min 1Q Median 3Q Max
-2.8311 -0.7273 -0.0537 0.6338 2.3359
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.1305 0.2319 9.188 7.61e-15 ***
x1 1.4396 0.7212 1.996 0.0487 *
x2 1.0097 1.1337 0.891 0.3754
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.056 on 97 degrees of freedom
Multiple R-squared: 0.2088, Adjusted R-squared: 0.1925
F-statistic: 12.8 on 2 and 97 DF, p-value: 1.164e-05
\(\hat\beta_0 = 2.13\), \(\hat\beta_1 = 1.44\), \(\hat\beta_2 = 1.01\). \(\hat\beta_1\) and \(\hat\beta_2\) are both far from the true coefficients (2 and 0.3, respectively).
From the p-values of the t-statistic(0.0487 and 0.3754), we can’t reject the null hypothesis of \(\beta_1 = 0\) and \(\beta_2 = 0\) (for 0.0487 is so close to 0.05).
Now fit a least squares regression to predict \(y\) using only \(x1\). Comment on your results. Can you reject the null hypothesis \(H_0 : \beta_1 = 0\)?
f14d <- lm(y ~ x1)
summary(f14d)
Call:
lm(formula = y ~ x1)
Residuals:
Min 1Q Median 3Q Max
-2.89495 -0.66874 -0.07785 0.59221 2.45560
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.1124 0.2307 9.155 8.27e-15 ***
x1 1.9759 0.3963 4.986 2.66e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.055 on 98 degrees of freedom
Multiple R-squared: 0.2024, Adjusted R-squared: 0.1942
F-statistic: 24.86 on 1 and 98 DF, p-value: 2.661e-06
Accroding to the p-value of t-statistic, we can reject the \(H_0\) hypothesis.
Now fit a least squares regression to predict \(y\) using only \(x2\). Comment on your results. Can you reject the null hypothesis \(H_0 : \beta_1 = 0\)?
f14e <- lm(y ~ x2)
summary(f14e)
Call:
lm(formula = y ~ x2)
Residuals:
Min 1Q Median 3Q Max
-2.62687 -0.75156 -0.03598 0.72383 2.44890
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.3899 0.1949 12.26 < 2e-16 ***
x2 2.8996 0.6330 4.58 1.37e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.072 on 98 degrees of freedom
Multiple R-squared: 0.1763, Adjusted R-squared: 0.1679
F-statistic: 20.98 on 1 and 98 DF, p-value: 1.366e-05
Accroding to the p-value of t-statistic, we can reject the \(H_0\) hypothesis.
Do the results obtained in (c)–(e) contradict each other? Explain your answer.
No. The results don’t contradict each other. Because when the independent variables are correlated to each other, they can’t be fitted with linear regression model.
Now suppose we obtain one additional observation, which was unfortunately mismeasured.
x1n <- c(x1, 0.1)
x2n <- c(x2, 0.8)
yn <- c(y, 6)
Re-fit the linear models from (c) to (e) using this new data. What effect does this new observation have on the each of the models? In each model, is this observation an outlier? A high-leverage point? Both? Explain your answers.
f14g <- lm(yn ~ x1n + x2n)
summary(f14g)
Call:
lm(formula = yn ~ x1n + x2n)
Residuals:
Min 1Q Median 3Q Max
-2.73348 -0.69318 -0.05263 0.66385 2.30619
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.2267 0.2314 9.624 7.91e-16 ***
x1n 0.5394 0.5922 0.911 0.36458
x2n 2.5146 0.8977 2.801 0.00614 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.075 on 98 degrees of freedom
Multiple R-squared: 0.2188, Adjusted R-squared: 0.2029
F-statistic: 13.72 on 2 and 98 DF, p-value: 5.564e-06
par(mfrow=c(2,2))
plot(f14g)
par(mfrow=c(1,1))
plot(predict(f14g), rstudent(f14g))
This observation disturbs the relationship between \(y\) and \(x1\), \(x2\) severely. The coefficients become more far from the true values. From these plots we can see, it is both outlier and with high-leverage.
f14g2 <- lm(yn ~ x1n)
summary(f14g2)
Call:
lm(formula = yn ~ x1n)
Residuals:
Min 1Q Median 3Q Max
-2.8897 -0.6556 -0.0909 0.5682 3.5665
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.2569 0.2390 9.445 1.78e-15 ***
x1n 1.7657 0.4124 4.282 4.29e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.111 on 99 degrees of freedom
Multiple R-squared: 0.1562, Adjusted R-squared: 0.1477
F-statistic: 18.33 on 1 and 99 DF, p-value: 4.295e-05
par(mfrow=c(2,2))
plot(f14g2)
par(mfrow=c(1,1))
plot(predict(f14g2), rstudent(f14g2))
points(predict(f14g2)[101], rstudent(f14g2)[101], col = 'red', cex = 2, pch = 3)
The \(x1n\) has no statistical relation with \(y\) because of the 101th observation. From these plots we can see, this observation is both outlier and with high-leverage.
f14g3 <- lm(yn ~ x2n)
summary(f14g3)
Call:
lm(formula = yn ~ x2n)
Residuals:
Min 1Q Median 3Q Max
-2.64729 -0.71021 -0.06899 0.72699 2.38074
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.3451 0.1912 12.264 < 2e-16 ***
x2n 3.1190 0.6040 5.164 1.25e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.074 on 99 degrees of freedom
Multiple R-squared: 0.2122, Adjusted R-squared: 0.2042
F-statistic: 26.66 on 1 and 99 DF, p-value: 1.253e-06
par(mfrow=c(2,2))
plot(f14g3)
par(mfrow=c(1,1))
plot(predict(f14g3), rstudent(f14g3))
The \(x2n\) still has statistical relation with the \(y\). But the \(R^2\) is very low. From these plots we can see, this observation is both outlier and with high-leverage.
library(MASS)
names(Boston)
[1] "crim" "zn" "indus" "chas" "nox" "rm" "age" "dis" "rad" "tax" "ptratio"
[12] "black" "lstat" "medv"
f15a1 <- lm(crim ~ zn, data = Boston)
f15a2 <- lm(crim ~ indus, data = Boston)
f15a3 <- lm(crim ~ chas, data = Boston)
f15a4 <- lm(crim ~ nox, data = Boston)
f15a5 <- lm(crim ~ rm, data = Boston)
f15a6 <- lm(crim ~ age, data = Boston)
f15a7 <- lm(crim ~ dis, data = Boston)
f15a8 <- lm(crim ~ rad, data = Boston)
f15a9 <- lm(crim ~ tax, data = Boston)
f15a10 <- lm(crim ~ ptratio, data = Boston)
f15a11 <- lm(crim ~ black, data = Boston)
f15a12 <- lm(crim ~ lstat, data = Boston)
f15a13 <- lm(crim ~ medv, data = Boston)
summary(f15a1)
Call:
lm(formula = crim ~ zn, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-4.429 -4.222 -2.620 1.250 84.523
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.45369 0.41722 10.675 < 2e-16 ***
zn -0.07393 0.01609 -4.594 5.51e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 8.435 on 504 degrees of freedom
Multiple R-squared: 0.04019, Adjusted R-squared: 0.03828
F-statistic: 21.1 on 1 and 504 DF, p-value: 5.506e-06
summary(f15a2)
Call:
lm(formula = crim ~ indus, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-11.972 -2.698 -0.736 0.712 81.813
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.06374 0.66723 -3.093 0.00209 **
indus 0.50978 0.05102 9.991 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.866 on 504 degrees of freedom
Multiple R-squared: 0.1653, Adjusted R-squared: 0.1637
F-statistic: 99.82 on 1 and 504 DF, p-value: < 2.2e-16
summary(f15a3)
Call:
lm(formula = crim ~ chas, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-3.738 -3.661 -3.435 0.018 85.232
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.7444 0.3961 9.453 <2e-16 ***
chas -1.8928 1.5061 -1.257 0.209
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 8.597 on 504 degrees of freedom
Multiple R-squared: 0.003124, Adjusted R-squared: 0.001146
F-statistic: 1.579 on 1 and 504 DF, p-value: 0.2094
summary(f15a4)
Call:
lm(formula = crim ~ nox, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-12.371 -2.738 -0.974 0.559 81.728
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -13.720 1.699 -8.073 5.08e-15 ***
nox 31.249 2.999 10.419 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.81 on 504 degrees of freedom
Multiple R-squared: 0.1772, Adjusted R-squared: 0.1756
F-statistic: 108.6 on 1 and 504 DF, p-value: < 2.2e-16
summary(f15a5)
Call:
lm(formula = crim ~ rm, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-6.604 -3.952 -2.654 0.989 87.197
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 20.482 3.365 6.088 2.27e-09 ***
rm -2.684 0.532 -5.045 6.35e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 8.401 on 504 degrees of freedom
Multiple R-squared: 0.04807, Adjusted R-squared: 0.04618
F-statistic: 25.45 on 1 and 504 DF, p-value: 6.347e-07
summary(f15a6)
Call:
lm(formula = crim ~ age, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-6.789 -4.257 -1.230 1.527 82.849
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.77791 0.94398 -4.002 7.22e-05 ***
age 0.10779 0.01274 8.463 2.85e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 8.057 on 504 degrees of freedom
Multiple R-squared: 0.1244, Adjusted R-squared: 0.1227
F-statistic: 71.62 on 1 and 504 DF, p-value: 2.855e-16
summary(f15a7)
Call:
lm(formula = crim ~ dis, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-6.708 -4.134 -1.527 1.516 81.674
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.4993 0.7304 13.006 <2e-16 ***
dis -1.5509 0.1683 -9.213 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.965 on 504 degrees of freedom
Multiple R-squared: 0.1441, Adjusted R-squared: 0.1425
F-statistic: 84.89 on 1 and 504 DF, p-value: < 2.2e-16
summary(f15a8)
Call:
lm(formula = crim ~ rad, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-10.164 -1.381 -0.141 0.660 76.433
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.28716 0.44348 -5.157 3.61e-07 ***
rad 0.61791 0.03433 17.998 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.718 on 504 degrees of freedom
Multiple R-squared: 0.3913, Adjusted R-squared: 0.39
F-statistic: 323.9 on 1 and 504 DF, p-value: < 2.2e-16
summary(f15a9)
Call:
lm(formula = crim ~ tax, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-12.513 -2.738 -0.194 1.065 77.696
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -8.528369 0.815809 -10.45 <2e-16 ***
tax 0.029742 0.001847 16.10 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.997 on 504 degrees of freedom
Multiple R-squared: 0.3396, Adjusted R-squared: 0.3383
F-statistic: 259.2 on 1 and 504 DF, p-value: < 2.2e-16
summary(f15a10)
Call:
lm(formula = crim ~ ptratio, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-7.654 -3.985 -1.912 1.825 83.353
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -17.6469 3.1473 -5.607 3.40e-08 ***
ptratio 1.1520 0.1694 6.801 2.94e-11 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 8.24 on 504 degrees of freedom
Multiple R-squared: 0.08407, Adjusted R-squared: 0.08225
F-statistic: 46.26 on 1 and 504 DF, p-value: 2.943e-11
summary(f15a11)
Call:
lm(formula = crim ~ black, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-13.756 -2.299 -2.095 -1.296 86.822
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 16.553529 1.425903 11.609 <2e-16 ***
black -0.036280 0.003873 -9.367 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.946 on 504 degrees of freedom
Multiple R-squared: 0.1483, Adjusted R-squared: 0.1466
F-statistic: 87.74 on 1 and 504 DF, p-value: < 2.2e-16
summary(f15a12)
Call:
lm(formula = crim ~ lstat, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-13.925 -2.822 -0.664 1.079 82.862
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.33054 0.69376 -4.801 2.09e-06 ***
lstat 0.54880 0.04776 11.491 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.664 on 504 degrees of freedom
Multiple R-squared: 0.2076, Adjusted R-squared: 0.206
F-statistic: 132 on 1 and 504 DF, p-value: < 2.2e-16
summary(f15a13)
Call:
lm(formula = crim ~ medv, data = Boston)
Residuals:
Min 1Q Median 3Q Max
-9.071 -4.022 -2.343 1.298 80.957
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.79654 0.93419 12.63 <2e-16 ***
medv -0.36316 0.03839 -9.46 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.934 on 504 degrees of freedom
Multiple R-squared: 0.1508, Adjusted R-squared: 0.1491
F-statistic: 89.49 on 1 and 504 DF, p-value: < 2.2e-16
par(mfrow=c(2,2))
plot(f15a1)
All predictors except chas has statistically significant associations with crim.
f15b <- lm(crim ~ ., data = Boston)
summary(f15b)
Call:
lm(formula = crim ~ ., data = Boston)
Residuals:
Min 1Q Median 3Q Max
-9.924 -2.120 -0.353 1.019 75.051
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 17.033228 7.234903 2.354 0.018949 *
zn 0.044855 0.018734 2.394 0.017025 *
indus -0.063855 0.083407 -0.766 0.444294
chas -0.749134 1.180147 -0.635 0.525867
nox -10.313535 5.275536 -1.955 0.051152 .
rm 0.430131 0.612830 0.702 0.483089
age 0.001452 0.017925 0.081 0.935488
dis -0.987176 0.281817 -3.503 0.000502 ***
rad 0.588209 0.088049 6.680 6.46e-11 ***
tax -0.003780 0.005156 -0.733 0.463793
ptratio -0.271081 0.186450 -1.454 0.146611
black -0.007538 0.003673 -2.052 0.040702 *
lstat 0.126211 0.075725 1.667 0.096208 .
medv -0.198887 0.060516 -3.287 0.001087 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.439 on 492 degrees of freedom
Multiple R-squared: 0.454, Adjusted R-squared: 0.4396
F-statistic: 31.47 on 13 and 492 DF, p-value: < 2.2e-16
For the predictors zn, dis, rad, medv and black we can reject the null hypothesis.
sim.coef <- c(coef(f15a1)[2], coef(f15a2)[2], coef(f15a3)[2], coef(f15a4)[2], coef(f15a5)[2], coef(f15a6)[2], coef(f15a7)[2], coef(f15a8)[2], coef(f15a9)[2], coef(f15a10)[2], coef(f15a11)[2], coef(f15a12)[2], coef(f15a13)[2])
mul.coef <- coef(f15b)[-1]
plot(sim.coef, mul.coef)
The predictor nox has positive influence on crim in simple linear regression and negative influence in multiple linear regression. The former says when calculating the average value of crime rates according to nitrogen oxides, the crim increate as the nox increase. The latter says when other predictors fixed, the crime rates decrease as the nitrogen oxides increase. Section 4.3.4 on page 135 provides detailed explanations about a similar problem: > In general, the phenomenon seen in Figure 4.3 is known as confounding.
f15d1 <- lm(crim ~ poly(zn, 3), data = Boston)
f15d2 <- lm(crim ~ poly(indus, 3), data = Boston)
# f15d3 <- lm(crim ~ poly(chas, 3), data = Boston)
f15d4 <- lm(crim ~ poly(nox, 3), data = Boston)
f15d5 <- lm(crim ~ poly(rm, 3), data = Boston)
f15d6 <- lm(crim ~ poly(age, 3), data = Boston)
f15d7 <- lm(crim ~ poly(dis, 3), data = Boston)
f15d8 <- lm(crim ~ poly(rad, 3), data = Boston)
f15d9 <- lm(crim ~ poly(tax, 3), data = Boston)
f15d10 <- lm(crim ~ poly(ptratio, 3), data = Boston)
f15d11 <- lm(crim ~ poly(black, 3), data = Boston)
f15d12 <- lm(crim ~ poly(lstat, 3), data = Boston)
f15d13 <- lm(crim ~ poly(medv, 3), data = Boston)
summary(f15d1)
Call:
lm(formula = crim ~ poly(zn, 3), data = Boston)
Residuals:
Min 1Q Median 3Q Max
-4.821 -4.614 -1.294 0.473 84.130
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.6135 0.3722 9.709 < 2e-16 ***
poly(zn, 3)1 -38.7498 8.3722 -4.628 4.7e-06 ***
poly(zn, 3)2 23.9398 8.3722 2.859 0.00442 **
poly(zn, 3)3 -10.0719 8.3722 -1.203 0.22954
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 8.372 on 502 degrees of freedom
Multiple R-squared: 0.05824, Adjusted R-squared: 0.05261
F-statistic: 10.35 on 3 and 502 DF, p-value: 1.281e-06
summary(f15d2)
Call:
lm(formula = crim ~ poly(indus, 3), data = Boston)
Residuals:
Min 1Q Median 3Q Max
-8.278 -2.514 0.054 0.764 79.713
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.614 0.330 10.950 < 2e-16 ***
poly(indus, 3)1 78.591 7.423 10.587 < 2e-16 ***
poly(indus, 3)2 -24.395 7.423 -3.286 0.00109 **
poly(indus, 3)3 -54.130 7.423 -7.292 1.2e-12 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.423 on 502 degrees of freedom
Multiple R-squared: 0.2597, Adjusted R-squared: 0.2552
F-statistic: 58.69 on 3 and 502 DF, p-value: < 2.2e-16
# summary(f15d3)
summary(f15d4)
Call:
lm(formula = crim ~ poly(nox, 3), data = Boston)
Residuals:
Min 1Q Median 3Q Max
-9.110 -2.068 -0.255 0.739 78.302
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.6135 0.3216 11.237 < 2e-16 ***
poly(nox, 3)1 81.3720 7.2336 11.249 < 2e-16 ***
poly(nox, 3)2 -28.8286 7.2336 -3.985 7.74e-05 ***
poly(nox, 3)3 -60.3619 7.2336 -8.345 6.96e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.234 on 502 degrees of freedom
Multiple R-squared: 0.297, Adjusted R-squared: 0.2928
F-statistic: 70.69 on 3 and 502 DF, p-value: < 2.2e-16
summary(f15d5)
Call:
lm(formula = crim ~ poly(rm, 3), data = Boston)
Residuals:
Min 1Q Median 3Q Max
-18.485 -3.468 -2.221 -0.015 87.219
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.6135 0.3703 9.758 < 2e-16 ***
poly(rm, 3)1 -42.3794 8.3297 -5.088 5.13e-07 ***
poly(rm, 3)2 26.5768 8.3297 3.191 0.00151 **
poly(rm, 3)3 -5.5103 8.3297 -0.662 0.50858
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 8.33 on 502 degrees of freedom
Multiple R-squared: 0.06779, Adjusted R-squared: 0.06222
F-statistic: 12.17 on 3 and 502 DF, p-value: 1.067e-07
summary(f15d6)
Call:
lm(formula = crim ~ poly(age, 3), data = Boston)
Residuals:
Min 1Q Median 3Q Max
-9.762 -2.673 -0.516 0.019 82.842
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.6135 0.3485 10.368 < 2e-16 ***
poly(age, 3)1 68.1820 7.8397 8.697 < 2e-16 ***
poly(age, 3)2 37.4845 7.8397 4.781 2.29e-06 ***
poly(age, 3)3 21.3532 7.8397 2.724 0.00668 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.84 on 502 degrees of freedom
Multiple R-squared: 0.1742, Adjusted R-squared: 0.1693
F-statistic: 35.31 on 3 and 502 DF, p-value: < 2.2e-16
summary(f15d7)
Call:
lm(formula = crim ~ poly(dis, 3), data = Boston)
Residuals:
Min 1Q Median 3Q Max
-10.757 -2.588 0.031 1.267 76.378
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.6135 0.3259 11.087 < 2e-16 ***
poly(dis, 3)1 -73.3886 7.3315 -10.010 < 2e-16 ***
poly(dis, 3)2 56.3730 7.3315 7.689 7.87e-14 ***
poly(dis, 3)3 -42.6219 7.3315 -5.814 1.09e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.331 on 502 degrees of freedom
Multiple R-squared: 0.2778, Adjusted R-squared: 0.2735
F-statistic: 64.37 on 3 and 502 DF, p-value: < 2.2e-16
summary(f15d8)
Call:
lm(formula = crim ~ poly(rad, 3), data = Boston)
Residuals:
Min 1Q Median 3Q Max
-10.381 -0.412 -0.269 0.179 76.217
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.6135 0.2971 12.164 < 2e-16 ***
poly(rad, 3)1 120.9074 6.6824 18.093 < 2e-16 ***
poly(rad, 3)2 17.4923 6.6824 2.618 0.00912 **
poly(rad, 3)3 4.6985 6.6824 0.703 0.48231
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.682 on 502 degrees of freedom
Multiple R-squared: 0.4, Adjusted R-squared: 0.3965
F-statistic: 111.6 on 3 and 502 DF, p-value: < 2.2e-16
summary(f15d9)
Call:
lm(formula = crim ~ poly(tax, 3), data = Boston)
Residuals:
Min 1Q Median 3Q Max
-13.273 -1.389 0.046 0.536 76.950
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.6135 0.3047 11.860 < 2e-16 ***
poly(tax, 3)1 112.6458 6.8537 16.436 < 2e-16 ***
poly(tax, 3)2 32.0873 6.8537 4.682 3.67e-06 ***
poly(tax, 3)3 -7.9968 6.8537 -1.167 0.244
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.854 on 502 degrees of freedom
Multiple R-squared: 0.3689, Adjusted R-squared: 0.3651
F-statistic: 97.8 on 3 and 502 DF, p-value: < 2.2e-16
summary(f15d10)
Call:
lm(formula = crim ~ poly(ptratio, 3), data = Boston)
Residuals:
Min 1Q Median 3Q Max
-6.833 -4.146 -1.655 1.408 82.697
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.614 0.361 10.008 < 2e-16 ***
poly(ptratio, 3)1 56.045 8.122 6.901 1.57e-11 ***
poly(ptratio, 3)2 24.775 8.122 3.050 0.00241 **
poly(ptratio, 3)3 -22.280 8.122 -2.743 0.00630 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 8.122 on 502 degrees of freedom
Multiple R-squared: 0.1138, Adjusted R-squared: 0.1085
F-statistic: 21.48 on 3 and 502 DF, p-value: 4.171e-13
summary(f15d11)
Call:
lm(formula = crim ~ poly(black, 3), data = Boston)
Residuals:
Min 1Q Median 3Q Max
-13.096 -2.343 -2.128 -1.439 86.790
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.6135 0.3536 10.218 <2e-16 ***
poly(black, 3)1 -74.4312 7.9546 -9.357 <2e-16 ***
poly(black, 3)2 5.9264 7.9546 0.745 0.457
poly(black, 3)3 -4.8346 7.9546 -0.608 0.544
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.955 on 502 degrees of freedom
Multiple R-squared: 0.1498, Adjusted R-squared: 0.1448
F-statistic: 29.49 on 3 and 502 DF, p-value: < 2.2e-16
summary(f15d12)
Call:
lm(formula = crim ~ poly(lstat, 3), data = Boston)
Residuals:
Min 1Q Median 3Q Max
-15.234 -2.151 -0.486 0.066 83.353
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.6135 0.3392 10.654 <2e-16 ***
poly(lstat, 3)1 88.0697 7.6294 11.543 <2e-16 ***
poly(lstat, 3)2 15.8882 7.6294 2.082 0.0378 *
poly(lstat, 3)3 -11.5740 7.6294 -1.517 0.1299
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.629 on 502 degrees of freedom
Multiple R-squared: 0.2179, Adjusted R-squared: 0.2133
F-statistic: 46.63 on 3 and 502 DF, p-value: < 2.2e-16
summary(f15d13)
Call:
lm(formula = crim ~ poly(medv, 3), data = Boston)
Residuals:
Min 1Q Median 3Q Max
-24.427 -1.976 -0.437 0.439 73.655
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.614 0.292 12.374 < 2e-16 ***
poly(medv, 3)1 -75.058 6.569 -11.426 < 2e-16 ***
poly(medv, 3)2 88.086 6.569 13.409 < 2e-16 ***
poly(medv, 3)3 -48.033 6.569 -7.312 1.05e-12 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.569 on 502 degrees of freedom
Multiple R-squared: 0.4202, Adjusted R-squared: 0.4167
F-statistic: 121.3 on 3 and 502 DF, p-value: < 2.2e-16
So the black predictor has no non-linear relationship with crim. lstat has a subtle non-linear relatonship with crim. chas has no linear relationship with crim (see 15a), hence no non-linear relationship, too. Other predictors have different levels but significant non-linear relationship with crim.