Conceptual

Question 1

Describe the null hypotheses to which the p-values given in Table 3.4 correspond. Explain what conclusions you can draw based on these p-values. Your explanation should be phrased in terms of sales, TV, radio, and newspaper, rather than in terms of the coefficients of the linear model

fitq1 <- lm(sales ~ TV + radio + newspaper, Advertising)
summary(fitq1)
## 
## Call:
## lm(formula = sales ~ TV + radio + newspaper, data = Advertising)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.8277 -0.8908  0.2418  1.1893  2.8292 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.938889   0.311908   9.422   <2e-16 ***
## TV           0.045765   0.001395  32.809   <2e-16 ***
## radio        0.188530   0.008611  21.893   <2e-16 ***
## newspaper   -0.001037   0.005871  -0.177     0.86    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.686 on 196 degrees of freedom
## Multiple R-squared:  0.8972, Adjusted R-squared:  0.8956 
## F-statistic: 570.3 on 3 and 196 DF,  p-value: < 2.2e-16

\[H_o: \beta_i = 0\] \[H_a: \beta_i \neq 0\]

A probabilidade de que F > 1 apresenta p virtualmente nulo. Portanto, tem-se forte evidência de que ao menos um preditor é associado com incremento das vendas.

A probabilidade de que sob uma hipótese nula em uma distribuição t em que \(\beta_o = \beta_{tv} = \beta_{radio} = 0\) se encontre os coeficientes apresentados acima é virtualmente nula. Assim, rejeitamos \(H_o\) em favor de \(H_a\) a um nível de significância de \(\alpha = 0.05\). Ou seja, há forte evidência de que o investimento em TV e rádio estejam relacionados com as vendas.

Todavia, a probabilidade de que sob a hipótese nula em que \(\beta_{newspaper} = 0\) em uma distribuição t com erro padrão de 0.005871 é de 0.86. Assim, não podemos descartar Ho sob nível de significância de \(\alpha = 0.05\). Ou seja, não foram identificadas evidências de que os investimentos em jornais estejam relacionados com as vendas.

Question 2

Carefully explain the differences between the KNN classifier and KNN regression methods.

O classificador K-nearest neighbors (KNN) é utilizado para classificar variáveis de resposta qualitativas, considerando \(K\) pontos dentre os dados de treinamento que estão próximos a \(x_0\) (representados por \(N_o\)), prevendo a qual classe a observação pertence mediante probibildade condicional estimada.

\[Pr(Y = j|X = x_o) = \frac{1}{K} \sum_{i \in N_o} I(y_i = j)\]

O K-nearest neghbors regression (KNN regression) é um método não paramétrico semelhante ao classificador KNN. Considerando \(K\) pontos dentre as observações de treinamento que estão próximas a \(x_0\), representadas por \(N_o\), o KNN regression estima \(f(x_0)\) a partir da média das respostas constantes em \(N_o\).

\[\hat f(x_0) = \frac{1}{K} \sum_{x_i \in N_o} y_i\]

Question 3

Suppose we have a data set with five predictors, X1 = GPA, X2 = IQ, X3 = Gender (1 for Female and 0 for Male), X4 = Interaction between GPA and IQ, and X5 = Interaction between GPA and Gender. The response is starting salary after graduation (in thousands of dollars). Suppose we use least squares to fit the model, and get \(\hat\beta_0 = 50, \hat\beta_1 = 20, \hat\beta_2 = 0.07, \hat\beta_3 = 35, \hat\beta_4 = 0.01, \hat\beta_5 = −10\).

  1. Which answer is correct, and why?

O valor de GPA varia de 0 a 4. Portanto, fixando IQ e GPA, tem-se:

  • Male: \(X_3 \times \beta_3 = 0, X_5 \times \beta_5 = 0\)

  • Female: \(X_3 \times \beta_3 = 35, X_5 \times \beta_5 = -10 \times GPA \in [-40, 0]\), portanto a soma dos dois preditores será \([-5, 35]\)

Assim, para um mesmo mesmo GPA e IQ, o homem ganha mais desde que o GPA seja alto o suficiente (iii).

  1. Predict the salary of a female with IQ of 110 and a GPA of 4.0.

137.1 thousands of dollars

  1. True or false: Since the coefficient for the GPA/IQ interaction term is very small, there is very little evidence of an interaction effect. Justify your answer.

Falso, pois a magnitude do coeficiente não revela a existência de um efeito de intereção. Para fazer tal afirmativa, seria necessário avaliar o valor de p associado ao referido coeficiente.

Question 4

I collect a set of data (n = 100 observations) containing a single predictor and a quantitative response. I then fit a linear regression model to the data, as well as a separate cubic regression, i.e. \(Y = \beta_0 + \beta_1 X + \beta_2 X^2 + \beta_3X^3 + \epsilon\).

  1. Suppose that the true relationship between X and Y is linear, i.e. \(Y = \beta_0 + \beta_1X^2 + \epsilon\). Consider the training residual sum of squares (RSS) for the linear regression, and also the training RSS for the cubic regression. Would we expect one to be lower than the other, would we expect them to be the same, or is there not enough information to tell? Justify your answer.

O valor de RSS será sempre inferior na regressão cúbica quando em comparação com a regressão linear, considerando que o incremento no número de preditores reduzirá o RSS nos dados de treinamento do modelo.

fakedata <- data.frame(x = seq(from = 0.1, to = 10, by = .1),
                       y = 5 + 2*seq(from = 0.1, to = 10, by = .1) + rnorm(n = 100, mean = 0, sd = 2))

plot(y ~ x, fakedata)

fakeTest <- sample_n(fakedata, size = 40)
fitFakeTestPoly <- lm(y ~ poly(x, 3), fakeTest)
fitFakeTestSing <- lm(y ~ x, fakeTest)
sum(fitFakeTestPoly$residuals^2)
## [1] 203.9416
sum(fitFakeTestSing$residuals^2)
## [1] 211.6811
  1. Answer (a) using test rather than training RSS.

O valor de RSS para o modelo cúbico será maior do que o linear nos dados de teste de modelo, considerando que a verdadeira relação é linear e o modelo cúbico se aderiu ao ruído dos dados.

  1. Suppose that the true relationship between X and Y is not linear, but we don’t know how far it is from linear. Consider the training RSS for the linear regression, and also the training RSS for the cubic regression. Would we expect one to be lower than the other, would we expect them to be the same, or is there not enough information to tell? Justify your answer.

O valor de RSS para o modelo cúbico será inferior do que o modelo linear, considerando que o incremento do número de variáveis reduz o valor de RSS.

  1. Answer (c) using test rather than training RSS.

Não temos informações suficentes para responder a essa pergunta.

Question 8

This question involves the use of simple linear regression on the Auto data set.

  1. Use the lm() function to perform a simple linear regression with mpg as the response and horsepower as the predictor. Use the summary() function to print the results. Comment on the output. For example:
fitq8 <- lm(mpg ~ horsepower, Auto)
summary(fitq8)
## 
## Call:
## lm(formula = mpg ~ horsepower, data = Auto)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.5710  -3.2592  -0.3435   2.7630  16.9240 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 39.935861   0.717499   55.66   <2e-16 ***
## horsepower  -0.157845   0.006446  -24.49   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.906 on 390 degrees of freedom
## Multiple R-squared:  0.6059, Adjusted R-squared:  0.6049 
## F-statistic: 599.7 on 1 and 390 DF,  p-value: < 2.2e-16
  1. Is there a relationship between the predictor and the response?

Verifica-se uma relação linear entre o preditor e a resposta: o incremento de 1 horsepower resulta em um decréscimo médio de -0.16 mpg em um dado veículo. O valor de p associado a \(H_o: \beta_1 = 0\) é muito baixo (\(<2\times 10^{-16}\)), sugerindo a existência de relação entre o preditor e a resposta. A análise do valor de F também aponta para conclusões semelhantes.

  1. How strong is the relationship between the predictor and the response?

O RSE (Residual standard error) do modelo é de 4.9057569, equivalente a um erro percentual de 21% (razão entre o RSE e a média da variável de resposta). Além disso, o modelo explica 61% da variância em mpg, conforme valor de \(R^2\).

  1. Is the relationship between the predictor and the response positive or negative?

A relação entre o preditor e a resposta é negativa \(\beta_1 = -0.16\).

  1. What is the predicted mpg associated with a horsepower of 98? What are the associated 95% confidence and prediction intervals?

O valor de mpg previsto para horsepower = 98 é apresentado abaixo. Nota-se que o intervalo de predição é maior que o intervalo de confiança porque contabiliza a incerteza associada com o erro irredutível \(\epsilon\).

predict(fitq8, data.frame(horsepower = 98), interval = "prediction")
##        fit     lwr      upr
## 1 24.46708 14.8094 34.12476
predict(fitq8, data.frame(horsepower = 98), interval = "confidence" )
##        fit      lwr      upr
## 1 24.46708 23.97308 24.96108
  1. Plot the response and the predictor. Use the abline() function to display the least squares regression line.
par(mfrow = c(1 ,1))
plot(mpg ~ horsepower, Auto, pch = 19, col = "darkgray")
abline(fitq8, lwd = 2)

  1. Use the plot() function to produce diagnostic plots of the least squares regression fit. Comment on any problems you see with the fit.

O padrão quadrático nos resíduos (curva em formato de U) indica uma não linearidade entre o preditor e a variável de resposta. Nota-se também heteroestaticidade nos resíduos, reforçando a existência de relação não-linear. Verifica-se que os pontos 7, 9, 14, 96 e 117 apresentam elevado efeito alavanca, considerando que seus valores de \(h_i\) são superiores a 5 vezes a média de \(h_i\) em todas as observações. Dentre esses, a observação 117 apresenta um valor de standardized residual maior que 2, o que demanda análise específica.

par(mfrow = c(2 ,2))
plot(fitq8)

Auto[hatvalues(fitq8) > 5 * mean(hatvalues(fitq8)),]
##     mpg cylinders displacement horsepower weight acceleration year origin
## 7    14         8          454        220   4354          9.0   70      1
## 9    14         8          455        225   4425         10.0   70      1
## 14   14         8          455        225   3086         10.0   70      1
## 96   12         8          455        225   4951         11.0   73      1
## 117  16         8          400        230   4278          9.5   73      1
##                         name
## 7           chevrolet impala
## 9           pontiac catalina
## 14   buick estate wagon (sw)
## 96  buick electra 225 custom
## 117       pontiac grand prix

Question 9

This question involves the use of multiple linear regression on the Auto data set.

  1. Produce a scatterplot matrix which includes all of the variables in the data set.
par(mfrow = c(1,1))
pairs(Auto, col = "blue", pch = 20)

  1. Compute the matrix of correlations between the variables using the function cor(). You will need to exclude the name variable, cor() which is qualitative.
library(tidyverse)

cor(Auto[, -ncol(Auto)]) %>%
        kable(booktabs = T) %>%
  kable_styling(position = "center")
mpg cylinders displacement horsepower weight acceleration year origin
mpg 1.0000000 -0.7776175 -0.8051269 -0.7784268 -0.8322442 0.4233285 0.5805410 0.5652088
cylinders -0.7776175 1.0000000 0.9508233 0.8429834 0.8975273 -0.5046834 -0.3456474 -0.5689316
displacement -0.8051269 0.9508233 1.0000000 0.8972570 0.9329944 -0.5438005 -0.3698552 -0.6145351
horsepower -0.7784268 0.8429834 0.8972570 1.0000000 0.8645377 -0.6891955 -0.4163615 -0.4551715
weight -0.8322442 0.8975273 0.9329944 0.8645377 1.0000000 -0.4168392 -0.3091199 -0.5850054
acceleration 0.4233285 -0.5046834 -0.5438005 -0.6891955 -0.4168392 1.0000000 0.2903161 0.2127458
year 0.5805410 -0.3456474 -0.3698552 -0.4163615 -0.3091199 0.2903161 1.0000000 0.1815277
origin 0.5652088 -0.5689316 -0.6145351 -0.4551715 -0.5850054 0.2127458 0.1815277 1.0000000
  1. Use the lm() function to perform a multiple linear regression with mpg as the response and all other variables except name as the predictors. Use the summary() function to print the results. Comment on the output. For instance:
fitq9 <- lm(mpg ~ .-name, Auto)
summary(fitq9)
## 
## Call:
## lm(formula = mpg ~ . - name, data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.5903 -2.1565 -0.1169  1.8690 13.0604 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -17.218435   4.644294  -3.707  0.00024 ***
## cylinders     -0.493376   0.323282  -1.526  0.12780    
## displacement   0.019896   0.007515   2.647  0.00844 ** 
## horsepower    -0.016951   0.013787  -1.230  0.21963    
## weight        -0.006474   0.000652  -9.929  < 2e-16 ***
## acceleration   0.080576   0.098845   0.815  0.41548    
## year           0.750773   0.050973  14.729  < 2e-16 ***
## origin         1.426141   0.278136   5.127 4.67e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.328 on 384 degrees of freedom
## Multiple R-squared:  0.8215, Adjusted R-squared:  0.8182 
## F-statistic: 252.4 on 7 and 384 DF,  p-value: < 2.2e-16
  1. Is there a relationship between the predictors and the response?

Sim, o valor de F > 1 associado a p-value virtualmente nulo sugere a existência de relação entre algum dos preditores adotados e a resposta.

  1. Which predictors appear to have a statistically significant relationship to the response?

Adotando-se um nível de significância de \(\alpha = 0.05\), tem-se que os preditores displacement, weight, yeare origin são estatisticamente significativos.

  1. What does the coefficient for the year variable suggest?

O coeficiente year(referente ao ano de fabrição do veículo) sugere que, mantendo as demais variáveis constantes, a cada ano que passa, os modelos fabricados performam em média 0.7507727 milhas por galão (mpg) a mais.

  1. Use the plot() function to produce diagnostic plots of the linear regression fit. Comment on any problems you see with the fit. Do the residual plots suggest any unusually large outliers? Does the leverage plot identify any observations with unusually high leverage?
par(mfrow = c(2,2))
plot(fitq9)

Auto[hatvalues(fitq9) > 5 * mean(hatvalues(fitq9)),]
##    mpg cylinders displacement horsepower weight acceleration year origin
## 14  14         8          455        225   3086           10   70      1
##                       name
## 14 buick estate wagon (sw)

O padrão quadrático nos resíduos (curva em formato de U) indica uma não linearidade entre o preditor e a variável de resposta. Nota-se também heteroestaticidade nos resíduos, reforçando a existência de relação não-linear. As observações 323, 326 e 327 apresentam elevado valor de standardized residuals. O leverage plot aponta que a observação 14 possui elevado efeito alavanca (9.3057314 maior que a média).

  1. Use the * and: symbols to fit linear regression models with interaction effects. Do any interactions appear to be statistically significant?
fitq9Inter <- lm(mpg ~ .-name + acceleration:displacement +  weight:horsepower , Auto)

summary(fitq9Inter)
## 
## Call:
## lm(formula = mpg ~ . - name + acceleration:displacement + weight:horsepower, 
##     data = Auto)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.038 -1.625 -0.076  1.340 11.721 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               -6.088e+00  5.275e+00  -1.154  0.24921    
## cylinders                  1.321e-01  2.893e-01   0.457  0.64811    
## displacement               3.145e-02  1.043e-02   3.014  0.00275 ** 
## horsepower                -2.172e-01  2.377e-02  -9.138  < 2e-16 ***
## weight                    -9.455e-03  9.082e-04 -10.411  < 2e-16 ***
## acceleration               2.483e-01  1.378e-01   1.801  0.07246 .  
## year                       7.766e-01  4.447e-02  17.465  < 2e-16 ***
## origin                     7.507e-01  2.498e-01   3.006  0.00282 ** 
## displacement:acceleration -2.260e-03  7.112e-04  -3.178  0.00160 ** 
## horsepower:weight          4.706e-05  5.778e-06   8.145 5.46e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.897 on 382 degrees of freedom
## Multiple R-squared:  0.8654, Adjusted R-squared:  0.8622 
## F-statistic: 272.9 on 9 and 382 DF,  p-value: < 2.2e-16

Os termos de interação acceleration:displacement (queda no tempo de aceleração com o aumento do volume de deslocamento do motor) e weight:horsepower (aumento da potência com o aumento do peso do veículo) apresentam p-value extremamente baixo, indicando a existência de forte evidência para \(H_a~:\beta \neq 0\), sugerindo que a relação verdadeira não é apenas aditiva. O valor de \(R^2\) de para o modelo que considera apenas os efeitos principais é de 0.8214781, ao passo que o modelo com os termos de intereção citados possui \(R^2\) de 0.8653975. Ou seja, 25% da variabilidade em mpg que não era explicada pelo modelo aditivo é explicada pelos coeficientes de interação.

  1. Try a few different transformations of the variables, such as \(log(X), \sqrt{X}, X^2\). Comment on your findings.
fitq9Transform <- lm(mpg ~ poly(displacement, 2) + horsepower*weight +  poly(year, 2) , Auto)

summary(fitq9Transform)
## 
## Call:
## lm(formula = mpg ~ poly(displacement, 2) + horsepower * weight + 
##     poly(year, 2), data = Auto)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.5090 -1.5577  0.0065  1.3915 12.2838 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             5.400e+01  3.034e+00  17.798  < 2e-16 ***
## poly(displacement, 2)1 -4.390e+00  9.178e+00  -0.478 0.632653    
## poly(displacement, 2)2  1.703e+01  5.019e+00   3.393 0.000763 ***
## horsepower             -1.783e-01  2.472e-02  -7.210 2.99e-12 ***
## weight                 -8.285e-03  1.055e-03  -7.849 4.22e-14 ***
## poly(year, 2)1          5.516e+01  3.158e+00  17.468  < 2e-16 ***
## poly(year, 2)2          1.615e+01  2.966e+00   5.446 9.22e-08 ***
## horsepower:weight       3.755e-05  7.314e-06   5.135 4.51e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.829 on 384 degrees of freedom
## Multiple R-squared:  0.871,  Adjusted R-squared:  0.8686 
## F-statistic: 370.3 on 7 and 384 DF,  p-value: < 2.2e-16

A transformação dos preditores displacement (volume do deslocamento do motor ou cilindrada) e year (ano do modelo) para quadrática reduz parcialmente a tendência de “U” constante no gráfico de diagnóstico dos resíduos.

Question 10

This question should be answered using the Carseats data set.

  1. Fit a multiple regression model to predict Sales using Price, Urban, and US.
fitq10 <- lm(Sales ~ Price + Urban + US, Carseats)
  1. Provide an interpretation of each coefficient in the model. Be careful—some of the variables in the model are qualitative!
summary(fitq10)
## 
## Call:
## lm(formula = Sales ~ Price + Urban + US, data = Carseats)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.9206 -1.6220 -0.0564  1.5786  7.0581 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 13.043469   0.651012  20.036  < 2e-16 ***
## Price       -0.054459   0.005242 -10.389  < 2e-16 ***
## UrbanYes    -0.021916   0.271650  -0.081    0.936    
## USYes        1.200573   0.259042   4.635 4.86e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.472 on 396 degrees of freedom
## Multiple R-squared:  0.2393, Adjusted R-squared:  0.2335 
## F-statistic: 41.52 on 3 and 396 DF,  p-value: < 2.2e-16

O incremento de 1 unidade no preço cobrado pelo assento de carro infantil (Price), mantendo-se as variáveis Urban e US constantes, resulta em média em uma queda de 54 unidades vendidas.

Caso a loja esteja situada em região urbana (Urban = Yes), mantendo-se as variáveis Price e US constantes, as vendas tendem a cair em média em 22 unidades (ainda que o valor de p seja próximo de 1, sugerindo que se trata de fraca evidência).

Caso a loja esteja situada nos Estados Unidos (US = Yes), mantendo-se as variáveis Price e Urban constantes, as vendas tendem a aumentar em 1201 unidades em média.

  1. Write out the model in equation form, being careful to handle the qualitative variables properly.

\[Sales = \beta_0 + \beta_1 \times Price + \beta_2 \times Urban + \beta_3 \times US\]

Se a venda i\(th\) for em uma loja urbana, \(Urban = 1\)

Se a a venda i\(th\) não for em uma loja urbana, \(Urban = 0\)

Se a a venda i\(th\) for em uma loja nos Estados Unidos, \(US = 1\)

Se a a venda i\(th\) não for em uma loja nos Estados Unidos, \(US = 0\)

  1. For which of the predictors can you reject the null hypothesis \(H_0: \beta_j = 0\)?

Para os preditores Price e US.

  1. On the basis of your response to the previous question, fit a smaller model that only uses the predictors for which there is evidence of association with the outcome.
fitq10smaller <- lm(Sales ~ Price + US, Carseats)
  1. How well do the models in (a) and (e) fit the data?

O modelo da questão (a) apresenta um \(RSE\) = 2.4724924, enquanto que o valor médio das vendas é de 7.5 milhares de unidades. Ou seja, o modelo apresenta um percentual de erro de aproximadamente 33 %. Além disso, o modelo da questão (a) possui \(R^2\) = 0.2392754, explicando aprox. 24 % da variância em Sales.

O modelo da questão (e) apresenta um \(RSE\) de 2.4693968, equivalente um percentual de erro de aproximadamente 33 %. O valor de \(R^2\) é de 0.2392629, explicando 24 % da variância em Sales.

  1. Using the model from (e), obtain 95% confidence intervals for the coefficient(s).
confint(fitq10smaller) %>%
        kable(booktabs = T) %>%
        kable_styling(position = "center")
2.5 % 97.5 %
(Intercept) 11.7903202 14.2712653
Price -0.0647598 -0.0441954
USYes 0.6915196 1.7077663
  1. Is there evidence of outliers or high leverage observations in the model from (e)?

Todos as observações apresentam Standardized residuals menor que três, enquanto que o maior valor de leverage corresponde a um hatvalue de 5.78 o valor do hatvalue médio.

par(mfrow = c(2,2))
plot(fitq10smaller)

Question 11

In this problem we will investigate the t-statistic for the null hypothesis H0: β = 0 in simple linear regression without an intercept. To begin, we generate a predictor x and a response y as follows.

set.seed(1)
x <- rnorm(100)
y <- 2 * x + rnorm(100)
  1. Perform a simple linear regression of y onto x, without an intercept. Report the coefficient estimate \(\hat\beta\), the standard error of this coefficient estimate, and the t-statistic and p-value associated with the null hypothesis \(H_0: \beta = 0\). Comment on these results. (You can perform regression without an intercept using the command lm(y∼x+0).)
fitq11_a <- lm(y ~ x + 0)
summary(fitq11_a)
## 
## Call:
## lm(formula = y ~ x + 0)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.9154 -0.6472 -0.1771  0.5056  2.3109 
## 
## Coefficients:
##   Estimate Std. Error t value Pr(>|t|)    
## x   1.9939     0.1065   18.73   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9586 on 99 degrees of freedom
## Multiple R-squared:  0.7798, Adjusted R-squared:  0.7776 
## F-statistic: 350.7 on 1 and 99 DF,  p-value: < 2.2e-16

Cada valor de y tende a aumentar em média 1.9938761 com o incremento de uma unidade no valor de x. O valor de p associado à hipótese nula \(H_0: \beta = 0\) é virtualmente nulo, sugerindo que a mesma pode ser descartada em favor da hipótese alternativa.

  1. Now perform a simple linear regression of x onto y without an intercept, and report the coefficient estimate, its standard error, and the corresponding t-statistic and p-values associated with the null hypothesis H0: β = 0. Comment on these results.
fitq11_b <- lm(x ~ y + 0)
summary(fitq11_b)
## 
## Call:
## lm(formula = x ~ y + 0)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.8699 -0.2368  0.1030  0.2858  0.8938 
## 
## Coefficients:
##   Estimate Std. Error t value Pr(>|t|)    
## y  0.39111    0.02089   18.73   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4246 on 99 degrees of freedom
## Multiple R-squared:  0.7798, Adjusted R-squared:  0.7776 
## F-statistic: 350.7 on 1 and 99 DF,  p-value: < 2.2e-16

Cada valor de x tende a aumentar em média 0.3911145 com o incremento de uma unidade no valor de y. O valor de p associado à hipótese nula \(H_0: \beta = 0\) é virtualmente nulo, sugerindo que a mesma pode ser descartada em favor da hipótese alternativa.

  1. What is the relationship between the results obtained in (a) and (b)?

O valor de t é o mesmo para os dois modelos.

  1. For the regression of Y onto X without an intercept, the t-statistic for \(H_0: \beta = 0\) takes the form \(\hat\beta /SE(\hat\beta)\), where \(\hat\beta\) is given by (3.38), and where \[SE(\hat\beta) = \sqrt\frac{\sum_{i=1}^n(y_i - x_i\hat\beta)^2}{(n-1)\sum_{i'=1}^n x^2_{i'}}\] (These formulas are slightly different from those given in Sections 3.1.1 and 3.1.2, since here we are performing regression without an intercept.) Show algebraically, and confirm numerically in R, that the t-statistic can be written as \[\frac{(\sqrt{n-1})\sum_{i=1}^nx_iy_i}{\sqrt{(\sum_{i=1}^nx_i^2)(\sum_{i'=1}^ny_{i'}^2) - (\sum_{i'=1}^nx_{i'}y_{i'})^2}}\]

\(\beta = {\frac{\sum_{i=1}^nx_iy_i}{\sum_{i=1}^nx_i^2}}\)

\(t = \frac{\frac{\sum_{i=1}^nx_iy_i}{\sum_{i=1}^nx_i^2}}{\sqrt\frac{\sum_{i=1}^n(y_i - x_i\hat\beta)^2}{(n-1)\sum_{i'=1}^n x^2_{i'}}}\)

\(= \frac{(\sqrt{n-1})\sum_{i=1}^nx_iy_i}{(\sqrt\frac{\sum_{i=1}^n(y_i - x_i\frac{\sum_{i=1}^nx_iy_i}{x_i^2})^2}{\sum_{i'=1}^n x^2_{i'}}){(\sum_{i=1}^nx_i^2})}\)

(sqrt(length(x) - 1) * sum(x * y)) / sqrt(sum(x^2) * sum(y^2) - sum(x * y)^2) #comparar com summary(fitq11_a)
## [1] 18.72593
  1. Using the results from (d), argue that the t-statistic for the regression of y onto x is the same as the t-statistic for the regression of x onto y.
  1. In R, show that when regression is performed with an intercept, the t-statistic for \(H_0: \beta_1 = 0\) is the same for the regression of y onto x as it is for the regression of x onto y.
summary(lm(y ~ x))
## 
## Call:
## lm(formula = y ~ x)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.8768 -0.6138 -0.1395  0.5394  2.3462 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.03769    0.09699  -0.389    0.698    
## x            1.99894    0.10773  18.556   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9628 on 98 degrees of freedom
## Multiple R-squared:  0.7784, Adjusted R-squared:  0.7762 
## F-statistic: 344.3 on 1 and 98 DF,  p-value: < 2.2e-16
summary(lm(x ~ y))
## 
## Call:
## lm(formula = x ~ y)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.90848 -0.28101  0.06274  0.24570  0.85736 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.03880    0.04266    0.91    0.365    
## y            0.38942    0.02099   18.56   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4249 on 98 degrees of freedom
## Multiple R-squared:  0.7784, Adjusted R-squared:  0.7762 
## F-statistic: 344.3 on 1 and 98 DF,  p-value: < 2.2e-16

Question 12

  1. Recall that the coefficient estimate βˆ for the linear regression of Y onto X without an intercept is given by (3.38). Under what circumstance is the coefficient estimate for the regression of X onto Y the same as the coefficient estimate for the regression of Y onto X?
  2. Generate an example in R with n = 100 observations in which the coefficient estimate for the regression of X onto Y is different from the coefficient estimate for the regression of Y onto X.
x <- rnorm(100)
y <- rnorm(100, mean = 1)

lm(y ~ x + 0)$coef
##         x 
## 0.1355718
lm(x ~ y + 0)$coef
##          y 
## 0.06910041
  1. Generate an example in R with n = 100 observations in which the coefficient estimate for the regression of X onto Y is the same as the coefficient estimate for the regression of Y onto X.
x <- rnorm(100)
y <- x

lm(y ~ x + 0)$coef
## x 
## 1
lm(x ~ y + 0)$coef
## y 
## 1
#or
y <- -1 * x
lm(y ~ x + 0)$coef
##  x 
## -1
lm(x ~ y + 0)$coef
##  y 
## -1

Question 13

In this exercise you will create some simulated data and will fit simple linear regression models to it. Make sure to use set.seed(1) prior to starting part (a) to ensure consistent results.

  1. Using the rnorm() function, create a vector, x, containing 100 observations drawn from a N(0, 1) distribution. This represents a feature, X.
set.seed(1)
x <- rnorm(100, 0, 1)
  1. Using the rnorm() function, create a vector, eps, containing 100 observations drawn from a N(0, 0.25) distribution i.e. a normal distribution with mean zero and variance 0.25.
eps <- rnorm(n = 100, mean =  0, sd= 0.25)
  1. Using x and eps, generate a vector y according to the model

\[Y = -1 + 0.5X +\epsilon\]

What is the length of the vector y? What are the values of \(\beta_0\) and \(\beta_1\) in this linear model?

y <- -1 + 0.5 * x + eps

length(y)
## [1] 100
beta0 <- -1
beta1 <- 0.5
  1. Create a scatterplot displaying the relationship between x and y. Comment on what you observe.
plot(y ~ x)

O gráfico sugere a existência de relação linear entre x e y.

  1. Fit a least squares linear model to predict y using x. Comment on the model obtained. How do \(\hat{\beta_0}\) and \(\hat{\beta_1}\) compare to \(\beta_0\) and \(\beta_1\)?
fite <- lm(y ~ x)

fite$coefficients
## (Intercept)           x 
##  -1.0094232   0.4997349

O valor de \(\hat\beta_0\) (-1.0094232) parece próximo do valor de \(\beta_0\) (-1), sendo que o valor de \(\hat\beta_1\) (0.4997349), também próximo de \(\beta_1\) (0.5).

  1. Display the least squares line on the scatterplot obtained in (d). Draw the population regression line on the plot, in a different color. Use the legend() command to create an appropriate legend.
plot(y ~ x)
abline(fite)
legend(x="bottomright",
       c("True line"),
       lty = 1)

  1. Now fit a polynomial regression model that predicts y using x and x2. Is there evidence that the quadratic term improves the model fit? Explain your answer.
fitg <- lm(y ~ poly(x, 2))


anova(fite, fitg)
## Analysis of Variance Table
## 
## Model 1: y ~ x
## Model 2: y ~ poly(x, 2)
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1     98 5.6772                           
## 2     97 5.5643  1   0.11291 1.9682 0.1638

Adotando-se um nível de significância \(\alpha = 0.05\), não é possível descartar a hipótese nula de que os modelos 1 e 2 se encaixam aos dados igualmente bem.

  1. Repeat (a)–(f) after modifying the data generation process in such a way that there is less noise in the data. The model (3.39) should remain the same. You can do this by decreasing the variance of the normal distribution used to generate the error term \(\epsilon\) in (b). Describe your results.
set.seed(1)
x <- rnorm(100, 0, 1)
eps <- rnorm(100, 0, 0.01)
y <- -1 + 0.5 * x + eps
length(y)
## [1] 100
fite_h <- lm(y ~ x)

plot(y ~ x)
abline(fite_h)
legend(x="bottomright",
       c("Less Noise"),
       lty = 1)

fite_h$coefficients
## (Intercept)           x 
##  -1.0003769   0.4999894
fitg_h <- lm(y ~ poly(x, 2))
anova(fite_h, fitg_h)
## Analysis of Variance Table
## 
## Model 1: y ~ x
## Model 2: y ~ poly(x, 2)
##   Res.Df       RSS Df  Sum of Sq      F Pr(>F)
## 1     98 0.0090836                            
## 2     97 0.0089029  1 0.00018065 1.9682 0.1638

O valor de \(\hat\beta_0\) (-1.0003769) parece próximo do valor de \(\beta_0\) (-1),assim como o valor de \(\hat\beta_1\) (0.4999894) de \(\beta_1\) (0.5).

Uma comparação entre os dois modelos a partir de análise da variância indica que a variável quadrática não aumenta a performance do modelo.

  1. Repeat (a)–(f) after modifying the data generation process in such a way that there is more noise in the data. The model (3.39) should remain the same. You can do this by increasing the variance of the normal distribution used to generate the error term \(\epsilon\) in (b). Describe your results.
set.seed(1)
x <- rnorm(100, 0, 1)
eps <- rnorm(100, 0, sqrt(0.50))
y <- -1 + 0.5 * x + eps
length(y)
## [1] 100
fite_i <- lm(y ~ x)

plot(y ~ x)
abline(fite_i)
legend(x="bottomright",
       c("More noise"),
       lty = 1)

fite_i$coefficients
## (Intercept)           x 
##  -1.0266527   0.4992502
fitg_i <- lm(y ~ poly(x, 2))
anova(fite_i, fitg_i)
## Analysis of Variance Table
## 
## Model 1: y ~ x
## Model 2: y ~ poly(x, 2)
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1     98 45.418                           
## 2     97 44.515  1   0.90325 1.9682 0.1638

O valor de \(\hat\beta_0\) (-1.0266527) parece próximo do valor de \(\beta_0\) (-1), assim como \(\hat\beta_1\) (0.4992502) e \(\beta_1\) (0.5).

Uma comparação entre os dois modelos a partir de análise da variância indica que, adotando-se um nível de significância de \(\alpha = 0.05\), não é possível descartar a hipótese nula, não se identificando, portanto, evidências de melhoria no modelo.

  1. What are the confidence intervals for \(\beta_0\) and \(\beta_1\) based on the original data set, the noisier data set, and the less noisy data set? Comment on your results.
betas1 <- data.frame(type = c("original", "less_noisy", "noisier"),
           rbind(confint(fite)[2, ],
                 confint(fite_h)[2, ],
                 confint(fite_i)[2, ]))

names(betas1)[-1] <- c("inf", "sup")
betas1 <- mutate(betas1, range_interval = sup - inf)

betas1 %>%
 kable(booktabs = T) %>%
         kable_styling(position = "center")
type inf sup range_interval
original 0.4462897 0.5531801 0.1068904
less_noisy 0.4978516 0.5021272 0.0042756
noisier 0.3480843 0.6504160 0.3023317

Quando menor o ruído, menor o intervalo de confiança.

Question 14

This problem focuses on the collinearity problem.

  1. Perform the following commands in R:
set.seed(1)
x1 <- runif (100)
x2 <- 0.5*x1+rnorm (100)/10
y  <- 2+2*x1+0.3*x2+rnorm (100) 

The last line corresponds to creating a linear model in which y is a function of x1 and x2. Write out the form of the linear model. What are the regression coefficients?

lm(y ~ x1 + x2)
## 
## Call:
## lm(formula = y ~ x1 + x2)
## 
## Coefficients:
## (Intercept)           x1           x2  
##        2.13         1.44         1.01
  1. What is the correlation between x1 and x2? Create a scatterplot displaying the relationship between the variables.
cor(x1, x2)
## [1] 0.8351212
plot(x2 ~ x1)

  1. Using this data, fit a least squares regression to predict y using x1 and x2. Describe the results obtained. What are \(\hat\beta_0\), \(\hat\beta_1\), and \(\hat\beta_2\)? How do these relate to the true \(\beta_0\), \(\beta_1\), and \(\beta_2\)? Can you reject the null hypothesis \(H_0: \beta_1 = 0\)? How about the null hypothesis \(H_0: \beta_2 = 0\)?
fit14_x1x2 <- lm(y ~ x1 + x2)

summary(fit14_x1x2)
## 
## Call:
## lm(formula = y ~ x1 + x2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.8311 -0.7273 -0.0537  0.6338  2.3359 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.1305     0.2319   9.188 7.61e-15 ***
## x1            1.4396     0.7212   1.996   0.0487 *  
## x2            1.0097     1.1337   0.891   0.3754    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.056 on 97 degrees of freedom
## Multiple R-squared:  0.2088, Adjusted R-squared:  0.1925 
## F-statistic:  12.8 on 2 and 97 DF,  p-value: 1.164e-05
  • \(\hat\beta_0\) = 2.1304996, ao passo que \(\beta_0 = 2\) conforme (a);

  • \(\hat\beta_1\) = 1.4395554, ao passo que \(\beta_1 = 2\). Adotando-se um nível de significância \(\alpha = 0.05\), é possível rejeitar hipótese nula \(H_0: \beta_1 = 0\);

  • \(\hat\beta_2\) = 1.0096742,ao passo que \(\beta_2 = 0.3\). Não é possível rejeitar a hipótese nula \(H_0: \beta_1 = 0\) a um nível de significância \(\alpha = 0.05\).

  1. Now fit a least squares regression to predict y using only x1. Comment on your results. Can you reject the null hypothesis \(H_0: \beta_1 = 0\)?
fit14_x1 <- lm(y ~ x1)

summary(fit14_x1)
## 
## Call:
## lm(formula = y ~ x1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.89495 -0.66874 -0.07785  0.59221  2.45560 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.1124     0.2307   9.155 8.27e-15 ***
## x1            1.9759     0.3963   4.986 2.66e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.055 on 98 degrees of freedom
## Multiple R-squared:  0.2024, Adjusted R-squared:  0.1942 
## F-statistic: 24.86 on 1 and 98 DF,  p-value: 2.661e-06
  • \(\hat\beta_0\) = 2.1123936;

  • \(\hat\beta_1\) = 1.975929. Adotando-se um nível de significância \(\alpha = 0.05\), é possível rejeitar hipótese nula \(H_0: \beta_1 = 0\).

  1. Now fit a least squares regression to predict y using only x2. Comment on your results. Can you reject the null hypothesis \(H_0: \beta_1 = 0\)?
fit14_x2 <- lm(y ~ x2)

summary(fit14_x2)
## 
## Call:
## lm(formula = y ~ x2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.62687 -0.75156 -0.03598  0.72383  2.44890 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.3899     0.1949   12.26  < 2e-16 ***
## x2            2.8996     0.6330    4.58 1.37e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.072 on 98 degrees of freedom
## Multiple R-squared:  0.1763, Adjusted R-squared:  0.1679 
## F-statistic: 20.98 on 1 and 98 DF,  p-value: 1.366e-05
  • \(\hat\beta_0\) = 2.3899491;

  • \(\hat\beta_1\) = 2.8995851. Adotando-se um nível de significância \(\alpha = 0.05\), é possível rejeitar hipótese nula \(H_0: \beta_1 = 0\).

  1. Do the results obtained in (c)–(e) contradict each other? Explain your answer.

O preditor x2 no modelo linear que considera x1 e x2 (c) não parece ter uma relação estatisticamente significativa com a variável de resposta y , enquanto que quando considerado isoladamente ele é considerado estatisticamente significativa (e). Conforme verificado em (b), os preditores x1 e x2 são colineares, e o poder do teste de hipótese (probabilidade de corretamente detectar um coeficiente distinto de zero) é reduzido pela colinearidade.

  1. Now suppose we obtain one additional observation, which was unfortunately mismeasured.
x1 <- c(x1, 0.1)
x2 <- c(x2, 0.8)
y <- c(y, 6)

Re-fit the linear models from (c) to (e) using this new data. What effect does this new observation have on the each of the models? In each model, is this observation an outlier? A high-leverage point? Both? Explain your answers.

fit14_x1x2_g <- lm(y ~ x1 + x2)

summary(fit14_x1x2_g)
## 
## Call:
## lm(formula = y ~ x1 + x2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.73348 -0.69318 -0.05263  0.66385  2.30619 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.2267     0.2314   9.624 7.91e-16 ***
## x1            0.5394     0.5922   0.911  0.36458    
## x2            2.5146     0.8977   2.801  0.00614 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.075 on 98 degrees of freedom
## Multiple R-squared:  0.2188, Adjusted R-squared:  0.2029 
## F-statistic: 13.72 on 2 and 98 DF,  p-value: 5.564e-06
par(mfrow = c(2,2))
plot(fit14_x1x2_g)

Quanto ao modelo que considera os preditores x1 e x2:

  • \(\hat\beta_1\) = 0.5394397 (sabe-se que \(\beta_1 = 2\)). Com a adição da nova observação, para um nível de significância de \(\alpha = 0.05\), não é possível rejeitar hipótese nula \(H_0: \beta_1 = 0\). Para o modelo sem adição da nova medição, rejeitava-se a \(H_0\).

  • \(\hat\beta_2\) = 2.5145694 (\(\beta_2 = 0.3\)). Após adição da nova observação, rejeita-se a hipótese nula \(H_0: \beta_2 = 0\) a um nível de significância \(\alpha = 0.05\). Para modelo anterior, não era possível rejeitar a hipótese nula.

Nota-se que a nova observação apresenta elevado hatvalue e um standardized residual de próximo de 2.

fit14_x1_g <- lm(y ~ x1)

summary(fit14_x1_g)
## 
## Call:
## lm(formula = y ~ x1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.8897 -0.6556 -0.0909  0.5682  3.5665 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.2569     0.2390   9.445 1.78e-15 ***
## x1            1.7657     0.4124   4.282 4.29e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.111 on 99 degrees of freedom
## Multiple R-squared:  0.1562, Adjusted R-squared:  0.1477 
## F-statistic: 18.33 on 1 and 99 DF,  p-value: 4.295e-05
par(mfrow = c(2,2))
plot(fit14_x1_g)

  • \(\hat\beta_1\) = 1.7656955. Adotando-se um nível de significância \(\alpha = 0.05\), é possível rejeitar hipótese nula \(H_0: \beta_1 = 0\). Assim, adicionar a nova observação não alterou o teste de hipótese.

A nova observação destaca-se pelo standardized residual acima de 3. O seu valor de hatvalue não destoa das demais observações.

fit14_x2_g <- lm(y ~ x2)

summary(fit14_x2_g)
## 
## Call:
## lm(formula = y ~ x2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.64729 -0.71021 -0.06899  0.72699  2.38074 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.3451     0.1912  12.264  < 2e-16 ***
## x2            3.1190     0.6040   5.164 1.25e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.074 on 99 degrees of freedom
## Multiple R-squared:  0.2122, Adjusted R-squared:  0.2042 
## F-statistic: 26.66 on 1 and 99 DF,  p-value: 1.253e-06
par(mfrow = c(2,2))
plot(fit14_x2_g)

  • \(\hat\beta_1\) = 3.1190497. Adotando-se um nível de significância \(\alpha = 0.05\), é possível rejeitar hipótese nula \(H_0: \beta_1 = 0\). Assim, adicionar a nova observação não alterou o teste de hipótese.

A nova observação destaca-se pelo hatvalue elevado quando em comparação com os demais valores. O seu valor de standardized residual não destoa das demais observações.

Question 15

This problem involves the Boston data set, which we saw in the lab for this chapter. We will now try to predict per capita crime rate using the other variables in this data set. In other words, per capita crime rate is the response, and the other variables are the predictors.

  1. For each predictor, fit a simple linear regression model to predict the response. Describe your results. In which of the models is there a statistically significant association between the predictor and the response? Create some plots to back up your assertions.
par(mfrow = c(1, 1))

valid.names <- names(Boston)[-1]

for(i in 1:length(valid.names)){
        fit <- summary(lm(as.formula(paste("crim ~", valid.names[i])), Boston))
        print(plot(as.formula(paste("crim ~", valid.names[i])), Boston))
        print(fit)
        }

## NULL
## 
## Call:
## lm(formula = as.formula(paste("crim ~", valid.names[i])), data = Boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -4.429 -4.222 -2.620  1.250 84.523 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.45369    0.41722  10.675  < 2e-16 ***
## zn          -0.07393    0.01609  -4.594 5.51e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.435 on 504 degrees of freedom
## Multiple R-squared:  0.04019,    Adjusted R-squared:  0.03828 
## F-statistic:  21.1 on 1 and 504 DF,  p-value: 5.506e-06

## NULL
## 
## Call:
## lm(formula = as.formula(paste("crim ~", valid.names[i])), data = Boston)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -11.972  -2.698  -0.736   0.712  81.813 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -2.06374    0.66723  -3.093  0.00209 ** 
## indus        0.50978    0.05102   9.991  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.866 on 504 degrees of freedom
## Multiple R-squared:  0.1653, Adjusted R-squared:  0.1637 
## F-statistic: 99.82 on 1 and 504 DF,  p-value: < 2.2e-16

## NULL
## 
## Call:
## lm(formula = as.formula(paste("crim ~", valid.names[i])), data = Boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.738 -3.661 -3.435  0.018 85.232 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.7444     0.3961   9.453   <2e-16 ***
## chas         -1.8928     1.5061  -1.257    0.209    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.597 on 504 degrees of freedom
## Multiple R-squared:  0.003124,   Adjusted R-squared:  0.001146 
## F-statistic: 1.579 on 1 and 504 DF,  p-value: 0.2094

## NULL
## 
## Call:
## lm(formula = as.formula(paste("crim ~", valid.names[i])), data = Boston)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -12.371  -2.738  -0.974   0.559  81.728 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -13.720      1.699  -8.073 5.08e-15 ***
## nox           31.249      2.999  10.419  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.81 on 504 degrees of freedom
## Multiple R-squared:  0.1772, Adjusted R-squared:  0.1756 
## F-statistic: 108.6 on 1 and 504 DF,  p-value: < 2.2e-16

## NULL
## 
## Call:
## lm(formula = as.formula(paste("crim ~", valid.names[i])), data = Boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -6.604 -3.952 -2.654  0.989 87.197 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   20.482      3.365   6.088 2.27e-09 ***
## rm            -2.684      0.532  -5.045 6.35e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.401 on 504 degrees of freedom
## Multiple R-squared:  0.04807,    Adjusted R-squared:  0.04618 
## F-statistic: 25.45 on 1 and 504 DF,  p-value: 6.347e-07

## NULL
## 
## Call:
## lm(formula = as.formula(paste("crim ~", valid.names[i])), data = Boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -6.789 -4.257 -1.230  1.527 82.849 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -3.77791    0.94398  -4.002 7.22e-05 ***
## age          0.10779    0.01274   8.463 2.85e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.057 on 504 degrees of freedom
## Multiple R-squared:  0.1244, Adjusted R-squared:  0.1227 
## F-statistic: 71.62 on 1 and 504 DF,  p-value: 2.855e-16

## NULL
## 
## Call:
## lm(formula = as.formula(paste("crim ~", valid.names[i])), data = Boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -6.708 -4.134 -1.527  1.516 81.674 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   9.4993     0.7304  13.006   <2e-16 ***
## dis          -1.5509     0.1683  -9.213   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.965 on 504 degrees of freedom
## Multiple R-squared:  0.1441, Adjusted R-squared:  0.1425 
## F-statistic: 84.89 on 1 and 504 DF,  p-value: < 2.2e-16

## NULL
## 
## Call:
## lm(formula = as.formula(paste("crim ~", valid.names[i])), data = Boston)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -10.164  -1.381  -0.141   0.660  76.433 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -2.28716    0.44348  -5.157 3.61e-07 ***
## rad          0.61791    0.03433  17.998  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.718 on 504 degrees of freedom
## Multiple R-squared:  0.3913, Adjusted R-squared:   0.39 
## F-statistic: 323.9 on 1 and 504 DF,  p-value: < 2.2e-16

## NULL
## 
## Call:
## lm(formula = as.formula(paste("crim ~", valid.names[i])), data = Boston)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -12.513  -2.738  -0.194   1.065  77.696 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -8.528369   0.815809  -10.45   <2e-16 ***
## tax          0.029742   0.001847   16.10   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.997 on 504 degrees of freedom
## Multiple R-squared:  0.3396, Adjusted R-squared:  0.3383 
## F-statistic: 259.2 on 1 and 504 DF,  p-value: < 2.2e-16

## NULL
## 
## Call:
## lm(formula = as.formula(paste("crim ~", valid.names[i])), data = Boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -7.654 -3.985 -1.912  1.825 83.353 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -17.6469     3.1473  -5.607 3.40e-08 ***
## ptratio       1.1520     0.1694   6.801 2.94e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.24 on 504 degrees of freedom
## Multiple R-squared:  0.08407,    Adjusted R-squared:  0.08225 
## F-statistic: 46.26 on 1 and 504 DF,  p-value: 2.943e-11

## NULL
## 
## Call:
## lm(formula = as.formula(paste("crim ~", valid.names[i])), data = Boston)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -13.756  -2.299  -2.095  -1.296  86.822 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 16.553529   1.425903  11.609   <2e-16 ***
## black       -0.036280   0.003873  -9.367   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.946 on 504 degrees of freedom
## Multiple R-squared:  0.1483, Adjusted R-squared:  0.1466 
## F-statistic: 87.74 on 1 and 504 DF,  p-value: < 2.2e-16

## NULL
## 
## Call:
## lm(formula = as.formula(paste("crim ~", valid.names[i])), data = Boston)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -13.925  -2.822  -0.664   1.079  82.862 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -3.33054    0.69376  -4.801 2.09e-06 ***
## lstat        0.54880    0.04776  11.491  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.664 on 504 degrees of freedom
## Multiple R-squared:  0.2076, Adjusted R-squared:  0.206 
## F-statistic:   132 on 1 and 504 DF,  p-value: < 2.2e-16

## NULL
## 
## Call:
## lm(formula = as.formula(paste("crim ~", valid.names[i])), data = Boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.071 -4.022 -2.343  1.298 80.957 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 11.79654    0.93419   12.63   <2e-16 ***
## medv        -0.36316    0.03839   -9.46   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.934 on 504 degrees of freedom
## Multiple R-squared:  0.1508, Adjusted R-squared:  0.1491 
## F-statistic: 89.49 on 1 and 504 DF,  p-value: < 2.2e-16

Para uma regressão linear simples, os preditores zn, indus, nox, rm, age, dis, rad, tax, ptratio, black, lstat e medv parecem ter uma relação estatisticamente significativa com a resposta.

  1. Fit a multiple regression model to predict the response using all of the predictors. Describe your results. For which predictors can we reject the null hypothesis \(H_0: \beta_j = 0\)?
fitall <- lm(crim ~ ., Boston)
summary(fitall)
## 
## Call:
## lm(formula = crim ~ ., data = Boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.924 -2.120 -0.353  1.019 75.051 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  17.033228   7.234903   2.354 0.018949 *  
## zn            0.044855   0.018734   2.394 0.017025 *  
## indus        -0.063855   0.083407  -0.766 0.444294    
## chas         -0.749134   1.180147  -0.635 0.525867    
## nox         -10.313535   5.275536  -1.955 0.051152 .  
## rm            0.430131   0.612830   0.702 0.483089    
## age           0.001452   0.017925   0.081 0.935488    
## dis          -0.987176   0.281817  -3.503 0.000502 ***
## rad           0.588209   0.088049   6.680 6.46e-11 ***
## tax          -0.003780   0.005156  -0.733 0.463793    
## ptratio      -0.271081   0.186450  -1.454 0.146611    
## black        -0.007538   0.003673  -2.052 0.040702 *  
## lstat         0.126211   0.075725   1.667 0.096208 .  
## medv         -0.198887   0.060516  -3.287 0.001087 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.439 on 492 degrees of freedom
## Multiple R-squared:  0.454,  Adjusted R-squared:  0.4396 
## F-statistic: 31.47 on 13 and 492 DF,  p-value: < 2.2e-16

Adotando-se um nível de significância \(\alpha = 0.05\), podemos rejeitar a hipótese nula \(H_0: \beta_j = 0\) para os preditores dis, rad e medv.

  1. How do your results from (a) compare to your results from (b)? Create a plot displaying the univariate regression coefficients from (a) on the x-axis, and the multiple regression coefficients from (b) on the y-axis. That is, each predictor is displayed as a single point in the plot. Its coefficient in a simple linear regression model is shown on the x-axis, and its coefficient estimate in the multiple linear regression model is shown on the y-axis.

Um número inferior de preditores apresenta uma associação estatisticamente significativa com a variável de resposta para o modelo com mais de um preditor (3) quando em comparação com o modelo com apenas um preditor (12).

UnivariateCoef <- numeric()

for(i in 1:length(valid.names)){
UnivariateCoef[i] <- lm(as.formula(paste("crim ~", valid.names[i])), Boston)$coef[2]
    }

UniMulti <- data.frame(predictors = valid.names,
                       uni = UnivariateCoef,
                       multi = fitall$coefficients[-1])

library(ggrepel)
## Warning: package 'ggrepel' was built under R version 3.6.3
ggplot(UniMulti) +
        geom_point(aes(x = uni,
                       y = multi,
                       colour = ifelse(predictors %in%
                                               c("dis", "rad", "medv"),"sig", "not")),
                   show.legend = FALSE) +
        geom_text_repel(aes(x = uni, y = multi, label = predictors)) +
        theme_minimal() +
        labs(x = "Univariate regression coefficients",
             y = "Multiple regression coefficients")

  1. Is there evidence of non-linear association between any of the predictors and the response? To answer this question, for each predictor X, fit a model of the form

\[Y = \beta_0 + \beta_1X + \beta_2X^2 + \beta_3X^3 + \epsilon\]

O preditor chas (Charles River dummy variable) não é possível de ser analisado mediante regressão cúbica. Os preditores que apresentaram relação estatisticamente significativa com a resposta para os coeficientes \(\hat\beta_1\), \(\hat\beta_2\) e \(\hat\beta_3\) foram indus, nox, age, dis, ptratio e medv.

# 'grau' deve ser menor do que o número de pontos únicos.
for(i in (1:length(valid.names))[-3]){
        fitq15 <- lm(as.formula(paste("crim ~", "poly(", valid.names[i], ", 3)")), Boston)
        print(summary(fitq15))
       }
## 
## Call:
## lm(formula = as.formula(paste("crim ~", "poly(", valid.names[i], 
##     ", 3)")), data = Boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -4.821 -4.614 -1.294  0.473 84.130 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    3.6135     0.3722   9.709  < 2e-16 ***
## poly(zn, 3)1 -38.7498     8.3722  -4.628  4.7e-06 ***
## poly(zn, 3)2  23.9398     8.3722   2.859  0.00442 ** 
## poly(zn, 3)3 -10.0719     8.3722  -1.203  0.22954    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.372 on 502 degrees of freedom
## Multiple R-squared:  0.05824,    Adjusted R-squared:  0.05261 
## F-statistic: 10.35 on 3 and 502 DF,  p-value: 1.281e-06
## 
## 
## Call:
## lm(formula = as.formula(paste("crim ~", "poly(", valid.names[i], 
##     ", 3)")), data = Boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.278 -2.514  0.054  0.764 79.713 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        3.614      0.330  10.950  < 2e-16 ***
## poly(indus, 3)1   78.591      7.423  10.587  < 2e-16 ***
## poly(indus, 3)2  -24.395      7.423  -3.286  0.00109 ** 
## poly(indus, 3)3  -54.130      7.423  -7.292  1.2e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.423 on 502 degrees of freedom
## Multiple R-squared:  0.2597, Adjusted R-squared:  0.2552 
## F-statistic: 58.69 on 3 and 502 DF,  p-value: < 2.2e-16
## 
## 
## Call:
## lm(formula = as.formula(paste("crim ~", "poly(", valid.names[i], 
##     ", 3)")), data = Boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.110 -2.068 -0.255  0.739 78.302 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     3.6135     0.3216  11.237  < 2e-16 ***
## poly(nox, 3)1  81.3720     7.2336  11.249  < 2e-16 ***
## poly(nox, 3)2 -28.8286     7.2336  -3.985 7.74e-05 ***
## poly(nox, 3)3 -60.3619     7.2336  -8.345 6.96e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.234 on 502 degrees of freedom
## Multiple R-squared:  0.297,  Adjusted R-squared:  0.2928 
## F-statistic: 70.69 on 3 and 502 DF,  p-value: < 2.2e-16
## 
## 
## Call:
## lm(formula = as.formula(paste("crim ~", "poly(", valid.names[i], 
##     ", 3)")), data = Boston)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -18.485  -3.468  -2.221  -0.015  87.219 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    3.6135     0.3703   9.758  < 2e-16 ***
## poly(rm, 3)1 -42.3794     8.3297  -5.088 5.13e-07 ***
## poly(rm, 3)2  26.5768     8.3297   3.191  0.00151 ** 
## poly(rm, 3)3  -5.5103     8.3297  -0.662  0.50858    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.33 on 502 degrees of freedom
## Multiple R-squared:  0.06779,    Adjusted R-squared:  0.06222 
## F-statistic: 12.17 on 3 and 502 DF,  p-value: 1.067e-07
## 
## 
## Call:
## lm(formula = as.formula(paste("crim ~", "poly(", valid.names[i], 
##     ", 3)")), data = Boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.762 -2.673 -0.516  0.019 82.842 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     3.6135     0.3485  10.368  < 2e-16 ***
## poly(age, 3)1  68.1820     7.8397   8.697  < 2e-16 ***
## poly(age, 3)2  37.4845     7.8397   4.781 2.29e-06 ***
## poly(age, 3)3  21.3532     7.8397   2.724  0.00668 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.84 on 502 degrees of freedom
## Multiple R-squared:  0.1742, Adjusted R-squared:  0.1693 
## F-statistic: 35.31 on 3 and 502 DF,  p-value: < 2.2e-16
## 
## 
## Call:
## lm(formula = as.formula(paste("crim ~", "poly(", valid.names[i], 
##     ", 3)")), data = Boston)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -10.757  -2.588   0.031   1.267  76.378 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     3.6135     0.3259  11.087  < 2e-16 ***
## poly(dis, 3)1 -73.3886     7.3315 -10.010  < 2e-16 ***
## poly(dis, 3)2  56.3730     7.3315   7.689 7.87e-14 ***
## poly(dis, 3)3 -42.6219     7.3315  -5.814 1.09e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.331 on 502 degrees of freedom
## Multiple R-squared:  0.2778, Adjusted R-squared:  0.2735 
## F-statistic: 64.37 on 3 and 502 DF,  p-value: < 2.2e-16
## 
## 
## Call:
## lm(formula = as.formula(paste("crim ~", "poly(", valid.names[i], 
##     ", 3)")), data = Boston)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -10.381  -0.412  -0.269   0.179  76.217 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     3.6135     0.2971  12.164  < 2e-16 ***
## poly(rad, 3)1 120.9074     6.6824  18.093  < 2e-16 ***
## poly(rad, 3)2  17.4923     6.6824   2.618  0.00912 ** 
## poly(rad, 3)3   4.6985     6.6824   0.703  0.48231    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.682 on 502 degrees of freedom
## Multiple R-squared:    0.4,  Adjusted R-squared:  0.3965 
## F-statistic: 111.6 on 3 and 502 DF,  p-value: < 2.2e-16
## 
## 
## Call:
## lm(formula = as.formula(paste("crim ~", "poly(", valid.names[i], 
##     ", 3)")), data = Boston)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -13.273  -1.389   0.046   0.536  76.950 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     3.6135     0.3047  11.860  < 2e-16 ***
## poly(tax, 3)1 112.6458     6.8537  16.436  < 2e-16 ***
## poly(tax, 3)2  32.0873     6.8537   4.682 3.67e-06 ***
## poly(tax, 3)3  -7.9968     6.8537  -1.167    0.244    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.854 on 502 degrees of freedom
## Multiple R-squared:  0.3689, Adjusted R-squared:  0.3651 
## F-statistic:  97.8 on 3 and 502 DF,  p-value: < 2.2e-16
## 
## 
## Call:
## lm(formula = as.formula(paste("crim ~", "poly(", valid.names[i], 
##     ", 3)")), data = Boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -6.833 -4.146 -1.655  1.408 82.697 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          3.614      0.361  10.008  < 2e-16 ***
## poly(ptratio, 3)1   56.045      8.122   6.901 1.57e-11 ***
## poly(ptratio, 3)2   24.775      8.122   3.050  0.00241 ** 
## poly(ptratio, 3)3  -22.280      8.122  -2.743  0.00630 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.122 on 502 degrees of freedom
## Multiple R-squared:  0.1138, Adjusted R-squared:  0.1085 
## F-statistic: 21.48 on 3 and 502 DF,  p-value: 4.171e-13
## 
## 
## Call:
## lm(formula = as.formula(paste("crim ~", "poly(", valid.names[i], 
##     ", 3)")), data = Boston)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -13.096  -2.343  -2.128  -1.439  86.790 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       3.6135     0.3536  10.218   <2e-16 ***
## poly(black, 3)1 -74.4312     7.9546  -9.357   <2e-16 ***
## poly(black, 3)2   5.9264     7.9546   0.745    0.457    
## poly(black, 3)3  -4.8346     7.9546  -0.608    0.544    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.955 on 502 degrees of freedom
## Multiple R-squared:  0.1498, Adjusted R-squared:  0.1448 
## F-statistic: 29.49 on 3 and 502 DF,  p-value: < 2.2e-16
## 
## 
## Call:
## lm(formula = as.formula(paste("crim ~", "poly(", valid.names[i], 
##     ", 3)")), data = Boston)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -15.234  -2.151  -0.486   0.066  83.353 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       3.6135     0.3392  10.654   <2e-16 ***
## poly(lstat, 3)1  88.0697     7.6294  11.543   <2e-16 ***
## poly(lstat, 3)2  15.8882     7.6294   2.082   0.0378 *  
## poly(lstat, 3)3 -11.5740     7.6294  -1.517   0.1299    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.629 on 502 degrees of freedom
## Multiple R-squared:  0.2179, Adjusted R-squared:  0.2133 
## F-statistic: 46.63 on 3 and 502 DF,  p-value: < 2.2e-16
## 
## 
## Call:
## lm(formula = as.formula(paste("crim ~", "poly(", valid.names[i], 
##     ", 3)")), data = Boston)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -24.427  -1.976  -0.437   0.439  73.655 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       3.614      0.292  12.374  < 2e-16 ***
## poly(medv, 3)1  -75.058      6.569 -11.426  < 2e-16 ***
## poly(medv, 3)2   88.086      6.569  13.409  < 2e-16 ***
## poly(medv, 3)3  -48.033      6.569  -7.312 1.05e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.569 on 502 degrees of freedom
## Multiple R-squared:  0.4202, Adjusted R-squared:  0.4167 
## F-statistic: 121.3 on 3 and 502 DF,  p-value: < 2.2e-16