Home Work 5 - Alex Matteson Problem 2 a.) i.
library(tidyverse)
Auto <- read.table("http://faculty.marshall.usc.edu/gareth-james/ISL/Auto.data",
header=TRUE,
na.strings = "?")
mod<-lm(mpg~horsepower, Auto)
summary(mod)
Call:
lm(formula = mpg ~ horsepower, data = Auto)
Residuals:
Min 1Q Median 3Q Max
-13.5710 -3.2592 -0.3435 2.7630 16.9240
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 39.935861 0.717499 55.66 <2e-16 ***
horsepower -0.157845 0.006446 -24.49 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.906 on 390 degrees of freedom
(5 observations deleted due to missingness)
Multiple R-squared: 0.6059, Adjusted R-squared: 0.6049
F-statistic: 599.7 on 1 and 390 DF, p-value: < 2.2e-16
plot(Auto$horsepower, Auto$mpg)
abline(mod)

It appears that the responce and predictor variables are related, the relationship may not be linear though.
- There seems to be a strong negative relationship between the two variables. The r^2 value is .6 which is fairly high
Looking at the plot it appears to be a negative relationship between the variables.
newdata=data.frame(horsepower=98)
predict(mod, newdata)
1
24.46708
confint(mod)
2.5 % 97.5 %
(Intercept) 38.525212 41.3465103
horsepower -0.170517 -0.1451725
confBand<-predict(mod, interval="confidence")
predBand<-predict(mod, interval="predict")
predictions on current data refer to _future_ responses
nrow(confBand)
[1] 392
nrow(predBand)
[1] 392
nrow(Auto)
[1] 397
colnames(predBand)<-c("fit2", "lwr2", "upr2")
newDF<-cbind(Auto, confBand, predBand)
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 397, 392
The predicted value would be 24.467. For some reason when I try to create my confidence and prediction bands it tells me that they have 392 rows while the Auto set has 397 so they can’t be combined with the cbind it doesn’t work? Not really sure what is happening with this?
b.)
plot(Auto$horsepower, Auto$mpg)
abline(mod)

c.)
plot(mod)




qqnorm(mod$residuals)
qqline((mod$residuals))

It looks like the tails are pretty big on the qq norm and line plots. Also in the residuals vs fitted there seems to be a pattern and I’m pretty sure that means that there is a relationship there that our moddle fails to take into account.
LS0tCnRpdGxlOiAiSG9tZSBXb3JrIDUgLSBBbGV4IE1hdHRlc29uIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgpIb21lIFdvcmsgNSAtIEFsZXggTWF0dGVzb24KUHJvYmxlbSAyCmEuKQppLiAKYGBge3J9CmxpYnJhcnkodGlkeXZlcnNlKQpBdXRvIDwtIHJlYWQudGFibGUoImh0dHA6Ly9mYWN1bHR5Lm1hcnNoYWxsLnVzYy5lZHUvZ2FyZXRoLWphbWVzL0lTTC9BdXRvLmRhdGEiLCAKICAgICAgICAgICAgICAgICAgIGhlYWRlcj1UUlVFLAogICAgICAgICAgICAgICAgICAgbmEuc3RyaW5ncyA9ICI/IikKCm1vZDwtbG0obXBnfmhvcnNlcG93ZXIsIEF1dG8pCnN1bW1hcnkobW9kKQoKCnBsb3QoQXV0byRob3JzZXBvd2VyLCBBdXRvJG1wZykKYWJsaW5lKG1vZCkKYGBgCkl0IGFwcGVhcnMgdGhhdCB0aGUgcmVzcG9uY2UgYW5kIHByZWRpY3RvciB2YXJpYWJsZXMgYXJlIHJlbGF0ZWQsIHRoZSByZWxhdGlvbnNoaXAgbWF5IG5vdCBiZSBsaW5lYXIgdGhvdWdoLgoKaWkuClRoZXJlIHNlZW1zIHRvIGJlIGEgc3Ryb25nIG5lZ2F0aXZlIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIHRoZSB0d28gdmFyaWFibGVzLiBUaGUgcl4yIHZhbHVlIGlzIC42IHdoaWNoIGlzIGZhaXJseSBoaWdoCmlpaS4KTG9va2luZyBhdCB0aGUgcGxvdCBpdCBhcHBlYXJzIHRvIGJlIGEgbmVnYXRpdmUgcmVsYXRpb25zaGlwIGJldHdlZW4gdGhlIHZhcmlhYmxlcy4KCml2LgpgYGB7cn0KbmV3ZGF0YT1kYXRhLmZyYW1lKGhvcnNlcG93ZXI9OTgpCnByZWRpY3QobW9kLCBuZXdkYXRhKQpjb25maW50KG1vZCkKCmNvbmZCYW5kPC1wcmVkaWN0KG1vZCwgaW50ZXJ2YWw9ImNvbmZpZGVuY2UiKQpwcmVkQmFuZDwtcHJlZGljdChtb2QsIGludGVydmFsPSJwcmVkaWN0IikKbnJvdyhjb25mQmFuZCkKbnJvdyhwcmVkQmFuZCkKbnJvdyhBdXRvKQoKY29sbmFtZXMocHJlZEJhbmQpPC1jKCJmaXQyIiwgImx3cjIiLCAidXByMiIpCm5ld0RGPC1jYmluZChBdXRvLCBjb25mQmFuZCwgcHJlZEJhbmQpCmdncGxvdChuZXdERiwgYWVzKHg9aG9yc2Vwb3dlciwgeT1tcGcpKSsKICBnZW9tX3BvaW50KGFscGhhPS4zKSsKICBnZW9tX2FibGluZShzbG9wZT1tb2QkY29lZmZpY2llbnRzWzJdLCBpbnRlcmNlcHQ9bW9kJGNvZWZmaWNpZW50c1sxXSwKICAgICAgICAgICAgICBjb2xvcj0iYmx1ZSIsIGx0eT0yLCBsd2Q9MSkrCiAgZ2VvbV9saW5lKGFlcyh5PWx3ciksIGNvbG9yPSJncmVlbiIsIGx0eT0yLCBsd2Q9MSkrCiAgZ2VvbV9saW5lKGFlcyh5PXVwciksIGNvbG9yPSJncmVlbiIsIGx0eT0yLCBsd2Q9MSkrCiAgZ2VvbV9saW5lKGFlcyh5PWx3cjIpLCBjb2xvcj0icmVkIiwgbHR5PTIsIGx3ZD0xKSsKICBnZW9tX2xpbmUoYWVzKHk9dXByMiksIGNvbG9yPSJyZWQiLCBsdHk9MiwgbHdkPTEpKwogIHRoZW1lX2J3KCkKYGBgClRoZSBwcmVkaWN0ZWQgdmFsdWUgd291bGQgYmUgMjQuNDY3LiBGb3Igc29tZSByZWFzb24gd2hlbiBJIHRyeSB0byBjcmVhdGUgbXkgY29uZmlkZW5jZSBhbmQgcHJlZGljdGlvbiBiYW5kcyBpdCB0ZWxscyBtZSB0aGF0IHRoZXkgaGF2ZSAzOTIgcm93cyB3aGlsZSB0aGUgQXV0byBzZXQgaGFzIDM5NyBzbyB0aGV5IGNhbid0IGJlIGNvbWJpbmVkIHdpdGggdGhlIGNiaW5kIGl0IGRvZXNuJ3Qgd29yaz8gTm90IHJlYWxseSBzdXJlIHdoYXQgaXMgaGFwcGVuaW5nIHdpdGggdGhpcz8KCgpiLikKYGBge3J9CnBsb3QoQXV0byRob3JzZXBvd2VyLCBBdXRvJG1wZykKYWJsaW5lKG1vZCkKYGBgCgpjLikKYGBge3J9CnBsb3QobW9kKQoKcXFub3JtKG1vZCRyZXNpZHVhbHMpCnFxbGluZSgobW9kJHJlc2lkdWFscykpCmBgYApJdCBsb29rcyBsaWtlIHRoZSB0YWlscyBhcmUgcHJldHR5IGJpZyBvbiB0aGUgcXEgbm9ybSBhbmQgbGluZSBwbG90cy4KQWxzbyBpbiB0aGUgcmVzaWR1YWxzIHZzIGZpdHRlZCB0aGVyZSBzZWVtcyB0byBiZSBhIHBhdHRlcm4gYW5kIEknbSBwcmV0dHkgc3VyZSB0aGF0IG1lYW5zIHRoYXQgdGhlcmUgaXMgYSByZWxhdGlvbnNoaXAgdGhlcmUgdGhhdCBvdXIgbW9kZGxlIGZhaWxzIHRvIHRha2UgaW50byBhY2NvdW50LiAKCg==