p=pt(9.464259928, 48,lower.tail=FALSE)
2*p
## [1] 1.488495e-12
intercept t value: -2.60107 speed t value: 9.464259928 speed pr: 1.488e^(-12)
Mutliple R: 0.65107
F stat: 89.565 on 1 and 48 p val:1.490127e-12
degrees of freedom: 48
Anova speed:Ms=21186 F=89.566 p= 1.490127e-12 residual:Ms=236.541
auto=read.csv("auto.csv", header = TRUE, na.strings = "?")
auto=na.omit(auto)
mod=lm(auto$mpg~auto$horsepower)
summary(mod)
##
## Call:
## lm(formula = auto$mpg ~ auto$horsepower)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.5710 -3.2592 -0.3435 2.7630 16.9240
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 39.935861 0.717499 55.66 <2e-16 ***
## auto$horsepower -0.157845 0.006446 -24.49 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.906 on 390 degrees of freedom
## Multiple R-squared: 0.6059, Adjusted R-squared: 0.6049
## F-statistic: 599.7 on 1 and 390 DF, p-value: < 2.2e-16
ii) There is a decently strong relationship since the R^2 ~ .6
iii)The relationship is negative
iv) mpg=-0.157845(98)+39.935861=24.467051
plot(auto$horsepower,auto$mpg)+abline(a=39.935861, b=-.157845, col = "blue")
## integer(0)
#2c The relationship between horspoer and mpg seems to not be completley linear, so using the slr fit to predict values would proude error.
plot(mod)
#3a
auto<-auto[,-c(8:9)]
attach(auto)
pairs(auto)
#b
cor(auto)
## mpg cylinders displacement horsepower weight
## mpg 1.0000000 -0.7776175 -0.8051269 -0.7784268 -0.8322442
## cylinders -0.7776175 1.0000000 0.9508233 0.8429834 0.8975273
## displacement -0.8051269 0.9508233 1.0000000 0.8972570 0.9329944
## horsepower -0.7784268 0.8429834 0.8972570 1.0000000 0.8645377
## weight -0.8322442 0.8975273 0.9329944 0.8645377 1.0000000
## acceleration 0.4233285 -0.5046834 -0.5438005 -0.6891955 -0.4168392
## year 0.5805410 -0.3456474 -0.3698552 -0.4163615 -0.3091199
## acceleration year
## mpg 0.4233285 0.5805410
## cylinders -0.5046834 -0.3456474
## displacement -0.5438005 -0.3698552
## horsepower -0.6891955 -0.4163615
## weight -0.4168392 -0.3091199
## acceleration 1.0000000 0.2903161
## year 0.2903161 1.0000000
#c
mlr_mod=lm(mpg ~ cylinders+displacement+horsepower+weight+acceleration+year, data=auto)
summary(mlr_mod)
##
## Call:
## lm(formula = mpg ~ cylinders + displacement + horsepower + weight +
## acceleration + year, data = auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.6927 -2.3864 -0.0801 2.0291 14.3607
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.454e+01 4.764e+00 -3.051 0.00244 **
## cylinders -3.299e-01 3.321e-01 -0.993 0.32122
## displacement 7.678e-03 7.358e-03 1.044 0.29733
## horsepower -3.914e-04 1.384e-02 -0.028 0.97745
## weight -6.795e-03 6.700e-04 -10.141 < 2e-16 ***
## acceleration 8.527e-02 1.020e-01 0.836 0.40383
## year 7.534e-01 5.262e-02 14.318 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.435 on 385 degrees of freedom
## Multiple R-squared: 0.8093, Adjusted R-squared: 0.8063
## F-statistic: 272.2 on 6 and 385 DF, p-value: < 2.2e-16
#d
Y<-as.matrix(mpg)
head(Y)
## [,1]
## [1,] 18
## [2,] 15
## [3,] 18
## [4,] 16
## [5,] 17
## [6,] 15
dim(Y)
## [1] 392 1
n=dim(Y)[1]
X<-matrix(c(rep(1, n),
cylinders,
displacement,horsepower,weight,acceleration,year),
ncol=7,
byrow=FALSE)
head(X)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 1 8 307 130 3504 12.0 70
## [2,] 1 8 350 165 3693 11.5 70
## [3,] 1 8 318 150 3436 11.0 70
## [4,] 1 8 304 150 3433 12.0 70
## [5,] 1 8 302 140 3449 10.5 70
## [6,] 1 8 429 198 4341 10.0 70
dim(X)
## [1] 392 7
betaHat<-solve(t(X)%*%X)%*%t(X)%*%Y
betaHat
## [,1]
## [1,] -1.453525e+01
## [2,] -3.298591e-01
## [3,] 7.678430e-03
## [4,] -3.913556e-04
## [5,] -6.794618e-03
## [6,] 8.527325e-02
## [7,] 7.533672e-01
#E
plot(mlr_mod)
There are values in the y direction that get quite large as the grapgh goes to 35, which show how they impact relationship significantly.