pf(89.56564089, 1, 48, lower.tail = FALSE)
## [1] 1.490221e-12
##### t value Pr(>|t|)
##### Intercept -2.60107 0.0123
##### speed 9.46425 1.490221e-12
##### Residual standard error: 15.38 on 48 degrees of freedom
##### Multiple R-squared: 0.6509555706 , adjusted R-squared: 0.6438
##### F-statistic: 89.56564089 on 1 and 48 DF, p-value 1.490221e-12
##### Mean Sq F value Pr(>F)
##### speed 21186 89.56564089 1.490221e-12
##### Residuals 236.5416 N/A N/A
library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.2.1 ✓ purrr 0.3.3
## ✓ tibble 2.1.3 ✓ dplyr 0.8.3
## ✓ tidyr 1.0.2 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.4.0
## ── Conflicts ─────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(ggplot2)
library(readxl)
Auto <- read_excel("Auto.xlsx")
str(Auto)
## Classes 'tbl_df', 'tbl' and 'data.frame': 397 obs. of 9 variables:
## $ mpg : num 18 15 18 16 17 15 14 14 14 15 ...
## $ cylinders : num 8 8 8 8 8 8 8 8 8 8 ...
## $ displacement: num 307 350 318 304 302 429 454 440 455 390 ...
## $ horsepower : num 130 165 150 150 140 198 220 215 225 190 ...
## $ weight : num 3504 3693 3436 3433 3449 ...
## $ acceleration: num 12 11.5 11 12 10.5 10 9 8.5 10 8.5 ...
## $ year : num 70 70 70 70 70 70 70 70 70 70 ...
## $ origin : num 1 1 1 1 1 1 1 1 1 1 ...
## $ name : chr "chevrolet chevelle malibu" "buick skylark 320" "plymouth satellite" "amc rebel sst" ...
ggplot(Auto, aes(x = horsepower, y = mpg)) +
geom_jitter()
lmauto <- lm(mpg~horsepower, data = Auto)
summary(lmauto)
##
## Call:
## lm(formula = mpg ~ horsepower, data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.6299 -3.2719 -0.3331 2.7113 16.8586
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 40.061876 0.716607 55.91 <2e-16 ***
## horsepower -0.158777 0.006455 -24.60 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.924 on 395 degrees of freedom
## Multiple R-squared: 0.605, Adjusted R-squared: 0.604
## F-statistic: 605.1 on 1 and 395 DF, p-value: < 2.2e-16
There is a negative correlation between mog and horsepower
The relationship is moderately strong.
It is negative correlation.
newdata <- data.frame(horsepower=c(98))
predict(lmauto, newdata, interval = "confidence")
## fit lwr upr
## 1 24.50173 24.00949 24.99397
ggplot(Auto, aes(x=horsepower, y=mpg))+
geom_jitter()+
geom_abline(slope=lmauto$coefficients[2], intercept=lmauto$coefficients[1],
color="blue", lty=2, lwd=1)+
theme_bw()
plot(lmauto)
The first plot indicates a non-linear relationship between the predictor and response variables becasue a pattern is u shaped. However, the residuals ar enormally distributed.