Part II Porblem 1

pf(89.56564089, 1, 48, lower.tail = FALSE)
## [1] 1.490221e-12
#####            t value  Pr(>|t|)
##### Intercept  -2.60107      0.0123
##### speed      9.46425       1.490221e-12 

##### Residual standard error: 15.38 on 48 degrees of freedom 
##### Multiple R-squared: 0.6509555706 , adjusted R-squared: 0.6438
##### F-statistic: 89.56564089   on 1 and 48 DF, p-value 1.490221e-12 



#####            Mean Sq   F value        Pr(>F) 
##### speed       21186     89.56564089    1.490221e-12
##### Residuals   236.5416     N/A         N/A 

Problem 2

library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.2.1     ✓ purrr   0.3.3
## ✓ tibble  2.1.3     ✓ dplyr   0.8.3
## ✓ tidyr   1.0.2     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.4.0
## ── Conflicts ─────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(ggplot2)
library(readxl)

Auto <- read_excel("Auto.xlsx")
str(Auto)
## Classes 'tbl_df', 'tbl' and 'data.frame':    397 obs. of  9 variables:
##  $ mpg         : num  18 15 18 16 17 15 14 14 14 15 ...
##  $ cylinders   : num  8 8 8 8 8 8 8 8 8 8 ...
##  $ displacement: num  307 350 318 304 302 429 454 440 455 390 ...
##  $ horsepower  : num  130 165 150 150 140 198 220 215 225 190 ...
##  $ weight      : num  3504 3693 3436 3433 3449 ...
##  $ acceleration: num  12 11.5 11 12 10.5 10 9 8.5 10 8.5 ...
##  $ year        : num  70 70 70 70 70 70 70 70 70 70 ...
##  $ origin      : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ name        : chr  "chevrolet chevelle malibu" "buick skylark 320" "plymouth satellite" "amc rebel sst" ...
ggplot(Auto, aes(x = horsepower, y = mpg)) +
  geom_jitter()

lmauto <- lm(mpg~horsepower, data = Auto)
summary(lmauto)
## 
## Call:
## lm(formula = mpg ~ horsepower, data = Auto)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.6299  -3.2719  -0.3331   2.7113  16.8586 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 40.061876   0.716607   55.91   <2e-16 ***
## horsepower  -0.158777   0.006455  -24.60   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.924 on 395 degrees of freedom
## Multiple R-squared:  0.605,  Adjusted R-squared:  0.604 
## F-statistic: 605.1 on 1 and 395 DF,  p-value: < 2.2e-16
  1. There is a negative correlation between mog and horsepower

  2. The relationship is moderately strong.

  3. It is negative correlation.

newdata <- data.frame(horsepower=c(98))
predict(lmauto, newdata, interval = "confidence")
##        fit      lwr      upr
## 1 24.50173 24.00949 24.99397
ggplot(Auto, aes(x=horsepower, y=mpg))+
  geom_jitter()+
  geom_abline(slope=lmauto$coefficients[2], intercept=lmauto$coefficients[1],
              color="blue", lty=2, lwd=1)+
  theme_bw()

plot(lmauto)

The first plot indicates a non-linear relationship between the predictor and response variables becasue a pattern is u shaped. However, the residuals ar enormally distributed.