In collaboration with Sam and Bryce

Problem 1

Coefficients

# t-value of the intercept
-17.5791/6.7584
## [1] -2.601074
# t-value of speed
3.9324/.4155
## [1] 9.46426
# p-value of speed
pt(-17.5791/6.7584, df=48, lower.tail =  TRUE)*2
## [1] 0.01231831
pt(3.9324/.4155, df=1, lower.tail = FALSE)*2
## [1] 0.06701701
#R^2
21186/11354
## [1] 1.86595

Residual standard error: 15.38 on 48 degrees of freedom

Multiple R-squared: 1.86595, Adjusted R-squared: 0.6438

F-statistic: 49 on 1 and 48 DF, p-value: 7.353108e-09

Analysis of Variance Table

# Mean square root reg
21186/1
## [1] 21186
# Mean square root res
21186/(49)
## [1] 432.3673
# P-value
pf(49, 1, 48, lower.tail = FALSE)
## [1] 7.353108e-09

Problem 2

Auto <- read.table("http://faculty.marshall.usc.edu/gareth-james/ISL/Auto.data", 
                   header=TRUE,
                   na.strings = "?")
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 3.6.2
## -- Attaching packages -------------------------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.2.1     v purrr   0.3.3
## v tibble  2.1.3     v dplyr   0.8.3
## v tidyr   1.0.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.4.0
## Warning: package 'ggplot2' was built under R version 3.6.2
## Warning: package 'tidyr' was built under R version 3.6.2
## Warning: package 'readr' was built under R version 3.6.2
## Warning: package 'purrr' was built under R version 3.6.2
## Warning: package 'dplyr' was built under R version 3.6.2
## Warning: package 'stringr' was built under R version 3.6.2
## Warning: package 'forcats' was built under R version 3.6.2
## -- Conflicts ----------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

Part A

mod<- lm(mpg~horsepower, data=Auto)
summary(mod)
## 
## Call:
## lm(formula = mpg ~ horsepower, data = Auto)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.5710  -3.2592  -0.3435   2.7630  16.9240 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 39.935861   0.717499   55.66   <2e-16 ***
## horsepower  -0.157845   0.006446  -24.49   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.906 on 390 degrees of freedom
##   (5 observations deleted due to missingness)
## Multiple R-squared:  0.6059, Adjusted R-squared:  0.6049 
## F-statistic: 599.7 on 1 and 390 DF,  p-value: < 2.2e-16

There is a moderatley strong negative relationship between horsepower and mpg

predict(mod , data.frame(horsepower=98), interval= "confidence")
##        fit      lwr      upr
## 1 24.46708 23.97308 24.96108
predict(mod, data.frame(horsepower=98), interval= "prediction")
##        fit     lwr      upr
## 1 24.46708 14.8094 34.12476

Part B

ggplot(Auto, aes(x=mpg, y=horsepower))+
  geom_point()+
  #geom_abline(slope=, intercept=, lty=2, color="blue")+
  theme_bw()
## Warning: Removed 5 rows containing missing values (geom_point).

I know theres is supposed to be an abline but I couldn’t get it to work, I think the problem was with the slope and intercept I used.

Sam showed me how she did it using plot rather than ggplot so I went ahead and tried that as well

plot(Auto$horsepower, Auto$mpg, xlab= "Horsepower", ylab= "MPG", col= "blue")+
  abline(mod, col="green")

## integer(0)

Part C

plot(mod)