Practice Problem 1 a.Intercept t-value: -17.5791/6.7584 = -2.601
b.speed t-value: 3.9324/0.4155 = 9.464
c.Because the we see *** and the t-value is high we can understand that the p-value for the t- stat is nearly zero.
e.Multiple R-squared: 1- (11354/32540) = 0.6510756
F-statistic = 86.833715
The p-value for the f-statisitc has *** and a high f-statistic lead us to beleive that the p-vale for the f-statisitic is nearly zero.
h.speed Mean Sq: 21186
F-value 21186/246.826087 -> 86.833715
The p-value for the f-statisitc has *** and a high f-statistic lead us to beleive that the p-vale for the f-statisitic is nearly zero.
Problem 2 a
Auto <- read.table("http://faculty.marshall.usc.edu/gareth-james/ISL/Auto.data",
header=TRUE,
na.strings = "?",
stringsAsFactors = FALSE)
lim.fit <- lm(mpg ~ horsepower, data = Auto)
summary(lim.fit)
##
## Call:
## lm(formula = mpg ~ horsepower, data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.5710 -3.2592 -0.3435 2.7630 16.9240
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 39.935861 0.717499 55.66 <2e-16 ***
## horsepower -0.157845 0.006446 -24.49 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.906 on 390 degrees of freedom
## (5 observations deleted due to missingness)
## Multiple R-squared: 0.6059, Adjusted R-squared: 0.6049
## F-statistic: 599.7 on 1 and 390 DF, p-value: < 2.2e-16
The p-values are much lower then 0.05 which leads us to believe that there is strong evidence that there is statisitcal significance. This suggests that there is a relationship between the predictor and the response.
There seems to be a moderetly strong relationship between the predictor and response because the R-squared value is 0.6049.
The regression coefficient for the predictor, horsepower, is negative.This leads us to believe that it is a negative relationship
confidence interval
predict(lim.fit, data.frame(horsepower = c(85)), interval = "confidence")
## fit lwr upr
## 1 26.51906 25.973 27.06512
predicted interval
predict(lim.fit, data.frame(horsepower = c(85)), interval = "prediction")
## fit lwr upr
## 1 26.51906 16.85857 36.17954
The 95% confidence interval for the prediction interval is bigger than that of the confidence interval because there is an increased uncertainity when a new individual observation must be accounted for.
library(tidyverse)
## ── Attaching packages ────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.2.1 ✓ purrr 0.3.3
## ✓ tibble 2.1.3 ✓ dplyr 0.8.3
## ✓ tidyr 1.0.2 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.4.0
## ── Conflicts ───────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
plot(Auto$horsepower, Auto$mpg, xlab = "horsepower", ylab = "mpg")
abline(lim.fit, col = "red")
par(mfrow=c(2,2))
plot(lim.fit)
In the Residuals vs. Leverage we cthere is relatively high leverage which suggests that there may be problems with the fit. The residual plot also has a trong curve, which may suggest that there are problems with the fit.