In this learning log I will be working with the MTCars dataset.
data("mtcars")
attach(mtcars)
I will be creating a multiple linear regression model for quarter mile times. The predictors are gross horsepower and weight.
quartMile <- lm(qsec ~ hp + wt)
quartMile
##
## Call:
## lm(formula = qsec ~ hp + wt)
##
## Coefficients:
## (Intercept) hp wt
## 18.82559 -0.02731 0.94153
I will be conducting a hypothesis test on weight at a 0.05 level of significance.
Null hypothesis: After accounting for gross horsepower, there is no linear relationship between quarter mile time and weight.
Alternative hypothesis: After accounting for gross horsepower, there is a linear relationship between quarter mile time and weight.
summary(quartMile)
##
## Call:
## lm(formula = qsec ~ hp + wt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8283 -0.4055 -0.1464 0.3519 3.7030
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.825585 0.671867 28.020 < 2e-16 ***
## hp -0.027310 0.003795 -7.197 6.36e-08 ***
## wt 0.941532 0.265897 3.541 0.00137 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.09 on 29 degrees of freedom
## Multiple R-squared: 0.652, Adjusted R-squared: 0.628
## F-statistic: 27.17 on 2 and 29 DF, p-value: 2.251e-07
From the summary of the model, the value of the test statistic is 3.541, which follows a t-distribution with 29 degrees of freedom. There are 32 observations in the dataset (n=32), and the model has two predictors (k=2). Degrees of freedom is equal to n-k-1, which in this case is 32-2-1=29. The p-value is 0.00137.
Since our p-value of 0.00137 is less than our 0.05 level of significance, we reject the null hypothesis in favor of the alternative. There is strong evidence to suggest that after we account for gross horsepower, there is a linear relationship between quarter mile time and weight.
weightInt <- confint (quartMile, level = 0.95)
weightInt
## 2.5 % 97.5 %
## (Intercept) 17.45146289 20.19970760
## hp -0.03507046 -0.01954879
## wt 0.39771199 1.48535274
The output gives the upper and lower bounds for a 95% confidence interval. The interval for weight is (0.39771199, 1.48535274)
We are 95% confident that for any given gross horsepower rating, each additional 1000 pounds of weight will result in an increase in quarter mile times between 0.3977 and 1.4854 seconds.
Let’s look at one specific vehicle in this dataset, the Mercedes-Benz 450SE. This vehicle is the 12th observation in the dataset.
mtcars[12, ]
## mpg cyl disp hp drat wt qsec vs am gear carb
## Merc 450SE 16.4 8 275.8 180 3.07 4.07 17.4 0 0 3 3
The gross horsepower for the 450SE is 180, and weight is 4070 pounds. We then create a new dataset using these values to use with our prediction interval.
newdata <- data.frame(hp = 180, wt = 4.07)
Creating the prediction interval:
PI <- predict(quartMile, newdata, interval = "predict")
PI
## fit lwr upr
## 1 17.74189 15.45114 20.03264
Our predicted quarter mile time given the observed gross horsepower and weight of the 450SE is 17.7419 seconds. This is the center of the prediction interval. The prediction interval is (15.45114, 20.03264).
We predict the quarter mile time for a car weighing 4070 pounds and producing 180 gross horsepower to be between 15.45114 seconds to 20.03264 seconds.
Creating the confidence interval:
CI <- predict(quartMile, newdata, interval = "confidence")
CI
## fit lwr upr
## 1 17.74189 17.2135 18.27028
The confidence interval is (17.2135, 18.27028).
We are 95% confident the mean quarter mile time for cars weighing 4070 pounds and producing 180 gross horsepower is between 17.2135 seconds and 18.27028 seconds.