In this report I looked closer at the mtcars data set to see if there is a relationship between any of the factors and Miles per Gallon. The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 11 aspects of automobile design and performance for 32 automobiles. Exploratory data analyses are used to mainly explore how automatic and manual transmissions features affect the MPG feature. This shows the performance difference between cars with automatic and manual transmission is about 7 MPG more for cars with manual transmission than those with automatic transmission. Regression models are used and the one with best value is selected. When other factors are held constant: - manual transmitted cars are 14.079 + (-4.141)*weight more MPG (miles per gallon) on average better than automatic transmitted cars. Thus, cars that are lighter in weight (thus may also have smaller engines) with a manual transmission have higher MPG values than cars that are heavier(thus may also have larger engines with heaver transmissions).
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.1.2
## Warning in register(): Can't find generic `scale_type` in package ggplot2 to
## register S3 method.
data(mtcars)
str(mtcars)
qplot(wt, mpg, data = mtcars, color = am)
qplot(wt, cyl, data = mtcars, color = am)
A pattern is clearly visable.
test <- t.test(mpg ~ am, mtcars)
test$p.value
## [1] 0.001373638
test$estimate
## mean in group 0 mean in group 1
## 17.14737 24.39231
Low p value means null hypothesis is rejected. The mean for MPG of manual transmitted cars is about 7 more than that of automatic transmitted cars
model01 <- lm(mpg ~ ., data=mtcars)
summary(model01)
wt Estimate = -3.71530, am Estimate = 2.52023, Adjusted R-squared value is 0.8066
model02 <- step(model01, k=log(nrow(mtcars)))
summary(model02)
Here the Residual standard error as 2.459 on 28 degrees of freedom. And the Adjusted R-squared value is 0.8336.
model03 <- lm(mpg ~ wt + qsec + am + wt:am, data=mtcars)
summary(model03)
model04 <- lm(mpg ~ wt + am + wt:am, data=mtcars)
summary(model04)
model05 <- lm(mpg ~ am, data=mtcars)
summary(model05)
model06 <- lm(mpg ~ wt, data=mtcars)
summary(model06)
Removing wt or am doesn’t tell the whole picture
anova(model01, model02, model03, model04)
confint(model03)
“mpg ~ wt + qsec + am + wt:am” Sum of Sq = 52.010, the higher value
summary(model03)$coef
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.723053 5.8990407 1.648243 0.1108925394
## wt -2.936531 0.6660253 -4.409038 0.0001488947
## qsec 1.016974 0.2520152 4.035366 0.0004030165
## am 14.079428 3.4352512 4.098515 0.0003408693
## wt:am -4.141376 1.1968119 -3.460340 0.0018085763
Holding “wt” (weight lb/1000) and “qsec” (1/4 mile time) constant, cars with manual transmission, on average, add 14.079 + (-4.141)*wt more MPG (miles per gallon), compared to cars with automatic transmission.
See plot below. 1. The Residuals vs. Fitted plot hasn’t a distinct pattern 2. The Normal Q-Q plot has a normally distributed pattern. 3. The Scale-Location plot has reasonable distrubition 4. The Residuals vs. Leverage doesn’t show up any outliers.
par(mfrow = c(2,2))
plot(model03)