We explore the mtcars dataset to examine the effect of transmission type on miles per galon. We prove by fitting a linear model and by other methods that manual transmission has an MPG greater than an automatic transmission by 1.8
In this project, we explore the mtcars dataset to examine two questions:
For purposes of consequent exploration and modelling, we transform relevant variables into factors.
#transforming into factors
mtcars$cyl <- factor(mtcars$cyl)
mtcars$vs <- factor(mtcars$vs)
mtcars$gear <- factor(mtcars$gear)
mtcars$carb <- factor(mtcars$carb)
mtcars$am <- factor(mtcars$am,labels=c("Automatic","Manual"))
Visually, manual transmission appears to be more effective in terms of mpg.
boxplot(mpg ~ am, data = mtcars)
T-test proves that the dfference is statistically significant.
automatic <- mtcars$mpg[which(mtcars$am == "Automatic")]
manual <- mtcars$mpg[which(mtcars$am == "Manual")]
t.test(automatic,manual)
##
## Welch Two Sample t-test
##
## data: automatic and manual
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.280194 -3.209684
## sample estimates:
## mean of x mean of y
## 17.14737 24.39231
To quantify the difference, we calculate means of mpg per group:
aggregate(mpg~am, data = mtcars, mean)
## am mpg
## 1 Automatic 17.14737
## 2 Manual 24.39231
Manual is 7.25 higher in terms of mileage than automatic. This however does not control for other variables. To find effect of only the type of transmission, we fit a linear model.
We fit three candidate models: * Simple model with one predictor. * Model with all variables as predictors * Model with predictors selected by the step function
data(mtcars)
simple_fit <- lm(mpg ~ am, data = mtcars)
init_fit <- lm(mpg ~ ., data = mtcars)
best_fit <- step(init_fit, direction = "both", trace = FALSE)
We compare the three models using anova.
anova(simple_fit,best_fit,init_fit)
## Analysis of Variance Table
##
## Model 1: mpg ~ am
## Model 2: mpg ~ wt + qsec + am
## Model 3: mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 30 720.90
## 2 28 169.29 2 551.61 39.2687 8.025e-08 ***
## 3 21 147.49 7 21.79 0.4432 0.8636
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model selected by the step function performs the best.
summary(best_fit)
##
## Call:
## lm(formula = mpg ~ wt + qsec + am, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.4811 -1.5555 -0.7257 1.4110 4.6610
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.6178 6.9596 1.382 0.177915
## wt -3.9165 0.7112 -5.507 6.95e-06 ***
## qsec 1.2259 0.2887 4.247 0.000216 ***
## am 2.9358 1.4109 2.081 0.046716 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.459 on 28 degrees of freedom
## Multiple R-squared: 0.8497, Adjusted R-squared: 0.8336
## F-statistic: 52.75 on 3 and 28 DF, p-value: 1.21e-11
The model reads as follows:
Cars with manual transmissions get 1.8 more MPG than automatic. This is adjusted for horsepower, number of cylinders, and the weight of the vehicle.
MPG decreases with the weight of the car, about 2.5 for every 1000 lb increase.
MPG will decrease by only 0.32 for every increase of 10 in horsepower.
If the number of cylinders increases from 4 to 6 or 8, the MPG will decrease by 3.0 or 2.2, respectively.
par(mfrow = c(2,2))
plot(best_fit)