In this report, we’re exploring the relationship between a set of variables and miles per gallon (MPG) from the given data set of a collection of cars. Then, answer the following questions:
The data was extracted from the 1974 Motor Trend US Magazine, and comprises fuel consumptions and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models).
There are in total 32 observations with 11 variables. And, the data looks like this.
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Based on the observation in the boxplot below, manual transmission cars seem to be more fuel efficient than the automatic transmission cars.
And, modelling the relationship between the dependent variable, mpg, and the regressor am.
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.147368 1.124603 15.247492 1.133983e-15
## factor(am)1 7.244939 1.764422 4.106127 2.850207e-04
The mean of mpg for automatic car is 17.1473684, while 7.2449393 is the change in mean of mpg between automatic and manual transmission cars.
Then, running a t-test to test whether the difference is significant.
##
## Welch Two Sample t-test
##
## data: mpg by am
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.280194 -3.209684
## sample estimates:
## mean in group 0 mean in group 1
## 17.14737 24.39231
Since the p-value of 0.0013736 is < 0.05, we reject the null hypothesisas there is significant difference in mean MPG bewtween Automatic and Manual transmission cars. Therefore, manual transmission car is better for mpg.
## 2.5 % 97.5 %
## (Intercept) 14.85062 19.44411
## factor(am)1 3.64151 10.84837
And, we are 95% confident that, on average, manual transmission cars are 3.64151 more efficient than automatic transmission cars.
This is a scatterplot matrix across all 11 variables with regression lines and correlation values, grouping by transmission (red = automatic, green = manual)
From the pairs plot above, besides am, the few other variables that are highly correlated to mpg are cyl, disp, hp and wt. Hence, we will try
## Analysis of Variance Table
##
## Model 1: mpg ~ factor(am)
## Model 2: mpg ~ factor(am) + factor(cyl)
## Model 3: mpg ~ factor(am) + factor(cyl) + disp
## Model 4: mpg ~ factor(am) + factor(cyl) + disp + wt
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 30 720.90
## 2 28 264.50 2 456.40 32.4451 8.589e-08 ***
## 3 27 230.46 1 34.04 4.8391 0.03691 *
## 4 26 182.87 1 47.59 6.7663 0.01513 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
With a p-value of 0.01513, which is less than 0.05, we can claim that using model 4 is better than our initial model 1.
Constructing residual plots to check for any signs of regular patterns.
The residual plot show a random patterns, indicating that this is a good fit for the linear model.
##
## Call:
## lm(formula = mpg ~ factor(am) + factor(cyl) + disp + wt, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.5029 -1.2829 -0.4825 1.4954 5.7889
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 33.816067 2.914272 11.604 8.79e-12 ***
## factor(am)1 0.141212 1.326751 0.106 0.91605
## factor(cyl)6 -4.304782 1.492355 -2.885 0.00777 **
## factor(cyl)8 -6.318406 2.647658 -2.386 0.02458 *
## disp 0.001632 0.013757 0.119 0.90647
## wt -3.249176 1.249098 -2.601 0.01513 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.652 on 26 degrees of freedom
## Multiple R-squared: 0.8376, Adjusted R-squared: 0.8064
## F-statistic: 26.82 on 5 and 26 DF, p-value: 1.73e-09
This model explains 83.76% of the variance. But, with a p-value of 0.91605, it shows that transmission type is no insignificant statistical impact on fuel efficiency.