Consumption is an important attribute for a car. Does the transmission type affect the consumption of a car? And if it does which one is better in terms of consumption? Automatic or manual? In order to answer the following questions, various techniques are used. First, some exploration of the data is done in order to see and understand better the data. Variables are also plotted against each other in order to see the relationships between them. T-tests are used to see if there is a difference in the mean consumption between the transmission type. Finally, some linear models are constructed and compared to find the one with the best fit. The final regression model is examined by analyzing the residuals. The results show that the two transmission types do not have significant difference in consumption.
The data has been extracted from the 1974 Motor Trend US magazine. The aspects of the vehicles are fuel consumption, number of cylinders, Displacement, horsepower, Rear axle ratio, weight, quarter mile time, engine type (V or Straight), transmission (automatic or manual), number of forward gears and number of carburetors. The sample is 32 automobiles (1973–74 models).
The average consumption of the cars is 20.09 miles/gallon with a standard deviation of 6.03 miles/gallon and the average displacement of the cars is 230.72 cu.in. with a standard deviation of 123.94 cu.in. The sample has an average of 146.69 horsepower with a standard deviation of 68.56. Regarding the rear axle ratio the average is 3.6 with a standard deviation of 0.53. The average weight of the cars is 3217 lb with a standard deviation of 978 lb. The average time for a quarter of mile is 17.85 second with a standard deviation of 1.79 seconds. Histograms for all the variables can be seen in Figure 1 of the Appendix. Figure 2 shows the pie charts of the categorical data. The normal Q-Q plots of the non categorical data are shown in Figure 3.
Because, we are interested in the relationship between the consumption and the transmission type, it is interesting to examine if there is a difference in the mean consumption between the two transmission types. The mean consumption for the cars with automatic transmission is 17.15 miles/gallon with a standard deviation of 3.83miles/gallon. The average consumption of cars with manual transmission is 24.39 miles/gallon with a standard deviation of 6.17miles/gallon. A t-test shows that this difference in the means is significant (p-value = 0.0013) witch means that we can say that the cars with automatic transmission have a grater consumption. The box-plot in Figure 4 shows the differences.
In the scatter plots shown in Figure 5 show that there is a strong correlation between the variables. There is a clear link between the displacement and the consumption, with the cars with bigger displacement having less miles per gallon. Also, there is a clear and almost linear negative correlation between miles per gallon and weight. The heavier the car the less the miles per gallon. A clear linear correlation exist between weight and displacement. Additionally, we can see that the lighter cars, with the greater mileage, have manual transmission (blue dots) and the heavier cars, with the lower mileage, have automatic transmission (green dots). The same happens with displacement.
This section tries to find a model that explains the variability in the consumption. Transmission type seems to have an impact on consumption. But the fact that the lighter cars and those with smaller displacement have manual transmissions, in contrast with the heavier and with bigger displacement, which have automatic transmission, may be an important issue. First, a simple regression model is constructed with transmission as predictor and and consumption as the outcome.
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.147368 1.124603 15.247492 1.133983e-15
## factor(am)1 7.244939 1.764422 4.106127 2.850207e-04
The coefficients are both significant. An automatic transmission increases the mileage by 17.15 miles and a manual one by 24.39 miles. The r-squared is 0.3598 which means that the 36% of the variation in consumption is explained by transmission. As explained above, weight and displacement have a relationship with consumption, so we will try to add to the model first weight and then displacement and compare them.
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.32155131 3.0546385 12.21799285 5.843477e-13
## factor(am)1 -0.02361522 1.5456453 -0.01527855 9.879146e-01
## wt -5.35281145 0.7882438 -6.79080719 1.867415e-07
The results show that the variation now is better explained, adjusted r-squared equals 0.7528. The coefficients show that there is no significant difference between the automatic and manual transmission. Additionally, an increase of 1000 pounds decreases the millage by 5.35 miles per gallon.
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 34.67591088 3.24060891 10.7004306 2.115200e-11
## factor(am)1 0.17772414 1.48431586 0.1197347 9.055483e-01
## wt -3.27904388 1.32750927 -2.4700723 1.986658e-02
## disp -0.01780491 0.00937465 -1.8992613 6.787740e-02
The coefficients change a little from the addition of displacement but the signs remain the same. The result of the following table means that the inclusion of displacement in the model does not offer much in explaining the variance in consumption, but between the first and the second model the second explains much more. so, we should keep the second model and do some diagnostics.
## Analysis of Variance Table
##
## Model 1: mpg ~ factor(am)
## Model 2: mpg ~ factor(am) + wt
## Model 3: mpg ~ factor(am) + wt + disp
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 30 720.90
## 2 29 278.32 1 442.58 50.2610 1.032e-07 ***
## 3 28 246.56 1 31.76 3.6072 0.06788 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The confidence intervals of the coefficients are seen in the following table. The coefficient for manual transmission can go from -3 to 3 and it includes zero. so we cannot say that manual transmission is better or not from automatic.
## 2.5 % 97.5 %
## (Intercept) 31.074114 43.568989
## factor(am)1 -3.184815 3.137584
## wt -6.964951 -3.740672
The residual plots against the the fitted values seem to have a non linear pattern and this probably means that the model needs some kind of transformation. The residuals seem to have a normal distribution (see figure 6 of the Appendix).
Although, the t-tests show that there is a significant difference between the consumption of automatic and manual transmission cars, a more elaborative analysis shows that this difference cannot be attrinuted to transmission but on other characteristics of the cars.
Figure 1. Variable histograms
Figure 2. Pie charts
Figure 3. Normal Q-Q plots
Figure 6. Residual plots