Analysis on Transmission Effects vs. Fuel Efficiency ?

Executive Summary

You work for Motor Trend, a magazine about the automobile industry. Looking at a data set of a collection of cars, they are interested in exploring the relationship between a set of variables and miles per gallon (MPG) (outcome). They are particularly interested in the following two questions:

The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models).

Dataset details

The dataset contains of 32 rows on 11 variables name below :-

  1. mpg: Miles per US gallon
  2. cyl: Number of cylinders
  3. disp: Displacement (cc)
  4. hp: Raw Horsepower
  5. drat: Rear axle ratio
  6. wt: Weight (lb / 1000)
  7. qsec: 1/4 mile time in sec
  8. vs: V engine or Straight engine
  9. am: Transmission (0 = automatic, 1 = manual)
  10. gear: Number of gears
  11. carb: Number of carburetors

Details records of the dataset

unique(mtcars)
##                      mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
## Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
## Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
## Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
## Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
## Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
## Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
## Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
## Fiat 128            32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
## Honda Civic         30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
## Toyota Corolla      33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
## Toyota Corona       21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
## Dodge Challenger    15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
## AMC Javelin         15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
## Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
## Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
## Fiat X1-9           27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
## Porsche 914-2       26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
## Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
## Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
## Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
## Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
## Volvo 142E          21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

Research Questions

Summary for all cars

summary(mtcars)
##       mpg             cyl             disp             hp       
##  Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
##  Median :19.20   Median :6.000   Median :196.3   Median :123.0  
##  Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
##  3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
##  Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
##       drat             wt             qsec             vs        
##  Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
##  1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
##  Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
##  Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
##  3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
##  Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
##        am              gear            carb      
##  Min.   :0.0000   Min.   :3.000   Min.   :1.000  
##  1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
##  Median :0.0000   Median :4.000   Median :2.000  
##  Mean   :0.4062   Mean   :3.688   Mean   :2.812  
##  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
##  Max.   :1.0000   Max.   :5.000   Max.   :8.000

Summary for automatic type cars

summary(mtcars[mtcars$am==0,])
##       mpg             cyl             disp             hp       
##  Min.   :10.40   Min.   :4.000   Min.   :120.1   Min.   : 62.0  
##  1st Qu.:14.95   1st Qu.:6.000   1st Qu.:196.3   1st Qu.:116.5  
##  Median :17.30   Median :8.000   Median :275.8   Median :175.0  
##  Mean   :17.15   Mean   :6.947   Mean   :290.4   Mean   :160.3  
##  3rd Qu.:19.20   3rd Qu.:8.000   3rd Qu.:360.0   3rd Qu.:192.5  
##  Max.   :24.40   Max.   :8.000   Max.   :472.0   Max.   :245.0  
##       drat             wt             qsec             vs        
##  Min.   :2.760   Min.   :2.465   Min.   :15.41   Min.   :0.0000  
##  1st Qu.:3.070   1st Qu.:3.438   1st Qu.:17.18   1st Qu.:0.0000  
##  Median :3.150   Median :3.520   Median :17.82   Median :0.0000  
##  Mean   :3.286   Mean   :3.769   Mean   :18.18   Mean   :0.3684  
##  3rd Qu.:3.695   3rd Qu.:3.842   3rd Qu.:19.17   3rd Qu.:1.0000  
##  Max.   :3.920   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
##        am         gear            carb      
##  Min.   :0   Min.   :3.000   Min.   :1.000  
##  1st Qu.:0   1st Qu.:3.000   1st Qu.:2.000  
##  Median :0   Median :3.000   Median :3.000  
##  Mean   :0   Mean   :3.211   Mean   :2.737  
##  3rd Qu.:0   3rd Qu.:3.000   3rd Qu.:4.000  
##  Max.   :0   Max.   :4.000   Max.   :4.000

Summary for manual type cars

summary(mtcars[mtcars$am==1,])
##       mpg             cyl             disp             hp       
##  Min.   :15.00   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:21.00   1st Qu.:4.000   1st Qu.: 79.0   1st Qu.: 66.0  
##  Median :22.80   Median :4.000   Median :120.3   Median :109.0  
##  Mean   :24.39   Mean   :5.077   Mean   :143.5   Mean   :126.8  
##  3rd Qu.:30.40   3rd Qu.:6.000   3rd Qu.:160.0   3rd Qu.:113.0  
##  Max.   :33.90   Max.   :8.000   Max.   :351.0   Max.   :335.0  
##       drat            wt             qsec             vs        
##  Min.   :3.54   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
##  1st Qu.:3.85   1st Qu.:1.935   1st Qu.:16.46   1st Qu.:0.0000  
##  Median :4.08   Median :2.320   Median :17.02   Median :1.0000  
##  Mean   :4.05   Mean   :2.411   Mean   :17.36   Mean   :0.5385  
##  3rd Qu.:4.22   3rd Qu.:2.780   3rd Qu.:18.61   3rd Qu.:1.0000  
##  Max.   :4.93   Max.   :3.570   Max.   :19.90   Max.   :1.0000  
##        am         gear            carb      
##  Min.   :1   Min.   :4.000   Min.   :1.000  
##  1st Qu.:1   1st Qu.:4.000   1st Qu.:1.000  
##  Median :1   Median :4.000   Median :2.000  
##  Mean   :1   Mean   :4.385   Mean   :2.923  
##  3rd Qu.:1   3rd Qu.:5.000   3rd Qu.:4.000  
##  Max.   :1   Max.   :5.000   Max.   :8.000

We found that the mean for MPG for manual cars is bigger than automatic which is [ 24.39 > 17.15 ]. Further investigation needed to be done to verify this .

By plotting into BoxPlot we can have more info

boxplot(mpg ~ am, data = mtcars, col=(c("gold","darkgreen")), 
        xlab = "Transmission Type", ylab = "MPG",
        main="MPG vs. Transmission Type")

it proves that manual = 1 transmission is higher mpg mean than automatic = 0

We need to do the hypothesis testing to prove this

aggregate(mpg~am, data = mtcars, mean)
##   am      mpg
## 1  0 17.14737
## 2  1 24.39231

The mean transmission for manuals is 24.39 mpg which is 7.24 mpg higher than automatic which is 17.15 mpg

We want to test and to determine is there any significant between this. Student TTest can be used to achive this.

a <- mtcars[mtcars$am == 0,]
m <- mtcars[mtcars$am == 1,]
t.test(a$mpg, m$mpg)
## 
##  Welch Two Sample t-test
## 
## data:  a$mpg and m$mpg
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.280194  -3.209684
## sample estimates:
## mean of x mean of y 
##  17.14737  24.39231

With the pValue = 0.001374 , we reject the NULL hypothesis. There is a difference between the mpg for auto and manual transmission.

We try to fit the data into the linear model to check the variance

m<-lm(mpg~am,data=mtcars)
summary(m)
## 
## Call:
## lm(formula = mpg ~ am, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.3923 -3.0923 -0.2974  3.2439  9.5077 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   17.147      1.125  15.247 1.13e-15 ***
## am             7.245      1.764   4.106 0.000285 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.902 on 30 degrees of freedom
## Multiple R-squared:  0.3598, Adjusted R-squared:  0.3385 
## F-statistic: 16.86 on 1 and 30 DF,  p-value: 0.000285

We found that the R-Squared = 0.3598 , so we can assume the model only can be trusted for 36% of variance.

We also want to analyze the differences between the group means and their association.We put the dataset into the model and compared it using Anova

model <- lm(mpg~am + wt + hp + cyl, data = mtcars)
anova(m,model)
## Analysis of Variance Table
## 
## Model 1: mpg ~ am
## Model 2: mpg ~ am + wt + hp + cyl
##   Res.Df   RSS Df Sum of Sq      F    Pr(>F)    
## 1     30 720.9                                  
## 2     27 170.0  3     550.9 29.166 1.274e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The model finally can be summarized below :-

summary(model)
## 
## Call:
## lm(formula = mpg ~ am + wt + hp + cyl, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.4765 -1.8471 -0.5544  1.2758  5.6608 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 36.14654    3.10478  11.642 4.94e-12 ***
## am           1.47805    1.44115   1.026   0.3142    
## wt          -2.60648    0.91984  -2.834   0.0086 ** 
## hp          -0.02495    0.01365  -1.828   0.0786 .  
## cyl         -0.74516    0.58279  -1.279   0.2119    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.509 on 27 degrees of freedom
## Multiple R-squared:  0.849,  Adjusted R-squared:  0.8267 
## F-statistic: 37.96 on 4 and 27 DF,  p-value: 1.025e-10

Conclusion

The model clearly explains with the 84.9% of variance that on average the manual transmission will have more 1.47805 more mpg than the automatic transmission.

We also found that the residuals were normally distributed.

Appendix

plot(model)