Executive Summary

A dataset of car performance from Motor Trend Car Roadtests (mtcars) was explored using regression analysis techniques to address questions on the performance differences in fuel consumption (MPG) between cars with automatic and manual transmissions and their quantifiable differences. The findings, including data visualization, showed significant differences in fuel consumption, such that cars with manual transmissions outperformed those with automatic transmissions by 1.81 MPG. Limitations in the analysis were also considered.

Exploratory Data Analysis

  1. Downloading the {mtcars} dataset.
  2. Generating a summary of Variables and inspecting them for further investigation.
  3. Conduct statistical analysis:
## [1] 32 11
## The following object is masked from package:ggplot2:
## 
##     mpg

Statistical Analysis of Coefficients and Test for Model Fit

Statistical Inference

T-Test transmission type and MPG

## [1] 0.001373638

The T-Test rejects the null hypothesis that the difference between transmission types is 0.

## mean in group 0 mean in group 1 
##        17.14737        24.39231

The difference estimate between the 2 transmissions is 7.24494 MPG in favor of manual transmission.

Regression Analysis

Fit the full model of the data

Since none of the coefficients have a p-value less than 0.05, we cannot conclude which variables are more statistically significant.

Backward selection determined which variables were most statistically significant, with the new regression model having 4 variables (cylinders, horsepower, weight, transmission). The R-squared value of 0.8659 confirmed that this model explained about 87% of the variance in MPG.

The p-values also are statistically significant because they have a p-value less than 0.05. The coefficients conclude that increasing the number of cylinders from 4 to 6 with decrease the MPG by 3.03. Further increasing the cylinders to 8 with a decrease to the MPG by 2.16. Increasing the horsepower decreases MPG by 3.21 for every 100 horsepower. Weight decreases the MPG by 2.5 for each 1000 lbs increase. A Manual transmission improves the MPG by 1.81.

A stepwise regression was used to determine the best-fit regression model. This involved iteratively adding and removing variables based on Akaike Information Criterion (AIC) because of the relatively low n size. This stepwise process aimed to balance model

Several regression models were considered during the analysis. Polynomial regression was explored in addition to the linear regression model to capture potential non-linear relationships. Additionally, interaction terms were included for interdependent effects between predictor variables.

Alternative regression models,ridge or lasso regression, were considered to address multicollinearity and prevent overfitting. These regularization techniques proved beneficial for the dataset that contained multiple correlated predictors.

Findings

The results of the EDA using multple forms of linear regression were found to address the questions in this report:

Is an automatic or manual transmission better for MPG?

Comparing “Manual” and “Automatic” transmission types to miles per gallon (MPG), relied on coefficients’ significance. The coefficients for each transmission type differed significantly from zero, and their associated p-values were below a chosen significance level (e.g., 0.05). In short, manual transmission demonstrated significantly better fuel consumption, measure as MPG).

What is the quantifiable difference of MPG between automatic and manual transmissions?

Further analysis of the “mtcars” dataset revealed a significant MPG difference between automatic and manual transmissions. The quantified difference, represented by regression coefficients, indicated a meaningful impact on fuel efficiency. The result is statistically justified, with p-values confirming the significance of the transmission type in explaining the observed MPG variations.

Discussion

The consideration of various regression models, both linear and non-linear, allowed for a comprehensive evaluation of the dataset, ensuring the chosen model provided the best balance of explanatory power and interpretability. However, given the limited obserservations (n=32) and visual review of the uniqueness of car model types (removing Merc and Toyota) making up close to 72% of the dataset, the interpretation of the regression analysis and results should be regarded with caution.

Appendix

## [1] 0.001373638
## mean in group 0 mean in group 1 
##        17.14737        24.39231
##             Length Class  Mode     
## statistic   1      -none- numeric  
## parameter   1      -none- numeric  
## p.value     1      -none- numeric  
## conf.int    2      -none- numeric  
## estimate    2      -none- numeric  
## null.value  1      -none- numeric  
## stderr      1      -none- numeric  
## alternative 1      -none- character
## method      1      -none- character
## data.name   1      -none- character
## Start:  AIC=226.88
## hp ~ mpg + wt + drat + qsec
## 
##        Df Sum of Sq   RSS    AIC
## - drat  1      94.9 28183 224.98
## - mpg   1    1519.4 29608 226.56
## <none>              28088 226.88
## - wt    1    3861.9 31950 229.00
## - qsec  1   28102.2 56190 247.06
## 
## Step:  AIC=224.98
## hp ~ mpg + wt + qsec
## 
##        Df Sum of Sq   RSS    AIC
## - mpg   1    1424.5 29608 224.56
## <none>              28183 224.98
## + drat  1      94.9 28088 226.88
## - wt    1    3797.9 31981 227.03
## - qsec  1   29625.1 57808 245.97
## 
## Step:  AIC=224.56
## hp ~ wt + qsec
## 
##        Df Sum of Sq   RSS    AIC
## <none>              29608 224.56
## + mpg   1      1425 28183 224.98
## + drat  1         0 29608 226.56
## - wt    1     43026 72633 251.28
## - qsec  1     52881 82489 255.35
## 
## Call:
## lm(formula = hp ~ wt + qsec, data = mtcars)
## 
## Coefficients:
## (Intercept)           wt         qsec  
##      441.26        38.67       -23.47
##              Estimate Std. Error   t value     Pr(>|t|)
## (Intercept)  9.723053  5.8990407  1.648243 0.1108925394
## wt          -2.936531  0.6660253 -4.409038 0.0001488947
## qsec         1.016974  0.2520152  4.035366 0.0004030165
## am1         14.079428  3.4352512  4.098515 0.0003408693
## wt:am1      -4.141376  1.1968119 -3.460340 0.0018085763
## [1] 0