Motivation

Modern cars chase performance: more horsepower, faster acceleration, higher top speeds. But performance is never free.

From an engineering and economic perspective, energy efficiency is the trade-off—and fuel consumption is where that trade-off becomes measurable.

Core statistical question:

As engine power increases, does fuel efficiency systematically decline?

This question is ideal for statistical analysis because it involves:

  • a continuous, measurable response (fuel efficiency in miles per gallon),
  • a quantitative predictor tied directly to performance (engine horsepower),
  • and natural real-world variability driven by design, weight, and drivetrain differences.

Rather than relying on intuition, we use data to quantify this trade-off.

The Data

To investigate this relationship, we analyze the built-in mtcars dataset, which contains technical specifications for 32 production automobiles.

This dataset is well-suited for modeling efficiency–performance trade-offs because it includes both engine characteristics and vehicle design factors.

Key variables:

  • mpg — miles per gallon, measuring fuel efficiency
  • hp — horsepower, representing engine power and performance potential
  • wt — vehicle weight (in thousands of pounds), a critical confounding factor

Together, these variables allow us to move beyond simple comparisons and examine how power and mass jointly influence efficiency, using visualization and regression techniques.

Exploratory Visualization

Before modeling, we pressure-test the relationship visually:
Is there a clear efficiency penalty as horsepower rises, or is the pattern mostly noise?

We also want to spot outliers (cars that are unusually efficient or inefficient for their power).

What the plot suggests (visually):

  • The relationship is strongly negative: higher horsepower is generally associated with lower mpg.
  • Heavier cars (larger bubbles) cluster in the high-hp / low-mpg region, hinting that weight may be a confounder.
  • A few models behave like exceptions — which is exactly why we move next to regression.

Plot

Adding a Regression Line

Visual patterns are informative, but regression allows us to quantify the relationship between performance and efficiency.

By fitting a linear model, we estimate the average change in fuel efficiency associated with a one-unit increase in horsepower, while smoothing out random variation in individual vehicles.

Interpretable As:
The fitted regression line has a clear negative slope, indicating that, on average, vehicles with higher horsepower achieve lower fuel efficiency.

The narrow confidence band reinforces that this downward trend is not driven by a few extreme observations, but reflects a consistent relationship across the dataset.

Plot

Statistical Model

To quantify the relationship between performance and efficiency, we use a simple linear regression model:

\[ \text{mpg}_i = \beta_0 + \beta_1\,\text{hp}_i + \varepsilon_i \] where:

  • \(\beta_0\) is the intercept, the expected fuel efficiency when horsepower is 0
  • \(\beta_1\) is the slope, the avg change in fuel efficiency (mpg) for a 1-unit increase in horsepower
  • \(\varepsilon_i\) is a random error term, capturing unobserved factors and natural variability across vehicles

Note

The sign of \(\beta_1\) determines the direction of the relationship.
If \(\beta_1 < 0\), increases in engine power are associated with systematic decreases in fuel efficiency.

This model assumes a linear relationship, constant variance, and independent errors—assumptions we assess indirectly through diagnostics.

Model Estimation in R

The regression parameters are estimated using ordinary least squares (OLS), which selects coefficients that minimize the total squared prediction error.

model <- lm(mpg ~ hp, data = mtcars)
coef(model)
## (Intercept)          hp 
## 30.09886054 -0.06822828

The fitted model takes the form:

\[ \widehat{\text{mpg}} = \hat{\beta}_0 + \hat{\beta}_1\,\text{hp} \]

Interpretation of the estimates:

  • The estimated slope \(\hat{\beta}_1\) is negative, indicating that fuel efficiency decreases as horsepower increases.
  • Numerically, \(\hat{\beta}_1\) represents the average change in miles per gallon for each additional unit of horsepower.
  • This provides a quantitative measure of the efficiency cost associated with higher engine power.

The intercept \(\hat{\beta}_0\) anchors the regression line and ensures unbiased estimation within the observed data range, even if its literal interpretation is not physically meaningful.

Beyond Two Dimensions

Fuel efficiency is inherently multivariate. While horsepower captures engine power, vehicle weight strongly mediates how that power translates into fuel consumption.

To examine these effects simultaneously, we use an interactive three-dimensional visualization.

Plot

Key multivariate insights:

  • Holding horsepower constant, heavier vehicles consistently achieve lower fuel efficiency.
  • At higher horsepower levels, increases in weight amplify efficiency losses.
  • The least efficient cars combine high horsepower and high weight, highlighting an interaction between power and mass.

This visualization motivates extending the model beyond simple regression to include additional predictors.

Together, horsepower and weight explain substantially more variation in fuel efficiency than either variable alone.

Least Squares Principle

To estimate the regression line, we choose the coefficients that minimize the total squared prediction error:

\[ \sum_{i=1}^{n} \left(y_i - \hat{y}_i\right)^2 \]

Here, each term represents the squared vertical distance between an observed fuel efficiency value and the value predicted by the model.

Why squared error?

  • Squaring penalizes large prediction mistakes more heavily.
  • Positive and negative errors do not cancel out.
  • The optimization problem has a stable, unique solution.

In the context of cars, this means selecting the line that best balances under- and over-prediction of fuel efficiency across all vehicles.

Solving this minimization problem yields the least squares estimates
\(\hat{\beta}_0\) (intercept) and \(\hat{\beta}_1\) (slope).

Interpretation

From the statistical analysis, we conclude that:

  • Fuel efficiency decreases systematically as engine horsepower increases.
  • The regression model quantifies this trade-off, providing an average efficiency loss per additional unit of power.
  • Visualizations reinforce the model results by revealing structure, trends, and multivariate effects.

Together, these tools show how statistical modeling transforms raw automotive data into interpretable insight.

Conclusion

Does speed kill efficiency?

From a statistical perspective, the answer is yes, on average.

Across the data, vehicles with greater engine power tend to achieve lower fuel efficiency. Simple and multivariate visualizations reveal a clear downward trend, while regression modeling quantifies the trade-off between horsepower and miles per gallon.

Most importantly, this analysis demonstrates how statistical tools transform real-world data into evidence-based insight. By combining visualization, modeling, and interpretation, we can move beyond intuition and rigorously evaluate questions about performance, efficiency, and design.