SIMPLE LINEAR REGRESSION

Author
Affiliation

William Mwita

Computational Statistics

Abstract

This analysis investigates the relationship between vehicle weight and fuel efficiency using data from the Motor Trend US magazine dataset. A linear regression model was employed to quantify the impact of vehicle weight (in 1000 lbs) on miles per gallon (Mpg). The findings reveal a significant negative correlation, indicating that as vehicle weight increases, Mpg decreases. Practical recommendations are provided for car manufacturers to optimize vehicle design strategies aimed at improving fuel efficiency while maintaining safety and performance standards.


Introduction

For this project, I will exclusively utilize the R programming language. The dataset selected for analysis is the built-in mtcars dataset in R. Below, I outline the problem statement that will be addressed using the Simple Linear Regression algorithm.

Problem Statement

The objective of this project is to uncover the relationship within the sample distribution from the 1974 Motor Trend US magazine dataset. This dataset includes measurements of fuel consumption and 10 aspects of automobile design and performance for 32 vehicles (models from 1973-74). One of the critical features considered by the automotive industry is the mileage per gallon (Mpg), which indicates how far a vehicle can travel on one gallon of fuel. Understanding how Mpg relates to vehicle weight (in 1000 lbs) is crucial for manufacturers in designing models that cater to a broad customer base. As a statistician, my focus is on exploring the association between Mpg and vehicle weight in this context.

Simple Linear Regression Model between Milliage Per Gallon and The vehicle Weight.

Using R, I conducted a statistical analysis and developed a model named milliageModel, Below is the formula of the model along with the coefficients obtained to explain the relationship visually.


Call:
lm(formula = mpg ~ wt, data = mtCars_data, method = "qr")

Coefficients:
(Intercept)           wt  
     37.285       -5.344  

Regression Model with Coefficient Values

(Intercept)          wt 
  37.285126   -5.344472 

The regression equation ( dependent variable Y is Mpg and the independent variable X is Weight in 1000 lbs):

\[ Mpg = 37.2851 - 5.344472 *Weight \]

Summary of the Model


Call:
lm(formula = mpg ~ wt, data = mtCars_data, method = "qr")

Residuals:
    Min      1Q  Median      3Q     Max 
-4.5432 -2.3647 -0.1252  1.4096  6.8727 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  37.2851     1.8776  19.858  < 2e-16 ***
wt           -5.3445     0.5591  -9.559 1.29e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.046 on 30 degrees of freedom
Multiple R-squared:  0.7528,    Adjusted R-squared:  0.7446 
F-statistic: 91.38 on 1 and 30 DF,  p-value: 1.294e-10

Model Interpretation

From the Above output of the linear regression analysis my interpretation is as follows:

  1. Interpretation of Coefficients:
    • The intercept (37.2851) represents the estimated Mpg when the weight (wt) of the vehicle is zero. However, this is not practically meaningful since vehicle weight cannot be zero. It serves as a baseline reference point.
    • The coefficient for wt (-5.3445) indicates that for every unit increase in weight (in 1000 lbs), the Mpg decreases by approximately 5.3445 units.
  2. Significance of Coefficients:
    • Both the intercept and wt coefficients are statistically significant (denoted by the ‘***’ next to the coefficients). This suggests that there is strong evidence to reject the null hypothesis that these coefficients are zero.
    • The p-values associated with both coefficients are extremely small (< 0.001), indicating a high level of confidence in these results.
  3. Fit of the Model:
    • The multiple R-squared value is 0.7528, which means that approximately 75.28% of the variability in Mpg can be explained by the linear regression model with weight (wt) as the predictor.
    • The adjusted R-squared (0.7446) is slightly lower but adjusts for the number of predictors in the model, providing a more conservative estimate of the model’s goodness of fit.
  4. Residual Analysis:
    • The residual standard error (3.046) gives an estimate of the standard deviation of the residuals, which represents the variability in Mpg that is not explained by the model.
    • The residuals appear to be reasonably normally distributed with a mean close to zero and no obvious patterns in their distribution, as indicated by the summary of residuals (Min, 1Q, Median, 3Q, Max).
  5. Overall Model Assessment:
    • The F-statistic (91.38) with a very low p-value (1.294e-10) indicates that the overall regression model is statistically significant. This suggests that the model as a whole provides a better fit to the data than a model with no predictors.

The regression analysis suggests a strong and statistically significant negative relationship between vehicle weight and Mpg. Approximately 75% of the variability in Mpg can be explained by the linear relationship with weight. The model is deemed valid based on its significant coefficients, high R-squared value, and low residual standard error.

Model Vizualization

Loading required package: ggplot2

Conclusion

From the statistics we have on the linear regression model, here’s my counsil advice to the Product Manager in the car industry

The analysis shows a statistically significant negative relationship between vehicle weight and miles per gallon (Mpg). Specifically, for every 1000 lbs increase in vehicle weight, the predicted Mpg decreases by approximately 5.34 units.

Based on our recent analysis, it’s evident that vehicle weight plays a crucial role in determining fuel efficiency, with each 1000 lbs increase correlating to a notable 5.34 unit decrease in miles per gallon (Mpg). This underscores the critical need for to prioritizing weight management strategies in the vehicle designs. Embracing lightweight materials like aluminum, carbon fiber, and high-strength steel presents a tangible opportunity to not only reduce weight but also maintain safety and performance standards. By optimizing the design of vehicle components, the industry can further enhance fuel economy without compromising durability. These efforts align well with consumer expectations, where fuel efficiency is increasingly valued as a key factor in vehicle purchasing decisions. As we move forward, it’s essential to leverage these insights to differentiate your products in the market. Highlighting improved fuel efficiency through targeted marketing campaigns and consumer education initiatives will resonate strongly with our environmentally-conscious customer base. Exploring advancements in hybrid and electric vehicle technologies also holds promise, providing sustainable solutions that meet both regulatory requirements and consumer preferences for eco-friendly options. Continuous innovation and a proactive approach to competitive analysis will be pivotal in maintaining leadership position in the dynamic automotive landscape.


The analysis has been conducted by W. Mwita, a Statistics Major, with expertise in computational statistics, econometrics, and research analysis. This study aims to assist organizations in uncovering hidden insights from data, particularly in optimizing vehicle design strategies for enhanced fuel efficiency.