Linear Regression Project

Dr. Roberto Chang López /rchang@unah.edu.hn
2021-09-07

JuveYell

Fuente:DataSource.ai

##Haga Click Aqui para ver Certificado Machine Learning MIT https://www.credential.net/4dd365ea-ea5a-46a2-a72e-539e70545c6e

##Haga Click Aqui para ver Certificado Columbia Python for Managers https://certificates.emeritus.org/0a2e1de7-add2-4710-ad49-417d1dadfb61#gs.4a92hv ##Contacto:

Algunos Dashboards elaborados son: Para Bolsa de Valores https://rchang.shinyapps.io/rchang-stock-exchange/

Para el Estado del Clima https://rchang.shinyapps.io/rchang-app_clima_ho/

Para Machine Learning https://rchang.shinyapps.io/rchang-app/

Para Empresariales e Industriales https://rchang.shinyapps.io/rchang-app_final_emp/

Para Dashboards con log in https://rchang.shinyapps.io/clase_3-shiny-2/_w_ae4e775f/_w_f249a9a1/?page=sign_in

y para Sistemas de Información Geográfica

Executive Summary

Motor Trend is a magazine about the automobile industry. It is interested in exploring the relationship between a set of variables and miles per gallon (MPG) (outcome), particularly:

“Is an automatic or manual transmission better for MPG” “Quantify the MPG difference between automatic and manual transmissions”

Using a data set from Motor Trend Magazine along with linear regression and hypothesis testing, it can be concluded that there is a significant difference between the MPG of automatic and manual transmission cars.

To quantify the MPG difference between automatic and manual transmission cars, a linear regression model that took into account the weight, the type of transmission and the acceleration (qsec) was used. Controlling for these factors, manual transmission cars have a better fuel efficiency of 2.94 MPG more than automatic transmission cars.

You work for Motor Trend, a magazine about the automobile industry. Looking at a data set of a collection of cars, they are interested in exploring the relationship between a set of variables and miles per gallon (MPG) (outcome). They are particularly interested in the following two questions:

“Is an automatic or manual transmission better for MPG” “Quantify the MPG difference between automatic and manual transmissions”

Data Processing

The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).

Coefficents

From this simple linear regression model of mpg against am, manual transmission cars have 7.24 MPG more than automatic transmission cars. The R^2 value of this model is 0.3598, meaning that it only explains 35.98% of the variance.

library(datasets)
data(mtcars)

dim(mtcars)
[1] 32 11
summary(mtcars)
      mpg             cyl             disp             hp       
 Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
 1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
 Median :19.20   Median :6.000   Median :196.3   Median :123.0  
 Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
 3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
 Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
      drat             wt             qsec             vs        
 Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
 1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
 Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
 Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
 3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
 Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
       am              gear            carb      
 Min.   :0.0000   Min.   :3.000   Min.   :1.000  
 1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
 Median :0.0000   Median :4.000   Median :2.000  
 Mean   :0.4062   Mean   :3.688   Mean   :2.812  
 3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
 Max.   :1.0000   Max.   :5.000   Max.   :8.000  

Is an automatic or manual transmission better for MPG? For automatic:

summary(mtcars[mtcars$am==0,])
      mpg             cyl             disp             hp       
 Min.   :10.40   Min.   :4.000   Min.   :120.1   Min.   : 62.0  
 1st Qu.:14.95   1st Qu.:6.000   1st Qu.:196.3   1st Qu.:116.5  
 Median :17.30   Median :8.000   Median :275.8   Median :175.0  
 Mean   :17.15   Mean   :6.947   Mean   :290.4   Mean   :160.3  
 3rd Qu.:19.20   3rd Qu.:8.000   3rd Qu.:360.0   3rd Qu.:192.5  
 Max.   :24.40   Max.   :8.000   Max.   :472.0   Max.   :245.0  
      drat             wt             qsec             vs        
 Min.   :2.760   Min.   :2.465   Min.   :15.41   Min.   :0.0000  
 1st Qu.:3.070   1st Qu.:3.438   1st Qu.:17.18   1st Qu.:0.0000  
 Median :3.150   Median :3.520   Median :17.82   Median :0.0000  
 Mean   :3.286   Mean   :3.769   Mean   :18.18   Mean   :0.3684  
 3rd Qu.:3.695   3rd Qu.:3.842   3rd Qu.:19.17   3rd Qu.:1.0000  
 Max.   :3.920   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
       am         gear            carb      
 Min.   :0   Min.   :3.000   Min.   :1.000  
 1st Qu.:0   1st Qu.:3.000   1st Qu.:2.000  
 Median :0   Median :3.000   Median :3.000  
 Mean   :0   Mean   :3.211   Mean   :2.737  
 3rd Qu.:0   3rd Qu.:3.000   3rd Qu.:4.000  
 Max.   :0   Max.   :4.000   Max.   :4.000  
For manual:
summary(mtcars[mtcars$am==1,])
      mpg             cyl             disp             hp       
 Min.   :15.00   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
 1st Qu.:21.00   1st Qu.:4.000   1st Qu.: 79.0   1st Qu.: 66.0  
 Median :22.80   Median :4.000   Median :120.3   Median :109.0  
 Mean   :24.39   Mean   :5.077   Mean   :143.5   Mean   :126.8  
 3rd Qu.:30.40   3rd Qu.:6.000   3rd Qu.:160.0   3rd Qu.:113.0  
 Max.   :33.90   Max.   :8.000   Max.   :351.0   Max.   :335.0  
      drat            wt             qsec             vs        
 Min.   :3.54   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
 1st Qu.:3.85   1st Qu.:1.935   1st Qu.:16.46   1st Qu.:0.0000  
 Median :4.08   Median :2.320   Median :17.02   Median :1.0000  
 Mean   :4.05   Mean   :2.411   Mean   :17.36   Mean   :0.5385  
 3rd Qu.:4.22   3rd Qu.:2.780   3rd Qu.:18.61   3rd Qu.:1.0000  
 Max.   :4.93   Max.   :3.570   Max.   :19.90   Max.   :1.0000  
       am         gear            carb      
 Min.   :1   Min.   :4.000   Min.   :1.000  
 1st Qu.:1   1st Qu.:4.000   1st Qu.:1.000  
 Median :1   Median :4.000   Median :2.000  
 Mean   :1   Mean   :4.385   Mean   :2.923  
 3rd Qu.:1   3rd Qu.:5.000   3rd Qu.:4.000  
 Max.   :1   Max.   :5.000   Max.   :8.000  

Hence, the mean of mpg is greater for manual at 24.4 than automatic at 17.1

Quantify the MPG difference between automatic and manual transmissions.
boxplot(mpg ~ am, data = mtcars, xlab = "Transmission", ylab = "Miles per gallon", main="Miles per gallon by Transmission Type")

Manual (represented by 1) has a higher mean for mpg than automatic (represented by 0).

Hypothesis Testing

aggregate(mpg~am, data = mtcars, mean)
  am      mpg
1  0 17.14737
2  1 24.39231
The mean transmission for manual is 7.24mpg higher than automatic. Let alpha=0.5.
auto <- mtcars[mtcars$am == 0,]
manual <- mtcars[mtcars$am == 1,]
t.test(auto$mpg, manual$mpg)

    Welch Two Sample t-test

data:  auto$mpg and manual$mpg
t = -3.7671, df = 18.332, p-value = 0.001374
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -11.280194  -3.209684
sample estimates:
mean of x mean of y 
 17.14737  24.39231 

Since p-value = 0.001374, we reject the null hypothesis. There is a major difference between mpg of manual and automatic transmissions.

In other words, the p-value of the t-test is 0.001374, which falls within the 95% confidence interval. Hence, controlling for all other variables, there is a significant difference between the mean MPG of automatic and manual cars.

m<-lm(mpg~am,data=mtcars)
summary(m)

Call:
lm(formula = mpg ~ am, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-9.3923 -3.0923 -0.2974  3.2439  9.5077 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   17.147      1.125  15.247 1.13e-15 ***
am             7.245      1.764   4.106 0.000285 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.902 on 30 degrees of freedom
Multiple R-squared:  0.3598,    Adjusted R-squared:  0.3385 
F-statistic: 16.86 on 1 and 30 DF,  p-value: 0.000285

From the above, we may conclude that automatic run at 17.15mpg, while manual have 7.24 more mpg.

Also, R^2 is 0.36, hence the model only accounts for 36% variance.

In other words, from this simple linear regression model of mpg against am, manual transmission cars have 7.24 MPG more than automatic transmission cars. The R^2 value of this model is 0.3598, meaning that it only explains 35.98% of the variance.

Performing multivariate linear regression:

model <- lm(mpg~am + wt + hp + cyl, data = mtcars)
anova(m,model)
Analysis of Variance Table

Model 1: mpg ~ am
Model 2: mpg ~ am + wt + hp + cyl
  Res.Df   RSS Df Sum of Sq      F    Pr(>F)    
1     30 720.9                                  
2     27 170.0  3     550.9 29.166 1.274e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The final model is below:
summary(model)

Call:
lm(formula = mpg ~ am + wt + hp + cyl, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.4765 -1.8471 -0.5544  1.2758  5.6608 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 36.14654    3.10478  11.642 4.94e-12 ***
am           1.47805    1.44115   1.026   0.3142    
wt          -2.60648    0.91984  -2.834   0.0086 ** 
hp          -0.02495    0.01365  -1.828   0.0786 .  
cyl         -0.74516    0.58279  -1.279   0.2119    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.509 on 27 degrees of freedom
Multiple R-squared:  0.849, Adjusted R-squared:  0.8267 
F-statistic: 37.96 on 4 and 27 DF,  p-value: 1.025e-10

The only significant coefficient of the model is wt with a p value of 0.0086 ** and an impact of -2.60648 of the coefficient wt on the dependent variable, although the intercept is significant. This is not interpretable for this model. So the model can be reformulated with a simple regression with mpg = y and wt = x lm(formula = mpg ~ wt -1, data = mtcars)

Conclusion This model explains 84.9% of the variance. It may be concluded that on average, manual transmissions have 1.478 more mpg than automatic.

In conclusion, holding the weight and acceleration (qsec) of the car constant, manual transmission cars offer 2.94 MPG better fuel efficiency.

APPENDIX Model Residuals
plot(model)