Question 1

## Warning in eval(substitute(list(...)), `_data`, parent.frame()): NAs introduced
## by coercion
##   mpg cylinders displacement horsepower weight acceleration year
## 1  18         8          307        130   3504         12.0   70
## 2  15         8          350        165   3693         11.5   70
## 3  18         8          318        150   3436         11.0   70
## 4  16         8          304        150   3433         12.0   70
## 5  17         8          302        140   3449         10.5   70
## 6  15         8          429        198   4341         10.0   70

Question 2

## `geom_smooth()` using formula 'y ~ x'

Question 3

## 
## Call:
## lm(formula = mpg ~ cylinders, data = cars)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.2607  -3.3841  -0.6478   2.5538  17.9022 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  42.9493     0.8330   51.56   <2e-16 ***
## cylinders    -3.5629     0.1458  -24.43   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.942 on 396 degrees of freedom
## Multiple R-squared:  0.6012, Adjusted R-squared:  0.6002 
## F-statistic: 597.1 on 1 and 396 DF,  p-value: < 2.2e-16

With a one unit increase in cylinders, fuel efficiency decreases at a rate of approx 3.56 miles per gallon (mpg).

The coefficient of cylinders is in line with the graphical representation I found in question 2, as the coefficient is negative (-3.56) and the line in the statistical graph/slope is moving downwards.

Question 4

## 
## Call:
## lm(formula = mpg ~ cylinders + weight + year, data = cars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.9727 -2.3180 -0.0755  2.0138 14.3505 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -13.925603   4.037305  -3.449 0.000623 ***
## cylinders    -0.087402   0.232075  -0.377 0.706665    
## weight       -0.006511   0.000459 -14.185  < 2e-16 ***
## year          0.753286   0.049802  15.126  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.438 on 394 degrees of freedom
## Multiple R-squared:  0.8079, Adjusted R-squared:  0.8065 
## F-statistic: 552.4 on 3 and 394 DF,  p-value: < 2.2e-16

With a one unit increase of cylinder, the fuel efficiency decreases at a rate of approx 0.09 (mpg). With a one pound increase in weight, the fuel efficiency decreases at a rate of approx 0.0065 (mpg).

As the car model year goes up by a year, the fuel efficiency increases at a rate of approx 0.75 (mpg).

The coefficient of cylinders is not statistically significant at the 10% level, and the P-value is 0.7.

Question 5

The difference between the results in question 3 and question 4 is that when more independent variables are added and controlled (i.e. year, weight of vehicle), there is less bias on the dependent variable (outcome). The condition that is necessary for this to occur would be controlling the biasedness of the effect of the cylinder or the “omitted variable bias” on the fuel efficiency. Therefore, When the year and weight of the vehicle are added, the impact of the increase of one unit of cylinders on the fuel efficiency changes the coefficient from approx -3.56 to approx -0.08 which indicates we would have interpreted the impact of cylinders on mpg wrong.

Question 6

## 
## Call:
## lm(formula = mpg ~ year, data = cars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -12.024  -5.451  -0.390   4.947  18.200 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -69.55560    6.58911  -10.56   <2e-16 ***
## year          1.22445    0.08659   14.14   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.379 on 396 degrees of freedom
## Multiple R-squared:  0.3356, Adjusted R-squared:  0.3339 
## F-statistic:   200 on 1 and 396 DF,  p-value: < 2.2e-16
## `geom_smooth()` using formula 'y ~ x'

## 
## Call:
## lm(formula = mpg ~ acceleration, data = cars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -18.007  -5.636  -1.242   4.758  23.192 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    4.9698     2.0432   2.432   0.0154 *  
## acceleration   1.1912     0.1292   9.217   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.101 on 396 degrees of freedom
## Multiple R-squared:  0.1766, Adjusted R-squared:  0.1746 
## F-statistic: 84.96 on 1 and 396 DF,  p-value: < 2.2e-16
## `geom_smooth()` using formula 'y ~ x'

## 
## Call:
## lm(formula = mpg ~ acceleration + year, data = cars)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.9929  -5.0302  -0.5953   4.7848  18.2562 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -68.95578    6.24080 -11.049  < 2e-16 ***
## acceleration   0.78317    0.11482   6.821 3.41e-11 ***
## year           1.05615    0.08563  12.334  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.041 on 395 degrees of freedom
## Multiple R-squared:  0.4056, Adjusted R-squared:  0.4026 
## F-statistic: 134.7 on 2 and 395 DF,  p-value: < 2.2e-16

As we can see in the first graphical representation, as the year of the car goes up by one year, the fuel efficiency in miles per gallon increases. The coefficient also supports this because the coefficient on year is positive (approx 1.22). Therefore, as the car year goes up by one year, the fuel efficiency improves by 1.22 miles per gallon. I also ran the linear regression for acceleration and year as independent variables, as I believe that as the time to accelerate improves by a second, fuel efficiency in mpg should improve. I also believe that over the years we should see an improvement of acceleration as manufacturers are trying to improve fuel efficiency. Based on the results, we can see that both year and acceleration have positive correlations, indicating that what I believe is true and this could be a way to improve that car manufacturers are trying to improve fuel efficiency over the years.