library(ggplot2)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ lubridate 1.9.3     ✔ tibble    3.2.1
## ✔ purrr     1.0.2     ✔ tidyr     1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
data(mtcars)
head(mtcars, 10)
##                    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360        14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 240D         24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230          22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Merc 280          19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4

Module Two Problem Set: Interaction Terms and Qualitative Predictors

In this notebook, you have been given a set of steps that will show you how to create multiple regression models that include interaction terms and qualitative predictors in R. It is very important to run the steps in order. Some steps depend on the outputs of earlier steps. Once you have run all the steps, you will be asked to create your own regression models to help you answer the questions in the Module Two Problem Set. You are expected to write the R script yourself to answer these questions.

Reminder: If you have not already reviewed the Problem Set Report template for your Module Two Problem Set, be sure to do so now. That will give you an idea of the questions you will need to answer with the outputs of this script. You should use the code you are given as reference when writing your own scripts.

Step 1: Loading the Data Set

You are an analyst working for a car maker. You have access to a set of data that can be used to study the fuel economy of a car. Car makers are interested in studying factors that are associated with better fuel economy. This data set includes several important variables that are associated with fuel economy. You will use this data set to create models to predict fuel economy.

This block of R code will load the data set from mtcars.csv file. Here are the variables contained in the dataset.

Reference: R data sets. (1974). Motor trend car road tests [Data file]. Retrieved from https://www.rdocumentation.org/packages/datasets/versions/3.6.2/topics/mtcars

Click the code section below and hit the Run button above.

Converting appropriate variables to factors

mtcars2 <- within(mtcars, {
   vs <- factor(vs)
   am <- factor(am)
   cyl  <- factor(cyl)
   gear <- factor(gear)
   carb <- factor(carb)
})

Create the multiple regression model and print summary statistics. Note that this model includes the interaction term.

model1 <- lm(mpg ~ wt + drat + wt:drat, data=mtcars_subset)
summary(model1)
## 
## Call:
## lm(formula = mpg ~ wt + drat + wt:drat, data = mtcars_subset)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.8913 -1.8634 -0.3398  1.3247  6.4730 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)    5.550     12.631   0.439   0.6637  
## wt             3.884      3.798   1.023   0.3153  
## drat           8.494      3.321   2.557   0.0162 *
## wt:drat       -2.543      1.093  -2.327   0.0274 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.839 on 28 degrees of freedom
## Multiple R-squared:  0.7996, Adjusted R-squared:  0.7782 
## F-statistic: 37.25 on 3 and 28 DF,  p-value: 6.567e-10

Interpretation of Beta Estimates

From the output in previous step, the prediction model equation is:

\[\begin{equation*} \large \hat{y} = 5.5500\ +\ 3.8840\ {x}_1\ +\ 8.4940\ {x}_2\ -\ 2.5430\ {x}_1\ {x}_2 \end{equation*}\]

\[\begin{equation*} \text{where } \hat{y} \text{ is the predicted fuel efficiency,}\ {x}_1\ \text{is weight,}\ {x}_2\ \text{and rear axle ratio} \end{equation*}\]


Interpret the estimated coefficient of rear axle ratio variable.

  • Let us assume there was no interaction present and the multiple regression model was 5.55 + 3.884 x1 + 8.494 x2. Then, the fuel economy of a car would increase on average by 8.494 for each unit increase in rear axle ratio (since 8.494 is the estimated coefficient for a rear axle ratio variable).

  • However, weight and rear axle ratio interact in the model in step 3, therefore the rate of change of average fuel economy with rear axle ratio depends on the weight of the car. The estimated coefficient for x2 is:

8.4940 x2 - 2.5430 (x1)(x2) = ( 8.4940 - 2.5430 x1 ) x2

The estimated coefficient for x2 is ( 8.494 - 2.543 x1 )

  • Suppose we have a car with weight 2.50. The estimated coeffficient for rear axle ratio can be calculated as:

( 8.494 - 2.543 x1 ) = 8.494 - 2.543 (2.50) = +2.1365

In other words, we estimate that the fuel economy of a car with weight 2,500lbs will increase by 2.1365 units for each unit increase in rear axle ratio. Note that this is different than the 8.494 from the first bullet. This is because the multiple regression model has an interaction term between weight and rear axle ratio.

  • Suppose we now have a car with weight 2.32. If there were no interaction term included in the model, then the fuel economy of this car would increase on average by 8.494 for each unit increase in rear axle ratio. However, since we now have the interaction term in the model, the fuel economy of the car will increase by 8.494 - 2.543 (2.32) = 2.59 units for each unit increase in rear axle ratio.

  • Therefore, the rate of increase in fuel economy now varies due to the presence of an interaction term.

Step 4: Adding in a Qualitative Predictor

In this step, you will add a qualitative predictor into the multiple regression model from step 3. The transmission variable am is a qualitative variable with two levels; manual and automatic. Since this is a variable with two levels, we can use one dummy variable to represent it. R will create the dummy variable automatically and will also label it appropriately.

The general form of this regression model is:

\[\begin{equation*} \large E(y) = {\beta}_0\ +\ {\beta}_1\ {x}_1\ +\ {\beta}_2\ {x}_2\ +\ {\beta}_3\ {x}_1\ {x}_2\ +\ {\beta}_4\ {x}_3 \end{equation*}\]

The prediction regression model is:

\[\begin{equation*} \large \hat{y} = \hat{{\beta}_0}\ +\ \hat{{\beta}_1}\ {x}_1\ +\ \hat{{\beta}_2}\ {x}_2\ +\ \hat{{\beta}_3}\ {x}_1\ {x}_2\ +\ \hat{{\beta}_4}\ {x}_3 \end{equation*}\]

\[\begin{equation*} \text{In the model above, } \hat{y} \text{ is the predicted fuel economy (mpg), x1 is weight (wt), x2 is rear axle ratio (drat), and x3 is the dummy variable for transmission (am).} \end{equation*}\]

\[\begin{equation*} \text{Note that am='1' for a manual transmission and am='0' for an automatic transmission.} \end{equation*}\]

\[\begin{equation*} \hat{{\beta}_0} \text{,} \hspace{0.25cm} \hat{{\beta}_1} \text{, } \hspace{0.25cm} \hat{{\beta}_2} \text{,} \hspace{0.25cm} \hat{{\beta}_3} \text{,} \hspace{0.25cm} \hat{{\beta}_4} \text{ } \text{ are estimates of} \text{ } {\beta}_0\ \text{,} \hspace{0.25cm} {\beta}_1\ \text{,} \hspace{0.25cm} {\beta}_2\ \text{,} \hspace{0.25cm} {\beta}_3\ \text{,} \hspace{0.25cm} {\beta}_4\ \text{ respectively }\\ \end{equation*}\]

Click the block of code below and hit the Run button above.

Subsetting data to only include the variables that are needed

myvars <- c("mpg","wt","drat","am")
mtcars_subset <- mtcars2[myvars]

Create the model

model2 <- lm(mpg ~ wt + drat + wt:drat + am, data=mtcars_subset)
summary(model2)
## 
## Call:
## lm(formula = mpg ~ wt + drat + wt:drat + am, data = mtcars_subset)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.6907 -1.4711 -0.2512  0.9344  6.7453 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)    3.247     12.914   0.251   0.8034  
## wt             4.168      3.822   1.091   0.2851  
## drat           9.562      3.529   2.710   0.0116 *
## am1           -1.464      1.597  -0.917   0.3674  
## wt:drat       -2.708      1.111  -2.438   0.0216 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.847 on 27 degrees of freedom
## Multiple R-squared:  0.8057, Adjusted R-squared:  0.7769 
## F-statistic: 27.99 on 4 and 27 DF,  p-value: 2.948e-09

Step 5: Fitted Values

In this step, you will obtain the fitted values of the data set using the model from step 4. Recall that the fitted value is just the predicted value of the dependent variable (miles per gallon) for data points from the data set.

Click the block of code below and hit the Run button above.

predicted values

print("fitted")
## [1] "fitted"
fitted_values <- fitted.values(model2) 
fitted_values
##           Mazda RX4       Mazda RX4 Wag          Datsun 710      Hornet 4 Drive 
##           22.320207           20.689507           24.074750           19.278594 
##   Hornet Sportabout             Valiant          Duster 360           Merc 240D 
##           18.356600           18.194854           17.782915           19.945503 
##            Merc 230            Merc 280           Merc 280C          Merc 450SE 
##           20.415578           18.545349           18.545349           15.724427 
##          Merc 450SL         Merc 450SLC  Cadillac Fleetwood Lincoln Continental 
##           17.134382           16.927036           11.483047           10.468474 
##   Chrysler Imperial            Fiat 128         Honda Civic      Toyota Corolla 
##            9.650796           25.654704           34.090685           28.809681 
##       Toyota Corona    Dodge Challenger         AMC Javelin          Camaro Z28 
##           24.198310           17.996415           18.378418           16.124986 
##    Pontiac Firebird           Fiat X1-9       Porsche 914-2        Lotus Europa 
##           16.648968           27.478543           27.385767           28.689013 
##      Ford Pantera L        Ferrari Dino       Maserati Bora          Volvo 142E 
##           19.115459           20.784240           16.283559           21.723885

Step 6: Residuals

In this step, you will obtain the residuals using the model in step 4. Recall that the residual is the difference between the actual value and the predicted value of the dependent variable (miles per gallon).

Click the block of code below and hit the Run button above.

residuals

print("residuals")
## [1] "residuals"
residuals <- residuals(model2)
residuals
##           Mazda RX4       Mazda RX4 Wag          Datsun 710      Hornet 4 Drive 
##         -1.32020710          0.31049250         -1.27475025          2.12140638 
##   Hornet Sportabout             Valiant          Duster 360           Merc 240D 
##          0.34339960         -0.09485431         -3.48291480          4.45449689 
##            Merc 230            Merc 280           Merc 280C          Merc 450SE 
##          2.38442164          0.65465150         -0.74534850          0.67557281 
##          Merc 450SL         Merc 450SLC  Cadillac Fleetwood Lincoln Continental 
##          0.16561816         -1.72703557         -1.08304682         -0.06847434 
##   Chrysler Imperial            Fiat 128         Honda Civic      Toyota Corolla 
##          5.04920376          6.74529611         -3.69068489          5.09031922 
##       Toyota Corona    Dodge Challenger         AMC Javelin          Camaro Z28 
##         -2.69830968         -2.49641509         -3.17841839         -2.82498557 
##    Pontiac Firebird           Fiat X1-9       Porsche 914-2        Lotus Europa 
##          2.55103235         -0.17854335         -1.38576690          1.71098689 
##      Ford Pantera L        Ferrari Dino       Maserati Bora          Volvo 142E 
##         -3.31545871         -1.08423985         -1.28355913         -0.32388453

Step 7: Diagnostic Plots — Residuals against Fitted Values

In this step, you will generate plot of residuals against fitted values to test the assumption of homoscadasticity.

Click the block of code below and hit the Run button above.
NOTE: If the plot is not created, click the code section and hit the Run button again.

plot(fitted_values, residuals, 
     main = "Residuals against Fitted Values",
     xlab = "Fitted Values", ylab = "Residuals",
     col="red", 
     pch = 19, frame = FALSE)

Step 8: Diagnostic Plots — Q-Q Plot

In this step, you will generate a Q-Q plot to test assumptions of normality of the residuals.

Click the block of code below and hit the Run button above.
NOTE: If the plot is not created, click the code section and hit the Run button again.

qqnorm(residuals, pch = 19, col="red", frame = FALSE)
qqline(residuals, col = "blue", lwd = 2)

Step 9: Confidence Interval for Parameter Estimates

In this step, you will use the confint function to create 90% confidence intervals for the beta parameters.

Click the block of code below and hit the Run button above.

confidence intervals for model parameters

print("confint")
## [1] "confint"
conf_90_int <- confint(model2, level=0.90) 
round(conf_90_int, 4)
##                  5 %    95 %
## (Intercept) -18.7488 25.2427
## wt           -2.3414 10.6771
## drat          3.5516 15.5725
## am1          -4.1845  1.2564
## wt:drat      -4.6004 -0.8164

Step 10: Predictions, Prediction Interval, and Confidence Interval

In this step, you will predict the fuel economy for a car that has a weight of 3.88, a rear axle ratio of 3.05, and has a manual transmission. You will also obtain a 90% prediction interval and confidence interval for this prediction. Note that this observation is not from the dataset that was used to create this model.

Click the block of code below and hit the Run button above.

newdata <- data.frame(wt=3.88, drat=3.05, am='1')

print("prediction interval")
## [1] "prediction interval"
prediction_pred_int <- predict(model2, newdata, interval="predict", level=0.90) 
round(prediction_pred_int, 4)
##       fit    lwr     upr
## 1 15.0672 9.4501 20.6844
print("confidence interval")
## [1] "confidence interval"
prediction_conf_int <- predict(model2, newdata, interval="confidence", level=0.90) 
round(prediction_conf_int, 4)
##       fit     lwr     upr
## 1 15.0672 12.2316 17.9029

Your Code

You have been asked to create regression models in the Module Two Problem Set. Review the Problem Set Report template to see the questions you will be answering about your models.

Use the empty blocks below to write the R code for your models and get outputs. Then use the outputs to answer the questions in your problem set report.

Note: Use the + (plus) button to add new code blocks or the scissor icon to remove empty code blocks, if needed.

Pearson correlation coefficients

cor_mpg_hp <- cor(mtcars$mpg, mtcars$hp)
cor_mpg_hp_qsec_drat <- cor(mtcars$mpg, mtcars$hp+mtcars$qsec+mtcars$drat)
cor_hp_qsec <- cor(mtcars$hp, mtcars$qsec)
cor_mpg_qsec <- cor(mtcars$mpg, mtcars$qsec)
cor_mpg_drat <- cor(mtcars$mpg, mtcars$drat)

print(cor_mpg_hp)
## [1] -0.7761684
print(cor_mpg_hp_qsec_drat)
## [1] -0.7768859
print(cor_hp_qsec)
## [1] -0.7082234
print(cor_mpg_qsec)
## [1] 0.418684
print(cor_mpg_drat)
## [1] 0.6811719

More details

cor.test(mtcars$mpg, mtcars$hp)
## 
##  Pearson's product-moment correlation
## 
## data:  mtcars$mpg and mtcars$hp
## t = -6.7424, df = 30, p-value = 1.788e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.8852686 -0.5860994
## sample estimates:
##        cor 
## -0.7761684
cor.test(mtcars$mpg, mtcars$hp+mtcars$qsec+mtcars$drat)
## 
##  Pearson's product-moment correlation
## 
## data:  mtcars$mpg and mtcars$hp + mtcars$qsec + mtcars$drat
## t = -6.7581, df = 30, p-value = 1.713e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.8856589 -0.5872846
## sample estimates:
##        cor 
## -0.7768859
cor.test(mtcars$hp, mtcars$qsec)
## 
##  Pearson's product-moment correlation
## 
## data:  mtcars$hp and mtcars$qsec
## t = -5.4946, df = 30, p-value = 5.766e-06
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.8475998 -0.4774331
## sample estimates:
##        cor 
## -0.7082234
cor.test(mtcars$mpg, mtcars$qsec)
## 
##  Pearson's product-moment correlation
## 
## data:  mtcars$mpg and mtcars$qsec
## t = 2.5252, df = 30, p-value = 0.01708
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.08195487 0.66961864
## sample estimates:
##      cor 
## 0.418684
cor.test(mtcars$mpg, mtcars$drat)
## 
##  Pearson's product-moment correlation
## 
## data:  mtcars$mpg and mtcars$drat
## t = 5.096, df = 30, p-value = 1.776e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4360484 0.8322010
## sample estimates:
##       cor 
## 0.6811719

#Linear Regression Models

“Write the general form and the prediction equation of the regression model for fuel economy using horsepower, quarter mile time, and rear axle ratio as predictors. Include interaction terms for horsepower and quarter mile time; horsepower and rear axle ratio. Create the regression model for fuel economy using horsepower, quarter mile time, and rear axle ratio as predictors. Include interaction terms for horsepower and quarter mile time; horsepower and rear axle ratio. Write the prediction model equation using outputs obtained from your R script.”

Linear Regression Model for MPG and HP

“MPG = β0 + β1 * HP + ϵ”

” MPG: is the dependent variabnle that I aim to predict” ” HP: is the independent variable used as a predictor” ” β0: The intercept of the regression line, representing the predicted MPG when HP is 0” ” β1: The slope coefficent, representing the change in MPG for each unit increase in. HP” ” ϵ: The error term, representing the difference between the opbderved MPG values and the values predicted by the model”

model <- lm(mpg ~ hp, data = mtcars)

summary(model)
## 
## Call:
## lm(formula = mpg ~ hp, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.7121 -2.1122 -0.8854  1.5819  8.2360 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 30.09886    1.63392  18.421  < 2e-16 ***
## hp          -0.06823    0.01012  -6.742 1.79e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.863 on 30 degrees of freedom
## Multiple R-squared:  0.6024, Adjusted R-squared:  0.5892 
## F-statistic: 45.46 on 1 and 30 DF,  p-value: 1.788e-07
cat("R-squared:", summary(model)$r.squared, "\n")
## R-squared: 0.6024373
cat("Adjusted R-squared:", summary(model)$adj.r.squared)
## Adjusted R-squared: 0.5891853
ggplot(data = mtcars, aes(x= hp, y = mpg)) +
  geom_point() + 
  geom_smooth(method = "lm", col = "red") + 
  labs(title = "Linear Regression of MPG on HP",
       subtitle = "SNHU Module Two MTCARS",
       caption = "The red  line is the Linear Regression Line. 
    The closer the data is to the regression line shows higher correlation between 
    the two variables. The data plots that qare further away have larger residuals 
    could be considered as potential outliers.",
       x = "Horsepower (HP)",
       y = "Miles Per Gallon (MPG)") + 
  theme_minimal()+
  theme(
    plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5),
    plot.caption = element_text(face = "italic", hjust = 0.5))
## `geom_smooth()` using formula = 'y ~ x'

Multiple Linear Regression Model for MPG on QSEC, DRAT and HP

“MPG = β0 + β1 * HP + β2 * QSEC + β3 * DRAT + ϵ”

” MPG: is the dependent variabnle that I aim to predict fuel economy” ” HP: Predictor variable for horsepowe” ” QSEC: Predictor variable for a.” ” DRAT: Predictor variable for rear axle ratio” ” β0: Intercept of the regression line, representing the predicted MPG when all predictors are zero.” ” β1, β2, β3 : Coefficients representing the change in MPG for a one-unit increase in HP, QSEC, and DRAT, respectively.” ” ϵ: Error term representing the difference between observed and predicted MPG values”

model1 <- lm(mpg ~ hp + qsec + drat, data = mtcars)

summary(model1)
## 
## Call:
## lm(formula = mpg ~ hp + qsec + drat, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.7977 -2.4804 -0.4937  1.1381  7.3188 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 17.73662   13.01979   1.362 0.183968    
## hp          -0.05797    0.01421  -4.080 0.000339 ***
## qsec        -0.28407    0.48923  -0.581 0.566116    
## drat         4.42875    1.29169   3.429 0.001897 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.207 on 28 degrees of freedom
## Multiple R-squared:  0.7443, Adjusted R-squared:  0.7168 
## F-statistic: 27.16 on 3 and 28 DF,  p-value: 1.937e-08
cat("R-squared:", summary(model1)$r.squared, "\n")
## R-squared: 0.7442512
cat("Adjusted R-squared:", summary(model1)$adj.r.squared)
## Adjusted R-squared: 0.7168495
ggplot(data = mtcars, aes(x= hp + qsec + drat, y = mpg)) +
  geom_point() + 
  geom_smooth(method = "lm", col = "blue") + 
  labs(title = "Multiple Linear Regression of MPG on QSEC, DRAT and HP",
       subtitle = "SNHU Module Two MTCARS",
       caption = "The blue line is the Linear Regression Line. 
    The closer the data is to the regression line shows higher correlation between 
    the multiple variables and MPG. The data plots that qare further away have larger residuals 
    could be considered as potential outliers.",
       x = "QSEC, DRAT and HP",
       y = "Miles Per Gallon (MPG)") + 
  theme_minimal()+
  theme(
    plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5),
    plot.caption = element_text(face = "italic", hjust = 0.5))
## `geom_smooth()` using formula = 'y ~ x'

Linear Regression Model for HP and QSEC

“HP = β0 + β1 * QSEC + ϵ”

” HP: is the dependent variabnle that I aim to predict” ” QSEC: is the independent variable used as a predictor” ” β0: The intercept of the regression line, representing the predicted HPx when QSEC is 0” ” β1: The slope coefficent, representing the change in HP for each unit increase in. QSEC” ” ϵ: The error term, representing the difference between the opbderved HP values and the values predicted by the model”

model2 <- lm(hp ~ qsec, data = mtcars)

summary(model2)
## 
## Call:
## lm(formula = hp ~ qsec, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -86.903 -33.629   5.336  27.925 100.032 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  631.704     88.700   7.122 6.38e-08 ***
## qsec         -27.174      4.946  -5.495 5.77e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 49.2 on 30 degrees of freedom
## Multiple R-squared:  0.5016, Adjusted R-squared:  0.485 
## F-statistic: 30.19 on 1 and 30 DF,  p-value: 5.766e-06
cat("R-squared:", summary(model2)$r.squared, "\n")
## R-squared: 0.5015804
cat("Adjusted R-squared:", summary(model2)$adj.r.squared)
## Adjusted R-squared: 0.4849664
ggplot(data = mtcars, aes(x= qsec, y = hp)) +
  geom_point() + 
  geom_smooth(method = "lm", col = "purple") + 
  labs(title = "Linear Regression of HP on QSEC",
       subtitle = "SNHU Module Two MTCARS",
       caption = "The purple line is the Linear Regression Line. 
    The closer the data is to the regression line shows higher correlation between 
    the two variables. The data plots that qare further away have larger residuals 
    could be considered as potential outliers.",
       x = "Quarter Mile Time (QSEC)",
       y = "Horsepower (HP)") + 
  theme_minimal()+
  theme(
    plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5),
    plot.caption = element_text(face = "italic", hjust = 0.5))
## `geom_smooth()` using formula = 'y ~ x'

Linear Regression Model for HP and DRAT

“HP = β0 + β1 * DRAT + ϵ”

” HP: is the dependent variabnle that I aim to predict” ” QSEC: is the independent variable used as a predictor” ” β0: The intercept of the regression line, representing the predicted HP when QSEC is 0” ” β1: The slope coefficent, representing the change in HP for each unit increase in. QSEC” ” ϵ: The error term, representing the difference between the opbderved HP values and the values predicted by the model”

model3 <- lm(hp ~ drat, data = mtcars)

summary(model3)
## 
## Call:
## lm(formula = hp ~ drat, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -89.828 -40.261  -7.934   7.247 185.058 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   353.65      76.05    4.65 6.24e-05 ***
## drat          -57.55      20.92   -2.75  0.00999 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 62.28 on 30 degrees of freedom
## Multiple R-squared:  0.2014, Adjusted R-squared:  0.1748 
## F-statistic: 7.565 on 1 and 30 DF,  p-value: 0.009989
cat("R-squared:", summary(model3)$r.squared, "\n")
## R-squared: 0.2013847
cat("Adjusted R-squared:", summary(model3)$adj.r.squared)
## Adjusted R-squared: 0.1747642
ggplot(data = mtcars, aes(x= qsec, y = hp)) +
  geom_point() + 
  geom_smooth(method = "lm", col = "orange") + 
  labs(title = "Linear Regression of HP on DRAT",
       subtitle = "SNHU Module Two MTCARS",
       caption = "The orange line is the Linear Regression Line. 
    The closer the data is to the regression line shows higher correlation between 
    the two variables. The data plots that qare further away have larger residuals 
    could be considered as potential outliers.",
       x = "Rear Axle Ratio (DRAT)",
       y = "Horsepower (HP)") + 
  theme_minimal()+
  theme(
    plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5),
    plot.caption = element_text(face = "italic", hjust = 0.5))
## `geom_smooth()` using formula = 'y ~ x'

head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Multiple Linear Regression Model for MPG with 160HP with each unit of change of QSEC

“MPG = β0 + β1 * HP + β2 * QSEC”

” MPG: is the dependent variabnle that I aim to predict fuel economy” ” HP: Predictor variable for horsepowe” ” QSEC: Predictor variable for quarter-mile time.” ” β0: representing the predicted value of MPG when both HP and QSEC are zero.” ” β1: Coefficients representing the change in MPG for a one-unit increase in HP while holding QSEC constant.” ” β2: Coefficient Slope for QSEC. It represents the change in MPG for each unit increase in QSEC, holding HP constant.”

model4 <- lm(mpg ~ hp + qsec, data = mtcars)

summary(model4)
## 
## Call:
## lm(formula = mpg ~ hp + qsec, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.1782 -2.6030 -0.5098  1.2866  8.7178 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 48.32371   11.10331   4.352 0.000153 ***
## hp          -0.08459    0.01393  -6.071 1.31e-06 ***
## qsec        -0.88658    0.53459  -1.658 0.108007    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.755 on 29 degrees of freedom
## Multiple R-squared:  0.6369, Adjusted R-squared:  0.6118 
## F-statistic: 25.43 on 2 and 29 DF,  p-value: 4.176e-07

“To see the actual change, I will create a prediction model to show the change”

new_data <- data.frame(hp = 160, qsec = c(1, 2))
prediction_model <- predict(model4, newdata = new_data)
prediction_model
##        1        2 
## 33.90224 33.01566
difference <- prediction_model[1] - prediction_model[2]
difference
##         1 
## 0.8865796
ggplot(data = mtcars, aes(x= qsec, y = mpg + hp)) +
  geom_point() + 
  geom_smooth(method = "lm", col = "green") + 
  labs(title = "Multiple Linear Regression of MPG & HP on QSEC",
       subtitle = "SNHU Module Two MTCARS",
       caption = "The green line is the Linear Regression Line. 
    The closer the data is to the regression line shows higher correlation between 
    the QSEC and MPG combined with HP. The data plots that qare further away have larger residuals 
    could be considered as potential outliers.",
       x = "QSEC",
       y = "MPG & HP") + 
  theme_minimal()+
  theme(
    plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5),
    plot.caption = element_text(face = "italic", hjust = 0.5))
## `geom_smooth()` using formula = 'y ~ x'

Display the summary to get the F-test results

model4_summary <- summary(model4)
cat("Overall F-test P-value:", model4_summary$fstatistic, "\n")
## Overall F-test P-value: 25.43136 2 29

T-Test

Extract coefficients table for p-values

coeff_table <- model4_summary$coefficients

Create a Normal Q-Q plot of residuals

fitted_values <- fitted(model4)
fitted_df <- data.frame(Car = names(fitted_values), Fitted = fitted_values)
residuals <- resid(model4)
residuals_df <- data.frame(Car = names(residuals), Residuals = residuals)

View the data frame

print(fitted_df)
##                                     Car    Fitted
## Mazda RX4                     Mazda RX4 24.425370
## Mazda RX4 Wag             Mazda RX4 Wag 23.928885
## Datsun 710                   Datsun 710 23.957305
## Hornet 4 Drive           Hornet 4 Drive 21.783362
## Hornet Sportabout     Hornet Sportabout 18.430337
## Valiant                         Valiant 21.514796
## Duster 360                   Duster 360 13.554988
## Merc 240D                     Merc 240D 25.347344
## Merc 230                       Merc 230 19.984693
## Merc 280                       Merc 280 21.694354
## Merc 280C                     Merc 280C 21.162406
## Merc 450SE                   Merc 450SE 17.670472
## Merc 450SL                   Merc 450SL 17.493156
## Merc 450SLC                 Merc 450SLC 17.138524
## Cadillac Fleetwood   Cadillac Fleetwood 15.041430
## Lincoln Continental Lincoln Continental 14.337352
## Chrysler Imperial     Chrysler Imperial 13.423088
## Fiat 128                       Fiat 128 25.478859
## Honda Civic                 Honda Civic 27.505412
## Toyota Corolla           Toyota Corolla 25.182223
## Toyota Corona             Toyota Corona 22.377722
## Dodge Challenger       Dodge Challenger 20.678150
## AMC Javelin                 AMC Javelin 20.296921
## Camaro Z28                   Camaro Z28 13.936217
## Pontiac Firebird       Pontiac Firebird 18.403740
## Fiat X1-9                     Fiat X1-9 25.984209
## Porsche 914-2             Porsche 914-2 25.819858
## Lotus Europa               Lotus Europa 23.781496
## Ford Pantera L           Ford Pantera L 13.135737
## Ferrari Dino               Ferrari Dino 19.777938
## Maserati Bora             Maserati Bora  7.040973
## Volvo 142E                   Volvo 142E 22.612682
print(residuals_df)
##                                     Car   Residuals
## Mazda RX4                     Mazda RX4 -3.42536974
## Mazda RX4 Wag             Mazda RX4 Wag -2.92888515
## Datsun 710                   Datsun 710 -1.15730529
## Hornet 4 Drive           Hornet 4 Drive -0.38336246
## Hornet Sportabout     Hornet Sportabout  0.26966269
## Valiant                         Valiant -3.41479557
## Duster 360                   Duster 360  0.74501179
## Merc 240D                     Merc 240D -0.94734397
## Merc 230                       Merc 230  2.81530738
## Merc 280                       Merc 280 -2.49435367
## Merc 280C                     Merc 280C -3.36240589
## Merc 450SE                   Merc 450SE -1.27047184
## Merc 450SL                   Merc 450SL -0.19315591
## Merc 450SLC                 Merc 450SLC -1.93852406
## Cadillac Fleetwood   Cadillac Fleetwood -4.64142957
## Lincoln Continental Lincoln Continental -3.93735187
## Chrysler Imperial     Chrysler Imperial  1.27691194
## Fiat 128                       Fiat 128  6.92114100
## Honda Civic                 Honda Civic  2.89458775
## Toyota Corolla           Toyota Corolla  8.71777720
## Toyota Corona             Toyota Corona -0.87772164
## Dodge Challenger       Dodge Challenger -5.17815035
## AMC Javelin                 AMC Javelin -5.09692111
## Camaro Z28                   Camaro Z28 -0.63621745
## Pontiac Firebird       Pontiac Firebird  0.79626007
## Fiat X1-9                     Fiat X1-9  1.31579062
## Porsche 914-2             Porsche 914-2  0.18014154
## Lotus Europa               Lotus Europa  6.61850442
## Ford Pantera L           Ford Pantera L  2.66426292
## Ferrari Dino               Ferrari Dino -0.07793834
## Maserati Bora             Maserati Bora  7.95902698
## Volvo 142E                   Volvo 142E -1.21268239
qqnorm(residuals, main = "Normal Q-Q Plot")
qqline(residuals, col = "red", lwd = 2)

plot(fitted_values, residuals, 
     main = "Residuals vs Fitted Values fuel economy (MPG) of a car with 160 horsepower (HP)
     for each unit increase in quarter mile time (QSEC)", 
     xlab = "Fitted Values", 
     ylab = "Residuals",
     pch = 19, col = "darkgreen")
abline(h = 0, lty = 2, col = "red")
grid()

Multiple Linear Regression Model for MPG with 160HP with each unit of change of DRAT

“MPG = β0 + β1 * HP + β2 * DRAT”

” MPG: is the dependent variabnle that I aim to predict fuel economy” ” HP: Predictor variable for horsepowe” ” DRAT: Predictor variable for rear axle ratio.” ” β0: representing the predicted value of MPG when both HP and DRAT are zero.” ” β1: Coefficients representing the change in MPG for a one-unit increase in HP while holding DRAT constant.” ” β2: Coefficient Slope for DRAT It represents the change in MPG for each unit increase in DRAT, holding HP constant.”

model5 <- lm(mpg ~ hp + drat, data = mtcars)

summary(model5)
## 
## Call:
## lm(formula = mpg ~ hp + drat, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.0369 -2.3487 -0.6034  1.1897  7.7500 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 10.789861   5.077752   2.125 0.042238 *  
## hp          -0.051787   0.009293  -5.573 5.17e-06 ***
## drat         4.698158   1.191633   3.943 0.000467 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.17 on 29 degrees of freedom
## Multiple R-squared:  0.7412, Adjusted R-squared:  0.7233 
## F-statistic: 41.52 on 2 and 29 DF,  p-value: 3.081e-09

“To see the actual change, I will create a prediction model to show the change”

new_data <- data.frame(hp = 160, drat = c(1, 2))
prediction_model2 <- predict(model5, newdata = new_data)
prediction_model2
##         1         2 
##  7.202155 11.900313
difference2 <- prediction_model2[1] - prediction_model2[2]
difference2
##         1 
## -4.698158
ggplot(data = mtcars, aes(x= drat, y = mpg + hp)) +
  geom_point() + 
  geom_smooth(method = "lm", col = "brown") + 
  labs(title = "Multiple Linear Regression of MPG & HP on DRAT",
       subtitle = "SNHU Module Two MTCARS",
       caption = "The green line is the Linear Regression Line. 
    The closer the data is to the regression line shows higher correlation between 
    the DRAT and MPG combined with HP. The data plots that qare further away have larger residuals 
    could be considered as potential outliers.",
       x = "DRAT",
       y = "MPG & HP") + 
  theme_minimal()+
  theme(
    plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5),
    plot.caption = element_text(face = "italic", hjust = 0.5))
## `geom_smooth()` using formula = 'y ~ x'

Display the summary to get the F-test results and P-value

model5_summary <- summary(model5)
cat("Overall F-test P-value:", model5_summary$fstatistic, "\n")
## Overall F-test P-value: 41.52167 2 29

T-Test

Extract coefficients table for p-values

coeff_table <- model5_summary$coefficients

Create a Normal Q-Q plot of residuals

fitted_values1 <- fitted(model5)
fitted_df1 <- data.frame(Car = names(fitted_values1), Fitted = fitted_values1)
residuals1 <- resid(model5)
residuals_df1 <- data.frame(Car = names(residuals1), Residuals = residuals1)

qqnorm(residuals1, main = "Normal Q-Q Plot")
qqline(residuals1, col = "red", lwd = 2)

plot(fitted_values1, residuals1, 
     main = "Residuals vs Fitted Values fuel economy (MPG) of a car with 160 horsepower (HP)
     for each unit increase in rear axle ratio (DRAT)", 
     xlab = "Fitted Values", 
     ylab = "Residuals",
     pch = 19, col = "purple")
abline(h = 0, lty = 2, col = "blue")
grid()

Model with Interaction Term and Qualitative Predictor

“General form: MPG=β0+β1⋅HP+β2⋅QSEC+β3⋅(HP⋅QSEC)+β4⋅CYL6+β5⋅CYL8+ϵ”

“Predicted general form: MPG-hat=β0+β1⋅HP+β2⋅QSEC+β3⋅(HP⋅QSEC)+β4⋅CYL6+β5⋅CYL8+ϵ”

model6 <- lm(mpg~ hp + qsec +qsec:hp + factor(cyl), data = mtcars)
summary(model6)
## 
## Call:
## lm(formula = mpg ~ hp + qsec + qsec:hp + factor(cyl), data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.0004 -1.6264 -0.2424  1.3322  5.7974 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  24.505565  13.186080   1.858   0.0745 .
## hp            0.141850   0.079164   1.792   0.0848 .
## qsec          0.531630   0.746717   0.712   0.4828  
## factor(cyl)6 -4.408372   1.627676  -2.708   0.0118 *
## factor(cyl)8 -4.580823   2.555742  -1.792   0.0847 .
## hp:qsec      -0.012526   0.005251  -2.386   0.0246 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.692 on 26 degrees of freedom
## Multiple R-squared:  0.8327, Adjusted R-squared:  0.8005 
## F-statistic: 25.88 on 5 and 26 DF,  p-value: 2.526e-09

Display the summary to get the F-test results

model6_summary <- summary(model6)
cat("Overall F-test P-value:", model6_summary$fstatistic, "\n")
## Overall F-test P-value: 25.88205 5 26

T-Test

Extract coefficients table for p-values

coeff_table <- model6_summary$coefficients

Create a Normal Q-Q plot of residuals

fitted_values2 <- fitted(model6)
fitted_df2 <- data.frame(Car = names(fitted_values2), Fitted = fitted_values2)
residuals2 <- resid(model6)
residuals_df2 <- data.frame(Car = names(residuals2), Residuals = residuals2)

qqnorm(residuals2, main = "Normal Q-Q Plot")
qqline(residuals2, col = "green", lwd = 2)

plot(fitted_values2, residuals2, 
     main = "Residuals vs Fitted Values for fuel economy (MPG) using Horsepower (HP), 
     Quarter Mile Time (QSEC), interaction term for HP and QSEC, and number of cylinders", 
     xlab = "Fitted Values", 
     ylab = "Residuals",
     pch = 19, col = "brown")
abline(h = 0, lty = 2, col = "blue")
grid()

Prediction Models

Prediction Model for predicted fuel economy for a car that has 175 horsepower, 14.2 quarter mile time and 3.91 rear-axle ratio?

new_data1 <- data.frame(hp = 175, qsec = 14.2, drat = 3.91)
prediction_model3 <- predict(model1, newdata1 = new_data1)
prediction_model3
##           Mazda RX4       Mazda RX4 Wag          Datsun 710      Hornet 4 Drive 
##           23.955867           23.796785           24.109210           19.477748 
##   Hornet Sportabout             Valiant          Duster 360           Merc 240D 
##           16.706976           18.128834           13.249800           24.802909 
##            Merc 230            Merc 280           Merc 280C          Merc 450SE 
##           23.084598           22.768097           22.597652           15.954863 
##          Merc 450SL         Merc 450SLC  Cadillac Fleetwood Lincoln Continental 
##           15.898048           15.784418           13.720749           13.496484 
##   Chrysler Imperial            Fiat 128         Honda Civic      Toyota Corolla 
##           13.759132           26.448790           31.294723           27.004637 
##       Toyota Corona    Dodge Challenger         AMC Javelin          Camaro Z28 
##           22.815301           16.471698           18.076760           15.674904 
##    Pontiac Firebird           Fiat X1-9       Porsche 914-2        Lotus Europa 
##           16.388441           26.610713           27.336415           23.081217 
##      Ford Pantera L        Ferrari Dino       Maserati Bora          Volvo 142E 
##           17.002014           19.220283            9.845972           24.335959
prediction_model3_df <- data.frame(Car = names(prediction_model3), Predict = prediction_model3)
prediction_model3_df
##                                     Car   Predict
## Mazda RX4                     Mazda RX4 23.955867
## Mazda RX4 Wag             Mazda RX4 Wag 23.796785
## Datsun 710                   Datsun 710 24.109210
## Hornet 4 Drive           Hornet 4 Drive 19.477748
## Hornet Sportabout     Hornet Sportabout 16.706976
## Valiant                         Valiant 18.128834
## Duster 360                   Duster 360 13.249800
## Merc 240D                     Merc 240D 24.802909
## Merc 230                       Merc 230 23.084598
## Merc 280                       Merc 280 22.768097
## Merc 280C                     Merc 280C 22.597652
## Merc 450SE                   Merc 450SE 15.954863
## Merc 450SL                   Merc 450SL 15.898048
## Merc 450SLC                 Merc 450SLC 15.784418
## Cadillac Fleetwood   Cadillac Fleetwood 13.720749
## Lincoln Continental Lincoln Continental 13.496484
## Chrysler Imperial     Chrysler Imperial 13.759132
## Fiat 128                       Fiat 128 26.448790
## Honda Civic                 Honda Civic 31.294723
## Toyota Corolla           Toyota Corolla 27.004637
## Toyota Corona             Toyota Corona 22.815301
## Dodge Challenger       Dodge Challenger 16.471698
## AMC Javelin                 AMC Javelin 18.076760
## Camaro Z28                   Camaro Z28 15.674904
## Pontiac Firebird       Pontiac Firebird 16.388441
## Fiat X1-9                     Fiat X1-9 26.610713
## Porsche 914-2             Porsche 914-2 27.336415
## Lotus Europa               Lotus Europa 23.081217
## Ford Pantera L           Ford Pantera L 17.002014
## Ferrari Dino               Ferrari Dino 19.220283
## Maserati Bora             Maserati Bora  9.845972
## Volvo 142E                   Volvo 142E 24.335959

Predict with 95% Conf Interval

prediction_with_interval <- predict(model1, newdata1 = new_data1, interval = "prediction", level = 0.95)
## Warning in predict.lm(model1, newdata1 = new_data1, interval = "prediction", : predictions on current data refer to _future_ responses

Convert the result to a data frame for easy viewing

prediction_with_interval_df <- data.frame(prediction_with_interval)
prediction_with_interval_df
##                           fit       lwr      upr
## Mazda RX4           23.955867 16.976208 30.93553
## Mazda RX4 Wag       23.796785 16.948899 30.64467
## Datsun 710          24.109210 17.370295 30.84813
## Hornet 4 Drive      19.477748 12.583415 26.37208
## Hornet Sportabout   16.706976  9.933170 23.48078
## Valiant             18.128834 10.962522 25.29515
## Duster 360          13.249800  6.362600 20.13700
## Merc 240D           24.802909 17.944540 31.66128
## Merc 230            23.084598 15.121567 31.04763
## Merc 280            22.768097 16.053721 29.48247
## Merc 280C           22.597652 15.834798 29.36051
## Merc 450SE          15.954863  9.179293 22.73043
## Merc 450SL          15.898048  9.126116 22.66998
## Merc 450SLC         15.784418  9.001975 22.56686
## Cadillac Fleetwood  13.720749  6.835247 20.60625
## Lincoln Continental 13.496484  6.606825 20.38614
## Chrysler Imperial   13.759132  6.854298 20.66397
## Fiat 128            26.448790 19.608904 33.28868
## Honda Civic         31.294723 23.994106 38.59534
## Toyota Corolla      27.004637 20.081856 33.92742
## Toyota Corona       22.815301 15.984610 29.64599
## Dodge Challenger    16.471698  9.291140 23.65226
## AMC Javelin         18.076760 11.259460 24.89406
## Camaro Z28          15.674904  8.690133 22.65967
## Pontiac Firebird    16.388441  9.587063 23.18982
## Fiat X1-9           26.610713 19.773972 33.44745
## Porsche 914-2       27.336415 20.237929 34.43490
## Lotus Europa        23.081217 16.212903 29.94953
## Ford Pantera L      17.002014  9.576316 24.42771
## Ferrari Dino        19.220283 12.308670 26.13190
## Maserati Bora        9.845972  2.239377 17.45257
## Volvo 142E          24.335959 17.554492 31.11743

Prediction Models

Prediction Model for predicted fuel economy for a car that has 175 horsepower, 14.2 quarter mile time and cyl is 6?

new_data2 <- data.frame(hp = 175, qsec = 14.2, cyl = "6")
prediction_model3 <- predict(model6, newdata = new_data2)
prediction_model3
##        1 
## 21.34244
prediction_model3_df <- data.frame(Car = names(prediction_model3), Predict = prediction_model3)
prediction_model3_df
##   Car  Predict
## 1   1 21.34244

Predict with 95% Conf Interval

prediction_with_interval1 <- predict(model6, newdata = new_data2, interval = "prediction", level = 0.95)

Convert the result to a data frame for easy viewing

prediction_with_interval_df1 <- data.frame(prediction_with_interval)
prediction_with_interval_df1
##                           fit       lwr      upr
## Mazda RX4           23.955867 16.976208 30.93553
## Mazda RX4 Wag       23.796785 16.948899 30.64467
## Datsun 710          24.109210 17.370295 30.84813
## Hornet 4 Drive      19.477748 12.583415 26.37208
## Hornet Sportabout   16.706976  9.933170 23.48078
## Valiant             18.128834 10.962522 25.29515
## Duster 360          13.249800  6.362600 20.13700
## Merc 240D           24.802909 17.944540 31.66128
## Merc 230            23.084598 15.121567 31.04763
## Merc 280            22.768097 16.053721 29.48247
## Merc 280C           22.597652 15.834798 29.36051
## Merc 450SE          15.954863  9.179293 22.73043
## Merc 450SL          15.898048  9.126116 22.66998
## Merc 450SLC         15.784418  9.001975 22.56686
## Cadillac Fleetwood  13.720749  6.835247 20.60625
## Lincoln Continental 13.496484  6.606825 20.38614
## Chrysler Imperial   13.759132  6.854298 20.66397
## Fiat 128            26.448790 19.608904 33.28868
## Honda Civic         31.294723 23.994106 38.59534
## Toyota Corolla      27.004637 20.081856 33.92742
## Toyota Corona       22.815301 15.984610 29.64599
## Dodge Challenger    16.471698  9.291140 23.65226
## AMC Javelin         18.076760 11.259460 24.89406
## Camaro Z28          15.674904  8.690133 22.65967
## Pontiac Firebird    16.388441  9.587063 23.18982
## Fiat X1-9           26.610713 19.773972 33.44745
## Porsche 914-2       27.336415 20.237929 34.43490
## Lotus Europa        23.081217 16.212903 29.94953
## Ford Pantera L      17.002014  9.576316 24.42771
## Ferrari Dino        19.220283 12.308670 26.13190
## Maserati Bora        9.845972  2.239377 17.45257
## Volvo 142E          24.335959 17.554492 31.11743

End of Module Two Jupyter Notebook

Attach the HTML output along with your problem set report for the Module Two Problem Set. The HTML output can be downloaded by clicking File, then Download as, then HTML. Be sure to answer all of the questions in your problem set report.