A multiple linear regression model is a generalization of the simple linear regression model It has \(k\) variables with the form: \[ \hat{y}=a_0+a_1 x_1+a_2 x_2+\cdots a_k x_k, \] where the \(x_i\) values are the inputs to the system, the \(a_i\) coefficients are the model parameters computed from the measured data, and \(\hat{y}\) is the output value predicted by the model.
# Load the mtcars dataset
data(mtcars)
# Add the required terms to the dataset
mtcars$x1_squared <- mtcars$mpg^2 # Quadratic term
mtcars$dichotomous <- as.factor(ifelse(mtcars$cyl > median(mtcars$cyl), 1, 0)) # Dichotomous term
mtcars$x2 <- rnorm(nrow(mtcars)) # Quantitative variable for interaction
# Build a multiple regression model
model <- lm(mpg ~ wt + x1_squared + dichotomous + x2 + dichotomous:x2, data = mtcars)
# Display the model summary
summary(model)
##
## Call:
## lm(formula = mpg ~ wt + x1_squared + dichotomous + x2 + dichotomous:x2,
## data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3967 -0.3000 -0.1028 0.3119 1.4706
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 14.9279364 1.0479404 14.245 8.58e-14 ***
## wt -0.7708947 0.2442802 -3.156 0.00402 **
## x1_squared 0.0183835 0.0008503 21.619 < 2e-16 ***
## dichotomous1 -1.0192270 0.3720813 -2.739 0.01097 *
## x2 0.1286721 0.1796677 0.716 0.48027
## dichotomous1:x2 -0.3536109 0.2638667 -1.340 0.19180
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.691 on 26 degrees of freedom
## Multiple R-squared: 0.989, Adjusted R-squared: 0.9869
## F-statistic: 466.4 on 5 and 26 DF, p-value: < 2.2e-16
# Extract residuals from the transformed linear regression model
residual <- residuals(model)
# Create a histogram of residuals
hist(residual, breaks = 20, col = "skyblue", border = "black",
main = "Histogram of mtcars Residuals", xlab = "Residuals", ylab = "Frequency")
The histogram seems to be bell shaped and looks nearly residuals are normal distributed. It might be said that the multiple linear regression model fit with the data set.