1. The data sets package in R contains a small data set called mtcars that contains n = 32 observations of the characteristics of different automobiles. Create a new data frame from part of this data set using this command: myCars <‐ data.frame(mtcars[,1:6]).
# Load the datasets package (if not already loaded)
library(datasets)

# Create the new data frame myCars
myCars <- data.frame(mtcars[, 1:6])

# Display the first few rows of myCars
head(myCars)
##                    mpg cyl disp  hp drat    wt
## Mazda RX4         21.0   6  160 110 3.90 2.620
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875
## Datsun 710        22.8   4  108  93 3.85 2.320
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215
## Hornet Sportabout 18.7   8  360 175 3.15 3.440
## Valiant           18.1   6  225 105 2.76 3.460
  1. Create and interpret a bivariate correlation matrix using cor(myCars) keeping in mind the idea that you will be trying to predict the mpg variable. Which other variable might be the single best predictor of mpg?
# Calculate the correlation matrix
cor_matrix <- cor(myCars)

# Display the correlation matrix
print(round(cor_matrix, 2))  # Round to 2 decimal places for readability
##        mpg   cyl  disp    hp  drat    wt
## mpg   1.00 -0.85 -0.85 -0.78  0.68 -0.87
## cyl  -0.85  1.00  0.90  0.83 -0.70  0.78
## disp -0.85  0.90  1.00  0.79 -0.71  0.89
## hp   -0.78  0.83  0.79  1.00 -0.45  0.66
## drat  0.68 -0.70 -0.71 -0.45  1.00 -0.71
## wt   -0.87  0.78  0.89  0.66 -0.71  1.00
# 'wt' (weight) shows the strongest negative correlation with 'mpg', suggesting heavier cars tend to have lower gas mileage.
# 'wt' appears to be the best single predictor of mpg in this dataset.
  1. Run a multiple regression analysis on the myCars data with lm(), using mpg as the dependent variable and wt (weight) and hp (horsepower) as the predictors. Make sure to say whether or not the overall R‐squared was significant. If it was significant, report the value and say in your own words whether it seems like a strong result or not. Review the significance tests on the coefficients (B‐weights). For each one that was significant, report its value and say in your own words whether it seems like a strong result or not.
# Run the multiple regression
model <- lm(mpg ~ wt + hp, data = myCars)

# View the summary of the model
summary(model)
## 
## Call:
## lm(formula = mpg ~ wt + hp, data = myCars)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.941 -1.600 -0.182  1.050  5.854 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 37.22727    1.59879  23.285  < 2e-16 ***
## wt          -3.87783    0.63273  -6.129 1.12e-06 ***
## hp          -0.03177    0.00903  -3.519  0.00145 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.593 on 29 degrees of freedom
## Multiple R-squared:  0.8268, Adjusted R-squared:  0.8148 
## F-statistic: 69.21 on 2 and 29 DF,  p-value: 9.109e-12
# Multiple regression model strongly predicts MPG (R-squared = 0.8268, p < 0.001). 
# Weight significantly impacts MPG, with heavier cars having lower MPG (coef = -3.88, p < 0.001).
# Horsepower also significantly affects MPG, but less so than weight (coef = -0.03, p = 0.00145).

# Heavier cars (weight coef. = -3.88, p < 0.001): The significantly negative coefficient indicates a strong inverse relationship between weight and MPG. As a car gets heavier, its MPG decreases considerably. This is likely because heavier vehicles require more energy to move. Specifically, every 1000-pound increase in weight is associated with a significant decrease of approximately 3.88 MPG.

# Cars with more horsepower (hp coef. = -0.03, p = 0.00145): A negative coefficient for horsepower also shows a statistically significant inverse relationship with MPG, though less pronounced than weight. Vehicles with higher horsepower tend to have lower MPG, likely due to increased energy demands for acceleration and maintaining speed. Even though the effect of horsepower on MPG is smaller compared to weight, it remains a statistically significant factor in determining a vehicle's fuel efficiency.

# Multiple regression analysis reveals weight and horsepower significantly impact MPG, with heavier and more powerful cars exhibiting lower fuel efficiency. The model effectively predicts MPG, explaining 82.68% of its variance.
  1. Using the results of the analysis from Exercise 2, construct a prediction equation for mpg using all three of the coefficients from the analysis (the intercept along with the two B‐weights). Pretend that an automobile designer has asked you to predict the mpg for a car with 110 horsepower and a weight of 3 tons. Show your calculation and the resulting value of mpg.
# Prediction equation for MPG:
# mpg = 37.22727 - 3.87783 * wt - 0.03177 * hp

# Predicting MPG for a car with 110 horsepower and 3 tons (6000 pounds) weight

# Regression coefficients from the model
intercept <- 37.22727
wt_coef <- -3.87783
hp_coef <- -0.03177

# Car specifications
hp <- 110
weight_tons <- 3
weight_pounds <- weight_tons * 2000
wt <- weight_pounds / 1000  # Convert to thousands of pounds

# Predict MPG using the regression equation
predicted_mpg <- intercept + wt_coef * wt + hp_coef * hp

# Print the predicted MPG
cat("Predicted MPG for a car with 110 horsepower and 3 tons weight:", round(predicted_mpg, 2), "\n")
## Predicted MPG for a car with 110 horsepower and 3 tons weight: 10.47
  1. Run a multiple regression analysis on the myCars data with lmBF(), using mpg as the dependent variable and wt (weight) and hp (horsepower) as the predictors. Interpret the resulting Bayes factor in terms of the odds in favor of the alternative hypothesis. If you did Exercise 2, do these results strengthen or weaken your conclusions?
# Load the BayesFactor package
if (!requireNamespace("BayesFactor", quietly = TRUE)) {
  install.packages("BayesFactor")
}
library(BayesFactor)
## Loading required package: coda
## Loading required package: Matrix
## ************
## Welcome to BayesFactor 0.9.12-4.7. If you have questions, please contact Richard Morey (richarddmorey@gmail.com).
## 
## Type BFManual() to open the manual.
## ************
# Run the Bayesian multiple regression
bf_model <- lmBF(mpg ~ wt + hp, data = myCars)

# Display the Bayes factor
bf_model
## Bayes factor analysis
## --------------
## [1] wt + hp : 788547604 ±0%
## 
## Against denominator:
##   Intercept only 
## ---
## Bayes factor type: BFlinearModel, JZS
# The Bayes factor for the model is very large (BF > 10^5), strongly supports the idea that weight and horsepower are significant predictors of MPG, reinforcing the previous conclusions.
  1. Run lmBF() with the same model as for Exercise 4, but with the options posterior=TRUE and iterations=10000. Interpret the resulting information about the coefficients.
# Run the Bayesian multiple regression with posterior sampling
posterior_samples <- lmBF(mpg ~ wt + hp, data = myCars, posterior = TRUE, iterations = 10000)

# Summarize the posterior samples
summary(posterior_samples)
## 
## Iterations = 1:10000
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 10000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##         Mean       SD  Naive SE Time-series SE
## mu   20.0926  0.48623 0.0048623      0.0042706
## wt   -3.7763  0.66764 0.0066764      0.0067732
## hp   -0.0311  0.00942 0.0000942      0.0000942
## sig2  7.4880  2.34441 0.0234441      0.0296373
## g     4.1584 25.94942 0.2594942      0.2594942
## 
## 2. Quantiles for each variable:
## 
##          2.5%      25%      50%      75%    97.5%
## mu   19.14835 19.77881 20.09153 20.40774 21.05005
## wt   -5.09266 -4.21790 -3.78483 -3.33441 -2.44517
## hp   -0.04968 -0.03731 -0.03097 -0.02493 -0.01305
## sig2  4.36765  5.93274  7.11170  8.57267 12.80524
## g     0.35435  0.95359  1.72876  3.41698 18.49216
# Plot the posterior distributions for the coefficients
plot(posterior_samples)

# The posterior distributions confirm that the coefficients for weight and horsepower are consistently negative, aligning with the previous analysis and indicating precise estimates.