# Load mtcars dataset
data <- mtcars
# Converting transmission (am) to a factor (0 = automatic, 1 = manual)
data$am <- factor(data$am, levels = c(0,1), labels = c("Automatic", "Manual"))
# Fit a logistic regression model predicting manual transmission based on horsepower (hp), weight (wt), and cylinders (cyl)
model <- glm(am ~ hp + wt + cyl, data = data, family = binomial)
summary(model)
##
## Call:
## glm(formula = am ~ hp + wt + cyl, family = binomial, data = data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 19.70288 8.11637 2.428 0.0152 *
## hp 0.03259 0.01886 1.728 0.0840 .
## wt -9.14947 4.15332 -2.203 0.0276 *
## cyl 0.48760 1.07162 0.455 0.6491
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 43.2297 on 31 degrees of freedom
## Residual deviance: 9.8415 on 28 degrees of freedom
## AIC: 17.841
##
## Number of Fisher Scoring iterations: 8
# Simulating model parameters (n = 100 simulations to reduce runtime)
set.seed(123)
sim_model <- sim(model, n = 100)
# Computing Average Marginal Effects (AMEs) for horsepower (hp)
ames <- sim_ame(sim_model, var = "hp")
print(ames)
## A `clarify_est` object (from `sim_ame()`)
## - Average marginal effect of `hp`
## - 100 simulated values
## - 1 quantity estimated:
## E[dY/d(hp)] 0.00143393
# Defining prediction scenarios: Low horsepower (100) vs. High horsepower (200)
x_pred <- data.frame(hp = 100, wt = mean(data$wt), cyl = median(data$cyl))
x1_pred <- data.frame(hp = 200, wt = mean(data$wt), cyl = median(data$cyl))
# Computing predicted probabilities and first difference using sim_setx()
predictions <- sim_setx(sim_model, x = x_pred, x1 = x1_pred)
# Extracting estimates
prob_low_hp <- predictions[1] # Probability at hp = 100
prob_high_hp <- predictions[2] # Probability at hp = 200
The mtcars dataset is used to build a logistic regression model that predicts the probability of a car having a manual transmission (am) based on horsepower (hp), weight (wt), and the number of cylinders (cyl). Since am is a binary variable, it is converted into a categorical factor with two levels: “Automatic” (0) and “Manual” (1) to be appropriately modeled in a logistic regression framework. The glm() function is used to estimate the relationship between the predictor variables and transmission type under a binomial distribution. The summary(model) function provides estimates of the model coefficients, assessing the impact of each predictor on the likelihood of a manual transmission.
The model output reveals that the Intercept (19.70, p = 0.0152) is high, suggesting that when hp, wt, and cyl are all zero, the log-odds of a car being manual are large, though this scenario is unrealistic. The coefficient for horsepower (hp = 0.0326, p = 0.0840) suggests that an increase in horsepower slightly increases the likelihood of a manual transmission, but this effect is not statistically significant at the 5% level. The coefficient for weight (wt = -9.15, p = 0.0276) is statistically significant, indicating that heavier cars are much more likely to have an automatic transmission. Meanwhile, the coefficient for cylinders (cyl = 0.4876, p = 0.6491) is not statistically significant, implying that the number of cylinders does not strongly influence transmission type. Model diagnostics show a residual deviance of 9.84 compared to a null deviance of 43.23, suggesting that the model explains a substantial amount of variation in transmission type. Additionally, the Akaike Information Criterion (AIC) of 17.84 indicates a relatively efficient model balance between complexity and goodness of fit. However, since hp and cyl are not statistically significant, the model may be simplified by removing these variables to improve interpretability.
To further explore the model’s implications, post-estimation analysis is conducted using the clarify package. Simulated parameter distributions (sim()) are generated with 100 iterations to evaluate uncertainty in the coefficient estimates. Average Marginal Effects (AMEs) are computed (sim_ame()) for hp, revealing that a one-unit increase in horsepower increases the probability of a manual transmission by only 0.014% on average. This minimal effect suggests that horsepower alone does not meaningfully determine transmission type. Further predictions are generated using sim_setx(), comparing the probability of a car having a manual transmission at low (hp = 100) and high (hp = 200) horsepower levels while holding weight and cylinder count constant. If the estimated probability of manual transmission is higher at 200 HP, it suggests that higher-powered cars are more likely to be manual. Conversely, if the probability decreases, it indicates a preference for automatic transmission in high-horsepower vehicles. These results reinforce the conclusion that weight has the most substantial impact on transmission type, while horsepower and cylinder count contribute minimally to predicting whether a car is manual or automatic.