Assignment 5
Introduction
This assignment extends Assignment 4, using post-estimation methods via clarify to enhance interpretation of our logistic regression model predicting transmission type (am) in the mtcars dataset.
We will:
Simulate model coefficients to assess uncertainty.
Compute Average Marginal Effects (AME) to quantify the effects of predictors.
Estimate Predicted Probabilities at different mpg values for deeper insight.
Evaluate Model Fit using Likelihood Ratio Test, AIC, and BIC.
Data Preparation
# Load mtcars dataset
mtcars_data <- mtcars
# Convert 'am' (Transmission Type) into a binary factor
mtcars_data <- mtcars_data %>%
mutate(am = factor(am, labels = c("Automatic", "Manual")),
hp_group = ifelse(hp > median(hp), "High", "Low")) # Dichotomize horsepower
# Check dataset structure
str(mtcars_data)
## 'data.frame': 32 obs. of 12 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp : num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat : num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec : num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : Factor w/ 2 levels "Automatic","Manual": 2 2 2 1 1 1 1 1 1 1 ...
## $ gear : num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb : num 4 4 1 1 2 1 4 2 2 4 ...
## $ hp_group: chr "Low" "Low" "Low" "Low" ...
Logistic Regression Model
# Model: Predicting Transmission Type
m1 <- glm(am ~ mpg + hp_group + wt, data = mtcars_data, family = binomial)
summary(m1)
##
## Call:
## glm(formula = am ~ mpg + hp_group + wt, family = binomial, data = mtcars_data)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 24.8793 13.0045 1.913 0.0557 .
## mpg -0.1562 0.3220 -0.485 0.6277
## hp_groupLow -2.3475 2.2507 -1.043 0.2970
## wt -6.7465 2.7408 -2.461 0.0138 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 43.230 on 31 degrees of freedom
## Residual deviance: 15.814 on 28 degrees of freedom
## AIC: 23.814
##
## Number of Fisher Scoring iterations: 7
Key Findings:
mpg is not statistically significant (p = 0.6277), meaning fuel efficiency does not strongly predict transmission type.
However, weight (wt) is statistically significant (p = 0.0138), indicating that heavier cars are more likely automatic.
hp_group is not statistically significant, suggesting that horsepower (high vs. low) alone does not explain transmission type.
Post-Estimation with clarify
1. Simulating Model Coefficients
## Length Class Mode
## sim.coefs 4000 -none- numeric
## coefs 4 -none- numeric
## vcov 16 -none- numeric
## fit 30 glm list
Significance of This Approach:
Simulations allow us to estimate uncertainty around our model coefficients.
This avoids over-reliance on single coefficient values.
2. Average Marginal Effects (AME)
sim_ame_results <- sim_ame(sim_coefs, var = "mpg", contrast = "rd", verbose = FALSE)
summary(sim_ame_results)
## Estimate 2.5 % 97.5 %
## E[dY/d(mpg)] -0.0116 -0.0735 0.0454
Findings: Average Marginal Effects (AME)
The AME of mpg (-0.0116) suggests that increasing mpg has an extremely small effect on the probability of a car having a manual transmission.
This confirms that mpg is not statistically significant, and transmission choice is likely influenced by other variables such as weight.
3. Predictions at Set Values
# Define mpg values for prediction
sim_pred <- sim_setx(sim_coefs, x = list(mpg = c(15, 25)))
summary(sim_pred)
## Estimate 2.5 % 97.5 %
## mpg = 15 0.18011 0.00351 0.95022
## mpg = 25 0.04406 0.00108 0.69067
# Plot predicted probabilities
plot(sim_pred, main = "Predicted Probability of Transmission Type at Different MPG Values")
Findings: Predicted Probabilities
At 15 mpg, the probability of manual transmission is 18%.
At 25 mpg, the probability drops to 4.4%, meaning cars with higher mpg are actually less likely to have a manual transmission in this dataset.
This suggests that other factors (like weight) may have a stronger influence on transmission choice than mpg alone.
Model Selection and Fit
## Likelihood ratio test
##
## Model 1: am ~ mpg + hp_group + wt
## Model 2: am ~ 1
## #Df LogLik Df Chisq Pr(>Chisq)
## 1 4 -7.9071
## 2 1 -21.6149 -3 27.416 4.817e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## [1] 23.81423
## [1] 29.67717
Findings: Model Selection and Fit
The Likelihood Ratio Test (p < 0.001) confirms that our model is statistically significant, meaning the predictors (mpg, hp_group, wt) help explain transmission type.
Lower AIC/BIC values suggest our model strikes a balance between fit and complexity.
Key Findings
MPG has a small, non-significant effect on transmission type.
Weight (wt) is a strong predictor - heavier cars are more likely automatic.
Clarify confirms these findings:
Marginal effects plot shows little impact from mpg.
Predicted probabilities demonstrate unexpected trends in mpg effects.
Conclusion
This assignment demonstrated how clarify enhances logistic regression analysis by:
Estimating uncertainty through simulations.
Providing meaningful marginal effects instead of raw log-odds.
Visualizing predicted probabilities to communicate findings effectively.