Assignment 5

Introduction

This assignment extends Assignment 4, using post-estimation methods via clarify to enhance interpretation of our logistic regression model predicting transmission type (am) in the mtcars dataset.

We will:

  • Simulate model coefficients to assess uncertainty.

  • Compute Average Marginal Effects (AME) to quantify the effects of predictors.

  • Estimate Predicted Probabilities at different mpg values for deeper insight.

  • Evaluate Model Fit using Likelihood Ratio Test, AIC, and BIC.

Data Preparation

# Load mtcars dataset
mtcars_data <- mtcars

# Convert 'am' (Transmission Type) into a binary factor
mtcars_data <- mtcars_data %>%
  mutate(am = factor(am, labels = c("Automatic", "Manual")),
         hp_group = ifelse(hp > median(hp), "High", "Low"))  # Dichotomize horsepower

# Check dataset structure
str(mtcars_data)
## 'data.frame':    32 obs. of  12 variables:
##  $ mpg     : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl     : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp    : num  160 160 108 258 360 ...
##  $ hp      : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat    : num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt      : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec    : num  16.5 17 18.6 19.4 17 ...
##  $ vs      : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am      : Factor w/ 2 levels "Automatic","Manual": 2 2 2 1 1 1 1 1 1 1 ...
##  $ gear    : num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb    : num  4 4 1 1 2 1 4 2 2 4 ...
##  $ hp_group: chr  "Low" "Low" "Low" "Low" ...

Logistic Regression Model

# Model: Predicting Transmission Type
m1 <- glm(am ~ mpg + hp_group + wt, data = mtcars_data, family = binomial)
summary(m1)
## 
## Call:
## glm(formula = am ~ mpg + hp_group + wt, family = binomial, data = mtcars_data)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  24.8793    13.0045   1.913   0.0557 .
## mpg          -0.1562     0.3220  -0.485   0.6277  
## hp_groupLow  -2.3475     2.2507  -1.043   0.2970  
## wt           -6.7465     2.7408  -2.461   0.0138 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 43.230  on 31  degrees of freedom
## Residual deviance: 15.814  on 28  degrees of freedom
## AIC: 23.814
## 
## Number of Fisher Scoring iterations: 7

Key Findings:

  • mpg is not statistically significant (p = 0.6277), meaning fuel efficiency does not strongly predict transmission type.

  • However, weight (wt) is statistically significant (p = 0.0138), indicating that heavier cars are more likely automatic.

  • hp_group is not statistically significant, suggesting that horsepower (high vs. low) alone does not explain transmission type.

Post-Estimation with clarify

1. Simulating Model Coefficients

set.seed(123)
sim_coefs <- sim(m1, n = 1000)
summary(sim_coefs)
##           Length Class  Mode   
## sim.coefs 4000   -none- numeric
## coefs        4   -none- numeric
## vcov        16   -none- numeric
## fit         30   glm    list

Significance of This Approach:

  • Simulations allow us to estimate uncertainty around our model coefficients.

  • This avoids over-reliance on single coefficient values.

2. Average Marginal Effects (AME)

sim_ame_results <- sim_ame(sim_coefs, var = "mpg", contrast = "rd", verbose = FALSE)
summary(sim_ame_results)
##              Estimate   2.5 %  97.5 %
## E[dY/d(mpg)]  -0.0116 -0.0735  0.0454
# Plot AME
plot(sim_ame_results, main = "Average Marginal Effect of MPG on Transmission")

Findings: Average Marginal Effects (AME)

  • The AME of mpg (-0.0116) suggests that increasing mpg has an extremely small effect on the probability of a car having a manual transmission.

  • This confirms that mpg is not statistically significant, and transmission choice is likely influenced by other variables such as weight.

3. Predictions at Set Values

# Define mpg values for prediction
sim_pred <- sim_setx(sim_coefs, x = list(mpg = c(15, 25)))  
summary(sim_pred)
##          Estimate   2.5 %  97.5 %
## mpg = 15  0.18011 0.00351 0.95022
## mpg = 25  0.04406 0.00108 0.69067
# Plot predicted probabilities
plot(sim_pred, main = "Predicted Probability of Transmission Type at Different MPG Values")

Findings: Predicted Probabilities

  • At 15 mpg, the probability of manual transmission is 18%.

  • At 25 mpg, the probability drops to 4.4%, meaning cars with higher mpg are actually less likely to have a manual transmission in this dataset.

  • This suggests that other factors (like weight) may have a stronger influence on transmission choice than mpg alone.

Model Selection and Fit

lrtest(m1)   # Likelihood Ratio Test
## Likelihood ratio test
## 
## Model 1: am ~ mpg + hp_group + wt
## Model 2: am ~ 1
##   #Df   LogLik Df  Chisq Pr(>Chisq)    
## 1   4  -7.9071                         
## 2   1 -21.6149 -3 27.416  4.817e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
AIC(m1)      # Akaike Information Criterion
## [1] 23.81423
BIC(m1)      # Bayesian Information Criterion
## [1] 29.67717

Findings: Model Selection and Fit

  • The Likelihood Ratio Test (p < 0.001) confirms that our model is statistically significant, meaning the predictors (mpg, hp_group, wt) help explain transmission type.

  • Lower AIC/BIC values suggest our model strikes a balance between fit and complexity.

Key Findings

  1. MPG has a small, non-significant effect on transmission type.

  2. Weight (wt) is a strong predictor - heavier cars are more likely automatic.

  3. Clarify confirms these findings:

  • Marginal effects plot shows little impact from mpg.

  • Predicted probabilities demonstrate unexpected trends in mpg effects.

Conclusion

This assignment demonstrated how clarify enhances logistic regression analysis by:

  • Estimating uncertainty through simulations.

  • Providing meaningful marginal effects instead of raw log-odds.

  • Visualizing predicted probabilities to communicate findings effectively.