Assignment 4

Some of the things that can go wrong with interaction terms with logit models are that unlike linear regression,logit models are nonlinear. This means that the effect of an interaction term isn’t simply the product of individual coefficients. This makes it possible for misinterpretation if you directly look at the coefficient of the interaction term without considering marginal effects. Additionally,standard errors and significance tests might be misleading if the results aren’t properly interpreted. However, using a simulation-based approach can help by allowing us to compute and visualize marginal effects and predicted probabilities,which makes interpretation more intuitive, rather than just looking at raw coefficients. It can also improve accuracy of conclusions drawn by the model by accounting for nonlinearities and uncertainty, as well as generating confidence intervals around predictions; which reduces the risk of misleading interpretations.

rm = (list = ls())
gc()

##          used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 538114 28.8    1194238 63.8   686460 36.7
## Vcells 980225  7.5    8388608 64.0  1876069 14.4

library(dplyr)

## Warning: package 'dplyr' was built under R version 4.4.3

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(ggplot2)

## Warning: package 'ggplot2' was built under R version 4.4.3

library(broom)

## Warning: package 'broom' was built under R version 4.4.3

data(mtcars)

# Convert the outcome variable (am: automatic vs manual) into a factor
mtcars$am <- as.factor(mtcars$am)

# Fit a simple logit model with one predictor
model1 <- glm(am ~ mpg, data = mtcars, family = binomial)

# Fit a moderate complexity model with additional predictors
model2 <- glm(am ~ mpg + hp + wt, data = mtcars, family = binomial)

# Fit a complex model with an interaction term
model3 <- glm(am ~ mpg * hp + wt, data = mtcars, family = binomial)

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

# Compare models using AIC and BIC
aic_values <- AIC(model1, model2, model3)
bic_values <- BIC(model1, model2, model3)

# Check if models exist
print(model1)

## 
## Call:  glm(formula = am ~ mpg, family = binomial, data = mtcars)
## 
## Coefficients:
## (Intercept)          mpg  
##      -6.604        0.307  
## 
## Degrees of Freedom: 31 Total (i.e. Null);  30 Residual
## Null Deviance:       43.23 
## Residual Deviance: 29.68     AIC: 33.68

print(model2)

## 
## Call:  glm(formula = am ~ mpg + hp + wt, family = binomial, data = mtcars)
## 
## Coefficients:
## (Intercept)          mpg           hp           wt  
##   -15.72137      1.22930      0.08389     -6.95492  
## 
## Degrees of Freedom: 31 Total (i.e. Null);  28 Residual
## Null Deviance:       43.23 
## Residual Deviance: 8.766     AIC: 16.77

print(model3)

## 
## Call:  glm(formula = am ~ mpg * hp + wt, family = binomial, data = mtcars)
## 
## Coefficients:
## (Intercept)          mpg           hp           wt       mpg:hp  
##   -8.565472     0.684792    -0.051142    -7.927624     0.009486  
## 
## Degrees of Freedom: 31 Total (i.e. Null);  27 Residual
## Null Deviance:       43.23 
## Residual Deviance: 8.106     AIC: 18.11

# Print AIC and BIC values

print(aic_values)

##        df      AIC
## model1  2 33.67517
## model2  4 16.76611
## model3  5 18.10561

print(bic_values)

##        df      BIC
## model1  2 36.60664
## model2  4 22.62905
## model3  5 25.43429

# Likelihood Ratio Test (Comparing nested models)
lr_test_12 <- anova(model1, model2, test = "Chisq")
lr_test_23 <- anova(model2, model3, test = "Chisq")

print(lr_test_12)

## Analysis of Deviance Table
## 
## Model 1: am ~ mpg
## Model 2: am ~ mpg + hp + wt
##   Resid. Df Resid. Dev Df Deviance  Pr(>Chi)    
## 1        30    29.6752                          
## 2        28     8.7661  2   20.909 2.882e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

print(lr_test_23)

## Analysis of Deviance Table
## 
## Model 1: am ~ mpg + hp + wt
## Model 2: am ~ mpg * hp + wt
##   Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1        28     8.7661                     
## 2        27     8.1056  1   0.6605   0.4164

# Summary of the best model
summary(model3)

## 
## Call:
## glm(formula = am ~ mpg * hp + wt, family = binomial, data = mtcars)
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)
## (Intercept) -8.565472  53.427170  -0.160    0.873
## mpg          0.684792   2.089230   0.328    0.743
## hp          -0.051142   0.212399  -0.241    0.810
## wt          -7.927624   5.001046  -1.585    0.113
## mpg:hp       0.009486   0.017710   0.536    0.592
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 43.2297  on 31  degrees of freedom
## Residual deviance:  8.1056  on 27  degrees of freedom
## AIC: 18.106
## 
## Number of Fisher Scoring iterations: 11

# Interpretation: Extracting odds ratios and confidence intervals
exp(coef(model3))  # Odds ratios

##  (Intercept)          mpg           hp           wt       mpg:hp 
## 0.0001905736 1.9833592213 0.9501435185 0.0003606423 1.0095311201

exp(confint(model3))  # Confidence intervals

## Waiting for profiling to be done...

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

##                    2.5 %       97.5 %
## (Intercept) 1.220012e-73 4.257175e+25
## mpg         1.433487e-01 9.948836e+02
## hp          5.185495e-01 1.520176e+00
## wt          5.086622e-12 1.007085e-01
## mpg:hp      9.910798e-01 1.072975e+00

Assignment 4

Donna Parker

2025-03-11