1 Required libraries

library(stargazer)
library(sjPlot)
library(ggplot2)
library(ggthemes)
library(dplyr)
library(tidyr)
library(viridis)
library(gt)
library(extrafont)
library(effects)    # For marginal effects and predictions
library(explore)
library(DescTools)  # for PseudoR2
library(stats)      # For AIC, BIC, logLik
library(afcommon)

Set the defaults theme for ggplot2 in order to match Latex and the requiremens of political science magazines

loadfonts(device = "win", quiet=TRUE)

Set the ggplot2 theme

theme_set(theme_bw() +
            theme(text = element_text(family = "LM Roman 10", size = 12)))

2 Create Dataset

The mtcars data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).

# Create data frame with the data dictionary information
car_variables <- data.frame(
  variable = c("mpg", "cylinder", "disp", "hp", "drat", 
               "wt", "qsec", "vs", "am", "gear", "carb"),
  description = c(
    "Miles/(US) gallon", 
    "Number of cylinders", 
    "Displacement (cu.in.)", 
    "Gross horsepower", 
    "Rear axle ratio", 
    "Weight (lb/1000)", 
    "1/4 mile time", 
    "V Engine / S Engine", 
    "Transmission (0 = automatic, 1 = manual)", 
    "Number of forward gears", 
    "Number of carburetors"
  )
)

# Create GT table
gt::gt(car_variables)

variable	description
mpg	Miles/(US) gallon
cylinder	Number of cylinders
disp	Displacement (cu.in.)
hp	Gross horsepower
drat	Rear axle ratio
wt	Weight (lb/1000)
qsec	1/4 mile time
vs	V Engine / S Engine
am	Transmission (0 = automatic, 1 = manual)
gear	Number of forward gears
carb	Number of carburetors

3 Running the ordinal regression

We first run the logistic regression and print the summary of the fitted model

model <- glm(
  vs ~ wt + disp,
  data = mtcars,
  family = binomial(link = "logit")
)

summary(model)

Call:
glm(formula = vs ~ wt + disp, family = binomial(link = "logit"), 
    data = mtcars)

Coefficients:
            Estimate Std. Error z value Pr(>|z|)  
(Intercept)  1.60859    2.43903   0.660    0.510  
wt           1.62635    1.49068   1.091    0.275  
disp        -0.03443    0.01536  -2.241    0.025 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 43.86  on 31  degrees of freedom
Residual deviance: 21.40  on 29  degrees of freedom
AIC: 27.4

Number of Fisher Scoring iterations: 6

gt_coeftest(model) # with p-values

Variable	Coefficient	Std. Error	t-statistic	p-value
Regression Coefficient Results
(Intercept)	1.609	2.439	0.660	0.510
wt	1.626	1.491	1.091	0.275
disp	−0.034	0.015	−2.241	0.025

Extract model fit information

logLik <- stats::logLik(model)
aic <- stats::AIC(model)
pseudo_r2 <- DescTools::PseudoR2(model)

Use Stargazer to output a regression table that can be used with LaTex.

stargazer(
  model,
  type = "html",
  omit = c("Constant"),
  star.cutoffs = c(0.05, 0.01, 0.001),
  title = "V/S Engine - Logistic Regression",
  label = "tab:vs_engine_reg",
  keep.stat = c("n"),
  add.lines = list(c("Log-Likelihood", round(logLik,3)),
                   c("AIC", round(aic,3)),
                   c("Pseudo-R2", round(pseudo_r2,3))),
  style = "apsr",
  out = "vs_engine.tex"
)

**V/S Engine - Logistic Regression**

	vs

wt	1.626
	(1.491)
disp	-0.034^*
	(0.015)
Log-Likelihood	-10.7
AIC	27.4
Pseudo-R2	0.512
N	32

p < .05; p < .01; p < .001

Results Interpretation

dep_var <- "vs"
ind_vars <- c("wt", "disp")
interpretation <- 
  af_reg_interpret(model, mtcars, y_name = dep_var, x_list = ind_vars, reg_type= "logistic")

To examine the correlation of wt and disp with vs, we estimated a Logistic regression model (n=32). The dependent variable vs has two levels: and . The model yielded a significant result [χ²(2) = 22.5, p < 1.33e-05], with a log-likelihood value of -10.7 and pseudo R² = 0.512.
The disp variable has statistically significant correlation. An increase of 1 unit in disp decreases the odds of vs being by 3.385%, assuming all other variables stay constant.
Variable wt (odds ratio = 5.09, p_val = 0.275) is not statistically significant.

4 Plot model coeeficients

Note: the plotted coefficients are exponentiated

p <- plot_model(
  model,
  transform = "exp", # NULL
  title = "",
  vline.color = "grey70",
  show.values = TRUE,
  show.p = TRUE,
  show.intercept = FALSE,
  digits = 3,
  sort.est = TRUE
)
af_ggsave("vs_engine_model_coef.png", plot = p)

5 Plot effects

The below plots show the marginal effects of the “wt” and “disp” variables on the outcome variable “vs”, controlling for the other variables in the model.The plot visualizes this marginal effect, showing how the predicted probability or cumulative probability changes as “wt” or “disp” varies. The y-axis shows the predicted probability of the outcome variable (e.g. 1) as the value of the “wt”/“disp” variable changes. This probability ranges from 0 to 1.

plot(effect("wt", model))


# Save in a png file
png("wt_effect_on_mpg_cat", width = 800, height = 600)
plot(effect("wt", model))
dev.off()
png 
  2

Interpretation:

As the vehicle is heavier (wt is bigger), the probability of the vehicle to have a Straight engine (vs == 1) increases.

plot(effect("disp", model))


# Save in a png file
png("disp_effect_on_mpg_cat.png", width = 800, height = 600)
plot(effect("disp", model))
dev.off()
png 
  2

Interpretation:

As the vehicle displacement is lower, the probability of the vehicle to have a Straight engine (vs == 1) decreases thus the probability of the vbehicle to have a V-engine increases.

Logistic Regression

Demonstrate how to perform Ordinal regression using the MASS package - POLR function.

Amir Freund

2025-02-05

1 Required libraries

2 Create Dataset

3 Running the ordinal regression

4 Plot model coeeficients

5 Plot effects