library(stargazer)
library(sjPlot)
library(ggplot2)
library(ggthemes)
library(dplyr)
library(tidyr)
library(viridis)
library(gt)
library(extrafont)
library(effects) # For marginal effects and predictions
library(explore)
library(DescTools) # for PseudoR2
library(stats) # For AIC, BIC, logLik
library(afcommon)
Set the defaults theme for ggplot2 in order to match Latex and the requiremens of political science magazines
Set the ggplot2 theme
The mtcars data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).
# Create data frame with the data dictionary information
car_variables <- data.frame(
variable = c("mpg", "cylinder", "disp", "hp", "drat",
"wt", "qsec", "vs", "am", "gear", "carb"),
description = c(
"Miles/(US) gallon",
"Number of cylinders",
"Displacement (cu.in.)",
"Gross horsepower",
"Rear axle ratio",
"Weight (lb/1000)",
"1/4 mile time",
"V Engine / S Engine",
"Transmission (0 = automatic, 1 = manual)",
"Number of forward gears",
"Number of carburetors"
)
)
# Create GT table
gt::gt(car_variables)
variable | description |
---|---|
mpg | Miles/(US) gallon |
cylinder | Number of cylinders |
disp | Displacement (cu.in.) |
hp | Gross horsepower |
drat | Rear axle ratio |
wt | Weight (lb/1000) |
qsec | 1/4 mile time |
vs | V Engine / S Engine |
am | Transmission (0 = automatic, 1 = manual) |
gear | Number of forward gears |
carb | Number of carburetors |
We first run the logistic regression and print the summary of the fitted model
model <- glm(
vs ~ wt + disp,
data = mtcars,
family = binomial(link = "logit")
)
summary(model)
Call:
glm(formula = vs ~ wt + disp, family = binomial(link = "logit"),
data = mtcars)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.60859 2.43903 0.660 0.510
wt 1.62635 1.49068 1.091 0.275
disp -0.03443 0.01536 -2.241 0.025 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 43.86 on 31 degrees of freedom
Residual deviance: 21.40 on 29 degrees of freedom
AIC: 27.4
Number of Fisher Scoring iterations: 6
Regression Coefficient Results | ||||
Variable | Coefficient | Std. Error | t-statistic | p-value |
---|---|---|---|---|
(Intercept) | 1.609 | 2.439 | 0.660 | 0.510 |
wt | 1.626 | 1.491 | 1.091 | 0.275 |
disp | −0.034 | 0.015 | −2.241 | 0.025 |
Extract model fit information
Use Stargazer to output a regression table that can be used with LaTex.
stargazer(
model,
type = "html",
omit = c("Constant"),
star.cutoffs = c(0.05, 0.01, 0.001),
title = "V/S Engine - Logistic Regression",
label = "tab:vs_engine_reg",
keep.stat = c("n"),
add.lines = list(c("Log-Likelihood", round(logLik,3)),
c("AIC", round(aic,3)),
c("Pseudo-R2", round(pseudo_r2,3))),
style = "apsr",
out = "vs_engine.tex"
)
vs | |
wt | 1.626 |
(1.491) | |
disp | -0.034* |
(0.015) | |
Log-Likelihood | -10.7 |
AIC | 27.4 |
Pseudo-R2 | 0.512 |
N | 32 |
p < .05; p < .01; p < .001 |
Results Interpretation
dep_var <- "vs"
ind_vars <- c("wt", "disp")
interpretation <-
af_reg_interpret(model, mtcars, y_name = dep_var, x_list = ind_vars, reg_type= "logistic")
To examine the correlation of wt and disp with vs, we estimated a
Logistic regression model (n=32). The dependent variable vs has two
levels: and . The model yielded a significant result [χ²(2) = 22.5, p
< 1.33e-05], with a log-likelihood value of -10.7 and pseudo R² =
0.512.
The disp variable has statistically significant correlation.
An increase of 1 unit in disp decreases the odds of vs being
Variable wt
(odds ratio = 5.09, p_val = 0.275) is not statistically significant.
Note: the plotted coefficients are exponentiated
p <- plot_model(
model,
transform = "exp", # NULL
title = "",
vline.color = "grey70",
show.values = TRUE,
show.p = TRUE,
show.intercept = FALSE,
digits = 3,
sort.est = TRUE
)
af_ggsave("vs_engine_model_coef.png", plot = p)
The below plots show the marginal effects of the “wt” and “disp” variables on the outcome variable “vs”, controlling for the other variables in the model.The plot visualizes this marginal effect, showing how the predicted probability or cumulative probability changes as “wt” or “disp” varies. The y-axis shows the predicted probability of the outcome variable (e.g. 1) as the value of the “wt”/“disp” variable changes. This probability ranges from 0 to 1.
# Save in a png file
png("wt_effect_on_mpg_cat", width = 800, height = 600)
plot(effect("wt", model))
dev.off()
png
2
Interpretation:
As the vehicle is heavier (wt is bigger), the probability of the vehicle to have a Straight engine (vs == 1) increases.
# Save in a png file
png("disp_effect_on_mpg_cat.png", width = 800, height = 600)
plot(effect("disp", model))
dev.off()
png
2
Interpretation:
As the vehicle displacement is lower, the probability of the vehicle to have a Straight engine (vs == 1) decreases thus the probability of the vbehicle to have a V-engine increases.