library(socsci)
library(car)
library(jtools)
library(showtext)
library(fst)
source("D://theme.R")
cces18 <- read.fst("C://cces18.fst")

Cleaning Variables

I need a dichotomous DV. So let’s use Republican party affiliation as the one. I will quickly spin up some controls. White dummy, age, male dummy, education.

cces18 <- cces18 %>% 
  mutate(rep = car::recode(pid3, "2=1; else =0")) %>% 
  mutate(male = car::recode(gender, "1=1; else =0")) %>% 
  mutate(white = car::recode(race, "1=1; else =0")) %>% 
  mutate(age = 2018 - birthyr) 

Run the logit

Make sure to specificy family = "binomial"

reg1 <- glm(rep ~ male + white + age + educ, data = cces18, family = "binomial")

Coefficient Table

Let’s take a look at our output

summ(reg1)
## MODEL INFO:
## Observations: 60000
## Dependent Variable: rep
## Type: Generalized linear model
##   Family: binomial 
##   Link function: logit 
## 
## MODEL FIT:
## <U+03C7>²(4) = 3327.60, p = 0.00
## Pseudo-R² (Cragg-Uhler) = 0.08
## Pseudo-R² (McFadden) = 0.05
## AIC = 66272.22, BIC = 66317.24 
## 
## Standard errors: MLE
## ------------------------------------------------
##                      Est.   S.E.   z val.      p
## ----------------- ------- ------ -------- ------
## (Intercept)         -2.21   0.04   -54.45   0.00
## male                 0.12   0.02     6.39   0.00
## white                1.10   0.03    39.76   0.00
## age                  0.01   0.00    22.99   0.00
## educ                -0.10   0.01   -16.14   0.00
## ------------------------------------------------

All variables are significant. Let’s visualize.

Let’s visualize our coefficients

You should use the scale = TRUE option here because age has many values, as does education.

If you don’t scale it’s going to make the dichtomous variables look huge and the continuous look small. You can see it in the table. One point increase in age is .01, but going from non-white to white is 1.10. You really need to see what the coefficients are from going from the lowest value (age 18) to the highest (age 89 or whatever). Scaling your variables does that for you.

plot_summs(reg1, scale = TRUE)

Everything to the right of zero means more likely to be a Republican, everything to the left of zero is less likely. So white is the biggest factor. Male and age are pretty similar. Education means less likely to be a Republican.