Analysis of Intentionality Cues in US Immigration Discourse

Introduction

Immigration has become a major political fault line in many Western societies (Card et al., 2022; Dancygier & Margalit, 2020). Why do people hold such polarized view? Research on immigration attitudes has approached this question through the lens of costs and benefits, particularly by emphasizing perceived threats (Lutz & Bitschnau, 2023). Specifically, people are more likely to reject immigration when they see it as a source of unemployment, crime, or social conflict. These explanations share a key feature: they tend to focus on the outcomes of immigration—its (perceived) tangible effects on the host society. Yet, social evaluations are not based solely on actions and consequences; they also depend on the perceived mental states behind those actions—intentions, motivations, and reasons. Humans possess a cognitive ability for mind-reading, allowing them to attribute mental states to others (Ho, Saxe & Cushman, 2022). Crucially, mind perception plays a central role in moral judgment and emotional responses to actions (Barrett & Saxe, 2021; Gray et al., 2012; Sell et al., 2017).

A key dimension of mind perception that strongly influences moral judgment and social evaluation is intentionality (Barrett et al., 2016; Barrett & Saxe, 2021; Cushman, 2015; Fincher et al., 2018; Gray et al., 2012). People judge an action more harshly when they perceive it as intentional and, conversely, more leniently when they see it as unintentional. When assessing intentionality, individuals rely on various mental concepts, including goals, motivations, attitudes, and character traits. This mechanism is evident in attitudes toward redistribution: people are more likely to support welfare policies when they believe recipients are hardworking and not responsible for their situation (Aarøe & Petersen, 2014; Petersen, 2012; van Oorschot, 2000, 2006). Similarly, the perceived intent behind an aggression can dramatically alter its social evaluation—aggressions perceived as driven by harmful intent are judged far more negatively and generate more negative emotional responses (Sell et al., 2017). Accordingly, intentionality cues also shape public perceptions of immigrants: their perceived motivation to work, attitude toward the host society, and reasons for migrating all influence attitudes toward them independently of the costs and benefits that they generate for the host society (Kootstra, 2016; Naumann et al., 2024; Reeskens & van der Meer, 2019).

Unraveling the psychological underpinnings of attitudes towards immigration is essential in explaining major trends in politics. Politicians and other political actors can be seen as strategic agents who seek to mobilize voters through rhetorical strategies and policy stances, but their effectiveness depends on aligning these efforts with the psychology of their audience. To make an issue more salient, political entrepreneurs must frame it in a psychologically compelling way—one that effectively engages cognitive mechanisms to capture attention, evoke emotions, and generate support . In this sense, political rhetoric can be understood as a form of “cultural technology,” intuitively designed by self-interested actors to exploit psychological predispositions (Dubourg & Baumard, 2021; Fitouchi et al., 2021; Fitouchi & Singh, 2022; Sijilmassi et al., 2024). This is particularly evident in political discourse on immigration: while immigration was a low-salience issue in many Western countries during the 1950s and 1960s, it has become one of the most politically charged topics since the 2000s (Card et al., 2022; Dancygier & Margalit, 2020; Simonsen & Widmann, 2023). This shift in salience is, in part, the result of sustained political narratives that have framed immigration as a problem in mass media and political speeches (Dancygier & Margalit, 2020; Eberl et al., 2018).

Given the central role of intentionality in moral judgment and social evaluation, highlighting intentionality cues should be a particularly effective rhetorical strategy for increasing the salience of immigration and mobilizing voters on this issue. As immigration becomes more politically salient and polarized, we expect a growing emphasis on intentionality in political discourse. Anti-immigration parties are likely to underscore perceived negative intentions of immigrants to elicit hostility and moral outrage among their supporters, whereas pro-immigration parties will emphasize perceived positive intentions to foster empathy and support.

To test our hypotheses, we will use large language models (LLMs) to annotate extensive corpora of immigration-related texts, including parliamentary speeches and political manifestos. As a first step, we will quantify intentionality cues in a dataset of approximately 250,000 excerpts from US congressional speeches on immigration since 1880. This script tests our core hypotheses regarding the presence of intentionality cues in U.S. immigration discourse, considering time trends, tone (positive/negative), salience, and polarization.

Setup:

Main Analyses

H1: There should be an increase of intentionality cues in political discourse over time

Model:
#Main model: 
summary(glmer(gpt4_label_binary2 ~ year + party + chamber + (1 | state), data = sampled_data_annotated_arranged, family = binomial ))
boundary (singular) fit: see help('isSingular')
Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Warning in vcov.merMod(object, correlation = correlation, sigm = sig): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: gpt4_label_binary2 ~ year + party + chamber + (1 | state)
   Data: sampled_data_annotated_arranged

      AIC       BIC    logLik -2*log(L)  df.resid 
    407.8     437.0    -196.9     393.8       472 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.6351 -0.4340 -0.3809 -0.3428  3.0799 

Random effects:
 Groups Name        Variance Std.Dev.
 state  (Intercept) 0        0       
Number of obs: 479, groups:  state, 53

Fixed effects:
              Estimate Std. Error z value Pr(>|z|)
(Intercept)  -9.314927   7.595201  -1.226    0.220
year          0.004093   0.003805   1.076    0.282
partyR       -0.139420   0.301168  -0.463    0.643
partyUnknown  0.188366   0.580518   0.324    0.746
chamberH     -0.398994   0.665278  -0.600    0.549
chamberS     -0.805301   0.640040  -1.258    0.208

Correlation of Fixed Effects:
            (Intr) year   partyR prtyUn chmbrH
year        -0.996                            
partyR      -0.036  0.021                     
partyUnknwn -0.124  0.048  0.196              
chamberH    -0.167  0.085 -0.018  0.806       
chamberS    -0.120  0.039  0.010  0.780  0.901
optimizer (Nelder_Mead) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
Plot:
Warning: The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.
ℹ Please use the `linewidth` argument instead.
`geom_smooth()` using formula = 'y ~ x'

Robustness check:
#Plot: 
plot_robustness_distribution(
  results_df = sim_output$results,
  term_of_interest = "year",
  ci_bounds = sim_output$ci
)

H2a&b: Intentionality cues should become more prevalent in both positive (H2a) and negative claims (H2b) over time

Model:
#H2a: 
summary(glmer(
  gpt4_label_binary2 ~ year + party + chamber + (1 | state),
  data = sampled_data_annotated_arranged[sampled_data_annotated_arranged$tone2 == "Positive",],
  family = binomial
))
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.530319 (tol = 0.002, component 1)
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: gpt4_label_binary2 ~ year + party + chamber + (1 | state)
   Data: 
sampled_data_annotated_arranged[sampled_data_annotated_arranged$tone2 ==  
    "Positive", ]

      AIC       BIC    logLik -2*log(L)  df.resid 
    232.7     255.7    -109.4     218.7       189 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.8583 -0.6199 -0.4342  1.1301  1.9561 

Random effects:
 Groups Name        Variance Std.Dev.
 state  (Intercept) 0.6999   0.8366  
Number of obs: 196, groups:  state, 46

Fixed effects:
              Estimate Std. Error z value Pr(>|z|)
(Intercept)  -6.489307  19.853230  -0.327    0.744
year          0.002737   0.009882   0.277    0.782
partyR       -0.159385   0.501115  -0.318    0.750
partyUnknown  0.316862   1.158192   0.274    0.784
chamberH     -0.045019   0.837487  -0.054    0.957
chamberS     -0.297689   0.804293  -0.370    0.711

Correlation of Fixed Effects:
            (Intr) year   partyR prtyUn chmbrH
year        -0.999                            
partyR      -0.314  0.307                     
partyUnknwn -0.239  0.214  0.194              
chamberH    -0.269  0.233  0.070  0.551       
chamberS    -0.233  0.197  0.061  0.532  0.862
optimizer (Nelder_Mead) convergence code: 0 (OK)
Model failed to converge with max|grad| = 0.530319 (tol = 0.002, component 1)
Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?
Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
#H2b: 
summary(glmer(
  gpt4_label_binary2 ~ year + party + chamber + (1 | state),
  data = sampled_data_annotated_arranged[sampled_data_annotated_arranged$tone2 == "Negative",],
  family = binomial
))
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
unable to evaluate scaled gradient
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues
Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Warning in vcov.merMod(object, correlation = correlation, sigm = sig): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: gpt4_label_binary2 ~ year + party + chamber + (1 | state)
   Data: 
sampled_data_annotated_arranged[sampled_data_annotated_arranged$tone2 ==  
    "Negative", ]

      AIC       BIC    logLik -2*log(L)  df.resid 
    161.0     186.5     -73.5     147.0       276 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.4510 -0.2889 -0.2514 -0.2115  4.5112 

Random effects:
 Groups Name        Variance Std.Dev.
 state  (Intercept) 0.3411   0.5841  
Number of obs: 283, groups:  state, 48

Fixed effects:
              Estimate Std. Error z value Pr(>|z|)
(Intercept)  -9.543005  12.791589  -0.746    0.456
year          0.003434   0.006423   0.535    0.593
partyR        0.433093   0.515032   0.841    0.400
partyUnknown  0.259075   1.283596   0.202    0.840
chamberH      0.042139   1.554772   0.027    0.978
chamberS     -0.397690   1.511957  -0.263    0.793

Correlation of Fixed Effects:
            (Intr) year   partyR prtyUn chmbrH
year        -0.992                            
partyR       0.070 -0.092                     
partyUnknwn -0.063 -0.017  0.214              
chamberH    -0.167  0.051 -0.030  0.589       
chamberS    -0.113 -0.004  0.010  0.570  0.942
optimizer (Nelder_Mead) convergence code: 0 (OK)
unable to evaluate scaled gradient
Model failed to converge: degenerate  Hessian with 1 negative eigenvalues
Plot:
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 4 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).

Robustness check:

Robustness check plot for “Positive”:

plot_robustness_distribution(
  results_df = sim_output_positive$results,
  term_of_interest = "year",
  ci_bounds = sim_output_positive$ci
)

Robustness check plot for “Negative”:

plot_robustness_distribution(
  results_df = sim_output_positive$results,
  term_of_interest = "year",
  ci_bounds = sim_output_positive$ci
)

H3: Intentionality cues should be more prevalent in congressional periods when immigration is more salient

NB: Salience is measured as the percentage of immigration-related tokens on the total token in a given congressional period.

Model:
summary(glmer(
  gpt4_label_binary2 ~ salience_text + year + party + chamber + (1 | state),
  data = sampled_data_annotated_arranged,
  family = binomial
))
boundary (singular) fit: see help('isSingular')
Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Warning in vcov.merMod(object, correlation = correlation, sigm = sig): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: gpt4_label_binary2 ~ salience_text + year + party + chamber +  
    (1 | state)
   Data: sampled_data_annotated_arranged

      AIC       BIC    logLik -2*log(L)  df.resid 
    409.8     443.1    -196.9     393.8       471 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.6319 -0.4371 -0.3795 -0.3448  3.1023 

Random effects:
 Groups Name        Variance Std.Dev.
 state  (Intercept) 0        0       
Number of obs: 479, groups:  state, 53

Fixed effects:
               Estimate Std. Error z value Pr(>|z|)
(Intercept)   -9.929307   8.310771  -1.195    0.232
salience_text -0.021327   0.114993  -0.185    0.853
year           0.004426   0.004219   1.049    0.294
partyR        -0.136885   0.301522  -0.454    0.650
partyUnknown   0.185532   0.580713   0.319    0.749
chamberH      -0.390215   0.666999  -0.585    0.559
chamberS      -0.794067   0.642738  -1.235    0.217

Correlation of Fixed Effects:
            (Intr) slnc_t year   partyR prtyUn chmbrH
salienc_txt  0.397                                   
year        -0.996 -0.424                            
partyR      -0.050 -0.045  0.037                     
partyUnknwn -0.104  0.026  0.033  0.194              
chamberH    -0.180 -0.070  0.106 -0.015  0.802       
chamberS    -0.146 -0.093  0.075  0.015  0.774  0.901
optimizer (Nelder_Mead) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
Plot:

Robustness check:
plot_robustness_distribution(
  results_df = sim_output$results,
  term_of_interest = "salience_text",
  ci_bounds = sim_output$ci
)

H4: Intentionality cues should be more prevalent in congressional periods when immigration is more polarized

NB: Polarization is measured as the average difference in the mean tone of speeches towards immigration (from positive to negative) between Republicans and Democrats, in a given congressional period.

Model:
summary(glmer(
  gpt4_label_binary2 ~ polarization_score + year + party + chamber + (1 | state),
  data = sampled_data_annotated_arranged,
  family = binomial
))
boundary (singular) fit: see help('isSingular')
Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Warning in vcov.merMod(object, correlation = correlation, sigm = sig): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: gpt4_label_binary2 ~ polarization_score + year + party + chamber +  
    (1 | state)
   Data: sampled_data_annotated_arranged

      AIC       BIC    logLik -2*log(L)  df.resid 
    407.6     440.9    -195.8     391.6       471 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.6937 -0.4256 -0.3727 -0.3242  3.5238 

Random effects:
 Groups Name        Variance Std.Dev.
 state  (Intercept) 0        0       
Number of obs: 479, groups:  state, 53

Fixed effects:
                    Estimate Std. Error z value Pr(>|z|)
(Intercept)         5.944368  12.410058   0.479    0.632
polarization_score  1.039835   0.692981   1.501    0.133
year               -0.003767   0.006336  -0.594    0.552
partyR             -0.177788   0.302903  -0.587    0.557
partyUnknown        0.206908   0.582475   0.355    0.722
chamberH           -0.517164   0.671978  -0.770    0.442
chamberS           -0.926444   0.647712  -1.430    0.153

Correlation of Fixed Effects:
            (Intr) plrzt_ year   partyR prtyUn chmbrH
polrztn_scr  0.815                                   
year        -0.998 -0.823                            
partyR      -0.096 -0.088  0.087                     
partyUnknwn -0.047  0.026 -0.001  0.194              
chamberH    -0.194 -0.123  0.145 -0.004  0.797       
chamberS    -0.177 -0.132  0.128  0.020  0.769  0.903
optimizer (Nelder_Mead) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
Plot:
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 4 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).

Just for info, the evolution of polarization per year:

`geom_smooth()` using formula = 'y ~ x'

Robustness check:
plot_robustness_distribution(
  results_df = sim_output$results,
  term_of_interest = "polarization_score",
  ci_bounds = sim_output$ci
)

H5: Intentionality cues in political discourse about immigration should be associated with more moralization

Model:
summary(glmer(
  gpt4_label_binary2 ~ morality_binary + year + tone2 + party + chamber + (1 | state),
  data = moralization_annotated_output_arranged,
  family = binomial
))
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.357793 (tol = 0.002, component 1)
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: gpt4_label_binary2 ~ morality_binary + year + tone2 + party +  
    chamber + (1 | state)
   Data: moralization_annotated_output_arranged

      AIC       BIC    logLik -2*log(L)  df.resid 
    379.6     417.2    -180.8     361.6       470 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.8235 -0.4313 -0.2981 -0.1894  5.5602 

Random effects:
 Groups Name        Variance Std.Dev.
 state  (Intercept) 0.1967   0.4435  
Number of obs: 479, groups:  state, 53

Fixed effects:
                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)     -6.514521   3.039365  -2.143  0.03208 *  
morality_binary  0.952510   0.354493   2.687  0.00721 ** 
year             0.001585   0.001441   1.100  0.27149    
tone2Positive    1.342181   0.313485   4.281 1.86e-05 ***
partyR           0.111575   0.337997   0.330  0.74132    
partyUnknown     0.543719   0.786283   0.692  0.48925    
chamberH         0.170543   0.710475   0.240  0.81030    
chamberS        -0.153662   0.684901  -0.224  0.82248    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) mrlty_ year   tn2Pst partyR prtyUn chmbrH
morlty_bnry -0.127                                          
year        -0.957 -0.006                                   
tone2Positv -0.146 -0.024  0.051                            
partyR      -0.155  0.023  0.085  0.239                     
partyUnknwn -0.248  0.164  0.056  0.035  0.199              
chamberH    -0.289  0.140  0.051  0.074  0.016  0.654       
chamberS    -0.258  0.130  0.024  0.076  0.027  0.630  0.899
optimizer (Nelder_Mead) convergence code: 0 (OK)
Model failed to converge with max|grad| = 0.357793 (tol = 0.002, component 1)
Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?
Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
Plot:
#| echo: false
#Plot:
# Prepare summary data
intent_by_moralization <- moralization_annotated_output_arranged %>%
  group_by(morality_binary) %>%
  summarise(
    mean = mean(gpt4_label_binary2, na.rm = TRUE),
    n = n(),
    sd = sd(gpt4_label_binary2, na.rm = TRUE),
    se = sd / sqrt(n),
    ci_low = mean - 1.96 * se,
    ci_high = mean + 1.96 * se
  )

# Run t-test
t_test_result <- t.test(
  gpt4_label_binary2 ~ morality_binary,
  data = moralization_annotated_output_arranged
)

t_stat <- round(t_test_result$statistic, 2)
df <- round(t_test_result$parameter, 0)
p_val <- t_test_result$p.value

# Format significance level text
sig_text <- case_when(
  p_val < 0.001 ~ "p < 0.001",
  p_val < 0.01  ~ "p < 0.01",
  p_val < 0.05  ~ "p < 0.05",
  TRUE          ~ paste0("p = ", signif(p_val, 2))
)

# Build the full subtitle string
subtitle_text <- paste0(
  "The difference is significant at the ", sig_text, " level ",
  "(t(", df, ") = ", t_stat, ", p = ", signif(p_val, 2), ")"
)

# Format p-value
p_value <- t_test_result$p.value
p_text <- ifelse(p_value < 0.001, "***", 
                 ifelse(p_value < 0.01, "**", 
                        ifelse(p_value < 0.05, "*", "ns")))

moralization_colors_balanced <- c(
  "Non-Moralizing" = "#56B4E9",  # softened vermillion
  "Moralizing" = "#E79E83"   # soft sky blue
) 

intent_by_moralization$morality_binary <- factor(
  intent_by_moralization$morality_binary, 
  levels = c(0, 1),
  labels = c("Non-Moralizing", "Moralizing")
)

# Plot with error bars
ggplot(intent_by_moralization, aes(x = morality_binary, y = mean, fill = morality_binary)) +
  geom_col(width = 0.6, alpha = 0.9) +
  geom_errorbar(aes(ymin = ci_low, ymax = ci_high), width = 0.2, linewidth = 0.9) +
  scale_fill_manual(values = moralization_colors_balanced) +
  scale_y_continuous(limits = c(0, 1)) +
  labs(
    title = "Intentionality Cues by Moralization Score",
    subtitle = subtitle_text,
    x = "Moralization Score",
    y = "Proportion with Intentionality Cue",
    fill = "Moralization Score"
  ) +
  theme_minimal(base_size = 15) +
  theme(
    plot.title = element_text(face = "bold", size = 18),
    plot.subtitle = element_text(size = 13, margin = margin(b = 10)),
    axis.title = element_text(face = "bold", size = 13),
    axis.text = element_text(size = 11),
    legend.position = "none",
    panel.grid.minor = element_blank(),
    panel.grid.major.x = element_blank()
  )

Robustness check:

H6: Politicians adapt their use of intentionality cues in reaction to their adversaries (post-1960 only)

Preliminary tests:
#LAG ANALYSIS: which time lag maximizes the model fit while remaining parsimonious?

lag_selection <- VARselect(var_input, lag.max = 7, type = "const")
lag_selection$selection #Choose time lag selected by AIC
AIC(n)  HQ(n)  SC(n) FPE(n) 
     1      1      1      1 
#STATIONARY CHECK
#NB: If the time series is not stationary, we might need to difference again

adf_results <- lapply(var_input, function(x) {
  adf.test(x, alternative = "stationary")
})
Warning in adf.test(x, alternative = "stationary"): p-value smaller than
printed p-value
Warning in adf.test(x, alternative = "stationary"): p-value smaller than
printed p-value
Warning in adf.test(x, alternative = "stationary"): p-value smaller than
printed p-value
Warning in adf.test(x, alternative = "stationary"): p-value smaller than
printed p-value
Warning in adf.test(x, alternative = "stationary"): p-value smaller than
printed p-value
Warning in adf.test(x, alternative = "stationary"): p-value smaller than
printed p-value
# Print summary of results
for (var in names(adf_results)) {
  cat("\n--- ADF Test for:", var, "---\n")
  print(adf_results[[var]])
}

--- ADF Test for: intentionality_D ---

    Augmented Dickey-Fuller Test

data:  x
Dickey-Fuller = -9.325, Lag order = 9, p-value = 0.01
alternative hypothesis: stationary


--- ADF Test for: intentionality_R ---

    Augmented Dickey-Fuller Test

data:  x
Dickey-Fuller = -8.3793, Lag order = 9, p-value = 0.01
alternative hypothesis: stationary


--- ADF Test for: tone_D ---

    Augmented Dickey-Fuller Test

data:  x
Dickey-Fuller = -8.097, Lag order = 9, p-value = 0.01
alternative hypothesis: stationary


--- ADF Test for: tone_R ---

    Augmented Dickey-Fuller Test

data:  x
Dickey-Fuller = -8.0311, Lag order = 9, p-value = 0.01
alternative hypothesis: stationary


--- ADF Test for: intentionality_neg ---

    Augmented Dickey-Fuller Test

data:  x
Dickey-Fuller = -8.2385, Lag order = 9, p-value = 0.01
alternative hypothesis: stationary


--- ADF Test for: intentionality_pos ---

    Augmented Dickey-Fuller Test

data:  x
Dickey-Fuller = -10.085, Lag order = 9, p-value = 0.01
alternative hypothesis: stationary

H6a: An increase in the use of intentionality cues by Republican speakers is associated with a subsequent rise in intentionality cues used by Democratic speakers in the following months

Model:
var <- VAR(var_input, p = 1, type = "both") #change p depending on the analysis above
#NB: the type = "both" here is because we suspect deterministic trends in the time series (namely, intentionality and tone increasing over time in a predictible way)


# Prepare model list from your VAR object
model_list <- list(
  "intentionality_D" = var$varresult$intentionality_D,
  "intentionality_R" = var$varresult$intentionality_R,
  "tone_D" = var$varresult$tone_D,
  "tone_R" = var$varresult$tone_R,
  "intentionality_neg" = var$varresult$intentionality_neg,
  "intentionality_pos" = var$varresult$intentionality_pos
)

modelsummary(
  model_list,
  output = "kableExtra",          # Pretty HTML/Word table
  statistic = "({std.error})",    # Show standard errors in parentheses
  stars = TRUE,                   # Show significance stars
  title = "VAR Results Summary"
)
VAR Results Summary
intentionality_D intentionality_R tone_D tone_R intentionality_neg intentionality_pos
intentionality_D.l1 0.273* −0.009 −0.441 0.513* 0.215* 0.030
(0.117) (0.089) (0.293) (0.231) (0.088) (0.130)
intentionality_R.l1 0.307* −0.018 −0.360 0.401+ 0.288** −0.012
(0.120) (0.092) (0.301) (0.237) (0.090) (0.134)
tone_D.l1 −0.011 −0.003 0.038 −0.013 −0.014 0.017
(0.016) (0.012) (0.039) (0.031) (0.012) (0.017)
tone_R.l1 −0.029 0.004 0.022 0.077* −0.014 −0.010
(0.019) (0.015) (0.048) (0.038) (0.014) (0.021)
intentionality_neg.l1 −0.301* −0.007 0.349 −0.418+ −0.247** −0.044
(0.124) (0.094) (0.309) (0.243) (0.093) (0.137)
intentionality_pos.l1 −0.234* −0.013 0.297 −0.600** −0.214** −0.030
(0.107) (0.081) (0.267) (0.210) (0.080) (0.118)
const 0.019 −0.004 0.021 0.010 0.009 0.012
(0.012) (0.009) (0.031) (0.024) (0.009) (0.014)
trend 0.000 0.000** −0.000 −0.000* 0.000 0.000*
(0.000) (0.000) (0.000) (0.000) (0.000) (0.000)
Num.Obs. 731 731 731 731 731 731
R2 0.015 0.011 0.005 0.030 0.020 0.008
R2 Adj. 0.005 0.002 −0.004 0.021 0.010 −0.001
AIC −551.7 −948.6 787.3 436.4 −969.9 −400.0
BIC −510.4 −907.3 828.7 477.7 −928.5 −358.7
Log.Lik. 284.864 483.300 −384.665 −209.178 493.932 209.001
RMSE 0.16 0.12 0.41 0.32 0.12 0.18
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001
irf1 = compute_irf(var, impulse = "intentionality_R", response = "intentionality_D")
display_irf_table(irf1)
Impulse Response of intentionality_D to a Shock in intentionality_R
Month Response Lower Bound (95%) Upper Bound (95%)
0 0.00e+00 0.00e+00 0.00e+00
1 6.04e-03 -1.66e-03 2.11e-02
2 -5.77e-05 -2.49e-03 2.58e-03
3 2.66e-06 -2.76e-04 4.01e-04
4 4.83e-07 -5.73e-05 5.76e-05
5 -5.15e-07 -7.22e-06 1.20e-05
6 -5.53e-08 -1.53e-06 2.18e-06
7 -8.11e-09 -2.20e-07 4.31e-07
Plot:
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

H6b: An increase in the use of intentionality cues by Democrat speakers is associated with a subsequent rise in intentionality cues used by Republican speakers in the following months

Model:
irf2 = compute_irf(var, impulse = "intentionality_D", response = "intentionality_R")
display_irf_table(irf2)
Impulse Response of intentionality_R to a Shock in intentionality_D
Month Response Lower Bound (95%) Upper Bound (95%)
0 5.86e-04 -4.84e-03 9.04e-03
1 -3.79e-03 -9.03e-03 7.60e-03
2 4.00e-05 -1.20e-03 1.14e-03
3 -1.07e-05 -1.94e-04 1.75e-04
4 2.91e-08 -3.21e-05 2.71e-05
5 2.71e-07 -4.91e-06 6.00e-06
6 3.58e-08 -9.61e-07 8.98e-07
7 5.55e-09 -1.42e-07 1.88e-07
Plot:

H6c: An increase in the use of intentionality cues in negative statements about immigration is associated with a subsequent rise in intentionality cues used positive statements in the following months

Model:
irf3 = compute_irf(var, impulse = "intentionality_neg", response = "intentionality_pos")
display_irf_table(irf3)
Impulse Response of intentionality_pos to a Shock in intentionality_neg
Month Response Lower Bound (95%) Upper Bound (95%)
0 -8.34e-02 -1.01e-01 -5.66e-02
1 -1.32e-03 -1.28e-02 1.32e-02
2 -4.28e-05 -2.11e-03 1.41e-03
3 2.25e-05 -2.68e-04 2.85e-04
4 4.13e-06 -5.62e-05 4.70e-05
5 7.10e-07 -8.54e-06 9.55e-06
6 8.20e-08 -1.98e-06 1.33e-06
7 7.30e-09 -3.27e-07 2.56e-07
Plot:

H6d: An increase in the use of intentionality cues in positive statements about immigration is associated with a subsequent rise in intentionality cues used negative statements in the following months

Model:
irf4 = compute_irf(var, impulse = "intentionality_pos", response = "intentionality_neg")
display_irf_table(irf4)
Impulse Response of intentionality_neg to a Shock in intentionality_pos
Month Response Lower Bound (95%) Upper Bound (95%)
0 0.00e+00 0.00e+00 0.00e+00
1 -1.22e-02 -2.12e-02 1.24e-04
2 5.50e-04 -1.78e-03 2.64e-03
3 -4.07e-05 -4.82e-04 2.96e-04
4 4.38e-06 -5.54e-05 7.24e-05
5 1.10e-06 -1.53e-05 9.37e-06
6 1.24e-07 -1.87e-06 2.12e-06
7 1.73e-08 -5.36e-07 2.57e-07
Plot:

H6e: An increase in the negative tone of Republican immigration rhetoric predict higher levels of intentionality cues in Democratic discourse in the following months

Model:
irf5 = compute_irf(var, impulse = "tone_R", response = "intentionality_D")
display_irf_table(irf5)
Impulse Response of intentionality_D to a Shock in tone_R
Month Response Lower Bound (95%) Upper Bound (95%)
0 0.00e+00 0.00e+00 0.00e+00
1 -8.32e-03 -2.05e-02 3.20e-03
2 -6.75e-04 -2.88e-03 1.25e-03
3 -1.02e-04 -5.59e-04 1.81e-04
4 -8.72e-06 -8.74e-05 4.57e-05
5 -2.42e-07 -1.75e-05 7.23e-06
6 6.28e-08 -2.86e-06 2.06e-06
7 1.95e-08 -5.26e-07 3.63e-07
Plot:

H6f: An increase in the negative tone of Democrat immigration rhetoric predict higher levels of intentionality cues in Republican discourse in the following months

Model:
irf6 = compute_irf(var, impulse = "tone_D", response = "intentionality_R")
display_irf_table(irf6)
Impulse Response of intentionality_R to a Shock in tone_D
Month Response Lower Bound (95%) Upper Bound (95%)
0 0.00e+00 0.00e+00 0.00e+00
1 -1.04e-03 -1.05e-02 8.28e-03
2 -6.63e-05 -1.40e-03 1.74e-03
3 -4.63e-06 -3.44e-04 2.58e-04
4 -5.47e-06 -4.74e-05 3.80e-05
5 -3.82e-07 -8.93e-06 5.30e-06
6 -5.32e-08 -1.31e-06 1.25e-06
7 -3.61e-09 -2.86e-07 1.90e-07
Plot:

#Build plot!

# Named list of existing IRF objects
irf_list <- list(
  "H6a: Rep Intentionality → Dem Intentionality" = irf1,
  "H6b: Dem Intentionality → Rep Intentionality" = irf2,
  "H6c: Neg Intentionality → Pos Intentionality" = irf3,
  "H6d: Pos Intentionality → Neg Intentionality" = irf4,
  "H6e: Rep Tone → Dem Intentionality"           = irf5,
  "H6f: Dem Tone → Rep Intentionality"           = irf6
)

# Build unified dataframe
irf_df_all <- imap_dfr(irf_list, ~{
  tibble(
    Month = 0:(length(.x$irf[[1]]) - 1),
    Response = as.numeric(.x$irf[[1]]),
    Lower = as.numeric(.x$Lower[[1]]),
    Upper = as.numeric(.x$Upper[[1]]),
    Label = .y
  )
})
Summarizing plots:
ggplot(irf_df_all, aes(x = Month, y = Response)) +
  geom_ribbon(aes(ymin = Lower, ymax = Upper), fill = "#A6CEE3", alpha = 0.4) +
  geom_line(color = "#1F78B4", size = 1.2) +
  geom_point(color = "#1F78B4", size = 2) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "grey40") +
  facet_wrap(~ Label, ncol = 3, scales = "free_y") +
  labs(
    title = "Impulse Response Functions (IRF)",
    subtitle = "Shocks in One Series and Their Effects on Another",
    x = "Months After Shock",
    y = "Cumulative Response"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    strip.text = element_text(face = "bold", size = 13),
    plot.title = element_text(face = "bold", size = 18),
    plot.subtitle = element_text(size = 13),
    panel.grid.minor = element_blank()
  )

# Desired order
hypothesis_order <- names(irf_list)

# Pick the max absolute coefficient (excluding month 0) for each IRF
irf_summary <- imap_dfr(irf_list, ~{
  n_ahead <- length(.x$irf[[1]]) - 1
  irf_data <- tibble(
    Month = 0:n_ahead,
    Response = as.numeric(.x$irf[[1]]),
    Lower = as.numeric(.x$Lower[[1]]),
    Upper = as.numeric(.x$Upper[[1]])
  ) %>%
    filter(Month > 0) %>%
    slice_max(order_by = abs(Response), n = 1)

  irf_data$Hypothesis <- .y
  irf_data
})

# Set the factor levels to control the order in the plot
irf_summary$Hypothesis <- factor(irf_summary$Hypothesis, levels = rev(hypothesis_order))

# Plot
ggplot(irf_summary, aes(x = Response, y = Hypothesis)) +
  geom_point(color = "#1F78B4", size = 3) +
  geom_errorbarh(aes(xmin = Lower, xmax = Upper), height = 0.2, color = "#1F78B4", linewidth = 1.2) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "grey50") +
  labs(
    title = "Summary of Key IRF Effects",
    subtitle = "Most impactful response coefficient per hypothesis (with 95% CI)",
    x = "Impulse Response Estimate",
    y = NULL
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", size = 16),
    axis.text.y = element_text(size = 12),
    panel.grid.minor = element_blank()
  )

H7: The effect of negative speech towards immigration in Republicans on the use of intentionality cues in Democrats is causal:

Model:
# --- STEP 3: First Stage (predict tone_R from post_9_11) ---
first_stage <- lm(tone_R ~ post_9_11, data = df_window)
summary(first_stage)

Call:
lm(formula = tone_R ~ post_9_11, data = df_window)

Residuals:
   Min     1Q Median     3Q    Max 
 -0.01  -0.01   0.00   0.00   0.99 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.010000   0.007053   1.418    0.158
post_9_11   -0.010000   0.009950  -1.005    0.316

Residual standard error: 0.07053 on 199 degrees of freedom
Multiple R-squared:  0.00505,   Adjusted R-squared:  5.025e-05 
F-statistic:  1.01 on 1 and 199 DF,  p-value: 0.3161
# --- STEP 4: FIRST STAGE PREDICTION & SECOND STAGE (MAIN CAUSAL TEST) ---
# Get predicted values of tone_R from the intervention
df_window$predicted_tone_R <- predict(first_stage)

#Model:

second_stage = lm(formula = intentionality_D ~ predicted_tone_R + time_index, 
    data = df_window)
summary(second_stage)

Call:
lm(formula = intentionality_D ~ predicted_tone_R + time_index, 
    data = df_window)

Residuals:
   Min     1Q Median     3Q    Max 
     0      0      0      0      0 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)
(Intercept)             0          0     NaN      NaN
predicted_tone_R        0          0     NaN      NaN
time_index              0          0     NaN      NaN

Residual standard error: 0 on 198 degrees of freedom
Multiple R-squared:    NaN, Adjusted R-squared:    NaN 
F-statistic:   NaN on 2 and 198 DF,  p-value: NA
#Model:
second_stage_bis <- lm(intentionality_pos ~ predicted_tone_R + time_index, data = df_window)
summary(second_stage_bis)

Call:
lm(formula = intentionality_pos ~ predicted_tone_R + time_index, 
    data = df_window)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.01278 -0.00992 -0.00275  0.00006  0.98947 

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)
(Intercept)      -8.423e-03  2.672e-02  -0.315    0.753
predicted_tone_R  1.564e+00  1.995e+00   0.784    0.434
time_index        5.615e-05  1.719e-04   0.327    0.744

Residual standard error: 0.07069 on 198 degrees of freedom
Multiple R-squared:  0.005586,  Adjusted R-squared:  -0.004458 
F-statistic: 0.5561 on 2 and 198 DF,  p-value: 0.5743
Plot:
# Set your tone colors for matching
tone_colors_balanced <- c(
  "Negative" = "#56B4E9",
  "Neutral"  = "#B0B0B0",
  "Positive" = "#E79E83"
)

# Reformat to long for plotting
df_long <- df_window %>%
  dplyr::select(date, intentionality_D, intentionality_pos) %>%
  pivot_longer(cols = c(intentionality_D, intentionality_pos),
               names_to = "Measure", values_to = "Score") %>%
  mutate(
    ToneLabel = case_when(
      Measure == "intentionality_D" ~ "Negative",
      Measure == "intentionality_pos" ~ "Positive"
    )
  )

# Plot raw data and smoothed curves
ggplot(df_long, aes(x = date, y = Score, group = ToneLabel, color = ToneLabel)) +
  geom_point(alpha = 0.3, size = 1.5) +                                # raw points
  geom_smooth(method = "loess", se = TRUE, span = 0.3, size = 1.2) +  # smooth line
  geom_vline(xintercept = as.Date("2001-09-11"), linetype = "dashed", color = "red", linewidth = 1) +
  scale_color_manual(
    values = tone_colors_balanced,
    labels = c("Negative" = "Democrat Intentionality",
               "Positive" = "Positive Intentionality")
  ) +
  labs(
    title = "Daily Democratic Intentionality Around 9/11",
    subtitle = "Raw daily values (transparent) with LOESS smoothing; red dashed line = 9/11",
    x = "Date",
    y = "Intentionality Score",
    color = "Measure"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", size = 16),
    legend.position = "top",
    axis.text.x = element_text(angle = 45, hjust = 1)
  )
`geom_smooth()` using formula = 'y ~ x'

Additional research questions

ARQ1a): Is there an asymetry in the prevalence of intentionality cues between positive vs. negative discourse about immigration?

Model:
summary(glmer(
  gpt4_label_binary2 ~ tone2 + year + party + chamber + (1 | state),
  data = sampled_data_annotated_arranged,
  family = binomial
))
boundary (singular) fit: see help('isSingular')
Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Warning in vcov.merMod(object, correlation = correlation, sigm = sig): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: gpt4_label_binary2 ~ tone2 + year + party + chamber + (1 | state)
   Data: sampled_data_annotated_arranged

      AIC       BIC    logLik -2*log(L)  df.resid 
    385.7     419.1    -184.9     369.7       471 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.7067 -0.5168 -0.2882 -0.2595  3.9845 

Random effects:
 Groups Name        Variance Std.Dev.
 state  (Intercept) 0        0       
Number of obs: 479, groups:  state, 53

Fixed effects:
               Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -7.870293   8.014611  -0.982    0.326    
tone2Positive  1.373218   0.290842   4.722 2.34e-06 ***
year           0.002795   0.004010   0.697    0.486    
partyR         0.085255   0.313828   0.272    0.786    
partyUnknown   0.192810   0.597904   0.322    0.747    
chamberH      -0.092061   0.684005  -0.135    0.893    
chamberS      -0.387008   0.662951  -0.584    0.559    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) tn2Pst year   partyR prtyUn chmbrH
tone2Positv  0.019                                   
year        -0.996 -0.051                            
partyR      -0.101  0.150  0.082                     
partyUnknwn -0.156  0.004  0.082  0.203              
chamberH    -0.173  0.077  0.091  0.007  0.806       
chamberS    -0.143  0.107  0.061  0.029  0.778  0.902
optimizer (Nelder_Mead) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
Plot:

Robustness check:

ARQ1b): Does intentionality rhetoric converge across tones when immigration becomes more salient?

Model:
summary(glmer(
  gpt4_label_binary2 ~ tone2 + salience_text + year + party + chamber + tone2 * salience_text + (1 | state),
  data = sampled_data_annotated_arranged,
  family = binomial
))
boundary (singular) fit: see help('isSingular')
Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Warning in vcov.merMod(object, correlation = correlation, sigm = sig): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: gpt4_label_binary2 ~ tone2 + salience_text + year + party + chamber +  
    tone2 * salience_text + (1 | state)
   Data: sampled_data_annotated_arranged

      AIC       BIC    logLik -2*log(L)  df.resid 
    389.6     431.4    -184.8     369.6       469 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.7355 -0.5121 -0.2882 -0.2607  3.9609 

Random effects:
 Groups Name        Variance Std.Dev.
 state  (Intercept) 0        0       
Number of obs: 479, groups:  state, 53

Fixed effects:
                             Estimate Std. Error z value Pr(>|z|)  
(Intercept)                 -7.395531   8.639065  -0.856   0.3920  
tone2Positive                1.244351   0.629888   1.976   0.0482 *
salience_text               -0.015600   0.193194  -0.081   0.9356  
year                         0.002576   0.004396   0.586   0.5579  
partyR                       0.088955   0.315056   0.282   0.7777  
partyUnknown                 0.203201   0.598830   0.339   0.7344  
chamberH                    -0.098306   0.684630  -0.144   0.8858  
chamberS                    -0.390749   0.664589  -0.588   0.5566  
tone2Positive:salience_text  0.053034   0.224209   0.237   0.8130  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) tn2Pst slnc_t year   partyR prtyUn chmbrH chmbrS
tone2Positv  0.070                                                 
salienc_txt  0.268  0.726                                          
year        -0.995 -0.122 -0.319                                   
partyR      -0.112  0.000 -0.081  0.097                            
partyUnknwn -0.134 -0.038 -0.015  0.065  0.205                     
chamberH    -0.179  0.034 -0.030  0.104  0.009  0.802              
chamberS    -0.160  0.023 -0.062  0.086  0.033  0.773  0.902       
tn2Pstv:sl_ -0.054 -0.887 -0.802  0.097  0.076  0.046  0.000  0.027
optimizer (Nelder_Mead) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
Plot:
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 4 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).

Robustness check:

ARQ1c): Does intentionality rhetoric converge across tones when immigration becomes more polarized?

summary(glmer(
  gpt4_label_binary2 ~ tone2 + polarization_score + year + party + chamber + tone2 * polarization_score + (1 | state),
  data = sampled_data_annotated_arranged,
  family = binomial
))
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.814656 (tol = 0.002, component 1)
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: gpt4_label_binary2 ~ tone2 + polarization_score + year + party +  
    chamber + tone2 * polarization_score + (1 | state)
   Data: sampled_data_annotated_arranged

      AIC       BIC    logLik -2*log(L)  df.resid 
    386.0     427.7    -183.0     366.0       469 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.8267 -0.4684 -0.2895 -0.2279  4.8382 

Random effects:
 Groups Name        Variance Std.Dev.
 state  (Intercept) 0.1094   0.3308  
Number of obs: 479, groups:  state, 53

Fixed effects:
                                  Estimate Std. Error z value Pr(>|z|)   
(Intercept)                      12.429449   5.536897   2.245  0.02478 * 
tone2Positive                     1.449502   0.463088   3.130  0.00175 **
polarization_score                1.397647   0.684067   2.043  0.04104 * 
year                             -0.007708   0.002682  -2.874  0.00405 **
partyR                            0.058237   0.349422   0.167  0.86763   
partyUnknown                      0.227425   0.718465   0.317  0.75159   
chamberH                         -0.233376   0.696829  -0.335  0.73769   
chamberS                         -0.557985   0.669486  -0.833  0.40459   
tone2Positive:polarization_score  0.021149   0.843566   0.025  0.98000   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) tn2Pst plrzt_ year   partyR prtyUn chmbrH chmbrS
tone2Positv -0.238                                                 
polrztn_scr  0.218  0.509                                          
year        -0.989  0.173 -0.271                                   
partyR      -0.322  0.159 -0.092  0.287                            
partyUnknwn -0.286  0.056 -0.023  0.186  0.265                     
chamberH    -0.255  0.105 -0.014  0.134  0.060  0.702              
chamberS    -0.169  0.066 -0.044  0.052  0.033  0.655  0.892       
tn2Pstv:pl_  0.024 -0.711 -0.742  0.012  0.071  0.034 -0.019  0.011
optimizer (Nelder_Mead) convergence code: 0 (OK)
Model failed to converge with max|grad| = 0.814656 (tol = 0.002, component 1)
Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?
Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 4 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).

Robustness check:

ARQ2a): Is there an asymmetry between parties in the evolution of intentionality cues in discourse about immigration?

Model:
sampled_data_annotated_arranged_bipartisan = sampled_data_annotated_arranged[sampled_data_annotated_arranged$party %in% c("R", "D"),]

summary(glmer(
  gpt4_label_binary2 ~ party + year + chamber + (1 | state),
  data = sampled_data_annotated_arranged_bipartisan,
  family = binomial
))
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.377749 (tol = 0.002, component 1)
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: gpt4_label_binary2 ~ party + year + chamber + (1 | state)
   Data: sampled_data_annotated_arranged_bipartisan

      AIC       BIC    logLik -2*log(L)  df.resid 
    322.9     342.9    -156.4     312.9       398 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.5875 -0.4148 -0.3548 -0.3064  3.3445 

Random effects:
 Groups Name        Variance Std.Dev.
 state  (Intercept) 0.1936   0.4401  
Number of obs: 403, groups:  state, 52

Fixed effects:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept) -13.981376   2.501749  -5.589 2.29e-08 ***
partyR       -0.112481   0.309721  -0.363    0.716    
year          0.006231   0.001259   4.951 7.39e-07 ***
chamberS     -0.498686   0.315022  -1.583    0.113    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
         (Intr) partyR year  
partyR   -0.086              
year     -0.994  0.030       
chamberS  0.023  0.053 -0.075
optimizer (Nelder_Mead) convergence code: 0 (OK)
Model failed to converge with max|grad| = 0.377749 (tol = 0.002, component 1)
Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?
Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
Plot:

Robustness check:

ARQ2b): Does intentionality rhetoric converge across parties when immigration becomes more salient?

Model:
summary(glmer(
  gpt4_label_binary2 ~ party + salience_text + year + chamber + party * salience_text + (1 | state),
  data = sampled_data_annotated_arranged_bipartisan,
  family = binomial
))
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.31791 (tol = 0.002, component 1)
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: gpt4_label_binary2 ~ party + salience_text + year + chamber +  
    party * salience_text + (1 | state)
   Data: sampled_data_annotated_arranged_bipartisan

      AIC       BIC    logLik -2*log(L)  df.resid 
    325.0     353.0    -155.5     311.0       396 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.6559 -0.4049 -0.3540 -0.2846  3.7195 

Random effects:
 Groups Name        Variance Std.Dev.
 state  (Intercept) 0.2489   0.4989  
Number of obs: 403, groups:  state, 52

Fixed effects:
                       Estimate Std. Error z value Pr(>|z|)    
(Intercept)          -14.818041   2.279400  -6.501 7.99e-11 ***
partyR                -0.948414   0.708382  -1.339   0.1806    
salience_text         -0.153135   0.152459  -1.004   0.3152    
year                   0.006842   0.001153   5.935 2.94e-09 ***
chamberS              -0.534158   0.319265  -1.673   0.0943 .  
partyR:salience_text   0.321919   0.241107   1.335   0.1818    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) partyR slnc_t year   chmbrS
partyR      -0.108                            
salienc_txt -0.059  0.521                     
year        -0.981  0.000 -0.095              
chamberS    -0.006  0.104 -0.024 -0.048       
prtyR:slnc_  0.086 -0.895 -0.637  0.013 -0.089
optimizer (Nelder_Mead) convergence code: 0 (OK)
Model failed to converge with max|grad| = 0.31791 (tol = 0.002, component 1)
Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?
Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
Plot:
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 4 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).

Robustness check:

ARQ2c): Does intentionality rhetoric converge across parties when immigration becomes more polarized?

Model:
summary(glmer(
  gpt4_label_binary2 ~ party + polarization_score + year + chamber + party * polarization_score + (1 | state),
  data = sampled_data_annotated_arranged_bipartisan,
  family = binomial
))
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.316707 (tol = 0.002, component 1)
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: gpt4_label_binary2 ~ party + polarization_score + year + chamber +  
    party * polarization_score + (1 | state)
   Data: sampled_data_annotated_arranged_bipartisan

      AIC       BIC    logLik -2*log(L)  df.resid 
    323.9     351.9    -154.9     309.9       396 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.7424 -0.4046 -0.3468 -0.2847  4.4696 

Random effects:
 Groups Name        Variance Std.Dev.
 state  (Intercept) 0.2534   0.5034  
Number of obs: 403, groups:  state, 52

Fixed effects:
                            Estimate Std. Error z value Pr(>|z|)
(Intercept)               -2.4077024  2.2847561  -1.054    0.292
partyR                    -0.6735570  0.5004156  -1.346    0.178
polarization_score         0.2743066  0.5958298   0.460    0.645
year                       0.0003169  0.0011680   0.271    0.786
chamberS                  -0.5247962  0.3199621  -1.640    0.101
partyR:polarization_score  1.2903336  0.9312815   1.386    0.166

Correlation of Fixed Effects:
            (Intr) partyR plrzt_ year   chmbrS
partyR      -0.081                            
polrztn_scr  0.086  0.420                     
year        -0.989  0.009 -0.173              
chamberS    -0.025  0.069 -0.079 -0.025       
prtyR:plrz_  0.042 -0.774 -0.625  0.008 -0.049
optimizer (Nelder_Mead) convergence code: 0 (OK)
Model failed to converge with max|grad| = 0.316707 (tol = 0.002, component 1)
Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?
Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
Plot:
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 4 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).

Robustness check:

ARQ3: Is the evolution of intentionality cues in positive and negative sentences about immigration the same in both parties?

Model:
summary(glmer(
  gpt4_label_binary2 ~ tone2 + year + party + chamber + tone2*year + tone2*party + year*party + tone2*year*party + (1 | state),
  data = sampled_data_annotated_arranged_bipartisan,
  family = binomial
))
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
unable to evaluate scaled gradient
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues
Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Warning in vcov.merMod(object, correlation = correlation, sigm = sig): variance-covariance matrix computed from finite-difference Hessian is
not positive definite or contains NA values: falling back to var-cov estimated from RX
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: gpt4_label_binary2 ~ tone2 + year + party + chamber + tone2 *  
    year + tone2 * party + year * party + tone2 * year * party +  
    (1 | state)
   Data: sampled_data_annotated_arranged_bipartisan

      AIC       BIC    logLik -2*log(L)  df.resid 
    311.9     351.9    -145.9     291.9       393 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-0.7962 -0.4106 -0.2762 -0.2166  4.9603 

Random effects:
 Groups Name        Variance Std.Dev.
 state  (Intercept) 0.3251   0.5702  
Number of obs: 403, groups:  state, 52

Fixed effects:
                            Estimate Std. Error z value Pr(>|z|)
(Intercept)               -1.009e+01  1.948e+01  -0.518    0.604
tone2Positive              9.477e+00  2.477e+01   0.383    0.702
year                       3.730e-03  9.886e-03   0.377    0.706
partyR                    -1.457e-01  2.548e+01  -0.006    0.995
chamberS                  -3.713e-01  3.356e-01  -1.106    0.269
tone2Positive:year        -3.939e-03  1.254e-02  -0.314    0.753
tone2Positive:partyR      -2.507e+01  3.647e+01  -0.687    0.492
year:partyR                3.224e-04  1.291e-02   0.025    0.980
tone2Positive:year:partyR  1.238e-02  1.845e-02   0.671    0.502

Correlation of Fixed Effects:
            (Intr) tn2Pst year   partyR chmbrS tn2Ps: tn2P:R yr:prR
tone2Positv -0.772                                                 
year        -1.000  0.771                                          
partyR      -0.767  0.593  0.766                                   
chamberS     0.101 -0.063 -0.110 -0.030                            
ton2Pstv:yr  0.773 -1.000 -0.773 -0.594  0.065                     
tn2Pstv:prR  0.523 -0.678 -0.523 -0.678  0.007  0.678              
year:partyR  0.767 -0.593 -0.767 -1.000  0.031  0.594  0.677       
tn2Pstv:y:R -0.524  0.678  0.524  0.678 -0.008 -0.678 -1.000 -0.679
optimizer (Nelder_Mead) convergence code: 0 (OK)
unable to evaluate scaled gradient
Model failed to converge: degenerate  Hessian with 1 negative eigenvalues
Plot:
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 5 rows containing missing values or values outside the scale range
(`geom_smooth()`).

Robustness check: