Intro

The following is a re-analysis of the data reported here, specifically Study 2 from that data set. The research question in this re-analysis is: Why do people have negative opinions on the COVID-19 vaccine, but generally believe that mRNA vaccines are safe? What do people think about mRNA vaccines?

The code block below imports and cleans the data. It also has a few custom-made functions for recoding specific questions in these data.

dat <- read.csv("~/Desktop/Junk/Google Drive/Research/--Research -- Vax ranking/wave2/rank 2_November 9, 2024_11.32.csv")

# Rows 1 through 10 are preview, junk, and labbies
dat=dat[-c(1:10),]

recode.likert=function(vec){
  vec=ifelse(vec=="Strongly disagree",1,vec)
  vec=ifelse(vec=="Somewhat disagree",2,vec)
  vec=ifelse(vec=="Neither agree nor disagree",3,vec)
  vec=ifelse(vec=="Somewhat agree",4,vec)
  vec=ifelse(vec=="Strongly agree",5,vec)
  return(as.numeric(vec))
}
for (i in 59:74) {
  dat[,i]=as.numeric(recode.likert(dat[,i]))
}



dat$age=as.numeric(dat$age)

dat$is.male=ifelse(dat$gender=="Male",1,0)

dat$is.white=ifelse(dat$race=="White or European (e.g., German, Irish, English, Italian, Polish, French, etc.)",1,0)


dat$education=ifelse(dat$education=="Some high school",1,dat$education)
dat$education=ifelse(dat$education=="High school diploma, or equivalent",2,dat$education)
dat$education=ifelse(dat$education=="Some college",3,dat$education)
dat$education=ifelse(dat$education=="Associate degree",4,dat$education)
dat$education=ifelse(dat$education=="Bachelor's Degree",5,dat$education)
dat$education=ifelse(dat$education=="Master's Degree",6,dat$education)
dat$education=ifelse(dat$education=="Doctorate Degree or other advanced degree (e.g., M.D., J.D.)",7,dat$education)
dat$education=as.numeric(dat$education)

dat$income=ifelse(dat$income=="$0 - $29,999",1,dat$income)
dat$income=ifelse(dat$income=="$30,000 - $59,999",2,dat$income)
dat$income=ifelse(dat$income=="$60,000 - $89,999",3,dat$income)
dat$income=ifelse(dat$income=="$90,000 - $119,999",4,dat$income)
dat$income=ifelse(dat$income=="$120,000 - $149,999",5,dat$income)
dat$income=ifelse(dat$income=="$150,000+",6,dat$income)

dat$income=as.numeric(dat$income)

recode.pols=function(vec){
  vec=ifelse(vec=="Very liberal",1,vec)
  vec=ifelse(vec=="Somewhat liberal",2,vec)
  vec=ifelse(vec=="Neutral, independent",3,vec)
  vec=ifelse(vec=="Somewhat conservative",4,vec)
  vec=ifelse(vec=="Very conservative",5,vec)
  return(as.numeric(vec))
}

dat$politics=recode.pols(dat$politics)

Negative opinions about the COVID-19 vaccine?

First, let’s examine people’s attitudes towards the COVID-19 vaccines.

# recode all the relevant likert phrases (e.g., "strongly agree") into numbers
dat$covid.rate.1=recode.likert(dat$covid.rate.1)
dat$covid.rate.2=recode.likert(dat$covid.rate.2)
dat$covid.rate3=recode.likert(dat$covid.rate3)
dat$covid.rate4=recode.likert(dat$covid.rate4)
dat$covid.rate.5=recode.likert(dat$covid.rate.5)
dat$covid.rate.6=recode.likert(dat$covid.rate.6)

# create a new column/varialbe in the data that is the mean of all these COVID-19 attidue questions
dat$covid.att=rowMeans(dat[19:24])

# Examine histogram of COCVID attitdues
hist(dat$covid.att, ylim=c(0,85),
     main="",xlab="Negative COVID vaccine attitudes")

Beliefs about mRNA technologies?

Some of the questions from the survey can be viewed as indirectly about mRNA concerns. Here, I erred on the side of analyzing items that only directly referenced mRNA.

“mRNA vaccines are unsafe”

likert.labels=c("Strongly disagree","Somewhat disagree","Neither...         ","Somewhat agree","Strongly agree")
barplot(table(dat$mRNA.unsafe),col="red4",names.arg=likert.labels,las=2,cex.names=.55)

“mRNA-based vaccines can alter people’s DNA”

barplot(table(dat$alter.DNA),col="red4",names.arg=likert.labels,las=2,cex.names=.55)

Are these two beliefs related to each other?

The distribution of Likert endoresments across these two items look very similar. If the items are similar in content and also highly correlated in how people respond to them, this could be justification for averaging them together.

plot(jitter(dat$mRNA.unsafe)~jitter(dat$alter.DNA),
     ylab="mRNA unsafe",xlab="can alter DNA")

cor.test(dat$mRNA.unsafe,dat$alter.DNA)

## 
##  Pearson's product-moment correlation
## 
## data:  dat$mRNA.unsafe and dat$alter.DNA
## t = 16.813, df = 250, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.6647279 0.7816713
## sample estimates:
##       cor 
## 0.7284631

Since they are so strongly corelated with each other, I’ll go ahead and average them together into one variable, thus simplifying our analyses.

Some exploratory analyses

There are a number of ways to examine whether mRNA beliefs are associated with attitudes towards the COVID-19 vaccines.

mRNA.att=rowMeans(cbind(dat$mRNA.unsafe,dat$alter.DNA))
summary(lm(dat$covid.att~mRNA.att))

## 
## Call:
## lm(formula = dat$covid.att ~ mRNA.att)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.30130 -0.37011  0.08255  0.48152  1.98152 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.12706    0.14751  14.420  < 2e-16 ***
## mRNA.att     0.39141    0.04691   8.345 5.25e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7094 on 245 degrees of freedom
##   (6 observations deleted due to missingness)
## Multiple R-squared:  0.2213, Adjusted R-squared:  0.2181 
## F-statistic: 69.63 on 1 and 245 DF,  p-value: 5.255e-15

Apparently mRNA concerns positively correlate with negative COVID-19 vaccine attitudes. This finding is a bit “on the nose”. So let’s delve deeper. Is this effect the same across various demographic variables? Most notably, is it moderated by level of education?

summary(lm(dat$covid.att~mRNA.att+dat$education+dat$age+dat$is.male+dat$is.white+dat$income))

## 
## Call:
## lm(formula = dat$covid.att ~ mRNA.att + dat$education + dat$age + 
##     dat$is.male + dat$is.white + dat$income)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.53625 -0.37403  0.05869  0.43132  1.81216 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    2.635048   0.260931  10.099  < 2e-16 ***
## mRNA.att       0.359948   0.047709   7.545 9.79e-13 ***
## dat$education -0.032415   0.035083  -0.924   0.3565    
## dat$age       -0.001535   0.003805  -0.403   0.6870    
## dat$is.male   -0.018827   0.094318  -0.200   0.8420    
## dat$is.white  -0.172291   0.098243  -1.754   0.0808 .  
## dat$income    -0.032352   0.032082  -1.008   0.3143    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6977 on 236 degrees of freedom
##   (10 observations deleted due to missingness)
## Multiple R-squared:  0.2455, Adjusted R-squared:  0.2263 
## F-statistic:  12.8 on 6 and 236 DF,  p-value: 1.658e-12

Surprisingly, none of the additional demographic variables have a statistically significant link with COVID-19 attitudes. The model didn’t even account for much extra variance. \(R^2\) went from .22 to .25. Granted, though, each coefficient can only represent the unique relationship between that variable and DV while adjusting for all other variables in the model. This is also a pretty large model for a relatively limited number of observations (n = 253).

What about the specific interaction I was curious about? Does education moderate the relationship between mRNA beliefs and negative COVID-19 vaccine attitudes?

summary(lm(dat$covid.att~mRNA.att*dat$education))

## 
## Call:
## lm(formula = dat$covid.att ~ mRNA.att * dat$education)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.43507 -0.32446  0.05135  0.44893  1.83975 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             3.32553    0.44810   7.421 1.98e-12 ***
## mRNA.att                0.07029    0.13674   0.514  0.60768    
## dat$education          -0.25817    0.09428  -2.738  0.00663 ** 
## mRNA.att:dat$education  0.06917    0.02958   2.339  0.02017 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6946 on 241 degrees of freedom
##   (8 observations deleted due to missingness)
## Multiple R-squared:  0.2437, Adjusted R-squared:  0.2343 
## F-statistic: 25.89 on 3 and 241 DF,  p-value: 1.491e-14

Quite the opposite! When education and the education x mRNA attitude interaction are in the same model, mRNA concerns (by themselves) no longer predict negative COVID-19 attitudes. Now, higher levels of education is significantly associated with lower negative vaccine attitudes (i.e., more positive ones).

Let’s look at the interaction coefficient though: To the degree that people are more educated, the relationship between problematic mRNA beliefs and negative COVID-19 vaccine attitudes intensifies. Put another way: To the degree that people have stronger concerns about mRNA vaccines, the stronger the relationship is between education and negative COVID-19 vaccines become.

(Note: I’d be interested in returning to this with a path analysis because we could plausibly assume that education can cause differences in mRNA concerns, but not the other way around.)

To create a visual examination of this interaction, I lifted code from this site.

library(jtools) # for summ()

covid.att=dat$covid.att
education=dat$education

fiti <- lm(covid.att ~ mRNA.att * education)
summ(fiti)

## MODEL INFO:
## Observations: 245 (8 missing obs. deleted)
## Dependent Variable: covid.att
## Type: OLS linear regression 
## 
## MODEL FIT:
## F(3,241) = 25.89, p = 0.00
## R² = 0.24
## Adj. R² = 0.23 
## 
## Standard errors:OLS
## -------------------------------------------------------
##                             Est.   S.E.   t val.      p
## ------------------------ ------- ------ -------- ------
## (Intercept)                 3.33   0.45     7.42   0.00
## mRNA.att                    0.07   0.14     0.51   0.61
## education                  -0.26   0.09    -2.74   0.01
## mRNA.att:education          0.07   0.03     2.34   0.02
## -------------------------------------------------------

library(ggplot2)
library(interactions)
p=interact_plot(fiti, pred = mRNA.att, modx = education,interval=T)
p+ylab("Negative COVID-19 vaccine attitudes")+xlab("mRNA concerns")

This is why it’s important to visualize interactions. Just looking at the regression output, I was under a misleading impression. By looking at the visualization I can see that, really, mRNA concerns are still a strong indicator of negative vaccine attitudes. Sure, this trend is moderated by levels of education, but it feels misleading to write off the mRNA-attitude association as “non-significant” after looking at the visualization.

Things appear to work like this: The more concerns you have about mRNA vaccines, overall, higher your negative COVID-19 vaccine attitudes will be. This trend is weaker for people with less formal educational attainment.

To what extent is the moderation effect of education driven by a small number of extreme participants?

Delving deeper

edu.labels=c("Some HS","HS/GED","Some college","Ass.","Bach.","Mast.","Doc.,etc.")
barplot(table(education),names.arg=edu.labels,las=2,cex.names=.55,
        ylim=c(0,100))

There are a lot of college students, people who stopped at a bachelors, and people with a masters (or higher). Not sure I want to specifically cluster people into those groups for an analysis though. I’ll start by looking at the finest grained breakdown of the DV and moderator by education level.

barplot(
  aggregate(mRNA.att,by=list(education),mean,na.rm=T)$x,
  names.arg=edu.labels,las=2,cex.names=.55,
  ylab="mRNA concerns",
  ylim=c(0,5)
)

So, there’s a small trend where higher education is associated with less negative COVID-19 attitudes.

barplot(
  aggregate(covid.att,by=list(education),mean,na.rm=T)$x,
  names.arg=edu.labels,las=2,cex.names=.55,
  ylab="Negative COVID-19 vax attitudes",
  ylim=c(0,5)
)

There’s apparently no consistent association between education level and negative vaccine attitudes. I don’t know though. These data visualizations don’t take sample size into account. They’re crude.

Mixing political affiliation into the data

And it might all be a moot point. According to my very awesome previous research (good job, me!), people’s vaccine attitudes are closely related to their political orientation. In fact, people’s attitudes towards vaccines in general (not just COVID-19 vaccines in particular) became more negative going from pre-pandemic to post-pandemic largely along party lines. In the U.S. at least.

summary(lm(covid.att~mRNA.att+education+dat$politics))

## 
## Call:
## lm(formula = covid.att ~ mRNA.att + education + dat$politics)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.41365 -0.36023  0.07222  0.40341  1.85471 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.21755    0.22842   9.708  < 2e-16 ***
## mRNA.att      0.33162    0.05015   6.613  2.4e-10 ***
## education    -0.05127    0.03097  -1.656   0.0991 .  
## dat$politics  0.10126    0.04531   2.235   0.0264 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6953 on 241 degrees of freedom
##   (8 observations deleted due to missingness)
## Multiple R-squared:  0.2423, Adjusted R-squared:  0.2328 
## F-statistic: 25.69 on 3 and 241 DF,  p-value: 1.877e-14

When we look only at main effects (no interactions, moderation), we see that mRNA concerns are associated with more negative COVID-19 vaccine attitudes while adjusting for education and politics. Likewise, being more conservative is associated with more negative negative COVID-19 vaccine attitudes while adjusting for mRNA concerns and education. However, education is not significantly associated with COVID-19 vaccine attitudes while adjusting for education and politics.

I can’t really go down the rabbit hole of interactions/moderations between these variables because having all 3 interacting would be under-powered:

summary(lm(covid.att~mRNA.att*education*dat$politics))

## 
## Call:
## lm(formula = covid.att ~ mRNA.att * education * dat$politics)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.44164 -0.33747  0.05536  0.46405  1.81237 
## 
## Coefficients:
##                                  Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                      2.178732   1.251724   1.741   0.0831 .
## mRNA.att                         0.243599   0.417685   0.583   0.5603  
## education                       -0.160203   0.248419  -0.645   0.5196  
## dat$politics                     0.421970   0.402372   1.049   0.2954  
## mRNA.att:education               0.066215   0.085416   0.775   0.4390  
## mRNA.att:dat$politics           -0.071273   0.124877  -0.571   0.5687  
## education:dat$politics          -0.035380   0.081795  -0.433   0.6657  
## mRNA.att:education:dat$politics  0.002392   0.025465   0.094   0.9252  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6903 on 237 degrees of freedom
##   (8 observations deleted due to missingness)
## Multiple R-squared:  0.2655, Adjusted R-squared:  0.2438 
## F-statistic: 12.24 on 7 and 237 DF,  p-value: 2.406e-13

There are too many coefficients and too little data, I’m guessing. Surely, some of these interactions/moderations do exist, but we’re only working with 253 observations. That’s probably too few observations for so complex of a model. We could go into interactions between 2 variables and leave the third one out, but that would feel arbitrary and feel like it’s leaving out key questions, in my view.

APS25

Mark LaCour

12/10/2024