Knight Regression Analysis

The main research objective of this analysis is to determine if race is a significant predictor of outcomes of interest above and beyond other potential confounding variables such as partisanship. The racial categories of most interest based on crosstab analysis are: Black, Hispanic and white.

The secondary research objective is to review if any of the control variables have predictive power over the outcomes of interest as a means of informing potential future analyses.

This report will use multivariate regression techniques on 22 survey questions of interest; the specific regression technique will vary depending on outcome of interest. Different models employed here include:

Logistic Regression: This is used when our dependent variable of interest is binary – either a person provided a certain response, or did not. In this analysis, logistic regression is typically used to predict if a person agreed with a specific question, such as Q2A, which asks a person to indicate how much they agreed with the statement “I trust the major technology companies”. The logistic regression focuses on whether a person agreed (strongly or somewhat) compared to providing any other type of response.
Ordinal Regression: This analysis is similar to logistic regression, but allows for more than two types of outcomes. This technique was used for the Q16 series questions, where a person registers their level of concern over certain tech-related issues. [NOTE: THIS ANALYSIS HAS SINCE BEEN EXCLUDED AND WILL NOT BE FEATURED HERE]
Linear regression: The most familiar type of model in the regression family, this was used on questions with a number of different outcomes.

Regardles of the type of regression model used, this analysis used the same set of predictor variables, including:

Race/ethnicity: This variable was recoded to the following groups – White (the reference category), Black, Hispanic and other.
Age cohort: This consisted of three age groups, 18-34, 35-54, and 55+.
Political Party: The categories here included Republican (reference category), Independent and Democrat. ‘Leaners’ were allocated to whichever party they leaned towards.
College Gradute: The reference category is the non-graduate.
High Internet User: Non-daily (i.e. high) users are the reference category.
News source user: People who do not use their favorite news sources’ daily are the reference category.
‘Urbanicity’: This is Q32 in the survey. People can describe the area they live as ‘rural’ (reference category), small town, large city or suburb of large city.

Part 1. Recodes of items

The file requires some recoding and data cleaning. This code deals with these issues. One key step is the coding of the ‘-98’ labels as missing

## 
##  REPUBLICAN    DEMOCRAT INDEPENDENT 
##        3858        5143         769

## 
##    White    Other    Black    Asian Hispanic 
##     7687      236     1126      166      983

## 
##    0    1 
## 5256 4900

Part 2. Logistic Regression Analysis

Logistic regression will be the primary tool for this analysis, including on the following survey questions:

Q2A Trust in Tech companies: The model will predict which respondents said they agreed with this statement (including strongly or somewhat agree). The reference category will be respondents who did not agree – including those who said ‘neither agree nor disagree’.
Q2B Trust information on social media: Again, the model will focus on predicting which respondents agreed with this statement.
Q4D Reports of events by the news media CANNOT be trusted: Outcome of interest is disagree/strongly disagree. Note this is different from the approach followed for most other items. The explanation for this break from protocol is explained further below.
Q4F Cistizens have equal right to vote: Outcome of interest is disagree/strongly disagree.
Q18F To what extent does social media make the following easier or harder for you…Voting. Here we will predict respondents who said ‘easier’ or ‘much easier’.
Q26A Thinking about political and social debates between people on social media, would you say they have made you more or less likely to…Spend time with family. The model will predict respondents who said ‘more likely’.
Q26F Thinking about political and social debates between people on social media, would you say they have made you more or less likely to…use social media. Again, the model predicts which type of respondent said ‘more likely’.
Q26G Thinking about political and social debates between people on social media, would you say they have made you more or less likely to…follow the news. Again, the model predicts which type of respondent said ‘more likely’.
Q26G Thinking about political and social debates between people on social media, would you say they have made you more or less likely to…vote. Again, the model predicts which type of respondent said ‘more likely’.

For ease of analysis, we first create a function to run the logistic regression.

## 
## Note more likely      More likely 
##             5124             3311

Q2A: Trust in Tech Companies

We have seen apparent differences by race on this question, as the below bar chart shows. The regression analysis further confirms this relationship, with Black being a significant predictor. The regression table and coefficient plot shows the results in terms of odd-ratios, with a value over 1 indicating a particular group was more likely to agree with the statement than individuals not in that group. Here, we see Blacks are 1.66 times more likely to agree than non-blacks, and this is statistically significant; only the odds-ratio associated with being a Democrat is higher.

## MODEL INFO:
## Observations: 9295
## Dependent Variable: ind.var
## Type: Analysis of complex survey design 
##  Family: quasibinomial 
##  Link function: logit 
## 
## MODEL FIT:
## Pseudo-R² (Cragg-Uhler) = 0.05
## Pseudo-R² (McFadden) = 0.04
## AIC =  NA 
## 
## -------------------------------------------------------------------
##                                    exp(Est.)   S.E.   t val.      p
## -------------------------------- ----------- ------ -------- ------
## (Intercept)                             0.09   0.19   -13.07   0.00
## race.varBlack                           1.66   0.10     5.17   0.00
## race.varHispanic                        1.25   0.11     2.10   0.04
## race.varOther                           1.03   0.17     0.15   0.88
## POL_PARTYDEMOCRAT                       2.04   0.09     8.32   0.00
## POL_PARTYINDEPENDENT                    1.05   0.15     0.31   0.76
## genderFemale                            0.95   0.07    -0.77   0.44
## AGE_CAT35 TO 54                         1.01   0.10     0.13   0.89
## AGE_CAT55+                              1.38   0.10     3.23   0.00
## college.grad1                           1.12   0.08     1.43   0.15
## high_internet_use_1high                 0.98   0.13    -0.16   0.88
## high_news_1high                         1.08   0.08     0.93   0.35
## Q32A small town or village              1.18   0.12     1.43   0.15
## Q32A large city                         1.01   0.13     0.07   0.95
## Q32A suburb of a large                  1.22   0.11     1.74   0.08
## city                                                               
## -------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1

## Loading required namespace: broom.mixed

Q2B: Trust information on social media

In terms of the percentage who agree or strongly agree with this statement, we do again see apparent differences by race on this question, as the below chart shows. However, the percent who agree is very low across all race/ethnicities, even if Blacks have the highest rate of agreement (7%). Still, the logistic regression finds that race is a significant predictor – and Black respondents are comparitively more likely to agree than any other group included in the analysis, including Democrats.

## MODEL INFO:
## Observations: 9276
## Dependent Variable: ind.var
## Type: Analysis of complex survey design 
##  Family: quasibinomial 
##  Link function: logit 
## 
## MODEL FIT:
## Pseudo-R² (Cragg-Uhler) = 0.06
## Pseudo-R² (McFadden) = 0.05
## AIC =  NA 
## 
## -------------------------------------------------------------------
##                                    exp(Est.)   S.E.   t val.      p
## -------------------------------- ----------- ------ -------- ------
## (Intercept)                             0.07   0.38    -7.05   0.00
## race.varBlack                           2.75   0.23     4.39   0.00
## race.varHispanic                        1.61   0.24     1.97   0.05
## race.varOther                           2.29   0.34     2.46   0.01
## POL_PARTYDEMOCRAT                       1.39   0.21     1.57   0.12
## POL_PARTYINDEPENDENT                    1.03   0.33     0.09   0.93
## genderFemale                            0.70   0.17    -2.12   0.03
## AGE_CAT35 TO 54                         0.80   0.24    -0.95   0.34
## AGE_CAT55+                              1.07   0.25     0.26   0.79
## college.grad1                           0.73   0.19    -1.66   0.10
## high_internet_use_1high                 0.42   0.25    -3.50   0.00
## high_news_1high                         0.87   0.18    -0.75   0.45
## Q32A small town or village              0.84   0.28    -0.64   0.52
## Q32A large city                         1.02   0.29     0.07   0.95
## Q32A suburb of a large                  0.55   0.28    -2.17   0.03
## city                                                               
## -------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.01

Q4D: Reports of events by the news media cannot be trusted

In this analysis, we shift gears somewhat and instead turn our focus to predicting who DISAGREE (strongly or somewhat) with the statement. Why? Recall, the main overarching objective of this review is to better understand on which matters Blacks offer different opinions from other respondents, after controlling for all other potentially relevant characteristics, such as political party.

The bar charts below compare the results of this question if we first focus on the “agree” response (shown in the first bar chart) and then the “disagree” response (shown in the second bar chart). As can be seen, Blacks are more likely to “disagree” than the other groups, and so the regression analysis will focus on this outcome.

## # A tibble: 6 x 6
##   ind_var  dep_var   Wording dep_category   pct unweighted_n
##   <fct>    <chr>     <chr>   <fct>        <dbl>        <int>
## 1 White    q4d.agree <NA>    Not agree     49.4         7644
## 2 White    q4d.agree <NA>    Agree         50.6         7644
## 3 Black    q4d.agree <NA>    Not agree     72.8         1106
## 4 Black    q4d.agree <NA>    Agree         27.2         1106
## 5 Hispanic q4d.agree <NA>    Not agree     58.8          978
## 6 Hispanic q4d.agree <NA>    Agree         41.2          978

The cross-tabs suggest that Blacks are more likely than any other race or ethnicity to disagree with the statement. However, the regression results – which test this relationship against other salient predictors – do not exactly confirm this apparent relationship. While Black is a significant predictor, the odds-ratio is below 1, suggesting that Blacks are somewhat less likely to disagree with this statement, when considering all other factors.

Notably, the effect associated with being a Democrat is titanic – people of this political stripe are about 20 times as likely as others to disagree with this statement. Given the primary role of political affiliation, this may help explain the puzzling results.

## MODEL INFO:
## Observations: 9293
## Dependent Variable: ind.var
## Type: Analysis of complex survey design 
##  Family: quasibinomial 
##  Link function: logit 
## 
## MODEL FIT:
## Pseudo-R² (Cragg-Uhler) = 0.35
## Pseudo-R² (McFadden) = 0.24
## AIC =  NA 
## 
## -------------------------------------------------------------------
##                                    exp(Est.)   S.E.   t val.      p
## -------------------------------- ----------- ------ -------- ------
## (Intercept)                             0.02   0.22   -18.56   0.00
## race.varBlack                           0.65   0.10    -4.54   0.00
## race.varHispanic                        0.67   0.11    -3.83   0.00
## race.varOther                           0.65   0.17    -2.51   0.01
## POL_PARTYDEMOCRAT                      20.18   0.12    25.99   0.00
## POL_PARTYINDEPENDENT                    4.17   0.17     8.55   0.00
## genderFemale                            0.87   0.07    -2.09   0.04
## AGE_CAT35 TO 54                         0.96   0.09    -0.46   0.65
## AGE_CAT55+                              1.02   0.09     0.23   0.82
## college.grad1                           1.46   0.07     5.13   0.00
## high_internet_use_1high                 1.12   0.15     0.79   0.43
## high_news_1high                         2.18   0.08     9.56   0.00
## Q32A small town or village              1.11   0.12     0.86   0.39
## Q32A large city                         1.45   0.12     3.02   0.00
## Q32A suburb of a large                  1.44   0.12     3.17   0.00
## city                                                               
## -------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.02

As a robustness check, this analyst also ran the logistic regression in the alternative direction – focusing on predicting agree/strongly agree (which means people are indicating DISTRUST). Here, Black was not a significant predictor. Again, the clear divide was with respect to political affiliation (though this is harder to see here because Republican is the reference category, but if you rescale the OR below to make GOP the non-reference caegory, you see they would be approximatley 20 times more likely to AGREE with this statement).

## MODEL INFO:
## Observations: 9293
## Dependent Variable: ind.var
## Type: Analysis of complex survey design 
##  Family: quasibinomial 
##  Link function: logit 
## 
## MODEL FIT:
## Pseudo-R² (Cragg-Uhler) = 0.46
## Pseudo-R² (McFadden) = 0.30
## AIC =  NA 
## 
## -------------------------------------------------------------------
##                                    exp(Est.)   S.E.   t val.      p
## -------------------------------- ----------- ------ -------- ------
## (Intercept)                             8.86   0.20    11.18   0.00
## race.varBlack                           1.06   0.11     0.53   0.60
## race.varHispanic                        1.10   0.11     0.84   0.40
## race.varOther                           1.20   0.17     1.07   0.28
## POL_PARTYDEMOCRAT                       0.06   0.08   -37.15   0.00
## POL_PARTYINDEPENDENT                    0.28   0.11   -11.47   0.00
## genderFemale                            0.94   0.07    -0.90   0.37
## AGE_CAT35 TO 54                         1.08   0.10     0.80   0.42
## AGE_CAT55+                              1.18   0.10     1.64   0.10
## college.grad1                           0.72   0.07    -4.51   0.00
## high_internet_use_1high                 1.00   0.14    -0.03   0.98
## high_news_1high                         0.58   0.08    -6.95   0.00
## Q32A small town or village              0.82   0.11    -1.87   0.06
## Q32A large city                         0.60   0.12    -4.27   0.00
## Q32A suburb of a large                  0.71   0.11    -3.29   0.00
## city                                                               
## -------------------------------------------------------------------
## 
## Estimated dispersion parameter = 0.99

All in all, we should not consider race/ethnicity a significant predictor for this question.

Q4F: All adult citizens have equal opportunity to vote

This is another question where we flip the script in the analysis, now focusing on the disagree/strongly disagree category. Again, this is driven by the fact that the cross-tabs suggest Blacks are more likely than others to disagree with this statement – about 58% disagree to some extent, compared to 38% of whites, 43% of Hispanics and 46% of other individuals.

However the logistic regression results do not find a statistically significant relationship, after considering all other factors. Like Q4D above, we see political affiliation playing a dominant role here.

## # A tibble: 4 x 6
##   ind_var  dep_var      Wording                  dep_category   pct unweighted_n
##   <fct>    <chr>        <chr>                    <fct>        <dbl>        <int>
## 1 White    q4f.disagree All adult citizens have~ Disagree      38.4         7649
## 2 Black    q4f.disagree All adult citizens have~ Disagree      57.5         1109
## 3 Hispanic q4f.disagree All adult citizens have~ Disagree      43.4          975
## 4 Other    q4f.disagree All adult citizens have~ Disagree      46.1          400

## MODEL INFO:
## Observations: 9299
## Dependent Variable: ind.var
## Type: Analysis of complex survey design 
##  Family: quasibinomial 
##  Link function: logit 
## 
## MODEL FIT:
## Pseudo-R² (Cragg-Uhler) = 0.39
## Pseudo-R² (McFadden) = 0.25
## AIC =  NA 
## 
## -------------------------------------------------------------------
##                                    exp(Est.)   S.E.   t val.      p
## -------------------------------- ----------- ------ -------- ------
## (Intercept)                             0.08   0.19   -13.58   0.00
## race.varBlack                           1.09   0.10     0.85   0.40
## race.varHispanic                        0.84   0.10    -1.73   0.08
## race.varOther                           0.96   0.17    -0.28   0.78
## POL_PARTYDEMOCRAT                      14.48   0.08    32.40   0.00
## POL_PARTYINDEPENDENT                    3.70   0.12    10.56   0.00
## genderFemale                            1.16   0.06     2.31   0.02
## AGE_CAT35 TO 54                         0.73   0.08    -3.65   0.00
## AGE_CAT55+                              0.55   0.09    -6.74   0.00
## college.grad1                           1.39   0.07     4.89   0.00
## high_internet_use_1high                 1.64   0.14     3.45   0.00
## high_news_1high                         1.00   0.08     0.03   0.98
## Q32A small town or village              1.13   0.11     1.14   0.25
## Q32A large city                         1.20   0.12     1.55   0.12
## Q32A suburb of a large                  1.18   0.11     1.54   0.12
## city                                                               
## -------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.02

Q18F To what extent does social media make the following easier or harder for you: Voting

Here the outcome of interest is predicting those respondents who said “easier,” or “much easier”. In general, we see that Blacks are slightly more likely to offer this response – 30% did so, compared to 15% of whites, 23% of Hispanics and 19% of all other individuals.

And, indeed, the logistic regression results confirm as much – Blacks are twice as likely to say this than others; the effect is slightly stronger than that associated with being a Democrat.

## # A tibble: 4 x 6
##   ind_var  dep_var     Wording                  dep_category    pct unweighted_n
##   <fct>    <chr>       <chr>                    <fct>         <dbl>        <int>
## 1 White    q18f.easier To what extent do you t~ Easier/Much ~  14.7         6495
## 2 Black    q18f.easier To what extent do you t~ Easier/Much ~  30.0          935
## 3 Hispanic q18f.easier To what extent do you t~ Easier/Much ~  22.8          844
## 4 Other    q18f.easier To what extent do you t~ Easier/Much ~  18.7          351

## MODEL INFO:
## Observations: 7954
## Dependent Variable: ind.var
## Type: Analysis of complex survey design 
##  Family: quasibinomial 
##  Link function: logit 
## 
## MODEL FIT:
## Pseudo-R² (Cragg-Uhler) = 0.04
## Pseudo-R² (McFadden) = 0.03
## AIC =  NA 
## 
## -------------------------------------------------------------------
##                                    exp(Est.)   S.E.   t val.      p
## -------------------------------- ----------- ------ -------- ------
## (Intercept)                             0.17   0.21    -8.50   0.00
## race.varBlack                           2.01   0.11     6.62   0.00
## race.varHispanic                        1.58   0.11     4.21   0.00
## race.varOther                           1.27   0.17     1.41   0.16
## POL_PARTYDEMOCRAT                       1.99   0.09     7.43   0.00
## POL_PARTYINDEPENDENT                    1.05   0.17     0.28   0.78
## genderFemale                            0.99   0.08    -0.07   0.94
## AGE_CAT35 TO 54                         0.64   0.10    -4.49   0.00
## AGE_CAT55+                              0.74   0.10    -2.94   0.00
## college.grad1                           0.80   0.08    -2.72   0.01
## high_internet_use_1high                 0.66   0.15    -2.77   0.01
## high_news_1high                         1.12   0.09     1.33   0.19
## Q32A small town or village              1.30   0.13     1.94   0.05
## Q32A large city                         1.70   0.14     3.86   0.00
## Q32A suburb of a large                  1.29   0.13     1.95   0.05
## city                                                               
## -------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.02

Q26A Thinking about political and social debates between people on social media, would you say they have made you more or less likely to…Spend time with family

Here, we focus on the “more likely” category for our regression efforts. Again, the cross-tabs suggest that race/ethnicity might be an important predictor – 25% of Blacks said “more likely” to this question, compared to 17% of Hispanics, 10% of Whites and 8% of all other individuals.

Indeed, we find a significant relationship here – with Blacks 3.5 times to say “more likely” than others. Interestingly, the odds-ratio for Democrats is below 1 – indicating that Democrats, after for controlling for other personal characteristics, are less prone to give this response. This is a rare instance where both of these characteristics are statistically significant, but of opposite direction (at least for the questions examined here)

## MODEL INFO:
## Observations: 7792
## Dependent Variable: ind.var
## Type: Analysis of complex survey design 
##  Family: quasibinomial 
##  Link function: logit 
## 
## MODEL FIT:
## Pseudo-R² (Cragg-Uhler) = 0.03
## Pseudo-R² (McFadden) = 0.02
## AIC =  NA 
## 
## -------------------------------------------------------------------
##                                    exp(Est.)   S.E.   t val.      p
## -------------------------------- ----------- ------ -------- ------
## (Intercept)                             0.28   0.23    -5.57   0.00
## race.varBlack                           3.52   0.12    10.56   0.00
## race.varHispanic                        1.99   0.12     5.58   0.00
## race.varOther                           0.87   0.22    -0.66   0.51
## POL_PARTYDEMOCRAT                       0.65   0.10    -4.35   0.00
## POL_PARTYINDEPENDENT                    0.96   0.16    -0.25   0.80
## genderFemale                            1.04   0.08     0.47   0.64
## AGE_CAT35 TO 54                         1.18   0.13     1.26   0.21
## AGE_CAT55+                              1.31   0.14     2.01   0.04
## college.grad1                           0.75   0.10    -2.98   0.00
## high_internet_use_1high                 0.57   0.16    -3.45   0.00
## high_news_1high                         0.98   0.10    -0.16   0.87
## Q32A small town or village              0.84   0.13    -1.40   0.16
## Q32A large city                         0.70   0.15    -2.45   0.01
## Q32A suburb of a large                  0.76   0.13    -2.08   0.04
## city                                                               
## -------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.02

Q26F Thinking about political and social debates between people on social media, would you say they have made you more or less likely to…Use social media

Again, the outcome of interest will be the “more likely” response. The logistic regression shows that Black is a significant predictor, though so too is being Hispanic. Political affiliation is not significant, a rare finding (or perhaps non-finding) in this analysis.

Indeed, the regression results suggest it is difficult to pick up a clear pattern on this question – most characteristics are not statistically significant and, even among those that are, the effect is somewhat small. Given this abundance of noise, we may consider not reporting out on this analysis.

## MODEL INFO:
## Observations: 7797
## Dependent Variable: ind.var
## Type: Analysis of complex survey design 
##  Family: quasibinomial 
##  Link function: logit 
## 
## MODEL FIT:
## Pseudo-R² (Cragg-Uhler) = -0.01
## Pseudo-R² (McFadden) = -0.01
## AIC =  NA 
## 
## -------------------------------------------------------------------
##                                    exp(Est.)   S.E.   t val.      p
## -------------------------------- ----------- ------ -------- ------
## (Intercept)                             0.16   0.23    -8.00   0.00
## race.varBlack                           1.80   0.11     5.17   0.00
## race.varHispanic                        1.63   0.11     4.34   0.00
## race.varOther                           1.12   0.19     0.62   0.53
## POL_PARTYDEMOCRAT                       1.18   0.09     1.84   0.07
## POL_PARTYINDEPENDENT                    0.76   0.16    -1.70   0.09
## genderFemale                            1.08   0.08     1.02   0.31
## AGE_CAT35 TO 54                         0.84   0.11    -1.65   0.10
## AGE_CAT55+                              0.96   0.11    -0.36   0.72
## college.grad1                           0.73   0.09    -3.70   0.00
## high_internet_use_1high                 1.27   0.19     1.27   0.20
## high_news_1high                         1.32   0.09     3.06   0.00
## Q32A small town or village              0.74   0.12    -2.51   0.01
## Q32A large city                         0.87   0.13    -1.14   0.26
## Q32A suburb of a large                  0.68   0.12    -3.35   0.00
## city                                                               
## -------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.03

Q26G Thinking about political and social debates between people on social media, would you say they have made you more or less likely to…Follow the news

We focus on the “more likely” response. The regression finds race/ethnicity as a significant predictor (Blacks are 1.24 times more likely than the Whites to say ‘more likely’); however, political affiliation plays a more decisive role in the calculus (Democrats are 2.31 times more prone than Republicans to say “more likely”)

## MODEL INFO:
## Observations: 7792
## Dependent Variable: ind.var
## Type: Analysis of complex survey design 
##  Family: quasibinomial 
##  Link function: logit 
## 
## MODEL FIT:
## Pseudo-R² (Cragg-Uhler) = 0.05
## Pseudo-R² (McFadden) = 0.03
## AIC =  NA 
## 
## -------------------------------------------------------------------
##                                    exp(Est.)   S.E.   t val.      p
## -------------------------------- ----------- ------ -------- ------
## (Intercept)                             0.22   0.18    -8.39   0.00
## race.varBlack                           1.24   0.10     2.20   0.03
## race.varHispanic                        1.31   0.09     2.89   0.00
## race.varOther                           1.06   0.15     0.40   0.69
## POL_PARTYDEMOCRAT                       2.31   0.07    12.24   0.00
## POL_PARTYINDEPENDENT                    0.98   0.12    -0.16   0.87
## genderFemale                            1.03   0.06     0.47   0.64
## AGE_CAT35 TO 54                         0.82   0.08    -2.48   0.01
## AGE_CAT55+                              0.97   0.08    -0.32   0.75
## college.grad1                           0.94   0.06    -0.92   0.36
## high_internet_use_1high                 1.16   0.15     0.98   0.33
## high_news_1high                         1.96   0.07     9.56   0.00
## Q32A small town or village              0.94   0.10    -0.63   0.53
## Q32A large city                         1.14   0.10     1.27   0.20
## Q32A suburb of a large                  0.96   0.09    -0.46   0.65
## city                                                               
## -------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.02

Q26J Thinking about political and social debates between people on social media, would you say they have made you more or less likely to…Vote

The results here are similar with other items in this battery: race is significant in terms of whether a person will say “more likely” or not; political party has a stronger effect.

## MODEL INFO:
## Observations: 7796
## Dependent Variable: ind.var
## Type: Analysis of complex survey design 
##  Family: quasibinomial 
##  Link function: logit 
## 
## MODEL FIT:
## Pseudo-R² (Cragg-Uhler) = -0.01
## Pseudo-R² (McFadden) = -0.01
## AIC =  NA 
## 
## -------------------------------------------------------------------
##                                    exp(Est.)   S.E.   t val.      p
## -------------------------------- ----------- ------ -------- ------
## (Intercept)                             0.54   0.17    -3.61   0.00
## race.varBlack                           1.41   0.09     3.70   0.00
## race.varHispanic                        1.16   0.09     1.61   0.11
## race.varOther                           1.22   0.14     1.44   0.15
## POL_PARTYDEMOCRAT                       1.56   0.06     6.83   0.00
## POL_PARTYINDEPENDENT                    0.66   0.12    -3.63   0.00
## genderFemale                            1.28   0.06     4.39   0.00
## AGE_CAT35 TO 54                         0.94   0.08    -0.81   0.42
## AGE_CAT55+                              0.97   0.08    -0.35   0.73
## college.grad1                           0.81   0.06    -3.34   0.00
## high_internet_use_1high                 1.20   0.14     1.30   0.19
## high_news_1high                         1.20   0.06     2.78   0.01
## Q32A small town or village              0.96   0.09    -0.47   0.64
## Q32A large city                         0.90   0.10    -1.04   0.30
## Q32A suburb of a large                  1.01   0.09     0.11   0.92
## city                                                               
## -------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.03

Q16A How concerned are you about each of the following issues – The size and power of major technology companies?

Here, we focus on the percent who were “very concerned.” In all, 37% of respondents said they were very concerned; another 45% said they were somewhat concerned about the size and power of major technology companies.

If we focus on the “very concerned” category, though, we see notable differences by race/ethnicity – with Blacks less likely than all other groups to give this response.

Turning to logistic regression, we do find this observation holds, even when controlling for other factors. The odds-ratio associated with Black respondents is well below 1, indicating lower likelihood to say this.

## MODEL INFO:
## Observations: 9308
## Dependent Variable: ind.var
## Type: Analysis of complex survey design 
##  Family: quasibinomial 
##  Link function: logit 
## 
## MODEL FIT:
## Pseudo-R² (Cragg-Uhler) = 0.11
## Pseudo-R² (McFadden) = 0.06
## AIC =  NA 
## 
## -------------------------------------------------------------------
##                                    exp(Est.)   S.E.   t val.      p
## -------------------------------- ----------- ------ -------- ------
## (Intercept)                             1.08   0.14     0.54   0.59
## race.varBlack                           0.68   0.10    -3.93   0.00
## race.varHispanic                        0.97   0.09    -0.36   0.72
## race.varOther                           1.28   0.13     1.85   0.06
## POL_PARTYDEMOCRAT                       0.35   0.06   -16.96   0.00
## POL_PARTYINDEPENDENT                    0.60   0.11    -4.85   0.00
## genderFemale                            0.73   0.05    -5.73   0.00
## AGE_CAT35 TO 54                         0.81   0.08    -2.73   0.01
## AGE_CAT55+                              0.95   0.08    -0.59   0.56
## college.grad1                           1.00   0.06     0.04   0.97
## high_internet_use_1high                 0.98   0.10    -0.17   0.86
## high_news_1high                         1.33   0.07     4.39   0.00
## Q32A small town or village              0.97   0.09    -0.34   0.73
## Q32A large city                         1.14   0.10     1.39   0.16
## Q32A suburb of a large                  1.09   0.09     1.06   0.29
## city                                                               
## -------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1

Q16D How concerned are you about each of the following issues – Hate speech and other abusive or threatening language online?

For this question, we again focus on the “very concerned” category. At the crosstab level, we find that 70% of Blacks are ‘very concerned’ about this issue, compared to 52% of whites, 59% of Hispanics and 48% of other individuals.

Despite these clear differences, the results were not signigicant within the logistic regression. Political affiliation was significant and the dominant effect tested.

## MODEL INFO:
## Observations: 9303
## Dependent Variable: ind.var
## Type: Analysis of complex survey design 
##  Family: quasibinomial 
##  Link function: logit 
## 
## MODEL FIT:
## Pseudo-R² (Cragg-Uhler) = 0.24
## Pseudo-R² (McFadden) = 0.14
## AIC =  NA 
## 
## -------------------------------------------------------------------
##                                    exp(Est.)   S.E.   t val.      p
## -------------------------------- ----------- ------ -------- ------
## (Intercept)                             0.16   0.17   -10.92   0.00
## race.varBlack                           1.15   0.10     1.32   0.19
## race.varHispanic                        1.21   0.09     2.07   0.04
## race.varOther                           0.89   0.14    -0.83   0.41
## POL_PARTYDEMOCRAT                       5.28   0.07    24.44   0.00
## POL_PARTYINDEPENDENT                    1.84   0.10     5.86   0.00
## genderFemale                            1.92   0.06    11.41   0.00
## AGE_CAT35 TO 54                         1.54   0.08     5.46   0.00
## AGE_CAT55+                              3.19   0.09    13.29   0.00
## college.grad1                           0.96   0.06    -0.56   0.58
## high_internet_use_1high                 0.86   0.13    -1.20   0.23
## high_news_1high                         1.28   0.07     3.71   0.00
## Q32A small town or village              1.27   0.09     2.58   0.01
## Q32A large city                         1.21   0.10     1.83   0.07
## Q32A suburb of a large                  1.20   0.09     1.94   0.05
## city                                                               
## -------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1

LINEAR REGRESSIONS

The remaining survey items – all of which come from the Q25 series of questions, which asks “In the last 12 months how often have you done each of the dollowing – Daily, weekly, a few times a month, rarely, never or not applicable.”

For this analysis, the “Never” and “Not applicable” categories were combined.

Given this 5-point scale, it was decided we pursue a linear regression model – though certainly this is a choice that would have its critics. As we will see, though, it isn’t clear we’ll want to report on these results anyway, as we face issues related to model fit (and in general) a lack of significant terms.

First we recode the variables and write a function to run the linear rgeressions.

Q25 QUESTIONS

This analysis will first show the descriptive statistics by race/ethnicity for each question and then the model output.

q25a.data.race<-two_tab_func(survey.df, "race.var", "q25a.recode", "WEIGHT")%>%
  mutate(Wording = label_df$Wording[label_df$QTAG == "Q25A"])


ggplot(q25a.data.race, aes(x=dep_category, y=pct))+geom_bar(stat="identity", fill="green")+
  labs(title="Q25 Sent an email or social media post to govt official", x="Frequency") +
  ylim(0,100)+
  geom_text(aes(label=pct), vjust=-0.1, colour="black")+
  facet_wrap(~ind_var, ncol=1)+
  theme_classic()

linear.regression.function.data("q25a.recode", survey.df)

## MODEL INFO:
## Observations: 9309
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.010
## Adj. R² = 0.008 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.447    0.309    0.585    6.366   0.000
## race.varBlack                       0.026   -0.055    0.108    0.628   0.530
## race.varHispanic                    0.067   -0.016    0.151    1.578   0.115
## race.varOther                       0.018   -0.080    0.117    0.365   0.715
## POL_PARTYDEMOCRAT                  -0.028   -0.083    0.027   -1.005   0.315
## POL_PARTYINDEPENDENT               -0.114   -0.201   -0.026   -2.549   0.011
## genderFemale                       -0.025   -0.072    0.023   -1.027   0.304
## AGE_CAT35 TO 54                     0.067    0.000    0.134    1.963   0.050
## AGE_CAT55+                          0.043   -0.028    0.114    1.190   0.234
## college.grad1                       0.102    0.047    0.157    3.666   0.000
## high_internet_use_1high             0.142    0.030    0.254    2.485   0.013
## high_news_1high                     0.097    0.042    0.152    3.456   0.001
## Q32A small town or village         -0.003   -0.079    0.073   -0.086   0.932
## Q32A large city                    -0.007   -0.090    0.077   -0.159   0.874
## Q32A suburb of a large             -0.035   -0.110    0.039   -0.931   0.352
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 0.84

q25b.data.race<-two_tab_func(survey.df, "race.var", "q25b.recode", "WEIGHT")%>%
  mutate(Wording = label_df$Wording[label_df$QTAG == "Q25B"])


ggplot(q25b.data.race, aes(x=dep_category, y=pct))+geom_bar(stat="identity", fill="green")+
  labs(title="Q25B Donated money online to political candidate or party", x="Frequency") +
  ylim(0,100)+
  geom_text(aes(label=pct), vjust=-0.1, colour="black")+
  facet_wrap(~ind_var, ncol=1)+
  theme_classic()

linear.regression.function.data("q25b.recode", survey.df)

## MODEL INFO:
## Observations: 9305
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.058
## Adj. R² = 0.056 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.090   -0.012    0.191    1.735   0.083
## race.varBlack                       0.002   -0.055    0.058    0.060   0.952
## race.varHispanic                    0.008   -0.052    0.067    0.257   0.797
## race.varOther                       0.039   -0.045    0.123    0.912   0.362
## POL_PARTYDEMOCRAT                   0.170    0.131    0.209    8.596   0.000
## POL_PARTYINDEPENDENT               -0.097   -0.150   -0.044   -3.609   0.000
## genderFemale                       -0.039   -0.072   -0.005   -2.242   0.025
## AGE_CAT35 TO 54                     0.024   -0.023    0.071    1.017   0.309
## AGE_CAT55+                          0.010   -0.038    0.059    0.413   0.679
## college.grad1                       0.116    0.079    0.153    6.126   0.000
## high_internet_use_1high             0.060   -0.024    0.145    1.396   0.163
## high_news_1high                     0.114    0.076    0.152    5.854   0.000
## Q32A small town or village          0.051    0.002    0.100    2.024   0.043
## Q32A large city                     0.152    0.093    0.210    5.052   0.000
## Q32A suburb of a large              0.073    0.023    0.123    2.875   0.004
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 0.409

q25c.data.race<-two_tab_func(survey.df, "race.var", "q25c.recode", "WEIGHT")%>%
  mutate(Wording = label_df$Wording[label_df$QTAG == "Q25C"])


ggplot(q25c.data.race, aes(x=dep_category, y=pct))+geom_bar(stat="identity", fill="green")+
  labs(title="Q25C Created, signed or shared an online petition", x="Frequency") +
  ylim(0,100)+
  geom_text(aes(label=pct), vjust=-0.1, colour="black")+
  facet_wrap(~ind_var, ncol=1)+
  theme_classic()

linear.regression.function.data("q25c.recode", survey.df)

## MODEL INFO:
## Observations: 9299
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.036
## Adj. R² = 0.035 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.351    0.247    0.455    6.616   0.000
## race.varBlack                       0.003   -0.059    0.065    0.102   0.918
## race.varHispanic                    0.054   -0.010    0.117    1.658   0.097
## race.varOther                       0.123    0.018    0.228    2.301   0.021
## POL_PARTYDEMOCRAT                   0.064    0.019    0.108    2.800   0.005
## POL_PARTYINDEPENDENT               -0.065   -0.138    0.007   -1.766   0.077
## genderFemale                        0.136    0.097    0.174    6.902   0.000
## AGE_CAT35 TO 54                    -0.029   -0.082    0.023   -1.089   0.276
## AGE_CAT55+                         -0.116   -0.173   -0.059   -4.001   0.000
## college.grad1                       0.033   -0.010    0.077    1.505   0.132
## high_internet_use_1high             0.315    0.235    0.395    7.724   0.000
## high_news_1high                     0.035   -0.008    0.078    1.599   0.110
## Q32A small town or village         -0.023   -0.084    0.038   -0.727   0.467
## Q32A large city                     0.015   -0.054    0.084    0.420   0.675
## Q32A suburb of a large             -0.013   -0.074    0.047   -0.428   0.669
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 0.547

q25d.data.race<-two_tab_func(survey.df, "race.var", "q25d.recode", "WEIGHT")%>%
  mutate(Wording = label_df$Wording[label_df$QTAG == "Q25D"])



ggplot(q25d.data.race, aes(x=dep_category, y=pct))+geom_bar(stat="identity", fill="green")+
  labs(title="Q25D Volunteered to help online with political candidate or cause ", x="Frequency") +
  ylim(0,100)+
  geom_text(aes(label=pct), vjust=-0.1, colour="black")+
  facet_wrap(~ind_var, ncol=1)+
  theme_classic()

linear.regression.function.data("q25e.recode", survey.df)

## MODEL INFO:
## Observations: 9292
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.010
## Adj. R² = 0.008 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.192    0.111    0.272    4.668   0.000
## race.varBlack                       0.073    0.028    0.118    3.173   0.002
## race.varHispanic                    0.027   -0.014    0.068    1.290   0.197
## race.varOther                       0.084    0.012    0.156    2.294   0.022
## POL_PARTYDEMOCRAT                   0.001   -0.026    0.029    0.078   0.937
## POL_PARTYINDEPENDENT               -0.022   -0.062    0.019   -1.048   0.295
## genderFemale                        0.009   -0.016    0.033    0.681   0.496
## AGE_CAT35 TO 54                    -0.022   -0.057    0.013   -1.218   0.223
## AGE_CAT55+                         -0.068   -0.105   -0.031   -3.569   0.000
## college.grad1                      -0.002   -0.032    0.027   -0.165   0.869
## high_internet_use_1high            -0.084   -0.155   -0.014   -2.348   0.019
## high_news_1high                     0.023   -0.004    0.050    1.647   0.100
## Q32A small town or village          0.003   -0.033    0.039    0.173   0.863
## Q32A large city                     0.011   -0.031    0.053    0.504   0.615
## Q32A suburb of a large             -0.012   -0.046    0.023   -0.655   0.512
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 0.207

q25e.data.race<-two_tab_func(survey.df, "race.var", "q25e.recode", "WEIGHT")%>%
  mutate(Wording = label_df$Wording[label_df$QTAG == "Q25E"])



ggplot(q25e.data.race, aes(x=dep_category, y=pct))+geom_bar(stat="identity", fill="green")+
  labs(title="Q25E Started a political or cause-related group on social media ", x="Frequency") +
  ylim(0,100)+
  geom_text(aes(label=pct), vjust=-0.1, colour="black")+
  facet_wrap(~ind_var, ncol=1)+
  theme_classic()

linear.regression.function.data("q25e.recode", survey.df)

## MODEL INFO:
## Observations: 9292
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.010
## Adj. R² = 0.008 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.192    0.111    0.272    4.668   0.000
## race.varBlack                       0.073    0.028    0.118    3.173   0.002
## race.varHispanic                    0.027   -0.014    0.068    1.290   0.197
## race.varOther                       0.084    0.012    0.156    2.294   0.022
## POL_PARTYDEMOCRAT                   0.001   -0.026    0.029    0.078   0.937
## POL_PARTYINDEPENDENT               -0.022   -0.062    0.019   -1.048   0.295
## genderFemale                        0.009   -0.016    0.033    0.681   0.496
## AGE_CAT35 TO 54                    -0.022   -0.057    0.013   -1.218   0.223
## AGE_CAT55+                         -0.068   -0.105   -0.031   -3.569   0.000
## college.grad1                      -0.002   -0.032    0.027   -0.165   0.869
## high_internet_use_1high            -0.084   -0.155   -0.014   -2.348   0.019
## high_news_1high                     0.023   -0.004    0.050    1.647   0.100
## Q32A small town or village          0.003   -0.033    0.039    0.173   0.863
## Q32A large city                     0.011   -0.031    0.053    0.504   0.615
## Q32A suburb of a large             -0.012   -0.046    0.023   -0.655   0.512
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 0.207

###Q25F FOLLOWED A POLITICAN ON SOCIAL MEDIA


linear.regression.function.data("q25f.recode", survey.df)

## MODEL INFO:
## Observations: 9295
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.038
## Adj. R² = 0.037 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.402    0.272    0.532    6.045   0.000
## race.varBlack                      -0.197   -0.280   -0.114   -4.656   0.000
## race.varHispanic                   -0.044   -0.130    0.041   -1.016   0.310
## race.varOther                       0.012   -0.119    0.143    0.183   0.855
## POL_PARTYDEMOCRAT                  -0.068   -0.132   -0.004   -2.073   0.038
## POL_PARTYINDEPENDENT               -0.352   -0.453   -0.251   -6.828   0.000
## genderFemale                        0.130    0.075    0.186    4.626   0.000
## AGE_CAT35 TO 54                     0.119    0.045    0.193    3.168   0.002
## AGE_CAT55+                         -0.002   -0.081    0.076   -0.055   0.956
## college.grad1                       0.083    0.024    0.142    2.759   0.006
## high_internet_use_1high             0.372    0.275    0.469    7.530   0.000
## high_news_1high                     0.287    0.227    0.347    9.382   0.000
## Q32A small town or village          0.020   -0.071    0.112    0.438   0.661
## Q32A large city                     0.081   -0.019    0.181    1.592   0.112
## Q32A suburb of a large              0.006   -0.084    0.095    0.124   0.902
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.246

##Q25G SHARED YOUR POLITICAL OPINION ON SOCIAL MEDIA

linear.regression.function.data("q25g.recode", survey.df)

## MODEL INFO:
## Observations: 9291
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.028
## Adj. R² = 0.026 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.431    0.299    0.564    6.377   0.000
## race.varBlack                      -0.154   -0.232   -0.075   -3.852   0.000
## race.varHispanic                   -0.051   -0.135    0.032   -1.204   0.228
## race.varOther                       0.122   -0.010    0.254    1.812   0.070
## POL_PARTYDEMOCRAT                   0.027   -0.035    0.089    0.845   0.398
## POL_PARTYINDEPENDENT               -0.259   -0.352   -0.166   -5.486   0.000
## genderFemale                        0.003   -0.049    0.056    0.124   0.901
## AGE_CAT35 TO 54                     0.068   -0.004    0.140    1.845   0.065
## AGE_CAT55+                         -0.069   -0.146    0.008   -1.765   0.078
## college.grad1                      -0.040   -0.099    0.019   -1.335   0.182
## high_internet_use_1high             0.442    0.345    0.539    8.892   0.000
## high_news_1high                     0.135    0.077    0.194    4.524   0.000
## Q32A small town or village         -0.012   -0.101    0.077   -0.273   0.785
## Q32A large city                     0.092   -0.007    0.190    1.830   0.067
## Q32A suburb of a large              0.018   -0.069    0.106    0.414   0.679
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.089

###Q25H Shared political information posted by others on social media


linear.regression.function.data("q25h.recode", survey.df)

## MODEL INFO:
## Observations: 9295
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.026
## Adj. R² = 0.024 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.477    0.343    0.612    6.957   0.000
## race.varBlack                      -0.091   -0.171   -0.011   -2.236   0.025
## race.varHispanic                    0.003   -0.085    0.090    0.064   0.949
## race.varOther                       0.080   -0.057    0.217    1.145   0.252
## POL_PARTYDEMOCRAT                  -0.092   -0.156   -0.027   -2.783   0.005
## POL_PARTYINDEPENDENT               -0.301   -0.398   -0.204   -6.061   0.000
## genderFemale                        0.080    0.026    0.133    2.913   0.004
## AGE_CAT35 TO 54                    -0.046   -0.121    0.030   -1.179   0.238
## AGE_CAT55+                         -0.149   -0.230   -0.068   -3.613   0.000
## college.grad1                      -0.047   -0.107    0.014   -1.514   0.130
## high_internet_use_1high             0.404    0.304    0.503    7.956   0.000
## high_news_1high                     0.133    0.073    0.194    4.344   0.000
## Q32A small town or village         -0.006   -0.094    0.082   -0.143   0.887
## Q32A large city                     0.134    0.036    0.233    2.675   0.007
## Q32A suburb of a large             -0.002   -0.089    0.085   -0.047   0.963
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.101

##q25i Used a political hashtag

linear.regression.function.data("q25i.recode", survey.df)

## MODEL INFO:
## Observations: 9285
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.033
## Adj. R² = 0.032 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.213    0.125    0.301    4.743   0.000
## race.varBlack                       0.102    0.045    0.160    3.481   0.001
## race.varHispanic                    0.052   -0.004    0.109    1.807   0.071
## race.varOther                       0.040   -0.032    0.112    1.084   0.278
## POL_PARTYDEMOCRAT                   0.032   -0.006    0.069    1.662   0.097
## POL_PARTYINDEPENDENT               -0.090   -0.141   -0.039   -3.466   0.001
## genderFemale                        0.026   -0.006    0.059    1.601   0.109
## AGE_CAT35 TO 54                    -0.024   -0.072    0.024   -0.997   0.319
## AGE_CAT55+                         -0.201   -0.249   -0.153   -8.136   0.000
## college.grad1                      -0.022   -0.059    0.015   -1.161   0.246
## high_internet_use_1high             0.022   -0.047    0.091    0.621   0.535
## high_news_1high                     0.076    0.039    0.113    4.070   0.000
## Q32A small town or village         -0.014   -0.064    0.037   -0.539   0.590
## Q32A large city                     0.038   -0.021    0.096    1.260   0.208
## Q32A suburb of a large              0.010   -0.042    0.061    0.372   0.710
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 0.365

###Q25J Liked a post about politics on social media

linear.regression.function.data("q25j.recode", survey.df)

## MODEL INFO:
## Observations: 9303
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.062
## Adj. R² = 0.061 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.735    0.584    0.886    9.556   0.000
## race.varBlack                      -0.222   -0.318   -0.126   -4.543   0.000
## race.varHispanic                   -0.106   -0.209   -0.002   -2.005   0.045
## race.varOther                       0.134   -0.032    0.300    1.577   0.115
## POL_PARTYDEMOCRAT                   0.006   -0.070    0.081    0.150   0.881
## POL_PARTYINDEPENDENT               -0.334   -0.455   -0.213   -5.417   0.000
## genderFemale                        0.192    0.128    0.256    5.902   0.000
## AGE_CAT35 TO 54                    -0.175   -0.266   -0.084   -3.773   0.000
## AGE_CAT55+                         -0.439   -0.534   -0.344   -9.084   0.000
## college.grad1                      -0.007   -0.078    0.065   -0.182   0.856
## high_internet_use_1high             0.612    0.508    0.716   11.533   0.000
## high_news_1high                     0.242    0.171    0.314    6.636   0.000
## Q32A small town or village         -0.007   -0.108    0.094   -0.136   0.892
## Q32A large city                     0.136    0.025    0.248    2.395   0.017
## Q32A suburb of a large              0.002   -0.098    0.101    0.032   0.974
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.545

Alternatively, one could argue that these regressions should be run ONLY on respondents who did not say the situation was “not applicable/never use it”. For many of these questions, this means losing around 10% of respondents. The regression results for those items are below, but not much has meaningfully changed.

We first recode the data. Results follow below.

Regression results on new variables.

#Q25A Sent an email or a social media post to a national, state, or local government official

linear.regression.function.data("q25a.recode2", survey.df)

## MODEL INFO:
## Observations: 8748
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.007
## Adj. R² = 0.006 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.572    0.416    0.728    7.182   0.000
## race.varBlack                       0.042   -0.044    0.128    0.949   0.343
## race.varHispanic                    0.090    0.002    0.178    2.000   0.046
## race.varOther                       0.035   -0.068    0.138    0.668   0.504
## POL_PARTYDEMOCRAT                  -0.023   -0.080    0.034   -0.792   0.429
## POL_PARTYINDEPENDENT               -0.107   -0.199   -0.014   -2.254   0.024
## genderFemale                       -0.028   -0.077    0.021   -1.107   0.268
## AGE_CAT35 TO 54                     0.068   -0.002    0.137    1.916   0.055
## AGE_CAT55+                          0.051   -0.022    0.125    1.364   0.173
## college.grad1                       0.081    0.024    0.137    2.781   0.005
## high_internet_use_1high             0.059   -0.073    0.190    0.870   0.384
## high_news_1high                     0.093    0.035    0.150    3.163   0.002
## Q32A small town or village         -0.007   -0.088    0.073   -0.182   0.856
## Q32A large city                    -0.011   -0.099    0.076   -0.255   0.799
## Q32A suburb of a large             -0.041   -0.119    0.037   -1.027   0.304
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 0.865

#Q25B Donated money online or via text message to a political candidate, party, or issue

linear.regression.function.data("q25b.recode2", survey.df)

## MODEL INFO:
## Observations: 8667
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.056
## Adj. R² = 0.055 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.160    0.042    0.278    2.652   0.008
## race.varBlack                       0.010   -0.050    0.070    0.338   0.735
## race.varHispanic                    0.022   -0.042    0.086    0.683   0.495
## race.varOther                       0.059   -0.030    0.147    1.296   0.195
## POL_PARTYDEMOCRAT                   0.175    0.134    0.216    8.316   0.000
## POL_PARTYINDEPENDENT               -0.104   -0.161   -0.047   -3.593   0.000
## genderFemale                       -0.033   -0.068    0.003   -1.804   0.071
## AGE_CAT35 TO 54                     0.025   -0.024    0.074    0.986   0.324
## AGE_CAT55+                          0.012   -0.039    0.063    0.466   0.641
## college.grad1                       0.111    0.072    0.150    5.607   0.000
## high_internet_use_1high             0.002   -0.101    0.105    0.045   0.964
## high_news_1high                     0.114    0.074    0.155    5.542   0.000
## Q32A small town or village          0.055    0.002    0.107    2.045   0.041
## Q32A large city                     0.157    0.095    0.220    4.929   0.000
## Q32A suburb of a large              0.076    0.023    0.129    2.830   0.005
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 0.427

#Q25C Created, shared, or signed an online petition

linear.regression.function.data("q25c.recode2", survey.df)

## MODEL INFO:
## Observations: 8869
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.031
## Adj. R² = 0.030 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.429    0.316    0.543    7.396   0.000
## race.varBlack                       0.021   -0.043    0.085    0.642   0.521
## race.varHispanic                    0.072    0.007    0.138    2.176   0.030
## race.varOther                       0.128    0.022    0.235    2.365   0.018
## POL_PARTYDEMOCRAT                   0.063    0.018    0.109    2.716   0.007
## POL_PARTYINDEPENDENT               -0.059   -0.134    0.016   -1.532   0.126
## genderFemale                        0.145    0.105    0.184    7.198   0.000
## AGE_CAT35 TO 54                    -0.030   -0.084    0.023   -1.123   0.262
## AGE_CAT55+                         -0.112   -0.170   -0.054   -3.803   0.000
## college.grad1                       0.018   -0.026    0.063    0.806   0.420
## high_internet_use_1high             0.265    0.173    0.356    5.682   0.000
## high_news_1high                     0.029   -0.015    0.073    1.301   0.193
## Q32A small town or village         -0.023   -0.086    0.040   -0.709   0.479
## Q32A large city                     0.012   -0.060    0.083    0.322   0.747
## Q32A suburb of a large             -0.016   -0.078    0.046   -0.495   0.620
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 0.551

#Q25D Volunteered to help online with a political cause or a candidate’s campaign

linear.regression.function.data("q25d.recode2", survey.df)

## MODEL INFO:
## Observations: 8692
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.033
## Adj. R² = 0.031 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.241    0.122    0.360    3.973   0.000
## race.varBlack                       0.053   -0.002    0.108    1.896   0.058
## race.varHispanic                    0.024   -0.035    0.083    0.791   0.429
## race.varOther                       0.055   -0.014    0.124    1.553   0.120
## POL_PARTYDEMOCRAT                   0.109    0.073    0.145    5.883   0.000
## POL_PARTYINDEPENDENT               -0.030   -0.077    0.016   -1.265   0.206
## genderFemale                       -0.008   -0.040    0.024   -0.503   0.615
## AGE_CAT35 TO 54                    -0.019   -0.064    0.026   -0.844   0.399
## AGE_CAT55+                         -0.067   -0.117   -0.017   -2.633   0.008
## college.grad1                       0.054    0.016    0.092    2.777   0.006
## high_internet_use_1high            -0.131   -0.234   -0.028   -2.491   0.013
## high_news_1high                     0.065    0.030    0.101    3.605   0.000
## Q32A small town or village          0.024   -0.024    0.073    0.973   0.331
## Q32A large city                     0.085    0.024    0.145    2.749   0.006
## Q32A suburb of a large              0.020   -0.030    0.070    0.797   0.425
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 0.323

#Q25E Started a political or cause-related group on social media

linear.regression.function.data("q25e.recode2", survey.df)

## MODEL INFO:
## Observations: 8453
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.012
## Adj. R² = 0.010 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.244    0.147    0.342    4.909   0.000
## race.varBlack                       0.081    0.032    0.131    3.219   0.001
## race.varHispanic                    0.032   -0.013    0.078    1.386   0.166
## race.varOther                       0.093    0.015    0.172    2.344   0.019
## POL_PARTYDEMOCRAT                   0.001   -0.030    0.031    0.048   0.962
## POL_PARTYINDEPENDENT               -0.021   -0.066    0.024   -0.905   0.366
## genderFemale                        0.011   -0.016    0.038    0.771   0.441
## AGE_CAT35 TO 54                    -0.024   -0.062    0.015   -1.209   0.227
## AGE_CAT55+                         -0.074   -0.114   -0.034   -3.591   0.000
## college.grad1                      -0.007   -0.039    0.025   -0.407   0.684
## high_internet_use_1high            -0.126   -0.214   -0.038   -2.800   0.005
## high_news_1high                     0.023   -0.007    0.053    1.517   0.129
## Q32A small town or village          0.004   -0.036    0.044    0.192   0.847
## Q32A large city                     0.011   -0.035    0.058    0.486   0.627
## Q32A suburb of a large             -0.012   -0.050    0.026   -0.622   0.534
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 0.225

#Q25F Followed a politician on social media

linear.regression.function.data("q25f.recode2", survey.df)

## MODEL INFO:
## Observations: 8673
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.035
## Adj. R² = 0.033 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.556    0.413    0.700    7.605   0.000
## race.varBlack                      -0.201   -0.288   -0.114   -4.537   0.000
## race.varHispanic                   -0.034   -0.123    0.055   -0.745   0.457
## race.varOther                       0.012   -0.122    0.147    0.180   0.857
## POL_PARTYDEMOCRAT                  -0.071   -0.137   -0.005   -2.097   0.036
## POL_PARTYINDEPENDENT               -0.348   -0.456   -0.240   -6.294   0.000
## genderFemale                        0.130    0.072    0.187    4.432   0.000
## AGE_CAT35 TO 54                     0.117    0.041    0.193    3.027   0.002
## AGE_CAT55+                         -0.012   -0.094    0.070   -0.287   0.774
## college.grad1                       0.060   -0.001    0.121    1.929   0.054
## high_internet_use_1high             0.289    0.178    0.400    5.095   0.000
## high_news_1high                     0.304    0.242    0.366    9.570   0.000
## Q32A small town or village          0.009   -0.087    0.105    0.178   0.859
## Q32A large city                     0.071   -0.033    0.176    1.340   0.180
## Q32A suburb of a large              0.002   -0.092    0.096    0.038   0.970
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.265

#Q25G Shared your political opinion on social media


linear.regression.function.data("q25g.recode2", survey.df)

## MODEL INFO:
## Observations: 8619
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.024
## Adj. R² = 0.023 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.553    0.405    0.700    7.334   0.000
## race.varBlack                      -0.150   -0.232   -0.068   -3.589   0.000
## race.varHispanic                   -0.046   -0.134    0.041   -1.034   0.301
## race.varOther                       0.124   -0.012    0.259    1.790   0.073
## POL_PARTYDEMOCRAT                   0.022   -0.043    0.088    0.674   0.500
## POL_PARTYINDEPENDENT               -0.256   -0.355   -0.158   -5.084   0.000
## genderFemale                       -0.006   -0.061    0.049   -0.208   0.835
## AGE_CAT35 TO 54                     0.062   -0.012    0.137    1.644   0.100
## AGE_CAT55+                         -0.078   -0.157    0.002   -1.907   0.057
## college.grad1                      -0.063   -0.124   -0.002   -2.023   0.043
## high_internet_use_1high             0.395    0.280    0.509    6.781   0.000
## high_news_1high                     0.150    0.089    0.211    4.827   0.000
## Q32A small town or village         -0.027   -0.121    0.067   -0.566   0.571
## Q32A large city                     0.087   -0.016    0.191    1.655   0.098
## Q32A suburb of a large              0.016   -0.076    0.108    0.349   0.727
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.116

#Q25H Shared political information posted by others on social media

linear.regression.function.data("q25h.recode2", survey.df)

## MODEL INFO:
## Observations: 8610
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.023
## Adj. R² = 0.022 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.607    0.455    0.758    7.840   0.000
## race.varBlack                      -0.091   -0.175   -0.007   -2.126   0.034
## race.varHispanic                    0.016   -0.076    0.108    0.346   0.730
## race.varOther                       0.082   -0.060    0.225    1.136   0.256
## POL_PARTYDEMOCRAT                  -0.103   -0.171   -0.035   -2.971   0.003
## POL_PARTYINDEPENDENT               -0.306   -0.410   -0.202   -5.756   0.000
## genderFemale                        0.078    0.022    0.134    2.735   0.006
## AGE_CAT35 TO 54                    -0.058   -0.136    0.021   -1.441   0.150
## AGE_CAT55+                         -0.165   -0.250   -0.081   -3.827   0.000
## college.grad1                      -0.077   -0.140   -0.013   -2.368   0.018
## high_internet_use_1high             0.357    0.239    0.475    5.926   0.000
## high_news_1high                     0.147    0.084    0.210    4.576   0.000
## Q32A small town or village         -0.023   -0.117    0.070   -0.492   0.622
## Q32A large city                     0.130    0.026    0.235    2.449   0.014
## Q32A suburb of a large             -0.008   -0.101    0.084   -0.173   0.863
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.136

#Q25I Used a political hashtag


linear.regression.function.data("q25i.recode2", survey.df)

## MODEL INFO:
## Observations: 8119
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.033
## Adj. R² = 0.031 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.280    0.171    0.389    5.048   0.000
## race.varBlack                       0.112    0.049    0.175    3.467   0.001
## race.varHispanic                    0.057   -0.006    0.120    1.785   0.074
## race.varOther                       0.050   -0.030    0.129    1.231   0.218
## POL_PARTYDEMOCRAT                   0.032   -0.010    0.074    1.477   0.140
## POL_PARTYINDEPENDENT               -0.096   -0.154   -0.037   -3.222   0.001
## genderFemale                        0.031   -0.005    0.067    1.690   0.091
## AGE_CAT35 TO 54                    -0.026   -0.078    0.026   -0.993   0.321
## AGE_CAT55+                         -0.217   -0.270   -0.163   -7.899   0.000
## college.grad1                      -0.038   -0.080    0.003   -1.815   0.070
## high_internet_use_1high            -0.014   -0.106    0.077   -0.311   0.756
## high_news_1high                     0.084    0.043    0.125    4.044   0.000
## Q32A small town or village         -0.017   -0.075    0.040   -0.592   0.554
## Q32A large city                     0.037   -0.029    0.103    1.097   0.272
## Q32A suburb of a large              0.010   -0.049    0.069    0.328   0.743
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 0.409

#Q25J Liked a post about politics on social media

linear.regression.function.data("q25j.recode2", survey.df)

## MODEL INFO:
## Observations: 8595
## Dependent Variable: ind.var
## Type: Survey-weighted linear regression 
## 
## MODEL FIT:
## R² = 0.057
## Adj. R² = 0.055 
## 
## Standard errors: Robust
## ----------------------------------------------------------------------------
##                                      Est.     2.5%    97.5%   t val.       p
## -------------------------------- -------- -------- -------- -------- -------
## (Intercept)                         0.926    0.760    1.091   10.958   0.000
## race.varBlack                      -0.220   -0.320   -0.120   -4.312   0.000
## race.varHispanic                   -0.108   -0.215   -0.001   -1.976   0.048
## race.varOther                       0.112   -0.056    0.280    1.301   0.193
## POL_PARTYDEMOCRAT                  -0.007   -0.085    0.071   -0.175   0.861
## POL_PARTYINDEPENDENT               -0.324   -0.452   -0.196   -4.964   0.000
## genderFemale                        0.185    0.119    0.251    5.496   0.000
## AGE_CAT35 TO 54                    -0.198   -0.291   -0.106   -4.205   0.000
## AGE_CAT55+                         -0.475   -0.572   -0.378   -9.606   0.000
## college.grad1                      -0.049   -0.122    0.024   -1.307   0.191
## high_internet_use_1high             0.543    0.422    0.665    8.759   0.000
## high_news_1high                     0.271    0.198    0.345    7.227   0.000
## Q32A small town or village         -0.019   -0.124    0.086   -0.357   0.721
## Q32A large city                     0.128    0.012    0.244    2.161   0.031
## Q32A suburb of a large              0.001   -0.102    0.105    0.024   0.981
## city                                                                        
## ----------------------------------------------------------------------------
## 
## Estimated dispersion parameter = 1.548

END

Knight Regression Markdown

Andrew Dugan

01/11/2021