Introduction

I am interested in seeing how politically engaged/active Republicans and Democrats are, and if they differ in this. I hypothesize that Democrats may be more politically active recently because they are more motivated to remove the current administration from office. We will investigate this by seeing how Republicans and Democrats differ in the following:

Variables and Data Prep

I will determine which political party each voter identifies with using the pid3_2019 variable, filtered to only include Democrats and Republicans. I will use votereg_2019 to determine whether or not someone registered in 2019, turnout18post_2019 to see if someone voted in the 2018 general election, and tsmart_P2019_party_2019 to see how someone voted in the 2018 primaries. I will recode this variable to just show whether or not someone voted to more clearly tell if there was a difference in voter turnout between the two parties for this election. Political interest will be determined using the newsint_2019 variable. Feelings towards Democrats and Republicans will be taken from the Democrats_2019 and Republicans_2019 variables, respectively.

Importing Data

library(readr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
voterdata19<-read.csv("/Users/Nazija/Desktop/airdrop/Voter Data 2019.csv")

Recoding and Selecting Variables

data <- (voterdata19)%>%
  mutate(Party = ifelse(pid3_2019 == 1, "Democrat",
                 ifelse(pid3_2019 == 2, "Republican",NA)),
         Reg2019 = ifelse(votereg_2019 == 1, "Registered",
                   ifelse(votereg_2019 == 2, "Not Registered", NA)),
         Voted2018 = ifelse(turnout18post_2019 == 1, "Yes",
                     ifelse(turnout18post_2019 == 2, "No",NA)),
         Prim2018Party = ifelse(tsmart_P2018_party_2019 == 1, "Democratic",
                         ifelse(tsmart_P2018_party_2019 == 2, "Republican",
                         ifelse(tsmart_P2018_party_2019 == 98, "Did not vote",
                         ifelse(tsmart_P2018_party_2019 == 99, "Did not vote", NA)))),
         PoliticalInterest = ifelse(newsint_2019 == 1, "Most of the time",
                             ifelse(newsint_2019 == 2, "Some of the time",
                             ifelse(newsint_2019 == 3, "Only now and then",
                             ifelse(newsint_2019 == 4, "Hardly at all", NA)))),
         ft_Dems = ifelse(Democrats_2019 > 100, NA, Democrats_2019),
         ft_Reps = ifelse(Republicans_2019 > 100, NA, Republicans_2019))%>%
  select(Party, Reg2019, Voted2018, Prim2018Party, PoliticalInterest, ft_Dems, ft_Reps)

data = na.omit(data)
head(data)
##         Party    Reg2019 Voted2018 Prim2018Party PoliticalInterest ft_Dems
## 5  Republican Registered       Yes  Did not vote  Most of the time      31
## 7    Democrat Registered       Yes  Did not vote  Most of the time      53
## 10   Democrat Registered       Yes  Did not vote  Most of the time      97
## 14   Democrat Registered       Yes  Did not vote  Most of the time      95
## 17   Democrat Registered       Yes  Did not vote  Most of the time      73
## 18 Republican Registered       Yes    Republican  Some of the time       0
##    ft_Reps
## 5       74
## 7        5
## 10       0
## 14       0
## 17       1
## 18      52

Analysis

Political Party & Voter Registration

Crosstab

table(data$Party, data$Reg2019)%>%
  prop.table(1)%>%
  round(2)
##             
##              Not Registered Registered
##   Democrat             0.02       0.98
##   Republican           0.01       0.99

Both parties seem to have nearly the same percentage of registered voters.

Visualization

data%>%
  group_by(Party, Reg2019)%>%
  summarize(n = n())%>%
  mutate(percentage = n/sum(n))%>%
  ggplot()+
  geom_col(aes(x = Party, y = percentage, fill = Reg2019))
## `summarise()` regrouping output by 'Party' (override with `.groups` argument)

Analysis

chisq.test(data$Party, data$Reg2019)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  data$Party and data$Reg2019
## X-squared = 0.20857, df = 1, p-value = 0.6479

There is little to no difference between the two parties in voter registration, and the p-value shows that there is no statistically significant difference since it is greater than .05.

Political Party & Voting in 2018 General Election

Crosstab

table(data$Party, data$Voted2018)%>%
  prop.table(1)%>%
  round(2)
##             
##                No  Yes
##   Democrat   0.06 0.94
##   Republican 0.07 0.93

Nearly the same percentage of Democrats and Republicans voted in the 2018 general election.

Visualization

data%>%
  group_by(Party, Voted2018)%>%
  summarize(n = n())%>%
  mutate(percentage = n/sum(n))%>%
  ggplot()+
  geom_col(aes(x = Party, y = percentage, fill = Voted2018))
## `summarise()` regrouping output by 'Party' (override with `.groups` argument)

Analysis

chisq.test(data$Party, data$Voted2018)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  data$Party and data$Voted2018
## X-squared = 0.65961, df = 1, p-value = 0.4167

There is not a lot of difference in the percent of Democrats and Republicans who voted in the 2018 general election. The p-value is over .05, showing that there is no statistically significant difference.

Political Party & Voting in 2018 Primaries

Crosstab

table(data$Party, data$Prim2018Party)%>%
  prop.table(1)%>%
  round(2)
##             
##              Democratic Did not vote Republican
##   Democrat         0.22         0.77       0.01
##   Republican       0.01         0.77       0.22

Both partiest favored their own parties in the primaries over the other. The same percentage of both parties voted for their own parties or the other party. 77%, the majority of both parties, did not vote in the 2018 primaries.

Visualization

data%>%
  group_by(Party, Prim2018Party)%>%
  summarize(n = n())%>%
  mutate(percentage = n/sum(n))%>%
  ggplot()+
  geom_col(aes(x = Party, y = percentage, fill = Prim2018Party))
## `summarise()` regrouping output by 'Party' (override with `.groups` argument)

Analysis

chisq.test(data$Party, data$Prim2018Party)
## 
##  Pearson's Chi-squared test
## 
## data:  data$Party and data$Prim2018Party
## X-squared = 696.51, df = 2, p-value < 2.2e-16

This is showing that there is a statistically significant relationship between one’s party and how they voted in the 2018 primaries, since Democrats favored Democrats and Republicans favored Republicans. However, if we recode the variable to just show whether or not someone voted:

Just Vote or Didn’t Vote

Primaries18 <- (data)%>%
  mutate(Prim2018 = ifelse(Prim2018Party == "Republican", "Voted",
                         ifelse(Prim2018Party == "Democratic", "Voted",Prim2018Party)))

table(Primaries18$Party, Primaries18$Prim2018)%>%
  prop.table(1)%>%
  round(2)
##             
##              Did not vote Voted
##   Democrat           0.77  0.23
##   Republican         0.77  0.23
Primaries18%>%
  group_by(Party, Prim2018)%>%
  summarize(n = n())%>%
  mutate(percentage = n/sum(n))%>%
  ggplot()+
  geom_col(aes(x = Party, y = percentage, fill = Prim2018))
## `summarise()` regrouping output by 'Party' (override with `.groups` argument)

chisq.test(Primaries18$Party, Primaries18$Prim2018)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  Primaries18$Party and Primaries18$Prim2018
## X-squared = 0.10525, df = 1, p-value = 0.7456

The distribution of how people voted is the same, with about 77% not voting in the primary election, and the rest voting. The same percentage of people in each party voted for the party they align with. Therefore, people from both parties seem to be politically engaged at the same level when it comes to the 2018 primaries.

Political Party & Political Interest

Crosstab

table(data$Party, data$PoliticalInterest)%>%
  prop.table(1)%>%
  round(2)
##             
##              Hardly at all Most of the time Only now and then Some of the time
##   Democrat            0.04             0.61              0.10             0.25
##   Republican          0.03             0.64              0.09             0.25

The percentage of voters from either party is very close in each level of political interest.

Visualization

data%>%
  group_by(Party, PoliticalInterest)%>%
  summarize(n = n())%>%
  mutate(percentage = n/sum(n))%>%
  ggplot()+
  geom_col(aes(x = Party, y = percentage, fill = PoliticalInterest))
## `summarise()` regrouping output by 'Party' (override with `.groups` argument)

Analysis

chisq.test(data$Party, data$PoliticalInterest)
## 
##  Pearson's Chi-squared test
## 
## data:  data$Party and data$PoliticalInterest
## X-squared = 3.9349, df = 3, p-value = 0.2686

There seems to be very little difference in how people from either party kept up with political news. The p-value is higher than .05, showing that any difference there is is not statistically significant.

Political Party & Feelings Towards Democrats

Comparing Means

data%>%
  group_by(Party)%>%
  summarize(avg = mean(ft_Dems))
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 2 x 2
##   Party        avg
##   <chr>      <dbl>
## 1 Democrat    79.2
## 2 Republican  13.3

The mean rating for Democrats from Democrats is higher (79.2) than the one from Republicans (13.3).

data%>%
  group_by(Party)%>%
  summarize(avg = mean(ft_Dems))%>%
  ggplot()+
  geom_col(aes(x = Party, y = avg, fill = Party))+
  geom_label(aes(x = Party, y = avg, label = round(avg)))
## `summarise()` ungrouping output (override with `.groups` argument)

Sampling Distribution

Dem_data<-data%>%
  filter(Party == "Democrat")
Rep_data<-data%>%
  filter(Party == "Republican")

Dem_sampling<-replicate(10000,
          sample(Dem_data$ft_Dems, 40)%>%
            mean())%>%
  data.frame()%>%
  rename("mean" = 1)

Rep_sampling<-replicate(10000,
          sample(Rep_data$ft_Dems, 40)%>%
            mean())%>%
  data.frame()%>%
  rename("mean" = 1)

ggplot()+
  geom_histogram(data = Dem_sampling, aes(x = mean), fill = "blue")+ 
  geom_histogram(data = Rep_sampling, aes(x = mean), fill = "red")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

There is no overlap between the two sampling distributions. Both have pretty narrow ranges that are about the same size. The Democrats’ sample means are a lot higher (and to the right) than the Republicans’ sample means.

T-test

t.test(ft_Dems~Party, data = data)
## 
##  Welch Two Sample t-test
## 
## data:  ft_Dems by Party
## t = 101.12, df = 3427.8, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  64.66106 67.21810
## sample estimates:
##   mean in group Democrat mean in group Republican 
##                 79.24529                 13.30572

There is a statistically significant relationship between one’s party and how they feel towards Democrats. Democrats favor other Democrats while Republicans on average rate them lower.

Political Party & Feelings Towards Republicans

Comparing Means

data%>%
  group_by(Party)%>%
  summarize(avg = mean(ft_Reps))
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 2 x 2
##   Party        avg
##   <chr>      <dbl>
## 1 Democrat    15.1
## 2 Republican  73.7

The mean rating for Republicans from Democrats is lower (15.1) than the one from Republicans (73.7).

data%>%
  group_by(Party)%>%
  summarize(avg = mean(ft_Reps))%>%
  ggplot()+
  geom_col(aes(x = Party, y = avg, fill = Party))+
  geom_label(aes(x = Party, y = avg, label = round(avg)))
## `summarise()` ungrouping output (override with `.groups` argument)

Sampling Distribution

Dem_data<-data%>%
  filter(Party == "Democrat")
Rep_data<-data%>%
  filter(Party == "Republican")

Dem_sampling<-replicate(10000,
          sample(Dem_data$ft_Reps, 40)%>%
            mean())%>%
  data.frame()%>%
  rename("mean" = 1)

Rep_sampling<-replicate(10000,
          sample(Rep_data$ft_Reps, 40)%>%
            mean())%>%
  data.frame()%>%
  rename("mean" = 1)

ggplot()+
  geom_histogram(data = Dem_sampling, aes(x = mean), fill = "blue")+ 
  geom_histogram(data = Rep_sampling, aes(x = mean), fill = "red")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

There is no overlap between the two sampling distributions. It resembles the ft_Dems sampling distributions, flipped.

T-test

t.test(ft_Reps~Party, data = data)
## 
##  Welch Two Sample t-test
## 
## data:  ft_Reps by Party
## t = -82.759, df = 3237.5, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -59.98484 -57.20836
## sample estimates:
##   mean in group Democrat mean in group Republican 
##                 15.12723                 73.72383

There is a statistically significant relationship between one’s party and feelings towards Republicans. Republicans on average rate Republicans higher than Democrats do.

Conclusions

There wasn’t any statistically significant relationship between political party and any of the variables besides how one feels towards Democrats and how one feels towards Republicans. Both parties seem to be equally engaged in voting and staying informed on political news, and also seem equally polarized in their feelings towards each other.