I am interested in seeing how politically engaged/active Republicans and Democrats are, and if they differ in this. I hypothesize that Democrats may be more politically active recently because they are more motivated to remove the current administration from office. We will investigate this by seeing how Republicans and Democrats differ in the following:
I will determine which political party each voter identifies with using the pid3_2019 variable, filtered to only include Democrats and Republicans. I will use votereg_2019 to determine whether or not someone registered in 2019, turnout18post_2019 to see if someone voted in the 2018 general election, and tsmart_P2019_party_2019 to see how someone voted in the 2018 primaries. I will recode this variable to just show whether or not someone voted to more clearly tell if there was a difference in voter turnout between the two parties for this election. Political interest will be determined using the newsint_2019 variable. Feelings towards Democrats and Republicans will be taken from the Democrats_2019 and Republicans_2019 variables, respectively.
library(readr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
voterdata19<-read.csv("/Users/Nazija/Desktop/airdrop/Voter Data 2019.csv")
data <- (voterdata19)%>%
mutate(Party = ifelse(pid3_2019 == 1, "Democrat",
ifelse(pid3_2019 == 2, "Republican",NA)),
Reg2019 = ifelse(votereg_2019 == 1, "Registered",
ifelse(votereg_2019 == 2, "Not Registered", NA)),
Voted2018 = ifelse(turnout18post_2019 == 1, "Yes",
ifelse(turnout18post_2019 == 2, "No",NA)),
Prim2018Party = ifelse(tsmart_P2018_party_2019 == 1, "Democratic",
ifelse(tsmart_P2018_party_2019 == 2, "Republican",
ifelse(tsmart_P2018_party_2019 == 98, "Did not vote",
ifelse(tsmart_P2018_party_2019 == 99, "Did not vote", NA)))),
PoliticalInterest = ifelse(newsint_2019 == 1, "Most of the time",
ifelse(newsint_2019 == 2, "Some of the time",
ifelse(newsint_2019 == 3, "Only now and then",
ifelse(newsint_2019 == 4, "Hardly at all", NA)))),
ft_Dems = ifelse(Democrats_2019 > 100, NA, Democrats_2019),
ft_Reps = ifelse(Republicans_2019 > 100, NA, Republicans_2019))%>%
select(Party, Reg2019, Voted2018, Prim2018Party, PoliticalInterest, ft_Dems, ft_Reps)
data = na.omit(data)
head(data)
## Party Reg2019 Voted2018 Prim2018Party PoliticalInterest ft_Dems
## 5 Republican Registered Yes Did not vote Most of the time 31
## 7 Democrat Registered Yes Did not vote Most of the time 53
## 10 Democrat Registered Yes Did not vote Most of the time 97
## 14 Democrat Registered Yes Did not vote Most of the time 95
## 17 Democrat Registered Yes Did not vote Most of the time 73
## 18 Republican Registered Yes Republican Some of the time 0
## ft_Reps
## 5 74
## 7 5
## 10 0
## 14 0
## 17 1
## 18 52
table(data$Party, data$Reg2019)%>%
prop.table(1)%>%
round(2)
##
## Not Registered Registered
## Democrat 0.02 0.98
## Republican 0.01 0.99
Both parties seem to have nearly the same percentage of registered voters.
data%>%
group_by(Party, Reg2019)%>%
summarize(n = n())%>%
mutate(percentage = n/sum(n))%>%
ggplot()+
geom_col(aes(x = Party, y = percentage, fill = Reg2019))
## `summarise()` regrouping output by 'Party' (override with `.groups` argument)
chisq.test(data$Party, data$Reg2019)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: data$Party and data$Reg2019
## X-squared = 0.20857, df = 1, p-value = 0.6479
There is little to no difference between the two parties in voter registration, and the p-value shows that there is no statistically significant difference since it is greater than .05.
table(data$Party, data$Voted2018)%>%
prop.table(1)%>%
round(2)
##
## No Yes
## Democrat 0.06 0.94
## Republican 0.07 0.93
Nearly the same percentage of Democrats and Republicans voted in the 2018 general election.
data%>%
group_by(Party, Voted2018)%>%
summarize(n = n())%>%
mutate(percentage = n/sum(n))%>%
ggplot()+
geom_col(aes(x = Party, y = percentage, fill = Voted2018))
## `summarise()` regrouping output by 'Party' (override with `.groups` argument)
chisq.test(data$Party, data$Voted2018)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: data$Party and data$Voted2018
## X-squared = 0.65961, df = 1, p-value = 0.4167
There is not a lot of difference in the percent of Democrats and Republicans who voted in the 2018 general election. The p-value is over .05, showing that there is no statistically significant difference.
table(data$Party, data$Prim2018Party)%>%
prop.table(1)%>%
round(2)
##
## Democratic Did not vote Republican
## Democrat 0.22 0.77 0.01
## Republican 0.01 0.77 0.22
Both partiest favored their own parties in the primaries over the other. The same percentage of both parties voted for their own parties or the other party. 77%, the majority of both parties, did not vote in the 2018 primaries.
data%>%
group_by(Party, Prim2018Party)%>%
summarize(n = n())%>%
mutate(percentage = n/sum(n))%>%
ggplot()+
geom_col(aes(x = Party, y = percentage, fill = Prim2018Party))
## `summarise()` regrouping output by 'Party' (override with `.groups` argument)
chisq.test(data$Party, data$Prim2018Party)
##
## Pearson's Chi-squared test
##
## data: data$Party and data$Prim2018Party
## X-squared = 696.51, df = 2, p-value < 2.2e-16
This is showing that there is a statistically significant relationship between one’s party and how they voted in the 2018 primaries, since Democrats favored Democrats and Republicans favored Republicans. However, if we recode the variable to just show whether or not someone voted:
Primaries18 <- (data)%>%
mutate(Prim2018 = ifelse(Prim2018Party == "Republican", "Voted",
ifelse(Prim2018Party == "Democratic", "Voted",Prim2018Party)))
table(Primaries18$Party, Primaries18$Prim2018)%>%
prop.table(1)%>%
round(2)
##
## Did not vote Voted
## Democrat 0.77 0.23
## Republican 0.77 0.23
Primaries18%>%
group_by(Party, Prim2018)%>%
summarize(n = n())%>%
mutate(percentage = n/sum(n))%>%
ggplot()+
geom_col(aes(x = Party, y = percentage, fill = Prim2018))
## `summarise()` regrouping output by 'Party' (override with `.groups` argument)
chisq.test(Primaries18$Party, Primaries18$Prim2018)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: Primaries18$Party and Primaries18$Prim2018
## X-squared = 0.10525, df = 1, p-value = 0.7456
The distribution of how people voted is the same, with about 77% not voting in the primary election, and the rest voting. The same percentage of people in each party voted for the party they align with. Therefore, people from both parties seem to be politically engaged at the same level when it comes to the 2018 primaries.
table(data$Party, data$PoliticalInterest)%>%
prop.table(1)%>%
round(2)
##
## Hardly at all Most of the time Only now and then Some of the time
## Democrat 0.04 0.61 0.10 0.25
## Republican 0.03 0.64 0.09 0.25
The percentage of voters from either party is very close in each level of political interest.
data%>%
group_by(Party, PoliticalInterest)%>%
summarize(n = n())%>%
mutate(percentage = n/sum(n))%>%
ggplot()+
geom_col(aes(x = Party, y = percentage, fill = PoliticalInterest))
## `summarise()` regrouping output by 'Party' (override with `.groups` argument)
chisq.test(data$Party, data$PoliticalInterest)
##
## Pearson's Chi-squared test
##
## data: data$Party and data$PoliticalInterest
## X-squared = 3.9349, df = 3, p-value = 0.2686
There seems to be very little difference in how people from either party kept up with political news. The p-value is higher than .05, showing that any difference there is is not statistically significant.
data%>%
group_by(Party)%>%
summarize(avg = mean(ft_Dems))
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 2 x 2
## Party avg
## <chr> <dbl>
## 1 Democrat 79.2
## 2 Republican 13.3
The mean rating for Democrats from Democrats is higher (79.2) than the one from Republicans (13.3).
data%>%
group_by(Party)%>%
summarize(avg = mean(ft_Dems))%>%
ggplot()+
geom_col(aes(x = Party, y = avg, fill = Party))+
geom_label(aes(x = Party, y = avg, label = round(avg)))
## `summarise()` ungrouping output (override with `.groups` argument)
Dem_data<-data%>%
filter(Party == "Democrat")
Rep_data<-data%>%
filter(Party == "Republican")
Dem_sampling<-replicate(10000,
sample(Dem_data$ft_Dems, 40)%>%
mean())%>%
data.frame()%>%
rename("mean" = 1)
Rep_sampling<-replicate(10000,
sample(Rep_data$ft_Dems, 40)%>%
mean())%>%
data.frame()%>%
rename("mean" = 1)
ggplot()+
geom_histogram(data = Dem_sampling, aes(x = mean), fill = "blue")+
geom_histogram(data = Rep_sampling, aes(x = mean), fill = "red")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
There is no overlap between the two sampling distributions. Both have pretty narrow ranges that are about the same size. The Democrats’ sample means are a lot higher (and to the right) than the Republicans’ sample means.
t.test(ft_Dems~Party, data = data)
##
## Welch Two Sample t-test
##
## data: ft_Dems by Party
## t = 101.12, df = 3427.8, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 64.66106 67.21810
## sample estimates:
## mean in group Democrat mean in group Republican
## 79.24529 13.30572
There is a statistically significant relationship between one’s party and how they feel towards Democrats. Democrats favor other Democrats while Republicans on average rate them lower.
data%>%
group_by(Party)%>%
summarize(avg = mean(ft_Reps))
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 2 x 2
## Party avg
## <chr> <dbl>
## 1 Democrat 15.1
## 2 Republican 73.7
The mean rating for Republicans from Democrats is lower (15.1) than the one from Republicans (73.7).
data%>%
group_by(Party)%>%
summarize(avg = mean(ft_Reps))%>%
ggplot()+
geom_col(aes(x = Party, y = avg, fill = Party))+
geom_label(aes(x = Party, y = avg, label = round(avg)))
## `summarise()` ungrouping output (override with `.groups` argument)
Dem_data<-data%>%
filter(Party == "Democrat")
Rep_data<-data%>%
filter(Party == "Republican")
Dem_sampling<-replicate(10000,
sample(Dem_data$ft_Reps, 40)%>%
mean())%>%
data.frame()%>%
rename("mean" = 1)
Rep_sampling<-replicate(10000,
sample(Rep_data$ft_Reps, 40)%>%
mean())%>%
data.frame()%>%
rename("mean" = 1)
ggplot()+
geom_histogram(data = Dem_sampling, aes(x = mean), fill = "blue")+
geom_histogram(data = Rep_sampling, aes(x = mean), fill = "red")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
There is no overlap between the two sampling distributions. It resembles the ft_Dems sampling distributions, flipped.
t.test(ft_Reps~Party, data = data)
##
## Welch Two Sample t-test
##
## data: ft_Reps by Party
## t = -82.759, df = 3237.5, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -59.98484 -57.20836
## sample estimates:
## mean in group Democrat mean in group Republican
## 15.12723 73.72383
There is a statistically significant relationship between one’s party and feelings towards Republicans. Republicans on average rate Republicans higher than Democrats do.
There wasn’t any statistically significant relationship between political party and any of the variables besides how one feels towards Democrats and how one feels towards Republicans. Both parties seem to be equally engaged in voting and staying informed on political news, and also seem equally polarized in their feelings towards each other.