Preparing data for analysis.

First of all, we run all the libraries necessary for the analysis.

library(readr)
library(ggplot2)
library(dplyr)
library(sjlabelled)
library(sjmisc) 
library(sjstats) 
library(ggeffects)  
library(sjPlot)
library(knitr)
library(psych)
library(foreign)

Next, we load out dataset. It is a combination of data connected with both politics and personal information.

politics_gender <- read_spss("politics_gender.sav")

## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |                                                                 |   1%
  |                                                                       
  |=                                                                |   1%
  |                                                                       
  |=                                                                |   2%
  |                                                                       
  |==                                                               |   2%
  |                                                                       
  |==                                                               |   3%
  |                                                                       
  |==                                                               |   4%
  |                                                                       
  |===                                                              |   4%
  |                                                                       
  |===                                                              |   5%
  |                                                                       
  |====                                                             |   6%
  |                                                                       
  |====                                                             |   7%
  |                                                                       
  |=====                                                            |   7%
  |                                                                       
  |=====                                                            |   8%
  |                                                                       
  |======                                                           |   8%
  |                                                                       
  |======                                                           |   9%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=======                                                          |  10%
  |                                                                       
  |=======                                                          |  11%
  |                                                                       
  |========                                                         |  12%
  |                                                                       
  |========                                                         |  13%
  |                                                                       
  |=========                                                        |  13%
  |                                                                       
  |=========                                                        |  14%
  |                                                                       
  |==========                                                       |  15%
  |                                                                       
  |==========                                                       |  16%
  |                                                                       
  |===========                                                      |  16%
  |                                                                       
  |===========                                                      |  17%
  |                                                                       
  |===========                                                      |  18%
  |                                                                       
  |============                                                     |  18%
  |                                                                       
  |============                                                     |  19%
  |                                                                       
  |=============                                                    |  19%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |=============                                                    |  21%
  |                                                                       
  |==============                                                   |  21%
  |                                                                       
  |==============                                                   |  22%
  |                                                                       
  |===============                                                  |  22%
  |                                                                       
  |===============                                                  |  23%
  |                                                                       
  |===============                                                  |  24%
  |                                                                       
  |================                                                 |  24%
  |                                                                       
  |================                                                 |  25%
  |                                                                       
  |=================                                                |  25%
  |                                                                       
  |=================                                                |  26%
  |                                                                       
  |=================                                                |  27%
  |                                                                       
  |==================                                               |  27%
  |                                                                       
  |==================                                               |  28%
  |                                                                       
  |===================                                              |  29%
  |                                                                       
  |===================                                              |  30%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |====================                                             |  31%
  |                                                                       
  |=====================                                            |  32%
  |                                                                       
  |=====================                                            |  33%
  |                                                                       
  |======================                                           |  33%
  |                                                                       
  |======================                                           |  34%
  |                                                                       
  |======================                                           |  35%
  |                                                                       
  |=======================                                          |  35%
  |                                                                       
  |=======================                                          |  36%
  |                                                                       
  |========================                                         |  36%
  |                                                                       
  |========================                                         |  37%
  |                                                                       
  |========================                                         |  38%
  |                                                                       
  |=========================                                        |  38%
  |                                                                       
  |=========================                                        |  39%
  |                                                                       
  |==========================                                       |  39%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |==========================                                       |  41%
  |                                                                       
  |===========================                                      |  41%
  |                                                                       
  |===========================                                      |  42%
  |                                                                       
  |============================                                     |  42%
  |                                                                       
  |============================                                     |  43%
  |                                                                       
  |============================                                     |  44%
  |                                                                       
  |=============================                                    |  44%
  |                                                                       
  |=============================                                    |  45%
  |                                                                       
  |==============================                                   |  45%
  |                                                                       
  |==============================                                   |  46%
  |                                                                       
  |==============================                                   |  47%
  |                                                                       
  |===============================                                  |  47%
  |                                                                       
  |===============================                                  |  48%
  |                                                                       
  |================================                                 |  49%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=================================                                |  50%
  |                                                                       
  |=================================                                |  51%
  |                                                                       
  |==================================                               |  52%
  |                                                                       
  |==================================                               |  53%
  |                                                                       
  |===================================                              |  53%
  |                                                                       
  |===================================                              |  54%
  |                                                                       
  |===================================                              |  55%
  |                                                                       
  |====================================                             |  55%
  |                                                                       
  |====================================                             |  56%
  |                                                                       
  |=====================================                            |  56%
  |                                                                       
  |=====================================                            |  57%
  |                                                                       
  |=====================================                            |  58%
  |                                                                       
  |======================================                           |  58%
  |                                                                       
  |======================================                           |  59%
  |                                                                       
  |=======================================                          |  59%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |=======================================                          |  61%
  |                                                                       
  |========================================                         |  61%
  |                                                                       
  |========================================                         |  62%
  |                                                                       
  |=========================================                        |  62%
  |                                                                       
  |=========================================                        |  63%
  |                                                                       
  |=========================================                        |  64%
  |                                                                       
  |==========================================                       |  64%
  |                                                                       
  |==========================================                       |  65%
  |                                                                       
  |===========================================                      |  65%
  |                                                                       
  |===========================================                      |  66%
  |                                                                       
  |===========================================                      |  67%
  |                                                                       
  |============================================                     |  67%
  |                                                                       
  |============================================                     |  68%
  |                                                                       
  |=============================================                    |  69%
  |                                                                       
  |=============================================                    |  70%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |==============================================                   |  71%
  |                                                                       
  |===============================================                  |  72%
  |                                                                       
  |===============================================                  |  73%
  |                                                                       
  |================================================                 |  73%
  |                                                                       
  |================================================                 |  74%
  |                                                                       
  |================================================                 |  75%
  |                                                                       
  |=================================================                |  75%
  |                                                                       
  |=================================================                |  76%
  |                                                                       
  |==================================================               |  76%
  |                                                                       
  |==================================================               |  77%
  |                                                                       
  |==================================================               |  78%
  |                                                                       
  |===================================================              |  78%
  |                                                                       
  |===================================================              |  79%
  |                                                                       
  |====================================================             |  79%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |====================================================             |  81%
  |                                                                       
  |=====================================================            |  81%
  |                                                                       
  |=====================================================            |  82%
  |                                                                       
  |======================================================           |  82%
  |                                                                       
  |======================================================           |  83%
  |                                                                       
  |======================================================           |  84%
  |                                                                       
  |=======================================================          |  84%
  |                                                                       
  |=======================================================          |  85%
  |                                                                       
  |========================================================         |  86%
  |                                                                       
  |========================================================         |  87%
  |                                                                       
  |=========================================================        |  87%
  |                                                                       
  |=========================================================        |  88%
  |                                                                       
  |==========================================================       |  89%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |===========================================================      |  90%
  |                                                                       
  |===========================================================      |  91%
  |                                                                       
  |===========================================================      |  92%
  |                                                                       
  |============================================================     |  92%
  |                                                                       
  |============================================================     |  93%
  |                                                                       
  |=============================================================    |  93%
  |                                                                       
  |=============================================================    |  94%
  |                                                                       
  |==============================================================   |  95%
  |                                                                       
  |==============================================================   |  96%
  |                                                                       
  |===============================================================  |  96%
  |                                                                       
  |===============================================================  |  97%
  |                                                                       
  |===============================================================  |  98%
  |                                                                       
  |================================================================ |  98%
  |                                                                       
  |================================================================ |  99%
  |                                                                       
  |=================================================================|  99%
  |                                                                       
  |=================================================================| 100%

Here we filter our data in order to delete all the observation useless for the analysis.

politics <- politics_gender %>% 
  select(agea, lrscale, sgnptit, vote)

politics <- politics[!is.na(politics$agea),]
politics <- politics[!is.na(politics$lrscale),]
politics <- politics[!is.na(politics$sgnptit),]
politics <- politics[!is.na(politics$vote),]

politics13 <- politics %>% 
  filter(lrscale != 77) %>% 
  filter(lrscale != 88) %>% 
  filter(lrscale != 99 ) 

politics13 <- politics13 %>%
  filter(sgnptit != 7) %>%
  filter(sgnptit != 8) %>%
  filter(sgnptit != 9) 

politics13 <- politics13 %>% 
  filter(agea != 999)

Describing variables

First, we modify one of the variables to make it comfortable for manipulations. Then, we update our dataset.

politics13$lr <- ifelse(politics13$lrscale <= 3, "Left",
                    ifelse(politics13$lrscale >= 7, "Right", "Middle"))
politics13 <- politics13 %>% 
  select(- lrscale)

Now, let`s look at the number of observations and the number of variables.

dim(politics13)

## [1] 2231    4

Well, that`s all we need for conducting tests: 4 variables and enough number of observations.

Then, there is a description of chosen variables presented.

Label <- c("`sgnptit`", "`lr`", "`agea`", "`vote`" ) 
Meaning <- c("Signed petition last 12 months", "Placement on left right scale", "Age", "Voted last national election")
Level_Of_Measurement <- c("Nominal", "Nominal", "Ratio", "Nominal")
Test <- c("Chi-squared test", "Chi-squared test", "T-test for independet variables", "T-test for independet variables")
df <- data.frame(Label, Meaning, Level_Of_Measurement,Test, stringsAsFactors = FALSE)
kable(df)

Label	Meaning	Level_Of_Measurement	Test
`sgnptit`	Signed petition last 12 months	Nominal	Chi-squared test
`lr`	Placement on left right scale	Nominal	Chi-squared test
`agea`	Age	Ratio	T-test for independet variables
`vote`	Voted last national election	Nominal	T-test for independet variables

Exploratory analysis of categories

Firstly, we select variables necessary for chi-square test. Next, there is a contigency table presented.

politics_chi <- politics13 %>% 
  select(lr, sgnptit)
politics_chi$sgnptit <- factor(politics_chi$sgnptit, labels = c("Yes", "No"), ordered= F,exclude = NA)
ContigencyTable <- table(politics_chi$lr, politics_chi$sgnptit)
kable(ContigencyTable)

	Yes	No
Left	141	188
Middle	281	1036
Right	95	490

In order to check whether our categories are successful to run chi-square test, we are going to create stacked barplots and analyze them.

Stacked barplot №1

ggplot() +
  geom_bar(data = politics_chi, aes(x = lr, fill = sgnptit), position = "fill")+
  coord_flip()+
  xlab("Party affiliation") + 
  ylab("Percentage of people") +
  ggtitle("Participation in signing petitions due to party affiliation")

Stacked barplot №2

sjp.xtab(politics_chi$lr, politics_chi$sgnptit, type = "bar", margin ="row",
  bar.pos = "stack", title = "Participation in signing petitions due to party affiliation", title.wtd.suffix = NULL,
  axis.titles = NULL, axis.labels = NULL, legend.title = NULL,
  legend.labels = NULL, weight.by = NULL, rev.order = FALSE,
  show.values = TRUE, show.n = TRUE, show.prc = TRUE, show.total = TRUE,
  show.legend = TRUE, show.summary = TRUE, summary.pos = "r",
  string.total = "Total", wrap.title = 50, wrap.labels = 15,
  wrap.legend.title = 20, wrap.legend.labels = 20, geom.size = 0.7,
  geom.spacing = 0.1, geom.colors = "Paired", dot.size = 3,
  smooth.lines = FALSE, grid.breaks = 0.2, expand.grid = FALSE,
  ylim = NULL, vjust = "bottom", hjust = "left", y.offset = NULL,
  coord.flip = TRUE)

After building two plots, we were convinced that, one way or another, the adherents of each category of political preferences signed the petitions. However, it can be noted that the tendency to take part in signing petitions is not very common in Ireland; The largest group supporting this trend is the Liberals, with 42.5% of those who signed any petitions. The “middle” and “right”, respectively, have in their ranks 21.1% and 15.3% respectively.
Each observation is independent of all the others (i.e., one observation per subject)and no more than 20% of the expected counts are less than 5. (none of them, actually). Therefore, the data is appropriate to conduct a reliable chi-square test.

Chi-square test

With the introduction of the Internet, signing petitions has become available online. That is, the study of the largerst online petitions` platform (Change.org) has shown that this source is strongly biased toward liberal causes. Graph

In Ireland, the half of the political parties presented are right(conservatism) or centre(social democracy, liberal conservatism, populism) and another half is left(socialism, respublicanism), so citizens have a wide spectrum of different views to share. Accordingly, in order to find out whether this distribution of the petition signatories due to their political preferences is random, we decided to build a Chi- square test.

The following hypotheses were approved for this:

H0 - there is no relation between the political preferences of respondents and their signing petition or not signing behavior
H1 - there is a relation between the political preferences of respondents and their signing petition or not signing behavior

So, let`s run chi-square test.

colnames(ContigencyTable) <- c("Petition +", "Petition -")
rownames(ContigencyTable) <- c("L", "R", "C")
chi.test <- chisq.test(ContigencyTable)
chi.test

## 
##  Pearson's Chi-squared test
## 
## data:  ContigencyTable
## X-squared = 89.895, df = 2, p-value < 2.2e-16

After carrying out the Chi-square test, we found out that its p-value is extremely small, which means that we do not have strong enough evidence to assert that there is no relation between these two variables. In this way, our H0 should be rejected and political preferences of respondents and their signing petition or not signing behavior are likely to be related.

Next, we have to look at residuals.

kable(chi.test$stdres) #for residuals

	Petition +	Petition -
L	9.164265	-9.164265
R	-2.468595	2.468595
C	-4.627590	4.627590

assocplot(t(ContigencyTable), main="Residuals and number of observations" )

On the plot of residuals, we can see the confirmation of our conclusion on the Chi-test: the difference in the number of petitioners who belong to different political parties is too big to say that the variables are independent of each other. Especially distinguished are the liberals, in whose ranks the number of signatories of the petition for indicator 10 is greater than expected, if these variables were independent; as well as the вЂњrightвЂќ ones, where the indicator 6 is less than the expected number of people who signed the petitions, if these variables were independent.
Thus, we were convinced that, apparently, since the Chi-square test and the difference in the residuals indicate a lack of evidence in favor of the independence of these data, we can assert that the political preferences of the respondents and their desire to sign or not to sign petitions of any kinds are related.

Exploring data for T-test

Here we start with filtering data to delete values useless for our test.

politics_ttest <- politics13 %>% 
  select(agea, vote) %>% 
  filter(vote != 3) %>% 
  filter(vote != 7) %>% 
  filter(vote != 8)

Next, LetвЂ™s compare mean values with the help of boxplot.

politics_ttest$vote <- factor(politics_ttest$vote, labels = c("Yes", "No"), ordered= F,exclude = NA)
ggplot() +
  geom_boxplot(data = politics_ttest, aes(x = vote, y = agea), fill="#A44200", col="#A44200", alpha = 0.5) +
  scale_y_continuous(limits = c(0,100)) +
  xlab("Voted last national election") + 
  ylab("Age") +
  ggtitle("Participation in the election due to age")

The median age of voters is higher than the median age of those who did not vote. The first box plot is taller than the second, so we can say that there is a greater variety of ages in the group of voters than in the group of those who refused to vote. The whiskers are pretty the same on both of the graphs. However, the graph of non-voters shows some outliers.

Cheking normality of distribution

There is the first way to check normality presented.

describeBy(politics_ttest, politics_ttest$vote)

## 
##  Descriptive statistics by group 
## group: Yes
##       vars    n  mean    sd median trimmed   mad min max range  skew
## agea     1 1689 54.93 16.41     55   54.98 19.27  19  92    73 -0.03
## vote*    2 1689  1.00  0.00      1    1.00  0.00   1   1     0   NaN
##       kurtosis  se
## agea     -0.77 0.4
## vote*      NaN 0.0
## -------------------------------------------------------- 
## group: No
##       vars   n  mean    sd median trimmed   mad min max range skew
## agea     1 408 40.69 15.87     38   39.19 14.83  16  96    80 0.87
## vote*    2 408  2.00  0.00      2    2.00  0.00   2   2     0  NaN
##       kurtosis   se
## agea      0.44 0.79
## vote*      NaN 0.00

Skewness is a measure of the symmetry in a distribution. The normal distribution is symmetrical, so skew should be equal to 0 in normal distribution. In the group of voters skew equals to 7.49, and in the group of non-voters the skew is higher, 8.71. The distribution of age is more symmetrical in the group of voters, but still it is far away from normal. However, both of skews are greater than 1, so both of the groups have a high positive skewness (right).
Kurtosis tells us, whether the distribution is peaked or plain. The kurtosis of the age in voters group equals to 55.29, and in the non-voters group kurtosis equals to 75.76. That means that the distribution of the first group (voters) is less sharp than the distribution of the second group.

Next, we check normality with the help of histogram.

library(ggplot2)
ggplot(politics_ttest, aes(x = agea, fill = vote)) +
      geom_histogram(aes(y=..density..), position = "identity", alpha = 0.7, binwidth = 3) +
  geom_density(col = "yellow", fill = "white", alpha = 0.1) +
  geom_vline(aes(xintercept = mean(politics_ttest$agea), color = 'mean'), linetype="dashed", size=1) +
  geom_vline(aes(xintercept = median(politics_ttest$agea), color = 'median'), linetype="longdash", size=1) +
  scale_color_manual(name = "Measurement", values = c(median = "#cb3f68", mean = "#824acd")) +
  xlab("Age") + 
  ylab("Density") +
  ggtitle("Age distribution of voters and non-voters")

Now we can see the distribution and assume that both of the groups are not very close to normal distribution, but the group of voters is slightly closer to it.

Finally, we check normality with the help of Q-Q Plot.

#creating subgroups based on voting / non-voting
voteplus <- subset(politics_ttest[politics_ttest$vote == "Yes",]) 
voteminus <- subset(politics_ttest[politics_ttest$vote == "No",])
par(mfrow = c(1,2))
# y is limited from 18 because it is age at which the Irish are allowed to vote
qqnorm(voteplus$agea, ylim = c(18, 100), main = "Normal Q-Q Plot for vote+"); qqline(voteplus$agea,ylim = c(18 ,100), col= 2)
qqnorm(voteminus$agea, ylim = c(18 ,100), main = "Normal Q-Q Plot for vote-"); qqline(voteminus$agea, col= 2, ylim = c(18 ,100))

Both Q-Q plots show the distributions that are skewed to the right (some data higher than the line) and sharp peaks (the shape of the line is not the same as the normality line). However, the first plot shows a slightly plainer peak.
Thus, we can conclude that the first group (voters) is more normally distributed than the second group (non-voters). However, during a series of tests on the normality of the sample, it was proved that the resulting sample has an abnormal distribution.

Conducting T-test

Age is sometimes mentioned as one of the factor which can influence voting behaviour, but it seems that every country should be studied as a unique case.

As in the case with political preferences and signed petitions, we would like to find out whether voting behavior in Ireland is related to the respondent’s age, so, we should conduct a T-test. The following hypotheses were approved for this:

H0: the mean age of people who voted and did not vote does not differ.
H1: the mean age does differ and, thus, there is a relation between age and voting behavior

Now we are going to run T-test.

t.test(politics_ttest$agea ~ politics_ttest$vote)

## 
##  Welch Two Sample t-test
## 
## data:  politics_ttest$agea by politics_ttest$vote
## t = 16.147, df = 634.29, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  12.50096 15.96259
## sample estimates:
## mean in group Yes  mean in group No 
##          54.92540          40.69363

Statistical conclusion: at the 5% significance level on the available data the null hypothesis should be rejected in favor of the alternative one (p-value <О±).
Substantive conclusion : the average age of people is significantly different among those who voted, and those who refused to vote.

Chi-squared and t-test

Indentifying topic and describing individual contribution