Let’s download our data

ESS1 <- select(ESS, c("nwspol", "prtvtcil", "psppipla", "ppltrst", "polintr", "stfgov", "pbldmn", "vote", "contplt", "gndr", "icpart1"))

Constructing chi-square test

For chi-square test we decided to choose variables of gender (var. gndr) and participation in public demonstrations (var. pbldmn). Both variables are nominal.

  • H0 - there is no relationship between gender of respondents and participation in public demonstrations

  • H1 - the relationship exist

table(ESS1$gndr, ESS1$pbldmn)
##         
##           Yes   No
##   Male    151 1074
##   Female   89 1239
Table <- matrix(c(151, 89, 1074, 1239), nrow=2)

row.names(Table) <- c("Male","Female")

colnames(Table) <- c("Yes", "No")

Table
##        Yes   No
## Male   151 1074
## Female  89 1239
chisq.test(Table)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  Table
## X-squared = 23.014, df = 1, p-value = 1.608e-06
knitr::kable(Table)
Yes No
Male 151 1074
Female 89 1239

The probability of getting a value as 23.01, if there were no association between the variables in the population, is 1.608e-06. Since the P-value (1.608e-06) is less than the significance level (0.05), we might reject H0 and consider that there is a relationship between a gender of respondents and their involvements in demostrations.

set_theme(
  geom.outline.size = 0.3, 
  geom.label.size = 4,
  geom.label.color = "black",
  axis.angle.x = 45, 
  base = theme_bw()
)

sjp.xtab(ESS1$gndr, ESS1$pbldmn, margin = "row", bar.pos = "stack", show.summary = TRUE, coord.flip = TRUE, geom.colors = c("#CD423F", "#F5C6AC"))

Specially, most Israelis do not participate in demonstrations, however, men participate in demonstrations more often than women. Thus, only 7% of women participate in demonstations, while 12,3% do it among males. And 93,3% of females do not participate, while there’s less amount of men (87,7%) not participating.

firstchi <- chisq.test(Table)

knitr::kable(firstchi$stdres)
Yes No
Male 4.865195 -4.865195
Female -4.865195 4.865195

The value of the standardized residue less than -2 according to the obtained table means that:

    • the cell contains fewer observations that it was expected (the case of variables independence).

The value of standardized residual is higher than 2 according to the obtained table means that:

    • the cell contains more observations that it was expected

There are more men who participate in demonstations than women (std.res. = 4.86), and there are more women who do not participate than men (std.res. = 4.86).

Now we need to plot the results:

assocplot(t(Table), main="Residuals and number of observations")

Thus, we can see standardized residuals by using assocplot() function.

corrplot(firstchi$stdres, is.cor = FALSE)

  • Positive residuals are in blue
  • Negative residuals are in red

Let’s draw a stacked barplot with two variables!

counts = table(ESS1$pbldmn, ESS1$gndr)

barplot(counts, col=brewer.pal(n = 3, name = "PuRd"), legend = rownames(counts), las = 2, main = "Do you participate in demonstrations?")

We can conclude that there is a relationship between sex and participation in demonstrations. The citizens in Israel tend not to attend demonstrations, but men do it more often than women.