2BK team: Bakhareva, Borisenko, Kireeva, Kuzmicheva
15/03/2019
Hello. We are 2BK. Our topic is “Politics”. The country we have chosen for studying is Ireland (round 8). Team members are Bakhareva Anastasia, Borisenko Iana, Kireeva Irina, Kuzmicheva Daria. We have focused on the results of the surveys connected both with politics in Ireland.
During the discussion of Ireland as a object of our research, we found out, that it is a quite welfare country. It is on the sixth place on a scale of human development index, which is extremely cool. However, this index, which is composed of life expectancy, education, and per capita income indicators, does not include the political aspects of a country. Since our research topic is Politics, we were concerned with this fact, and tried to figure out and explore, how does involvement in politics affect life satisfaction in Ireland. Our expectations were that the most interested in politics people have the highest level of life satisfaction, comparing to other people who are not that interested in political processes.
As for individual contribution, there it is done as follows:
Anastasia Bakhareva: boxplot construction, graph analysis, barplot construstion, ANOVA-test
Iana Borisenko: Post hoc + non-parametric tests description and analysis, Levene test
Irina Kireeva: Post hoc + non-parametric tests
Daria Kuzmicheva: histogram construction, graph analysis, variables description and analysis
Our research question is “Do irish people who are interested in politics to different extents have the same level of life satisfaction?”
In order to explore the issue we got the following data from the all-countries file:
library(dplyr)
library(ggplot2)
library(tidyverse)
library(psych)
library(magrittr)
library(knitr)
library(kableExtra)
library(readr)
library(foreign)
library(haven)politics_media <- read_sav("ESS1-8e01.sav") politics = politics_media %>%
select( stflife, polintr)
politics = politics %>%
filter(stflife != 77) %>%
filter(stflife != 88) %>%
filter(stflife != 99) politics.1 = politics %>%
select( stflife, polintr) %>%
filter(polintr != 7) %>%
filter(polintr != 8) %>%
filter(polintr != 9 ) Then, there is a description of chosen variables presented.
Label <- c("`polintr`", "`stflife`")
Meaning <- c("How interested in politics", "How satisfied with life as a whole")
Level_Of_Measurement <- c("Ordinal", "Interval")
Measurement <- c("Very - Quite - Hardly - Not at all", "0 - 10")
df <- data.frame(Label, Meaning, Level_Of_Measurement, Measurement, stringsAsFactors = FALSE)
kable(df) %>%
kable_styling(bootstrap_options=c("bordered", "responsive","striped"), full_width = FALSE)| Label | Meaning | Level_Of_Measurement | Measurement |
|---|---|---|---|
polintr
|
How interested in politics | Ordinal | Very - Quite - Hardly - Not at all |
stflife
|
How satisfied with life as a whole | Interval | 0 - 10 |
politics.2 = politics.1 %>%
select(polintr, stflife)
politics.2$polintr <- ifelse (politics.2$polintr == 1, "Very interested",
ifelse(politics.2$polintr == 2, "Quite interested",
ifelse(politics.2$polintr == 3, "Hardly interested", "Not interested")))
politics.2$stflife <- as.numeric(as.character(politics.2$stflife))
politics.2$polintr <- as.factor(politics.2$polintr)
politics.3 <- data.frame(politics.2$polintr,politics.2$stflife)
str(politics.3)## 'data.frame': 2749 obs. of 2 variables:
## $ politics.2.polintr: Factor w/ 4 levels "Hardly interested",..: 1 1 3 1 1 1 3 2 1 3 ...
## $ politics.2.stflife: num 4 6 6 4 6 5 7 4 5 7 ...
politics.2$polintr <- factor(politics.2$polintr, c("Not interested", "Hardly interested", "Quite interested", "Very interested" ))politics.11 = politics.2 %>%
filter(politics.1$stflife != 88)
politics.11 = politics.11 %>%
filter(politics.11$stflife != 77)
politics.11 = politics.11 %>%
filter(politics.11$stflife != 99)
describeBy(politics.11$stflife, politics.11$polintr, mat = TRUE) %>% #create dataframe
select(polintr = group1, N=n, Mean=mean, SD=sd, Median=median, Min=min, Max=max,
Skew=skew, Kurtosis=kurtosis, st.error = se) %>%
kable(align=c("lrrrrrrrr"), digits=2, row.names = FALSE,
caption="Satisfaction with life by political preferences") %>%
kable_styling(bootstrap_options=c("bordered", "responsive","striped"), full_width = FALSE)| polintr | N | Mean | SD | Median | Min | Max | Skew | Kurtosis | st.error |
|---|---|---|---|---|---|---|---|---|---|
| Not interested | 744 | 7.12 | 2.00 | 7 | 0 | 10 | -0.87 | 1.05 | 0.07 |
| Hardly interested | 733 | 7.23 | 1.90 | 8 | 0 | 10 | -0.97 | 1.42 | 0.07 |
| Quite interested | 964 | 7.29 | 1.80 | 7 | 0 | 10 | -1.02 | 2.00 | 0.06 |
| Very interested | 308 | 7.54 | 1.87 | 8 | 0 | 10 | -0.99 | 1.35 | 0.11 |
By looking at this table we can conclude that the sizes of our groups are quite comparable
Next, we are to look at groups` sizes to be sure that they are representative.
par(mar = c(3,10,0,3))
barplot(table(politics.11$polintr)/nrow(politics.11)*100, horiz = T, xlim = c(0,60), las = 2)Now, by looking at the barplot, we also can conclude that the groups are of a comparable size.
ggplot()+
geom_boxplot(data = politics.2, aes(x = polintr, y = stflife), fill="pink", col="purple", alpha = 0.5) +
ylim(c(0,10)) +
xlab("How interested in politics") +
ylab("Level of Life satisfaction") +
ggtitle("Life satisfaction by the level of interest in politics")Conclusion: From the boxplot we can see that the Y variables are quite normally distributed among the groups. However,there are several outliers. Moreover, it can be see that those, who are completely not interested in politics and those who are very interested in politics have the higher mean of life satisfaction level.
The next step is to check the assumptions for ANOVA-test. Then, let`s look at homogeneity of variances with the help of Levene test.
library(car)
leveneTest(politics.11$stflife ~ politics.11$polintr)## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 3 1.7685 0.151
## 2745
Conclusion: From the results of the Levene’s Test it can be seen that the p-value is much higher than the significance level of 0.05. This means that there is no evidence to suggest that the variance among groups is statistically significantly different. Therefore, we can assume the homogeneity of variances in the different groups of political interest.
oneway.test(politics.11$stflife ~ politics.11$polintr, var.equal = T)##
## One-way analysis of means
##
## data: politics.11$stflife and politics.11$polintr
## F = 3.8028, num df = 3, denom df = 2745, p-value = 0.009808
aov.out <- aov(politics.11$stflife ~ politics.11$polintr)
summary(aov.out)## Df Sum Sq Mean Sq F value Pr(>F)
## politics.11$polintr 3 41 13.562 3.803 0.00981 **
## Residuals 2745 9790 3.566
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion: As the p-value is less than the chosen significance level 0.05 we can conclude, that the differences in level of life satisfaction across the groups of political interest are not equal.
layout(matrix(1:4, 2, 2))
plot(aov.out)Conclusion: We can see that on the upper two graphs the red line is pretty straight. The lime on the Q-Q plot is not as straight. However, on the basis of these graphs, we can conclude that the distribution of residuals is quite normal.
anova.res <- residuals(object = aov.out)
describe(anova.res) ## vars n mean sd median trimmed mad min max range skew kurtosis
## X1 1 2749 0 1.89 -0.12 0.14 1.48 -7.54 2.88 10.42 -0.96 1.5
## se
## X1 0.04
Conclusion: Skew and kurtosis are <2, so the distribution of residuals is normal
shapiro.test(x = anova.res)##
## Shapiro-Wilk normality test
##
## data: anova.res
## W = 0.93435, p-value < 2.2e-16
Conclusion: The p-value is extremely small, whcich testifies the non-normal distribution of residuals (!)
hist(anova.res, main = "Distribution of residuals", xlab = "Residuals", col = "pink", border = "#BC6B97")Conclusion By looking at the histogram we can conclude that residuals are quite normally distributed
Overall conclusion: All the tests except the Shapiro test tell that the distribution of residuals is normal. So, the assumption of the normality of residuals holds.
In the ANOVA test a significant p-value indicates that means in some groups are different, though it doesn`t show, which pairs of groups this exactly are. To find this out, a post hoc test can be conducted to determine if the mean difference between specific pairs of group are statistically significant.
As variances across groups are practically equal, we chose Tukey test for that.
par(mar = c(5, 15, 3, 1))
Tukey <- TukeyHSD(aov.out)
plot(Tukey, las = 2, col = "red" )Conclusion The test results show, that only the difference between very interested in politics and not interested in politics groups is significant, since the projection of difference between means of these two groups cross the “0” line
As it could be seen from the boxplot, there are some outliers. Therefore we want to double-check our results using non-parametric test.
kruskal.test(politics.11$stflife ~ politics.11$polintr, data = politics_media) ##
## Kruskal-Wallis rank sum test
##
## data: politics.11$stflife by politics.11$polintr
## Kruskal-Wallis chi-squared = 12.764, df = 3, p-value = 0.005176
Conclusion On the significance level of 5%, the test confrims the results of the ANOVA test, since p-value here is less than 0.05.
library(DescTools)
DunnTest(politics.11$stflife ~ politics.11$polintr, data = politics_media)##
## Dunn's test of multiple comparisons using rank sums : holm
##
## mean.rank.diff pval
## Hardly interested-Not interested 51.233292 0.4132
## Quite interested-Not interested 57.533139 0.3912
## Very interested-Not interested 188.485145 0.0022 **
## Quite interested-Hardly interested 6.299848 0.8690
## Very interested-Hardly interested 137.251854 0.0476 *
## Very interested-Quite interested 130.952006 0.0476 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion: The results of Dunn test show, that besides people who are very interested in politics and people who are not interested in politics at all there are two more pairs of groups, which differences in means are statistically significant. These are:
Very interested in politics and Hardly interested in politicsVery interested in politics and Quite interested in politics However, the differences in means between these pairs are statistically significant at 1% significance level, while the difference in means between Very interested in politics and Not interested in politics groups is significant at the level of 0.1%.The rest pairs of groups of people with different levels of political interest have not statistically significant differences in means.
So, answering our research question, we can argue that some groups of Irish people who are differently interested in politics have a different average level of life satisfaction. To be more precise, the following groups have a significant differences:
Very interested in politics and Hardly interested in politics peopleVery interested in politics and Quite interested in politics people While people, who are very interested in politics and people, who are not interested in politics at all, have remarkable significant differences in means of life satisfaction level, meaning that these groups of people have the largest difference in life satisfaction level.People, who are quite interested in politics and hardly interested in politics do not have statistically significant differences in means of life satisfaction level. The same goes also for these pairs of groups:
Quite interested in politics and Not interested in politics peopleHardly interested in politics and Not interested in politics peopleAfter all these tests and analysis we can conclude that the Irish people who are interested in politics to different extents indeed have not the same level of life satisfaction. Moreover, our expectations are met and people with the highest political interest are most satisfied with life.