Authors and individual contribution

Maria Starodubtseva - combining, new regressive models with interpretations, conclusion to the project

Eva Kirilova - combining, correcting previous mistakes with tests, standardized coefficients and a plot, their interpretations

Alexandra Martynova - correcting the previous parts with mistakes, descriptive statistics part, small conclusions

Introduction

In the forth project we continue to explore people’s trust variable, and how it can be predicted. In this project we aim to check new models that can predict trust to people.

Research topic and questions

The research question remains:

The general topic of our analysis is media consumption and social trust among the population of Austria. In particular, our project group is interested in a)the association of media consumption and levels of social trust among population of Austria, b)gender differences and patterns in usage of the Internet and News, as well as the level of social trust.

How consumption of News and Internet may be related with the level of social trust? Does usage of mass media and trust differ within gender groups? Who tends to consume more? And who is more trusting, active media consumers, male or female?

Then we want to check whether there are statistically significant differences between a) gender of the respondent and how often she/he uses the internet, b) consumption of news about politics and current affairs in minutes and gender groups among population of Austria, b) patterns in usage of the Internet and the level of social trust.

After establishing statistical significance between variables we want to find the influence of different factors on social trust. Therefore, our research questions are “What variables are correlated with the level of trust? and”What factors can predict the level of trust?".

To determine which variables are more likely to correlate with social trust, we found theories of other researchers. In the study “Media Effects on Political and Social Trust” Patricia Moy studied how the media consumption influence on political and social trust. They found that only social trust depend on the media consumption. One more research is “Trusting the State, Trusting Each Other? The Effect of Institutional Trust on Social Trust” (Kim Mannemar Sønderskov and Peter Thisted Dinesen). The researchers wanted to study dependence between social and political trust and found that the more people trust to the government, the more they trust each other. One more theoretical example is “Connecting” and “Disconnecting” With Civic Life: Patterns of Internet Use and the Production of Social Capital" (Dhavan V. Shah) is about interdependence between internet use and social capital. Reading these works we decided to study how social trust of the person may be predicted by such variables as watching news about politics, internet use, trust to the Parliament and interest in politics. (project 3) Based on these articles and our knowledge we think that these variables can be related to social trust to different extent. Also, we expect that trust to the Parliament is the strongest predictor for social trust.

Lastly, we in the forth model we construct new models to predict social trust, again with the help of existing theories and researches. The theories are presented in the last section of the combined project.

Hence, our general hypotheses are the following:

  1. Media consumption varies between gender groups in Austria, and it has a positive association with people’s level of trust.

  2. Social trust has a positive correlation with people’s trust to Parliament, and trust to Parliament can predict people’s trust, assuming how interested in politics people are.

  3. The predictive power of trust to Parliament will be higher, if people are less educated.

  4. Religion of people in the country can also predict level of trust of people: people of main religion are expected to have higher level of social trust than other religions.

Data and selecting variables

Firstly, we uploaded data - ESS9 (2018) - Austria. [1] As well as the necessary packages for working with the data.

library(foreign)
library(dplyr)
library(ggplot2)

data <- read.spss("ESS9AT.sav", 
                use.value.labels = T, to.data.frame = T)
data <- data %>% select(gndr, netusoft, nwspol, netustm, trstprl, ppltrst, polintr, eduyrs, rlgdnm)

There are 9 variables necessary for our topic and the research questions which types we identify straight away:

Variable gndr: Gender - nominal

Variable nwspol: News about politics and current affairs, watching, reading or listening, in minutes (a day) - ratio

Variable netusoft: Internet use, how often (5 categories) - ordinal

Variable netustm: Internet use, how much time on typical day, in minutes (a day) - ratio

Variable ppltrst: Most people can be trusted or you can’t be too careful (scale from 0 to 10, where 0 - you can’t be too careful and 10 - most people can be trusted) - interval

Variable polintr: Interest in politics - ordinal (4 categories: Very interested, Quite interested, Hardly interested, Not at all interested)

Variable trstprl: Whether a respondent trusts to his Parliament (scale 0-10) - interval

Variable eduyrs: Years of full-time education completed - ratio

Variable rlgdnm: Religion or denomination belonging to at present (6 categories) - nominal

Initially, they all are identified by R as factor variables but we need to change some of them to appropriate ones - numeric. And change netusoft variable to the ordinal one. Polintr, rlgdnm and gndr are already factor variables.

data$nwspol <- as.numeric(as.character(data$nwspol))
data$netustm <- as.numeric(as.character(data$netustm))
data$ppltrst <- as.numeric(data$ppltrst) - 1 
data$trstprl <- as.numeric(data$trstprl)- 1

data$netusoft <- ordered(data$netusoft, levels = c("Never", "Only occasionally", "A few times a week", "Most days", "Every day"))

data$eduyrs <- as.numeric(as.character(data$eduyrs))

Descriptive part

Let’s look at the variables and some descriptive statistics about them.

summary(data)
##      gndr                    netusoft        nwspol          netustm     
##  Male  :1153   Never             : 453   Min.   :  0.00   Min.   :  5.0  
##  Female:1346   Only occasionally : 112   1st Qu.: 15.00   1st Qu.: 60.0  
##                A few times a week: 182   Median : 30.00   Median :120.0  
##                Most days         : 224   Mean   : 45.79   Mean   :147.6  
##                Every day         :1528   3rd Qu.: 60.00   3rd Qu.:180.0  
##                                          Max.   :600.00   Max.   :600.0  
##                                          NA's   :13       NA's   :761    
##     trstprl         ppltrst                        polintr        eduyrs    
##  Min.   : 0.00   Min.   : 0.000   Very interested      :380   Min.   : 1.0  
##  1st Qu.: 4.00   1st Qu.: 4.000   Quite interested     :919   1st Qu.:11.0  
##  Median : 5.00   Median : 6.000   Hardly interested    :885   Median :12.0  
##  Mean   : 5.42   Mean   : 5.543   Not at all interested:315   Mean   :12.6  
##  3rd Qu.: 7.00   3rd Qu.: 7.000                               3rd Qu.:14.0  
##  Max.   :10.00   Max.   :10.000                               Max.   :32.0  
##  NA's   :51      NA's   :2                                    NA's   :35    
##                           rlgdnm    
##  Roman Catholic              :1546  
##  Protestant                  : 112  
##  Islam                       :  87  
##  Eastern Orthodox            :  40  
##  Other Christian denomination:  21  
##  (Other)                     :  16  
##  NA's                        : 677

For news in minutes the interquartile range is from 15 to 60, for internet use in minutes - 60 - 180, and for level of trust - 4 - 7. Other values of descriptive statistics we will discuss while talking about graphs with distributions.

Also we can find number of observations of variables like gndr, netusoft and polintr: 1153 males and 1346 females, 453 never use the internet, 112 only occasionally, 182 a few times a week, 224 most days and 1528 every day. Finally, 380 are very interested in politics, 919 quite interested, 885 hardly interested and 315 are not interested at all.

Calculating mode requires separate function.

getmode <- function(v) {
uniqv <- unique(v)
uniqv[which.max(tabulate(match(v, uniqv)))]}

getmode(data$gndr)
getmode(data$nwspol)
getmode(data$netusoft)
getmode(na.omit(data$netustm))
getmode(data$ppltrst)
getmode(data$polintr)
getmode(data$trstprl)

The modes for categorical variables are obvious just by looking at the number of observations for each category: female, every day and quite interested.

News are mostly watched 60 minutes, majority didn’t reply or missed the question about the internet usage, but without NA the mode is 120. And for social trust the most frequent answer is 7, for political trust - 5.

Let us look at the distributions of our continuous variables:

library(psych)
describe(data)
##           vars    n   mean     sd median trimmed   mad min max range  skew
## gndr*        1 2499   1.54   0.50      2    1.55  0.00   1   2     1 -0.15
## netusoft*    2 2499   3.91   1.58      5    4.13  0.00   1   5     4 -1.02
## nwspol       3 2486  45.79  45.48     30   38.58 29.65   0 600   600  3.15
## netustm      4 1738 147.57 118.76    120  128.65 88.96   5 600   595  1.62
## trstprl      5 2448   5.42   2.28      5    5.51  2.97   0  10    10 -0.32
## ppltrst      6 2497   5.54   2.37      6    5.66  2.97   0  10    10 -0.42
## polintr*     7 2499   2.45   0.90      2    2.44  1.48   1   4     3  0.03
## eduyrs       8 2464  12.60   3.25     12   12.28  1.48   1  32    31  1.18
## rlgdnm*      9 1822   1.43   1.25      1    1.06  0.00   1   8     7  3.19
##           kurtosis   se
## gndr*        -1.98 0.01
## netusoft*    -0.67 0.03
## nwspol       19.57 0.91
## netustm       2.78 2.85
## trstprl      -0.22 0.05
## ppltrst      -0.42 0.05
## polintr*     -0.76 0.02
## eduyrs        2.80 0.07
## rlgdnm*       9.17 0.03

Looking at the descriptive statistics we can say that nwspol and netustm variables are highly right-skewed because their values are higher than +1,nwspol(3.15) and netustm(1.62) (Bulmer). Looking at the kurtosis of these two variables we can say that they both are too peaked because nwspol(19.57) and netustm(2.78) (Hair). Also with mean and median we can find that people tend to spend approximately as median(30) and as mean(45.79) minutes a day on news or as median(120) and as mean(147.57) on internet. Variables ppltrst and trstprl are closer to be normally distributed because their values are between 0 and - 0.5. Mean and median are approximately equal and they are at the center of interval. Mean(5.42) and median(5) for trstprl and mean(5.54) and median(6) for ppltrst. Variable polintr is in the category of “Quite interested”, mostly.

Now let’s look at the variables more closely.

Firstly, we want to have a look at the distribution of time devoted to news to identify general trend of the population.

ggplot(data = data) + 
  geom_histogram(aes(x = nwspol), fill = "pink") +
  xlab("Time, in minutes (a day)")  + scale_x_continuous(breaks = seq(0, 600, 50)) +
  ylab("Count") +
  ggtitle("Time devoted to watching, reading or listening\nto news about politics and current affairs") +
    geom_vline(xintercept = 45.79, color = "blue") +  
  theme_minimal() 

The histogram was employed here because we have a continuous variable. On the graph we can see how the time devoted to news is distributed. We can notice from the summary table that the average time, devoted to watching, reading or listening to news is approximately 46 minutes. To show that on the graph we used a geom_vline layer. The distribution of this histogram is highly right-skewed because mean is larger than median value (median = 30.00), and also the longer right tail is visible. We may notice that there are some outliers - people who spend way more time than mean, median, and mode values for the population. As the kurtosis is way larger than +1, the distribution is too peacked, leptokurtic. (Hair et al., 2017)

Now we will take a look at the frequency of the Internet usage among population.

ggplot(data = data) + 
  geom_bar(aes(x = netusoft), fill = "pink", color = "black") + 
  xlab("") +
  ylab("Count") +
  ggtitle("Frequency of the Internet usage") +
  theme_minimal() 

The barplot was used here because we have an ordinal variable with 5 categories. On the graph we can see how frequently people use the Internet (from Never to Every day). It is noticeable that “every day” category is the largest by count. This means that it is a mode for the population. The smallest is “only occasionally”. The median for the variable is 5, because “every day” category overweights others (1528 observations).

Next we look at the distribution of the level of social trust among the population of Austria.

ggplot(data = data) +
geom_histogram(aes(x = ppltrst), fill = "pink", binwidth = 1) +
xlab("Level of trust") + scale_x_continuous(breaks = seq(0, 10, 1)) +
ggtitle("The level of social trust of Austria") +
  geom_vline(xintercept = 5.543, color = "blue") + 
theme_minimal()

The histogram was applied here because we have an interval variable, we can look at its distribution. On the graph we can see how trustful people of Austria are. We can notice from the summary table that the average level of trust is approximately 5.5. To show that on the graph we used a geom_vline layer. It is also noticeable that the mode is 7. The median value is 6, so the distribution is slightly right-skewed. And the kurtosis is -0.4, meaning that distribution is not peacked - platykurtic.

skew(data$ppltrst) #skew
## [1] -0.4174875
kurtosi(data$ppltrst) #kurtosis
## [1] -0.4155111

Hence, the level of trust of the population is moderate with some tendency to higher level of trust.

As we looked at the general trend for the population, now we would like to see whether the frequency of the usage of the Internet is different within gender groups.

ggplot(data = data) + 
  geom_bar(aes(x = netusoft, fill = gndr)) + 
  xlab("") +
  ylab("Count") +
  ggtitle("Frequency of the Internet usage by gender") +
  labs(fill = "Gender") +
  coord_flip() +
  theme_minimal() 

For such a purpose we use a stacked barplot, having two categorical variables. For the graph it is seen that there is no significant difference by gender. The share of women and men in categories is almost 50/50. So we cannot conclude that one gender is using the Internet more frequently than another.

As we saw the distribution of the Internet usage for the whole population, now we want to see if it differs between genders.

ggplot(data = data) +
geom_boxplot(aes(x = netustm, y = gndr), fill = "pink") +
xlab("Time, in minutes (a day)") + scale_x_continuous(breaks = seq(0, 600, 50)) +
ylab(" ") +
ggtitle("Time devoted to Internet usage on typical day by gender") +
theme_minimal()

We know from the table that the median of the Internet use in minutes is 120, and it is equal for both genders. But at the same time, men spend more time on internet usage, in general. The interquartile range for men is much higher than for women.

And now we would like to see whether amount of news consumption differs between genders.

ggplot(data = data) +
geom_boxplot(aes(x = nwspol, y = gndr), fill = "pink") +
xlab("Time, in minutes (a day)") + scale_x_continuous(breaks = seq(0, 600, 50)) +
ylab(" ") +
ggtitle("Time devoted to watching, reading or listening\nto news about politics and current affairs by gender") +
theme_minimal()

The median of both genders is the same again - 30. But the distribution of news consumption differs. Womens’ news consumption is a little bit smaller than mens’ because the maximum points (outliers) for men are higher. And the range between 1st and 3rd quartile for women is lower.

Lastly, we want to see the association of level of trust and news consumption. Plus how gender matters here or not.

ggplot(data = data) + 
  geom_point(aes(x = ppltrst, y = nwspol, color = gndr)) + 
  xlab("Level of trust (0-10)") +  scale_x_continuous(breaks = seq(0, 10, 1)) +
  ylab("Time, in minutes (a day)") +
    labs(color = "Gender") +
  ggtitle("Level of social trust and time devoted to news consumption") +
  theme_minimal() 

Scatterplot shows, firstly, that there is no clear association between the level of trust and news consumption. However, there are some points – people for whom the correlation is noticeable – who use the Internet a lot and have a higher trust level.

Also, the level of trust and Internet usage have no associations with gender, as the scatterplot is varicolored, and there are no certain clusters of one color in some place on the graph.

Conclusion: discriptive part

All in all, after this analysis, we may conclude that the level of trust is moderate within the population of Austria. The time spent in the Internet does not imply there are more trustful people among the population. Although majority use the Internet every day, the level of trust is moderate.

Moreover, the patterns of the media consumption seem not to differ much between gender groups. Although, it looks like males use the Internet a bit more. What is more, there are some outliers among people who watch, listen or read the news more than the general population and also they tend to trust others more, so there is some association for them but not for the general population.


Statistical tests

Chi-squared Test

First of all, we would like to learn how gender and frequency of internet consumption might be actually associated. For this we conduct a chi-squared test, having one binary variable - gender and categorical ordinal - Internet consumption frequency.

Our hypothesis from previous descriptive part is that males in Austria use the Internet more frequently compared to females.

We will look at the data using a stacked barplot first.

ggplot(data, 
       aes(x = netusoft, 
           fill =  gndr)) +
  geom_bar(position = "fill") +
  labs(title="",
       x="How often do respondent use the internet",
       y="Shares of observations",
       fill="Gender") +
  ggtitle("Frequency of the Internet usage by gender") +
  coord_flip()+
    theme_bw()

It is visible that females prevail in some categories, like “Never”, “A few times a week”.

Before conducting a chi-squared test we need to check its assumptions.

1) Independence of variables

Respondents answered independently of each other so the independence assumption is met.

2) At least 5 observations per each category

For this we will look at the contingency table of each category of Internet consumption by gender.

table(data$gndr, data$netusoft)
##         
##          Never Only occasionally A few times a week Most days Every day
##   Male     184                53                 74       112       730
##   Female   269                59                108       112       798

The categories have no less than 5 observations - the assumption is met.

The statistical hypothesis for the chi-squared:

H0: there is independence between gender and internet usage

H1: there is dependence between gender and internet usage

chisq.test(data$gndr, data$netusoft)
## 
##  Pearson's Chi-squared test
## 
## data:  data$gndr and data$netusoft
## X-squared = 10.807, df = 4, p-value = 0.02882

There is a statistically significant association between gender and amount of internet usage with X-squared(4) = 10.807 and p-value = 0.02882 (lower than the threshold of 0.05). We reject the null hypothesis, so the gender and internet usage are dependent.

chi <- chisq.test(data$gndr, data$netusoft)

Also another assumption is met 3) 5+ observations in at least 80% of the cells

round(chi$expected, 2)
##          data$netusoft
## data$gndr  Never Only occasionally A few times a week Most days Every day
##    Male   209.01             51.68              83.97    103.35       705
##    Female 243.99             60.32              98.03    120.65       823
library(corrplot)
corrplot(chisq.test(data$gndr, data$netusoft)$stdres, is.cor = FALSE)

round(chi$stdres, 2)
##          data$netusoft
## data$gndr Never Only occasionally A few times a week Most days Every day
##    Male   -2.60              0.26              -1.54      1.22      2.06
##    Female  2.60             -0.26               1.54     -1.22     -2.06

According to the tables and the plot of residuals we can say that there are less males who never use internet than it was expected, whereas there are more females who never use internet than it was expected.

In addition, there are more males who use internet every day than it was expected and there are less females who use internet every day than it was expected.

As a conclusion, we may say that males tend to use the Internet more, whereas females are more likely not to use it at all.

T-Test

Next we would like to analyze whether consumption of news about politics and current affairs in minutes a day varies between gender groups. For this we conduct an independent samples t-test, having one binomial variable - gender and continuous one - news consumption in minutes.

Our hypothesis from the descriptive part is that news consumption will not differ drastically between gender groups, but still males will have a bit higher amount of news consumption than females

Firstly, let’s have a look at the distributions of the variable in two gender groups.

ggplot(data = data) +
geom_boxplot(aes(x = nwspol, y = gndr), fill = "pink") +
xlab("Time, in minutes (a day)") +
ylab(" ") +
ggtitle("Time devoted to watching, reading or listening\nto news about politics and current affairs by gender") +
theme_minimal()

It is visible that the boxes are pretty similar, however, male participants have outliers with higher values than females. So we may assume that there will be a difference in means. Let’s check it with the t-test.

Before conducting a test we will check for assumptions first.

1) Independence of variables

Each estimate is independent and belongs to a particular participant of the survey. The answers are independent of each other.

2) Normality of distributions

Already from the boxplot it is obvious that the distributions are right-skewed but we will check it with two other ways.

Skew & Kurtosis:

library(kableExtra)
describeBy(data$nwspol, data$gndr, mat = T) %>%
  select(
    Gender = group1,
    N = n,
    Mean = mean,
    SD = sd,
    Median = median,
    Min = min,
    Max = max,
    Skew = skew,
    Kurtosis = kurtosis,
    st.error = se
  ) %>%
  kable(
    align = c("lrrrrrrrrr"),
    digits = 2,
    row.names = FALSE,
    caption = "News consumption (in minutes) by Gender"
  ) %>%
  kable_styling(bootstrap_options = c("bordered", "responsive", "striped"), full_width = FALSE)
News consumption (in minutes) by Gender
Gender N Mean SD Median Min Max Skew Kurtosis st.error
Male 1148 49.34 49.26 30 0 600 3.58 24.03 1.45
Female 1338 42.74 41.75 30 0 360 2.44 10.04 1.14

The values of skew are high, more than 0.5 (3.58 - male, 2.44 - female); they tell us about highly right-skewed distributions in two gender groups.

The kurtosis estimates are also very high, much larger than 1 (24.03 - male, 10.04 - female); they tell us about highly peaked distributions of the news variable in two groups.

Q-Q plot:

We may also see it on the plot.

ggplot(data, aes(sample = nwspol)) +
  stat_qq() +
  stat_qq_line() +
  facet_grid(~gndr)

The skewness to the right is very visible.

So the assumption for normality is not met. However, we have more than 1000 observations per gender group, so the assumption is not so strict.

3) Equality of variances

library(car)
leveneTest(data$nwspol ~ data$gndr)
## Levene's Test for Homogeneity of Variance (center = median)
##         Df F value  Pr(>F)  
## group    1  3.8935 0.04858 *
##       2484                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

With the p-value of 0.04858 (lower than the threshold of 0.05) we may reject the null hypothesis of equal variances. So the assumption of equal variances is also not met. This means that Welch’s t-test is relevant for this case.

The statistical hypothesis for the t-test:

H0: Mean values of news consumption in two gender groups are equal.

H1: Mean values of news consumption in two gender groups are different.

Welch’s T-test

t.test(data$nwspol ~ data$gndr)
## 
##  Welch Two Sample t-test
## 
## data:  data$nwspol by data$gndr
## t = 3.5682, df = 2260.2, p-value = 0.0003669
## alternative hypothesis: true difference in means between group Male and group Female is not equal to 0
## 95 percent confidence interval:
##   2.970739 10.220172
## sample estimates:
##   mean in group Male mean in group Female 
##             49.33537             42.73991

There is a statistically significant association between gender and the amount of news consumption with t(2260.2) = 3.5682 and p-value = 0.0003669 (lower than the threshold of 0.05). The mean values in two gender groups are different.

Non-parametric t-test

We will double check it with a non-parametric test.

library(coin)
wilcox_test(data$nwspol ~ data$gndr)
## 
##  Asymptotic Wilcoxon-Mann-Whitney Test
## 
## data:  data$nwspol by data$gndr (Male, Female)
## Z = 4.2444, p-value = 2.192e-05
## alternative hypothesis: true mu is not equal to 0

Again there is a statistically significant association between gender and amount of news consumption; with p-value = 2.192e-05 (lower than the threshold of 0.05) we may reject the null hypothesis of equality of means. The mean values in two gender groups are different. This confirms the parametric test’s results.

Effect size

Finally, we are interested in the effect size.

library(lsr)
cohensD(nwspol ~ gndr, data = data)
## [1] 0.1453594

The value of the Cohen’s d is equal to 0.1453594, meaning a small effect size. This means that the influence of gender on the news consumption variable is present but small.

ANOVA

Lastly, we’d like to see whether groups by interest in politics(4 categories) differ from each other depending on their trust in Parliament. For this purpose we will apply ANOVA and use one continuous variable - “trstprl” - and one ordinal (4 categories) - “polintr”. So, we will try to answer the question of whether people of different groups of interest in politics will trust to Parliament the same.

P.S. The results of this test will be later used for constructing regression models

**Again our hypothesis is that

Let us take a look at the distribution of trust to Parliament by interest levels:

data$polintr = as.factor(data$polintr)
ggplot(data = data) +
  geom_boxplot(aes(x = polintr, y = trstprl), fill = "pink") + 
      xlab("Level of interest in politics") + 
      ylab("Trust to Parliament") +
      ggtitle("Distribution of trust to Parliament by interest levels") +
   scale_y_continuous(breaks = seq(0, 10, 1)) +
  theme_minimal()

From the boxplot we can say that median value of trust in Parliament of every category is in the range between 5-6 (mostly the same). Category of “Quite interested” has some number of outliers whereas other categories don’t have outliers at all.

Before conducting the test, we’d like to check its assumptions:

1) Independence of variables

Each observation is independent and belongs to a particular participant. The answers are independent of each other.

2) Homogenity of variances

leveneTest(data$trstprl ~ data$polintr)
## Levene's Test for Homogeneity of Variance (center = median)
##         Df F value    Pr(>F)    
## group    3  10.507 7.261e-07 ***
##       2444                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From the Levene’s test on the equality of variances we can conclude that they are not equal (p-value is less than the threshold of 0.05). So, we reject the null hypothesis of the equal variances - the assumption is not met.

3) Normality

It was visible from the boxplot that distributions are a bit skewed in some groups but we will also check it formally:

Skew & Kurtosis:

library(kableExtra)
describeBy(data$trstprl, data$polintr, mat = T) %>%
  select(
    Trust = group1,
    N = n,
    Mean = mean,
    SD = sd,
    Median = median,
    Min = min,
    Max = max,
    Skew = skew,
    Kurtosis = kurtosis,
    st.error = se
  ) %>%
  kable(
    align = c("lrrrrrrrrr"),
    digits = 2,
    row.names = FALSE,
    caption = "News consumption (in minutes) by Trust levels"
  ) %>%
  kable_styling(bootstrap_options = c("bordered", "responsive", "striped"), full_width = FALSE)
News consumption (in minutes) by Trust levels
Trust N Mean SD Median Min Max Skew Kurtosis st.error
Very interested 378 5.94 2.36 6 0 10 -0.32 -0.35 0.12
Quite interested 911 5.86 2.13 6 0 10 -0.37 -0.06 0.07
Hardly interested 860 5.07 2.12 5 0 10 -0.30 -0.23 0.07
Not at all interested 299 4.45 2.60 5 0 10 -0.10 -0.70 0.15

In each category skeweness is between - 0.5 and 0.5 which says that the distribution is symmetrical (Bulmer 1979). The kurtosis of each category is less than zero. It means that distribution is almost flat (platokurtic) (Hair et al., 2017). “Not at all interested” category kurtosis is -0.7 and it’s the closest one to -1, which means it has the most flat distribution.

The statistical hypothesis for the ANOVA test:

H0: Mean values of trust to Parliament are equal in different groups by interest in politics.

H1: At least one group’s mean is different.

F-test

As we have different variances we state the parameter of equal variances to be equal to False.

library(stats)
oneway.test(data$trstprl ~ data$polintr, var.equal = F)
## 
##  One-way analysis of means (not assuming equal variances)
## 
## data:  data$trstprl and data$polintr
## F = 40.754, num df = 3.00, denom df = 888.85, p-value < 2.2e-16

We see from the test that the result is statistically significant, therefore, we can reject the null hypothesis (p-value < 0.05). So, at least one group’s mean is different.

Post-hoc test for parametric ANOVA

We will use Bonferroni’s post-hoc test because we have unequal variances:

pairwise.t.test(data$trstprl, data$polintr, adjust = "bonferroni", pool.sd = F)
## 
##  Pairwise comparisons using t tests with non-pooled SD 
## 
## data:  data$trstprl and data$polintr 
## 
##                       Very interested Quite interested Hardly interested
## Quite interested      0.58332         -                -                
## Hardly interested     3.6e-09         3.3e-14          -                
## Not at all interested 2.2e-13         1.9e-15          0.00049          
## 
## P value adjustment method: holm

We see that all but one pairs have statistically significant differences between the means. From the test we can notice that all but one pair of groups have statistically insignificant differences between the means. The one with the insignificant difference is a pair of “Quite interested” and “Very interested” levels.

Now let us double check the results using a non-parametric ANOVA

Non-parametric ANOVA

Before the test, we have written out hypotheses:

The statistical hypothesis for non-parametric ANOVA:

H0: mean ranks for trust in Parliament are the same for different political interest groups.

H1: mean ranks for trust in Parliament are not the same for different political interest groups.

kruskal.test(data$ppltrst ~ data$polintr)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$ppltrst by data$polintr
## Kruskal-Wallis chi-squared = 43.958, df = 3, p-value = 1.54e-09

The result of the non-parametric test is statistically significant (p-value < 0.05). We reject the null hypothesis about the equality of ranks in groups with different levels of interest in politics. The result confirms the one of parametric ANOVA.

We can also do the post-hoc test.

Post-hoc test for non-parametric ANOVA

We will use Games-Howell post-hoc test because we have unequal variances:

library(rstatix)
games_howell_test(ppltrst ~ polintr, data = data)
## # A tibble: 6 x 8
##   .y.    group1     group2     estimate conf.low conf.high    p.adj p.adj.signif
## * <chr>  <chr>      <chr>         <dbl>    <dbl>     <dbl>    <dbl> <chr>       
## 1 ppltr… Very inte… Quite int…  -0.0295   -0.413   0.354    9.97e-1 ns          
## 2 ppltr… Very inte… Hardly in…  -0.494    -0.879  -0.109    5   e-3 **          
## 3 ppltr… Very inte… Not at al…  -0.914    -1.42   -0.410    2.19e-5 ****        
## 4 ppltr… Quite int… Hardly in…  -0.465    -0.739  -0.190    8.13e-5 ****        
## 5 ppltr… Quite int… Not at al…  -0.885    -1.31   -0.458    8.37e-7 ****        
## 6 ppltr… Hardly in… Not at al…  -0.420    -0.848   0.00782  5.7 e-2 ns

From the post-hoc test we see that all but two groups have statistically significant differences in ranks. This result confirms what we saw in parametric ANOVA post-hoc test. Although, here one more group

Effect size

We can look at effect sizes of both parametric and non-parametric tests:

library(effectsize)
omega_squared(oneway.test(data$trstprl ~ data$polintr, var.equal = F))
## Omega2 |       90% CI
## ---------------------
## 0.12   | [0.09, 0.15]
  • The effect size of parametric ANOVA omega squared = 0.12. It is a medium effect size (between 0.06 and 0.14).
library(rstatix)
kruskal_effsize(data = data, trstprl ~ polintr)
## # A tibble: 1 x 5
##   .y.         n effsize method  magnitude
## * <chr>   <int>   <dbl> <chr>   <ord>    
## 1 trstprl  2499  0.0454 eta2[H] small
  • The effect size of non-parametric ANOVA = ~ 0.045, which is also small (between 0.01 and 0.06).

Conclusion: statistical tests

To conclude, gender have some associations with how frequently people use the Internet and how much they consume news: women are a bit more likely to never use the Internet, whereas men can use it more often. As for news, male participants consume it a bit more (the difference in means appeared to be approximately 7 min).

In addition, people with different levels of interest in politics in general seem to trust the Parliament differently, except for the insignificant difference between 2 groups with moderate and high interest in politics.


Correlation and regression models

Again after a little research for the articles about correlation of different variables with social trust, we have chosen certain variables for our analysis.

###Correlation

We have chosen 3 continuous variables (netustm, nwspol, trstprl) to see how they are correlated with Trust to people variable. For this purpose we used a correlation matrix:

Spearman’s method is used because from the descriptive statistics we saw that the distributions of variables are not normal.

library(sjPlot)
tab_corr(data[, 3:6], corr.method = "spearman")
  nwspol netustm trstprl ppltrst
nwspol   -0.016 0.131*** 0.046
netustm -0.016   0.029 -0.041
trstprl 0.131*** 0.029   0.304***
ppltrst 0.046 -0.041 0.304***  
Computed correlation used spearman-method with listwise-deletion.

From the matrix we see that all in all 2 correlations are statistically significant. They are between political news consumption and trust to Parliament, trust to Parliament and trust to people. Also, we see that these correlations are positive with small-to-medium magnitudes of correlation.

So the largest correlation that we received for ‘trust to people’ variable is with ‘trust to parliament’ variable.

Additionally, we will visualize the relationship with the strongest correlation coefficient:

library(ggplot2)
ggplot(data, aes(x = trstprl, y = ppltrst)) +
geom_point() +
geom_smooth(method = "lm") +
  geom_jitter() +
  labs(title = "Correlation between Trust to the Parliament and Trust to people", x = "Level of trust to Parliament", y = "Level of trust to people") +
theme_minimal()

From the scatter plot between the trust levels variables linearity is visible. So, the more respondent trusts to his Parliament, the more s/he trusts to people around her/him.

Regression

In addition to political trust, we assume that interest in politics itself may have an association with social trust, as well.

Let us build a boxplot between Interest in politics (categorical) and Trust to people variables (ordinal):

ggplot(data, aes(x = polintr, y = ppltrst)) +
geom_boxplot(fill = "pink") +
  labs(title = "Relationship between Interest in Politics and Trust to people", x = "", y = "Level of Trust") +
theme_minimal()

We see a slight trend for negative correlation between these two variables. Though the medians seem to be equal for first three categories, their interquartile ranges are different, and observations are concentrated in a descending way starting from the second category.

For our regression model with outcome variable ‘trust to people’ we will use one continuous (trust to parliament) and one categorical (interest in politics) variables.

From the correlation analysis we figured out a very small correlation between news consumption (and even smaller for internet use) and people’s trust. We assume no improvement in model if using nwspol variable. But we can check it.

Models

data1 <- data %>% 
  filter(is.na(data$ppltrst)== F, is.na(data$trstprl)== F, is.na(data$nwspol)== F)

Firstly, we will estimate the model 1.

m1 <- lm(ppltrst ~ trstprl, data = data1)
summary(m1)
## 
## Call:
## lm(formula = ppltrst ~ trstprl, data = data1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.9883 -1.4219  0.2648  1.5781  6.1445 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.85548    0.11809   32.65   <2e-16 ***
## trstprl      0.31328    0.02008   15.60   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.259 on 2432 degrees of freedom
## Multiple R-squared:  0.09099,    Adjusted R-squared:  0.09061 
## F-statistic: 243.4 on 1 and 2432 DF,  p-value: < 2.2e-16

For the first model F-statistic is 243.4 on 1 and 2432 DF with p-value < 2.2e-16. Intercept is 3.85548, that is when level of trust to parliament is 0, the outcome variable (people’s trust) is predicted to be equal to 3.85548. And the increase of predictor (trust in parliament) by 1 point has an effect of 0.31328 increase in outcome variable on average. The Adjusted R-squared is 0.09061, so that 0.09061 or 9% is explained by a model. It is very close to having a satisfactory predictive power (0.1)

So next we will build all three models hierarchically.

m2 <- lm(ppltrst ~ trstprl + polintr, data = data1)

m3 <- lm(ppltrst ~ trstprl + nwspol + polintr, data = data1)

Model fit

Let’s compare the models and decide which is the best.

anova(m1, m2, m3)
## Analysis of Variance Table
## 
## Model 1: ppltrst ~ trstprl
## Model 2: ppltrst ~ trstprl + polintr
## Model 3: ppltrst ~ trstprl + nwspol + polintr
##   Res.Df   RSS Df Sum of Sq      F   Pr(>F)   
## 1   2432 12411                                
## 2   2429 12344  3    67.515 4.4275 0.004128 **
## 3   2428 12342  1     2.150 0.4229 0.515545   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

It is visible that the 2 model fits the best with F = 4.4275 and p-value = 0.004128. As we thought, ‘consumption of news’ variable does not improve the model significantly.

The best model

tab_model(m2, show.ci = F)
  ppltrst
Predictors Estimates p
(Intercept) 4.07 <0.001
trstprl 0.30 <0.001
polintr [Quite
interested]
0.02 0.887
polintr [Hardly
interested]
-0.21 0.142
polintr [Not at all
interested]
-0.50 0.005
Observations 2434
R2 / R2 adjusted 0.096 / 0.094

Trust to people = 4.07 + 0.30 * trstprl + 0.02 * Quite interested - 0.21 * Hardly interested - 0.50 * Not at all interested

Reference level - Very interested

When level of trust to parliament is 0 and a person is very interested in politics, the outcome variable (people’s trust) is predicted to be equal to 4.08.

If other variables are constant, the increase of trust in parliament by 1 point has an effect of 0.31328 increase in social trust on average.

If other variables are constant, quite interested in politics people will score 0.02 higher in ppltrust than very interested on average. (not significant p-value though)

If other variables are constant, hardly interested in politics people will score 0.21 lower in ppltrust than very interested on average. (not significant p-value though)

If other variables are constant, not interested at all in politics people will score 0.51 lower in ppltrust than very interested on average. (significant)

Conclusion: correlation + regression

As a result of our analysis, people’s trust correlates the most with people’s trust in their parliament. In addition, we assumed that also the interest in politics itself can also affect the outcome variable. The causal relationships between political and social trust were revealed in articles that we cite. So the regression model that we get is logical and theoretically backed up.

To conclude, it appeared that social trust is associated more with political trust, than with consumption of media, such as News or the Internet in Austria. So the regression model was built with two variables as predictors - trust in parliament and interest in politics.

Regression with moderation

We continue to explore people’s trust variable, and how it can be predicted. In this project we aim to check new models that can predict trust to people. The research question remains: “What factors can predict the level of trust the best?”

From the above results we would like also to check if moderation of the effect of political trust on people’s trust by interest in politics is better than just additive model (m3 from previous section). The assumption is that the moderation can take place, because if people are not aware of political affairs, and they aren’t interested in politics, there might be no effect of trust in parliament on trust to people. But if they are interested in politics, the effect will be stronger. In addition, the amount of news consumed by people will have an effect on trust, as well. So we include all three predictors in a new model.

Another suggestion is that education of people might moderate the effect of trust to parliament on trust to people, taking into account people’s religious denomination. Our hypothesis is that people who trust to the Parliament will tend to trust other people less if they are educated. We assume that educated people can distinguish between these two kinds of trust.

In the article “Still a new democracy? Individual-level effects of social trust on political trust in South Korea”(Juheon Lee, 2018) we can find theoretical background for our research hypothesis. It’s about a relationship between social and political trust with an education as a moderator. Researchers found that with the increase in the level of education social trust is increasing, too, whereas when education is included, it has a reverse effect on the relationship: so, with the increase in political trust, social trust decreases.

Also, to find how religiosity affects social trust we found the theory from the study “Does Religiosity Promote or Discourage Social Trust? Evidence from Cross-Country and Cross State Comparisons”by Niclas Berggren and Christian Bjørnskov. The researchers wanted to study if religiosity has a positive influence on social trust. But as a result they found only negative influence of religiosity on social trust - it means if religiosity is lower the trust level would be higher.

Let’s check the above hypotheses.

Descriptive statistics:

table(data$rlgdnm)
## 
##                Roman Catholic                    Protestant 
##                          1546                           112 
##              Eastern Orthodox  Other Christian denomination 
##                            40                            21 
##                        Jewish                         Islam 
##                             3                            87 
##             Eastern religions Other Non-Christian religions 
##                            11                             2
describe(data)
##           vars    n   mean     sd median trimmed   mad min max range  skew
## gndr*        1 2499   1.54   0.50      2    1.55  0.00   1   2     1 -0.15
## netusoft*    2 2499   3.91   1.58      5    4.13  0.00   1   5     4 -1.02
## nwspol       3 2486  45.79  45.48     30   38.58 29.65   0 600   600  3.15
## netustm      4 1738 147.57 118.76    120  128.65 88.96   5 600   595  1.62
## trstprl      5 2448   5.42   2.28      5    5.51  2.97   0  10    10 -0.32
## ppltrst      6 2497   5.54   2.37      6    5.66  2.97   0  10    10 -0.42
## polintr*     7 2499   2.45   0.90      2    2.44  1.48   1   4     3  0.03
## eduyrs       8 2464  12.60   3.25     12   12.28  1.48   1  32    31  1.18
## rlgdnm*      9 1822   1.43   1.25      1    1.06  0.00   1   8     7  3.19
##           kurtosis   se
## gndr*        -1.98 0.01
## netusoft*    -0.67 0.03
## nwspol       19.57 0.91
## netustm       2.78 2.85
## trstprl      -0.22 0.05
## ppltrst      -0.42 0.05
## polintr*     -0.76 0.02
## eduyrs        2.80 0.07
## rlgdnm*       9.17 0.03

First, we will look rlgdnm variable, which includes 8 categories: Roman Catholic, Protestant, Islam, Eastern Orthodox , Other Christian denomination, (Other). Roman Catholic includes the most number of observations - 1546, because this religion prevails in Austria, whereas Jewish and Other non-Christian denomination include the least number of observations.The number of observations is particularly different in all groups.

Firstly, we look at ppltrst which mean(5.54) and median(6) have almost the same values. The skewness(-0.42) is between -0.5 and 0.5 which says that data is symmetrical. The kurtosis is -0.42 and it means that data is a little bit peaked.

Next is trstprl with mean(5.42) and median(5), including kurtosis (-0.22) which says that it’s distribution is flat and close to normal skew(-0.32).

Variable’s eduyrs distribution with mean(11.60) and median(11) seems very skewed to the right (1.17) and at the same time too peaked (2.63).

So, almost all our continuous variables’ distributions are close to normal except years of education variable.

Next let us center all the continuous predictors (nwspol, trstprl, eduyrs) for better interpretation of the intercept and the whole models.

data$nwspol_c <- as.numeric(scale(data$nwspol, center = T, scale = F))
data$trstprl_c <- as.numeric(scale(data$trstprl, center = T, scale = F))
data$eduyrs_c <- as.numeric(scale(data$eduyrs, center = T, scale = F))

summary(data)
##      gndr                    netusoft        nwspol          netustm     
##  Male  :1153   Never             : 453   Min.   :  0.00   Min.   :  5.0  
##  Female:1346   Only occasionally : 112   1st Qu.: 15.00   1st Qu.: 60.0  
##                A few times a week: 182   Median : 30.00   Median :120.0  
##                Most days         : 224   Mean   : 45.79   Mean   :147.6  
##                Every day         :1528   3rd Qu.: 60.00   3rd Qu.:180.0  
##                                          Max.   :600.00   Max.   :600.0  
##                                          NA's   :13       NA's   :761    
##     trstprl         ppltrst                        polintr        eduyrs    
##  Min.   : 0.00   Min.   : 0.000   Very interested      :380   Min.   : 1.0  
##  1st Qu.: 4.00   1st Qu.: 4.000   Quite interested     :919   1st Qu.:11.0  
##  Median : 5.00   Median : 6.000   Hardly interested    :885   Median :12.0  
##  Mean   : 5.42   Mean   : 5.543   Not at all interested:315   Mean   :12.6  
##  3rd Qu.: 7.00   3rd Qu.: 7.000                               3rd Qu.:14.0  
##  Max.   :10.00   Max.   :10.000                               Max.   :32.0  
##  NA's   :51      NA's   :2                                    NA's   :35    
##                           rlgdnm        nwspol_c        trstprl_c      
##  Roman Catholic              :1546   Min.   :-45.79   Min.   :-5.4199  
##  Protestant                  : 112   1st Qu.:-30.79   1st Qu.:-1.4199  
##  Islam                       :  87   Median :-15.79   Median :-0.4199  
##  Eastern Orthodox            :  40   Mean   :  0.00   Mean   : 0.0000  
##  Other Christian denomination:  21   3rd Qu.: 14.21   3rd Qu.: 1.5801  
##  (Other)                     :  16   Max.   :554.21   Max.   : 4.5801  
##  NA's                        : 677   NA's   :13       NA's   :51       
##     eduyrs_c       
##  Min.   :-11.5966  
##  1st Qu.: -1.5966  
##  Median : -0.5966  
##  Mean   :  0.0000  
##  3rd Qu.:  1.4034  
##  Max.   : 19.4034  
##  NA's   :35

Now, when trust to Parliament = 0, it means an “average trust to Parliament”, and when trust to Parliament = -1/1, it means trust is “one point below or above average”.

When time watching political news = 0, it means “average time spent watching political news”. When this time = -1/1, it means time spent watching news is “one minute below or above average”.

When years of education = o, it means the “average amount of years of completed education”. When this amount is -1/1, it means it is “one year below/above the average amount of completed years of education”.

Interaction models

We have filtered the data to have no NA’s and models of similar comparable sizes.

data2 <- data %>%
filter(is.na(data$ppltrst)== F, is.na(data$trstprl_c)== F, is.na(data$nwspol_c)== F)

summary(data2)
##      gndr                    netusoft        nwspol         netustm     
##  Male  :1138   Never             : 438   Min.   :  0.0   Min.   :  5.0  
##  Female:1296   Only occasionally : 108   1st Qu.: 15.0   1st Qu.: 60.0  
##                A few times a week: 176   Median : 30.0   Median :120.0  
##                Most days         : 221   Mean   : 46.1   Mean   :147.8  
##                Every day         :1491   3rd Qu.: 60.0   3rd Qu.:180.0  
##                                          Max.   :600.0   Max.   :600.0  
##                                                          NA's   :735    
##     trstprl          ppltrst                        polintr        eduyrs     
##  Min.   : 0.000   Min.   : 0.000   Very interested      :376   Min.   : 1.00  
##  1st Qu.: 4.000   1st Qu.: 4.000   Quite interested     :909   1st Qu.:11.00  
##  Median : 5.000   Median : 6.000   Hardly interested    :855   Median :12.00  
##  Mean   : 5.421   Mean   : 5.554   Not at all interested:294   Mean   :12.63  
##  3rd Qu.: 7.000   3rd Qu.: 7.000                               3rd Qu.:14.00  
##  Max.   :10.000   Max.   :10.000                               Max.   :32.00  
##                                                                NA's   :33     
##                           rlgdnm        nwspol_c          trstprl_c        
##  Roman Catholic              :1511   Min.   :-45.7856   Min.   :-5.419935  
##  Protestant                  : 109   1st Qu.:-30.7856   1st Qu.:-1.419935  
##  Islam                       :  81   Median :-15.7856   Median :-0.419935  
##  Eastern Orthodox            :  38   Mean   :  0.3155   Mean   : 0.001183  
##  Other Christian denomination:  20   3rd Qu.: 14.2144   3rd Qu.: 1.580065  
##  (Other)                     :  16   Max.   :554.2144   Max.   : 4.580065  
##  NA's                        : 659                                         
##     eduyrs_c        
##  Min.   :-11.59659  
##  1st Qu.: -1.59659  
##  Median : -0.59659  
##  Mean   :  0.03398  
##  3rd Qu.:  1.40341  
##  Max.   : 19.40341  
##  NA's   :33

After, we have built additive and interaction models and compare them by model fit.

int_m1 <- lm(ppltrst ~ trstprl_c, data = data2)

int_m2 <- lm(ppltrst ~ nwspol_c + trstprl_c + polintr, data = data2)

int_m3 <- lm(ppltrst ~ nwspol_c + trstprl_c * polintr, data = data2)

The second model is additive one, and the third is with a moderation.

Compare model fit:

anova(int_m1, int_m2, int_m3)
## Analysis of Variance Table
## 
## Model 1: ppltrst ~ trstprl_c
## Model 2: ppltrst ~ nwspol_c + trstprl_c + polintr
## Model 3: ppltrst ~ nwspol_c + trstprl_c * polintr
##   Res.Df   RSS Df Sum of Sq      F   Pr(>F)   
## 1   2432 12411                                
## 2   2428 12342  4    69.664 3.4261 0.008421 **
## 3   2425 12327  3    14.265 0.9354 0.422645   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

As a result, model 3 is not better than model 2, the p-value is higher than the threshold, so the moderation effect of interest in politics is insignificant. Besides, as in the case with an additive regression model “nwspol” variable had no significant effect on people’s trust.

Hence, we will turn to the second hypothesis and change “nwspol” and “polintr” variables. We assume that the moderation by education years of the effect of trust to parliament on trust to people will be significant. We also think that religious denomination will be a significant predictor. So, we will use two continuous predictors, which are centered, and one categorical.

data3 <- data %>%
filter(is.na(data$ppltrst)== F, is.na(data$trstprl_c)== F, is.na(data$eduyrs_c)== F, is.na(data$rlgdnm)==F)
int_m4 <- lm(ppltrst ~ rlgdnm + trstprl_c + eduyrs_c, data = data2)
int_m5 <- lm(ppltrst ~ rlgdnm + trstprl_c * eduyrs_c, data = data2)

summary(int_m5)
## 
## Call:
## lm(formula = ppltrst ~ rlgdnm + trstprl_c * eduyrs_c, data = data2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.9085 -1.4532  0.2565  1.5625  6.6364 
## 
## Coefficients:
##                                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                          5.617550   0.057344  97.962  < 2e-16 ***
## rlgdnmProtestant                     0.140873   0.219881   0.641   0.5218    
## rlgdnmEastern Orthodox              -0.802087   0.365649  -2.194   0.0284 *  
## rlgdnmOther Christian denomination  -0.954986   0.520831  -1.834   0.0669 .  
## rlgdnmJewish                         0.217497   1.271704   0.171   0.8642    
## rlgdnmIslam                         -0.519851   0.254264  -2.045   0.0411 *  
## rlgdnmEastern religions             -0.067705   0.664967  -0.102   0.9189    
## rlgdnmOther Non-Christian religions  1.277231   1.556592   0.821   0.4120    
## trstprl_c                            0.280848   0.024071  11.668  < 2e-16 ***
## eduyrs_c                             0.071162   0.017658   4.030 5.82e-05 ***
## trstprl_c:eduyrs_c                  -0.015759   0.007266  -2.169   0.0302 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.196 on 1741 degrees of freedom
##   (682 пропущенных наблюдений удалены)
## Multiple R-squared:  0.09678,    Adjusted R-squared:  0.09159 
## F-statistic: 18.65 on 10 and 1741 DF,  p-value: < 2.2e-16

From the summary we may see that the p-value is lower than the threshold (0.05), hence the moderation is significant in the model. We also see that all our predictors are significant (p-value < 0.05). ALthough, in case of religious denomination, only two categories (Eastern Orthodox & Islam) turned as such.

Model fit

anova(int_m4, int_m5)
## Analysis of Variance Table
## 
## Model 1: ppltrst ~ rlgdnm + trstprl_c + eduyrs_c
## Model 2: ppltrst ~ rlgdnm + trstprl_c * eduyrs_c
##   Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
## 1   1742 8420.0                              
## 2   1741 8397.3  1    22.691 4.7044 0.03022 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Now let us look at additive and interaction models presented next to each other:

library(sjPlot)
tab_model(int_m4, int_m5, show.ci = F)
  ppltrst ppltrst
Predictors Estimates p Estimates p
(Intercept) 5.61 <0.001 5.62 <0.001
rlgdnm [Protestant] 0.14 0.523 0.14 0.522
rlgdnm [Eastern Orthodox] -0.81 0.027 -0.80 0.028
rlgdnm [Other Christian
denomination]
-0.95 0.069 -0.95 0.067
rlgdnm [Jewish] 0.34 0.792 0.22 0.864
rlgdnm [Islam] -0.53 0.036 -0.52 0.041
rlgdnm [Eastern
religions]
-0.10 0.876 -0.07 0.919
rlgdnm [Other
Non-Christian religions]
1.32 0.398 1.28 0.412
trstprl_c 0.29 <0.001 0.28 <0.001
eduyrs_c 0.06 <0.001 0.07 <0.001
trstprl_c * eduyrs_c -0.02 0.030
Observations 1752 1752
R2 / R2 adjusted 0.094 / 0.090 0.097 / 0.092

Comparing the additive model and moderation model, F = 9.007 and p-value is 0.002717, lower than the threshold, making the difference significant and the moderation model better.

Also the additive model explains 10.3% of the data (Adjusted R-squared is 0.103), and the moderation model - 10.6% (Adjusted R-squared is 0.106). The values have a satisfactory predictive power (> 0.1), but for moderation model it is higher. This proves that moderation model is better than the additive one.

So we will use model “int_m5” for our further analysis.

Interpretation of the model

Trust to people = 5.62 + 0.14 * Protestant - 0.80 * Eastern Orthodox - 0.95 * Other Christian denomination + 0.22 * Jewish - 0.52 * Islam - 0.07 Eastern religions + 1.28 * Other Non-Christian religions + 0.28 * Trust to parliament + 0.07 * Years of education - 0.02 Trust to parliament: Years of education

Reference level - Roman Catholic

Significant: Eastern Orthodox, Islam, Trust to parliament, Years of education, Trust to parliament: Years of education

When tabulating the regression coefficients, it is seen that in moderation model when people are Roman Catholic, their trust to parliament is mean and years of education are also at mean value, then the trust to people value is predicted to be 5.62.

With mean years of education and mean value of trust to parliament Eastern Orthodox people will score 0.80 smaller in trust to people than Roman Catholic people on average.

With mean years of education and mean value of trust in parliament Islam people will have 0.52 smaller value of people’s trust than Roman Catholic people on average.

With mean years of education people who are Roman Catholic will get increase of 0.28 points in trust to people with increase of their trust to parliament in 1 point. But when the education is 1 year higher, the increase of trust to parliament by 1 point will have 0.02 decrease in people’s trust, so with higher education the effect will become negative.

With mean value of trust to parliament people who are Roman Catholic will get increase of 0.07 points in trust to people with increase in years of education in 1 point on average.

Standardised coefficients

We will calculate and compare the strength of the effect of education years and trust to Parliament to the outcome (trust to people).

library(lm.beta)
lm.beta(int_m5)
## 
## Call:
## lm(formula = ppltrst ~ rlgdnm + trstprl_c * eduyrs_c, data = data2)
## 
## Standardized Coefficients::
##                         (Intercept)                    rlgdnmProtestant 
##                         0.000000000                         0.014707863 
##              rlgdnmEastern Orthodox  rlgdnmOther Christian denomination 
##                        -0.050062679                        -0.041803940 
##                        rlgdnmJewish                         rlgdnmIslam 
##                         0.003903630                        -0.046827427 
##             rlgdnmEastern religions rlgdnmOther Non-Christian religions 
##                        -0.002321528                         0.018722462 
##                           trstprl_c                            eduyrs_c 
##                         0.269327315                         0.094862328 
##                  trstprl_c:eduyrs_c 
##                        -0.050768091

In the output we see that the marginal effect of trust to parliament (centered) by value is larger than that of education years (centered) and their values are positive. Therefore, trust to Parliament has a stronger effect on trust to people.

The model shows that with every increase of one standard deviation in trust to Parliament, a person’s trust to people rises by ~ 0.27 standard deviations. This assumes the other variable (education years) is held constant. With an increase of one standard deviation in education years, trust to people rises by ~ 0.09 standard deviations — assuming trust to Parliament is held constant.

The interaction plot

plot_model(int_m5,
           type = "int",
           title = "Predicted values of Trust to People Values", 
           legend.title = "Educationn years (centered)")

This interaction plot shows the estimated values of ppltrst given trust to Parliament at the minimal and maximal values of Education years.

  • When education years are minimal observed value, the higher the trust to the Parliament of the respondent, the higher his or her level of trust to people values are.

  • At the maximal observed education years, the higher trust to the Parliament, the higher the level of trust to people values are (although the trend is very small).

  • This interaction is significant until the confidence intervals of both lines intercept. So, at higher levels of trust to Parliament (~ 1.8 points higher than the average value), there is no difference in the relationship of trust to Parliament and trust to people by education years.

Conclusions

As a result, the moderation model has three predictors of social trust: trust to Parliament, years of education and religion. We learn that the best predictor of social trust is people’s trust to their Parliament. For uneducated people this trend is strong, however, for those who have higher education the predictive power is almost none. Also religion is important, it has a negative association with level of trust, people of not main religion in Austria (who are not Roman Catholics) trust to people less. The statistically significant results were revealed for Islam and Eastern Orthodox.

Conclusions for the project

To sum up, we have analyzed the media consumption in Austria and the level of social trust in the country. We revealed that male respondents tend to use the Internet and consume News more than females. We tried to use media consumption as a predictor for the level of social trust, but eventually, after checking the correlation coefficients and comparing various regression models, we discovered that the level of social trust correlates more with political trust of people, in particular trust to their Parliament. The theory helped us understand this phenomenon. Finally, the best models contain 1)trust to Parliament and interest in politics, 2)religion and trust to Parliament moderated by years of completed education. From the models we found out that:

References:

  1. European Social Survey Round 9 Data (2018). Data file edition 3.1. NSD - Norwegian Centre for Research Data, Norway - Data Archive and distributor of ESS data for ESS ERIC
  2. “Media Effects on Political and Social Trust” (Patricia Moy, 2000)
  3. “Trusting the State, Trusting Each Other? The Effect of Institutional Trust on Social Trust” (Kim Mannemar Sønderskov and Peter Thisted Dinesen, 2016) 4.“Connecting” and “Disconnecting” With Civic Life: Patterns of Internet Use and the Production of Social Capital" (Dhavan V. Shah, 2010)
  4. “Still a new democracy? Individual-level effects of social trust on political trust in South Korea”(Juheon Lee, 2018)
  5. “Does Religiosity Promote or Discourage Social Trust? Evidence from Cross-Country and CrossState Comparisons”(Niclas Berggren and Christian Bjørnskov, 2009)