We as a team have decided to choose the following theme: “Media and Social Trust in Russia”. We are interested in this topic because in modern Russia many people (especially youth) are becoming less and less confident in the truthfulness of the information that is received from media such as Internet and TV. Our group supposes that the political situation both in the country and in terms of foreign international relations only exacerbates the situation. The analysis of the year 2016 will help to receive more or less actual information. In general, statistical analysis for media and social trust studying is extremely important, because the information about the whole country should be analyzed to make valid results, which may help in different aspects (politics, economy, social sphere and, of course, science). Talking about personal motivation, the reason is obvious: we now live in Russia and simply want to know in what extent people around us trust the others’ words, media, and news.
Team: Bochkareva Ksenia, Vlasova Anastasia, Smorzh Alisa, Shestokova Anna; BSC 172
library(dplyr)
library(readr)
library(ggplot2)
library(wesanderson)
library(psych)
library(sjPlot)
library(foreign)
library(scales)
library(formattable)
We have chosen 8 variables for our analysis.
ESS8RU <- read.spss("ESS8RU.sav", use.value.labels = T, to.data.frame = T)
rus = ESS8RU %>% dplyr::select(nwspol, netusoft, netustm, ppltrst, pplfair, pplhlp, gndr, agea)
rus_noNA = rus %>% na.omit()
rus$nwspol = as.numeric(rus$nwspol)
rus$netusoft = as.factor(rus$netusoft)
rus$netustm = as.numeric(rus$netustm)
rus$ppltrst = as.factor(rus$ppltrst)
rus$pplfair = as.factor(rus$pplfair)
rus$pplhlp = as.factor(rus$pplhlp)
rus$gndr = as.factor(rus$gndr)
rus$agea = as.numeric(rus$agea)
Here the variables and their types are identified:
Variables <- c("nwspol: News about politics and current affairs, watching, reading or listening","netustm: Internet use, how much time on typical day, in minutes","netusoft: Internet use, how often", "ppltrst: Most people can be trusted or you can't be too careful", "pplfair: Most people try to take advantage of you, or try to be fair", "pplhlp: Most of the time people helpful or mostly looking out for themselves", "gndr: Gender", "agea: Age")
type <- c("ratio", "ratio", "ordinal", "interval", "interval", "interval", "nominal", "ratio")
cont_disc <- c("continuous", "continuous", "no one", "continuous", "continuous", "continuous", "no one", "continuous")
num_cat <- c("numeric", "numeric", "categorical","numeric", "numeric", "numeric", "categorical", "numeric" )
var_table <- data.frame(Variables, type, cont_disc, num_cat, check.rows = TRUE)
formattable(var_table,
align =c("l","c","c","c"),
list(`Indicator Name` = formatter(
"span", style = ~ style(color = "grey",font.weight = "bold"))))
Variables | type | cont_disc | num_cat |
---|---|---|---|
nwspol: News about politics and current affairs, watching, reading or listening | ratio | continuous | numeric |
netustm: Internet use, how much time on typical day, in minutes | ratio | continuous | numeric |
netusoft: Internet use, how often | ordinal | no one | categorical |
ppltrst: Most people can be trusted or you can’t be too careful | interval | continuous | numeric |
pplfair: Most people try to take advantage of you, or try to be fair | interval | continuous | numeric |
pplhlp: Most of the time people helpful or mostly looking out for themselves | interval | continuous | numeric |
gndr: Gender | nominal | no one | categorical |
agea: Age | ratio | continuous | numeric |
This histogram represents the disrtibution of the respondents’ age:
ggplot(data = rus)+
geom_histogram(aes(x = agea), binwidth = 1.5, color = "black", fill = wes_palette("Rushmore", 1), alpha = 0.5) +
scale_x_continuous(breaks = 0:20*5) +
geom_vline(aes(xintercept = mean(rus$agea)), color = "red", size = 1) +
geom_vline(aes(xintercept = median(rus$agea)), color = "royalblue", size = 1) +
annotate(x= 60, y= 350, label=paste("Mean age(red) = ",
round(mean(rus$agea),1)),geom="text", size= 5)+
annotate(x= 15, y= 350, label=paste("Median age(blue) = ",
round(median(rus$agea),1)),geom="text", size= 5)+
labs(title = "The distribution of respondents' age", x = "Age", y = "Frequency") +
theme_bw() +
scale_color_manual(name='statistics', values = c(mean='red', median='royalblue'))
nwspol
– News about politics and current affairs, watching, reading or listening, in minutes.Q1 On a typical day, about how much time do you spend watching, reading or listening to news about politics and current affairs?
ggplot(data = rus)+
geom_histogram(aes(x = nwspol), color = "black", binwidth = 2, fill = wes_palette("Rushmore1", 1), alpha = 0.5)+
scale_x_continuous(breaks = 0:60*10)+
labs(title = "The amount of time people spend on news daily", x = "Minutes per day", y = "Number of respondents") +
theme_bw()
This graph shows how much time people spend on the news every day. It can be noticed that most of the respondents spend no more than 15 minutes per day on the news. However, around 250 of people spend around 22-23 minutes.
netusoft
– Internet use, how often.Q2 People can use the internet on different devices such as computers, tablets and smartphones. How often do you use the internet on these or any other devices, whether for work or personal use?
rus %>% dplyr::select(netusoft) %>% na.omit() %>%
ggplot() +
geom_bar(aes(x = netusoft), color = "black", fill = wes_palette("Rushmore", 5), alpha = 0.5) +
scale_y_continuous(breaks = 0:20*100) +
labs(title = "How do people estimate the frequency of using the Internet", x = "Category of answer", y = "Number of respondents") +
theme_bw()
This third graph presents people`s perception of the time spent on the Internet. As it is seen, more than 1000 of the respondents use the Internet every day (this is the most frequent response), while the answer “Never” is on the second place.
netustm
– Internet use, how much time on typical day, in minutes.Q3 On a typical day, about how much time do you spend using the internet on a computer, tablet, smartphone or other device, whether for work or personal use?
#here we will use data set without NAs
ggplot(data = na.omit(subset(rus, select = c(netustm, gndr))))+
geom_boxplot(aes(x = gndr, y = netustm), color = "black", fill = wes_palette("Rushmore", 1), alpha = 0.5)+
scale_y_continuous(breaks = 0:45*5)+
labs(title = "Is there the difference between males and females in time spent in the Internet?", x = "Gender", y = "Time, minutes")+
theme_bw()
This boxplot presents the relationship between “Gender” and “Time spent on the Internet”. As the graph shows, there is no clear evidence for claiming the distinction between male and female to be significant.
The mean and median time for both genders are almost the same:
rus_stat = rus %>% na.omit() %>% group_by(gndr) %>% summarise(mean_time = mean(netustm), median_time = median(netustm))
knitr::kable(rus_stat)
gndr | mean_time | median_time |
---|---|---|
Male | 21.21165 | 20 |
Female | 21.00000 | 20 |
ppltrst
– Most people can be trusted or you can’t be too careful.Q4 Using this card, generally speaking, would you say that most people can be trusted, or that you can’t be too careful in dealing with people? Please tell me on a score of 0 to 10, where 0 means you can’t be too careful and 10 means that most people can be trusted.
ggplot(data = na.omit(subset(rus, select = c(ppltrst, gndr))))+
geom_bar(aes(x = ppltrst),color = "black", fill = wes_palette("Rushmore", 1), alpha = 0.5)+
facet_grid(gndr~.)+
labs(title = "The level of trust expressed towards another people", x = "Trust, 1-10", y = "Number of respondents")+
theme_bw()
These two graphs show to what extent males and females express trust towards others. There are no significant differences between genders (males trust slightly less). All in all, most of people are in the middle between absolute trust and total distrust.
pplfair
– Most people try to take advantage of you, or try to be fair.Q5 Using this card, do you think that most people would try to take advantage of you if they got the chance, or would they try to be fair?
ggplot(data = na.omit(subset(rus, select = c(pplfair, gndr))))+
geom_bar(aes(x = pplfair), color = "black", fill = wes_palette("Rushmore", 1), alpha = 0.5)+
facet_grid(gndr~.)+
labs(title = "Fairness attitudes", x = "Fairness, 1-10", y = "Number of respondents")+
theme_bw()
Here again we may see that most of the respondents are in the middle. However, people tend to believe in the fairness of other people a little bit more.
pplhlp
– Most of the time people helpful or mostly looking out for themselves.Q6 Would you say that most of the time people try to be helpful or that they are mostly looking out for themselves?
sjp.xtab(rus$pplhlp, rus$gndr, margin = "row", bar.pos = "stack",
show.summary = F, coord.flip = TRUE,
axis.titles = "The perception of other people's egoism", legend.title = "Gender",
geom.colors = c("lightblue", "#f3c6f3"))
The graph shows that the majority of respondents suppose that most people equally likely to try to be helpful and are, in the opposite, to look out for themselves.
ggplot(data = rus,aes(x = agea, y = netustm))+
geom_jitter (alpha = 0.5)+
geom_smooth(method = "lm" , se = T) + #plotting regression line
scale_x_continuous(breaks = 0:70*10) +
labs(title = "Relationship between age and time spent daily in the Internet", x = "Age", y = "Time in the Internet, min") +
theme_bw()
This scatterplot represents negative covariance between gender and time spent on the Internet. So to say, the less the respondent`s age, the more time they spend surfing the Internet.
Thanks for your attention!