Team: Tolstokoraya Darya
Baturina Elina
Sorokina Darya
Suetina Anna
We chose Switzerland as a country for our analysis.
Topic: “Digital and social contacts within family and workplace and its relation to subjective well-being, happiness and social exclusion”
Why Switzerland as a country? - Switzerland is one of the wealthiest countries in the world with $92,463 GDP per Capita (Richest Countries in the world 2024., n.d.). This point is directly connected with people’s work, how their work is organized and with people’s happiness and whether they enjoy their lives, as they probably have really high quality conditions of living in such a wealthy country. We got interested in this fact and thought we might derive interesting data in the ESS portal about Switzerland, happiness level there, work and family related data. We wanted to dive deeper and observe what factors are related to the happiness level of the population of Switzerland. SOURCE:Richest Countries in the world 2024. (n.d.).
Used packages and functions
library(dplyr)
library (kableExtra)
library(ggplot2)
library(foreign)
library(sjlabelled)
library(sjPlot)
library(ggpubr)
library(psych)
library(readr)
library(rstatix)
library(DescTools)
library(sjstats)
library(corrplot)
library(effsize)
library(coin)
library(RGraphics)
library(rcompanion)
library(car)
Mode = function(x){
ta = table(x)
tam = max(ta)
if (all(ta == tam))
mod = NA
else
if(is.numeric(x))
mod = as.numeric(names(ta)[ta == tam])
else
mod = names(ta)[ta == tam]
return(mod)
}
Research question: How digital and social contacts within family and workplace are related to subjective well-being and social exclusion?
SOURCE: Kavetsos, G., & Koutroumpis, P. (2011). Technological affluence and subjective well-being. Journal of Economic Psychology, 32(5), 742–753.
SOURCE: Taylor, S. H., & Bazarova, N. N. (2021). Always Available, Always attached: A relational perspective on the effects of mobile phones and social media on Subjective Well-Being. Journal of Computer-Mediated Communication, 26(4), 187–206.
SOURCE: Hommerich, C., & Tiefenbach, T. (2017). Analyzing the relationship between Social capital and Subjective Well-Being: The mediating role of social affiliation. Journal of Happiness Studies, 19(4), 1091–1114. https://doi.org/10.1007/s10902-017-9859-9
Downloading the data from ESS round 10
ESS1 <- read_csv(file = '/Users/admin/Downloads/ESS10/ESS10.csv')
Selecting the country & needed variables from dataset
ESS101 <- ESS1 %>%
filter(cntry == "CH") %>%
select(idno, acchome, sclact, closepnt, teamfeel, happy, ttminpnt)
ESS10_11 <- ESS1 %>%
filter(cntry == "CH") %>%
select(idno, acchome, sclact, closepnt, teamfeel, happy, ttminpnt)
Nominal variable: “acchome”
This variables represent the ability of the respondent to access the internet from home
#R represens this variable as numeric, so we assigning factor variable type
ESS101$acchome <- factor(ESS101$acchome, labels = c("Don't have an access", "Have an access"), ordered= F)
class(ESS101$acchome)
## [1] "factor"
summary(ESS101$acchome)
## Don't have an access Have an access
## 104 1419
Plot 1: How many people have access to the internet at home?
ggplot() +
geom_bar(data = ESS101, aes(x = acchome), fill="#00FFFF", col="#0000FF", alpha = 0.5) +
xlab("Having an ability to access the internet: Home") +
ylab("Number of people") +
ggtitle("The level of people`s access to the Internet at home")
This is nominal (binary) variable, thus we cannot check normality of the distribution and describe it’s shape.
Conclusion 1: In Switzerland there are much more people who have an access to the internet at home in comparison to those who don’t have.
Ordinal variable 1: “sclact”
This variable represents answers of respondents to the question “Compared to other people of your age, how often would you say you take part in social activities?”.
ESS10_sclact <- ESS10_11 %>%
filter(sclact != 8 & sclact != 7)
#Deleting observations, which are not needed for the analysis: Refusal* & Don't know*
ESS101$sclact[ESS101$sclact == 8 | ESS101$sclact == 7] <- NA
#R represents this variable as numeric, so we assigning ordered factor variable type
ESS101$sclact <- factor(ESS101$sclact, labels = c("Much less than most", "Less than most", "About the same", "More than most", "Much more than most"), ordered= T)
class(ESS101$sclact)
## [1] "ordered" "factor"
summary(ESS101$sclact)
## Much less than most Less than most About the same More than most
## 116 455 691 206
## Much more than most NA's
## 31 24
Plot 2: How often do people participate in social activities (compared to others of same age)?
ESS101 = ESS101 %>%
filter(sclact != 8 )%>%
filter(sclact != 9 )%>%
filter(sclact != 7 )
ESS101$sclact <- factor(ESS101$sclact, labels = c("Much less than most", "Less than most", "About the same", "More than most", "Much more than most"), ordered= F)
ggplot(ESS101 %>%
filter(sclact != "NA")) +
geom_bar(aes(x = sclact), fill="#99FF66", col="#990033", alpha = 0.5) +
xlab("Frequency of participation in social activities") +
ylab("Number of people") +
ggtitle("The degree of participation in social activities (compared to others of same age)")
The distribution is pretty normal, however it is a bit right-skewed.
Conclusion 2: People in Switzerland think that they are take part in social activities in the same level as others of their age. Fewer people tend to think they participate less than others.
Ordinal variable 2: “closepnt”
This variable represents answers of respondents to the question “Taking everything into consideration, how close do you feel to him/her?”.
ESS10_closepnt <- ESS10_11 %>%
select(idno, closepnt) %>%
filter(closepnt < 6)
#Deleting observations, which are not needed for the analysis: Not applicable* & Refusal* & Don't know*
ESS101$closepnt[ESS101$closepnt == 6 | ESS101$closepnt == 7 | ESS101$closepnt == 8 | ESS101$closepnt == 9] <- NA
#R represents this variable as numeric, so we assigning ordered factor variable type
ESS101$closepnt <- factor(ESS101$closepnt, labels = c("Extremely close", "Very close", "Quite close", "Not very close", "Not at all close"), ordered= T)
class(ESS101$closepnt)
## [1] "ordered" "factor"
summary(ESS101$closepnt)
## Extremely close Very close Quite close Not very close
## 209 432 237 67
## Not at all close NA's
## 23 531
Plot 3: How are people close to their parents?
ggplot(ESS101 %>%
filter(closepnt != "NA")) +
geom_bar(aes(x = closepnt), fill="#CCCCFF", col="#FF7F50", alpha = 0.5) +
xlab("How close to parents") +
ylab("Number of people") +
ggtitle("The degree of closeness to parents")
The distribution is pretty normal, however it is a bit right-skewed.
Conclusion 3: We see that people in Switzerland are more inclined to be close to their parents, fewer people have distant relationships inside family.
Interval variable 1: “teamfeel”
This variable represents answers of respondents to the question “If you work in a team, how much do you feel like part of your team?”
ESS10_teamfeel <- ESS101 %>%
select(teamfeel) %>%
filter(teamfeel <= 10)
class(ESS10_teamfeel$teamfeel)
## [1] "numeric"
summary(ESS10_teamfeel$teamfeel)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 8.000 9.000 8.487 10.000 10.000
Plot 4: Do people in Switzerland feel part of their working team?
ggplot(ESS10_teamfeel)+
geom_histogram( aes(x = teamfeel), binwidth = 1, fill="#367588", col="#6a5acd", alpha = 0.5) +
xlim(c(0, 10))+
xlab("How much people feel like a part of their working team") +
ylab("Number of people") +
scale_x_continuous(breaks= seq(0, 10, by=2))+
geom_vline(aes(xintercept = mean(teamfeel), color = 'mean'), linetype="solid", linewidth = 1) +
geom_vline(aes(xintercept = median(teamfeel), color = 'median'), linetype="solid", linewidth = 1)+
geom_vline(aes(xintercept = Mode(teamfeel), color = 'mode'), linetype="solid", linewidth = 1) +
scale_color_manual(name = "Measurement", values = c(median = "#1e90ff", mean = "#ffb6c1", mode = "#C9F76F"))+
ggtitle("The level of feeling like a part of working team")
ggplot(ESS10_teamfeel)+
geom_density( aes(x = teamfeel), binwidth = 1, fill="#367588", col="#6a5acd", alpha = 0.5) +
xlim(c(0, 10))+
xlab("How much people feel like a part of their working team") +
ylab("Number of people") +
scale_x_continuous(breaks= seq(0, 10, by=2))+
geom_vline(aes(xintercept = mean(teamfeel), color = 'mean'), linetype="solid", linewidth = 1) +
geom_vline(aes(xintercept = median(teamfeel), color = 'median'), linetype="solid", linewidth = 1)+
geom_vline(aes(xintercept = Mode(teamfeel), color = 'mode'), linetype="solid", linewidth = 1) +
scale_color_manual(name = "Measurement", values = c(median = "#1e90ff", mean = "#ffb6c1", mode = "#C9F76F"))+
ggtitle("The level of feeling like a part of working team")
The distribution is not normal, and very left-skewed.
Conclusion 4: As it can be seen from the graph, people in Switzerland mostly feel like a part of their working team as the histogram is left-skewed. Moreover, all central tendency measurement are higher then 8, which represents high lefel of feeling like a part of a workig team.
Interval variable 2: “happy”
This variable represents answers of respondents to the question “Taking all things together, how happy would you say you are?”
ESS10_happy <- ESS101 %>%
select(happy) %>%
filter(happy <= 10)
class(ESS10_happy$happy)
## [1] "numeric"
summary(ESS10_happy$happy)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 8.000 8.000 8.087 9.000 10.000
Plot 5: Do people in Switzerland feel happy?
ggplot(ESS10_happy)+
geom_histogram( aes(x = happy), binwidth = 1, fill="#367588", col="#6a5acd", alpha = 0.5) +
xlim(c(0, 10))+
xlab("How much people feel happy") +
ylab("Number of people") +
scale_x_continuous(breaks= seq(0, 10, by=2))+
geom_vline(aes(xintercept = mean(happy), color = 'mean'), linetype="solid",linewidth = 1) +
geom_vline(aes(xintercept = median(happy), color = 'median'), linetype="solid", linewidth = 2)+
geom_vline(aes(xintercept = Mode(happy), color = 'mode'), linetype="solid",linewidth = 1) +
scale_color_manual(name = "Measurement", values = c(median = "#1e90ff", mean = "#ffb6c1", mode = "#C9F76F"))+
ggtitle("Happines in Switzerland")
The distribution is not normal, and very left-skewed.
Conclusion 5: The graph illustrates that, on average, people in Switzerkand feel happy as the histogram is left-skewed and mean, median and mode are located approximately in 8 point, which is much higher than the central point (5).
Ratio variable: “ttminpnt”
This variable represents answers of respondents to the question “About how long would it take you to get to where your parents live, on average? Think of the way you would travel and of the time it would take door to door.”
ESS10_ttminpnt <- ESS101 %>%
select(idno, ttminpnt) %>%
filter(ttminpnt != 6666) %>%
filter (ttminpnt != 7777) %>%
filter(ttminpnt != 8888) %>%
filter (ttminpnt != 9999)
class(ESS10_ttminpnt$ttminpnt)
## [1] "numeric"
summary(ESS10_ttminpnt$ttminpnt)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 10.0 30.0 179.6 180.0 4320.0
Plot 6: How long does it takes for people to get to their parents?
ggplot(ESS10_ttminpnt)+
geom_histogram(aes(x = ttminpnt), fill="gray", col="#FF6347", alpha = 0.5) +
xlab("How long does it takes to get to parents, min") +
ylab("Number of people") +
geom_vline(aes(xintercept = mean(ttminpnt), color = 'mean'), linetype="solid",linewidth = 1) +
geom_vline(aes(xintercept = median(ttminpnt), color = 'median'), linetype="solid", linewidth = 1)+
geom_vline(aes(xintercept = Mode(ttminpnt), color = 'mode'), linetype="solid",linewidth = 1) +
scale_color_manual(name = "Measurement", values = c(median = "#1e90ff", mean = "#ffb6c1", mode = "#C9F76F"))+
ggtitle("Time to parents")+
xlim(0, 1500)+
ylim(0, 175)
The distribution is not normal, and very right-skewed.
Conclusion 6: On average, it takes 3 hours for people in Switzerland to get to their parents. Much more people tend to spend less than 250 minutes to get to their parents.
v.acchome <- c(NA, Mode(ESS10_11$acchome), NA)
names(v.acchome) <- c("mean", "mode", "median")
v.sclact <- c(NA, Mode(ESS10_sclact$sclact), median(ESS10_sclact$sclact))
names(v.sclact) <- c("mean", "mode", "median")
v.closepnt <- c(NA, Mode(ESS10_closepnt$closepnt), median(ESS10_closepnt$closepnt))
names(v.closepnt) <- c("mean", "mode", "median")
ESS10_teamfeel$teamfeel = as.numeric(as.character(ESS10_teamfeel$teamfeel))
v.teamfeel <- c(mean(ESS10_teamfeel$teamfeel), Mode(ESS10_teamfeel$teamfeel), median(ESS10_teamfeel$teamfeel))
names(v.teamfeel) <- c("mean", "mode", "median")
ESS10_happy$happy = as.numeric(as.character(ESS10_happy$happy))
v.happy <- c(mean(ESS10_happy$happy), Mode(ESS10_happy$happy), median(ESS10_happy$happy))
names(v.happy) <- c("mean", "mode", "median")
v.ttminpnt <- c(mean(ESS10_ttminpnt$ttminpnt), Mode(ESS10_ttminpnt$ttminpnt), median(ESS10_ttminpnt$ttminpnt))
names(v.ttminpnt) <- c("mean", "mode", "median")
tendencymeasures = data.frame(v.acchome, v.sclact, v.closepnt, v.teamfeel, v.happy, v.ttminpnt, stringsAsFactors = FALSE)
kable(tendencymeasures) %>%
kable_styling(bootstrap_options=c("bordered", "responsive","striped"), full_width = FALSE)
| v.acchome | v.sclact | v.closepnt | v.teamfeel | v.happy | v.ttminpnt | |
|---|---|---|---|---|---|---|
| mean | NA | NA | NA | 8.486766 | 8.086725 | 179.5907 |
| mode | 1 | 3 | 2 | 10.000000 | 8.000000 | 10.0000 |
| median | NA | 3 | 2 | 9.000000 | 8.000000 | 30.0000 |
Do people who feel like part of their working team feel happier than those who do not feel like part of their team?
ggplot(ESS101 %>%
filter(happy <=10 & teamfeel <= 10 ), aes(x = teamfeel, y = happy)) +
geom_point( )+
ylab("How happy are you") +
xlab("How much you feel like a team") +
ggtitle("Feeling happy due to feeling like a part of a working team") +
scale_x_continuous(breaks= seq(0, 10, by=2))+
scale_y_continuous(breaks= seq(0, 10, by=2))+
theme_minimal()+
geom_count()
Answer: Looking at the graph, it can be seen that there is a slight connecting between feeling of belonging to the working team and happiness as quite many high points of each scales corresponds with high values and have bigger size (therefore, there is bigger number of occurrences), however, still there are cases where high values of happiness corresponds with low values of feeling a part of a team and vise versa.
Is there a relation between how close a parent and child live and the closeness of their relationship (assessed by the child)?
ESS_boxplot <- full_join(ESS10_ttminpnt, ESS10_closepnt, by = "idno")
ggplot(ESS101 %>%
filter(closepnt != "NA") %>%
filter(ttminpnt != 6666) %>%
filter (ttminpnt != 7777) %>%
filter(ttminpnt != 8888) %>%
filter (ttminpnt != 9999), aes(x=closepnt, y=ttminpnt))+
geom_boxplot(aes(fill = closepnt))+
stat_summary(fun.y = mean, geom = "point", size = 2, col = "orange")+
ylim(0, 500)+
xlab("Closeness to parents")+
ylab("Time to parents")+
ggtitle("The relation of closeness within family and time between parents and child")
Answer: We see that the average distance to parents increases with decreasing degree of relationship closeness. Therefore we can conclude that families who live in a longer distance from each other have less close relationships.
Are people who have access to the internet at home more involved in social activities (compared to others of the same age)?
ggplot(ESS101 %>%
filter(sclact != "NA"), aes(x = sclact, fill = acchome)) +
geom_bar(position="fill")+
coord_flip()+
xlab("The degree of participation in social activities") +
ylab("Нaving an ability to access the internet: Home") +
ggtitle("Participation in social activities due to access to the Internet")
Answer: Looking at the graph, we can see that in Switzerland a fairly large number (about 80%) of people who have internet access at home think that they participate in social activities much less than others their age. Also a large number of people (about 90%) who have internet access at home think that they participate in social activities much more than others.
As a result of considering six variables that are related to the topic of “Digital and social contacts within family and workplace and its relation to subjective well-being and social exclusion”, we can see that overall, people who are less engaged in social and digital interactions tend to have much lower indicators of subjective well-being. For example, the more a person feels like a part of a team, the more they feel happy. What is more, families who live not far from each other (the distance is small) tend to have close relationships in comparison to those families who live in a longer distance. And the final point is that people who have an access to Internet at home tend to significantly greater be socially active compared to others.
Downloading the data from ESS round 10
ESS2 <- read_csv(file = '/Users/admin/Downloads/ESS10/ESS10.csv')
Selecting the country & needed variables from dataset
ESS10 <- ESS2 %>%
filter(cntry == "CH") %>%
select(idno, acchome, domicil, gndr, ttminpnt, speakpnt)
ESS10_1 <- ESS2 %>%
filter(cntry == "CH") %>%
select(idno, acchome, domicil, gndr, ttminpnt, speakpnt)
Label = c("acchome", "domicil", "gndr", "ttminpnt", "speakpnt")
Meaning = c("Ability to acess the Internet from home", "Area of living type", "Gender", "Time to parents in minutes", "Frequancy of speaking to parents")
Level_Of_Measurement <- c("Nominal, binary", "Nominal", "Nominal, binary", "Ratio", "Ordinal")
df <- data.frame(Label, Meaning, Level_Of_Measurement, stringsAsFactors = FALSE)
kable(df) %>%
kable_styling(bootstrap_options=c("bordered", "responsive","striped"), full_width = FALSE)
| Label | Meaning | Level_Of_Measurement |
|---|---|---|
| acchome | Ability to acess the Internet from home | Nominal, binary |
| domicil | Area of living type | Nominal |
| gndr | Gender | Nominal, binary |
| ttminpnt | Time to parents in minutes | Ratio |
| speakpnt | Frequancy of speaking to parents | Ordinal |
Research question: Is there a relation between the respondent’s description of the type of the area of living and their ability to access the internet from home?
The issue of internet usage is recently highly developed topic in Switzerland. The most full reaserch on which was conducted in context of World Internet Project. Autors collected the statistics about the usage of Internet in the country according to many parameters (e.g. age, purposes, fears, types of usage, etc.) The research showed that the Switzerland is one of the leading countries according to Internet usage in the Worls. Around 92% of population use Internet. Based on this fact we hypothesized that people all over the country should have equal access to the Internet, there should be no difference in the access to internet in different areas.
Reference: Latzer, Michael and Büchi, Moritz and Festic, Noemi, Internet Use in Switzerland 2011—2019: Trends, Attitudes and Effects. Summary Report from the World Internet Project – Switzerland (2020). Zurich, Switzerland: University of Zurich, 2020
Variables We use 2 categorical variables for this test.
First variable “acchome” – nominal, binary. This variables represent the ability of the respondent to access the internet from home - “Imagine you wanted to access the Internet. At which of these locations would you be able to do it?” (People marked or not marked home as a location to access the internet)
ESS10$acchome <- factor(ESS10$acchome, labels = c("Don't have an access", "Have an access"), ordered= F)
class(ESS10$acchome)
## [1] "factor"
summary(ESS10$acchome)
## Don't have an access Have an access
## 104 1419
Second variable “domicil” - domicile, respondents description - nominal. This variable represents the respondent’s description of type of the area where they live. For the second variable “domicil” first delete observations, which are not needed for the analysis: 7 = “Refusal” & 8 = “Don’t know” & 9 = “No answer”
ESS10$domicil[ESS10$domicil == 7 | ESS10$domicil == 8 | ESS10$domicil == 9] <- NA
ESS10$domicil <- factor(ESS10$domicil, labels = c("A big city", "Suburbs or outskirts of big city", "Town or small town", "Country village", "Farm or home in countryside"), ordered= F)
class(ESS10$domicil)
## [1] "factor"
summary(ESS10$domicil)
## A big city Suburbs or outskirts of big city
## 112 164
## Town or small town Country village
## 386 800
## Farm or home in countryside NA's
## 60 1
Descriptive plot
Here we are able to see descriptive plot which shows the amount of people living in a particular type of area.
ggplot(ESS10)+
geom_bar(aes(x=domicil, fill=acchome), position="stack", na.rm = TRUE)+
scale_x_discrete(na.translate = FALSE)+
ggtitle("The relationship between the description of the area type and ability to access the internet from home")+
xlab("Description of the area of living")+
ylab("Number of respondents")+
labs(caption = "ESS10, Switzerland")+
theme(axis.text.x = element_text(angle=65, vjust = 0.5))
We see, that there is a great majority of people live in a country village. Whereas in the other types of areas there are much less residents. In this case the proportions can be less obvious when we just look at the stacked bar plot, and it will be hard to derive valid conclusions from it. To solve this issue we build plot_xtab to look at the proportions.
library(sjPlot)
plot_xtab (ESS10$domicil, ESS10$acchome, margin = "row", bar.pos = "stack",
show.summary = TRUE)
Interpretation: We see that proportions are approximately equal, as there is not a big difference between proportions, that is why it is hard to understand whether this difference is significant. That is why we need to do chi-squared test in order to discover it.
Cheking assumptions
Assumptions:
Data is independent, the catagories are mutually exclusive
at least 5 observations per cell
table(ESS10$acchome, ESS10$domicil)
##
## A big city Suburbs or outskirts of big city
## Don't have an access 8 10
## Have an access 104 154
##
## Town or small town Country village
## Don't have an access 17 60
## Have an access 369 740
##
## Farm or home in countryside
## Don't have an access 9
## Have an access 51
exp<-chisq.test(ESS10$acchome, ESS10$domicil)
exp$expected
## ESS10$domicil
## ESS10$acchome A big city Suburbs or outskirts of big city
## Don't have an access 7.653088 11.20631
## Have an access 104.346912 152.79369
## ESS10$domicil
## ESS10$acchome Town or small town Country village
## Don't have an access 26.37582 54.66491
## Have an access 359.62418 745.33509
## ESS10$domicil
## ESS10$acchome Farm or home in countryside
## Don't have an access 4.099869
## Have an access 55.900131
The assumption is met.
Chi-square Test
HO: There is no association between the type of the area of living and ability to access the Internet from home
HA: There is association between the type of the area of living and ability to access the Internet from home
chisq.test(ESS10$acchome, ESS10$domicil)
##
## Pearson's Chi-squared test
##
## data: ESS10$acchome and ESS10$domicil
## X-squared = 10.579, df = 4, p-value = 0.03173
Our p-value = 0.03173, meaning we reject the null hypothesis and state that these two categorical variables are not independently distributed, meaning there is an association between the type of the area of living and ability to access the Internet from home. It means people have different abilities to access the Internet from home in different types of areas they live in.
Post-Hoc test
The analysis of the standardized residuals:
res <- chisq.test(ESS10$acchome, ESS10$domicil)
res$stdres
## ESS10$domicil
## ESS10$acchome A big city Suburbs or outskirts of big city
## Don't have an access 0.1349795 -0.3952332
## Have an access -0.1349795 0.3952332
## ESS10$domicil
## ESS10$acchome Town or small town Country village
## Don't have an access -2.1892416 1.0854129
## Have an access 2.1892416 -1.0854129
## ESS10$domicil
## ESS10$acchome Farm or home in countryside
## Don't have an access 2.5581477
## Have an access -2.5581477
Describe residuals: The residuals of 2.5581477 and -2.5581477 that appear for intersection of both “do not have an access” and “have an access” in a “Farm or home in countryside” category indicate substantial deviations between the observed and expected values. There is a positive association between living in farm or home in countryside and not having access to the Internet from home.
In the “Town or small town” category, the indicators are also beyond -2 and 2. So there is a positive association between living in a town or small town and having an access to the internet from home.
Other values are in the range from -2 to 2, meaning this deviation is not different from the expected values.
Visualize residuals:
corrplot(chisq.test(ESS10$acchome, ESS10$domicil)$stdres, is.corr = FALSE, method = "number")
Conclusions:
After conducting chi-squared test we can conclude that there is a relation between between the respondent’s description of the type of the area of living and their ability to access the Internet from home (or the categorical variables “domicil” and “acchome” are not independently distributed). Based on the residuals analysis, we can conclude that the variables that have the most influence on the test results. We see that in our sample there are many more people from town or small city who do not have access to the incinerator at home than we expected. On the other hand, we see that people who live in a farm or home in a countryside and have access to the Internet turned out to be much more than expected.
Thus, it can be concluded that our original hypothesis cannot be confirmed: people from Switzerland, living in different places, have different levels of access to the internet from home.
Research question: Do Swiss people of different gender (female, male) have the different mean time in minutes spent on the getting to parent’s place of living?
Research of Kolk and Martin was aimed at figuring out the geographical distance of children from their parents of different gender in Sweden. Unfortunatly the study do not have the data about children of different age, however it provides the information that mothers in comparison to fathers tend to live closer to their children. We introduced this logic to our data eximation and hypothesized that female children are tend to live closer to parents in Switherland.
Reference: Kolk, Martin (2016). A Life-Course Analysis of Geographical Distance to Siblings, Parents, and Grandparents in Sweden. Population, Space and Place
Data inspection
We are going to do independent samples t-test, where: Categorical variable: gndr - Gender of respondents
ESS10_ttminpnt <- ESS10 %>%
select(gndr, ttminpnt) %>%
filter(ttminpnt != 6666) %>%
filter (ttminpnt != 7777) %>%
filter(ttminpnt != 8888) %>%
filter (ttminpnt != 9999)
ESS10_ttminpnt$gndr <- factor(ESS10_ttminpnt$gndr, labels = c("Male", "Female"), ordered= F)
class (ESS10_ttminpnt$gndr)
## [1] "factor"
summary(ESS10_ttminpnt$gndr)
## Male Female
## 391 397
Description of variables: The “gndr” variable is a categorical and binary, since there are 2 variants (according to descriptive statistic function there are 391 males and 397 females). R identified the class of the variable as “numeric” one, but we converted it into “factor”, which corresponds to categorical type of data.
Continuous variable: ttminpnt - Travel time to parent, in minutes
ESS10_ttminpnt$ttminpnt <- as.numeric(ESS10_ttminpnt$ttminpnt)
class(ESS10_ttminpnt$ttminpnt)
## [1] "numeric"
summary(ESS10_ttminpnt$ttminpnt)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 10.0 30.0 188.1 180.0 4320.0
The variable “ttminpnt” is a continuous variable. It was identified as “integer” type by R so we we converted it into “numeric”. According to central tendency measures of “ttminpnt”, we can see that the mean of getting to the parents is 188.1 and median is 30.0, also minimum is 0 and the max is 4320.
Descriptive plot
Boxplot can help to visualize our data:
library (ggplot2)
ggplot(ESS10_ttminpnt)+
geom_boxplot(aes(x=gndr, y=ttminpnt), fill="#FFDDFF", col="#221100",alpha = 0.5)+
ylim(0, 500)+
ggtitle("Minutes spent on getting to the parents by Gender of Respondent")+
xlab("Gender of respondents")+
ylab("Duration of time for getting to the parents in minutes")
Interpretation: based on the plot we see that females (time is approximately 25 minutes) need more time to get to parents then males (time is approximately 20 minutes). It also means that distance between parents and females more than males and their parents.
Summary about data inspection:
there is > 300 observations in both groups
females need more time to get to parents (= on average, they have longer distance between themselves and parents)
Checking assumptions
Сhecking the normality assumption for the t-test
Here we are going to check normality of distribution of our continuous variable (time to get to parents) by Gender.
ggplot(ESS10_ttminpnt, aes(x = ttminpnt, color = gndr, fill = gndr)) +
geom_density(alpha = 0.5) +
labs(title = "Minutes spent on getting to parents by Gender", x = "Duration of time to get to the parents in Minutes", y = "Density") +
theme_classic()
Interpretation: this histogram show that the distributions are skewed to the right (i.e. the right tail is stretched).
#install.packages("psych")
library(psych)
describeBy(ESS10_ttminpnt, group = ESS10_ttminpnt$gndr)
##
## Descriptive statistics by group
## group: Male
## vars n mean sd median trimmed mad min max range skew
## gndr 1 391 1.00 0.00 1 1.00 0.00 1 1 0 NaN
## ttminpnt 2 391 195.61 382.47 30 99.64 37.06 0 3000 3000 3.3
## kurtosis se
## gndr NaN 0.00
## ttminpnt 14.71 19.34
## ------------------------------------------------------------
## group: Female
## vars n mean sd median trimmed mad min max range skew
## gndr 1 397 2.00 0.00 2 2.00 0.00 2 2 0 NaN
## ttminpnt 2 397 180.73 352.11 35 103.69 44.48 1 4320 4319 5.34
## kurtosis se
## gndr NaN 0.00
## ttminpnt 49.2 17.67
Interpretation:
Males: skew (3.3) is not normal (more than 0.5). And kurtosis (14.71) is not normal (more than 1), as the graph above tells us (very sharp top and long tail).
Females: skew (5.34) is not normal (more than 0.5). And kurtosis (49.2) is not normal (more than 1), as the graph above tells us (very sharp top and long tail).
In both groups distribution is skewed and not normal.
Here we are going also to test normality of variables:
qqnorm(ESS10_ttminpnt$ttminpnt)
qqline(ESS10_ttminpnt$ttminpnt)
Interpretation: Q-Q plot do not look normal (heavy right tail and U-shaped line). Also we can see that the points on the plot do not follow a straight line.
Here we also check the normality of our data with a help of test.
shapiro.test(ESS10_ttminpnt$ttminpnt)
##
## Shapiro-Wilk normality test
##
## data: ESS10_ttminpnt$ttminpnt
## W = 0.54001, p-value < 2.2e-16
Interpretation: according to Shapiro test we reject our null hypothesis (p-value < 0,05), so there is not a normal distribution.
Homogeneity of variances assumption
Here is visualization of comparison of the variances in the groups (males and females) with the help of boxplots:
ggplot(ESS10_ttminpnt, aes(x = gndr, y = ttminpnt)) +
ylim(0, 500)+
geom_boxplot() +
stat_summary(fun.y = mean, geom = "point", shape = 4, size = 4) +
theme_classic() +
ggtitle("Minutes spent on getting to parents by Gender of Respondent")
Interpretation: Women have a wider distribution, while men have a smaller one. Women spend more time on average to reach their parents than men (median in females group is slightly more than in males group). The mean among women and men is almost the same. Data distributions (women and men) are skewed because of the mean points are significantly displaced towards the longer tail of the distribution in both groups and do not align well with the medians. Also there are many outliers (points on the plot).
Here we are going to use the test in order to check our visualization results.
H0: Variances are equal.
HA: Variances are not equal.
bartlett.test(ESS10_ttminpnt$ttminpnt ~ ESS10_ttminpnt$gndr)
##
## Bartlett test of homogeneity of variances
##
## data: ESS10_ttminpnt$ttminpnt by ESS10_ttminpnt$gndr
## Bartlett's K-squared = 2.683, df = 1, p-value = 0.1014
Interpretation: according to Bartlett test we are failed to reject our null hypothesis (p-value > 0,05), so variances of groups are equal.
T-Test
The distributions of the continuous variable are not normal but the number of observations in both groups is high enough, so we can try to run t-test (and ignore non-parametric for now).
H0: The mean value of time to get to the parents of males is equal to mean value of of time to get to the parents of females.
HA: The mean value of time to get to the parents of males is not equal to mean value of of time to get to the parents of females.
Note: variances are equal (according our previous results), so Welch’s correction should be applied
t.test(ESS10_ttminpnt$ttminpnt ~ ESS10_ttminpnt$gndr, var.equal = F)
##
## Welch Two Sample t-test
##
## data: ESS10_ttminpnt$ttminpnt by ESS10_ttminpnt$gndr
## t = 0.56798, df = 778.57, p-value = 0.5702
## alternative hypothesis: true difference in means between group Male and group Female is not equal to 0
## 95 percent confidence interval:
## -36.54920 66.31075
## sample estimates:
## mean in group Male mean in group Female
## 195.6113 180.7305
Interpretation: according to Welch Two Sample t-test we are failed to reject our null hypothesis (p-value > 0,05), so there is no statistically significant difference in mean of time to get to the parents between males and females.
Effect size (t-test)
cohen.d(ESS10_ttminpnt$ttminpnt ~ ESS10_ttminpnt$gndr, na.rm = T)
##
## Cohen's d
##
## d estimate: 0.04049353 (negligible)
## 95 percent confidence interval:
## lower upper
## -0.09938187 0.18036893
Interpretation: according to the results the Cohen’s d effect size estimate is 0.04049353. This value indicates a negligible effect size, which means that there is very little difference between the mean values of the two groups being compared (we can also prove our results of t-test in such way).
Non-parametric t-test
Since our data is not normally distributed, t-test is not really reliable in this case. So there is a need to do non-parametric t-test (Wilcox test) for double-checking the results.
H0: The mean of time to get to the parents in minutes of males is equal to mean of time to get to the parents in minutes of females.
HA:The mean of time to get to the parents in minutes of males is not equal to mean of time to get to the parents in minutes of females.
wilcox.test(ESS10_ttminpnt$ttminpnt ~ ESS10_ttminpnt$gndr)
##
## Wilcoxon rank sum test with continuity correction
##
## data: ESS10_ttminpnt$ttminpnt by ESS10_ttminpnt$gndr
## W = 72632, p-value = 0.1183
## alternative hypothesis: true location shift is not equal to 0
Interpretation: according to Wilcox test our p-value is 0.1183, which is greater than 0,05. So, we are failed to reject our null hypothesis, that means there is no significant difference in means of time to get to parents between women and men.
wilcox_effsize(ttminpnt ~ gndr, data = ESS10_ttminpnt, na.rm = T)
## # A tibble: 1 × 7
## .y. group1 group2 effsize n1 n2 magnitude
## * <chr> <chr> <chr> <dbl> <int> <int> <ord>
## 1 ttminpnt Male Female 0.0556 391 397 small
Interpretation: based on our results there is effect size = 0.05564254, that we can interpret as small effect (we can also prove our results of non-parametric test in such way). It means really little difference of means of time to get to parents between males and females.
Conclusions and answer to the RQ: Based on the results after conducting visualizations and test the data is not normally distributed. Also according to the tests we provided, there is no statistically significant difference in the mean time in minutes to get to the parents between females and males. Thus, there is no enough proofs to state that Swiss people of different gender have different mean time in minutes spent on getting to the parents.
Research question: Is there a relation between the the amount of the time people spend to get to their parents and their frequency of live speaking in Swiztherland?
The study conducted by Schwarz, Trommsdorff, Albert and Mayer eximined the relationship quality of parent-child relationships. One of the measures which they used in the analysis was “residential distance”. They found out that residential distance have negative correlation with emotional and instrumental support types, expetially for mother-child relationships. Therefore we hypothesized, that there is a relation between distance between parent and child and frequancy of their communication.
Reference: Beate Schwarz; Gisela Trommsdorff; Isabelle Albert; Boris Mayer (2005). Adult Parent–Child Relationships: Relationship Quality, Support, and Reciprocity. , 54(3), 396–417. doi:10.1111/j.1464-0597.2005.00217.x
Data inspection
ESS10_anova <- ESS2 %>%
filter(cntry == "CH" & speakpnt <= 7 & ttminpnt != 6666) %>%
select(idno, ttminpnt, speakpnt)
First variable speakpnt – This variable answers the question “How often do you speak with them in person? Please only include occasions where you are physically in the same location.” And indicate the frequancy of speaking to parents in person.
ESS10_anova$speakpnt <- factor(ESS10_anova$speakpnt, labels = c('Several times a day', 'Once a day', 'Several times a week', 'Several times a month',
'Once a month', 'Less often', 'Never' ), ordered = T)
class(ESS10_anova$speakpnt)
## [1] "ordered" "factor"
summary(ESS10_anova$speakpnt)
## Several times a day Once a day Several times a week
## 24 34 177
## Several times a month Once a month Less often
## 234 86 210
## Never
## 36
The second variable ttminpnt was described previously - Travel time to parent, in minutes
ESS10_anova$ttminpnt <- as.numeric(ESS10_anova$ttminpnt)
class(ESS10_anova$ttminpnt)
## [1] "numeric"
summary(ESS10_anova$ttminpnt)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 10.0 30.0 335.6 239.0 8888.0
The variable “ttminpnt” is a continuous variable.
Descriptive plot
Now lets make a box plot in order to estimate our data
ggplot(ESS10_anova)+
geom_boxplot(aes(x=speakpnt, y=ttminpnt), fill="#367588", col="#6a5acd", alpha = 0.5)+
scale_x_discrete(na.translate = FALSE)+
ggtitle("Relationship between the frequency of live communication with parents and time of people getting to parents")+
xlab("How often speak")+
ylab("Time to parent")+
theme(axis.text = element_text(size = 7, angle=90))
Interpretation: we see that some groups have visual difference, however some of them not. It is hard to estimate the difference because of the size of the boxes since we have many outliers.
Lets group categories by the approximate frequency
ESS10_anova$speak <- rep(NA, length(ESS10_anova$speakpnt)) #new variable with grouped data from speakpnt
ESS10_anova$speak [ESS10_anova$speakpnt == "Several times a day"|
ESS10_anova$speakpnt == "Once a day"] <- "Daily"
ESS10_anova$speak [ESS10_anova$speakpnt == "Several times a week" ] <- "Weekly"
ESS10_anova$speak [ESS10_anova$speakpnt == "Several times a month"|
ESS10_anova$speakpnt == "Once a month" ] <- "Monthly"
ESS10_anova$speak [ESS10_anova$speakpnt == "Less often"] <- "Less often"
ESS10_anova$speak [ESS10_anova$speakpnt == "Never" ] <- "Never"
ESS10_anova$speak <- as.factor(ESS10_anova$speak)
ESS10_anova$speak <- factor(ESS10_anova$speak, levels = c("Daily", "Weekly", "Monthly", "Less often", "Never"))
table(ESS10_anova$speak)
##
## Daily Weekly Monthly Less often Never
## 58 177 320 210 36
And make a box plot for the new groups of variables
ggplot(ESS10_anova)+
geom_boxplot(aes(x=speak, y=ttminpnt), fill="#367588", col="#6a5acd", alpha = 0.5)+
scale_x_discrete(na.translate = FALSE)+
ggtitle("Relationship between the frequency of live communication with parents and time of people getting to parents")+
xlab("How often speak")+
ylab("Time to parent")+
theme(axis.text = element_text(size = 7, angle=90))
Interpretation: we still se some difference in groups of different frequency of live speaking with parents, but it is hard to estimate significance of this difference only visually
Checking assumptions for ANOVA test Homogentity of variances
H0 variances are equal
H1 variances are not equal
leveneTest(ESS10_anova$ttminpnt ~ ESS10_anova$speak)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 4 19.414 2.956e-15 ***
## 796
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Variances are not equal as p.value is less then 0,005. Thus, we will use var.equal = F in our ANOVA test later.
Performing F-test
oneway.test(ESS10_anova$ttminpnt ~ ESS10_anova$speak, var.equal = F)
##
## One-way analysis of means (not assuming equal variances)
##
## data: ESS10_anova$ttminpnt and ESS10_anova$speak
## F = 11.486, num df = 4.00, denom df = 178.95, p-value = 2.541e-08
str(oneway.test(ESS10_anova$ttminpnt ~ ESS10_anova$speak, var.equal = F))
## List of 5
## $ statistic: Named num 11.5
## ..- attr(*, "names")= chr "F"
## $ parameter: Named num [1:2] 4 179
## ..- attr(*, "names")= chr [1:2] "num df" "denom df"
## $ p.value : num 2.54e-08
## $ method : chr "One-way analysis of means (not assuming equal variances)"
## $ data.name: chr "ESS10_anova$ttminpnt and ESS10_anova$speak"
## - attr(*, "class")= chr "htest"
Cheching the residuals
one.way.anova <- aov(ESS10_anova$ttminpnt ~ ESS10_anova$speak)
summary(one.way.anova)
## Df Sum Sq Mean Sq F value Pr(>F)
## ESS10_anova$speak 4 118573377 29643344 24.19 <2e-16 ***
## Residuals 796 975437988 1225425
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
As the p.valuse if less then 0.05, the difference in the level of time needed to get to parents values across different frequency groups is statistically significant
Now lets check the second assumprion for ANOVA - the normality of residuals
plot(one.way.anova, 2)
We see that the points are not lying along the diagonal line, so our distribution is far from normal
Now lets check the normality of residuals using a test
anova_residuals <- residuals(one.way.anova)
describe(anova_residuals)
## vars n mean sd median trimmed mad min max range skew
## X1 1 801 0 1104.22 -84.38 -117.09 71.57 -1597.25 8750.35 10347.6 6.14
## kurtosis se
## X1 41.1 39.02
The skew and kurtosis are much more than 2, so we again see that the residuals are not normal
shapiro.test(x = anova_residuals)
##
## Shapiro-Wilk normality test
##
## data: anova_residuals
## W = 0.33506, p-value < 2.2e-16
The data definitely is not normal as p value is so low
hist(anova_residuals)
Visually we also see that resiaduals are not normal as it is skewed to the right and have many outliers
As not all the assumprions for ANOVa are not met (namely, our residuals are not distibuted normally), we will use non-parametric ANOVA, which is Kruskal-Wallis test.
kruskal.test(ESS10_anova$ttminpnt ~ ESS10_anova$speakpnt)
##
## Kruskal-Wallis rank sum test
##
## data: ESS10_anova$ttminpnt by ESS10_anova$speakpnt
## Kruskal-Wallis chi-squared = 391.25, df = 6, p-value < 2.2e-16
P.value is less than 0.05 so there is a significant difference between mean ranks of different frequency groups
Post-Hoc for non parametric test
DunnTest(ESS10$ttminpnt ~ ESS10$speakpnt)
##
## Dunn's test of multiple comparisons using rank sums : holm
##
## mean.rank.diff pval
## 2-1 -542.97148 8.4e-14 ***
## 3-1 -717.23890 < 2e-16 ***
## 4-1 -657.65260 < 2e-16 ***
## 5-1 -559.38737 < 2e-16 ***
## 6-1 -361.76776 1.1e-15 ***
## 7-1 -295.86223 0.0027 **
## 66-1 133.62426 0.0076 **
## 77-1 -98.62574 1.0000
## 88-1 502.62426 1.0000
## 3-2 -174.26741 0.2120
## 4-2 -114.68111 1.0000
## 5-2 -16.41588 1.0000
## 6-2 181.20373 0.1575
## 7-2 247.10926 0.1575
## 66-2 676.59574 < 2e-16 ***
## 77-2 444.34574 1.0000
## 88-2 1045.59574 0.2439
## 4-3 59.58630 1.0000
## 5-3 157.85153 0.0893 .
## 6-3 355.47114 3.9e-16 ***
## 7-3 421.37667 5.5e-07 ***
## 66-3 850.86316 < 2e-16 ***
## 77-3 618.61316 0.6543
## 88-3 1219.86316 0.0893 .
## 5-4 98.26523 0.9297
## 6-4 295.88484 1.2e-12 ***
## 7-4 361.79037 2.6e-05 ***
## 66-4 791.27686 < 2e-16 ***
## 77-4 559.02686 0.9297
## 88-4 1160.27686 0.1286
## 6-5 197.61961 0.0058 **
## 7-5 263.52514 0.0342 *
## 66-5 693.01163 < 2e-16 ***
## 77-5 460.76163 1.0000
## 88-5 1062.01163 0.2221
## 7-6 65.90553 1.0000
## 66-6 495.39202 < 2e-16 ***
## 77-6 263.14202 1.0000
## 88-6 864.39202 0.6543
## 66-7 429.48649 4.1e-08 ***
## 77-7 197.23649 1.0000
## 88-7 798.48649 0.9297
## 77-66 -232.25000 1.0000
## 88-66 369.00000 1.0000
## 88-77 601.25000 1.0000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We see that we have significance difference in the following groups: Once a month-Several times a day, Less often-Several times a day, Never-Several times a day, Several times a month-Once a day, Once a month-Once a day, Less often-Once a day, Never-Once a day, Several times a month-Several times a week, Once a month-Several times a week, Less often-Several times a week, Never-Several times a week, Once a month-Several times a month, Less often-Several times a month, Never-Several times a month, Less often-Once a month, Never-Once a month.
Effect size
epsilonSquared(x = ESS10$ttminpnt, g = ESS10$speakpnt)
## epsilon.squared
## 0.71
We got a result 0,489 which represents large effect, so we have strong statistically significant difference among different groups pof frequency of live speaking to parents.
Conclusions and answer to the RQ: In conclusion, we can see a relation between residental distance and frequency of in person communication in parent-child relations: The larger the distance between parent and child, the less frequently they communicate in person. We see statistical support of out research hypothesis.
As we mentioned in our previous projects, we chose Switzerland as a country for our analysis. Switzerland is known for its rich economy and stable political system. Switzerland is the world’s ninth-happiest country according to the world happiness record 2024. We wondered how the social interactions of people in this country affect their sense of general well-being and happiness.
The overall topic of our research is: ‘Digital and social contacts within family and workplace and its relation to subjective well-being and happiness’. However, for this part of our research we decided to focus on feelings of happiness and to look at potential social factors that may influence happiness levels.
Thus, our research question is: What factors connected to social contacts are related to people’s level of happiness?
Happiness is currently considered one of the most important individual goals in human life. We decided to focus on people’s happiness because we believe that happiness is the most general indicator of a person’s emotional state and well-being. It is known that there are statistically significant factors that influence the level of people’s happiness (for example: health, earnings, etc.). We wondered whether other, less known factors related to social interactions can influence people’s happiness levels. For example, the book by Prilleltensky [1] is devoted to a qualitative analysis of the influence of social factors of belonging to different groups on people’s happiness. From the theories of social psychology it is known that belonging to certain communities positively affects the general mental state of a person. Communities and quality social interactions provide a supportive and positive environment.
A related study was conducted using data from Holland, where researchers found correlations between the frequency and quality of people’s social connections and their overall sense of happiness. [2] A number of other studies have also found correlations between the quality of social relationships in the family and happiness. For example, a study by Tammisalo, K., Danielsbacka, M., Tanskanen, A. O., & Arpino, B. reveals how relationships with different family members are related to levels of happiness. [3] In addition to family contact, research reveals the importance of work relationships in influencing an individual’s happiness. [4]
Thus, we propose the following research hypotheses:
The more social contacts a person has and the more often he/she participates in social activities the higher his/her level of happiness.
The more a person feels that he belongs to a community of colleagues, the higher his level of happiness.
The better a person rates the closeness of their relationship with their parents, the higher their level of happiness.
The more work relationships interfere with relationships with family, the lower a person’s level of happiness.
References: Prilleltensky, I., & Prilleltensky, O. (2021). How people matter: Why it affects health, happiness, love, work, and society. Cambridge University Press. Arampatzi, E., Burger, M. J., & Novik, N. (2018). Social network sites, individual social capital and happiness. Journal of Happiness Studies, 19, 99-122. Tammisalo, K., Danielsbacka, M., Tanskanen, A. O., & Arpino, B. (2024). Social media contact with family members and happiness in younger and older adults. Computers in Human Behavior, 153, 108103. Haar, J., Schmitz, A., Di Fabio, A., & Daellenbach, U. (2019). The role of relationships at work and happiness: A moderated moderated mediation study of New Zealand managers. Sustainability, 11(12), 3443.
Uploud data
ESS <- read.csv('/Users/admin/Downloads/ESS10/ESS10.csv', header = T)
Filtering the data
ESS <- ESS %>%
filter(cntry == "CH") %>%
select(idno, sclact, sclmeet, closepnt, teamfeel, happy, hhlipnt, colprop, jbprtfp)
Label = c("idno", "sclact", "sclmeet", "closepnt", "teamfeel", "happy", "jbprtfp")
Meaning = c("Respondent's identification number", "Taking part in social activities", "How often socially meet with friends, relatives or colleagues", "How close a person feels to parent", "Feeling like part of your work team", "How happy the person is", "Job prevents you from giving time to partner/family, how often")
Level_Of_Measurement <- c("Ratio", "Quasi interval", "Quasi Interval", "Quasi Interval", "Quasi interval", "Quasi interval", "Ordinal")
df <- data.frame(Label, Meaning, Level_Of_Measurement, stringsAsFactors = FALSE)
kable(df) %>%
kable_styling(bootstrap_options=c("bordered", "responsive","striped"), full_width = FALSE)
| Label | Meaning | Level_Of_Measurement |
|---|---|---|
| idno | Respondent’s identification number | Ratio |
| sclact | Taking part in social activities | Quasi interval |
| sclmeet | How often socially meet with friends, relatives or colleagues | Quasi Interval |
| closepnt | How close a person feels to parent | Quasi Interval |
| teamfeel | Feeling like part of your work team | Quasi interval |
| happy | How happy the person is | Quasi interval |
| jbprtfp | Job prevents you from giving time to partner/family, how often | Ordinal |
table(ESS$happy)
##
## 0 1 2 3 4 5 6 7 8 9 10
## 2 4 7 12 17 51 62 220 537 380 231
ESS$happy <- as.numeric(ESS$happy)
ESS_happy<- ESS %>%
select (happy)
table(ESS_happy$happy)
##
## 0 1 2 3 4 5 6 7 8 9 10
## 2 4 7 12 17 51 62 220 537 380 231
summary(ESS_happy$happy)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 8.000 8.000 8.086 9.000 10.000
class(ESS_happy$happy)
## [1] "numeric"
ggplot(ESS_happy)+
geom_histogram( aes(x = happy), binwidth = 1, fill="#367588", col="#6a5acd", alpha = 0.5) +
xlim(c(0, 10))+
xlab("How happy people feel") +
ylab("Number of people") +
scale_x_continuous(breaks= seq(0, 10, by=1))+
geom_vline(aes(xintercept = mean(happy), color = 'mean'), linetype="solid", linewidth = 1) +
geom_vline(aes(xintercept = median(happy), color = 'median'), linetype="solid", linewidth = 2.5)+
geom_vline(aes(xintercept = Mode(happy), color = 'mode'), linetype="solid", linewidth = 1) +
scale_color_manual(name = "Measurement", values = c(median = "#1e90ff", mean = "#ffb6c1", mode = "#C9F76F"))+
ggtitle("Feeling of happiness")
The data is not normally distributed, we see long left tail, so the data is skewed to the left. Mean, mode and median are the same and equal to 8 score, meaning Swiss people have high level of subjective well-being.
describeBy(ESS_happy$happy, group = ESS_happy$happy >0)
##
## Descriptive statistics by group
## group: FALSE
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 2 0 0 0 0 0 0 0 0 NaN NaN 0
## ------------------------------------------------------------
## group: TRUE
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 1521 8.1 1.46 8 8.26 1.48 1 10 9 -1.33 3.12 0.04
Interpretation: Skew (-1.33) is not normal (less than - 0.5). And kurtosis (3.12) is not normal (more than 1), as the graph above tells us (not normally distributed with a sharp top and long left tail). So distribution is not normal.
table(ESS$sclact)
##
## 1 2 3 4 5 7 8
## 116 455 691 206 31 1 23
ESS_sclact <- ESS %>%
select( sclact) %>%
filter(sclact < 6)
table(ESS_sclact$sclact)
##
## 1 2 3 4 5
## 116 455 691 206 31
class(ESS_sclact$sclact)
## [1] "integer"
summary(ESS_sclact$sclact)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 2.00 3.00 2.72 3.00 5.00
ESS_sclact$sclact <- as.numeric(ESS_sclact$sclact)
ggplot(ESS_sclact)+
geom_histogram( aes(x = sclact), binwidth = 1, fill="#367588", col="#6a5acd", alpha = 0.5) +
xlim(c(0, 10))+
xlab("Taking part in social activities") +
ylab("Number of people") +
scale_x_continuous(breaks= seq(0, 10, by=1))+
geom_vline(aes(xintercept = mean(sclact), color = 'mean'), linetype="solid", linewidth = 1) +
geom_vline(aes(xintercept = median(sclact), color = 'median'), linetype="solid", linewidth = 3)+
geom_vline(aes(xintercept = Mode(sclact), color = 'mode'), linetype="solid", linewidth = 1) +
scale_color_manual(name = "Measurement", values = c(median = "#1e90ff", mean = "#ffb6c1", mode = "#C9F76F"))+
ggtitle("The degree of participation in social activities (compared to others of same age)")
The distribution is pretty normal, however it is a bit right-skewed. People in Switzerland think that they are take part in social activities in the same level as others (“3” stands for “About the same”) of their age. Fewer people tend to think they participate less than others (“1” and “2” - “Much less than most” and “Less than most” respectively. And the lowest number (“5” stands for “Much more than most”) of people believe they participate much more than their peers.
describeBy(ESS_sclact, group = ESS_sclact$sclact >0)
##
## Descriptive statistics by group
## group: TRUE
## vars n mean sd median trimmed mad min max range skew kurtosis se
## sclact 1 1499 2.72 0.87 3 2.72 1.48 1 5 4 0.05 -0.04 0.02
Interpretation: Skew (0.05) is normal (less than 0.5). And kurtosis (-0.04) is normal (within +-1), as the graph above tells us (relatively normal histogram without very sharp top and long tails). So distribution is rather normal according to these results.
table(ESS$sclmeet)
##
## 1 2 3 4 5 6 7 88
## 7 64 137 321 333 491 169 1
ESS$sclmeet <- as.numeric(ESS$sclmeet)
ESS_sclmeet<- ESS %>%
select (sclmeet) %>%
filter(sclmeet != 88)
table(ESS_sclmeet$sclmeet)
##
## 1 2 3 4 5 6 7
## 7 64 137 321 333 491 169
summary(ESS_sclmeet$sclmeet)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 4.000 5.000 5.009 6.000 7.000
ESS_sclmeet$sclmeet <- as.numeric(ESS_sclmeet$sclmeet)
ggplot(ESS_sclmeet)+
geom_histogram( aes(x = sclmeet), binwidth = 1, fill="#367588", col="#6a5acd", alpha = 0.5) +
xlim(c(0, 10))+
xlab("How often socially meet with friends, relatives or colleagues") +
ylab("Number of people") +
scale_x_continuous(breaks= seq(0, 10, by=1))+
geom_vline(aes(xintercept = mean(sclmeet), color = 'mean'), linetype="solid", linewidth = 2.5) +
geom_vline(aes(xintercept = median(sclmeet), color = 'median'), linetype="solid", linewidth = 1)+
geom_vline(aes(xintercept = Mode(sclmeet), color = 'mode'), linetype="solid", linewidth = 1) +
scale_color_manual(name = "Measurement", values = c(median = "#1e90ff", mean = "#ffb6c1", mode = "#C9F76F"))+
ggtitle("Social meetings")
The data is a little skewed to the left, however mean and median are located in the middle of the scale and they coincide. As “mode” shows, most frequent response of Swiss is that they meet with friends, relatives or colleagues several days a week (as “6” stands for “Several times a week”). Much fewer amount of people report they meet with friends, family or colleagues never or less than once a month (1 and 2 respectively).
describeBy(ESS_sclmeet, group = ESS_sclmeet$sclmeet >0)
##
## Descriptive statistics by group
## group: TRUE
## vars n mean sd median trimmed mad min max range skew kurtosis
## sclmeet 1 1522 5.01 1.34 5 5.08 1.48 1 7 6 -0.5 -0.39
## se
## sclmeet 0.03
Interpretation: Skew (-0.5) is normal (within +-0.5). And kurtosis (-0.39) is normal (within +-1), as the graph above tells us (relatively normal histogram without a very sharp top but with a little left tail). So distribution is rather normal according to these results.
table(ESS$closepnt)
##
## 1 2 3 4 5 6 7 8
## 211 441 241 69 23 536 1 1
# Filter the observations
ESS_closepnt <- ESS %>%
select(closepnt) %>%
filter(closepnt < 6)
table(ESS_closepnt$closepnt)
##
## 1 2 3 4 5
## 211 441 241 69 23
# We need to invert the scale first, as in the initial scale 1 stands for "Extremely close" and 5 is for "Not at all close"
ESS_closepnt$closepnt <- as.numeric (6 - ESS_closepnt$closepnt)
ggplot(ESS_closepnt)+
geom_histogram( aes(x = closepnt), binwidth = 1, fill="#367588", col="#6a5acd", alpha = 0.5) +
xlim(c(0, 10))+
xlab("How close a repondent feels to parent") +
ylab("Number of people") +
scale_x_continuous(breaks= seq(0, 10, by=1))+
geom_vline(aes(xintercept = mean(closepnt), color = 'mean'), linetype="solid", linewidth = 1) +
geom_vline(aes(xintercept = median(closepnt), color = 'median'), linetype="solid", linewidth = 3)+
geom_vline(aes(xintercept = Mode(closepnt), color = 'mode'), linetype="solid", linewidth = 1) +
scale_color_manual(name = "Measurement", values = c(median = "#1e90ff", mean = "#ffb6c1", mode = "#C9F76F"))+
ggtitle("The degree of closeness to parents")
Our graph is left-skewed, data is distributed slightly not normally. The majority of respondent estimate that they are vary close to their parents (which is “4” after we recode the variable), while the minority thinks they are not close at all. So, we see that people in Switzerland are more inclined to be close to their parents, fewer people have distant relationships inside family.
describeBy(ESS_closepnt, group = ESS_closepnt$closepnt > 0)
##
## Descriptive statistics by group
## group: TRUE
## vars n mean sd median trimmed mad min max range skew kurtosis
## closepnt 1 985 3.76 0.94 4 3.85 1.48 1 5 4 -0.67 0.29
## se
## closepnt 0.03
Interpretation: Skew (-0.67) is not normal (less than - 0.5). And kurtosis (0.29) is normal (within +-1), as the graph above tells us (relatively normal histogram without a very sharp top but with a little left tail). So distribution is rather normal according to these results, but still not perfecly normal distribution.
table(ESS$teamfeel)
##
## 0 1 2 3 4 5 6 7 8 9 10 55 66 77 88
## 7 4 6 13 7 17 23 90 212 187 316 100 530 4 7
ESS$teamfeel <- as.numeric(ESS$teamfeel)
# Filter the observations
ESS_teamfeel <- ESS %>%
select (teamfeel) %>%
filter (teamfeel <=10)
table(ESS_teamfeel$teamfeel)
##
## 0 1 2 3 4 5 6 7 8 9 10
## 7 4 6 13 7 17 23 90 212 187 316
summary(ESS_teamfeel$teamfeel)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 8.000 9.000 8.475 10.000 10.000
ggplot(ESS_teamfeel)+
geom_histogram( aes(x = teamfeel), binwidth = 1, fill="#367588", col="#6a5acd", alpha = 0.5) +
xlim(c(0, 10))+
xlab("Feeling like part of your working team") +
ylab("Number of people") +
scale_x_continuous(breaks= seq(0, 10, by=1))+
geom_vline(aes(xintercept = mean(teamfeel), color = 'mean'), linetype="solid", linewidth = 1) +
geom_vline(aes(xintercept = median(teamfeel), color = 'median'), linetype="solid", linewidth = 1)+
geom_vline(aes(xintercept = Mode(teamfeel), color = 'mode'), linetype="solid", linewidth = 1) +
scale_color_manual(name = "Measurement", values = c(median = "#1e90ff", mean = "#ffb6c1", mode = "#C9F76F"))+
ggtitle("Feeling like a part of your work team")
The graph is skewed to the left, our data is not distributed normally.
The most frequent response of Swiss people is that they completely feel
like a part of a team (“10”). The mean (average) response is around 9.
By analyzing all central tendency measurement, we define they are higher
then 8, meaning individuals mostly feel their belonging to a team and
feel comfortable in their working teams.
We defined that our variables “teamfeel” is on the scale from 0 to 10, so we can not calculate skew and kurtosis. So we need to recode our variable by changing the scale from 1 to 11, with all values stay the same in their meaning.
ESS_teamfeel$teamfeel <- (ESS_teamfeel$teamfeel + 1)
table (ESS_teamfeel$teamfeel)
##
## 1 2 3 4 5 6 7 8 9 10 11
## 7 4 6 13 7 17 23 90 212 187 316
library(psych)
describeBy(ESS_teamfeel, group = ESS_teamfeel$teamfeel > 0)
##
## Descriptive statistics by group
## group: TRUE
## vars n mean sd median trimmed mad min max range skew kurtosis
## teamfeel 1 882 9.48 1.81 10 9.8 1.48 1 11 10 -2.02 5.37
## se
## teamfeel 0.06
Interpretation: Skew (-2.02) is not normal (less than - 0.5). And kurtosis (5.37) is not normal (more than 1), as the graph above tells us (not normally distributed with a sharp top and long left tail). So distribution is not normal.
table(ESS$jbprtfp)
##
## 1 2 3 4 5 6 66 77 88
## 126 277 348 177 18 35 530 4 8
# Filtering observations
ESS_jbprtfp <- ESS %>%
select(idno, jbprtfp) %>%
filter(jbprtfp < 6)
table(ESS_jbprtfp$jbprtfp)
##
## 1 2 3 4 5
## 126 277 348 177 18
class(ESS_jbprtfp$jbprtfp)
## [1] "integer"
#Recode into 3 categories
ESS_jbprtfp$jbprtfp <- dplyr::recode(ESS_jbprtfp$jbprtfp,
"1"= "Never/hardly ever",
"2"= "Never/hardly ever",
"3"= "Sometimes",
"4"= "Often/always",
"5"= "Often/always")
#R represents this variable as integer, so we assigning ordered factor variable type
ESS_jbprtfp$jbprtfp <- factor(ESS_jbprtfp$jbprtfp, levels = c("Never/hardly ever", "Sometimes", "Often/always"), ordered= T)
ggplot(ESS_jbprtfp %>%
filter(jbprtfp != "NA")) +
geom_bar(aes(x = jbprtfp), fill="#CCCCFF", col="#FF7F50", alpha = 0.5) +
xlab("The frequency of job preventing from giving time to partner/family") +
ylab("Number of people") +
ggtitle("The frequency of job preventing from giving time to partner/family")
ost Swiss report that their job never or hardly ever prevents from devoting time to close ones. Fewest amount of respondents estimate that their job always or often prevents them from dedicating time to partner or family. A medium amount of Swiss report that sometimes their job distracts them from giving time to their close people.
v.sclact <- c(round(mean(ESS_sclact$sclact), 2), Mode(ESS_sclact$sclact), median(ESS_sclact$sclact))
names(v.sclact) <- c("mean", "mode", "median")
v.sclmeet <- c(round(mean(ESS_sclmeet$sclmeet), 2), Mode(ESS_sclmeet$sclmeet), median(ESS_sclmeet$sclmeet))
names(v.sclmeet) <- c("mean", "mode", "median")
v.closepnt <- c(round(mean(ESS_closepnt$closepnt), 2), Mode(ESS_closepnt$closepnt), median(ESS_closepnt$closepnt))
names(v.closepnt) <- c("mean", "mode", "median")
v.teamfeel <- c(round(mean(ESS_teamfeel$teamfeel), 2), Mode(ESS_teamfeel$teamfeel), median(ESS_teamfeel$teamfeel))
names(v.teamfeel) <- c("mean", "mode", "median")
v.happy <- c(round(mean(ESS_happy$happy), 2), Mode(ESS_happy$happy), median(ESS_happy$happy))
names(v.happy) <- c("mean", "mode", "median")
v.jbprtfp <- c(NA, Mode(ESS_jbprtfp$jbprtfp), "Sometimes")
names(v.jbprtfp) <- c("mean", "mode", "median")
tendencymeasures = data.frame(v.sclact, v.sclmeet, v.closepnt, v.teamfeel, v.happy, v.jbprtfp, stringsAsFactors = FALSE)
kable(tendencymeasures) %>%
kable_styling(bootstrap_options=c("bordered", "responsive","striped"), full_width = FALSE)
| v.sclact | v.sclmeet | v.closepnt | v.teamfeel | v.happy | v.jbprtfp | |
|---|---|---|---|---|---|---|
| mean | 2.72 | 5.01 | 3.76 | 9.48 | 8.09 | NA |
| mode | 3.00 | 6.00 | 4.00 | 11.00 | 8.00 | Never/hardly ever |
| median | 3.00 | 5.00 | 4.00 | 10.00 | 8.00 | Sometimes |
For our correlation analysis we choose continuous outcome - happy (Respondent report how happy they are) and found four correlations with it.
Our variables: 1) sclact - Taking part in social activities 2)
closepnt - How close a person feels to parent 3) teamfeel - Feeling like
part of your team
4) sclmeet - How often socially meet with friends, relatives or
colleagues
Filtering the data
ESS_cor1 <- ESS %>%
select (sclact, happy, idno)%>%
filter(sclact != 7 & sclact != 8) %>%
filter (happy < 11)
ESS_cor2 <- ESS %>%
select (sclmeet, happy, idno)%>%
filter(sclmeet <=7) %>%
filter (happy < 11)
ESS_cor3 <- ESS %>%
select (closepnt, happy, idno)%>%
filter(closepnt <=5) %>%
filter (happy < 11)
#Recoding the initial scale of closepnt variable
ESS_cor3$closepnt <- (6 - ESS_cor3$closepnt)
table(ESS_cor3$closepnt)
##
## 1 2 3 4 5
## 23 69 241 441 211
ESS_cor4 <- ESS %>%
select (teamfeel, happy, idno)%>%
filter(teamfeel <= 10) %>%
filter (happy < 11)
Checking the class of the variables for correlation and change if needed
class(ESS$happy)
## [1] "numeric"
class(ESS$sclact)
## [1] "integer"
class(ESS$sclmeet)
## [1] "numeric"
class(ESS$closepnt)
## [1] "integer"
class(ESS$teamfeel)
## [1] "numeric"
# Changing the variable type to numeric for correlation
ESS$happy <- as.numeric(ESS$happy)
ESS$sclact <- as.numeric(ESS$sclact)
ESS$sclmeet <- as.numeric(ESS$sclmeet)
ESS$closepnt <- as.numeric(ESS$closepnt)
ESS$teamfeel <- as.numeric(ESS$teamfeel)
Shapiro test to define the distribution of the variables and decide what test to use
options(scipen = 999)
shapiro.test(ESS$happy)
##
## Shapiro-Wilk normality test
##
## data: ESS$happy
## W = 0.8598, p-value < 0.00000000000000022
shapiro.test(ESS$sclact)
##
## Shapiro-Wilk normality test
##
## data: ESS$sclact
## W = 0.80976, p-value < 0.00000000000000022
shapiro.test(ESS$sclmeet)
##
## Shapiro-Wilk normality test
##
## data: ESS$sclmeet
## W = 0.34364, p-value < 0.00000000000000022
shapiro.test(ESS$closepnt)
##
## Shapiro-Wilk normality test
##
## data: ESS$closepnt
## W = 0.80974, p-value < 0.00000000000000022
shapiro.test(ESS$teamfeel)
##
## Shapiro-Wilk normality test
##
## data: ESS$teamfeel
## W = 0.69648, p-value < 0.00000000000000022
We see that our data in our variables are not distributed normally, so for correlation analysis we apply Spearman test.
Statistics hypothesis for correlation:
H0: There is no association between the feeling of closeness and feeling happy
HA: There is an association between the feeling of closeness and feeling happy
cor.test(ESS_cor3$closepnt, ESS_cor3$happy, method = "spearman")
##
## Spearman's rank correlation rho
##
## data: ESS_cor3$closepnt and ESS_cor3$happy
## S = 135515083, p-value = 0.000002566
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.1491938
We can see, that p-value < 0,05, so, we can reject the null hypothesis. We can conclude that closepnt and happy correlate statistically significant with the p-value = 0.000002566 (there is monotonic relationship between closepnt and happy) and correlation coefficient 0.1491938 (0.15) - positive (but small) statistically significant correlation between the feeling of closeness to a parent and feeling happy.
library(ggpubr)
ggscatter(ESS_cor3, x = "happy", y = "closepnt",
add = "reg.line",
cor.coef = TRUE,
cor.method = "spearman",
xlab = "feeling happy",
ylab = "The feeling of closeness to a parent")+
geom_jitter(width = 0.45, height = 0.45, alpha = 0.5)
Interpretation: We see the line as a positive rising trend (that rises to the right corner), but dots are not so close to the line (only on the right side), which means there is not a very strong association.
Statistics hypothesis for correlation:
H0: There is no association between the feeling like a part of a team and feeling happy
HA: There is an association between the feeling like a part of a team and feeling happy
cor.test(ESS_cor4$teamfeel, ESS_cor4$happy, method = "spearman")
##
## Spearman's rank correlation rho
##
## data: ESS_cor4$teamfeel and ESS_cor4$happy
## S = 91192793, p-value = 0.000000001281
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.2025443
We can see, that p-value < 0,05, so, we can reject the null hypothesis. We can conclude that teamfeel and happy correlate statistically significant with the p-value = 0.000000001281 (there is monotonic relationship between teamfeel and happy) and correlation coefficient 0.2025443 (0.2) - positive (but small) statistically significant correlation between the feeling like a part of a team and feeling happy.
ggscatter(ESS_cor4, x = "happy", y = "teamfeel",
add = "reg.line",
cor.coef = TRUE,
cor.method = "spearman",
xlab = "Feeling happy",
ylab = "Feeling like a part of a team") +
geom_jitter(width = 0.45, height = 0.45, alpha = 0.5)
Interpretation: We see the line as a positive rising trend (that rises to the right corner), so the feeling like a part of a team increases and the feeling of happiness also increases. So there is not a very strong association (we can also notice the cluster on the right corner).
ESS_corr1 <- merge(ESS_cor1, ESS_cor2, all = TRUE)
ESS_corr2 <- merge(ESS_cor3, ESS_cor4, all = TRUE)
ESS_corr <- merge(ESS_corr1, ESS_corr2, all = TRUE)
ESS_corr <- ESS_corr %>% select(-idno)
tab_corr(ESS_corr[, 1:5],
corr.method = "spearman", wrap.labels = 70)
| happy | sclact | sclmeet | closepnt | teamfeel | |
|---|---|---|---|---|---|
| happy | 0.120** | 0.074* | 0.124*** | 0.214*** | |
| sclact | 0.120** | 0.249*** | 0.147*** | 0.091* | |
| sclmeet | 0.074* | 0.249*** | 0.166*** | 0.089* | |
| closepnt | 0.124*** | 0.147*** | 0.166*** | 0.129*** | |
| teamfeel | 0.214*** | 0.091* | 0.089* | 0.129*** | |
| Computed correlation used spearman-method with listwise-deletion. | |||||
A graphical table of the correlation:
sjp.corr(ESS_corr[, 1:5], wrap.labels = 100, decimals = 3, , corr.method = "spearman")
ESS_reg <- ESS %>%
filter(happy < 77 & teamfeel <= 10 & jbprtfp < 6)
ESS_reg$closepnt <- as.numeric(ESS_reg$teamfeel)
ESS_reg$jbprtfp <- as.factor(ESS_reg$jbprtfp)
ESS_reg$happy <- as.numeric(ESS_reg$happy)
ESS_reg$jbprtfp <- dplyr::recode(ESS_reg$jbprtfp,
"1"= "Never/hardly ever",
"2"="Never/hardly ever",
"3"="Sometimes",
"4"="Often/always",
"5"="Often/always")
table(ESS_reg$jbprtfp)
##
## Never/hardly ever Sometimes Often/always
## 353 313 178
m1 <- lm(happy ~ teamfeel, data = ESS_reg)
m2<- lm(happy ~ teamfeel + jbprtfp, data = ESS_reg)
sjPlot::tab_model(m1, m2, show.ci = F)
| happy | happy | |||
|---|---|---|---|---|
| Predictors | Estimates | p | Estimates | p |
| (Intercept) | 6.75 | <0.001 | 7.07 | <0.001 |
| teamfeel | 0.16 | <0.001 | 0.15 | <0.001 |
| jbprtfp [Sometimes] | -0.25 | 0.020 | ||
| jbprtfp [Often/always] | -0.61 | <0.001 | ||
| Observations | 844 | 844 | ||
| R2 / R2 adjusted | 0.044 / 0.042 | 0.070 / 0.067 | ||
Model 1
H0: There is no significant relation between feeling happy (outcome) and the team feeling (continious predictor)
HA: There is a significant relation between feeling happy (outcome) and the team feeling (continious predictor)
Model 2
H0: There is no significant relation between feeling happy (outcome) and the frequency with which job prevents from communicating (categorical predictor)
HA: There is a significant relation between feeling happy (outcome) and the frequency with which job prevents from communication (categorical predictor)
Firstly we took a look at the relation between feeling happy and the team feeling and built the first model. The results were significant (p-value <0.001) and R-sq is positive that equals to 0.042 (the model explains by 4.2% the change in the dependent variable, i.e. percentage of the variance in the dependent variable that the independent variable explains). We then added the job preventing communication variable. We see that the second model is significantly better than the first model. There is a relationship between feeling happy and different frequency with which job prevents from communication (P-value < 0.001, P-value = 0.020). Also in this model R-sq is positive (0.067) that means that the model explains by 6.7% the change in the dependent variable. However, it should be noted that the R square is still relatively small, but this model better explains the feeling happy. Let’s compare the models using ANOVA test.
anova(m1,m2)
## Analysis of Variance Table
##
## Model 1: happy ~ teamfeel
## Model 2: happy ~ teamfeel + jbprtfp
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 842 1581.4
## 2 840 1537.3 2 44.139 12.059 0.000006864 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
According to the results of ANOVA test, we can see that the second model (1537.3) has less RSS than the first (1581.4), i.e. the first model has more variation that is not explained by the model, so the second model fits the data better. We also can notice that the second model does explain the outcome variable better as p-value = 6.864e-06.
sjPlot::tab_model(m2, show.ci = F)
| happy | ||
|---|---|---|
| Predictors | Estimates | p |
| (Intercept) | 7.07 | <0.001 |
| teamfeel | 0.15 | <0.001 |
| jbprtfp [Sometimes] | -0.25 | 0.020 |
| jbprtfp [Often/always] | -0.61 | <0.001 |
| Observations | 844 | |
| R2 / R2 adjusted | 0.070 / 0.067 | |
sjPlot::plot_models(m2)
Interpretation: Based on the model results we see that there is a
significant relation between feeling happy and team feeling, different
frequency with which job prevents from communication. We see that the
more a person feels part of a team (increases by one), the more (by
0.15) they feel happy. The reference jbprtfp category in our model is
job which never/hardly prevents communication with friends and
relatives. People with this frequency of preventing have the highest
degree of feeling happy (7.07). When we compare other categories with
the reference one we see, that people who have the job which prevents
the communication sometimes less happy on 0.25 than those who have job
without preventing (-0.25 estimate coefficient). Also we see, that
people who have the job which prevents the communication often/always in
a more degree less happy on 0.61 than those who have job without
preventing (-0.61 estimate coefficient).
To summaries, people are much happier when they have a job that does not interrupt them from socialising with family and friends, and the feeling of happiness increases when one feels part of a team.
Constructing regression model equation
The general equation looks like this: E(Y) = β0 + β1X1 + β2I2 + β3I3 + β4I4.
As the equation was asked for the whole model with significant coefficients. So, in our case it would be would be look like this:
feeling happy = 7.07 + 0.15 * teem feeling - 0.25 * job sometimes prevents from communication - 0.61 * job often/always prevents from communication
Let’s now go back to our research hypothesis and take a general overview of our results based on previous analysis:
1) The more social contacts a person has and the more often he/she participates in social activities the higher his/her level of happiness.
The hypothesis was proved. People who meet other people and take part in social activities more frequantly are happier.
2) The more a person feels that he belongs to a community of colleagues, the higher his level of happiness.
The hypothesis was proved. People who feel more like a part of their working team are happier.
3) The better a person rates the closeness of their relationship with their parents, the higher their level of happiness.
The hypothesis was proved. People who have closer relationships with their parents are happier.
4) The more work relationships interfere with relationships with family, the lower a person’s level of happiness.
The hypothesis was proved. People who have a job, which prevents their communication with famlily are less happy.
Going back to our discussion of explaining happiness, wee see that our analysis describe relationships not fully. The persantage of observations described by predictors (our socially connected factors) is pretty low. Therefore, we need to conclude that there are other factors (such as income and health), which probably describe hapinness better.
This study is a continuation of our study number 3. Here we aim to expand our understanding of the relationship between social contacts and happiness.
As a reminder, the overall topic of our research is: ‘Digital and social contacts within family and workplace and its relation to subjective well-being and happiness’. However, for this part of our research we decided to focus on feelings of happiness and to look at potential social factors that may influence happiness levels.
Thus, our research question is: What factors influence the relation between connection to social groups and feelings of happiness?
A number of other studies have also found a relationship between the quality of social relationships in the family and happiness. For example, a study by Tammisalo, K., Danielsbacka, M., Tanskanen, A. O., & Arpino, B. reveals how relationships with different family members are related to the level of happiness. [1] In addition to family contact, research reveals the importance of work relationships in influencing an individual’s happiness. [2]
In our project, we focused on two types of communities: co-workers and family. These groups were chosen because of their qualitative differences from each other. In a coworking community, people are in a more formal setting, while a family is based on personal relationships between its members.
References: Tammisalo, K., Danielsbacka, M., Tanskanen, A. O., & Arpino, B. (2024). Social media contact with family members and happiness in younger and older adults. Computers in Human Behavior, 153, 108103. Haar, J., Schmitz, A., Di Fabio, A., & Daellenbach, U. (2019). The role of relationships at work and happiness: A moderated moderated mediation study of New Zealand managers. Sustainability, 11(12), 3443.
For this investigation we chose the variables, which were shown to have statistically significant correlation in previous project.
Label = c("idno", "closepnt", "teamfeel", "happy", "hhlipnt", "colprop")
Meaning = c("Respondent's identification number", "How close a person feels to parent", "Feeling like part of your work team", "How happy the person is", "Parent lives in same household with a respondent", "Proportion of colleagues based at the same location")
Level_Of_Measurement <- c("Ratio", "Quasi Interval", "Quasi interval", "Quasi interval", "Nominal, binary", "Ordinal")
df <- data.frame(Label, Meaning, Level_Of_Measurement, stringsAsFactors = FALSE)
kable(df) %>%
kable_styling(bootstrap_options=c("bordered", "responsive","striped"), full_width = FALSE)
| Label | Meaning | Level_Of_Measurement |
|---|---|---|
| idno | Respondent’s identification number | Ratio |
| closepnt | How close a person feels to parent | Quasi Interval |
| teamfeel | Feeling like part of your work team | Quasi interval |
| happy | How happy the person is | Quasi interval |
| hhlipnt | Parent lives in same household with a respondent | Nominal, binary |
| colprop | Proportion of colleagues based at the same location | Ordinal |
table(ESS$happy)
##
## 0 1 2 3 4 5 6 7 8 9 10
## 2 4 7 12 17 51 62 220 537 380 231
ESS$happy <- as.numeric(ESS$happy)
ESS_happy<- ESS %>%
select (happy)
table(ESS_happy$happy)
##
## 0 1 2 3 4 5 6 7 8 9 10
## 2 4 7 12 17 51 62 220 537 380 231
summary(ESS_happy$happy)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 8.000 8.000 8.086 9.000 10.000
class(ESS_happy$happy)
## [1] "numeric"
ggplot(ESS_happy)+
geom_histogram( aes(x = happy), binwidth = 1, fill="#367588", col="#6a5acd", alpha = 0.5) +
xlim(c(0, 10))+
xlab("How happy people feel") +
ylab("Number of people") +
scale_x_continuous(breaks= seq(0, 10, by=1))+
geom_vline(aes(xintercept = mean(happy), color = 'mean'), linetype="solid", linewidth = 1) +
geom_vline(aes(xintercept = median(happy), color = 'median'), linetype="solid", linewidth = 2.5)+
geom_vline(aes(xintercept = Mode(happy), color = 'mode'), linetype="solid", linewidth = 1) +
scale_color_manual(name = "Measurement", values = c(median = "#1e90ff", mean = "#ffb6c1", mode = "#C9F76F"))+
ggtitle("Feeling of happiness")
The data is not normally distributed, we see long left tail, so the data
is skewed to the left. Mean, mode and median are the same and equal to 8
score, meaning Swiss people have high level of subjective
well-being.
describeBy(ESS_happy$happy, group = ESS_happy$happy >0)
##
## Descriptive statistics by group
## group: FALSE
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 2 0 0 0 0 0 0 0 0 NaN NaN 0
## ------------------------------------------------------------
## group: TRUE
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 1521 8.1 1.46 8 8.26 1.48 1 10 9 -1.33 3.12 0.04
Interpretation: Skew (-1.33) is not normal (less than - 0.5). And kurtosis (3.12) is not normal (more than 1), as the graph above tells us (not normally distributed with a sharp top and long left tail). So distribution is not normal.
table(ESS$closepnt)
##
## 1 2 3 4 5 6 7 8
## 211 441 241 69 23 536 1 1
# Filter the observations
ESS_closepnt <- ESS %>%
select(closepnt) %>%
filter(closepnt < 6)
table(ESS_closepnt$closepnt)
##
## 1 2 3 4 5
## 211 441 241 69 23
# We need to invert the scale first, as in the initial scale 1 stands for "Extremely close" and 5 is for "Not at all close"
ESS_closepnt$closepnt <- as.numeric (6 - ESS_closepnt$closepnt)
ggplot(ESS_closepnt)+
geom_histogram( aes(x = closepnt), binwidth = 1, fill="#367588", col="#6a5acd", alpha = 0.5) +
xlim(c(0, 10))+
xlab("How close a repondent feels to parent") +
ylab("Number of people") +
scale_x_continuous(breaks= seq(0, 10, by=1))+
geom_vline(aes(xintercept = mean(closepnt), color = 'mean'), linetype="solid", linewidth = 1) +
geom_vline(aes(xintercept = median(closepnt), color = 'median'), linetype="solid", linewidth = 3)+
geom_vline(aes(xintercept = Mode(closepnt), color = 'mode'), linetype="solid", linewidth = 1) +
scale_color_manual(name = "Measurement", values = c(median = "#1e90ff", mean = "#ffb6c1", mode = "#C9F76F"))+
ggtitle("The degree of closeness to parents")
Our graph is left-skewed, data is distributed slightly not normally. The majority of respondent estimate that they are vary close to their parents (which is “4” after we recode the variable), while the minority thinks they are not close at all. So, we see that people in Switzerland are more inclined to be close to their parents, fewer people have distant relationships inside family.
describeBy(ESS_closepnt, group = ESS_closepnt$closepnt > 0)
##
## Descriptive statistics by group
## group: TRUE
## vars n mean sd median trimmed mad min max range skew kurtosis
## closepnt 1 985 3.76 0.94 4 3.85 1.48 1 5 4 -0.67 0.29
## se
## closepnt 0.03
Interpretation: Skew (-0.67) is not normal (less than - 0.5). And kurtosis (0.29) is normal (within +-1), as the graph above tells us (relatively normal histogram without a very sharp top but with a little left tail). So distribution is rather normal according to these results, but still not perfecly normal distribution.
table(ESS$teamfeel)
##
## 0 1 2 3 4 5 6 7 8 9 10 55 66 77 88
## 7 4 6 13 7 17 23 90 212 187 316 100 530 4 7
ESS$teamfeel <- as.numeric(ESS$teamfeel)
# Filter the observations
ESS_teamfeel <- ESS %>%
select (teamfeel) %>%
filter (teamfeel <=10)
table(ESS_teamfeel$teamfeel)
##
## 0 1 2 3 4 5 6 7 8 9 10
## 7 4 6 13 7 17 23 90 212 187 316
summary(ESS_teamfeel$teamfeel)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 8.000 9.000 8.475 10.000 10.000
ggplot(ESS_teamfeel)+
geom_histogram( aes(x = teamfeel), binwidth = 1, fill="#367588", col="#6a5acd", alpha = 0.5) +
xlim(c(0, 10))+
xlab("Feeling like part of your working team") +
ylab("Number of people") +
scale_x_continuous(breaks= seq(0, 10, by=1))+
geom_vline(aes(xintercept = mean(teamfeel), color = 'mean'), linetype="solid", linewidth = 1) +
geom_vline(aes(xintercept = median(teamfeel), color = 'median'), linetype="solid", linewidth = 1)+
geom_vline(aes(xintercept = Mode(teamfeel), color = 'mode'), linetype="solid", linewidth = 1) +
scale_color_manual(name = "Measurement", values = c(median = "#1e90ff", mean = "#ffb6c1", mode = "#C9F76F"))+
ggtitle("Feeling like a part of your work team")
The graph is skewed to the left, our data is not distributed normally. The most frequent response of Swiss people is that they completely feel like a part of a team (“10”). The mean (average) response is around 9. By analyzing all central tendency measurement, we define they are higher then 8, meaning individuals mostly feel their belonging to a team and feel comfortable in their working teams.
We defined that our variables “teamfeel” is on the scale from 0 to 10, so we can not calculate skew and kurtosis. So we need to recode our variable by changing the scale from 1 to 11, with all values stay the same in their meaning.
ESS_teamfeel$teamfeel <- (ESS_teamfeel$teamfeel + 1)
table (ESS_teamfeel$teamfeel)
##
## 1 2 3 4 5 6 7 8 9 10 11
## 7 4 6 13 7 17 23 90 212 187 316
library(psych)
describeBy(ESS_teamfeel, group = ESS_teamfeel$teamfeel > 0)
##
## Descriptive statistics by group
## group: TRUE
## vars n mean sd median trimmed mad min max range skew kurtosis
## teamfeel 1 882 9.48 1.81 10 9.8 1.48 1 11 10 -2.02 5.37
## se
## teamfeel 0.06
Interpretation: Skew (-2.02) is not normal (less than - 0.5). And kurtosis (5.37) is not normal (more than 1), as the graph above tells us (not normally distributed with a sharp top and long left tail). So distribution is not normal.
table (ESS$hhlipnt)
##
## 1 2 6 7
## 183 803 536 1
#Filtering the observations
ESS_hhlipnt <- ESS %>%
select(hhlipnt) %>%
filter(hhlipnt <=2)
table(ESS_hhlipnt$hhlipnt)
##
## 1 2
## 183 803
class(ESS_hhlipnt$hhlipnt)
## [1] "integer"
#R represents this variable as integer, so we assigning ordered factor variable type
ESS_hhlipnt$hhlipnt <- factor(ESS_hhlipnt$hhlipnt, labels = c("Yes", "No"), ordered= T)
ggplot(ESS_hhlipnt)+
geom_bar(aes(x = hhlipnt), fill="#367588", col="#6a5acd", alpha = 0.5)+
xlab("Living with parents: yes or no") +
ylab("Number of people") +
ggtitle("Living with parents")
The majority of Swiss people report they do not live in the same household as their parents.
table(ESS$colprop)
##
## 1 2 3 4 5 6 7 55 66 77 88
## 175 185 82 131 93 154 78 78 530 5 12
# Filtering observations
ESS_colprop <- ESS %>%
select(idno, colprop) %>%
filter(colprop <= 7)
table(ESS_colprop$colprop)
##
## 1 2 3 4 5 6 7
## 175 185 82 131 93 154 78
class(ESS_colprop$colprop)
## [1] "integer"
# recode in 3 categories
ESS_colprop$colprop <- dplyr::recode(ESS_colprop$colprop,
"7"= "Small or none",
"6"= "Small or none",
"5"="A half",
"4"="A half",
"3"="A half",
"2"="Very large",
"1"="Very large")
#R represents this variable as integer, so we assigning ordered factor variable type
ESS_colprop$colprop <- factor(ESS_colprop$colprop, levels = c("Small or none", "A half", "Very large"), ordered= T)
ggplot(ESS_colprop %>%
filter(colprop != "NA")) +
geom_bar(aes(x = colprop), fill="#CCCCFF", col="#FF7F50", alpha = 0.5) +
xlab("The shares of colleagues at the same location") +
ylab("Number of people") +
ggtitle("Proportion of colleagues based at the same location on a normal working day")
From the graph we see that the majority of respondents report that they are set with a very large amount of colleagues at their work location on a normal work day (“All” and “Very large” on the initial scale). The fewest amount of respondents state there are none or very small number of colleagues with them during work. A medium number report they have about a half of their colleagues based at the same location.
table(ESS_colprop$colprop)
##
## Small or none A half Very large
## 232 306 360
table(ESS_jbprtfp$jbprtfp)
##
## Never/hardly ever Sometimes Often/always
## 403 348 195
v.closepnt <- c(round(mean(ESS_closepnt$closepnt), 2), Mode(ESS_closepnt$closepnt), median(ESS_closepnt$closepnt))
names(v.closepnt) <- c("mean", "mode", "median")
v.teamfeel <- c(round(mean(ESS_teamfeel$teamfeel), 2), Mode(ESS_teamfeel$teamfeel), median(ESS_teamfeel$teamfeel))
names(v.teamfeel) <- c("mean", "mode", "median")
v.happy <- c(round(mean(ESS_happy$happy), 2), Mode(ESS_happy$happy), median(ESS_happy$happy))
names(v.happy) <- c("mean", "mode", "median")
v.hhlipnt <- c(NA, Mode(ESS_hhlipnt$hhlipnt), NA)
names(v.hhlipnt) <- c("mean", "mode", "median")
v.colprop <- c(NA, Mode(ESS_colprop$colprop), "A half")
names(v.colprop) <- c("mean", "mode", "median")
tendencymeasures = data.frame( v.closepnt, v.teamfeel, v.happy, v.hhlipnt, v.colprop, stringsAsFactors = FALSE)
kable(tendencymeasures) %>%
kable_styling(bootstrap_options=c("bordered", "responsive","striped"), full_width = FALSE)
| v.closepnt | v.teamfeel | v.happy | v.hhlipnt | v.colprop | |
|---|---|---|---|---|---|
| mean | 3.76 | 9.48 | 8.09 | NA | NA |
| mode | 4.00 | 11.00 | 8.00 | No | Very large |
| median | 4.00 | 10.00 | 8.00 | NA | A half |
Starting with choosing variables and filtering them. Also we will recode categorical variables.
ESS_regr <- ESS %>%
filter(happy < 77 & closepnt < 6 & hhlipnt < 6 & teamfeel <= 10 & colprop < 55)
table(ESS_regr$hhlipnt)
##
## 1 2
## 96 620
ESS_regr$closepnt <- as.numeric(6 - ESS_regr$closepnt)
ESS_regr$hhlipnt <- as.factor(ESS_regr$hhlipnt)
ESS_regr$happy <- as.numeric(ESS_regr$happy)
ESS_regr$colprop <- as.factor(ESS_regr$colprop)
ESS_regr$colprop <- dplyr::recode(ESS_regr$colprop,
"1"= "Very large",
"2"="Very large",
"3"="A half",
"4"="A half",
"5"="A half",
"6"="Small or none",
"7"="Small or none")
ESS_regr$hhlipnt <- dplyr::recode(ESS_regr$hhlipnt,
"1"= "Yes",
"2"="No")
m3 <- lm(happy ~ closepnt + hhlipnt + teamfeel + colprop, data = ESS_regr)
sjPlot::tab_model(m3, show.ci = F)
| happy | ||
|---|---|---|
| Predictors | Estimates | p |
| (Intercept) | 6.08 | <0.001 |
| closepnt | 0.16 | 0.006 |
| hhlipnt [No] | 0.18 | 0.239 |
| teamfeel | 0.16 | <0.001 |
| colprop [A half] | -0.04 | 0.707 |
| colprop [Small or none] | -0.24 | 0.073 |
| Observations | 716 | |
| R2 / R2 adjusted | 0.069 / 0.063 | |
Interpretation: We see in this model that R^2 here describes 0,063% of variance and that we have 2 significant predictors for the happiness of individuals in Switzerland, those are closeness to parents and feeling like a part of a working team. Interestingly, estimates of both predictors are the same. Thus, with closeness to parent increasing by 1 point, happiness of a person also increases by 0,16 scores. The same is applicable to feeling like a part of a team. The more person feels connected with his working team, the happier he is (namely, with feeling a part of a team increasing by 1, a person gets happier by 0,16 scores).
m4 <- lm(happy ~ closepnt * hhlipnt + teamfeel * colprop, data = ESS_regr)
sjPlot::tab_model( m4, show.ci = F)
| happy | ||
|---|---|---|
| Predictors | Estimates | p |
| (Intercept) | 3.17 | <0.001 |
| closepnt | 0.53 | 0.002 |
| hhlipnt [No] | 1.93 | 0.012 |
| teamfeel | 0.32 | <0.001 |
| colprop [A half] | 1.99 | 0.006 |
| colprop [Small or none] | 1.36 | 0.030 |
| closepnt × hhlipnt [No] | -0.43 | 0.018 |
|
teamfeel × colprop [A half] |
-0.23 | 0.005 |
|
teamfeel × colprop [Small or none] |
-0.18 | 0.010 |
| Observations | 716 | |
| R2 / R2 adjusted | 0.090 / 0.079 | |
Interpretation: In the multiple model it can be seen that all of the chosen predictors are significantly associated with the happiness, just as both moderations are significant. Startling with closeness to parents, for each unit increase in closeness, happiness increases by 0.53 units. Positive effect is also found for feeling a part of working team. With feeling changing by 1 (positively), happiness increases by 0.32 points. Shifting to categorical predictors, people who do not live with their parents in the same household are on average happier than those who live on 1.93 scores. Looking at working team conditions, namely at number of people working in the same physical place, people from those companies where only near a half of workers are located in the same place are 1.99 scores happier that those from companies where large number of people work together life. The similar situation is for group where small number of people works together, their are on average happier than workers from offline companies on 1.36 scores. As for the interaction effect, starting with the influence of living with parents or not on the effect of closeness to parents on subjective happiness, living separately from parents decreases the effect of closeness to happiness on 0.43 scores. It means when people live with parents in the same household their happiness is less related to the closeness to parents than when they are separated from them. As for number of workers put together in the same place, having only half of them or small/none number reduces the impact of feeling a part of a team on happiness. Notably, it lowers more (on 0.23) when the half of workers are situated in the same physical place than when none or few of them (here it lowers the effect on 0.18) compared to the group where large number of people are together.
sjPlot::tab_model(m3, m4, show.ci = F)
| happy | happy | |||
|---|---|---|---|---|
| Predictors | Estimates | p | Estimates | p |
| (Intercept) | 6.08 | <0.001 | 3.17 | <0.001 |
| closepnt | 0.16 | 0.006 | 0.53 | 0.002 |
| hhlipnt [No] | 0.18 | 0.239 | 1.93 | 0.012 |
| teamfeel | 0.16 | <0.001 | 0.32 | <0.001 |
| colprop [A half] | -0.04 | 0.707 | 1.99 | 0.006 |
| colprop [Small or none] | -0.24 | 0.073 | 1.36 | 0.030 |
| closepnt × hhlipnt [No] | -0.43 | 0.018 | ||
|
teamfeel × colprop [A half] |
-0.23 | 0.005 | ||
|
teamfeel × colprop [Small or none] |
-0.18 | 0.010 | ||
| Observations | 716 | 716 | ||
| R2 / R2 adjusted | 0.069 / 0.063 | 0.090 / 0.079 | ||
anova(m3,m4)
## Analysis of Variance Table
##
## Model 1: happy ~ closepnt + hhlipnt + teamfeel + colprop
## Model 2: happy ~ closepnt * hhlipnt + teamfeel * colprop
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 710 1337.2
## 2 707 1307.8 3 29.392 5.2964 0.001291 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
According to anova results, the second model is significantly better, just as it has smaller RSS. Also the second model has bigger R^2.
library(ggplot2)
ggplot(ESS_regr, aes(x=closepnt, y=happy, color=hhlipnt)) +
geom_point() +
geom_smooth(method=lm, se=FALSE, fullrange=TRUE)+
geom_jitter(width = 0.45, height = 0.45, alpha = 0.4)
Comment: here we see visual confirmation of interpretation placed above. There is a positive relationship between closeness to parents in both groups, though this relationship is stronger in the “Yes” group which is colored with pink. We can make this conclusion based on the slope of the curve, as it is stepper than blue (corresponds to the “no” group)
ggplot(ESS_regr, aes(x=teamfeel, y=happy, color=colprop)) +
geom_point() +
geom_smooth(method=lm, se=FALSE, fullrange=TRUE)+
geom_jitter(width = 0.45, height = 0.45, alpha = 0.4)
Comment: we see that the most difference in effect appears in the group where a large number of workers work in a one physical place, so it is the most strong among three groups as the curve is the steepest. As for categories “A half” and “Small or none”, for them effect of feeling a part of a working team on happiness is approximately the same as the slope of the curves is not much different, at the same time they are flatter (compared with the curve for “Very large”).
There are factors, which describe the relationships between happiness and predictors of it. As for family enviroments, hapinness related closeness to parents, howver these relationships moderated by the fact of living in the same household or separated by parents. We saw, that for people, who live in the same household with their parents, clossness to them will significantly define their level of happiness. As for working enviroments, we have shown that feeling of belonging to the team affects happiness, however for people, most colleguase of whom are in the same location, this relationship will be stronger.