Introduction

Hello! Today I’d like to explore happiness a bit (again). The beauty of happiness in its relevance, I guess. Happiness is always needed, its always relevant and i do not really think, that one day that will be totally explored.

My idea is to lookat such factors, that describe way of people’s lives. Today is very popular an idea of “being myself” and “living for myself”, have exciting life, do not depend on anyone and do not live as anyone expect. Inspiring and beautiful, isn’t it? But how much does it really makes people happy? And how do people, willing to behave as expected, as right… how do they feel? How happy they are?

Yes, that’s my idea. Let’s move into data.

Data description

  • Country: Russia.
  • N of odservations: 10781.

Variables and hypotheses

As for the variables, I’ve used:

  • index of happiness, constructed out of both how happy is the person and how satisfied are they in their lives;

  • freedom: how much freedom of choice and control does person feel they have over the way their life turns out. It’s a numeric variable (rating from 1 to 10). I expect the more freedom people feel, the happier they are;

After the freedom goes factors, that in my opinion show person’s inner freedom: how do they live, do they want to live their way or to behave as expected. The following 4 binary variables show two sides of this situation:

  • seek to be themselves
  • decide goals in life by themselves
  • making parents proud as one of the life-goals;
  • making effort to live up to what my friends expect

And my assumption here is that people, who live as they are without trying to live as expected (or without seeking to make their parents proud) will be happier. And, in contrast, wish to live as others expect will make people less happy.

Next 2 variables are closely related to the previous idea and I hope these will support it: what is important for people:

  • to have an exciting life with adventure and taking risks;
  • to behave properly and avoid doing anything people would say is wrong.

These two variable are categorical (ordinal) with 6 levels of how much is that important for a person.

Descriptive statistics

  • Index of happiness

The distribution is more or less normal; mean level is 4 out of 5.

#class(wvs.ext$happy)
#levels(wvs.ext$happy)
#class(wvs.ext$satisf)
wvs.ext$happy0 <-   ifelse(wvs.ext$happy=="Not at all happy",1,
                            ifelse(wvs.ext$happy=="Not very happy",2,
                                   ifelse(wvs.ext$happy=="Quite happy",3,
                                          ifelse(wvs.ext$happy=="Very happy",4, NA))))
#table(wvs.ext$happy, wvs.ext$happy0)
wvs.ext$satisf1 <-   ifelse(wvs.ext$satisf=="Satisfied",10,
                             ifelse(wvs.ext$satisf=="Dissatisfied",1,
                                    wvs.ext$satisf))
#table(wvs.ext$satisf, wvs.ext$satisf1)
#wvs.ext$satisf <- as.numeric(wvs.ext$satisf)
wvs.ext$happyIND<- rowMeans(wvs.ext[c('happy0','satisf1')], na.rm=T)      
wvs.ext$happyINDscale <- scale(wvs.ext$happyIND)
# happiness 
wvs.ext1 <- na.omit(wvs.ext)
hist(wvs.ext1$happyINDscale, 
     main="Histogram for happiness index", 
     xlab="Level of happiness", 
     prob = TRUE)
lines(density(wvs.ext1$happyINDscale))

summary(wvs.ext$happyIND)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.000   3.000   4.000   4.262   5.500  10.000      60
  • Freedom level

Distribution is skewed to the right, there’s a light negative kurtosis, a bit - throught the graph it can be noticed that the higher the level of freedom, the more people fit this level; mean level of freedom is 6 out of 10.

# freedom
wvs.ext1 <- na.omit(wvs.ext)
hist(wvs.ext1$freedom1, 
     main="Histogram for freedom", 
     xlab="Level of freedom", 
     prob = TRUE)
lines(density(wvs.ext1$freedom1))

summary(wvs.ext$freedom1)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.000   5.000   6.000   6.156   8.000  10.000    2058

Then, I’ve created 4 variable as binary (warning: lots on missing values). These variables are for:

  • seek to be themselves (be_myself_1);
  • decide goals in life by themselves (goals_myself_1);
  • making parents proud as one of the life-goals (par_proud_1);
  • making effort to live up to what my friends expect (live_as_expected_1).
wvs.ext2 <- wvs.ext %>% select("be_myself_1", "goals_myself_1", "par_proud_1", "live_as_expected_1")
summary(wvs.ext2)
##  be_myself_1 goals_myself_1 par_proud_1 live_as_expected_1
##  0   :  90   0   : 149      0   : 470   0   :2373         
##  1   :1875   1   :1817      1   :4256   1   :2276         
##  NA's:8816   NA's:8815      NA's:6055   NA's:6132

Now I run t.tests to check differences in level of happiness means in groups and its significance. It can be seen (below) that in most of the cases difference is not really big, but still dignificant, all p-values are <0.05. In general, happier people are in following groups: who seek to be themselves, set their goals themselves, who aim to make their parents proud and who live as expected by friends.

t.test(wvs.ext$happyIND ~ wvs.ext$be_myself_1) #p-value = 0.0008403
## 
##  Welch Two Sample t-test
## 
## data:  wvs.ext$happyIND by wvs.ext$be_myself_1
## t = -3.4472, df = 96.584, p-value = 0.0008403
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.8985872 -0.2419158
## sample estimates:
## mean in group 0 mean in group 1 
##        3.966667        4.536918
boxplot(wvs.ext$happyIND ~ wvs.ext$be_myself_1, 
main = "Seek to be themselves")

t.test(wvs.ext$happyIND ~ wvs.ext$goals_myself_1) #p-value = 2.178e-10
## 
##  Welch Two Sample t-test
## 
## data:  wvs.ext$happyIND by wvs.ext$goals_myself_1
## t = -6.7584, df = 168.54, p-value = 2.178e-10
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.154585 -0.632555
## sample estimates:
## mean in group 0 mean in group 1 
##        3.684564        4.578134
boxplot(wvs.ext$happyIND ~ wvs.ext$goals_myself_1, 
main = "Decide goals by themselves")

t.test(wvs.ext$happyIND ~ wvs.ext$par_proud_1) #p-value = 1.715e-06
## 
##  Welch Two Sample t-test
## 
## data:  wvs.ext$happyIND by wvs.ext$par_proud_1
## t = -4.8354, df = 565.04, p-value = 1.715e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.4605990 -0.1944947
## sample estimates:
## mean in group 0 mean in group 1 
##        4.242521        4.570068
boxplot(wvs.ext$happyIND ~ wvs.ext$par_proud_1, 
main = "Goal is to make parents proud")

t.test(wvs.ext$happyIND ~ wvs.ext$live_as_expected_1) #p-value = 2.611e-05 
## 
##  Welch Two Sample t-test
## 
## data:  wvs.ext$happyIND by wvs.ext$live_as_expected_1
## t = -4.2093, df = 4632.3, p-value = 2.611e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.24373415 -0.08883771
## sample estimates:
## mean in group 0 mean in group 1 
##        4.457999        4.624285
boxplot(wvs.ext$happyIND ~ wvs.ext$live_as_expected_1, 
main = "Prefer living as expected by friend")

Than, I’ve worked a bit with other two variables, about important things in life: adventure and risk-taking (imp_adventure) versus proper behaviour (imp_proper). Both variables have 6 levels (people gave answers of how close is that statement of importance to them: from “Not at all like me” to “Very much like me”). Summaries with N of observations for each answers are the following:

wvs.ext2 <- wvs.ext %>% select("imp_adventure", "imp_proper")
summary(wvs.ext2)
##             imp_adventure               imp_proper  
##  Very much like me : 116   Very much like me : 350  
##  Like me           : 213   Like me           : 565  
##  Somewhat like me  : 321   Somewhat like me  : 465  
##  A little like me  : 292   A little like me  : 324  
##  Not like me       : 542   Not like me       : 204  
##  Not at all like me: 486   Not at all like me:  47  
##  NA's              :8811   NA's              :8826

I’ve also run Kruskal-Wallis to see if differences in happiness levels are significant - they are.

kruskal.test(wvs.ext$happyIND ~ wvs.ext$imp_adv1)   #p-value = 6.253e-15
## 
##  Kruskal-Wallis rank sum test
## 
## data:  wvs.ext$happyIND by wvs.ext$imp_adv1
## Kruskal-Wallis chi-squared = 75.827, df = 5, p-value = 6.253e-15
kruskal.test(wvs.ext$happyIND ~ wvs.ext$imp_prop1)  # p-value = 0.02066
## 
##  Kruskal-Wallis rank sum test
## 
## data:  wvs.ext$happyIND by wvs.ext$imp_prop1
## Kruskal-Wallis chi-squared = 13.307, df = 5, p-value = 0.02066

Analysis

Well, now we are ready to move into the modeling!

“Be yourself and live your life”

First, I’d like to look at the relation of “be yourself” model of behaviour and person’s happiness. The idea is that people, who seek to be themselves rather than to follow others are happier. Also, I guess that making decisions of their goals in life by themselves will add happiness either. Than I’d like to add there factor of how important for them live exciting life with adventures and, maybe, how much freedom do they have.

wvs.ext1 <- wvs.ext %>% select(happyINDscale, be_myself_1, goals_myself_1, imp_adv1, freedom1) %>% na.omit()

model1_1 <- lm(happyINDscale ~ be_myself_1, data = wvs.ext1)
summary(model1_1) # R^2 = 0.007 , p-value = 0.0002425
## 
## Call:
## lm(formula = happyINDscale ~ be_myself_1, data = wvs.ext1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.3148 -0.6906 -0.0410  0.6087  3.5321 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -0.18925    0.09998  -1.893 0.058541 .  
## be_myself_11  0.38469    0.10242   3.756 0.000178 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9272 on 1828 degrees of freedom
## Multiple R-squared:  0.007658,   Adjusted R-squared:  0.007115 
## F-statistic: 14.11 on 1 and 1828 DF,  p-value: 0.0001781
model1_2 <- lm(happyINDscale ~ be_myself_1 + goals_myself_1, data = wvs.ext1)
summary(model1_2) # R^2 = 0.028  , p-value = 2.324e-11
## 
## Call:
## lm(formula = happyINDscale ~ be_myself_1 + goals_myself_1, data = wvs.ext1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.3477 -0.7235  0.1799  0.5758  3.4992 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     -0.54722    0.11424  -4.790  1.8e-06 ***
## be_myself_11     0.25378    0.10349   2.452   0.0143 *  
## goals_myself_11  0.52178    0.08322   6.270  4.5e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9177 on 1827 degrees of freedom
## Multiple R-squared:  0.02856,    Adjusted R-squared:  0.0275 
## F-statistic: 26.86 on 2 and 1827 DF,  p-value: 3.196e-12
model1_3 <- lm(happyINDscale ~ be_myself_1 + goals_myself_1 + imp_adv1, data = wvs.ext1)
summary(model1_3) # R^2 = 0.06 , p-value < 2.2e-16
## 
## Call:
## lm(formula = happyINDscale ~ be_myself_1 + goals_myself_1 + imp_adv1, 
##     data = wvs.ext1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6863 -0.6058  0.1063  0.6935  3.2807 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     -0.68218    0.11654  -5.853 5.70e-09 ***
## be_myself_11     0.24009    0.10228   2.347 0.019013 *  
## goals_myself_11  0.49021    0.08227   5.959 3.04e-09 ***
## imp_adv12        0.06246    0.05949   1.050 0.293862    
## imp_adv13        0.27631    0.06942   3.980 7.15e-05 ***
## imp_adv14        0.25161    0.06827   3.686 0.000235 ***
## imp_adv15        0.39877    0.07655   5.209 2.11e-07 ***
## imp_adv16        0.51881    0.09560   5.427 6.50e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9045 on 1822 degrees of freedom
## Multiple R-squared:  0.05877,    Adjusted R-squared:  0.05516 
## F-statistic: 16.25 on 7 and 1822 DF,  p-value: < 2.2e-16
model1_4 <- lm(happyINDscale ~ be_myself_1 + goals_myself_1 + imp_adv1 + freedom1, data = wvs.ext1)
summary(model1_4) # R^2 = 0.15 , p-value < 2.2e-16
## 
## Call:
## lm(formula = happyINDscale ~ be_myself_1 + goals_myself_1 + imp_adv1 + 
##     freedom1, data = wvs.ext1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.9286 -0.5666  0.0747  0.5968  4.1294 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     -1.129896   0.114910  -9.833  < 2e-16 ***
## be_myself_11     0.096582   0.097539   0.990 0.322215    
## goals_myself_11  0.261880   0.079656   3.288 0.001030 ** 
## imp_adv12        0.016013   0.056522   0.283 0.776981    
## imp_adv13        0.161886   0.066336   2.440 0.014766 *  
## imp_adv14        0.158340   0.065086   2.433 0.015080 *  
## imp_adv15        0.249197   0.073363   3.397 0.000697 ***
## imp_adv16        0.376722   0.091228   4.129  3.8e-05 ***
## freedom1         0.120401   0.008431  14.280  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.858 on 1821 degrees of freedom
## Multiple R-squared:  0.1536, Adjusted R-squared:  0.1498 
## F-statistic:  41.3 on 8 and 1821 DF,  p-value: < 2.2e-16
plot(model1_4)

model1_5 <- lm(happyINDscale ~ goals_myself_1 + imp_adv1 + freedom1, data = wvs.ext1)
summary(model1_5) # R^2 =0.15  , p-value < 2.2e-16
## 
## Call:
## lm(formula = happyINDscale ~ goals_myself_1 + imp_adv1 + freedom1, 
##     data = wvs.ext1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.9261 -0.5701  0.0791  0.5981  4.1372 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     -1.057758   0.088863 -11.903  < 2e-16 ***
## goals_myself_11  0.275490   0.078461   3.511 0.000457 ***
## imp_adv12        0.018190   0.056479   0.322 0.747445    
## imp_adv13        0.161029   0.066330   2.428 0.015291 *  
## imp_adv14        0.161034   0.065029   2.476 0.013364 *  
## imp_adv15        0.251323   0.073331   3.427 0.000623 ***
## imp_adv16        0.376420   0.091227   4.126 3.85e-05 ***
## freedom1         0.121261   0.008386  14.459  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.858 on 1822 degrees of freedom
## Multiple R-squared:  0.1531, Adjusted R-squared:  0.1499 
## F-statistic: 47.06 on 7 and 1822 DF,  p-value: < 2.2e-16
#plot(model1_5)

# the lower the better
#AIC(model1_1) # 4920.745
#AIC(model1_2) # 4883.789
#AIC(model1_3) # 4835.972
#AIC(model1_4) # 4643.722
#AIC(model1_5) # 4642.707

Model <- c("model1_1", "model1_2", "model1_3", "model1_4", "model1_5")
AIC_score <- c("4920.745", "4883.789", "4835.972", "4643.722", "4642.707")
AIC_scores <- data.frame(Model, AIC_score)
kable(AIC_scores) %>%
  kable_styling(c("bordered"))
Model AIC_score
model1_1 4920.745
model1_2 4883.789
model1_3 4835.972
model1_4 4643.722
model1_5 4642.707
stats::anova(model1_1, model1_2, model1_3, model1_4,  model1_5, test="Chisq")

First of all, I can say that with every additional variable the model has been improving - Adjusted R-squared have increased from almost 0 to 0.06, which means that the 3rd model here (with all three variables) explains 6% of the variability of happiness. Not much, the model is weak, but still significant, all models have small p-value (< 0.05). Also, I decided to add freedom as an additional factor and that improved the model to the 15% of variability of happiness. Then, I’ve removed insignificant predictor (which is about seek to being themselves). After anova analysis it can be said that 4th model is the best one (as the last one with one reduced predictor is not better). AIC supports the idea, that the 4th model is better than the smaller ones (the value is lower than for others). Than as for the outliers and lavarages, there’re no lavarages and a couple of outliers (7913, 9087 and 8979).

As for that model I may say that:

  • if person decide goals by themselves, their level of happiness increases by 0.26, feeling of freedom also increases level of happiness (by 0.12 with every point), and if statement “living adventurous life is important” is close to the person, their level of happiness increases (and the stronger the agreement, the more is increase in happiness level: by 0.16 with answers “a little like me” and “somewhat like me”, by 0.25 with “like me” and by 0.37 with answer “very much like me”).

In other words, deciding their goals in life by themselves and feeling freedom while making decisions make people more happy, as well as importance of adventurous and exciting life.

Now, #fun_fact: “being yourself” is insignificant in most of the models, as well as in the last one. Well, is it a time to change your motto to smth more extraordinary :)

Now, let’s try to add here freedom factor as an interactional one - how much freedom of choice and control people feel they have over the way their life turns out.

model2_1 <- lm(happyINDscale ~ imp_adv1 + freedom1 * goals_myself_1 , data= wvs.ext1)
summary(model2_1) # R^2 = 0.15 , p-value < 2.2e-16, AIC = 4634.73
## 
## Call:
## lm(formula = happyINDscale ~ imp_adv1 + freedom1 * goals_myself_1, 
##     data = wvs.ext1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.9050 -0.5602  0.0661  0.6018  4.0827 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              -1.46725    0.15717  -9.336  < 2e-16 ***
## imp_adv12                 0.01071    0.05639   0.190  0.84939    
## imp_adv13                 0.15505    0.06619   2.342  0.01927 *  
## imp_adv14                 0.15319    0.06492   2.360  0.01839 *  
## imp_adv15                 0.24784    0.07316   3.388  0.00072 ***
## imp_adv16                 0.37246    0.09101   4.092 4.46e-05 ***
## freedom1                  0.20136    0.02673   7.533 7.74e-14 ***
## goals_myself_11           0.75131    0.16991   4.422 1.04e-05 ***
## freedom1:goals_myself_11 -0.08844    0.02803  -3.155  0.00163 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8559 on 1821 degrees of freedom
## Multiple R-squared:  0.1577, Adjusted R-squared:  0.154 
## F-statistic: 42.62 on 8 and 1821 DF,  p-value: < 2.2e-16
# AIC(model2_1)
library(sjPlot)
## Learn more about sjPlot with 'browseVignettes("sjPlot")'.
plot_model(model2_1, type = "int")

anova(model1_4, model2_1) #model2_1 IS NOT better

Well, here we see an increased adjusted R-squared, this the model explains the same 15% of the variability of happiness. As for the signifiant relations here: feeling of freedom has positive relation to people’s happiness, as well as ability to decide goals themselves. Importance of living exciting life also maintaines it positive relation to happiness.

As for the interaction effect: freedom of choice really increase level of happiness, as well as ability for aetting goals but themselves. When respondents have ability to set goals by themselves, their level of happiness is higher than if they would not have it. However, then their happiness will increase slower (with increasing freedom). And, what is interesting to notice, with the highest level of freedom level of happiness is higher without ability to set goals themselves (maybe that means the harder the way = the sweeter the award? idk :c ).

As for comparison with previous models: both adjusted R-squared and AIC are relatively the same with the last additive models. Anova shows that the model is not better than the one, which was better there.

As for interpretation of coefficients, its relatively the same as before:

  • if statement “living adventurous life is important” is close to the person, their level of happiness increases (and the stronger the agreement, the more is increase in happiness level: by 0.15 with answers “a little like me” and “somewhat like me”, by 0.25 with “like me” and by 0.37 with answer “very much like me”). If person decides goals by themselves, their level of happiness is higher at the beginning (by 0.75) and will grow with every additional point of feeling freedom in making choice, but slower, than if they do not have and ability to set goals themselves.

“Who’s a good boy here?”

Now, I’m curious about factors such as imposrtance for person to behave properly, to live as their frieds expect them and to make their parents proud. My assumption is that trying to live properly or as expected will make people less happy. Or maybe here lack of freedom will play its role. Also, I guess these people are kind of a contrast to the previous group and if someone will combine traits from this different groups that will make these people unhappy. Let’s try to explore it. (upd: i’m really interested to see interaction between feeling of freedom and importance of living properly.)

wvs.ext1 <- wvs.ext %>% select(happyINDscale, live_as_expected_1, par_proud_1, imp_prop1, freedom1) %>% na.omit()

model3_1 <- lm(happyINDscale ~ par_proud_1, data= wvs.ext1)
summary(model3_1) # R^2 =  0.01, p-value = 3.013e-05
## 
## Call:
## lm(formula = happyINDscale ~ par_proud_1, data = wvs.ext1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.3660 -0.4890  0.2326  0.5574  3.4809 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -0.006207   0.055041  -0.113     0.91    
## par_proud_11  0.252892   0.060438   4.184 3.01e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9194 on 1633 degrees of freedom
## Multiple R-squared:  0.01061,    Adjusted R-squared:   0.01 
## F-statistic: 17.51 on 1 and 1633 DF,  p-value: 3.013e-05
model3_2 <- lm(happyINDscale ~ live_as_expected_1 + par_proud_1, data= wvs.ext1)
summary(model3_2) # R^2 =  0.01, p-value = 0.0001645, live_as_expected_1 is insignificant
## 
## Call:
## lm(formula = happyINDscale ~ live_as_expected_1 + par_proud_1, 
##     data = wvs.ext1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.3707 -0.4882  0.2279  0.5599  3.4833 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         -0.007031   0.055360  -0.127    0.899    
## live_as_expected_11  0.007185   0.050378   0.143    0.887    
## par_proud_11         0.251263   0.061526   4.084 4.64e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9196 on 1632 degrees of freedom
## Multiple R-squared:  0.01062,    Adjusted R-squared:  0.009408 
## F-statistic: 8.759 on 2 and 1632 DF,  p-value: 0.0001645
model3_3 <- lm(happyINDscale ~ live_as_expected_1 + par_proud_1 + imp_prop1 , data= wvs.ext1)
summary(model3_3) # R^2 =  0.014, p-value = 8.437e-05, live_as_expected_1 and imp_prop1 are insignificant
## 
## Call:
## lm(formula = happyINDscale ~ live_as_expected_1 + par_proud_1 + 
##     imp_prop1, data = wvs.ext1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.5495 -0.6006  0.0491  0.6343  3.5644 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          0.02611    0.15123   0.173    0.863    
## live_as_expected_11  0.01679    0.05043   0.333    0.739    
## par_proud_11         0.27049    0.06188   4.371 1.31e-05 ***
## imp_prop12           0.13360    0.16372   0.816    0.415    
## imp_prop13          -0.02220    0.15740  -0.141    0.888    
## imp_prop14          -0.01159    0.15486  -0.075    0.940    
## imp_prop15          -0.13340    0.15414  -0.865    0.387    
## imp_prop16          -0.11560    0.15735  -0.735    0.463    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9174 on 1627 degrees of freedom
## Multiple R-squared:  0.01841,    Adjusted R-squared:  0.01419 
## F-statistic: 4.359 on 7 and 1627 DF,  p-value: 8.437e-05
model3_4 <- lm(happyINDscale ~ live_as_expected_1 + par_proud_1 + imp_prop1 + freedom1 , data= wvs.ext1)
summary(model3_4) # R^2 =  0.12, p-value < 2.2e-16, live_as_expected_11 and imp_prop1 are insignificant
## 
## Call:
## lm(formula = happyINDscale ~ live_as_expected_1 + par_proud_1 + 
##     imp_prop1 + freedom1, data = wvs.ext1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6642 -0.5421  0.1011  0.5822  4.3289 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         -0.867888   0.155578  -5.578 2.84e-08 ***
## live_as_expected_11  0.012495   0.047531   0.263   0.7927    
## par_proud_11         0.172812   0.058718   2.943   0.0033 ** 
## imp_prop12           0.160036   0.154312   1.037   0.2998    
## imp_prop13           0.037137   0.148410   0.250   0.8024    
## imp_prop14           0.041844   0.146004   0.287   0.7745    
## imp_prop15          -0.030907   0.145450  -0.212   0.8317    
## imp_prop16          -0.006109   0.148496  -0.041   0.9672    
## freedom1             0.124610   0.008691  14.338  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8647 on 1626 degrees of freedom
## Multiple R-squared:  0.1286, Adjusted R-squared:  0.1243 
## F-statistic: 29.99 on 8 and 1626 DF,  p-value: < 2.2e-16
plot(model3_4)

model3_5 <- lm(happyINDscale ~ par_proud_1 + freedom1 , data= wvs.ext1)
summary(model3_5) # R^2 =  0.12, p-value < 2.2e-16
## 
## Call:
## lm(formula = happyINDscale ~ par_proud_1 + freedom1, data = wvs.ext1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6999 -0.5609  0.0981  0.5489  4.2845 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -0.846076   0.077355 -10.938  < 2e-16 ***
## par_proud_11  0.162747   0.057185   2.846  0.00448 ** 
## freedom1      0.126388   0.008649  14.613  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8648 on 1632 degrees of freedom
## Multiple R-squared:  0.1251, Adjusted R-squared:  0.124 
## F-statistic: 116.7 on 2 and 1632 DF,  p-value: < 2.2e-16
anova(model3_1, model3_2, model3_3, model3_4, model3_5) # 4th is the best
# the lower the better
#AIC(model3_1) # 4368.993
#AIC(model3_2) # 4370.973
#AIC(model3_3) # 4368.048
#AIC(model3_4) # 4175.385
#AIC(model3_5) # 4169.945
#AIC(model3_6) # 4172.472

Model_1 <- c("model3_1", "model3_2", "model3_3", "model3_4", "model3_5")
AIC_score_1 <- c("4368.99", "4370.97", "4368.05", "4175.39", "4169.95")
AIC_scores_1 <- data.frame(Model_1, AIC_score_1)
kable(AIC_scores_1) %>%
  kable_styling
Model_1 AIC_score_1
model3_1 4368.99
model3_2 4370.97
model3_3 4368.05
model3_4 4175.39
model3_5 4169.95
wvs.ext1$imp_prop1 <-   ifelse(wvs.ext1$imp_prop1=="1",0,
                            ifelse(wvs.ext1$imp_prop1=="2",0,
                               ifelse(wvs.ext1$imp_prop1=="3",0.5,
                                          ifelse(wvs.ext1$imp_prop1=="4",0.5, 
                                                 ifelse(wvs.ext1$imp_prop1=="5",1, 
                                                        ifelse(wvs.ext1$imp_prop1=="6",1, NA))))))
wvs.ext1$imp_prop1 <- as.factor(wvs.ext1$imp_prop1)
#summary(wvs.ext1$imp_prop1)
model3_6 <- lm(happyINDscale ~ live_as_expected_1 + par_proud_1 + freedom1 *  imp_prop1, data= wvs.ext1)
summary(model3_6) # R^2 =  0.13, p-value < 2.2e-16
## 
## Call:
## lm(formula = happyINDscale ~ live_as_expected_1 + par_proud_1 + 
##     freedom1 * imp_prop1, data = wvs.ext1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6482 -0.5342  0.1038  0.5805  4.2798 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -1.28378    0.20990  -6.116 1.20e-09 ***
## live_as_expected_11    0.01170    0.04734   0.247  0.80484    
## par_proud_11           0.18114    0.05865   3.089  0.00204 ** 
## freedom1               0.19563    0.02600   7.525 8.67e-14 ***
## imp_prop10.5           0.54374    0.23432   2.320  0.02044 *  
## imp_prop11             0.43214    0.22493   1.921  0.05488 .  
## freedom1:imp_prop10.5 -0.08384    0.02971  -2.822  0.00482 ** 
## freedom1:imp_prop11   -0.07738    0.02856  -2.710  0.00680 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8625 on 1627 degrees of freedom
## Multiple R-squared:  0.1325, Adjusted R-squared:  0.1288 
## F-statistic:  35.5 on 7 and 1627 DF,  p-value: < 2.2e-16
plot_model(model3_6, type = "int")

anova(model3_4, model3_6) # interactive model is not better
#model3_7 <- lm(happyINDscale ~ par_proud_1 + freedom1 *  imp_prop1, data= wvs.ext1)
#summary(model3_7) # R^2 =  0.13, p-value < 2.2e-16
#plot_model(model3_7)
#anova(model3_6, model3_7) # last model IS NOT better

Here I have constructed 6 models: first 5 are additive models and last one (6th one) is with interaction. I’ve done it the same way as in the previous part: firstly, I’ve added preditors one by one (and noticed improvements with every next variable: adjusted R-squared increased from 0.01 to 0.12, meaning 12% of variability of happiness in 4th model. Then I’ve tried to reduce insignificant predictors just like before, but that did not make model better. Interaction was added in the last model (13% of variability of happiness), according anova model with interaction is not better than the 4th additive model.

4th (additive) model: In fact I don’t really like it: the half of predictors are insignificant and reducing them does not make model better. As for significant moments: if people seek their parents to be proud of them, their level of happiness increases by 0.17 and with every additional point of feeling freedom in their choice their level of happiness increases by 0.12. In other words, people, who want their parents to be proud of them and who feel freedom of choice are happier.

Being a bit annoyed by 6 levels of factor in this model, i’ve changed it in 3 levels (which gave me better on the interaction graph later at least). Now 0 means absence of relation to proper behavior at all (answers “Not at all like me” and “Not like me”), 0.5 is a slight relation (answers “A little like me” and “Somewhat like me”) and last category is respondet’s confident relation to proper behavior (answers “Like me” and “Very much like me”).

5th (interactive) model gives relatively the same results: if people seek their parents to be proud of them, their level of happiness increases by 0.18, with every additional point of feeling freedom in their choice their level of happiness increases by 0.20, if the idea of importance to behave properly is slightly close to them (answers “somewhat like me” or “a little like me”), their level of happiness inreases by 0.5. As for interactions, there are three levels of how close respondent to the idea of proper behavior. Slight and confident relation give similar effect. The idea here is when its important for respondents to behave properly, freesom to make decisions increase their level of happiness, but in less degreee, than if it was an independent predictor. And finally, irt can be said that with the higher level of freedom in making decision and setting goals, respondent will be happier without their relation to importance to behave properly.

In general it seems that if you’re a good guy - you suffer a bit from that (and freedom of choice does not work in its full way). However, you still tend to be more happy than those without that feeling of freedom.

Сonclusion

First thing, that I’ve noticed is the interaction of “freedom effect”. In both cases with live-your-life-lovers and be-good-lovers. The idea is that feeling freedom in making desicions by itself increases level of happiness. And it is truly on of the strongest predictors in my models. However, when it interacts with other variables, such as state your goals by yourself or remember to behave properly - we see negative relation and descrease in happiness. In my opinion, that’s the cruelty of freedom: by itself it’s a great thing. However, as soon as that freedom face the harsh reality of taking responsibility for your choice - effect changes. Maybe I am wrong, but I’ve found it really interesting.

Thank you for attention ~