Anna Gorobtsova
Nadezda Bykova
Artem Kulikov
Anastasia Vlasenko
The general idea of this paper is based on the assumption that there are some universal predictors of happiness, which work irrespective of the country and specific cultural features which also contribute to the level of happiness. Among universal predictors it is possible to highlight the state of health of citizens and their satisfaction with financial situation. There are several statements in favour of the fact that these two variables are indeed universal and work for most of the countries. First of all, general state of the health of the person is closely connected with his or her mental health. Therefore, people who often experience stress, anxiety or depression are more likely to be less satisfied with their lives. Additionally, one’s dissatisfaction with his or her financial situation expresses the fact that a person doesn’t have enough financial resources in order to satisfy all the needs. Therefore, people who are not satisfied with their financial situation are more likely to be unhappy. So these variables work almost for any country, as all people want to be healthy and financially stable.
However, there are still some variables, which work only for some cultures and not for others. For example, some countries value hard work, while others might have more hedonic attitude towards life. Moreover, some cultures value freedom of choice and their ability to make decisions by themselves. Therefore, the extent to which people are free in their decisions might an important preidictor of happiness.
Therefore, the general research question is as follows: Do culture specific features contribute a lot to the overall level of happiness in the chosen country? For this purpose World Value Survey from the years 2005-2006 has been used.
In order to conduct the analysis the United States of America have been chosen. Out of 347922 observation from the whole WVS dataset 8155 are from USA and after ommiting all the NA’s 3931 complete cases have been obtained.
The whole WVS dataset contains 289 variables. However, as it was stated in the introduction only four of them are used in the model building:
V11 - measures state of health
V46 - measures freedom of choice
V120 - measures the extent to which a person values hard work
V68 - measures one’s satisfaction with financial situation
Additionally, two variables were used in order to create the happiness index, which is the explanatory variable (the exact formula for creating the index will be specified during the analysis):
V10 - measures happiness level of a person
V22 - measures one’s life satisfaction
Finally, gender and age are also taken into account in some cases:
V235 - gender
V22 - age
The main hypothesis is that universal predictors, such as health and financial situation explain more variance than culture specific ones like value of hard work and freedom of choice.
library(foreign)
library(ggplot2)
library(car)
library(lmtest)
library(sjPlot)
library(ggcorrplot)
wvs <- read.spss("wvs.sav", to.data.frame = TRUE, use.value.labels = TRUE)
wvs$sat <- rep(NA, length(wvs$V22))
wvs$sat[wvs$V22 == "Dissatisfied" |
wvs$V22 == "2"] <- "1"
wvs$sat[wvs$V22 == "3" |
wvs$V22 == "4" |
wvs$V22 == "5"] <- "2"
wvs$sat[wvs$V22 == "6" |
wvs$V22 == "7" |
wvs$V22 == "8"] <- "3"
wvs$sat[wvs$V22 == "Satisfied" |
wvs$V22 == "9"] <- "4"
wvs$V10 <- ifelse(wvs$V10 =="Not at all happy",1,
ifelse(wvs$V10 =="Not very happy",2,
ifelse(wvs$V10 =="Quite happy",3,
ifelse(wvs$V10 =="Very happy",4, NA))))
In order to create the index of happiness we decided to sum up satisfaction variable(V22) and hapiness variable(V10). Also before doing so we recoded the satisfaction variable making less levels in order to put both variables on the same scale:
wvs$hapIND <- as.numeric(wvs$V10) + as.numeric(wvs$sat)
wvsUSA <- subset(wvs, V2 == "USA")
wvsUSA$V68 <- ifelse(wvsUSA$V68 == "Completely satisfied",10,
ifelse(wvsUSA =="Completely dissatisfied",1,
wvsUSA$V68))
wvsUSA$V11 <- ifelse(wvsUSA$V11=="Very good",4,
ifelse(wvsUSA$V11=="Good",3,
ifelse(wvsUSA$V11=="Fair",2,
ifelse(wvsUSA$V11=="Poor",1, NA))))
wvsUSA$V46 <- ifelse (wvsUSA$V46=="None at all",1,
ifelse(wvsUSA$V46=="A great deal",10,
wvsUSA$V46))
wvsUSA$V120 <- ifelse(wvsUSA$V120=="Hard work doesn't generally bring success - it's more a matter of luck and connections",10,
ifelse(wvsUSA$V120=="In the long run, hard work usually brings a better life",1,
wvsUSA$V120))
save <- c("V10", "V11", "V2", "V68", "V120", "V46", "V22", "V4", "hapIND", "V237","V239", "V235")
data1 <- wvsUSA[save]
data1 <- na.omit(data1)
wvsUSA1 <- data1
wvsUSA1$V237 <- as.numeric(as.character(wvsUSA1$V237))
wvsUSA1$V120 <- as.numeric(as.character(wvsUSA1$V120))
wvsUSA1$V46 <- as.numeric(as.character(wvsUSA1$V46))
wvsUSA1$V68 <- as.numeric(as.character(wvsUSA1$V68))
wvsUSA1$V11 <- as.factor(wvsUSA1$V11)
wvsUSA1$V239 <- as.numeric(as.character(wvsUSA1$V239))
ggplot(wvsUSA1, aes(x = hapIND)) +
geom_bar(col = "navy", fill = "cornflowerblue") +
xlab("Happiness level, from low to high") +
ylab("Number of observations") +
ggtitle("Happiness level for USA") +
geom_vline(aes(xintercept = mean(wvsUSA1$hapIND), colour="Mean"), lwd=1.1 )
From the barplot it can be seen that generally people in USA feel themselves more or less happy.
ggplot(wvsUSA1, aes(x = V11, y = hapIND)) +
geom_boxplot(col = "navy", fill = "cornflowerblue") +
theme(axis.text.x=element_text(vjust = 0.5)) +
xlab("Level of health") + ggtitle("Health boxplot") + ylab("Subjective appiness level")+
scale_x_discrete(labels=c("Poor", "Fair", "Good", "Very good"))
Boxplot above shows that people with very good health have the highest median happiness level.
ggplot(wvsUSA1, aes(x=V46)) +
geom_histogram(binwidth=0.5, col = "navy", fill = "cornflowerblue")+
labs(title="Freedom of choice histogram", x="Self-percievesd freedom of choice", y = "Number of people")+
geom_vline(aes(xintercept = mean(wvsUSA1$hapIND), colour="Mean"), lwd=1.1 )
On the histogram above 1 means, that people feel that they don’t have freedom of choice at all and 10 that they have a great deal of choice. Therefore, we can see that people are not completely free in their choices but still there is some degeree of freedom.
ggplot(wvsUSA1, aes(x=V120)) +
geom_histogram(binwidth=0.5,col = "navy", fill = "cornflowerblue")+
labs(title="Hard work histogram", x="Does hard work brings better life", y = "Number of people")+
geom_vline(aes(xintercept = mean(wvsUSA1$hapIND), colour="Mean"), lwd=1.1 )
On the histogram above 1 means that people value hard work and believe that it brings a better life, while 10 means that they think that hard work doesn’t generally bring success. Therefore, it can be seen, that people in America mostly value hard work and believe that it can bring success and better life.
ggplot(wvsUSA1, aes(x=V68)) +
geom_histogram(binwidth=0.5, col = "navy", fill = "cornflowerblue")+
labs(title="Financial situation histogram", x="Financial satisfaction level", y = "Number of people")+
geom_vline(aes(xintercept = mean(wvsUSA1$hapIND), colour="Mean"), lwd=1.1 )
On the histogram above 1 means that they are completely dissatisfied with the financial situation of their household and 10 that they are completely ssatisfied with it. Therefore, it can be seen generally people are quite satisfied with their financial situation.
corrsubset <- wvsUSA1[c("hapIND", "V46", "V120", "V68")]
names(corrsubset) <- c ("Happiness", "Choice", "Hard work",
"Financial satisfaction")
cornum <- cor(corrsubset, use="complete.obs")
ggcorrplot(cornum, lab = TRUE, type = "lower")
From the correlation matrix it can be seen that our explanatory variable has higher correlation coefficients with choice varibale and financial situation variable, which is 0,4 and 0,43, meaning that there is some positive correlation (the higher the level of freedom, the higher the level of happiness and the higher the financial satisfaction the higher the happiness level). All other variables have not that big correlation coefficients with Happiness variable and between each other as well.
thap <- t.test(hapIND ~ V235, wvsUSA1)
thap
##
## Welch Two Sample t-test
##
## data: hapIND by V235
## t = -0.147, df = 3923.2, p-value = 0.8831
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.07572420 0.06516071
## sample estimates:
## mean in group Male mean in group Female
## 6.504343 6.509625
The p-value equals 0.8831, which is rather big. Therefore, we accept the null hypothesis and conclude that there is no difference in mean levels of happiness between men and women.
aov.out <- aov(wvsUSA1$hapIND ~ wvsUSA1$V11)
summary(aov.out)
## Df Sum Sq Mean Sq F value Pr(>F)
## wvsUSA1$V11 3 415 138.40 118.9 <2e-16 ***
## Residuals 3927 4571 1.16
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
p-value is small, which means, that the difference in happiness level within health groups is statistically significant
Linearity
Homoscedasticity
No multicolleniarity
Normality of distribution
model1<-lm(hapIND ~ V11, data = wvsUSA1)
summary(model1)
##
## Call:
## lm(formula = hapIND ~ V11, data = wvsUSA1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.8758 -0.7255 0.1242 0.9056 2.2745
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.72549 0.08723 65.640 < 2e-16 ***
## V112 0.36894 0.09701 3.803 0.000145 ***
## V113 0.69123 0.09112 7.586 4.11e-14 ***
## V114 1.15028 0.09169 12.545 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.079 on 3927 degrees of freedom
## Multiple R-squared: 0.08326, Adjusted R-squared: 0.08256
## F-statistic: 118.9 on 3 and 3927 DF, p-value: < 2.2e-16
This model explains only 8% of variance in the explanatory variable. The p-values are significant and coefficients shows that:
Interpretation of coefficients:
Comparing with people, who have poor health,for those who have Fair health happiness level increases by 0.36894
Comparing with people, who have poor health,for those who have good health happiness level increases by 0.69123
Comparing with people, who have poor health,for those who have very good health happiness level increases 1.15028
So generally we can say that the better health a person has, the happier he or she is
model2<-lm(hapIND ~ V11 + V68, data = wvsUSA1)
summary(model2)
##
## Call:
## lm(formula = hapIND ~ V11 + V68, data = wvsUSA1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.0612 -0.6955 0.0243 0.7294 2.9710
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.667153 0.088570 52.694 < 2e-16 ***
## V112 0.336978 0.088797 3.795 0.00015 ***
## V113 0.584855 0.083492 7.005 2.9e-12 ***
## V114 0.942783 0.084260 11.189 < 2e-16 ***
## V68 0.180923 0.006556 27.597 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9875 on 3926 degrees of freedom
## Multiple R-squared: 0.2322, Adjusted R-squared: 0.2314
## F-statistic: 296.8 on 4 and 3926 DF, p-value: < 2.2e-16
This model includes only those predictors which were assumed to be universal. It can be seen that it explains ~23% of variance. The p-values are significant and coefficients shows that:
Interpretation of coefficients:
Comparing with people, who have poor health,for those who have Fair health happiness level increases by 0.336978
Comparing with people, who have poor health,for those who have good health happiness level increases by 0.584855
Comparing with people, who have poor health,for those who have very good health happiness level increases 0.942783
So generally we can say that the better health a person has, the happier he or she is
And with one unit increase in financial satisfaction variable, the level of happiness increases by 0.180923. This means that those people who are satisfied with the financial situation of their household are generally more happy
model3<-lm(hapIND ~ V11 + V68 + V46, data = wvsUSA1)
summary(model3)
##
## Call:
## lm(formula = hapIND ~ V11 + V68 + V46, data = wvsUSA1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.5167 -0.6033 0.0048 0.6599 3.1753
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.750342 0.096469 38.876 < 2e-16 ***
## V112 0.269057 0.084788 3.173 0.00152 **
## V113 0.469508 0.079872 5.878 4.49e-09 ***
## V114 0.785450 0.080786 9.723 < 2e-16 ***
## V68 0.147451 0.006482 22.750 < 2e-16 ***
## V46 0.164472 0.008349 19.701 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9422 on 3925 degrees of freedom
## Multiple R-squared: 0.3013, Adjusted R-squared: 0.3004
## F-statistic: 338.5 on 5 and 3925 DF, p-value: < 2.2e-16
Here we start adding culture specific predictors. In his model there is only one, which is freedom of choice and we can see that it adds approximately 7% of variance in the explanatory variable. The p-values are significant and coefficients shows that:
Interpretation of coefficients:
Comparing with people, who have poor health,for those who have Fair health happiness level increases by 0.269057
Comparing with people, who have poor health,for those who have good health happiness level increases by 0.469508
Comparing with people, who have poor health,for those who have very good health happiness level increases 0.785450
So generally we can say that the better health a person has, the happier he or she is
And with one unit increase in financial satisfaction variable, the level of happiness increases by 0.147451. This means that those people who are satisfied with the financial situation of their household are generally more happy
With one unit increase in the choice variable, the level of happiness increases by 0.164472. Which means that the more freedom of choice a person has the more happy he or she is
model4<-lm(hapIND ~ V11 + V68 + V46 + V120, data = wvsUSA1)
summary(model4)
##
## Call:
## lm(formula = hapIND ~ V11 + V68 + V46 + V120, data = wvsUSA1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.5059 -0.6169 0.0219 0.6547 3.3359
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.897671 0.101400 38.438 < 2e-16 ***
## V112 0.277721 0.084592 3.283 0.00104 **
## V113 0.481076 0.079707 6.036 1.73e-09 ***
## V114 0.792382 0.080593 9.832 < 2e-16 ***
## V68 0.144328 0.006500 22.203 < 2e-16 ***
## V46 0.160632 0.008369 19.194 < 2e-16 ***
## V120 -0.029814 0.006474 -4.605 4.25e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9398 on 3924 degrees of freedom
## Multiple R-squared: 0.305, Adjusted R-squared: 0.304
## F-statistic: 287.1 on 6 and 3924 DF, p-value: < 2.2e-16
This model contains both universal and culture specific predictors. And we can see that culture specific variables together add only 7% of variance to the model with two universal predictors. Therefore, the hypothesis which was stated in the beginig is correct and universal predictors like health and financial satisfaction explain bigger share of variance than the culture specific ones. The p-values are significant and coefficients shows that:
Interpretation of coefficients:
Comparing with people, who have poor health,for those who have Fair health happiness level increases by 0.277721
Comparing with people, who have poor health,for those who have good health happiness level increases by 0.481076
Comparing with people, who have poor health,for those who have very good health happiness level increases 0.792382
So generally we can say that the better health a person has, the happier he or she is
And with one unit increase in financial satisfaction variable, the level of happiness increases by 0.144328. This means that those people who are satisfied with the financial situation of their household are generally more happy
With one unit increase in the choice variable, the level of happiness increases by 0.160632. Which means that the more freedom of choice a person has the more happy he or she is
With one unit increase in hard work variable the levelof happiness decreases by 0.029814. In simple words this means that the more valuable to a person work is, the less happy he or she is.
anova(model1,model2)
## Analysis of Variance Table
##
## Model 1: hapIND ~ V11
## Model 2: hapIND ~ V11 + V68
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 3927 4571.4
## 2 3926 3828.7 1 742.69 761.57 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(model2,model3)
## Analysis of Variance Table
##
## Model 1: hapIND ~ V11 + V68
## Model 2: hapIND ~ V11 + V68 + V46
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 3926 3828.7
## 2 3925 3484.2 1 344.52 388.11 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(model3,model4)
## Analysis of Variance Table
##
## Model 1: hapIND ~ V11 + V68 + V46
## Model 2: hapIND ~ V11 + V68 + V46 + V120
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 3925 3484.2
## 2 3924 3465.4 1 18.731 21.21 4.246e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We can see that in each step of model comparison the p-value is significant, which means that every new model is better than the previous one. Therefore, we can claim that model 4 (which contains both universal and culture specific predictors) is the best one.
vif(model4)
## GVIF Df GVIF^(1/(2*Df))
## V11 1.043999 3 1.007202
## V68 1.112580 1 1.054789
## V46 1.114957 1 1.055915
## V120 1.031463 1 1.015610
Values are less than 5. Therefore, it can be concluded that we do not have multicollinearity.
par(mfrow = c(2,2))
plot(model4)
Residuals VS Fitted (we can see that dots are not quite evenly dispersed around zero, whihch means that we face the problem of heteroscedasticity)
Normal Q-Q plot shows that our data is normally distributed
Also we do not have any leverages or influential cases, as Cook’s distance line is not present on the last plot
Checking for heteroscedasticity again:
bptest(model4)
##
## studentized Breusch-Pagan test
##
## data: model4
## BP = 89.518, df = 6, p-value < 2.2e-16
ncvTest(model4)
## Non-constant Variance Score Test
## Variance formula: ~ fitted.values
## Chisquare = 80.54891, Df = 1, p = < 2.22e-16
These two tests also produce significant p-values, which proves the fact that we have the problem of heteroscedasticity. However, all other Linear Regression assumptions have been met. Therefore, we can claim that our model is unbiased . Moreover, problem of heteroscedasticity is a common one for cross sectional studies like this one,meaninig that, for example in case of USA we can have rather different values for different states, in some of them they might be rather small in others rather big, especially taling into account the fact that USA is rather diverse country.
model6 <-lm(hapIND ~ V11 + V68 + V46 + V120 + V4 + poly(V237, 3, raw = TRUE), data = wvsUSA1)
summary(model6)
##
## Call:
## lm(formula = hapIND ~ V11 + V68 + V46 + V120 + V4 + poly(V237,
## 3, raw = TRUE), data = wvsUSA1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.4204 -0.6053 0.0203 0.6549 3.2831
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.502e+00 2.834e-01 15.886 < 2e-16 ***
## V112 3.204e-01 8.416e-02 3.807 0.000143 ***
## V113 5.502e-01 8.004e-02 6.875 7.20e-12 ***
## V114 8.867e-01 8.184e-02 10.835 < 2e-16 ***
## V68 1.354e-01 6.671e-03 20.299 < 2e-16 ***
## V46 1.581e-01 8.326e-03 18.991 < 2e-16 ***
## V120 -2.721e-02 6.439e-03 -4.225 2.44e-05 ***
## V4Rather important -4.400e-01 6.793e-02 -6.478 1.04e-10 ***
## V4Not very important -2.188e-01 1.662e-01 -1.317 0.187974
## V4Not at all important -5.642e-02 3.312e-01 -0.170 0.864746
## poly(V237, 3, raw = TRUE)1 -4.770e-02 1.758e-02 -2.713 0.006702 **
## poly(V237, 3, raw = TRUE)2 1.034e-03 3.617e-04 2.859 0.004279 **
## poly(V237, 3, raw = TRUE)3 -6.343e-06 2.317e-06 -2.737 0.006221 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9322 on 3918 degrees of freedom
## Multiple R-squared: 0.3172, Adjusted R-squared: 0.3151
## F-statistic: 151.7 on 12 and 3918 DF, p-value: < 2.2e-16
So from the summary table it can be seen that in the first part of the polynom with increase in age people experience the decrease of the level of happiness by 4.770e-02, then it increases by 1.034e-03 and then again drops by 6.343e-06
Plotting modelwith non-linear effect:
ggplot(wvsUSA1, aes(V237, hapIND)) +
geom_point() +
stat_smooth(model = model6)
So the plot above shows the trend described above. We can observe that in the period of 35-40 years old people are not that happy. Probably this is connected with the fact that when you get older a lot of responsibilities occur, causing you to feel a little bit more anxious about things, therefore the level of happiness drops. And then after 40 years old, when your children are more or less grown ups and you are more stable financialy and morally, you start to enjoy your life more because now you have more time for yourself. And when people reach approximately 80 years old the level of their happiness again starts to decrease, as probably during this age period they start to experience a lot of health problems.
model7 <- lm(hapIND ~ V11 + V68 + V120 + V4 + V46*V235, data = wvsUSA1)
summary(model7)
##
## Call:
## lm(formula = hapIND ~ V11 + V68 + V120 + V4 + V46 * V235, data = wvsUSA1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.4973 -0.6183 0.0192 0.6524 3.2311
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.059443 0.118631 34.219 < 2e-16 ***
## V112 0.298394 0.084258 3.541 0.000403 ***
## V113 0.493619 0.079418 6.215 5.65e-10 ***
## V114 0.802530 0.080301 9.994 < 2e-16 ***
## V68 0.144087 0.006476 22.251 < 2e-16 ***
## V120 -0.029121 0.006446 -4.518 6.44e-06 ***
## V4Rather important -0.436179 0.068120 -6.403 1.70e-10 ***
## V4Not very important -0.193597 0.166839 -1.160 0.245966
## V4Not at all important -0.004553 0.331725 -0.014 0.989051
## V46 0.139543 0.011429 12.210 < 2e-16 ***
## V235Female -0.259634 0.124122 -2.092 0.036523 *
## V46:V235Female 0.036358 0.015781 2.304 0.021283 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9347 on 3919 degrees of freedom
## Multiple R-squared: 0.3134, Adjusted R-squared: 0.3115
## F-statistic: 162.7 on 11 and 3919 DF, p-value: < 2.2e-16
plot_model(model7, type = "int")
Here we added interaction between freedom of choice variable and gender variable
Interpretation of the result
With increase in freedom of choice the level of happiness increases not equally for men and women, which means that we do have an interaction effect. So from the plot it can be seen that when the level of freedom of choice reaches the value of approximately 6 and goes further, the level of happiness for women starts to grow more rapidly and at the point of 7.5 it starts to exeed the level of happiness of men.
After conducting the analysis the research question has been answered successfully. We may conclude that culture specific predictors, like the extent to which a person values hard work and the extent to which he or she is free in making choices and decisions do not contribute a lot to the model with only universal predictors. However, they still make this model explain more variance. Therefore, considering only universal predictors is not a good approach, it is still important to take into account culture specific features of the country.