#RQ: Does the separation from parents have an effect on financial satisfaction of household in South American and African countries?

##Hypotheses: 1. People who estimate their standard of living higher compared to their parents’ status will have higher financial satisfaction 2. People with one of main goals in life is to make my parents proud will have lower level of financial satisfaction 3. Higher number of people in household and living with parents will decrease financial satisfaction of household

Variables

1st level

To estimate separation from parents I will use three main variables of 1st level:

Q56 - Standard of living comparing with your parents

Q27 - One of main goals in life has been to make my parents proud

Q271 - Do you live with your parents

And also Q270 - number of people in household

Control

The control variables will be:

Q260 - sex

Q262 - age

Q273 - marital status

Q288 - scale of income

2nd level

The second level variables that will be used are:

GDPpercap1 - GDP per capita, PPP (current international $)

incomeWB - Income group country (Low, lower-middle, upper- middle, high)

regionWB - Geographic region (7 groups)

Countries

Countries that will be included in analysis: Puerto Rico, Brazil, Peru, Chile, Argentina, Uruguay, Mexico, Venezuela, Guatemala, Nicaragua, Colombia, Ecuador, Bolivia, Macau SAR, Nigeria, Kenya, Ethiopia, Zimbabwe, Morocco, Tunisia, Libya

library(foreign)
library(missForest)
library(dplyr)
library(ggplot2)
library(sjPlot)
library(table1)

wvs <- read.spss("WVS_Cross-National_Wave_7_spss_v6_0.sav", to.data.frame = T)

wvs1 <- select(wvs, c(C_COW_NUM, Q50, Q260, Q262, Q270, Q273, Q274, Q275R, Q279, Q280, Q288, Q288R, Q56, Q27, Q271, GDPpercap1, giniWB, btimarket, incomeWB, regionWB))

Data preparation

table1(~ Q50 + Q260 + Q262 + Q270 + Q273 + Q288 + Q56 + Q27 + Q271 + GDPpercap1 + incomeWB + regionWB, data = wvs1)

Problems in data:

  1. Q50 - not numeric lowest and highest levels
  2. Q262 - not numeric
  3. Q270 - not numeric
  4. Q288 - not numeric
  5. Q271 - has two small categories
  6. Q273 - has small categories
wvs1$Q262_n <- as.numeric(as.character(wvs1$Q262))
summary(wvs1$Q262_n)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   16.00   29.00   41.00   43.18   55.00  103.00     511
wvs1$Q270_n <- as.numeric(as.character(wvs1$Q270))
summary(wvs1$Q270_n)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.000   2.000   4.000   4.025   5.000  63.000     985
wvs1$Q288_n <- as.numeric(wvs1$Q288)
summary(wvs1$Q288_n)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    1.00    4.00    5.00    4.91    6.00   10.00    2961
wvs1$Q271_r <- ifelse(wvs1$Q271 == "Yes, own parent(s)" | wvs1$Q271 == "Yes, parent(s) in law" | 
                         wvs1$Q271 == "Yes, both own parent(s) and parent(s) in law", "Yes", as.character(wvs1$Q271)) %>% as.factor()
summary(wvs1$Q271_r)
##    No   Yes  NA's 
## 67098 27858  2264
wvs1$Q273_r <- ifelse(wvs1$Q273 == "Married" | wvs1$Q273 == "Living together as married", "Partnered", 
                      ifelse(wvs1$Q273 == "Single", "Single", "Previously partnered")) %>% as.factor()
summary(wvs1$Q273_r)
##            Partnered Previously partnered               Single 
##                61456                11925                23250 
##                 NA's 
##                  589
wvs1$Q50_r <- ifelse(wvs1$Q50=="Satisfied", 10, 
                     ifelse(wvs1$Q50=="Dissatisfied", 1, as.character(wvs1$Q50))) %>% as.numeric()
summary(wvs1$Q50_r)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.000   5.000   6.000   6.209   8.000  10.000     625
table1(~ Q50_r + Q260 + Q262_n + Q270_n + Q273_r + Q288_n + Q56 + Q27 + Q271_r + GDPpercap1 + incomeWB + regionWB, data = wvs1)

wvs2 <- select(wvs1, c(C_COW_NUM, Q50_r, Q260, Q262_n, Q270_n, Q273_r, Q288_n, Q56, Q27, Q271_r, GDPpercap1, incomeWB, regionWB))
lookup <- c(cntry = "C_COW_NUM", finsat="Q50_r", sex="Q260", age="Q262_n", people_house="Q270_n", marit_stat="Q273_r", income="Q288_n", status_comp="Q56", parent_proud="Q27", with_parent="Q271_r", GDPpc="GDPpercap1", country_income = "incomeWB", region = "regionWB")
wvs2<-rename(wvs2, all_of(lookup))

summary(wvs2)
##                       cntry           finsat           sex       
##  Canada                  : 4018   Min.   : 1.000   Male  :45995  
##  Indonesia               : 3200   1st Qu.: 5.000   Female:51130  
##  China                   : 3036   Median : 6.000   NA's  :   95  
##  Great Britain           : 2609   Mean   : 6.209                 
##  United States of America: 2596   3rd Qu.: 8.000                 
##  Turkey                  : 2415   Max.   :10.000                 
##  (Other)                 :79346   NA's   :625                    
##       age          people_house                   marit_stat        income     
##  Min.   : 16.00   Min.   : 1.000   Partnered           :61456   Min.   : 1.00  
##  1st Qu.: 29.00   1st Qu.: 2.000   Previously partnered:11925   1st Qu.: 4.00  
##  Median : 41.00   Median : 4.000   Single              :23250   Median : 5.00  
##  Mean   : 43.18   Mean   : 4.025   NA's                :  589   Mean   : 4.91  
##  3rd Qu.: 55.00   3rd Qu.: 5.000                                3rd Qu.: 6.00  
##  Max.   :103.00   Max.   :63.000                                Max.   :10.00  
##  NA's   :511      NA's   :985                                   NA's   :2961   
##             status_comp               parent_proud   with_parent 
##  Better off       :52478   Agree strongly   :47027   No  :67098  
##  Worse off        :16698   Agree            :36660   Yes :27858  
##  Or about the same:26284   Disagree         : 9949   NA's: 2264  
##  NA's             : 1760   Strongly disagree: 1908               
##                            NA's             : 1676               
##                                                                  
##                                                                  
##      GDPpc                    country_income 
##  Min.   :  2312   Low income         : 2430  
##  1st Qu.: 10317   Lower middle income:24457  
##  Median : 16785   Upper middle income:35201  
##  Mean   : 27428   High income        :34685  
##  3rd Qu.: 43029   NA's               :  447  
##  Max.   :129103                              
##  NA's   :5363                                
##                           region     
##  East Asia and Pacific       :26088  
##  Europe and Central Asia     :25852  
##  Latin America and Caribbean :17439  
##  Middle East and North Africa: 9906  
##  North America               : 6614  
##  (Other)                     :10874  
##  NA's                        :  447

Imputations by countries

wvs_pr <- subset(wvs2, cntry == 'Puerto Rico')
wvs_pr$cntry <- ifelse(wvs_pr$cntry=="Puerto Rico", "Puerto Rico", "0") %>% as.factor()

set.seed(123)
imp_pr = missForest(wvs_pr, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.0004657031 0.2453948 
##     difference(s): 1.345912e-10 0.001996451 
##     time: 2.536 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.0004549618 0.2326775 
##     difference(s): 4.266594e-12 0.0003327418 
##     time: 2.706 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.000458273 0.2303552 
##     difference(s): 2.446672e-12 0.0003327418 
##     time: 2.504 seconds
## 
##   missForest iteration 4 in progress...done!
##     estimated error(s): 0.0004613947 0.2372319 
##     difference(s): 1.91408e-12 0.0002218279 
##     time: 2.595 seconds
## 
##   missForest iteration 5 in progress...done!
##     estimated error(s): 0.0004565319 0.2333803 
##     difference(s): 1.658793e-12 0.0003327418 
##     time: 2.938 seconds
## 
##   missForest iteration 6 in progress...done!
##     estimated error(s): 0.000464642 0.2350178 
##     difference(s): 9.354495e-12 0.0002218279 
##     time: 2.907 seconds
## 
##   missForest iteration 7 in progress...done!
##     estimated error(s): 0.0004588342 0.2304481 
##     difference(s): 2.869853e-12 0.0004436557 
##     time: 2.72 seconds
## 
##   missForest iteration 8 in progress...done!
##     estimated error(s): 0.0004601552 0.2335738 
##     difference(s): 3.461541e-12 0.0003327418 
##     time: 2.68 seconds
## 
##   missForest iteration 9 in progress...done!
##     estimated error(s): 0.0004604003 0.2298985 
##     difference(s): 2.970667e-12 0.0002218279 
##     time: 2.519 seconds
## 
##   missForest iteration 10 in progress...done!
##     estimated error(s): 0.0004583402 0.2316926 
##     difference(s): 1.73601e-12 0.0002218279 
##     time: 2.911 seconds
imp_pr$OOBerror
##        NRMSE          PFC 
## 0.0004583402 0.2316926344
pr <-imp_pr$ximp
summary(pr)
##          cntry          finsat           sex           age      
##  Puerto Rico:1127   Min.   : 1.000   Male  :443   Min.   :18.0  
##                     1st Qu.: 5.000   Female:684   1st Qu.:34.0  
##                     Median : 8.000                Median :51.0  
##                     Mean   : 7.161                Mean   :49.8  
##                     3rd Qu.: 9.000                3rd Qu.:65.0  
##                     Max.   :10.000                Max.   :94.0  
##                                                                 
##   people_house                   marit_stat      income      
##  Min.   : 1.000   Partnered           :582   Min.   : 1.000  
##  1st Qu.: 2.000   Previously partnered:237   1st Qu.: 4.000  
##  Median : 3.000   Single              :308   Median : 5.000  
##  Mean   : 2.982                              Mean   : 5.136  
##  3rd Qu.: 4.000                              3rd Qu.: 7.000  
##  Max.   :12.000                              Max.   :10.000  
##                                                              
##             status_comp             parent_proud with_parent     GDPpc      
##  Better off       :555   Agree strongly   :808   No :909     Min.   :35948  
##  Worse off        :131   Agree            :263   Yes:218     1st Qu.:35948  
##  Or about the same:441   Disagree         : 43               Median :35948  
##                          Strongly disagree: 13               Mean   :35948  
##                                                              3rd Qu.:35948  
##                                                              Max.   :35948  
##                                                                             
##              country_income                          region    
##  Low income         :   0   Sub-Saharan Africa          :   0  
##  Lower middle income:   0   South Asia                  :   0  
##  Upper middle income:   0   North America               :   0  
##  High income        :1127   Middle East and North Africa:   0  
##                             Latin America and Caribbean :1127  
##                             Europe and Central Asia     :   0  
##                             East Asia and Pacific       :   0
wvs_br <- subset(wvs2, cntry == 'Brazil')
wvs_br$cntry <- ifelse(wvs_br$cntry=="Brazil", "Brazil", "0") %>% as.factor()

set.seed(123)
imp_br = missForest(wvs_br, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.001040227 0.1355264 
##     difference(s): 1.600914e-10 0.0026958 
##     time: 3.32 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.001039747 0.1351398 
##     difference(s): 1.351129e-11 0.001206016 
##     time: 3.179 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.001040098 0.1379352 
##     difference(s): 1.389505e-11 0.001206016 
##     time: 3.281 seconds
imp_br$OOBerror
##       NRMSE         PFC 
## 0.001039747 0.135139759
br <-imp_br$ximp
summary(br)
##     cntry          finsat           sex           age         people_house   
##  Brazil:1762   Min.   : 1.000   Male  :800   Min.   :17.00   Min.   : 1.000  
##                1st Qu.: 5.000   Female:962   1st Qu.:29.00   1st Qu.: 2.000  
##                Median : 6.000                Median :43.00   Median : 3.000  
##                Mean   : 6.075                Mean   :43.56   Mean   : 3.244  
##                3rd Qu.: 8.000                3rd Qu.:57.00   3rd Qu.: 4.000  
##                Max.   :10.000                Max.   :91.00   Max.   :29.000  
##                                                                              
##                 marit_stat      income                  status_comp  
##  Partnered           :896   Min.   : 1.000   Better off       :1215  
##  Previously partnered:339   1st Qu.: 2.000   Worse off        : 252  
##  Single              :527   Median : 4.000   Or about the same: 295  
##                             Mean   : 4.021                           
##                             3rd Qu.: 5.000                           
##                             Max.   :10.000                           
##                                                                      
##             parent_proud with_parent     GDPpc                   country_income
##  Agree strongly   :809   No :1356    Min.   :15259   Low income         :   0  
##  Agree            :809   Yes: 406    1st Qu.:15259   Lower middle income:   0  
##  Disagree         :123               Median :15259   Upper middle income:1762  
##  Strongly disagree: 21               Mean   :15259   High income        :   0  
##                                      3rd Qu.:15259                             
##                                      Max.   :15259                             
##                                                                                
##                           region    
##  Sub-Saharan Africa          :   0  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:   0  
##  Latin America and Caribbean :1762  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0
wvs_per <- subset(wvs2, cntry == 'Peru')
wvs_per$cntry <- ifelse(wvs_per$cntry=="Peru", "Peru", "0") %>% as.factor()

set.seed(123)
imp_per = missForest(wvs_per, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.0002267714 0.1404239 
##     difference(s): 5.13615e-11 0.002946429 
##     time: 1.866 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.0002258453 0.141958 
##     difference(s): 3.672163e-12 0.0005357143 
##     time: 1.852 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.0002247073 0.1416838 
##     difference(s): 2.057437e-12 0.000625 
##     time: 1.778 seconds
## 
##   missForest iteration 4 in progress...done!
##     estimated error(s): 0.00022642 0.1440654 
##     difference(s): 1.908553e-12 0.0004464286 
##     time: 1.878 seconds
## 
##   missForest iteration 5 in progress...done!
##     estimated error(s): 0.0002257124 0.138975 
##     difference(s): 2.578582e-12 0.0003571429 
##     time: 1.786 seconds
## 
##   missForest iteration 6 in progress...done!
##     estimated error(s): 0.0002256639 0.1399696 
##     difference(s): 2.476688e-12 0.0003571429 
##     time: 1.953 seconds
## 
##   missForest iteration 7 in progress...done!
##     estimated error(s): 0.00022596 0.1409665 
##     difference(s): 1.960845e-12 0.0004464286 
##     time: 1.905 seconds
## 
##   missForest iteration 8 in progress...done!
##     estimated error(s): 0.0002262008 0.1389732 
##     difference(s): 3.645834e-12 0.0004464286 
##     time: 1.929 seconds
imp_per$OOBerror
##      NRMSE        PFC 
## 0.00022596 0.14096654
per <-imp_per$ximp
summary(per)
##   cntry          finsat          sex           age         people_house   
##  Peru:1400   Min.   : 1.00   Male  :702   Min.   :18.00   Min.   : 1.000  
##              1st Qu.: 5.00   Female:698   1st Qu.:27.00   1st Qu.: 4.000  
##              Median : 6.00                Median :38.00   Median : 5.000  
##              Mean   : 6.35                Mean   :40.16   Mean   : 4.781  
##              3rd Qu.: 8.00                3rd Qu.:51.00   3rd Qu.: 6.000  
##              Max.   :10.00                Max.   :86.00   Max.   :15.000  
##                                                                           
##                 marit_stat      income                  status_comp 
##  Partnered           :898   Min.   : 1.000   Better off       :824  
##  Previously partnered:131   1st Qu.: 4.000   Worse off        : 92  
##  Single              :371   Median : 5.000   Or about the same:484  
##                             Mean   : 4.951                          
##                             3rd Qu.: 6.000                          
##                             Max.   :10.000                          
##                                                                     
##             parent_proud with_parent     GDPpc                   country_income
##  Agree strongly   :688   No :892     Min.   :13380   Low income         :   0  
##  Agree            :673   Yes:508     1st Qu.:13380   Lower middle income:   0  
##  Disagree         : 38               Median :13380   Upper middle income:1400  
##  Strongly disagree:  1               Mean   :13380   High income        :   0  
##                                      3rd Qu.:13380                             
##                                      Max.   :13380                             
##                                                                                
##                           region    
##  Sub-Saharan Africa          :   0  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:   0  
##  Latin America and Caribbean :1400  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0
wvs_ch <- subset(wvs2, cntry == 'Chile')
wvs_ch$cntry <- ifelse(wvs_ch$cntry == "Chile", "Chile", "0") %>% as.factor()

set.seed(123)
imp_ch = missForest(wvs_ch, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.0001084571 0.1666865 
##     difference(s): 2.307897e-11 0.014125 
##     time: 1.193 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.0001071028 0.16375 
##     difference(s): 5.751611e-12 0.00325 
##     time: 1.185 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.0001073493 0.1685869 
##     difference(s): 3.734321e-12 0.002375 
##     time: 1.43 seconds
## 
##   missForest iteration 4 in progress...done!
##     estimated error(s): 0.0001076745 0.1606491 
##     difference(s): 2.041616e-12 0.002 
##     time: 1.14 seconds
## 
##   missForest iteration 5 in progress...done!
##     estimated error(s): 0.0001074148 0.1611458 
##     difference(s): 1.89915e-12 0.002125 
##     time: 1.127 seconds
## 
##   missForest iteration 6 in progress...done!
##     estimated error(s): 0.0001073218 0.1632764 
##     difference(s): 2.451009e-12 0.002 
##     time: 1.217 seconds
## 
##   missForest iteration 7 in progress...done!
##     estimated error(s): 0.0001072016 0.1659902 
##     difference(s): 1.757565e-12 0.002 
##     time: 1.188 seconds
## 
##   missForest iteration 8 in progress...done!
##     estimated error(s): 0.0001071255 0.1596667 
##     difference(s): 1.773204e-12 0.00125 
##     time: 1.148 seconds
## 
##   missForest iteration 9 in progress...done!
##     estimated error(s): 0.0001069342 0.1657922 
##     difference(s): 2.552364e-12 0.0015 
##     time: 1.125 seconds
imp_ch$OOBerror
##        NRMSE          PFC 
## 0.0001071255 0.1596666931
ch <-imp_ch$ximp
summary(ch)
##    cntry          finsat           sex           age         people_house  
##  Chile:1000   Min.   : 1.000   Male  :474   Min.   :18.00   Min.   :1.000  
##               1st Qu.: 5.000   Female:526   1st Qu.:33.00   1st Qu.:2.000  
##               Median : 6.000                Median :44.00   Median :3.000  
##               Mean   : 6.148                Mean   :45.27   Mean   :3.266  
##               3rd Qu.: 8.000                3rd Qu.:56.25   3rd Qu.:4.000  
##               Max.   :10.000                Max.   :91.00   Max.   :9.000  
##                                                                            
##                 marit_stat      income                  status_comp 
##  Partnered           :624   Min.   : 1.000   Better off       :600  
##  Previously partnered:208   1st Qu.: 4.000   Worse off        :136  
##  Single              :168   Median : 5.000   Or about the same:264  
##                             Mean   : 4.673                          
##                             3rd Qu.: 6.000                          
##                             Max.   :10.000                          
##                                                                     
##             parent_proud with_parent     GDPpc                   country_income
##  Agree strongly   :389   No :781     Min.   :25155   Low income         :   0  
##  Agree            :415   Yes:219     1st Qu.:25155   Lower middle income:   0  
##  Disagree         :138               Median :25155   Upper middle income:   0  
##  Strongly disagree: 58               Mean   :25155   High income        :1000  
##                                      3rd Qu.:25155                             
##                                      Max.   :25155                             
##                                                                                
##                           region    
##  Sub-Saharan Africa          :   0  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:   0  
##  Latin America and Caribbean :1000  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0
wvs_arg <- subset(wvs2, cntry == 'Argentina')
wvs_arg$cntry <- ifelse(wvs_arg$cntry == 'Argentina', "Argentina", "0") %>% as.factor()

set.seed(123)
imp_arg = missForest(wvs_arg, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.0001534338 0.1572026 
##     difference(s): 3.488262e-11 0.002866401 
##     time: 1.691 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.0001519348 0.1554638 
##     difference(s): 2.876709e-12 0.0007477567 
##     time: 1.505 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.0001525242 0.1559435 
##     difference(s): 4.052913e-12 0.001495513 
##     time: 1.674 seconds
imp_arg$OOBerror
##        NRMSE          PFC 
## 0.0001519348 0.1554637638
arg <-imp_arg$ximp
summary(arg)
##        cntry          finsat           sex           age       
##  Argentina:1003   Min.   : 1.000   Male  :485   Min.   :18.00  
##                   1st Qu.: 5.000   Female:518   1st Qu.:27.00  
##                   Median : 6.000                Median :40.00  
##                   Mean   : 5.987                Mean   :42.55  
##                   3rd Qu.: 8.000                3rd Qu.:57.00  
##                   Max.   :10.000                Max.   :93.00  
##                                                                
##   people_house                   marit_stat      income      
##  Min.   : 1.000   Partnered           :502   Min.   : 1.000  
##  1st Qu.: 2.000   Previously partnered:217   1st Qu.: 4.000  
##  Median : 4.000   Single              :284   Median : 5.000  
##  Mean   : 3.626                              Mean   : 5.085  
##  3rd Qu.: 5.000                              3rd Qu.: 6.000  
##  Max.   :13.000                              Max.   :10.000  
##                                                              
##             status_comp             parent_proud with_parent     GDPpc      
##  Better off       :243   Agree strongly   :324   No :758     Min.   :22947  
##  Worse off        :218   Agree            :509   Yes:245     1st Qu.:22947  
##  Or about the same:542   Disagree         :137               Median :22947  
##                          Strongly disagree: 33               Mean   :22947  
##                                                              3rd Qu.:22947  
##                                                              Max.   :22947  
##                                                                             
##              country_income                          region    
##  Low income         :   0   Sub-Saharan Africa          :   0  
##  Lower middle income:   0   South Asia                  :   0  
##  Upper middle income:1003   North America               :   0  
##  High income        :   0   Middle East and North Africa:   0  
##                             Latin America and Caribbean :1003  
##                             Europe and Central Asia     :   0  
##                             East Asia and Pacific       :   0
wvs_urg <- subset(wvs2, cntry == 'Uruguay')
wvs_urg$cntry <- ifelse(wvs_urg$cntry == 'Uruguay', "Uruguay", "0") %>% as.factor()

set.seed(123)
imp_urg = missForest(wvs_urg, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.0001480635 0.1818762 
##     difference(s): 6.577302e-11 0.012375 
##     time: 1.195 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.0001457735 0.1799962 
##     difference(s): 5.001718e-12 0.004625 
##     time: 1.196 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.0001464527 0.1835796 
##     difference(s): 5.157268e-12 0.005375 
##     time: 1.106 seconds
imp_urg$OOBerror
##        NRMSE          PFC 
## 0.0001457735 0.1799962246
urg <-imp_urg$ximp
summary(urg)
##      cntry          finsat           sex           age         people_house  
##  Uruguay:1000   Min.   : 1.000   Male  :316   Min.   :18.00   Min.   : 4.00  
##                 1st Qu.: 5.000   Female:684   1st Qu.:34.00   1st Qu.: 8.00  
##                 Median : 7.000                Median :50.00   Median :17.00  
##                 Mean   : 6.901                Mean   :49.83   Mean   :14.96  
##                 3rd Qu.: 8.071                3rd Qu.:65.00   3rd Qu.:23.00  
##                 Max.   :10.000                Max.   :90.00   Max.   :23.00  
##                                                                              
##                 marit_stat      income                  status_comp 
##  Partnered           :477   Min.   : 1.000   Better off       :422  
##  Previously partnered:291   1st Qu.: 4.000   Worse off        :125  
##  Single              :232   Median : 5.000   Or about the same:453  
##                             Mean   : 5.006                          
##                             3rd Qu.: 6.038                          
##                             Max.   :10.000                          
##                                                                     
##             parent_proud with_parent     GDPpc                   country_income
##  Agree strongly   :365   No :887     Min.   :22455   Low income         :   0  
##  Agree            :493   Yes:113     1st Qu.:22455   Lower middle income:   0  
##  Disagree         :115               Median :22455   Upper middle income:   0  
##  Strongly disagree: 27               Mean   :22455   High income        :1000  
##                                      3rd Qu.:22455                             
##                                      Max.   :22455                             
##                                                                                
##                           region    
##  Sub-Saharan Africa          :   0  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:   0  
##  Latin America and Caribbean :1000  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0
wvs_mx <- subset(wvs2, cntry == 'Mexico')
wvs_mx$cntry <- ifelse(wvs_mx$cntry == 'Mexico', "Mexico", "0") %>% as.factor()

set.seed(123)
imp_mx = missForest(wvs_mx, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.0007860064 0.1993169 
##     difference(s): 1.949538e-11 0.0002871913 
##     time: 4.559 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.0007849077 0.1995288 
##     difference(s): 6.794877e-12 7.179782e-05 
##     time: 4.677 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.0007906234 0.1989591 
##     difference(s): 2.270198e-12 7.179782e-05 
##     time: 4.57 seconds
## 
##   missForest iteration 4 in progress...done!
##     estimated error(s): 0.000791721 0.1973008 
##     difference(s): 3.923289e-12 7.179782e-05 
##     time: 4.506 seconds
imp_mx$OOBerror
##        NRMSE          PFC 
## 0.0007906234 0.1989590613
mx <-imp_mx$ximp
summary(mx)
##     cntry          finsat           sex           age         people_house   
##  Mexico:1741   Min.   : 1.000   Male  :874   Min.   :18.00   Min.   : 1.000  
##                1st Qu.: 5.000   Female:867   1st Qu.:29.00   1st Qu.: 3.000  
##                Median : 7.000                Median :42.00   Median : 4.000  
##                Mean   : 6.811                Mean   :43.36   Mean   : 4.474  
##                3rd Qu.: 9.000                3rd Qu.:56.00   3rd Qu.: 5.000  
##                Max.   :10.000                Max.   :90.00   Max.   :20.000  
##                                                                              
##                 marit_stat       income                  status_comp  
##  Partnered           :1191   Min.   : 1.000   Better off       :1003  
##  Previously partnered: 252   1st Qu.: 2.000   Worse off        : 253  
##  Single              : 298   Median : 4.000   Or about the same: 485  
##                              Mean   : 4.228                           
##                              3rd Qu.: 6.000                           
##                              Max.   :10.000                           
##                                                                       
##             parent_proud  with_parent     GDPpc      
##  Agree strongly   :1038   No :1277    Min.   :20411  
##  Agree            : 587   Yes: 464    1st Qu.:20411  
##  Disagree         :  83               Median :20411  
##  Strongly disagree:  33               Mean   :20411  
##                                       3rd Qu.:20411  
##                                       Max.   :20411  
##                                                      
##              country_income                          region    
##  Low income         :   0   Sub-Saharan Africa          :   0  
##  Lower middle income:   0   South Asia                  :   0  
##  Upper middle income:1741   North America               :   0  
##  High income        :   0   Middle East and North Africa:   0  
##                             Latin America and Caribbean :1741  
##                             Europe and Central Asia     :   0  
##                             East Asia and Pacific       :   0
wvs_ven <- subset(wvs2, cntry == 'Venezuela')
wvs_ven$cntry <- ifelse(wvs_ven$cntry == 'Venezuela', "Venezuela", "0") %>% as.factor()

set.seed(123)
imp_ven = missForest(wvs_ven, verbose = TRUE)
##   removed variable(s) 11 due to the missingness of all entries
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0 0 
##     difference(s): 0 0 
##     time: 0.006 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0 0 
##     difference(s): 0 0 
##     time: 0.006 seconds
imp_ven$OOBerror
## NRMSE   PFC 
##     0     0
ven <-imp_ven$ximp
summary(ven)
##        cntry          finsat           sex           age       
##  Venezuela:1190   Min.   : 1.000   Male  :571   Min.   :18.00  
##                   1st Qu.: 3.000   Female:619   1st Qu.:25.00  
##                   Median : 5.000                Median :35.00  
##                   Mean   : 4.887                Mean   :38.31  
##                   3rd Qu.: 6.000                3rd Qu.:49.00  
##                   Max.   :10.000                Max.   :85.00  
##                                                                
##   people_house                   marit_stat      income      
##  Min.   : 1.000   Partnered           :631   Min.   : 1.000  
##  1st Qu.: 3.000   Previously partnered:162   1st Qu.: 3.000  
##  Median : 4.000   Single              :397   Median : 5.000  
##  Mean   : 4.546                              Mean   : 4.479  
##  3rd Qu.: 6.000                              3rd Qu.: 6.000  
##  Max.   :15.000                              Max.   :10.000  
##                                                              
##             status_comp             parent_proud with_parent
##  Better off       :147   Agree strongly   :666   No :731    
##  Worse off        :511   Agree            :438   Yes:459    
##  Or about the same:532   Disagree         : 72              
##                          Strongly disagree: 14              
##                                                             
##                                                             
##                                                             
##              country_income                          region    
##  Low income         :   0   Sub-Saharan Africa          :   0  
##  Lower middle income:   0   South Asia                  :   0  
##  Upper middle income:1190   North America               :   0  
##  High income        :   0   Middle East and North Africa:   0  
##                             Latin America and Caribbean :1190  
##                             Europe and Central Asia     :   0  
##                             East Asia and Pacific       :   0
wvs_gut <- subset(wvs2, cntry == 'Guatemala')
wvs_gut$cntry <- ifelse(wvs_gut$cntry == 'Guatemala', "Guatemala", "0") %>% as.factor()

set.seed(123)
imp_gut = missForest(wvs_gut, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.0003325686 0.1139921 
##     difference(s): 2.300062e-10 0.001423922 
##     time: 1.41 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.000333925 0.1171769 
##     difference(s): 1.646983e-11 0.0003051261 
##     time: 1.455 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.000333946 0.1162608 
##     difference(s): 2.038269e-11 0 
##     time: 1.425 seconds
## 
##   missForest iteration 4 in progress...done!
##     estimated error(s): 0.0003339189 0.1204789 
##     difference(s): 1.777699e-11 0.0001017087 
##     time: 1.514 seconds
## 
##   missForest iteration 5 in progress...done!
##     estimated error(s): 0.000333264 0.115533 
##     difference(s): 2.178245e-11 0.0003051261 
##     time: 1.415 seconds
imp_gut$OOBerror
##        NRMSE          PFC 
## 0.0003339189 0.1204788827
gut <-imp_gut$ximp
summary(gut)
##        cntry          finsat           sex           age        people_house   
##  Guatemala:1229   Min.   : 1.000   Male  :578   Min.   :18.0   Min.   : 1.000  
##                   1st Qu.: 5.000   Female:651   1st Qu.:21.0   1st Qu.: 3.000  
##                   Median : 7.000                Median :30.0   Median : 4.060  
##                   Mean   : 6.965                Mean   :33.5   Mean   : 4.595  
##                   3rd Qu.: 9.000                3rd Qu.:42.0   3rd Qu.: 5.000  
##                   Max.   :10.000                Max.   :89.0   Max.   :10.000  
##                                                                                
##                 marit_stat      income                  status_comp 
##  Partnered           :485   Min.   : 1.000   Better off       :760  
##  Previously partnered:134   1st Qu.: 5.000   Worse off        :132  
##  Single              :610   Median : 6.000   Or about the same:337  
##                             Mean   : 5.965                          
##                             3rd Qu.: 7.000                          
##                             Max.   :10.000                          
##                                                                     
##             parent_proud with_parent     GDPpc                  country_income
##  Agree strongly   :791   No :567     Min.   :8996   Low income         :   0  
##  Agree            :332   Yes:662     1st Qu.:8996   Lower middle income:   0  
##  Disagree         : 92               Median :8996   Upper middle income:1229  
##  Strongly disagree: 14               Mean   :8996   High income        :   0  
##                                      3rd Qu.:8996                             
##                                      Max.   :8996                             
##                                                                               
##                           region    
##  Sub-Saharan Africa          :   0  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:   0  
##  Latin America and Caribbean :1229  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0
wvs_nic <- subset(wvs2, cntry == 'Nicaragua')
wvs_nic$cntry <- ifelse(wvs_nic$cntry == 'Nicaragua', "Nicaragua", "0") %>% as.factor()

set.seed(123)
imp_nic = missForest(wvs_nic, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.0004795084 0 
##     difference(s): 6.318253e-11 0 
##     time: 0.491 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.0004777818 0 
##     difference(s): 8.324922e-12 0 
##     time: 0.474 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.000482197 0 
##     difference(s): 1.619466e-12 0 
##     time: 0.487 seconds
## 
##   missForest iteration 4 in progress...done!
##     estimated error(s): 0.0004794779 0 
##     difference(s): 1.382154e-12 0 
##     time: 0.497 seconds
## 
##   missForest iteration 5 in progress...done!
##     estimated error(s): 0.0004802284 0 
##     difference(s): 1.318186e-12 0 
##     time: 0.559 seconds
## 
##   missForest iteration 6 in progress...done!
##     estimated error(s): 0.0004773569 0 
##     difference(s): 1.076047e-12 0 
##     time: 0.482 seconds
## 
##   missForest iteration 7 in progress...done!
##     estimated error(s): 0.0004805845 0 
##     difference(s): 6.838752e-13 0 
##     time: 0.513 seconds
## 
##   missForest iteration 8 in progress...done!
##     estimated error(s): 0.0004827909 0 
##     difference(s): 1.967888e-12 0 
##     time: 0.488 seconds
imp_nic$OOBerror
##        NRMSE          PFC 
## 0.0004805845 0.0000000000
nic <-imp_nic$ximp
summary(nic)
##        cntry          finsat           sex           age       
##  Nicaragua:1200   Min.   : 1.000   Male  :589   Min.   :16.00  
##                   1st Qu.: 5.000   Female:611   1st Qu.:23.75  
##                   Median : 7.000                Median :33.00  
##                   Mean   : 6.716                Mean   :35.13  
##                   3rd Qu.: 9.000                3rd Qu.:44.00  
##                   Max.   :10.000                Max.   :81.00  
##                                                                
##   people_house                   marit_stat      income      
##  Min.   : 1.000   Partnered           :688   Min.   : 1.000  
##  1st Qu.: 4.000   Previously partnered:122   1st Qu.: 3.000  
##  Median : 5.000   Single              :390   Median : 5.000  
##  Mean   : 5.334                              Mean   : 4.579  
##  3rd Qu.: 6.000                              3rd Qu.: 6.000  
##  Max.   :17.000                              Max.   :10.000  
##                                                              
##             status_comp             parent_proud with_parent     GDPpc     
##  Better off       :517   Agree strongly   :522   No :675     Min.   :5631  
##  Worse off        :151   Agree            :658   Yes:525     1st Qu.:5631  
##  Or about the same:532   Disagree         : 16               Median :5631  
##                          Strongly disagree:  4               Mean   :5631  
##                                                              3rd Qu.:5631  
##                                                              Max.   :5631  
##                                                                            
##              country_income                          region    
##  Low income         :   0   Sub-Saharan Africa          :   0  
##  Lower middle income:1200   South Asia                  :   0  
##  Upper middle income:   0   North America               :   0  
##  High income        :   0   Middle East and North Africa:   0  
##                             Latin America and Caribbean :1200  
##                             Europe and Central Asia     :   0  
##                             East Asia and Pacific       :   0
wvs_col <- subset(wvs2, cntry == 'Colombia')
wvs_col$cntry <- ifelse(wvs_col$cntry == 'Colombia', "Colombia", "0") %>% as.factor()

set.seed(123)
imp_col = missForest(wvs_col, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0 0 
##     difference(s): 0 0 
##     time: 0.005 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0 0 
##     difference(s): 0 0 
##     time: 0.005 seconds
imp_col$OOBerror
## NRMSE   PFC 
##     0     0
col <-imp_col$ximp
summary(col)
##       cntry          finsat           sex           age         people_house   
##  Colombia:1520   Min.   : 1.000   Male  :760   Min.   :18.00   Min.   : 1.000  
##                  1st Qu.: 5.000   Female:760   1st Qu.:25.00   1st Qu.: 3.000  
##                  Median : 7.000                Median :36.00   Median : 4.000  
##                  Mean   : 6.615                Mean   :38.85   Mean   : 4.131  
##                  3rd Qu.: 9.000                3rd Qu.:52.00   3rd Qu.: 5.000  
##                  Max.   :10.000                Max.   :89.00   Max.   :18.000  
##                                                                                
##                 marit_stat      income                  status_comp 
##  Partnered           :790   Min.   : 1.000   Better off       :566  
##  Previously partnered:207   1st Qu.: 2.000   Worse off        :177  
##  Single              :523   Median : 5.000   Or about the same:777  
##                             Mean   : 4.443                          
##                             3rd Qu.: 6.000                          
##                             Max.   :10.000                          
##                                                                     
##             parent_proud with_parent     GDPpc                   country_income
##  Agree strongly   :732   No :988     Min.   :15644   Low income         :   0  
##  Agree            :736   Yes:532     1st Qu.:15644   Lower middle income:   0  
##  Disagree         : 46               Median :15644   Upper middle income:1520  
##  Strongly disagree:  6               Mean   :15644   High income        :   0  
##                                      3rd Qu.:15644                             
##                                      Max.   :15644                             
##                                                                                
##                           region    
##  Sub-Saharan Africa          :   0  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:   0  
##  Latin America and Caribbean :1520  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0
wvs_ecu <- subset(wvs2, cntry == 'Ecuador')
wvs_ecu$cntry <- ifelse(wvs_ecu$cntry == 'Ecuador', "Ecuador", "0") %>% as.factor()

set.seed(123)
imp_ecu = missForest(wvs_ecu, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.0003122292 0.1253165 
##     difference(s): 3.382572e-11 0.002604167 
##     time: 1.452 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.0003095908 0.1276035 
##     difference(s): 2.15582e-12 0.0008333333 
##     time: 1.424 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.0003109061 0.1302942 
##     difference(s): 3.210066e-12 0.001145833 
##     time: 1.334 seconds
imp_ecu$OOBerror
##        NRMSE          PFC 
## 0.0003095908 0.1276034866
ecu <-imp_ecu$ximp
summary(ecu)
##      cntry          finsat          sex           age         people_house   
##  Ecuador:1200   Min.   : 1.00   Male  :573   Min.   :18.00   Min.   : 1.000  
##                 1st Qu.: 5.00   Female:627   1st Qu.:26.00   1st Qu.: 3.000  
##                 Median : 7.00                Median :37.00   Median : 4.000  
##                 Mean   : 6.34                Mean   :39.49   Mean   : 4.487  
##                 3rd Qu.: 8.00                3rd Qu.:52.00   3rd Qu.: 6.000  
##                 Max.   :10.00                Max.   :80.00   Max.   :20.000  
##                                                                              
##                 marit_stat      income                  status_comp 
##  Partnered           :663   Min.   : 1.000   Better off       :303  
##  Previously partnered:172   1st Qu.: 3.000   Worse off        :118  
##  Single              :365   Median : 5.000   Or about the same:779  
##                             Mean   : 4.758                          
##                             3rd Qu.: 6.000                          
##                             Max.   :10.000                          
##                                                                     
##             parent_proud with_parent     GDPpc                   country_income
##  Agree strongly   :857   No :779     Min.   :11847   Low income         :   0  
##  Agree            :309   Yes:421     1st Qu.:11847   Lower middle income:   0  
##  Disagree         : 26               Median :11847   Upper middle income:1200  
##  Strongly disagree:  8               Mean   :11847   High income        :   0  
##                                      3rd Qu.:11847                             
##                                      Max.   :11847                             
##                                                                                
##                           region    
##  Sub-Saharan Africa          :   0  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:   0  
##  Latin America and Caribbean :1200  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0
wvs_bol <- subset(wvs2, cntry == 'Bolivia')
wvs_bol$cntry <- ifelse(wvs_bol$cntry == 'Bolivia', "Bolivia", "0") %>% as.factor()

set.seed(123)
imp_bol = missForest(wvs_bol, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.0004648883 0.2038974 
##     difference(s): 2.898543e-10 0.002539913 
##     time: 4.941 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.000464985 0.1953894 
##     difference(s): 1.370608e-11 0.0006652153 
##     time: 4.796 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.0004643176 0.1992363 
##     difference(s): 9.10762e-12 0.0006047412 
##     time: 5.626 seconds
## 
##   missForest iteration 4 in progress...done!
##     estimated error(s): 0.0004637441 0.2035822 
##     difference(s): 1.514099e-11 0.0007256894 
##     time: 4.895 seconds
imp_bol$OOBerror
##        NRMSE          PFC 
## 0.0004643176 0.1992362975
bol <-imp_bol$ximp
summary(bol)
##      cntry          finsat           sex            age         people_house 
##  Bolivia:2067   Min.   : 1.000   Male  :1024   Min.   :18.00   Min.   : 1.0  
##                 1st Qu.: 5.000   Female:1043   1st Qu.:25.00   1st Qu.: 4.0  
##                 Median : 7.000                 Median :35.00   Median : 5.0  
##                 Mean   : 6.453                 Mean   :38.33   Mean   : 5.2  
##                 3rd Qu.: 8.000                 3rd Qu.:50.00   3rd Qu.: 6.0  
##                 Max.   :10.000                 Max.   :85.00   Max.   :30.0  
##                                                                              
##                 marit_stat       income                  status_comp  
##  Partnered           :1185   Min.   : 1.000   Better off       : 805  
##  Previously partnered: 253   1st Qu.: 4.000   Worse off        : 186  
##  Single              : 629   Median : 5.000   Or about the same:1076  
##                              Mean   : 5.003                           
##                              3rd Qu.: 6.000                           
##                              Max.   :10.000                           
##                                                                       
##             parent_proud  with_parent     GDPpc                  country_income
##  Agree strongly   :1154   No :1252    Min.   :9086   Low income         :   0  
##  Agree            : 825   Yes: 815    1st Qu.:9086   Lower middle income:2067  
##  Disagree         :  74               Median :9086   Upper middle income:   0  
##  Strongly disagree:  14               Mean   :9086   High income        :   0  
##                                       3rd Qu.:9086                             
##                                       Max.   :9086                             
##                                                                                
##                           region    
##  Sub-Saharan Africa          :   0  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:   0  
##  Latin America and Caribbean :2067  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0
wvs_mac <- subset(wvs2, cntry == 'Macau SAR')
wvs_mac$cntry <- ifelse(wvs_mac$cntry == 'Macau SAR', "Macau SAR", "0") %>% as.factor()

set.seed(123)
imp_mac = missForest(wvs_mac, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 9.535232e-05 0.1701235 
##     difference(s): 1.410182e-09 0 
##     time: 1.838 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 9.503666e-05 0.1641233 
##     difference(s): 2.261257e-11 0 
##     time: 1.741 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 9.439277e-05 0.1635139 
##     difference(s): 1.837271e-11 0 
##     time: 1.797 seconds
## 
##   missForest iteration 4 in progress...done!
##     estimated error(s): 9.465273e-05 0.1600876 
##     difference(s): 1.886795e-11 0 
##     time: 1.732 seconds
imp_mac$OOBerror
##        NRMSE          PFC 
## 9.439277e-05 1.635139e-01
mac <-imp_mac$ximp
summary(mac)
##        cntry          finsat           sex           age       
##  Macau SAR:1023   Min.   : 1.000   Male  :450   Min.   :18.00  
##                   1st Qu.: 5.000   Female:573   1st Qu.:26.00  
##                   Median : 6.000                Median :39.00  
##                   Mean   : 6.202                Mean   :40.82  
##                   3rd Qu.: 7.000                3rd Qu.:53.00  
##                   Max.   :10.000                Max.   :82.00  
##                                                                
##   people_house                   marit_stat      income      
##  Min.   : 1.000   Partnered           :570   Min.   : 1.000  
##  1st Qu.: 3.000   Previously partnered: 50   1st Qu.: 4.000  
##  Median : 3.912   Single              :403   Median : 5.000  
##  Mean   : 3.685                              Mean   : 5.023  
##  3rd Qu.: 4.000                              3rd Qu.: 6.000  
##  Max.   :20.000                              Max.   :10.000  
##                                                              
##             status_comp             parent_proud with_parent     GDPpc       
##  Better off       :839   Agree strongly   :114   No :625     Min.   :129103  
##  Worse off        : 60   Agree            :541   Yes:398     1st Qu.:129103  
##  Or about the same:124   Disagree         :333               Median :129103  
##                          Strongly disagree: 35               Mean   :129103  
##                                                              3rd Qu.:129103  
##                                                              Max.   :129103  
##                                                                              
##              country_income                          region    
##  Low income         :   0   Sub-Saharan Africa          :   0  
##  Lower middle income:   0   South Asia                  :   0  
##  Upper middle income:   0   North America               :   0  
##  High income        :1023   Middle East and North Africa:   0  
##                             Latin America and Caribbean :   0  
##                             Europe and Central Asia     :   0  
##                             East Asia and Pacific       :1023
wvs_nig <- subset(wvs2, cntry == 'Nigeria')
wvs_nig$cntry <- ifelse(wvs_nig$cntry == 'Nigeria', "Nigeria", "0") %>% as.factor()

set.seed(123)
imp_nig = missForest(wvs_nig, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.0006894611 0.1521317 
##     difference(s): 9.630554e-11 0.002425222 
##     time: 1.624 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.0006879526 0.1529117 
##     difference(s): 1.463198e-11 0.0006063056 
##     time: 1.579 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.0006886714 0.1538614 
##     difference(s): 1.030557e-11 0.0007073565 
##     time: 1.61 seconds
## 
##   missForest iteration 4 in progress...done!
##     estimated error(s): 0.0006897513 0.1528423 
##     difference(s): 8.676889e-12 0.0007073565 
##     time: 1.603 seconds
## 
##   missForest iteration 5 in progress...done!
##     estimated error(s): 0.0006896226 0.1541757 
##     difference(s): 8.288235e-12 0.0005052546 
##     time: 1.572 seconds
## 
##   missForest iteration 6 in progress...done!
##     estimated error(s): 0.0006920485 0.1536433 
##     difference(s): 1.285152e-11 0.0007073565 
##     time: 1.625 seconds
imp_nig$OOBerror
##        NRMSE          PFC 
## 0.0006896226 0.1541757420
nig <-imp_nig$ximp
summary(nig)
##      cntry          finsat           sex           age          people_house   
##  Nigeria:1237   Min.   : 1.000   Male  :633   Min.   : 18.00   Min.   : 1.000  
##                 1st Qu.: 3.000   Female:604   1st Qu.: 24.00   1st Qu.: 4.000  
##                 Median : 5.000                Median : 30.00   Median : 5.000  
##                 Mean   : 4.802                Mean   : 32.56   Mean   : 6.284  
##                 3rd Qu.: 7.000                3rd Qu.: 38.00   3rd Qu.: 8.000  
##                 Max.   :10.000                Max.   :100.00   Max.   :63.000  
##                                                                                
##                 marit_stat      income                  status_comp 
##  Partnered           :695   Min.   : 1.000   Better off       :631  
##  Previously partnered: 50   1st Qu.: 3.000   Worse off        :500  
##  Single              :492   Median : 4.000   Or about the same:106  
##                             Mean   : 4.411                          
##                             3rd Qu.: 6.000                          
##                             Max.   :10.000                          
##                                                                     
##             parent_proud with_parent     GDPpc                  country_income
##  Agree strongly   :969   No :796     Min.   :5348   Low income         :   0  
##  Agree            :248   Yes:441     1st Qu.:5348   Lower middle income:1237  
##  Disagree         : 17               Median :5348   Upper middle income:   0  
##  Strongly disagree:  3               Mean   :5348   High income        :   0  
##                                      3rd Qu.:5348                             
##                                      Max.   :5348                             
##                                                                               
##                           region    
##  Sub-Saharan Africa          :1237  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:   0  
##  Latin America and Caribbean :   0  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0
wvs_ken <- subset(wvs2, cntry == 'Kenya')
wvs_ken$cntry <- ifelse(wvs_ken$cntry == 'Kenya', "Kenya", "0") %>% as.factor()

set.seed(123)
imp_ken = missForest(wvs_ken, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.002251408 0.232495 
##     difference(s): 2.789829e-09 0.002665877 
##     time: 2.282 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.00222875 0.234547 
##     difference(s): 1.268227e-10 0.0007898894 
##     time: 2.317 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.002247218 0.2355213 
##     difference(s): 3.853873e-10 0.0008886256 
##     time: 2.267 seconds
imp_ken$OOBerror
##      NRMSE        PFC 
## 0.00222875 0.23454696
ken <-imp_ken$ximp
summary(ken)
##    cntry          finsat           sex           age         people_house   
##  Kenya:1266   Min.   : 1.000   Male  :643   Min.   :18.00   Min.   : 1.000  
##               1st Qu.: 3.000   Female:623   1st Qu.:24.00   1st Qu.: 2.000  
##               Median : 5.000                Median :28.00   Median : 3.000  
##               Mean   : 4.878                Mean   :30.73   Mean   : 3.833  
##               3rd Qu.: 7.000                3rd Qu.:36.00   3rd Qu.: 5.000  
##               Max.   :10.000                Max.   :84.00   Max.   :30.000  
##                                                                             
##                 marit_stat      income                  status_comp 
##  Partnered           :611   Min.   : 1.000   Better off       :763  
##  Previously partnered:101   1st Qu.: 3.000   Worse off        :310  
##  Single              :554   Median : 5.000   Or about the same:193  
##                             Mean   : 4.602                          
##                             3rd Qu.: 6.000                          
##                             Max.   :10.000                          
##                                                                     
##             parent_proud with_parent     GDPpc                  country_income
##  Agree strongly   :848   No :922     Min.   :4509   Low income         :   0  
##  Agree            :353   Yes:344     1st Qu.:4509   Lower middle income:1266  
##  Disagree         : 55               Median :4509   Upper middle income:   0  
##  Strongly disagree: 10               Mean   :4509   High income        :   0  
##                                      3rd Qu.:4509                             
##                                      Max.   :4509                             
##                                                                               
##                           region    
##  Sub-Saharan Africa          :1266  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:   0  
##  Latin America and Caribbean :   0  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0
wvs_eth <- subset(wvs2, cntry == 'Ethiopia')
wvs_eth$cntry <- ifelse(wvs_eth$cntry == 'Ethiopia', "Ethiopia", "0") %>% as.factor()

set.seed(123)
imp_eth = missForest(wvs_eth, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.001684979 0.1326647 
##     difference(s): 7.542113e-10 0.00101626 
##     time: 1.625 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.001677411 0.1321494 
##     difference(s): 1.831599e-10 0.000101626 
##     time: 1.719 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.001677687 0.1322462 
##     difference(s): 8.722801e-11 0.0004065041 
##     time: 2.042 seconds
## 
##   missForest iteration 4 in progress...done!
##     estimated error(s): 0.001684294 0.1408357 
##     difference(s): 6.047123e-11 0.0004065041 
##     time: 1.781 seconds
## 
##   missForest iteration 5 in progress...done!
##     estimated error(s): 0.00168345 0.1314646 
##     difference(s): 2.831966e-11 0.0004065041 
##     time: 1.441 seconds
## 
##   missForest iteration 6 in progress...done!
##     estimated error(s): 0.001685903 0.1343298 
##     difference(s): 2.020996e-10 0.000203252 
##     time: 1.712 seconds
## 
##   missForest iteration 7 in progress...done!
##     estimated error(s): 0.001689213 0.1293018 
##     difference(s): 7.701341e-11 0.000203252 
##     time: 1.861 seconds
## 
##   missForest iteration 8 in progress...done!
##     estimated error(s): 0.001684337 0.1343261 
##     difference(s): 1.470513e-11 0.000203252 
##     time: 1.607 seconds
## 
##   missForest iteration 9 in progress...done!
##     estimated error(s): 0.001684053 0.1305308 
##     difference(s): 1.392172e-11 0 
##     time: 1.618 seconds
## 
##   missForest iteration 10 in progress...done!
##     estimated error(s): 0.001680672 0.129068 
##     difference(s): 6.079987e-11 0.000203252 
##     time: 1.57 seconds
imp_eth$OOBerror
##       NRMSE         PFC 
## 0.001680672 0.129067969
eth <-imp_eth$ximp
summary(eth)
##       cntry          finsat           sex           age         people_house   
##  Ethiopia:1230   Min.   : 1.000   Male  :622   Min.   :18.00   Min.   : 1.000  
##                  1st Qu.: 3.000   Female:608   1st Qu.:23.00   1st Qu.: 3.000  
##                  Median : 5.000                Median :29.00   Median : 5.000  
##                  Mean   : 5.216                Mean   :31.93   Mean   : 5.055  
##                  3rd Qu.: 7.000                3rd Qu.:38.00   3rd Qu.: 6.000  
##                  Max.   :10.000                Max.   :79.00   Max.   :18.000  
##                                                                                
##                 marit_stat      income                 status_comp 
##  Partnered           :821   Min.   : 1.00   Better off       :779  
##  Previously partnered: 89   1st Qu.: 3.00   Worse off        :257  
##  Single              :320   Median : 5.00   Or about the same:194  
##                             Mean   : 4.38                          
##                             3rd Qu.: 6.00                          
##                             Max.   :10.00                          
##                                                                    
##             parent_proud with_parent     GDPpc                  country_income
##  Agree strongly   :996   No :898     Min.   :2312   Low income         :1230  
##  Agree            :214   Yes:332     1st Qu.:2312   Lower middle income:   0  
##  Disagree         : 15               Median :2312   Upper middle income:   0  
##  Strongly disagree:  5               Mean   :2312   High income        :   0  
##                                      3rd Qu.:2312                             
##                                      Max.   :2312                             
##                                                                               
##                           region    
##  Sub-Saharan Africa          :1230  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:   0  
##  Latin America and Caribbean :   0  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0
wvs_zim <- subset(wvs2, cntry == 'Zimbabwe')
wvs_zim$cntry <- ifelse(wvs_zim$cntry == 'Zimbabwe', "Zimbabwe", "0") %>% as.factor()

set.seed(123)
imp_zim = missForest(wvs_zim, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.005208871 0.130114 
##     difference(s): 3.188649e-08 0.0005144033 
##     time: 2.092 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.005216296 0.1315656 
##     difference(s): 5.880808e-10 0 
##     time: 2.011 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.00519394 0.1356921 
##     difference(s): 1.198285e-09 0 
##     time: 2.007 seconds
imp_zim$OOBerror
##       NRMSE         PFC 
## 0.005216296 0.131565580
zim <-imp_zim$ximp
summary(zim)
##       cntry          finsat           sex           age         people_house   
##  Zimbabwe:1215   Min.   : 1.000   Male  :600   Min.   :18.00   Min.   : 1.000  
##                  1st Qu.: 1.000   Female:615   1st Qu.:25.00   1st Qu.: 3.500  
##                  Median : 3.000                Median :36.00   Median : 5.000  
##                  Mean   : 3.644                Mean   :39.16   Mean   : 5.185  
##                  3rd Qu.: 5.000                3rd Qu.:50.00   3rd Qu.: 6.000  
##                  Max.   :10.000                Max.   :99.00   Max.   :25.000  
##                                                                                
##                 marit_stat      income                  status_comp 
##  Partnered           :760   Min.   : 1.000   Better off       :372  
##  Previously partnered:188   1st Qu.: 1.000   Worse off        :663  
##  Single              :267   Median : 3.000   Or about the same:180  
##                             Mean   : 3.463                          
##                             3rd Qu.: 5.000                          
##                             Max.   :10.000                          
##                                                                     
##             parent_proud with_parent     GDPpc                  country_income
##  Agree strongly   :937   No :920     Min.   :2953   Low income         :   0  
##  Agree            :246   Yes:295     1st Qu.:2953   Lower middle income:1215  
##  Disagree         : 21               Median :2953   Upper middle income:   0  
##  Strongly disagree: 11               Mean   :2953   High income        :   0  
##                                      3rd Qu.:2953                             
##                                      Max.   :2953                             
##                                                                               
##                           region    
##  Sub-Saharan Africa          :1215  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:   0  
##  Latin America and Caribbean :   0  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0
wvs_lib <- subset(wvs2, cntry == 'Libya')
wvs_lib$cntry <- ifelse(wvs_lib$cntry == 'Libya', "Libya", "0") %>% as.factor()

set.seed(123)
imp_lib = missForest(wvs_lib, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.0007405244 0.1456742 
##     difference(s): 2.683191e-09 0.004807692 
##     time: 1.807 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.0007291783 0.1412055 
##     difference(s): 6.724643e-11 0.0009406355 
##     time: 1.911 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.0007285703 0.1441481 
##     difference(s): 3.895165e-11 0.0007316054 
##     time: 1.904 seconds
## 
##   missForest iteration 4 in progress...done!
##     estimated error(s): 0.0007308028 0.1421588 
##     difference(s): 6.901109e-11 0.001149666 
##     time: 1.737 seconds
imp_lib$OOBerror
##        NRMSE          PFC 
## 0.0007285703 0.1441480950
lib <-imp_lib$ximp
summary(lib)
##    cntry          finsat           sex           age         people_house   
##  Libya:1196   Min.   : 1.000   Male  :619   Min.   :18.00   Min.   : 1.000  
##               1st Qu.: 5.000   Female:577   1st Qu.:30.00   1st Qu.: 4.000  
##               Median : 6.000                Median :41.00   Median : 6.000  
##               Mean   : 6.499                Mean   :40.22   Mean   : 5.983  
##               3rd Qu.: 8.000                3rd Qu.:49.00   3rd Qu.: 7.000  
##               Max.   :10.000                Max.   :85.00   Max.   :41.000  
##                                                                             
##                 marit_stat      income                  status_comp 
##  Partnered           :749   Min.   : 1.000   Better off       :606  
##  Previously partnered: 60   1st Qu.: 4.000   Worse off        :203  
##  Single              :387   Median : 5.000   Or about the same:387  
##                             Mean   : 5.424                          
##                             3rd Qu.: 6.127                          
##                             Max.   :10.000                          
##                                                                     
##             parent_proud  with_parent     GDPpc      
##  Agree strongly   :1110   No :544     Min.   :15803  
##  Agree            :  81   Yes:652     1st Qu.:15803  
##  Disagree         :   2               Median :15803  
##  Strongly disagree:   3               Mean   :15803  
##                                       3rd Qu.:15803  
##                                       Max.   :15803  
##                                                      
##              country_income                          region    
##  Low income         :   0   Sub-Saharan Africa          :   0  
##  Lower middle income:   0   South Asia                  :   0  
##  Upper middle income:1196   North America               :   0  
##  High income        :   0   Middle East and North Africa:1196  
##                             Latin America and Caribbean :   0  
##                             Europe and Central Asia     :   0  
##                             East Asia and Pacific       :   0
wvs_tun <- subset(wvs2, cntry == 'Tunisia') 
wvs_tun$cntry <- ifelse(wvs_tun$cntry == 'Tunisia', "Tunisia", "0") %>% as.factor()

set.seed(123)
imp_tun = missForest(wvs_tun, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0.001273197 0.2130999 
##     difference(s): 2.594939e-10 0.003621689 
##     time: 2.707 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0.001235917 0.2126609 
##     difference(s): 8.175183e-10 0.0004139073 
##     time: 2.628 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0.001245533 0.2131933 
##     difference(s): 2.68914e-10 0.0003104305 
##     time: 2.648 seconds
## 
##   missForest iteration 4 in progress...done!
##     estimated error(s): 0.001243975 0.2116334 
##     difference(s): 1.362753e-10 0.0001034768 
##     time: 2.652 seconds
## 
##   missForest iteration 5 in progress...done!
##     estimated error(s): 0.001242606 0.2131704 
##     difference(s): 4.686635e-12 0.0003104305 
##     time: 2.699 seconds
## 
##   missForest iteration 6 in progress...done!
##     estimated error(s): 0.00124193 0.2127527 
##     difference(s): 8.487324e-12 0.0004139073 
##     time: 2.675 seconds
imp_tun$OOBerror
##       NRMSE         PFC 
## 0.001242606 0.213170370
tun <-imp_tun$ximp
summary(tun)
##      cntry          finsat           sex           age         people_house   
##  Tunisia:1208   Min.   : 1.000   Male  :558   Min.   :18.00   Min.   : 1.000  
##                 1st Qu.: 3.000   Female:650   1st Qu.:30.00   1st Qu.: 2.000  
##                 Median : 5.000                Median :42.00   Median : 4.000  
##                 Mean   : 4.591                Mean   :43.19   Mean   : 3.658  
##                 3rd Qu.: 6.000                3rd Qu.:54.00   3rd Qu.: 5.000  
##                 Max.   :10.000                Max.   :94.00   Max.   :10.000  
##                                                                               
##                 marit_stat      income                  status_comp 
##  Partnered           :728   Min.   : 1.000   Better off       :652  
##  Previously partnered:140   1st Qu.: 3.000   Worse off        :375  
##  Single              :340   Median : 5.000   Or about the same:181  
##                             Mean   : 4.707                          
##                             3rd Qu.: 6.000                          
##                             Max.   :10.000                          
##                                                                     
##             parent_proud with_parent     GDPpc                   country_income
##  Agree strongly   :936   No :776     Min.   :11201   Low income         :   0  
##  Agree            :248   Yes:432     1st Qu.:11201   Lower middle income:1208  
##  Disagree         : 21               Median :11201   Upper middle income:   0  
##  Strongly disagree:  3               Mean   :11201   High income        :   0  
##                                      3rd Qu.:11201                             
##                                      Max.   :11201                             
##                                                                                
##                           region    
##  Sub-Saharan Africa          :   0  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:1208  
##  Latin America and Caribbean :   0  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0
wvs_mor <- subset(wvs2, cntry == 'Morocco')
wvs_mor$cntry <- ifelse(wvs_mor$cntry == 'Morocco', "Morocco", "0") %>% as.factor()

set.seed(123)
imp_mor = missForest(wvs_mor, verbose = TRUE)
##   missForest iteration 1 in progress...done!
##     estimated error(s): 0 0.02368531 
##     difference(s): 0 0.0001041667 
##     time: 0.144 seconds
## 
##   missForest iteration 2 in progress...done!
##     estimated error(s): 0 0.02389399 
##     difference(s): 0 0 
##     time: 0.135 seconds
## 
##   missForest iteration 3 in progress...done!
##     estimated error(s): 0 0.02389399 
##     difference(s): 0 0 
##     time: 0.172 seconds
imp_mor$OOBerror
##      NRMSE        PFC 
## 0.00000000 0.02389399
mor <-imp_mor$ximp
summary(mor)
##      cntry          finsat           sex           age         people_house   
##  Morocco:1200   Min.   : 1.000   Male  :600   Min.   :18.00   Min.   : 1.000  
##                 1st Qu.: 5.000   Female:600   1st Qu.:26.00   1st Qu.: 4.000  
##                 Median : 6.000                Median :34.50   Median : 5.000  
##                 Mean   : 6.238                Mean   :37.22   Mean   : 4.903  
##                 3rd Qu.: 8.000                3rd Qu.:46.00   3rd Qu.: 6.000  
##                 Max.   :10.000                Max.   :82.00   Max.   :11.000  
##                                                                               
##                 marit_stat      income                  status_comp 
##  Partnered           :666   Min.   : 1.000   Better off       :621  
##  Previously partnered: 94   1st Qu.: 4.000   Worse off        :234  
##  Single              :440   Median : 5.000   Or about the same:345  
##                             Mean   : 5.228                          
##                             3rd Qu.: 6.000                          
##                             Max.   :10.000                          
##                                                                     
##             parent_proud with_parent     GDPpc                  country_income
##  Agree strongly   :804   No :517     Min.   :7826   Low income         :   0  
##  Agree            :317   Yes:683     1st Qu.:7826   Lower middle income:1200  
##  Disagree         : 64               Median :7826   Upper middle income:   0  
##  Strongly disagree: 15               Mean   :7826   High income        :   0  
##                                      3rd Qu.:7826                             
##                                      Max.   :7826                             
##                                                                               
##                           region    
##  Sub-Saharan Africa          :   0  
##  South Asia                  :   0  
##  North America               :   0  
##  Middle East and North Africa:1200  
##  Latin America and Caribbean :   0  
##  Europe and Central Asia     :   0  
##  East Asia and Pacific       :   0

Missings description

ps. it’s a bit structurally unstructured, because in I firstly impute and then describe the missings and data, sorry. I did it the other way around physically.

wvs_na <- bind_rows(wvs_pr, wvs_br, wvs_per, wvs_ch, wvs_arg, wvs_urg, wvs_mx, wvs_gut, wvs_nic, wvs_col, wvs_ecu, wvs_bol, wvs_mac, wvs_nig, wvs_ken, wvs_eth, wvs_zim, wvs_lib, wvs_tun, wvs_mor)

library(misty)
library(naniar)
na.test(wvs_na) #not MCAR
##  Little's MCAR Test
## 
##       n nIncomp nPattern    chi2  df  pval 
##   25824    1845       57 7817.35 573 0.000
gg_miss_upset(wvs_na)

Missings: According to Little’s test the missings are not MCAR with p-value being smaller than 0.05 (0.00), meaning that it can be MAR or MNAR. If we look at the visualization, the biggest amount of missings is in income, but there is not a lot of missings that are connected between each other, 95 have connection of people in the household with age, 33 together income with status of living compared to parents’, also there are 27 together status of living compared to parents’ and stating that one of the main goals is to make parents proud. I would suggest that the type of missingness is MAR in the data.

wvs_fin <- bind_rows(pr, br, per, ch, arg, urg, mx, gut, nic, col, ecu, bol, mac, nig, ken, eth, zim, lib, tun, mor)

wvs100 <- select(wvs_fin, c(cntry, finsat, sex, age, people_house, marit_stat, income, status_comp, parent_proud, with_parent, GDPpc, country_income, region))

Data discription and visualization

table1(~ cntry + finsat + sex + age + people_house + marit_stat + income + status_comp + parent_proud + with_parent + GDPpc + country_income + region, data = wvs100)
Overall
(N=25824)
cntry
Puerto Rico 1127 (4.4%)
Brazil 1762 (6.8%)
Peru 1400 (5.4%)
Chile 1000 (3.9%)
Argentina 1003 (3.9%)
Uruguay 1000 (3.9%)
Mexico 1741 (6.7%)
Guatemala 1229 (4.8%)
Nicaragua 1200 (4.6%)
Colombia 1520 (5.9%)
Ecuador 1200 (4.6%)
Bolivia 2067 (8.0%)
Macau SAR 1023 (4.0%)
Nigeria 1237 (4.8%)
Kenya 1266 (4.9%)
Ethiopia 1230 (4.8%)
Zimbabwe 1215 (4.7%)
Libya 1196 (4.6%)
Tunisia 1208 (4.7%)
Morocco 1200 (4.6%)
finsat
Mean (SD) 6.05 (2.60)
Median [Min, Max] 6.00 [1.00, 10.0]
sex
Male 12343 (47.8%)
Female 13481 (52.2%)
age
Mean (SD) 39.7 (16.1)
Median [Min, Max] 37.0 [16.0, 100]
people_house
Mean (SD) 4.89 (3.40)
Median [Min, Max] 4.00 [1.00, 63.0]
marit_stat
Partnered 14581 (56.5%)
Previously partnered 3335 (12.9%)
Single 7908 (30.6%)
income
Mean (SD) 4.73 (2.15)
Median [Min, Max] 5.00 [1.00, 10.0]
status_comp
Better off 13076 (50.6%)
Worse off 4573 (17.7%)
Or about the same 8175 (31.7%)
parent_proud
Agree strongly 15191 (58.8%)
Agree 8857 (34.3%)
Disagree 1459 (5.6%)
Strongly disagree 317 (1.2%)
with_parent
No 17119 (66.3%)
Yes 8705 (33.7%)
GDPpc
Mean (SD) 17800 (24000)
Median [Min, Max] 11800 [2310, 129000]
country_income
Low income 1230 (4.8%)
Lower middle income 9393 (36.4%)
Upper middle income 11051 (42.8%)
High income 4150 (16.1%)
region
Sub-Saharan Africa 4948 (19.2%)
South Asia 0 (0%)
North America 0 (0%)
Middle East and North Africa 3604 (14.0%)
Latin America and Caribbean 16249 (62.9%)
Europe and Central Asia 0 (0%)
East Asia and Pacific 1023 (4.0%)

Data description

The distribution of observations among countries is almost equal, there is no countries that have more that 8% of total observations. Financial satisfaction has mean equal to 6.05, and median 6.00, sex is almost equally distributed with 48% of males, and 52% females, mean age of the sample is 39.7 with min 16 and max 100. People in the household has a mean of 4.89 with min 1 and max 63, marital status has the biggest category of “partnered” equal to 56.5% of the sample, “previously partnered” with 12.9%. Mean income is 4.73, and sd = 2.15. Status comparison with parents most people indicate to be better off (50.6% of observations), worse off is 17,7% of the sample. Making parents proud is strongly one of the main goals for 58.8%, and just one of the main goals for 34.3%, strongly disagree too little (1.2%), I will further merge this category with just disagreement. Living with parents is indicated by 33.7% of the sample, it includes both parents and parents-in-laws.

summary(wvs100)
##       cntry           finsat           sex             age        
##  Bolivia : 2067   Min.   : 1.000   Male  :12343   Min.   : 16.00  
##  Brazil  : 1762   1st Qu.: 5.000   Female:13481   1st Qu.: 26.00  
##  Mexico  : 1741   Median : 6.000                  Median : 37.00  
##  Colombia: 1520   Mean   : 6.053                  Mean   : 39.66  
##  Peru    : 1400   3rd Qu.: 8.000                  3rd Qu.: 51.00  
##  Kenya   : 1266   Max.   :10.000                  Max.   :100.00  
##  (Other) :16068                                                   
##   people_house                   marit_stat        income      
##  Min.   : 1.000   Partnered           :14581   Min.   : 1.000  
##  1st Qu.: 3.000   Previously partnered: 3335   1st Qu.: 3.000  
##  Median : 4.000   Single              : 7908   Median : 5.000  
##  Mean   : 4.889                                Mean   : 4.726  
##  3rd Qu.: 6.000                                3rd Qu.: 6.000  
##  Max.   :63.000                                Max.   :10.000  
##                                                                
##             status_comp               parent_proud   with_parent
##  Better off       :13076   Agree strongly   :15191   No :17119  
##  Worse off        : 4573   Agree            : 8857   Yes: 8705  
##  Or about the same: 8175   Disagree         : 1459              
##                            Strongly disagree:  317              
##                                                                 
##                                                                 
##                                                                 
##      GDPpc                    country_income 
##  Min.   :  2312   Low income         : 1230  
##  1st Qu.:  7826   Lower middle income: 9393  
##  Median : 11847   Upper middle income:11051  
##  Mean   : 17795   High income        : 4150  
##  3rd Qu.: 20411                              
##  Max.   :129103                              
##                                              
##                           region     
##  Sub-Saharan Africa          : 4948  
##  South Asia                  :    0  
##  North America               :    0  
##  Middle East and North Africa: 3604  
##  Latin America and Caribbean :16249  
##  Europe and Central Asia     :    0  
##  East Asia and Pacific       : 1023
library(lattice)

Numeric variables:

densityplot(~wvs100$finsat | wvs100$cntry)

  1. Financial satisfaction is distributed differently among countries. In Puerto Rico, Colombia, Nicaragua, Brazil there are quite a lot of observation of high financial satisfaction, while in Zimbabwe there are a lot of observations with small financial satisfaction. In Libia, Tunisia, Morocco, Ecuador, etc. most of the observations are in the middle. In Nigeria there is a decreasing trend, with highest density levels close to zero and lowest closer to 10, similarly with Tunisia, but there is a peak in the middle.
densityplot(~wvs100$income | wvs100$cntry)

  1. Income is more similarly distributed, Libya, Morocco, Bolivia, Macau, Uruguay, Peru, Chile, and Argentina have some kind of normal distribution, though of course it is not really the case. In Mexico it seem that there is a high number of observations with small income, and small with higher income.
densityplot(~wvs100$people_house | wvs100$cntry)

  1. As for people living in the house it is mostly the same for each country, with Libya and Nigeria having some outliers around 40 and 60.

Categorical variables:

ggplot(wvs100, aes(x=status_comp)) + geom_bar() + facet_wrap(vars(cntry))

  1. Proportion of people living better off than their parents is highest among almost all the countries, but in some cases there are more “about the same” observation, for instance in Colombia and Bolivia. Worse off is the smallest category, but in some countries it has almost equal number of observations as “better off” - in Argentina, and with “about the same” - in Brazil, Ethiopia, and Morocco, in Zimbabwe “worse off” is the biggest category.
ggplot(wvs100, aes(x=parent_proud)) + geom_bar() + facet_wrap(vars(cntry))

  1. Most of people in all the countries strongly or just agree with that “making parents proud” is one of their main goals in life, while disagreement is not that frequent. In Macau there are quite a lot of observation in “disagree” category
ggplot(wvs100, aes(x=with_parent)) + geom_bar() + facet_wrap(vars(cntry))

  1. Living with parents in not that popular among all the countries, but in Guatemala, Nicaragua, Libya, and Morocco the “yes” and “no” categories are almost equal.

Bivariate tests

wvs100$parent_proud1 <- ifelse(wvs100$parent_proud == "Disagree" | wvs100$parent_proud == "Strongly disagree", "Disagree", as.character(wvs100$parent_proud))

t-tests: binary + numeric (finsat)

#sex 
ks.test(wvs100$finsat, pnorm)
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  wvs100$finsat
## D = 0.89384, p-value < 2.2e-16
## alternative hypothesis: two-sided
wilcox.test(wvs100$finsat ~ wvs100$sex) #no significant difference
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  wvs100$finsat by wvs100$sex
## W = 83384212, p-value = 0.7539
## alternative hypothesis: true location shift is not equal to 0
boxplot(wvs100$finsat ~ wvs100$sex)

Interpretation: Financial satisfaction is not normally distributed, wilcox.test does not show significant difference in financial satisfaction among sexes. The visualization aligns with the test.

#with_parent
wilcox.test(wvs100$finsat ~ wvs100$with_parent) #significant difference
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  wvs100$finsat by wvs100$with_parent
## W = 70770255, p-value = 2.867e-11
## alternative hypothesis: true location shift is not equal to 0
boxplot(wvs100$finsat ~ wvs100$with_parent)

Interpretation: Wilcox.test indicates significant difference in financial satisfaction between people who live and do not live with their parents (with p-value 2.867e-11), but if we look at visualization there is no clear difference between these two groups, it may be due to the high number of observations in the dataset.

corr: num + num (finsat)

#age
cor1 <- cor.test(wvs100$finsat, wvs100$age, method = "spearman")
cor1
## 
##  Spearman's rank correlation rho
## 
## data:  wvs100$finsat and wvs100$age
## S = 2.9018e+12, p-value = 0.07771
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##         rho 
## -0.01097819
#income
cor2 <- cor.test(wvs100$finsat, wvs100$income, method = "spearman")
cor2
## 
##  Spearman's rank correlation rho
## 
## data:  wvs100$finsat and wvs100$income
## S = 2.0655e+12, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##       rho 
## 0.2803708
#people_house
cor3 <- cor.test(wvs100$finsat, wvs100$people_house, method = "spearman")
cor3
## 
##  Spearman's rank correlation rho
## 
## data:  wvs100$finsat and wvs100$people_house
## S = 2.8771e+12, p-value = 0.7021
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##          rho 
## -0.002380482

Interpretation: There is a significant correlation between financial satisfaction and income, with p-value < 2.2e-16. While number of people in the house and age are not really correlated with fin sat.

##anova: cat + num (finsat)

#marit_stat
car::leveneTest(wvs100$finsat~wvs100$marit_stat) #groups are not equal
## Levene's Test for Homogeneity of Variance (center = median)
##          Df F value    Pr(>F)    
## group     2   18.69 7.741e-09 ***
##       25821                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
kruskal.test(wvs100$finsat~wvs100$marit_stat) #difference is statistically significant
## 
##  Kruskal-Wallis rank sum test
## 
## data:  wvs100$finsat by wvs100$marit_stat
## Kruskal-Wallis chi-squared = 36.229, df = 2, p-value = 1.358e-08
#status_comp 
car::leveneTest(wvs100$finsat~wvs100$status_comp) #groups are not equal
## Levene's Test for Homogeneity of Variance (center = median)
##          Df F value    Pr(>F)    
## group     2  41.614 < 2.2e-16 ***
##       25821                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
kruskal.test(wvs100$finsat~wvs100$status_comp) #difference is statistically significant
## 
##  Kruskal-Wallis rank sum test
## 
## data:  wvs100$finsat by wvs100$status_comp
## Kruskal-Wallis chi-squared = 1412.3, df = 2, p-value < 2.2e-16
#parent_proud1
car::leveneTest(wvs100$finsat~wvs100$parent_proud1) #groups are not equal
## Levene's Test for Homogeneity of Variance (center = median)
##          Df F value    Pr(>F)    
## group     2  60.078 < 2.2e-16 ***
##       25821                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
kruskal.test(wvs100$finsat~wvs100$parent_proud1) #difference is statistically significant
## 
##  Kruskal-Wallis rank sum test
## 
## data:  wvs100$finsat by wvs100$parent_proud1
## Kruskal-Wallis chi-squared = 16.063, df = 2, p-value = 0.0003251

Interpretation: For all three anovas, leveneTest indicate that there are differences in variances between categories of marital status, status of living compared to parents, and making parents proud as one of the main goals. 1. Financial satisfaction is not equal across different marital status groups (p-value 1.358e-08) 2. Financial satisfaction is not equal across different statuses of living compared to parents (p-value < 2.2e-16) 3. Financial satisfaction is not equal across different agreements that making parents proud is one of the main goals (p-value 0.0003251)

visualization

anova_comparisons1 <- list( c("Partnered", "Previously partnered"), c("Partnered", "Single"), c("Single", "Previously partnered"))
ggplot(wvs100, aes(x=marit_stat, y=finsat)) + 
  geom_boxplot()+
  theme_minimal(base_size = 20)+
  ylab('Fin satisfaction')+xlab('Marital status')+
  ggpubr::stat_compare_means(comparisons = anova_comparisons1,label = "p.signif")+
  ggpubr::stat_compare_means(method = 'anova', label.y = 81)

#status_comp
table(wvs100$status_comp)
## 
##        Better off         Worse off Or about the same 
##             13076              4573              8175
anova_comparisons2 <- list(c("Better off", "Worse off"), c("Better off", "Or about the same"), c("Or about the same", "Worse off"))
ggplot(wvs100, aes(x=status_comp, y=finsat)) + 
  geom_boxplot()+
  theme_minimal(base_size = 20)+
  ylab('Fin satisfaction')+xlab('Status compared to parents')+
  ggpubr::stat_compare_means(comparisons = anova_comparisons2,label = "p.signif")+
  ggpubr::stat_compare_means(method = 'anova', label.y = 81)

#parent_proud1
anova_comparisons3 <- list( c("Agree", "Agree strongly"), c("Agree", "Disagree"), c("Agree strongly", "Disagree"))
ggplot(wvs100, aes(x=parent_proud1, y=finsat)) + 
  geom_boxplot()+
  theme_minimal(base_size = 20)+
  ylab('Fin satisfaction')+xlab('Make parents proud')+
  ggpubr::stat_compare_means(comparisons = anova_comparisons3,label = "p.signif")+
  ggpubr::stat_compare_means(method = 'anova', label.y = 81)

Interpretation: On the plots we can see that distribution between groups is indeed different for all three variables.

Multilevel regression

2nd level justification

library(lme4)

nullmodel1 <- lm(finsat ~ 1, data = wvs100)
summary(nullmodel1)
## 
## Call:
## lm(formula = finsat ~ 1, data = wvs100)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.0528 -1.0528 -0.0528  1.9472  3.9472 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  6.05278    0.01619   373.8   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.602 on 25823 degrees of freedom
nullmodel2 <- lmer(finsat ~ (1 | cntry), data = wvs100, REML = FALSE)  
tab_model(nullmodel2) 
  finsat
Predictors Estimates CI p
(Intercept) 6.03 5.63 – 6.43 <0.001
Random Effects
σ2 5.99
τ00 cntry 0.81
ICC 0.12
N cntry 20
Observations 25824
Marginal R2 / Conditional R2 0.000 / 0.119
anova(nullmodel2, nullmodel1)
## Data: wvs100
## Models:
## nullmodel1: finsat ~ 1
## nullmodel2: finsat ~ (1 | cntry)
##            npar    AIC    BIC logLik deviance Chisq Df Pr(>Chisq)    
## nullmodel1    2 122677 122694 -61337   122673                        
## nullmodel2    3 119635 119660 -59815   119629  3044  1  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The addition of second level is justified, p-value is small (< 2.2e-16) indicating that improvement is statistically significant, log likelihood for model with second level is closer to 0 (-59819), thus the model 2 is better.

Model with control variables

mdl1 <- lmer(finsat ~ sex + age + income + marit_stat  + (1 | cntry), data = wvs100, REML = FALSE) 

Models with 1st level variables

mdl2 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + (1 | cntry), data = wvs100, REML = FALSE) 

mdl3 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + (1 | cntry), data = wvs100, REML = FALSE) 

anova(mdl2, mdl3)
## Data: wvs100
## Models:
## mdl2: finsat ~ sex + age + income + marit_stat + status_comp + (1 | cntry)
## mdl3: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + (1 | cntry)
##      npar    AIC    BIC logLik deviance  Chisq Df Pr(>Chisq)    
## mdl2   10 117116 117197 -58548   117096                         
## mdl3   12 117083 117181 -58529   117059 36.794  2  1.024e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mdl4 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + (1 | cntry), data = wvs100, REML = FALSE) 
 #people_house r2 - 18 icc - 10

anova(mdl3, mdl4)
## Data: wvs100
## Models:
## mdl3: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + (1 | cntry)
## mdl4: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + (1 | cntry)
##      npar    AIC    BIC logLik deviance  Chisq Df Pr(>Chisq)  
## mdl3   12 117083 117181 -58529   117059                       
## mdl4   13 117080 117186 -58527   117054 4.8039  1    0.02839 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mdl4.1 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + with_parent + (1 | cntry), data = wvs100, REML = FALSE) 


anova(mdl4.1, mdl4)
## Data: wvs100
## Models:
## mdl4: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + (1 | cntry)
## mdl4.1: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + with_parent + (1 | cntry)
##        npar    AIC    BIC logLik deviance  Chisq Df Pr(>Chisq)
## mdl4     13 117080 117186 -58527   117054                     
## mdl4.1   14 117082 117196 -58527   117054 0.1008  1     0.7509
tab_model(mdl1, mdl2, mdl3, mdl4)
  finsat finsat finsat finsat
Predictors Estimates CI p Estimates CI p Estimates CI p Estimates CI p
(Intercept) 4.52 4.14 – 4.90 <0.001 4.95 4.59 – 5.30 <0.001 4.90 4.53 – 5.26 <0.001 4.96 4.59 – 5.33 <0.001
sex [Female] 0.00 -0.06 – 0.06 0.999 -0.00 -0.06 – 0.05 0.866 -0.00 -0.06 – 0.05 0.892 -0.00 -0.06 – 0.05 0.897
age 0.00 -0.00 – 0.00 0.338 0.00 -0.00 – 0.00 0.401 0.00 -0.00 – 0.00 0.396 0.00 -0.00 – 0.00 0.406
income 0.31 0.29 – 0.32 <0.001 0.28 0.27 – 0.30 <0.001 0.28 0.27 – 0.30 <0.001 0.28 0.27 – 0.30 <0.001
marit stat [Previously
partnered]
-0.12 -0.22 – -0.03 0.009 -0.08 -0.17 – 0.01 0.094 -0.08 -0.17 – 0.01 0.101 -0.08 -0.17 – 0.01 0.084
marit stat [Single] 0.10 0.03 – 0.18 0.005 0.09 0.02 – 0.16 0.017 0.09 0.01 – 0.16 0.018 0.09 0.01 – 0.16 0.019
status comp [Worse off] -1.10 -1.18 – -1.01 <0.001 -1.09 -1.17 – -1.00 <0.001 -1.09 -1.17 – -1.00 <0.001
status comp [Or about the
same]
-0.37 -0.44 – -0.31 <0.001 -0.37 -0.44 – -0.30 <0.001 -0.37 -0.44 – -0.30 <0.001
parent proud1 [Agree
strongly]
0.12 0.06 – 0.19 <0.001 0.12 0.06 – 0.19 <0.001
parent proud1 [Disagree] -0.23 -0.35 – -0.11 <0.001 -0.23 -0.35 – -0.11 <0.001
people house -0.01 -0.02 – -0.00 0.028
Random Effects
σ2 5.58 5.43 5.43 5.43
τ00 0.66 cntry 0.56 cntry 0.58 cntry 0.59 cntry
ICC 0.11 0.09 0.10 0.10
N 20 cntry 20 cntry 20 cntry 20 cntry
Observations 25824 25824 25824 25824
Marginal R2 / Conditional R2 0.066 / 0.165 0.092 / 0.177 0.093 / 0.180 0.093 / 0.182

Interpretation: Addition of variable “living with parents” does not improve the model at all (p-value is 0.7509), while other variables do, so I will not add “with_parent” variable to the model

Models with 2nd level variables

mdl5 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc + (1 | cntry), data = wvs100, REML = FALSE) 


wvs100$GDPpc_sc <- scale(wvs100$GDPpc) 
mdl5.1 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + (1 | cntry), data = wvs100, REML = FALSE) 


mdl6 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + (1 | cntry), data = wvs100, REML = FALSE) 
 #baseline: Low income

mdl7 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + region + (1 | cntry), data = wvs100, REML = FALSE) 
 #baseline: Sub-Saharan Africa 

anova(mdl6, mdl7)
## Data: wvs100
## Models:
## mdl6: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + (1 | cntry)
## mdl7: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + region + (1 | cntry)
##      npar    AIC    BIC logLik deviance  Chisq Df Pr(>Chisq)   
## mdl6   17 117076 117215 -58521   117042                        
## mdl7   20 117066 117229 -58513   117026 15.924  3   0.001176 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
tab_model(mdl4, mdl5.1, mdl6, mdl7) #7th model explains 19.3%, with ICC being 0.03 
  finsat finsat finsat finsat
Predictors Estimates CI p Estimates CI p Estimates CI p Estimates CI p
(Intercept) 4.96 4.59 – 5.33 <0.001 4.95 4.59 – 5.32 <0.001 4.10 2.96 – 5.25 <0.001 4.13 3.08 – 5.18 <0.001
sex [Female] -0.00 -0.06 – 0.05 0.897 -0.00 -0.06 – 0.05 0.894 -0.00 -0.06 – 0.05 0.888 -0.00 -0.06 – 0.05 0.886
age 0.00 -0.00 – 0.00 0.406 0.00 -0.00 – 0.00 0.411 0.00 -0.00 – 0.00 0.438 0.00 -0.00 – 0.00 0.458
income 0.28 0.27 – 0.30 <0.001 0.28 0.27 – 0.30 <0.001 0.28 0.27 – 0.30 <0.001 0.28 0.27 – 0.30 <0.001
marit stat [Previously
partnered]
-0.08 -0.17 – 0.01 0.084 -0.08 -0.17 – 0.01 0.085 -0.08 -0.17 – 0.01 0.082 -0.08 -0.17 – 0.01 0.079
marit stat [Single] 0.09 0.01 – 0.16 0.019 0.09 0.01 – 0.16 0.019 0.09 0.01 – 0.16 0.020 0.09 0.01 – 0.16 0.020
status comp [Worse off] -1.09 -1.17 – -1.00 <0.001 -1.08 -1.17 – -1.00 <0.001 -1.09 -1.17 – -1.00 <0.001 -1.08 -1.17 – -1.00 <0.001
status comp [Or about the
same]
-0.37 -0.44 – -0.30 <0.001 -0.37 -0.44 – -0.30 <0.001 -0.37 -0.44 – -0.30 <0.001 -0.37 -0.44 – -0.30 <0.001
parent proud1 [Agree
strongly]
0.12 0.06 – 0.19 <0.001 0.12 0.06 – 0.19 <0.001 0.12 0.06 – 0.19 <0.001 0.13 0.06 – 0.19 <0.001
parent proud1 [Disagree] -0.23 -0.35 – -0.11 <0.001 -0.23 -0.35 – -0.11 <0.001 -0.23 -0.35 – -0.11 <0.001 -0.23 -0.35 – -0.11 <0.001
people house -0.01 -0.02 – -0.00 0.028 -0.01 -0.02 – -0.00 0.029 -0.01 -0.02 – -0.00 0.029 -0.01 -0.02 – -0.00 0.039
GDPpc sc 0.13 -0.17 – 0.43 0.398 -0.14 -0.44 – 0.16 0.365 -0.09 -1.22 – 1.04 0.875
country income [Lower
middle income]
0.21 -0.99 – 1.41 0.736 -0.51 -1.38 – 0.37 0.256
country income [Upper
middle income]
1.21 0.01 – 2.41 0.049 -0.32 -1.41 – 0.78 0.574
country income [High
income]
1.57 0.16 – 2.98 0.029 -0.05 -1.54 – 1.45 0.952
region [Middle East and
North Africa]
0.87 0.20 – 1.55 0.011
region [Latin America and
Caribbean]
1.59 0.93 – 2.25 <0.001
region [East Asia and
Pacific]
1.28 -3.75 – 6.32 0.617
Random Effects
σ2 5.43 5.43 5.43 5.43
τ00 0.59 cntry 0.57 cntry 0.32 cntry 0.14 cntry
ICC 0.10 0.09 0.06 0.03
N 20 cntry 20 cntry 20 cntry 20 cntry
Observations 25824 25824 25824 25824
Marginal R2 / Conditional R2 0.093 / 0.182 0.097 / 0.183 0.141 / 0.189 0.173 / 0.194

Interpretation: 7th model is improved compared to the 6th, with p-value 0.001176 and log lik closer to 0 (-58513 vs -58521), it explains 19.4%, with ICC being 0.03, which is not bad.

Random effects

mdl8 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + region + (1 + parent_proud1 | cntry), data = wvs100, REML = FALSE) #8th model explains 17.8%, with ICC being 0.04 

anova(mdl8, mdl7)
## Data: wvs100
## Models:
## mdl7: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + region + (1 | cntry)
## mdl8: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + region + (1 + parent_proud1 | cntry)
##      npar    AIC    BIC logLik deviance  Chisq Df Pr(>Chisq)    
## mdl7   20 117066 117229 -58513   117026                         
## mdl8   25 117041 117245 -58496   116991 34.791  5  1.656e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mdl9 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 * region + people_house + GDPpc_sc + country_income + (1 + parent_proud1 | cntry), data = wvs100, REML = FALSE) 

anova(mdl8, mdl9) #not significant impovement
## Data: wvs100
## Models:
## mdl8: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + region + (1 + parent_proud1 | cntry)
## mdl9: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 * region + people_house + GDPpc_sc + country_income + (1 + parent_proud1 | cntry)
##      npar    AIC    BIC logLik deviance  Chisq Df Pr(>Chisq)
## mdl8   25 117041 117245 -58496   116991                     
## mdl9   31 117043 117296 -58491   116981 10.372  6     0.1098

Interpretation: 8th model explains 17.9%, with ICC being 0.04, and it shows a significant improvement. While the addition of cross-level interaction does not improve the model significantly (p-value is 0.1098). So I will leave the model with random slope with no cross-level interaction.

tab_model(mdl8)
  finsat
Predictors Estimates CI p
(Intercept) 3.87 2.90 – 4.83 <0.001
sex [Female] -0.01 -0.06 – 0.05 0.851
age 0.00 -0.00 – 0.00 0.503
income 0.28 0.27 – 0.29 <0.001
marit stat [Previously
partnered]
-0.08 -0.17 – 0.01 0.078
marit stat [Single] 0.08 0.01 – 0.15 0.026
status comp [Worse off] -1.07 -1.16 – -0.99 <0.001
status comp [Or about the
same]
-0.37 -0.44 – -0.30 <0.001
parent proud1 [Agree
strongly]
0.13 0.01 – 0.24 0.026
parent proud1 [Disagree] -0.15 -0.46 – 0.16 0.339
people house -0.01 -0.02 – -0.00 0.043
GDPpc sc -0.21 -1.18 – 0.75 0.664
country income [Lower
middle income]
0.03 -0.79 – 0.85 0.945
country income [Upper
middle income]
0.14 -0.88 – 1.15 0.793
country income [High
income]
0.47 -0.85 – 1.79 0.488
region [Middle East and
North Africa]
0.70 0.09 – 1.31 0.025
region [Latin America and
Caribbean]
1.28 0.68 – 1.88 <0.001
region [East Asia and
Pacific]
1.52 -2.80 – 5.84 0.490
Random Effects
σ2 5.41
τ00 cntry 0.24
τ11 cntry.parent_proud1Agree strongly 0.04
τ11 cntry.parent_proud1Disagree 0.37
ρ01 -0.49
-0.75
ICC 0.04
N cntry 20
Observations 25824
Marginal R2 / Conditional R2 0.148 / 0.179

Interpretation: Sex, age, marital status of previously partnered (compared with partnered), disagreement with statement that one of the main goals in life is to make parents proud (compared to “agree”), GDP per capita, country income, and all the regions apart from Latin America and Caribbean (compared to Sub-Saharan Africa) are not significant.

Increase in income by 1 leads to 0.28 increase in financial satisfaction, being single compared to being partnered increases financial satisfaction by 0.08, status of life being worse off compared to parents’ (compared with better off) decreases the fin sat by 1.07, while about the same decreases fin sat by 0.37, strong agreement that one’s main goal is to make parents proud increases fin sat by 0.13, number of people in the household only marginally affect the fin sat, with increase of number of people decreases fin sat by 0.01, in Middle East and North Africa region compared to Sub-Saharan Africa the financial satisfaction is higher by 0.7, and in Latin America and Caribbean fin sat is higher by 1.28.

Here is the visualization of what was described above:

plot_model(mdl8, type="pred", term = "income")

plot_model(mdl8, type="pred", term = "marit_stat")

plot_model(mdl8, type="pred", term = "status_comp")

plot_model(mdl8, type="pred", term = "parent_proud1")

plot_model(mdl8, type="pred", term = "people_house")

plot_model(mdl8, type="pred", term = "region")

#library(lattice)
dotplot(ranef(mdl8))
## $cntry

Here we can see that for some countries there are different levels of fin sat, for instance in Tunisia strong agreement indicated higher financial satisfaction.

plot_model(mdl8,type="pred",
           terms=c("parent_proud1", "cntry"),pred.type="re", grid=T)

On the plot we can see different levels of agreement that one of the main goals is to make one’s parents proud.

Model Diagnostics

library(performance)
model_performance(mdl7)
## # Indices of model performance
## 
## AIC       |      AICc |       BIC | R2 (cond.) | R2 (marg.) |   ICC |  RMSE | Sigma
## -----------------------------------------------------------------------------------
## 1.171e+05 | 1.171e+05 | 1.173e+05 |      0.194 |      0.173 | 0.026 | 2.328 | 2.329
model_performance(mdl8)
## # Indices of model performance
## 
## AIC       |      AICc |       BIC | R2 (cond.) | R2 (marg.) |   ICC |  RMSE | Sigma
## -----------------------------------------------------------------------------------
## 1.171e+05 | 1.171e+05 | 1.173e+05 |      0.179 |      0.148 | 0.036 | 2.324 | 2.326

ICC indicates that 3.6% of variance is explained by country-level differences, conditional R2 is equal to 0.179, meaning that the model explains 17.9% of variance. If we compare models with and without random effect, we can conclude that model without random effect is better in terms of conditional R2 19.4% for model 7 compared to 17.9% for model 8, but ICC is higher in 8th model, suggesting that there the more difference in groups is observed. If we look at AIC and BIC we can notice that they are the same for both models.

library(DHARMa)
testDispersion(mdl8)

## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.99726, p-value = 0.944
## alternative hypothesis: two.sided
simulationOutput <- simulateResiduals(fittedModel = mdl8, plot = F)
plot(simulationOutput)

p-value of KS test indicated a significant diviation from the normal distribution, dispertion test shows that there is no significant dispersion, there are significant outliers.

Conclusions

Hypotheses

  1. People who estimate their standard of living higher compared to their parents’ status will have higher financial satisfaction
  2. People with one of main goals in life is to make my parents proud will have lower level of financial satisfaction
  3. Higher number of people in household and living with parents will decrease financial satisfaction of household

Hypotheses confirmation

First hypothesis is confirmed, people who estimate their status “worse off” or “about the same” as their parents have lower financial satisfaction compared to those with “better off” status.

Second hypothesis is not confirmed, people who strongly agree with the statement “One of main goals in life has been to make my parents proud” have higher financial satisfaction.

Third hypothesis is not confirmed as well, although number of people in the household have marginally significant effect, for such big dataset we cannot really confirm the effect. And living with parents was not significant, and the addition did not improve the model at all.

The effect of separation from parents on financial satisfaction should be studied further, preferable on the different dataset.