#RQ: Does the separation from parents have an effect on financial satisfaction of household in South American and African countries?
##Hypotheses: 1. People who estimate their standard of living higher compared to their parents’ status will have higher financial satisfaction 2. People with one of main goals in life is to make my parents proud will have lower level of financial satisfaction 3. Higher number of people in household and living with parents will decrease financial satisfaction of household
To estimate separation from parents I will use three main variables of 1st level:
Q56 - Standard of living comparing with your parents
Q27 - One of main goals in life has been to make my parents proud
Q271 - Do you live with your parents
And also Q270 - number of people in household
The control variables will be:
Q260 - sex
Q262 - age
Q273 - marital status
Q288 - scale of income
The second level variables that will be used are:
GDPpercap1 - GDP per capita, PPP (current international $)
incomeWB - Income group country (Low, lower-middle, upper- middle, high)
regionWB - Geographic region (7 groups)
Countries that will be included in analysis: Puerto Rico, Brazil, Peru, Chile, Argentina, Uruguay, Mexico, Venezuela, Guatemala, Nicaragua, Colombia, Ecuador, Bolivia, Macau SAR, Nigeria, Kenya, Ethiopia, Zimbabwe, Morocco, Tunisia, Libya
library(foreign)
library(missForest)
library(dplyr)
library(ggplot2)
library(sjPlot)
library(table1)
wvs <- read.spss("WVS_Cross-National_Wave_7_spss_v6_0.sav", to.data.frame = T)
wvs1 <- select(wvs, c(C_COW_NUM, Q50, Q260, Q262, Q270, Q273, Q274, Q275R, Q279, Q280, Q288, Q288R, Q56, Q27, Q271, GDPpercap1, giniWB, btimarket, incomeWB, regionWB))
table1(~ Q50 + Q260 + Q262 + Q270 + Q273 + Q288 + Q56 + Q27 + Q271 + GDPpercap1 + incomeWB + regionWB, data = wvs1)
wvs1$Q262_n <- as.numeric(as.character(wvs1$Q262))
summary(wvs1$Q262_n)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 16.00 29.00 41.00 43.18 55.00 103.00 511
wvs1$Q270_n <- as.numeric(as.character(wvs1$Q270))
summary(wvs1$Q270_n)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1.000 2.000 4.000 4.025 5.000 63.000 985
wvs1$Q288_n <- as.numeric(wvs1$Q288)
summary(wvs1$Q288_n)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1.00 4.00 5.00 4.91 6.00 10.00 2961
wvs1$Q271_r <- ifelse(wvs1$Q271 == "Yes, own parent(s)" | wvs1$Q271 == "Yes, parent(s) in law" |
wvs1$Q271 == "Yes, both own parent(s) and parent(s) in law", "Yes", as.character(wvs1$Q271)) %>% as.factor()
summary(wvs1$Q271_r)
## No Yes NA's
## 67098 27858 2264
wvs1$Q273_r <- ifelse(wvs1$Q273 == "Married" | wvs1$Q273 == "Living together as married", "Partnered",
ifelse(wvs1$Q273 == "Single", "Single", "Previously partnered")) %>% as.factor()
summary(wvs1$Q273_r)
## Partnered Previously partnered Single
## 61456 11925 23250
## NA's
## 589
wvs1$Q50_r <- ifelse(wvs1$Q50=="Satisfied", 10,
ifelse(wvs1$Q50=="Dissatisfied", 1, as.character(wvs1$Q50))) %>% as.numeric()
summary(wvs1$Q50_r)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1.000 5.000 6.000 6.209 8.000 10.000 625
table1(~ Q50_r + Q260 + Q262_n + Q270_n + Q273_r + Q288_n + Q56 + Q27 + Q271_r + GDPpercap1 + incomeWB + regionWB, data = wvs1)
wvs2 <- select(wvs1, c(C_COW_NUM, Q50_r, Q260, Q262_n, Q270_n, Q273_r, Q288_n, Q56, Q27, Q271_r, GDPpercap1, incomeWB, regionWB))
lookup <- c(cntry = "C_COW_NUM", finsat="Q50_r", sex="Q260", age="Q262_n", people_house="Q270_n", marit_stat="Q273_r", income="Q288_n", status_comp="Q56", parent_proud="Q27", with_parent="Q271_r", GDPpc="GDPpercap1", country_income = "incomeWB", region = "regionWB")
wvs2<-rename(wvs2, all_of(lookup))
summary(wvs2)
## cntry finsat sex
## Canada : 4018 Min. : 1.000 Male :45995
## Indonesia : 3200 1st Qu.: 5.000 Female:51130
## China : 3036 Median : 6.000 NA's : 95
## Great Britain : 2609 Mean : 6.209
## United States of America: 2596 3rd Qu.: 8.000
## Turkey : 2415 Max. :10.000
## (Other) :79346 NA's :625
## age people_house marit_stat income
## Min. : 16.00 Min. : 1.000 Partnered :61456 Min. : 1.00
## 1st Qu.: 29.00 1st Qu.: 2.000 Previously partnered:11925 1st Qu.: 4.00
## Median : 41.00 Median : 4.000 Single :23250 Median : 5.00
## Mean : 43.18 Mean : 4.025 NA's : 589 Mean : 4.91
## 3rd Qu.: 55.00 3rd Qu.: 5.000 3rd Qu.: 6.00
## Max. :103.00 Max. :63.000 Max. :10.00
## NA's :511 NA's :985 NA's :2961
## status_comp parent_proud with_parent
## Better off :52478 Agree strongly :47027 No :67098
## Worse off :16698 Agree :36660 Yes :27858
## Or about the same:26284 Disagree : 9949 NA's: 2264
## NA's : 1760 Strongly disagree: 1908
## NA's : 1676
##
##
## GDPpc country_income
## Min. : 2312 Low income : 2430
## 1st Qu.: 10317 Lower middle income:24457
## Median : 16785 Upper middle income:35201
## Mean : 27428 High income :34685
## 3rd Qu.: 43029 NA's : 447
## Max. :129103
## NA's :5363
## region
## East Asia and Pacific :26088
## Europe and Central Asia :25852
## Latin America and Caribbean :17439
## Middle East and North Africa: 9906
## North America : 6614
## (Other) :10874
## NA's : 447
wvs_pr <- subset(wvs2, cntry == 'Puerto Rico')
wvs_pr$cntry <- ifelse(wvs_pr$cntry=="Puerto Rico", "Puerto Rico", "0") %>% as.factor()
set.seed(123)
imp_pr = missForest(wvs_pr, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.0004657031 0.2453948
## difference(s): 1.345912e-10 0.001996451
## time: 2.536 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.0004549618 0.2326775
## difference(s): 4.266594e-12 0.0003327418
## time: 2.706 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.000458273 0.2303552
## difference(s): 2.446672e-12 0.0003327418
## time: 2.504 seconds
##
## missForest iteration 4 in progress...done!
## estimated error(s): 0.0004613947 0.2372319
## difference(s): 1.91408e-12 0.0002218279
## time: 2.595 seconds
##
## missForest iteration 5 in progress...done!
## estimated error(s): 0.0004565319 0.2333803
## difference(s): 1.658793e-12 0.0003327418
## time: 2.938 seconds
##
## missForest iteration 6 in progress...done!
## estimated error(s): 0.000464642 0.2350178
## difference(s): 9.354495e-12 0.0002218279
## time: 2.907 seconds
##
## missForest iteration 7 in progress...done!
## estimated error(s): 0.0004588342 0.2304481
## difference(s): 2.869853e-12 0.0004436557
## time: 2.72 seconds
##
## missForest iteration 8 in progress...done!
## estimated error(s): 0.0004601552 0.2335738
## difference(s): 3.461541e-12 0.0003327418
## time: 2.68 seconds
##
## missForest iteration 9 in progress...done!
## estimated error(s): 0.0004604003 0.2298985
## difference(s): 2.970667e-12 0.0002218279
## time: 2.519 seconds
##
## missForest iteration 10 in progress...done!
## estimated error(s): 0.0004583402 0.2316926
## difference(s): 1.73601e-12 0.0002218279
## time: 2.911 seconds
imp_pr$OOBerror
## NRMSE PFC
## 0.0004583402 0.2316926344
pr <-imp_pr$ximp
summary(pr)
## cntry finsat sex age
## Puerto Rico:1127 Min. : 1.000 Male :443 Min. :18.0
## 1st Qu.: 5.000 Female:684 1st Qu.:34.0
## Median : 8.000 Median :51.0
## Mean : 7.161 Mean :49.8
## 3rd Qu.: 9.000 3rd Qu.:65.0
## Max. :10.000 Max. :94.0
##
## people_house marit_stat income
## Min. : 1.000 Partnered :582 Min. : 1.000
## 1st Qu.: 2.000 Previously partnered:237 1st Qu.: 4.000
## Median : 3.000 Single :308 Median : 5.000
## Mean : 2.982 Mean : 5.136
## 3rd Qu.: 4.000 3rd Qu.: 7.000
## Max. :12.000 Max. :10.000
##
## status_comp parent_proud with_parent GDPpc
## Better off :555 Agree strongly :808 No :909 Min. :35948
## Worse off :131 Agree :263 Yes:218 1st Qu.:35948
## Or about the same:441 Disagree : 43 Median :35948
## Strongly disagree: 13 Mean :35948
## 3rd Qu.:35948
## Max. :35948
##
## country_income region
## Low income : 0 Sub-Saharan Africa : 0
## Lower middle income: 0 South Asia : 0
## Upper middle income: 0 North America : 0
## High income :1127 Middle East and North Africa: 0
## Latin America and Caribbean :1127
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_br <- subset(wvs2, cntry == 'Brazil')
wvs_br$cntry <- ifelse(wvs_br$cntry=="Brazil", "Brazil", "0") %>% as.factor()
set.seed(123)
imp_br = missForest(wvs_br, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.001040227 0.1355264
## difference(s): 1.600914e-10 0.0026958
## time: 3.32 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.001039747 0.1351398
## difference(s): 1.351129e-11 0.001206016
## time: 3.179 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.001040098 0.1379352
## difference(s): 1.389505e-11 0.001206016
## time: 3.281 seconds
imp_br$OOBerror
## NRMSE PFC
## 0.001039747 0.135139759
br <-imp_br$ximp
summary(br)
## cntry finsat sex age people_house
## Brazil:1762 Min. : 1.000 Male :800 Min. :17.00 Min. : 1.000
## 1st Qu.: 5.000 Female:962 1st Qu.:29.00 1st Qu.: 2.000
## Median : 6.000 Median :43.00 Median : 3.000
## Mean : 6.075 Mean :43.56 Mean : 3.244
## 3rd Qu.: 8.000 3rd Qu.:57.00 3rd Qu.: 4.000
## Max. :10.000 Max. :91.00 Max. :29.000
##
## marit_stat income status_comp
## Partnered :896 Min. : 1.000 Better off :1215
## Previously partnered:339 1st Qu.: 2.000 Worse off : 252
## Single :527 Median : 4.000 Or about the same: 295
## Mean : 4.021
## 3rd Qu.: 5.000
## Max. :10.000
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :809 No :1356 Min. :15259 Low income : 0
## Agree :809 Yes: 406 1st Qu.:15259 Lower middle income: 0
## Disagree :123 Median :15259 Upper middle income:1762
## Strongly disagree: 21 Mean :15259 High income : 0
## 3rd Qu.:15259
## Max. :15259
##
## region
## Sub-Saharan Africa : 0
## South Asia : 0
## North America : 0
## Middle East and North Africa: 0
## Latin America and Caribbean :1762
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_per <- subset(wvs2, cntry == 'Peru')
wvs_per$cntry <- ifelse(wvs_per$cntry=="Peru", "Peru", "0") %>% as.factor()
set.seed(123)
imp_per = missForest(wvs_per, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.0002267714 0.1404239
## difference(s): 5.13615e-11 0.002946429
## time: 1.866 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.0002258453 0.141958
## difference(s): 3.672163e-12 0.0005357143
## time: 1.852 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.0002247073 0.1416838
## difference(s): 2.057437e-12 0.000625
## time: 1.778 seconds
##
## missForest iteration 4 in progress...done!
## estimated error(s): 0.00022642 0.1440654
## difference(s): 1.908553e-12 0.0004464286
## time: 1.878 seconds
##
## missForest iteration 5 in progress...done!
## estimated error(s): 0.0002257124 0.138975
## difference(s): 2.578582e-12 0.0003571429
## time: 1.786 seconds
##
## missForest iteration 6 in progress...done!
## estimated error(s): 0.0002256639 0.1399696
## difference(s): 2.476688e-12 0.0003571429
## time: 1.953 seconds
##
## missForest iteration 7 in progress...done!
## estimated error(s): 0.00022596 0.1409665
## difference(s): 1.960845e-12 0.0004464286
## time: 1.905 seconds
##
## missForest iteration 8 in progress...done!
## estimated error(s): 0.0002262008 0.1389732
## difference(s): 3.645834e-12 0.0004464286
## time: 1.929 seconds
imp_per$OOBerror
## NRMSE PFC
## 0.00022596 0.14096654
per <-imp_per$ximp
summary(per)
## cntry finsat sex age people_house
## Peru:1400 Min. : 1.00 Male :702 Min. :18.00 Min. : 1.000
## 1st Qu.: 5.00 Female:698 1st Qu.:27.00 1st Qu.: 4.000
## Median : 6.00 Median :38.00 Median : 5.000
## Mean : 6.35 Mean :40.16 Mean : 4.781
## 3rd Qu.: 8.00 3rd Qu.:51.00 3rd Qu.: 6.000
## Max. :10.00 Max. :86.00 Max. :15.000
##
## marit_stat income status_comp
## Partnered :898 Min. : 1.000 Better off :824
## Previously partnered:131 1st Qu.: 4.000 Worse off : 92
## Single :371 Median : 5.000 Or about the same:484
## Mean : 4.951
## 3rd Qu.: 6.000
## Max. :10.000
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :688 No :892 Min. :13380 Low income : 0
## Agree :673 Yes:508 1st Qu.:13380 Lower middle income: 0
## Disagree : 38 Median :13380 Upper middle income:1400
## Strongly disagree: 1 Mean :13380 High income : 0
## 3rd Qu.:13380
## Max. :13380
##
## region
## Sub-Saharan Africa : 0
## South Asia : 0
## North America : 0
## Middle East and North Africa: 0
## Latin America and Caribbean :1400
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_ch <- subset(wvs2, cntry == 'Chile')
wvs_ch$cntry <- ifelse(wvs_ch$cntry == "Chile", "Chile", "0") %>% as.factor()
set.seed(123)
imp_ch = missForest(wvs_ch, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.0001084571 0.1666865
## difference(s): 2.307897e-11 0.014125
## time: 1.193 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.0001071028 0.16375
## difference(s): 5.751611e-12 0.00325
## time: 1.185 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.0001073493 0.1685869
## difference(s): 3.734321e-12 0.002375
## time: 1.43 seconds
##
## missForest iteration 4 in progress...done!
## estimated error(s): 0.0001076745 0.1606491
## difference(s): 2.041616e-12 0.002
## time: 1.14 seconds
##
## missForest iteration 5 in progress...done!
## estimated error(s): 0.0001074148 0.1611458
## difference(s): 1.89915e-12 0.002125
## time: 1.127 seconds
##
## missForest iteration 6 in progress...done!
## estimated error(s): 0.0001073218 0.1632764
## difference(s): 2.451009e-12 0.002
## time: 1.217 seconds
##
## missForest iteration 7 in progress...done!
## estimated error(s): 0.0001072016 0.1659902
## difference(s): 1.757565e-12 0.002
## time: 1.188 seconds
##
## missForest iteration 8 in progress...done!
## estimated error(s): 0.0001071255 0.1596667
## difference(s): 1.773204e-12 0.00125
## time: 1.148 seconds
##
## missForest iteration 9 in progress...done!
## estimated error(s): 0.0001069342 0.1657922
## difference(s): 2.552364e-12 0.0015
## time: 1.125 seconds
imp_ch$OOBerror
## NRMSE PFC
## 0.0001071255 0.1596666931
ch <-imp_ch$ximp
summary(ch)
## cntry finsat sex age people_house
## Chile:1000 Min. : 1.000 Male :474 Min. :18.00 Min. :1.000
## 1st Qu.: 5.000 Female:526 1st Qu.:33.00 1st Qu.:2.000
## Median : 6.000 Median :44.00 Median :3.000
## Mean : 6.148 Mean :45.27 Mean :3.266
## 3rd Qu.: 8.000 3rd Qu.:56.25 3rd Qu.:4.000
## Max. :10.000 Max. :91.00 Max. :9.000
##
## marit_stat income status_comp
## Partnered :624 Min. : 1.000 Better off :600
## Previously partnered:208 1st Qu.: 4.000 Worse off :136
## Single :168 Median : 5.000 Or about the same:264
## Mean : 4.673
## 3rd Qu.: 6.000
## Max. :10.000
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :389 No :781 Min. :25155 Low income : 0
## Agree :415 Yes:219 1st Qu.:25155 Lower middle income: 0
## Disagree :138 Median :25155 Upper middle income: 0
## Strongly disagree: 58 Mean :25155 High income :1000
## 3rd Qu.:25155
## Max. :25155
##
## region
## Sub-Saharan Africa : 0
## South Asia : 0
## North America : 0
## Middle East and North Africa: 0
## Latin America and Caribbean :1000
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_arg <- subset(wvs2, cntry == 'Argentina')
wvs_arg$cntry <- ifelse(wvs_arg$cntry == 'Argentina', "Argentina", "0") %>% as.factor()
set.seed(123)
imp_arg = missForest(wvs_arg, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.0001534338 0.1572026
## difference(s): 3.488262e-11 0.002866401
## time: 1.691 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.0001519348 0.1554638
## difference(s): 2.876709e-12 0.0007477567
## time: 1.505 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.0001525242 0.1559435
## difference(s): 4.052913e-12 0.001495513
## time: 1.674 seconds
imp_arg$OOBerror
## NRMSE PFC
## 0.0001519348 0.1554637638
arg <-imp_arg$ximp
summary(arg)
## cntry finsat sex age
## Argentina:1003 Min. : 1.000 Male :485 Min. :18.00
## 1st Qu.: 5.000 Female:518 1st Qu.:27.00
## Median : 6.000 Median :40.00
## Mean : 5.987 Mean :42.55
## 3rd Qu.: 8.000 3rd Qu.:57.00
## Max. :10.000 Max. :93.00
##
## people_house marit_stat income
## Min. : 1.000 Partnered :502 Min. : 1.000
## 1st Qu.: 2.000 Previously partnered:217 1st Qu.: 4.000
## Median : 4.000 Single :284 Median : 5.000
## Mean : 3.626 Mean : 5.085
## 3rd Qu.: 5.000 3rd Qu.: 6.000
## Max. :13.000 Max. :10.000
##
## status_comp parent_proud with_parent GDPpc
## Better off :243 Agree strongly :324 No :758 Min. :22947
## Worse off :218 Agree :509 Yes:245 1st Qu.:22947
## Or about the same:542 Disagree :137 Median :22947
## Strongly disagree: 33 Mean :22947
## 3rd Qu.:22947
## Max. :22947
##
## country_income region
## Low income : 0 Sub-Saharan Africa : 0
## Lower middle income: 0 South Asia : 0
## Upper middle income:1003 North America : 0
## High income : 0 Middle East and North Africa: 0
## Latin America and Caribbean :1003
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_urg <- subset(wvs2, cntry == 'Uruguay')
wvs_urg$cntry <- ifelse(wvs_urg$cntry == 'Uruguay', "Uruguay", "0") %>% as.factor()
set.seed(123)
imp_urg = missForest(wvs_urg, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.0001480635 0.1818762
## difference(s): 6.577302e-11 0.012375
## time: 1.195 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.0001457735 0.1799962
## difference(s): 5.001718e-12 0.004625
## time: 1.196 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.0001464527 0.1835796
## difference(s): 5.157268e-12 0.005375
## time: 1.106 seconds
imp_urg$OOBerror
## NRMSE PFC
## 0.0001457735 0.1799962246
urg <-imp_urg$ximp
summary(urg)
## cntry finsat sex age people_house
## Uruguay:1000 Min. : 1.000 Male :316 Min. :18.00 Min. : 4.00
## 1st Qu.: 5.000 Female:684 1st Qu.:34.00 1st Qu.: 8.00
## Median : 7.000 Median :50.00 Median :17.00
## Mean : 6.901 Mean :49.83 Mean :14.96
## 3rd Qu.: 8.071 3rd Qu.:65.00 3rd Qu.:23.00
## Max. :10.000 Max. :90.00 Max. :23.00
##
## marit_stat income status_comp
## Partnered :477 Min. : 1.000 Better off :422
## Previously partnered:291 1st Qu.: 4.000 Worse off :125
## Single :232 Median : 5.000 Or about the same:453
## Mean : 5.006
## 3rd Qu.: 6.038
## Max. :10.000
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :365 No :887 Min. :22455 Low income : 0
## Agree :493 Yes:113 1st Qu.:22455 Lower middle income: 0
## Disagree :115 Median :22455 Upper middle income: 0
## Strongly disagree: 27 Mean :22455 High income :1000
## 3rd Qu.:22455
## Max. :22455
##
## region
## Sub-Saharan Africa : 0
## South Asia : 0
## North America : 0
## Middle East and North Africa: 0
## Latin America and Caribbean :1000
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_mx <- subset(wvs2, cntry == 'Mexico')
wvs_mx$cntry <- ifelse(wvs_mx$cntry == 'Mexico', "Mexico", "0") %>% as.factor()
set.seed(123)
imp_mx = missForest(wvs_mx, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.0007860064 0.1993169
## difference(s): 1.949538e-11 0.0002871913
## time: 4.559 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.0007849077 0.1995288
## difference(s): 6.794877e-12 7.179782e-05
## time: 4.677 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.0007906234 0.1989591
## difference(s): 2.270198e-12 7.179782e-05
## time: 4.57 seconds
##
## missForest iteration 4 in progress...done!
## estimated error(s): 0.000791721 0.1973008
## difference(s): 3.923289e-12 7.179782e-05
## time: 4.506 seconds
imp_mx$OOBerror
## NRMSE PFC
## 0.0007906234 0.1989590613
mx <-imp_mx$ximp
summary(mx)
## cntry finsat sex age people_house
## Mexico:1741 Min. : 1.000 Male :874 Min. :18.00 Min. : 1.000
## 1st Qu.: 5.000 Female:867 1st Qu.:29.00 1st Qu.: 3.000
## Median : 7.000 Median :42.00 Median : 4.000
## Mean : 6.811 Mean :43.36 Mean : 4.474
## 3rd Qu.: 9.000 3rd Qu.:56.00 3rd Qu.: 5.000
## Max. :10.000 Max. :90.00 Max. :20.000
##
## marit_stat income status_comp
## Partnered :1191 Min. : 1.000 Better off :1003
## Previously partnered: 252 1st Qu.: 2.000 Worse off : 253
## Single : 298 Median : 4.000 Or about the same: 485
## Mean : 4.228
## 3rd Qu.: 6.000
## Max. :10.000
##
## parent_proud with_parent GDPpc
## Agree strongly :1038 No :1277 Min. :20411
## Agree : 587 Yes: 464 1st Qu.:20411
## Disagree : 83 Median :20411
## Strongly disagree: 33 Mean :20411
## 3rd Qu.:20411
## Max. :20411
##
## country_income region
## Low income : 0 Sub-Saharan Africa : 0
## Lower middle income: 0 South Asia : 0
## Upper middle income:1741 North America : 0
## High income : 0 Middle East and North Africa: 0
## Latin America and Caribbean :1741
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_ven <- subset(wvs2, cntry == 'Venezuela')
wvs_ven$cntry <- ifelse(wvs_ven$cntry == 'Venezuela', "Venezuela", "0") %>% as.factor()
set.seed(123)
imp_ven = missForest(wvs_ven, verbose = TRUE)
## removed variable(s) 11 due to the missingness of all entries
## missForest iteration 1 in progress...done!
## estimated error(s): 0 0
## difference(s): 0 0
## time: 0.006 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0 0
## difference(s): 0 0
## time: 0.006 seconds
imp_ven$OOBerror
## NRMSE PFC
## 0 0
ven <-imp_ven$ximp
summary(ven)
## cntry finsat sex age
## Venezuela:1190 Min. : 1.000 Male :571 Min. :18.00
## 1st Qu.: 3.000 Female:619 1st Qu.:25.00
## Median : 5.000 Median :35.00
## Mean : 4.887 Mean :38.31
## 3rd Qu.: 6.000 3rd Qu.:49.00
## Max. :10.000 Max. :85.00
##
## people_house marit_stat income
## Min. : 1.000 Partnered :631 Min. : 1.000
## 1st Qu.: 3.000 Previously partnered:162 1st Qu.: 3.000
## Median : 4.000 Single :397 Median : 5.000
## Mean : 4.546 Mean : 4.479
## 3rd Qu.: 6.000 3rd Qu.: 6.000
## Max. :15.000 Max. :10.000
##
## status_comp parent_proud with_parent
## Better off :147 Agree strongly :666 No :731
## Worse off :511 Agree :438 Yes:459
## Or about the same:532 Disagree : 72
## Strongly disagree: 14
##
##
##
## country_income region
## Low income : 0 Sub-Saharan Africa : 0
## Lower middle income: 0 South Asia : 0
## Upper middle income:1190 North America : 0
## High income : 0 Middle East and North Africa: 0
## Latin America and Caribbean :1190
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_gut <- subset(wvs2, cntry == 'Guatemala')
wvs_gut$cntry <- ifelse(wvs_gut$cntry == 'Guatemala', "Guatemala", "0") %>% as.factor()
set.seed(123)
imp_gut = missForest(wvs_gut, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.0003325686 0.1139921
## difference(s): 2.300062e-10 0.001423922
## time: 1.41 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.000333925 0.1171769
## difference(s): 1.646983e-11 0.0003051261
## time: 1.455 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.000333946 0.1162608
## difference(s): 2.038269e-11 0
## time: 1.425 seconds
##
## missForest iteration 4 in progress...done!
## estimated error(s): 0.0003339189 0.1204789
## difference(s): 1.777699e-11 0.0001017087
## time: 1.514 seconds
##
## missForest iteration 5 in progress...done!
## estimated error(s): 0.000333264 0.115533
## difference(s): 2.178245e-11 0.0003051261
## time: 1.415 seconds
imp_gut$OOBerror
## NRMSE PFC
## 0.0003339189 0.1204788827
gut <-imp_gut$ximp
summary(gut)
## cntry finsat sex age people_house
## Guatemala:1229 Min. : 1.000 Male :578 Min. :18.0 Min. : 1.000
## 1st Qu.: 5.000 Female:651 1st Qu.:21.0 1st Qu.: 3.000
## Median : 7.000 Median :30.0 Median : 4.060
## Mean : 6.965 Mean :33.5 Mean : 4.595
## 3rd Qu.: 9.000 3rd Qu.:42.0 3rd Qu.: 5.000
## Max. :10.000 Max. :89.0 Max. :10.000
##
## marit_stat income status_comp
## Partnered :485 Min. : 1.000 Better off :760
## Previously partnered:134 1st Qu.: 5.000 Worse off :132
## Single :610 Median : 6.000 Or about the same:337
## Mean : 5.965
## 3rd Qu.: 7.000
## Max. :10.000
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :791 No :567 Min. :8996 Low income : 0
## Agree :332 Yes:662 1st Qu.:8996 Lower middle income: 0
## Disagree : 92 Median :8996 Upper middle income:1229
## Strongly disagree: 14 Mean :8996 High income : 0
## 3rd Qu.:8996
## Max. :8996
##
## region
## Sub-Saharan Africa : 0
## South Asia : 0
## North America : 0
## Middle East and North Africa: 0
## Latin America and Caribbean :1229
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_nic <- subset(wvs2, cntry == 'Nicaragua')
wvs_nic$cntry <- ifelse(wvs_nic$cntry == 'Nicaragua', "Nicaragua", "0") %>% as.factor()
set.seed(123)
imp_nic = missForest(wvs_nic, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.0004795084 0
## difference(s): 6.318253e-11 0
## time: 0.491 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.0004777818 0
## difference(s): 8.324922e-12 0
## time: 0.474 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.000482197 0
## difference(s): 1.619466e-12 0
## time: 0.487 seconds
##
## missForest iteration 4 in progress...done!
## estimated error(s): 0.0004794779 0
## difference(s): 1.382154e-12 0
## time: 0.497 seconds
##
## missForest iteration 5 in progress...done!
## estimated error(s): 0.0004802284 0
## difference(s): 1.318186e-12 0
## time: 0.559 seconds
##
## missForest iteration 6 in progress...done!
## estimated error(s): 0.0004773569 0
## difference(s): 1.076047e-12 0
## time: 0.482 seconds
##
## missForest iteration 7 in progress...done!
## estimated error(s): 0.0004805845 0
## difference(s): 6.838752e-13 0
## time: 0.513 seconds
##
## missForest iteration 8 in progress...done!
## estimated error(s): 0.0004827909 0
## difference(s): 1.967888e-12 0
## time: 0.488 seconds
imp_nic$OOBerror
## NRMSE PFC
## 0.0004805845 0.0000000000
nic <-imp_nic$ximp
summary(nic)
## cntry finsat sex age
## Nicaragua:1200 Min. : 1.000 Male :589 Min. :16.00
## 1st Qu.: 5.000 Female:611 1st Qu.:23.75
## Median : 7.000 Median :33.00
## Mean : 6.716 Mean :35.13
## 3rd Qu.: 9.000 3rd Qu.:44.00
## Max. :10.000 Max. :81.00
##
## people_house marit_stat income
## Min. : 1.000 Partnered :688 Min. : 1.000
## 1st Qu.: 4.000 Previously partnered:122 1st Qu.: 3.000
## Median : 5.000 Single :390 Median : 5.000
## Mean : 5.334 Mean : 4.579
## 3rd Qu.: 6.000 3rd Qu.: 6.000
## Max. :17.000 Max. :10.000
##
## status_comp parent_proud with_parent GDPpc
## Better off :517 Agree strongly :522 No :675 Min. :5631
## Worse off :151 Agree :658 Yes:525 1st Qu.:5631
## Or about the same:532 Disagree : 16 Median :5631
## Strongly disagree: 4 Mean :5631
## 3rd Qu.:5631
## Max. :5631
##
## country_income region
## Low income : 0 Sub-Saharan Africa : 0
## Lower middle income:1200 South Asia : 0
## Upper middle income: 0 North America : 0
## High income : 0 Middle East and North Africa: 0
## Latin America and Caribbean :1200
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_col <- subset(wvs2, cntry == 'Colombia')
wvs_col$cntry <- ifelse(wvs_col$cntry == 'Colombia', "Colombia", "0") %>% as.factor()
set.seed(123)
imp_col = missForest(wvs_col, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0 0
## difference(s): 0 0
## time: 0.005 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0 0
## difference(s): 0 0
## time: 0.005 seconds
imp_col$OOBerror
## NRMSE PFC
## 0 0
col <-imp_col$ximp
summary(col)
## cntry finsat sex age people_house
## Colombia:1520 Min. : 1.000 Male :760 Min. :18.00 Min. : 1.000
## 1st Qu.: 5.000 Female:760 1st Qu.:25.00 1st Qu.: 3.000
## Median : 7.000 Median :36.00 Median : 4.000
## Mean : 6.615 Mean :38.85 Mean : 4.131
## 3rd Qu.: 9.000 3rd Qu.:52.00 3rd Qu.: 5.000
## Max. :10.000 Max. :89.00 Max. :18.000
##
## marit_stat income status_comp
## Partnered :790 Min. : 1.000 Better off :566
## Previously partnered:207 1st Qu.: 2.000 Worse off :177
## Single :523 Median : 5.000 Or about the same:777
## Mean : 4.443
## 3rd Qu.: 6.000
## Max. :10.000
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :732 No :988 Min. :15644 Low income : 0
## Agree :736 Yes:532 1st Qu.:15644 Lower middle income: 0
## Disagree : 46 Median :15644 Upper middle income:1520
## Strongly disagree: 6 Mean :15644 High income : 0
## 3rd Qu.:15644
## Max. :15644
##
## region
## Sub-Saharan Africa : 0
## South Asia : 0
## North America : 0
## Middle East and North Africa: 0
## Latin America and Caribbean :1520
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_ecu <- subset(wvs2, cntry == 'Ecuador')
wvs_ecu$cntry <- ifelse(wvs_ecu$cntry == 'Ecuador', "Ecuador", "0") %>% as.factor()
set.seed(123)
imp_ecu = missForest(wvs_ecu, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.0003122292 0.1253165
## difference(s): 3.382572e-11 0.002604167
## time: 1.452 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.0003095908 0.1276035
## difference(s): 2.15582e-12 0.0008333333
## time: 1.424 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.0003109061 0.1302942
## difference(s): 3.210066e-12 0.001145833
## time: 1.334 seconds
imp_ecu$OOBerror
## NRMSE PFC
## 0.0003095908 0.1276034866
ecu <-imp_ecu$ximp
summary(ecu)
## cntry finsat sex age people_house
## Ecuador:1200 Min. : 1.00 Male :573 Min. :18.00 Min. : 1.000
## 1st Qu.: 5.00 Female:627 1st Qu.:26.00 1st Qu.: 3.000
## Median : 7.00 Median :37.00 Median : 4.000
## Mean : 6.34 Mean :39.49 Mean : 4.487
## 3rd Qu.: 8.00 3rd Qu.:52.00 3rd Qu.: 6.000
## Max. :10.00 Max. :80.00 Max. :20.000
##
## marit_stat income status_comp
## Partnered :663 Min. : 1.000 Better off :303
## Previously partnered:172 1st Qu.: 3.000 Worse off :118
## Single :365 Median : 5.000 Or about the same:779
## Mean : 4.758
## 3rd Qu.: 6.000
## Max. :10.000
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :857 No :779 Min. :11847 Low income : 0
## Agree :309 Yes:421 1st Qu.:11847 Lower middle income: 0
## Disagree : 26 Median :11847 Upper middle income:1200
## Strongly disagree: 8 Mean :11847 High income : 0
## 3rd Qu.:11847
## Max. :11847
##
## region
## Sub-Saharan Africa : 0
## South Asia : 0
## North America : 0
## Middle East and North Africa: 0
## Latin America and Caribbean :1200
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_bol <- subset(wvs2, cntry == 'Bolivia')
wvs_bol$cntry <- ifelse(wvs_bol$cntry == 'Bolivia', "Bolivia", "0") %>% as.factor()
set.seed(123)
imp_bol = missForest(wvs_bol, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.0004648883 0.2038974
## difference(s): 2.898543e-10 0.002539913
## time: 4.941 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.000464985 0.1953894
## difference(s): 1.370608e-11 0.0006652153
## time: 4.796 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.0004643176 0.1992363
## difference(s): 9.10762e-12 0.0006047412
## time: 5.626 seconds
##
## missForest iteration 4 in progress...done!
## estimated error(s): 0.0004637441 0.2035822
## difference(s): 1.514099e-11 0.0007256894
## time: 4.895 seconds
imp_bol$OOBerror
## NRMSE PFC
## 0.0004643176 0.1992362975
bol <-imp_bol$ximp
summary(bol)
## cntry finsat sex age people_house
## Bolivia:2067 Min. : 1.000 Male :1024 Min. :18.00 Min. : 1.0
## 1st Qu.: 5.000 Female:1043 1st Qu.:25.00 1st Qu.: 4.0
## Median : 7.000 Median :35.00 Median : 5.0
## Mean : 6.453 Mean :38.33 Mean : 5.2
## 3rd Qu.: 8.000 3rd Qu.:50.00 3rd Qu.: 6.0
## Max. :10.000 Max. :85.00 Max. :30.0
##
## marit_stat income status_comp
## Partnered :1185 Min. : 1.000 Better off : 805
## Previously partnered: 253 1st Qu.: 4.000 Worse off : 186
## Single : 629 Median : 5.000 Or about the same:1076
## Mean : 5.003
## 3rd Qu.: 6.000
## Max. :10.000
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :1154 No :1252 Min. :9086 Low income : 0
## Agree : 825 Yes: 815 1st Qu.:9086 Lower middle income:2067
## Disagree : 74 Median :9086 Upper middle income: 0
## Strongly disagree: 14 Mean :9086 High income : 0
## 3rd Qu.:9086
## Max. :9086
##
## region
## Sub-Saharan Africa : 0
## South Asia : 0
## North America : 0
## Middle East and North Africa: 0
## Latin America and Caribbean :2067
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_mac <- subset(wvs2, cntry == 'Macau SAR')
wvs_mac$cntry <- ifelse(wvs_mac$cntry == 'Macau SAR', "Macau SAR", "0") %>% as.factor()
set.seed(123)
imp_mac = missForest(wvs_mac, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 9.535232e-05 0.1701235
## difference(s): 1.410182e-09 0
## time: 1.838 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 9.503666e-05 0.1641233
## difference(s): 2.261257e-11 0
## time: 1.741 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 9.439277e-05 0.1635139
## difference(s): 1.837271e-11 0
## time: 1.797 seconds
##
## missForest iteration 4 in progress...done!
## estimated error(s): 9.465273e-05 0.1600876
## difference(s): 1.886795e-11 0
## time: 1.732 seconds
imp_mac$OOBerror
## NRMSE PFC
## 9.439277e-05 1.635139e-01
mac <-imp_mac$ximp
summary(mac)
## cntry finsat sex age
## Macau SAR:1023 Min. : 1.000 Male :450 Min. :18.00
## 1st Qu.: 5.000 Female:573 1st Qu.:26.00
## Median : 6.000 Median :39.00
## Mean : 6.202 Mean :40.82
## 3rd Qu.: 7.000 3rd Qu.:53.00
## Max. :10.000 Max. :82.00
##
## people_house marit_stat income
## Min. : 1.000 Partnered :570 Min. : 1.000
## 1st Qu.: 3.000 Previously partnered: 50 1st Qu.: 4.000
## Median : 3.912 Single :403 Median : 5.000
## Mean : 3.685 Mean : 5.023
## 3rd Qu.: 4.000 3rd Qu.: 6.000
## Max. :20.000 Max. :10.000
##
## status_comp parent_proud with_parent GDPpc
## Better off :839 Agree strongly :114 No :625 Min. :129103
## Worse off : 60 Agree :541 Yes:398 1st Qu.:129103
## Or about the same:124 Disagree :333 Median :129103
## Strongly disagree: 35 Mean :129103
## 3rd Qu.:129103
## Max. :129103
##
## country_income region
## Low income : 0 Sub-Saharan Africa : 0
## Lower middle income: 0 South Asia : 0
## Upper middle income: 0 North America : 0
## High income :1023 Middle East and North Africa: 0
## Latin America and Caribbean : 0
## Europe and Central Asia : 0
## East Asia and Pacific :1023
wvs_nig <- subset(wvs2, cntry == 'Nigeria')
wvs_nig$cntry <- ifelse(wvs_nig$cntry == 'Nigeria', "Nigeria", "0") %>% as.factor()
set.seed(123)
imp_nig = missForest(wvs_nig, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.0006894611 0.1521317
## difference(s): 9.630554e-11 0.002425222
## time: 1.624 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.0006879526 0.1529117
## difference(s): 1.463198e-11 0.0006063056
## time: 1.579 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.0006886714 0.1538614
## difference(s): 1.030557e-11 0.0007073565
## time: 1.61 seconds
##
## missForest iteration 4 in progress...done!
## estimated error(s): 0.0006897513 0.1528423
## difference(s): 8.676889e-12 0.0007073565
## time: 1.603 seconds
##
## missForest iteration 5 in progress...done!
## estimated error(s): 0.0006896226 0.1541757
## difference(s): 8.288235e-12 0.0005052546
## time: 1.572 seconds
##
## missForest iteration 6 in progress...done!
## estimated error(s): 0.0006920485 0.1536433
## difference(s): 1.285152e-11 0.0007073565
## time: 1.625 seconds
imp_nig$OOBerror
## NRMSE PFC
## 0.0006896226 0.1541757420
nig <-imp_nig$ximp
summary(nig)
## cntry finsat sex age people_house
## Nigeria:1237 Min. : 1.000 Male :633 Min. : 18.00 Min. : 1.000
## 1st Qu.: 3.000 Female:604 1st Qu.: 24.00 1st Qu.: 4.000
## Median : 5.000 Median : 30.00 Median : 5.000
## Mean : 4.802 Mean : 32.56 Mean : 6.284
## 3rd Qu.: 7.000 3rd Qu.: 38.00 3rd Qu.: 8.000
## Max. :10.000 Max. :100.00 Max. :63.000
##
## marit_stat income status_comp
## Partnered :695 Min. : 1.000 Better off :631
## Previously partnered: 50 1st Qu.: 3.000 Worse off :500
## Single :492 Median : 4.000 Or about the same:106
## Mean : 4.411
## 3rd Qu.: 6.000
## Max. :10.000
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :969 No :796 Min. :5348 Low income : 0
## Agree :248 Yes:441 1st Qu.:5348 Lower middle income:1237
## Disagree : 17 Median :5348 Upper middle income: 0
## Strongly disagree: 3 Mean :5348 High income : 0
## 3rd Qu.:5348
## Max. :5348
##
## region
## Sub-Saharan Africa :1237
## South Asia : 0
## North America : 0
## Middle East and North Africa: 0
## Latin America and Caribbean : 0
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_ken <- subset(wvs2, cntry == 'Kenya')
wvs_ken$cntry <- ifelse(wvs_ken$cntry == 'Kenya', "Kenya", "0") %>% as.factor()
set.seed(123)
imp_ken = missForest(wvs_ken, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.002251408 0.232495
## difference(s): 2.789829e-09 0.002665877
## time: 2.282 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.00222875 0.234547
## difference(s): 1.268227e-10 0.0007898894
## time: 2.317 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.002247218 0.2355213
## difference(s): 3.853873e-10 0.0008886256
## time: 2.267 seconds
imp_ken$OOBerror
## NRMSE PFC
## 0.00222875 0.23454696
ken <-imp_ken$ximp
summary(ken)
## cntry finsat sex age people_house
## Kenya:1266 Min. : 1.000 Male :643 Min. :18.00 Min. : 1.000
## 1st Qu.: 3.000 Female:623 1st Qu.:24.00 1st Qu.: 2.000
## Median : 5.000 Median :28.00 Median : 3.000
## Mean : 4.878 Mean :30.73 Mean : 3.833
## 3rd Qu.: 7.000 3rd Qu.:36.00 3rd Qu.: 5.000
## Max. :10.000 Max. :84.00 Max. :30.000
##
## marit_stat income status_comp
## Partnered :611 Min. : 1.000 Better off :763
## Previously partnered:101 1st Qu.: 3.000 Worse off :310
## Single :554 Median : 5.000 Or about the same:193
## Mean : 4.602
## 3rd Qu.: 6.000
## Max. :10.000
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :848 No :922 Min. :4509 Low income : 0
## Agree :353 Yes:344 1st Qu.:4509 Lower middle income:1266
## Disagree : 55 Median :4509 Upper middle income: 0
## Strongly disagree: 10 Mean :4509 High income : 0
## 3rd Qu.:4509
## Max. :4509
##
## region
## Sub-Saharan Africa :1266
## South Asia : 0
## North America : 0
## Middle East and North Africa: 0
## Latin America and Caribbean : 0
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_eth <- subset(wvs2, cntry == 'Ethiopia')
wvs_eth$cntry <- ifelse(wvs_eth$cntry == 'Ethiopia', "Ethiopia", "0") %>% as.factor()
set.seed(123)
imp_eth = missForest(wvs_eth, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.001684979 0.1326647
## difference(s): 7.542113e-10 0.00101626
## time: 1.625 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.001677411 0.1321494
## difference(s): 1.831599e-10 0.000101626
## time: 1.719 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.001677687 0.1322462
## difference(s): 8.722801e-11 0.0004065041
## time: 2.042 seconds
##
## missForest iteration 4 in progress...done!
## estimated error(s): 0.001684294 0.1408357
## difference(s): 6.047123e-11 0.0004065041
## time: 1.781 seconds
##
## missForest iteration 5 in progress...done!
## estimated error(s): 0.00168345 0.1314646
## difference(s): 2.831966e-11 0.0004065041
## time: 1.441 seconds
##
## missForest iteration 6 in progress...done!
## estimated error(s): 0.001685903 0.1343298
## difference(s): 2.020996e-10 0.000203252
## time: 1.712 seconds
##
## missForest iteration 7 in progress...done!
## estimated error(s): 0.001689213 0.1293018
## difference(s): 7.701341e-11 0.000203252
## time: 1.861 seconds
##
## missForest iteration 8 in progress...done!
## estimated error(s): 0.001684337 0.1343261
## difference(s): 1.470513e-11 0.000203252
## time: 1.607 seconds
##
## missForest iteration 9 in progress...done!
## estimated error(s): 0.001684053 0.1305308
## difference(s): 1.392172e-11 0
## time: 1.618 seconds
##
## missForest iteration 10 in progress...done!
## estimated error(s): 0.001680672 0.129068
## difference(s): 6.079987e-11 0.000203252
## time: 1.57 seconds
imp_eth$OOBerror
## NRMSE PFC
## 0.001680672 0.129067969
eth <-imp_eth$ximp
summary(eth)
## cntry finsat sex age people_house
## Ethiopia:1230 Min. : 1.000 Male :622 Min. :18.00 Min. : 1.000
## 1st Qu.: 3.000 Female:608 1st Qu.:23.00 1st Qu.: 3.000
## Median : 5.000 Median :29.00 Median : 5.000
## Mean : 5.216 Mean :31.93 Mean : 5.055
## 3rd Qu.: 7.000 3rd Qu.:38.00 3rd Qu.: 6.000
## Max. :10.000 Max. :79.00 Max. :18.000
##
## marit_stat income status_comp
## Partnered :821 Min. : 1.00 Better off :779
## Previously partnered: 89 1st Qu.: 3.00 Worse off :257
## Single :320 Median : 5.00 Or about the same:194
## Mean : 4.38
## 3rd Qu.: 6.00
## Max. :10.00
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :996 No :898 Min. :2312 Low income :1230
## Agree :214 Yes:332 1st Qu.:2312 Lower middle income: 0
## Disagree : 15 Median :2312 Upper middle income: 0
## Strongly disagree: 5 Mean :2312 High income : 0
## 3rd Qu.:2312
## Max. :2312
##
## region
## Sub-Saharan Africa :1230
## South Asia : 0
## North America : 0
## Middle East and North Africa: 0
## Latin America and Caribbean : 0
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_zim <- subset(wvs2, cntry == 'Zimbabwe')
wvs_zim$cntry <- ifelse(wvs_zim$cntry == 'Zimbabwe', "Zimbabwe", "0") %>% as.factor()
set.seed(123)
imp_zim = missForest(wvs_zim, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.005208871 0.130114
## difference(s): 3.188649e-08 0.0005144033
## time: 2.092 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.005216296 0.1315656
## difference(s): 5.880808e-10 0
## time: 2.011 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.00519394 0.1356921
## difference(s): 1.198285e-09 0
## time: 2.007 seconds
imp_zim$OOBerror
## NRMSE PFC
## 0.005216296 0.131565580
zim <-imp_zim$ximp
summary(zim)
## cntry finsat sex age people_house
## Zimbabwe:1215 Min. : 1.000 Male :600 Min. :18.00 Min. : 1.000
## 1st Qu.: 1.000 Female:615 1st Qu.:25.00 1st Qu.: 3.500
## Median : 3.000 Median :36.00 Median : 5.000
## Mean : 3.644 Mean :39.16 Mean : 5.185
## 3rd Qu.: 5.000 3rd Qu.:50.00 3rd Qu.: 6.000
## Max. :10.000 Max. :99.00 Max. :25.000
##
## marit_stat income status_comp
## Partnered :760 Min. : 1.000 Better off :372
## Previously partnered:188 1st Qu.: 1.000 Worse off :663
## Single :267 Median : 3.000 Or about the same:180
## Mean : 3.463
## 3rd Qu.: 5.000
## Max. :10.000
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :937 No :920 Min. :2953 Low income : 0
## Agree :246 Yes:295 1st Qu.:2953 Lower middle income:1215
## Disagree : 21 Median :2953 Upper middle income: 0
## Strongly disagree: 11 Mean :2953 High income : 0
## 3rd Qu.:2953
## Max. :2953
##
## region
## Sub-Saharan Africa :1215
## South Asia : 0
## North America : 0
## Middle East and North Africa: 0
## Latin America and Caribbean : 0
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_lib <- subset(wvs2, cntry == 'Libya')
wvs_lib$cntry <- ifelse(wvs_lib$cntry == 'Libya', "Libya", "0") %>% as.factor()
set.seed(123)
imp_lib = missForest(wvs_lib, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.0007405244 0.1456742
## difference(s): 2.683191e-09 0.004807692
## time: 1.807 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.0007291783 0.1412055
## difference(s): 6.724643e-11 0.0009406355
## time: 1.911 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.0007285703 0.1441481
## difference(s): 3.895165e-11 0.0007316054
## time: 1.904 seconds
##
## missForest iteration 4 in progress...done!
## estimated error(s): 0.0007308028 0.1421588
## difference(s): 6.901109e-11 0.001149666
## time: 1.737 seconds
imp_lib$OOBerror
## NRMSE PFC
## 0.0007285703 0.1441480950
lib <-imp_lib$ximp
summary(lib)
## cntry finsat sex age people_house
## Libya:1196 Min. : 1.000 Male :619 Min. :18.00 Min. : 1.000
## 1st Qu.: 5.000 Female:577 1st Qu.:30.00 1st Qu.: 4.000
## Median : 6.000 Median :41.00 Median : 6.000
## Mean : 6.499 Mean :40.22 Mean : 5.983
## 3rd Qu.: 8.000 3rd Qu.:49.00 3rd Qu.: 7.000
## Max. :10.000 Max. :85.00 Max. :41.000
##
## marit_stat income status_comp
## Partnered :749 Min. : 1.000 Better off :606
## Previously partnered: 60 1st Qu.: 4.000 Worse off :203
## Single :387 Median : 5.000 Or about the same:387
## Mean : 5.424
## 3rd Qu.: 6.127
## Max. :10.000
##
## parent_proud with_parent GDPpc
## Agree strongly :1110 No :544 Min. :15803
## Agree : 81 Yes:652 1st Qu.:15803
## Disagree : 2 Median :15803
## Strongly disagree: 3 Mean :15803
## 3rd Qu.:15803
## Max. :15803
##
## country_income region
## Low income : 0 Sub-Saharan Africa : 0
## Lower middle income: 0 South Asia : 0
## Upper middle income:1196 North America : 0
## High income : 0 Middle East and North Africa:1196
## Latin America and Caribbean : 0
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_tun <- subset(wvs2, cntry == 'Tunisia')
wvs_tun$cntry <- ifelse(wvs_tun$cntry == 'Tunisia', "Tunisia", "0") %>% as.factor()
set.seed(123)
imp_tun = missForest(wvs_tun, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0.001273197 0.2130999
## difference(s): 2.594939e-10 0.003621689
## time: 2.707 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0.001235917 0.2126609
## difference(s): 8.175183e-10 0.0004139073
## time: 2.628 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0.001245533 0.2131933
## difference(s): 2.68914e-10 0.0003104305
## time: 2.648 seconds
##
## missForest iteration 4 in progress...done!
## estimated error(s): 0.001243975 0.2116334
## difference(s): 1.362753e-10 0.0001034768
## time: 2.652 seconds
##
## missForest iteration 5 in progress...done!
## estimated error(s): 0.001242606 0.2131704
## difference(s): 4.686635e-12 0.0003104305
## time: 2.699 seconds
##
## missForest iteration 6 in progress...done!
## estimated error(s): 0.00124193 0.2127527
## difference(s): 8.487324e-12 0.0004139073
## time: 2.675 seconds
imp_tun$OOBerror
## NRMSE PFC
## 0.001242606 0.213170370
tun <-imp_tun$ximp
summary(tun)
## cntry finsat sex age people_house
## Tunisia:1208 Min. : 1.000 Male :558 Min. :18.00 Min. : 1.000
## 1st Qu.: 3.000 Female:650 1st Qu.:30.00 1st Qu.: 2.000
## Median : 5.000 Median :42.00 Median : 4.000
## Mean : 4.591 Mean :43.19 Mean : 3.658
## 3rd Qu.: 6.000 3rd Qu.:54.00 3rd Qu.: 5.000
## Max. :10.000 Max. :94.00 Max. :10.000
##
## marit_stat income status_comp
## Partnered :728 Min. : 1.000 Better off :652
## Previously partnered:140 1st Qu.: 3.000 Worse off :375
## Single :340 Median : 5.000 Or about the same:181
## Mean : 4.707
## 3rd Qu.: 6.000
## Max. :10.000
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :936 No :776 Min. :11201 Low income : 0
## Agree :248 Yes:432 1st Qu.:11201 Lower middle income:1208
## Disagree : 21 Median :11201 Upper middle income: 0
## Strongly disagree: 3 Mean :11201 High income : 0
## 3rd Qu.:11201
## Max. :11201
##
## region
## Sub-Saharan Africa : 0
## South Asia : 0
## North America : 0
## Middle East and North Africa:1208
## Latin America and Caribbean : 0
## Europe and Central Asia : 0
## East Asia and Pacific : 0
wvs_mor <- subset(wvs2, cntry == 'Morocco')
wvs_mor$cntry <- ifelse(wvs_mor$cntry == 'Morocco', "Morocco", "0") %>% as.factor()
set.seed(123)
imp_mor = missForest(wvs_mor, verbose = TRUE)
## missForest iteration 1 in progress...done!
## estimated error(s): 0 0.02368531
## difference(s): 0 0.0001041667
## time: 0.144 seconds
##
## missForest iteration 2 in progress...done!
## estimated error(s): 0 0.02389399
## difference(s): 0 0
## time: 0.135 seconds
##
## missForest iteration 3 in progress...done!
## estimated error(s): 0 0.02389399
## difference(s): 0 0
## time: 0.172 seconds
imp_mor$OOBerror
## NRMSE PFC
## 0.00000000 0.02389399
mor <-imp_mor$ximp
summary(mor)
## cntry finsat sex age people_house
## Morocco:1200 Min. : 1.000 Male :600 Min. :18.00 Min. : 1.000
## 1st Qu.: 5.000 Female:600 1st Qu.:26.00 1st Qu.: 4.000
## Median : 6.000 Median :34.50 Median : 5.000
## Mean : 6.238 Mean :37.22 Mean : 4.903
## 3rd Qu.: 8.000 3rd Qu.:46.00 3rd Qu.: 6.000
## Max. :10.000 Max. :82.00 Max. :11.000
##
## marit_stat income status_comp
## Partnered :666 Min. : 1.000 Better off :621
## Previously partnered: 94 1st Qu.: 4.000 Worse off :234
## Single :440 Median : 5.000 Or about the same:345
## Mean : 5.228
## 3rd Qu.: 6.000
## Max. :10.000
##
## parent_proud with_parent GDPpc country_income
## Agree strongly :804 No :517 Min. :7826 Low income : 0
## Agree :317 Yes:683 1st Qu.:7826 Lower middle income:1200
## Disagree : 64 Median :7826 Upper middle income: 0
## Strongly disagree: 15 Mean :7826 High income : 0
## 3rd Qu.:7826
## Max. :7826
##
## region
## Sub-Saharan Africa : 0
## South Asia : 0
## North America : 0
## Middle East and North Africa:1200
## Latin America and Caribbean : 0
## Europe and Central Asia : 0
## East Asia and Pacific : 0
ps. it’s a bit structurally unstructured, because in I firstly impute and then describe the missings and data, sorry. I did it the other way around physically.
wvs_na <- bind_rows(wvs_pr, wvs_br, wvs_per, wvs_ch, wvs_arg, wvs_urg, wvs_mx, wvs_gut, wvs_nic, wvs_col, wvs_ecu, wvs_bol, wvs_mac, wvs_nig, wvs_ken, wvs_eth, wvs_zim, wvs_lib, wvs_tun, wvs_mor)
library(misty)
library(naniar)
na.test(wvs_na) #not MCAR
## Little's MCAR Test
##
## n nIncomp nPattern chi2 df pval
## 25824 1845 57 7817.35 573 0.000
gg_miss_upset(wvs_na)
Missings: According to Little’s test the missings are not MCAR
with p-value being smaller than 0.05 (0.00), meaning that it can be MAR
or MNAR. If we look at the visualization, the biggest amount of missings
is in income, but there is not a lot of missings that are connected
between each other, 95 have connection of people in the household with
age, 33 together income with status of living compared to parents’, also
there are 27 together status of living compared to parents’ and stating
that one of the main goals is to make parents proud. I would suggest
that the type of missingness is MAR in the data.
wvs_fin <- bind_rows(pr, br, per, ch, arg, urg, mx, gut, nic, col, ecu, bol, mac, nig, ken, eth, zim, lib, tun, mor)
wvs100 <- select(wvs_fin, c(cntry, finsat, sex, age, people_house, marit_stat, income, status_comp, parent_proud, with_parent, GDPpc, country_income, region))
table1(~ cntry + finsat + sex + age + people_house + marit_stat + income + status_comp + parent_proud + with_parent + GDPpc + country_income + region, data = wvs100)
Overall (N=25824) |
|
---|---|
cntry | |
Puerto Rico | 1127 (4.4%) |
Brazil | 1762 (6.8%) |
Peru | 1400 (5.4%) |
Chile | 1000 (3.9%) |
Argentina | 1003 (3.9%) |
Uruguay | 1000 (3.9%) |
Mexico | 1741 (6.7%) |
Guatemala | 1229 (4.8%) |
Nicaragua | 1200 (4.6%) |
Colombia | 1520 (5.9%) |
Ecuador | 1200 (4.6%) |
Bolivia | 2067 (8.0%) |
Macau SAR | 1023 (4.0%) |
Nigeria | 1237 (4.8%) |
Kenya | 1266 (4.9%) |
Ethiopia | 1230 (4.8%) |
Zimbabwe | 1215 (4.7%) |
Libya | 1196 (4.6%) |
Tunisia | 1208 (4.7%) |
Morocco | 1200 (4.6%) |
finsat | |
Mean (SD) | 6.05 (2.60) |
Median [Min, Max] | 6.00 [1.00, 10.0] |
sex | |
Male | 12343 (47.8%) |
Female | 13481 (52.2%) |
age | |
Mean (SD) | 39.7 (16.1) |
Median [Min, Max] | 37.0 [16.0, 100] |
people_house | |
Mean (SD) | 4.89 (3.40) |
Median [Min, Max] | 4.00 [1.00, 63.0] |
marit_stat | |
Partnered | 14581 (56.5%) |
Previously partnered | 3335 (12.9%) |
Single | 7908 (30.6%) |
income | |
Mean (SD) | 4.73 (2.15) |
Median [Min, Max] | 5.00 [1.00, 10.0] |
status_comp | |
Better off | 13076 (50.6%) |
Worse off | 4573 (17.7%) |
Or about the same | 8175 (31.7%) |
parent_proud | |
Agree strongly | 15191 (58.8%) |
Agree | 8857 (34.3%) |
Disagree | 1459 (5.6%) |
Strongly disagree | 317 (1.2%) |
with_parent | |
No | 17119 (66.3%) |
Yes | 8705 (33.7%) |
GDPpc | |
Mean (SD) | 17800 (24000) |
Median [Min, Max] | 11800 [2310, 129000] |
country_income | |
Low income | 1230 (4.8%) |
Lower middle income | 9393 (36.4%) |
Upper middle income | 11051 (42.8%) |
High income | 4150 (16.1%) |
region | |
Sub-Saharan Africa | 4948 (19.2%) |
South Asia | 0 (0%) |
North America | 0 (0%) |
Middle East and North Africa | 3604 (14.0%) |
Latin America and Caribbean | 16249 (62.9%) |
Europe and Central Asia | 0 (0%) |
East Asia and Pacific | 1023 (4.0%) |
The distribution of observations among countries is almost equal, there is no countries that have more that 8% of total observations. Financial satisfaction has mean equal to 6.05, and median 6.00, sex is almost equally distributed with 48% of males, and 52% females, mean age of the sample is 39.7 with min 16 and max 100. People in the household has a mean of 4.89 with min 1 and max 63, marital status has the biggest category of “partnered” equal to 56.5% of the sample, “previously partnered” with 12.9%. Mean income is 4.73, and sd = 2.15. Status comparison with parents most people indicate to be better off (50.6% of observations), worse off is 17,7% of the sample. Making parents proud is strongly one of the main goals for 58.8%, and just one of the main goals for 34.3%, strongly disagree too little (1.2%), I will further merge this category with just disagreement. Living with parents is indicated by 33.7% of the sample, it includes both parents and parents-in-laws.
summary(wvs100)
## cntry finsat sex age
## Bolivia : 2067 Min. : 1.000 Male :12343 Min. : 16.00
## Brazil : 1762 1st Qu.: 5.000 Female:13481 1st Qu.: 26.00
## Mexico : 1741 Median : 6.000 Median : 37.00
## Colombia: 1520 Mean : 6.053 Mean : 39.66
## Peru : 1400 3rd Qu.: 8.000 3rd Qu.: 51.00
## Kenya : 1266 Max. :10.000 Max. :100.00
## (Other) :16068
## people_house marit_stat income
## Min. : 1.000 Partnered :14581 Min. : 1.000
## 1st Qu.: 3.000 Previously partnered: 3335 1st Qu.: 3.000
## Median : 4.000 Single : 7908 Median : 5.000
## Mean : 4.889 Mean : 4.726
## 3rd Qu.: 6.000 3rd Qu.: 6.000
## Max. :63.000 Max. :10.000
##
## status_comp parent_proud with_parent
## Better off :13076 Agree strongly :15191 No :17119
## Worse off : 4573 Agree : 8857 Yes: 8705
## Or about the same: 8175 Disagree : 1459
## Strongly disagree: 317
##
##
##
## GDPpc country_income
## Min. : 2312 Low income : 1230
## 1st Qu.: 7826 Lower middle income: 9393
## Median : 11847 Upper middle income:11051
## Mean : 17795 High income : 4150
## 3rd Qu.: 20411
## Max. :129103
##
## region
## Sub-Saharan Africa : 4948
## South Asia : 0
## North America : 0
## Middle East and North Africa: 3604
## Latin America and Caribbean :16249
## Europe and Central Asia : 0
## East Asia and Pacific : 1023
library(lattice)
Numeric variables:
densityplot(~wvs100$finsat | wvs100$cntry)
densityplot(~wvs100$income | wvs100$cntry)
densityplot(~wvs100$people_house | wvs100$cntry)
Categorical variables:
ggplot(wvs100, aes(x=status_comp)) + geom_bar() + facet_wrap(vars(cntry))
ggplot(wvs100, aes(x=parent_proud)) + geom_bar() + facet_wrap(vars(cntry))
ggplot(wvs100, aes(x=with_parent)) + geom_bar() + facet_wrap(vars(cntry))
wvs100$parent_proud1 <- ifelse(wvs100$parent_proud == "Disagree" | wvs100$parent_proud == "Strongly disagree", "Disagree", as.character(wvs100$parent_proud))
#sex
ks.test(wvs100$finsat, pnorm)
##
## Asymptotic one-sample Kolmogorov-Smirnov test
##
## data: wvs100$finsat
## D = 0.89384, p-value < 2.2e-16
## alternative hypothesis: two-sided
wilcox.test(wvs100$finsat ~ wvs100$sex) #no significant difference
##
## Wilcoxon rank sum test with continuity correction
##
## data: wvs100$finsat by wvs100$sex
## W = 83384212, p-value = 0.7539
## alternative hypothesis: true location shift is not equal to 0
boxplot(wvs100$finsat ~ wvs100$sex)
Interpretation: Financial satisfaction is not normally distributed, wilcox.test does not show significant difference in financial satisfaction among sexes. The visualization aligns with the test.
#with_parent
wilcox.test(wvs100$finsat ~ wvs100$with_parent) #significant difference
##
## Wilcoxon rank sum test with continuity correction
##
## data: wvs100$finsat by wvs100$with_parent
## W = 70770255, p-value = 2.867e-11
## alternative hypothesis: true location shift is not equal to 0
boxplot(wvs100$finsat ~ wvs100$with_parent)
Interpretation: Wilcox.test indicates significant difference in
financial satisfaction between people who live and do not live with
their parents (with p-value 2.867e-11), but if we look at visualization
there is no clear difference between these two groups, it may be due to
the high number of observations in the dataset.
#age
cor1 <- cor.test(wvs100$finsat, wvs100$age, method = "spearman")
cor1
##
## Spearman's rank correlation rho
##
## data: wvs100$finsat and wvs100$age
## S = 2.9018e+12, p-value = 0.07771
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.01097819
#income
cor2 <- cor.test(wvs100$finsat, wvs100$income, method = "spearman")
cor2
##
## Spearman's rank correlation rho
##
## data: wvs100$finsat and wvs100$income
## S = 2.0655e+12, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.2803708
#people_house
cor3 <- cor.test(wvs100$finsat, wvs100$people_house, method = "spearman")
cor3
##
## Spearman's rank correlation rho
##
## data: wvs100$finsat and wvs100$people_house
## S = 2.8771e+12, p-value = 0.7021
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.002380482
Interpretation: There is a significant correlation between financial satisfaction and income, with p-value < 2.2e-16. While number of people in the house and age are not really correlated with fin sat.
##anova: cat + num (finsat)
#marit_stat
car::leveneTest(wvs100$finsat~wvs100$marit_stat) #groups are not equal
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 18.69 7.741e-09 ***
## 25821
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
kruskal.test(wvs100$finsat~wvs100$marit_stat) #difference is statistically significant
##
## Kruskal-Wallis rank sum test
##
## data: wvs100$finsat by wvs100$marit_stat
## Kruskal-Wallis chi-squared = 36.229, df = 2, p-value = 1.358e-08
#status_comp
car::leveneTest(wvs100$finsat~wvs100$status_comp) #groups are not equal
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 41.614 < 2.2e-16 ***
## 25821
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
kruskal.test(wvs100$finsat~wvs100$status_comp) #difference is statistically significant
##
## Kruskal-Wallis rank sum test
##
## data: wvs100$finsat by wvs100$status_comp
## Kruskal-Wallis chi-squared = 1412.3, df = 2, p-value < 2.2e-16
#parent_proud1
car::leveneTest(wvs100$finsat~wvs100$parent_proud1) #groups are not equal
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 60.078 < 2.2e-16 ***
## 25821
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
kruskal.test(wvs100$finsat~wvs100$parent_proud1) #difference is statistically significant
##
## Kruskal-Wallis rank sum test
##
## data: wvs100$finsat by wvs100$parent_proud1
## Kruskal-Wallis chi-squared = 16.063, df = 2, p-value = 0.0003251
Interpretation: For all three anovas, leveneTest indicate that there are differences in variances between categories of marital status, status of living compared to parents, and making parents proud as one of the main goals. 1. Financial satisfaction is not equal across different marital status groups (p-value 1.358e-08) 2. Financial satisfaction is not equal across different statuses of living compared to parents (p-value < 2.2e-16) 3. Financial satisfaction is not equal across different agreements that making parents proud is one of the main goals (p-value 0.0003251)
anova_comparisons1 <- list( c("Partnered", "Previously partnered"), c("Partnered", "Single"), c("Single", "Previously partnered"))
ggplot(wvs100, aes(x=marit_stat, y=finsat)) +
geom_boxplot()+
theme_minimal(base_size = 20)+
ylab('Fin satisfaction')+xlab('Marital status')+
ggpubr::stat_compare_means(comparisons = anova_comparisons1,label = "p.signif")+
ggpubr::stat_compare_means(method = 'anova', label.y = 81)
#status_comp
table(wvs100$status_comp)
##
## Better off Worse off Or about the same
## 13076 4573 8175
anova_comparisons2 <- list(c("Better off", "Worse off"), c("Better off", "Or about the same"), c("Or about the same", "Worse off"))
ggplot(wvs100, aes(x=status_comp, y=finsat)) +
geom_boxplot()+
theme_minimal(base_size = 20)+
ylab('Fin satisfaction')+xlab('Status compared to parents')+
ggpubr::stat_compare_means(comparisons = anova_comparisons2,label = "p.signif")+
ggpubr::stat_compare_means(method = 'anova', label.y = 81)
#parent_proud1
anova_comparisons3 <- list( c("Agree", "Agree strongly"), c("Agree", "Disagree"), c("Agree strongly", "Disagree"))
ggplot(wvs100, aes(x=parent_proud1, y=finsat)) +
geom_boxplot()+
theme_minimal(base_size = 20)+
ylab('Fin satisfaction')+xlab('Make parents proud')+
ggpubr::stat_compare_means(comparisons = anova_comparisons3,label = "p.signif")+
ggpubr::stat_compare_means(method = 'anova', label.y = 81)
Interpretation: On the plots we can see that distribution between groups is indeed different for all three variables.
library(lme4)
nullmodel1 <- lm(finsat ~ 1, data = wvs100)
summary(nullmodel1)
##
## Call:
## lm(formula = finsat ~ 1, data = wvs100)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.0528 -1.0528 -0.0528 1.9472 3.9472
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.05278 0.01619 373.8 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.602 on 25823 degrees of freedom
nullmodel2 <- lmer(finsat ~ (1 | cntry), data = wvs100, REML = FALSE)
tab_model(nullmodel2)
finsat | |||
---|---|---|---|
Predictors | Estimates | CI | p |
(Intercept) | 6.03 | 5.63 – 6.43 | <0.001 |
Random Effects | |||
σ2 | 5.99 | ||
τ00 cntry | 0.81 | ||
ICC | 0.12 | ||
N cntry | 20 | ||
Observations | 25824 | ||
Marginal R2 / Conditional R2 | 0.000 / 0.119 |
anova(nullmodel2, nullmodel1)
## Data: wvs100
## Models:
## nullmodel1: finsat ~ 1
## nullmodel2: finsat ~ (1 | cntry)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## nullmodel1 2 122677 122694 -61337 122673
## nullmodel2 3 119635 119660 -59815 119629 3044 1 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The addition of second level is justified, p-value is small (< 2.2e-16) indicating that improvement is statistically significant, log likelihood for model with second level is closer to 0 (-59819), thus the model 2 is better.
mdl1 <- lmer(finsat ~ sex + age + income + marit_stat + (1 | cntry), data = wvs100, REML = FALSE)
mdl2 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + (1 | cntry), data = wvs100, REML = FALSE)
mdl3 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + (1 | cntry), data = wvs100, REML = FALSE)
anova(mdl2, mdl3)
## Data: wvs100
## Models:
## mdl2: finsat ~ sex + age + income + marit_stat + status_comp + (1 | cntry)
## mdl3: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + (1 | cntry)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## mdl2 10 117116 117197 -58548 117096
## mdl3 12 117083 117181 -58529 117059 36.794 2 1.024e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mdl4 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + (1 | cntry), data = wvs100, REML = FALSE)
#people_house r2 - 18 icc - 10
anova(mdl3, mdl4)
## Data: wvs100
## Models:
## mdl3: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + (1 | cntry)
## mdl4: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + (1 | cntry)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## mdl3 12 117083 117181 -58529 117059
## mdl4 13 117080 117186 -58527 117054 4.8039 1 0.02839 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mdl4.1 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + with_parent + (1 | cntry), data = wvs100, REML = FALSE)
anova(mdl4.1, mdl4)
## Data: wvs100
## Models:
## mdl4: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + (1 | cntry)
## mdl4.1: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + with_parent + (1 | cntry)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## mdl4 13 117080 117186 -58527 117054
## mdl4.1 14 117082 117196 -58527 117054 0.1008 1 0.7509
tab_model(mdl1, mdl2, mdl3, mdl4)
finsat | finsat | finsat | finsat | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Predictors | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p |
(Intercept) | 4.52 | 4.14 – 4.90 | <0.001 | 4.95 | 4.59 – 5.30 | <0.001 | 4.90 | 4.53 – 5.26 | <0.001 | 4.96 | 4.59 – 5.33 | <0.001 |
sex [Female] | 0.00 | -0.06 – 0.06 | 0.999 | -0.00 | -0.06 – 0.05 | 0.866 | -0.00 | -0.06 – 0.05 | 0.892 | -0.00 | -0.06 – 0.05 | 0.897 |
age | 0.00 | -0.00 – 0.00 | 0.338 | 0.00 | -0.00 – 0.00 | 0.401 | 0.00 | -0.00 – 0.00 | 0.396 | 0.00 | -0.00 – 0.00 | 0.406 |
income | 0.31 | 0.29 – 0.32 | <0.001 | 0.28 | 0.27 – 0.30 | <0.001 | 0.28 | 0.27 – 0.30 | <0.001 | 0.28 | 0.27 – 0.30 | <0.001 |
marit stat [Previously partnered] |
-0.12 | -0.22 – -0.03 | 0.009 | -0.08 | -0.17 – 0.01 | 0.094 | -0.08 | -0.17 – 0.01 | 0.101 | -0.08 | -0.17 – 0.01 | 0.084 |
marit stat [Single] | 0.10 | 0.03 – 0.18 | 0.005 | 0.09 | 0.02 – 0.16 | 0.017 | 0.09 | 0.01 – 0.16 | 0.018 | 0.09 | 0.01 – 0.16 | 0.019 |
status comp [Worse off] | -1.10 | -1.18 – -1.01 | <0.001 | -1.09 | -1.17 – -1.00 | <0.001 | -1.09 | -1.17 – -1.00 | <0.001 | |||
status comp [Or about the same] |
-0.37 | -0.44 – -0.31 | <0.001 | -0.37 | -0.44 – -0.30 | <0.001 | -0.37 | -0.44 – -0.30 | <0.001 | |||
parent proud1 [Agree strongly] |
0.12 | 0.06 – 0.19 | <0.001 | 0.12 | 0.06 – 0.19 | <0.001 | ||||||
parent proud1 [Disagree] | -0.23 | -0.35 – -0.11 | <0.001 | -0.23 | -0.35 – -0.11 | <0.001 | ||||||
people house | -0.01 | -0.02 – -0.00 | 0.028 | |||||||||
Random Effects | ||||||||||||
σ2 | 5.58 | 5.43 | 5.43 | 5.43 | ||||||||
τ00 | 0.66 cntry | 0.56 cntry | 0.58 cntry | 0.59 cntry | ||||||||
ICC | 0.11 | 0.09 | 0.10 | 0.10 | ||||||||
N | 20 cntry | 20 cntry | 20 cntry | 20 cntry | ||||||||
Observations | 25824 | 25824 | 25824 | 25824 | ||||||||
Marginal R2 / Conditional R2 | 0.066 / 0.165 | 0.092 / 0.177 | 0.093 / 0.180 | 0.093 / 0.182 |
Interpretation: Addition of variable “living with parents” does not improve the model at all (p-value is 0.7509), while other variables do, so I will not add “with_parent” variable to the model
mdl5 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc + (1 | cntry), data = wvs100, REML = FALSE)
wvs100$GDPpc_sc <- scale(wvs100$GDPpc)
mdl5.1 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + (1 | cntry), data = wvs100, REML = FALSE)
mdl6 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + (1 | cntry), data = wvs100, REML = FALSE)
#baseline: Low income
mdl7 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + region + (1 | cntry), data = wvs100, REML = FALSE)
#baseline: Sub-Saharan Africa
anova(mdl6, mdl7)
## Data: wvs100
## Models:
## mdl6: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + (1 | cntry)
## mdl7: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + region + (1 | cntry)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## mdl6 17 117076 117215 -58521 117042
## mdl7 20 117066 117229 -58513 117026 15.924 3 0.001176 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
tab_model(mdl4, mdl5.1, mdl6, mdl7) #7th model explains 19.3%, with ICC being 0.03
finsat | finsat | finsat | finsat | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Predictors | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p |
(Intercept) | 4.96 | 4.59 – 5.33 | <0.001 | 4.95 | 4.59 – 5.32 | <0.001 | 4.10 | 2.96 – 5.25 | <0.001 | 4.13 | 3.08 – 5.18 | <0.001 |
sex [Female] | -0.00 | -0.06 – 0.05 | 0.897 | -0.00 | -0.06 – 0.05 | 0.894 | -0.00 | -0.06 – 0.05 | 0.888 | -0.00 | -0.06 – 0.05 | 0.886 |
age | 0.00 | -0.00 – 0.00 | 0.406 | 0.00 | -0.00 – 0.00 | 0.411 | 0.00 | -0.00 – 0.00 | 0.438 | 0.00 | -0.00 – 0.00 | 0.458 |
income | 0.28 | 0.27 – 0.30 | <0.001 | 0.28 | 0.27 – 0.30 | <0.001 | 0.28 | 0.27 – 0.30 | <0.001 | 0.28 | 0.27 – 0.30 | <0.001 |
marit stat [Previously partnered] |
-0.08 | -0.17 – 0.01 | 0.084 | -0.08 | -0.17 – 0.01 | 0.085 | -0.08 | -0.17 – 0.01 | 0.082 | -0.08 | -0.17 – 0.01 | 0.079 |
marit stat [Single] | 0.09 | 0.01 – 0.16 | 0.019 | 0.09 | 0.01 – 0.16 | 0.019 | 0.09 | 0.01 – 0.16 | 0.020 | 0.09 | 0.01 – 0.16 | 0.020 |
status comp [Worse off] | -1.09 | -1.17 – -1.00 | <0.001 | -1.08 | -1.17 – -1.00 | <0.001 | -1.09 | -1.17 – -1.00 | <0.001 | -1.08 | -1.17 – -1.00 | <0.001 |
status comp [Or about the same] |
-0.37 | -0.44 – -0.30 | <0.001 | -0.37 | -0.44 – -0.30 | <0.001 | -0.37 | -0.44 – -0.30 | <0.001 | -0.37 | -0.44 – -0.30 | <0.001 |
parent proud1 [Agree strongly] |
0.12 | 0.06 – 0.19 | <0.001 | 0.12 | 0.06 – 0.19 | <0.001 | 0.12 | 0.06 – 0.19 | <0.001 | 0.13 | 0.06 – 0.19 | <0.001 |
parent proud1 [Disagree] | -0.23 | -0.35 – -0.11 | <0.001 | -0.23 | -0.35 – -0.11 | <0.001 | -0.23 | -0.35 – -0.11 | <0.001 | -0.23 | -0.35 – -0.11 | <0.001 |
people house | -0.01 | -0.02 – -0.00 | 0.028 | -0.01 | -0.02 – -0.00 | 0.029 | -0.01 | -0.02 – -0.00 | 0.029 | -0.01 | -0.02 – -0.00 | 0.039 |
GDPpc sc | 0.13 | -0.17 – 0.43 | 0.398 | -0.14 | -0.44 – 0.16 | 0.365 | -0.09 | -1.22 – 1.04 | 0.875 | |||
country income [Lower middle income] |
0.21 | -0.99 – 1.41 | 0.736 | -0.51 | -1.38 – 0.37 | 0.256 | ||||||
country income [Upper middle income] |
1.21 | 0.01 – 2.41 | 0.049 | -0.32 | -1.41 – 0.78 | 0.574 | ||||||
country income [High income] |
1.57 | 0.16 – 2.98 | 0.029 | -0.05 | -1.54 – 1.45 | 0.952 | ||||||
region [Middle East and North Africa] |
0.87 | 0.20 – 1.55 | 0.011 | |||||||||
region [Latin America and Caribbean] |
1.59 | 0.93 – 2.25 | <0.001 | |||||||||
region [East Asia and Pacific] |
1.28 | -3.75 – 6.32 | 0.617 | |||||||||
Random Effects | ||||||||||||
σ2 | 5.43 | 5.43 | 5.43 | 5.43 | ||||||||
τ00 | 0.59 cntry | 0.57 cntry | 0.32 cntry | 0.14 cntry | ||||||||
ICC | 0.10 | 0.09 | 0.06 | 0.03 | ||||||||
N | 20 cntry | 20 cntry | 20 cntry | 20 cntry | ||||||||
Observations | 25824 | 25824 | 25824 | 25824 | ||||||||
Marginal R2 / Conditional R2 | 0.093 / 0.182 | 0.097 / 0.183 | 0.141 / 0.189 | 0.173 / 0.194 |
Interpretation: 7th model is improved compared to the 6th, with p-value 0.001176 and log lik closer to 0 (-58513 vs -58521), it explains 19.4%, with ICC being 0.03, which is not bad.
mdl8 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + region + (1 + parent_proud1 | cntry), data = wvs100, REML = FALSE) #8th model explains 17.8%, with ICC being 0.04
anova(mdl8, mdl7)
## Data: wvs100
## Models:
## mdl7: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + region + (1 | cntry)
## mdl8: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + region + (1 + parent_proud1 | cntry)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## mdl7 20 117066 117229 -58513 117026
## mdl8 25 117041 117245 -58496 116991 34.791 5 1.656e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mdl9 <- lmer(finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 * region + people_house + GDPpc_sc + country_income + (1 + parent_proud1 | cntry), data = wvs100, REML = FALSE)
anova(mdl8, mdl9) #not significant impovement
## Data: wvs100
## Models:
## mdl8: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 + people_house + GDPpc_sc + country_income + region + (1 + parent_proud1 | cntry)
## mdl9: finsat ~ sex + age + income + marit_stat + status_comp + parent_proud1 * region + people_house + GDPpc_sc + country_income + (1 + parent_proud1 | cntry)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## mdl8 25 117041 117245 -58496 116991
## mdl9 31 117043 117296 -58491 116981 10.372 6 0.1098
Interpretation: 8th model explains 17.9%, with ICC being 0.04, and it shows a significant improvement. While the addition of cross-level interaction does not improve the model significantly (p-value is 0.1098). So I will leave the model with random slope with no cross-level interaction.
tab_model(mdl8)
finsat | |||
---|---|---|---|
Predictors | Estimates | CI | p |
(Intercept) | 3.87 | 2.90 – 4.83 | <0.001 |
sex [Female] | -0.01 | -0.06 – 0.05 | 0.851 |
age | 0.00 | -0.00 – 0.00 | 0.503 |
income | 0.28 | 0.27 – 0.29 | <0.001 |
marit stat [Previously partnered] |
-0.08 | -0.17 – 0.01 | 0.078 |
marit stat [Single] | 0.08 | 0.01 – 0.15 | 0.026 |
status comp [Worse off] | -1.07 | -1.16 – -0.99 | <0.001 |
status comp [Or about the same] |
-0.37 | -0.44 – -0.30 | <0.001 |
parent proud1 [Agree strongly] |
0.13 | 0.01 – 0.24 | 0.026 |
parent proud1 [Disagree] | -0.15 | -0.46 – 0.16 | 0.339 |
people house | -0.01 | -0.02 – -0.00 | 0.043 |
GDPpc sc | -0.21 | -1.18 – 0.75 | 0.664 |
country income [Lower middle income] |
0.03 | -0.79 – 0.85 | 0.945 |
country income [Upper middle income] |
0.14 | -0.88 – 1.15 | 0.793 |
country income [High income] |
0.47 | -0.85 – 1.79 | 0.488 |
region [Middle East and North Africa] |
0.70 | 0.09 – 1.31 | 0.025 |
region [Latin America and Caribbean] |
1.28 | 0.68 – 1.88 | <0.001 |
region [East Asia and Pacific] |
1.52 | -2.80 – 5.84 | 0.490 |
Random Effects | |||
σ2 | 5.41 | ||
τ00 cntry | 0.24 | ||
τ11 cntry.parent_proud1Agree strongly | 0.04 | ||
τ11 cntry.parent_proud1Disagree | 0.37 | ||
ρ01 | -0.49 | ||
-0.75 | |||
ICC | 0.04 | ||
N cntry | 20 | ||
Observations | 25824 | ||
Marginal R2 / Conditional R2 | 0.148 / 0.179 |
Interpretation: Sex, age, marital status of previously partnered (compared with partnered), disagreement with statement that one of the main goals in life is to make parents proud (compared to “agree”), GDP per capita, country income, and all the regions apart from Latin America and Caribbean (compared to Sub-Saharan Africa) are not significant.
Increase in income by 1 leads to 0.28 increase in financial satisfaction, being single compared to being partnered increases financial satisfaction by 0.08, status of life being worse off compared to parents’ (compared with better off) decreases the fin sat by 1.07, while about the same decreases fin sat by 0.37, strong agreement that one’s main goal is to make parents proud increases fin sat by 0.13, number of people in the household only marginally affect the fin sat, with increase of number of people decreases fin sat by 0.01, in Middle East and North Africa region compared to Sub-Saharan Africa the financial satisfaction is higher by 0.7, and in Latin America and Caribbean fin sat is higher by 1.28.
Here is the visualization of what was described above:
plot_model(mdl8, type="pred", term = "income")
plot_model(mdl8, type="pred", term = "marit_stat")
plot_model(mdl8, type="pred", term = "status_comp")
plot_model(mdl8, type="pred", term = "parent_proud1")
plot_model(mdl8, type="pred", term = "people_house")
plot_model(mdl8, type="pred", term = "region")
#library(lattice)
dotplot(ranef(mdl8))
## $cntry
Here we can see that for some countries there are different levels of fin sat, for instance in Tunisia strong agreement indicated higher financial satisfaction.
plot_model(mdl8,type="pred",
terms=c("parent_proud1", "cntry"),pred.type="re", grid=T)
On the plot we can see different levels of agreement that one of the main goals is to make one’s parents proud.
library(performance)
model_performance(mdl7)
## # Indices of model performance
##
## AIC | AICc | BIC | R2 (cond.) | R2 (marg.) | ICC | RMSE | Sigma
## -----------------------------------------------------------------------------------
## 1.171e+05 | 1.171e+05 | 1.173e+05 | 0.194 | 0.173 | 0.026 | 2.328 | 2.329
model_performance(mdl8)
## # Indices of model performance
##
## AIC | AICc | BIC | R2 (cond.) | R2 (marg.) | ICC | RMSE | Sigma
## -----------------------------------------------------------------------------------
## 1.171e+05 | 1.171e+05 | 1.173e+05 | 0.179 | 0.148 | 0.036 | 2.324 | 2.326
ICC indicates that 3.6% of variance is explained by country-level differences, conditional R2 is equal to 0.179, meaning that the model explains 17.9% of variance. If we compare models with and without random effect, we can conclude that model without random effect is better in terms of conditional R2 19.4% for model 7 compared to 17.9% for model 8, but ICC is higher in 8th model, suggesting that there the more difference in groups is observed. If we look at AIC and BIC we can notice that they are the same for both models.
library(DHARMa)
testDispersion(mdl8)
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.99726, p-value = 0.944
## alternative hypothesis: two.sided
simulationOutput <- simulateResiduals(fittedModel = mdl8, plot = F)
plot(simulationOutput)
p-value of KS test indicated a significant diviation from the normal distribution, dispertion test shows that there is no significant dispersion, there are significant outliers.
First hypothesis is confirmed, people who estimate their status “worse off” or “about the same” as their parents have lower financial satisfaction compared to those with “better off” status.
Second hypothesis is not confirmed, people who strongly agree with the statement “One of main goals in life has been to make my parents proud” have higher financial satisfaction.
Third hypothesis is not confirmed as well, although number of people in the household have marginally significant effect, for such big dataset we cannot really confirm the effect. And living with parents was not significant, and the addition did not improve the model at all.
The effect of separation from parents on financial satisfaction should be studied further, preferable on the different dataset.