There don’t appear to be major red flags in the correlations. Some variables are strongly related with each other, but nothing that would lead to issues.
## Sample size: 881
## Number of trees: 1000
## Forest terminal node size: 5
## Average no. of terminal nodes: 117.814
## No. of variables tried at each split: 15
## Total no. of variables: 43
## Resampling used to grow trees: swor
## Resample size used to grow trees: 557
## Analysis: RF-R
## Family: regr
## Splitting rule: mse *random*
## Number of random split points: 10
## (OOB) R squared: 0.22781441
## (OOB) Requested performance error: 1.55686503
From the training data, the model had a good OOB R-squared.
## Sample size of test (predict) data: 369
## Number of grow trees: 1000
## Average no. of grow terminal nodes: 117.814
## Total no. of grow variables: 43
## Resampling used to grow trees: swor
## Resample size used to grow trees: 557
## Analysis: RF-R
## Family: regr
## R squared: 0.19801046
## Requested performance error: 1.78474045
The test data showed a similarly good OOB R-squared.
As predicted, 2018 life satisfaction is by far the most important predictor of 2022 life satisfaction. Interestingly, there are no demographic predictors that have real importance for predicting life satisfaction.