Exploring data patterns

Determinants of psychological distress

Let’s look at differences in psychological distress between ages, genders, positions, years in conservation, education, social support, personal security, and national/non-national status.

Age

Variable	Estimate	p-value
age_year	-0.8847	`***`

Gender

The reference level is ‘female’.

Variable	Estimate (standardised)	p-value
gender_Male	-0.3003	`***`
gender_Other	1.3107	`***`
gender_Prefernottosay	0.2182	`.`

Position

The reference level is ‘researcher’.

Variable	Estimate (standardised)	p-value
position_Administration	0.0805
position_Bachelorsstudent	0.4112	`***`
position_Consultantself_employed	-0.2275	`***`
position_Fieldworker	0.1181	`.`
position_Graduatestudent	0.2655	`***`
position_Intern	0.2908	`.`
position_Manager	0.0000
position_Other	0.0782	`.`
position_Policymaker	-0.4579	`***`
position_Ranger	0.0883
position_Unknown	0.1033

Desk/non-desk

This will only be used if the previous variable is not used.

The reference level is desk-based.

Variable	Estimate (standardised)	p-value
position_simple_nondesk	0.0722
position_simple_unknownother	0.0662

Years in conservation

Variable	Estimate (standardised)	p-value
years_cons	-0.3294	`***`

Education

The reference level is ‘college’.

Variable	Estimate (standardised)	p-value
education_University	-0.0008
education_Primary	0.6423	`*`
education_Secondary	0.2371
education_Unknown	0.0624

University/non-university

This will only be used if the previous variable is not used.

The reference level is non-University.

Variable	Estimate (standardised)	p-value
education_simple	-0.0942	`*`

Social support

Here we’re making the assumption that social support is equal interval, which may not be correct.

Variable	Estimate (standardised)	p-value
SS1	-0.1728	`***`
SS2	-0.1369	`***`
SS3	-0.1373	`***`

Personal security

Again, we’re making the assumption that personal security is equal interval.

Variable	Estimate (standardised)	p-value
PS_1	0.0010
PS_2	0.0701	`***`
PS_3	0.2179	`***`

National/non-national

The reference level is ‘national’.

Variable	Estimate (standardised)	p-value
national.nnational	-0.0346

By work country

Subset to those countries that have more than 20 observation (arbitary). The reference level is “UK”.

Variable	Estimate (standardised)	p-value
W_coun_Australia	-0.0308
W_coun_Brazil	0.4150	`*`
W_coun_Canada	-0.0395
W_coun_Colombia	0.1221
W_coun_France	-0.0212
W_coun_Germany	0.3099	`*`
W_coun_India	0.3988	`***`
W_coun_Indonesia	0.2929	`***`
W_coun_Kenya	-0.0419
W_coun_NewZealand	-0.2650	`*`
W_coun_NULL	0.1170
W_coun_SouthAfrica	0.1411	`.`
W_coun_UnitedStatesofAmerica	0.0122

By nationality

Again, subset to those countries that have more than 20 observation (arbitary). The reference level is “UK”. (There is “conservation context country”, but this is probably going to be very similar to the above.)

Variable	Estimate (standardised)	p-value
Nation_Australia	-0.0876
Nation_Brazil	0.3278	`*`
Nation_Canada	0.0075
Nation_Colombia	0.1431
Nation_France	0.0258
Nation_Germany	-0.0798
Nation_India	0.3748	`***`
Nation_Indonesia	0.3035	`***`
Nation_Italy	0.0516
Nation_Kenya	-0.0916
Nation_NewZealand	-0.1550
Nation_NULL	0.0842
Nation_SouthAfrica	0.1278
Nation_Spain	-0.1901	`.`
Nation_UnitedStatesofAmerica	-0.0001

Was COVID-19 associated with distress?

Variable	Estimate (standardised)	p-value
CV	-0.056

Determinants of ERI

We’re going to assume that ERI is a continuous variable. This isn’t really true, because it’s bound, but I think it’s acceptable. A value great than one suggests that efforts outweigh rewards.

Age

	Estimate	p-value
(Intercept)	0.1526	`*`
age_year	-0.1509	`*`

Gender

	Estimate	p-value
(Intercept)	0.0615	`*`
genderMale	-0.1501	`***`
genderOther	0.3594
genderPrefer not to say	-0.0277

Position

	Estimate	p-value
(Intercept)	-0.2246	`.`
positionBachelors student	-0.098
positionConsultant/self-employed	0.1023
positionFieldworker	0.3845	`*`
positionGraduate student	0.0927
positionIntern	0.0061
positionManager	0.2708	`.`
positionOther	0.1945
positionPolicymaker	-0.1287
positionRanger	0.2641
positionResearcher	0.2894	`*`
positionUnknown	0.3165	`.`

Desk/non-desk

	Estimate	p-value
(Intercept)	-0.0156
position_simplenondesk	0.1561	`.`
position_simpleunknownother	0.1075

Years

	Estimate	p-value
(Intercept)	0.0619	`.`
years_cons	-0.0669	`*`

Education

	Estimate	p-value
(Intercept)	-0.1799	`.`
educationUnknown	0.3033	`*`
educationPrimary	-0.0994
educationSecondary	-0.1817
educationUniversity	0.19	`.`

National/non-national

	Estimate	p-value
(Intercept)	0.0466
national.nnational	-0.0598

Country

	Estimate	p-value
(Intercept)	-0.0336
W_counBrazil	0.3384
W_counCanada	0.0044
W_counColombia	-0.1493
W_counFrance	0.1403
W_counGermany	0.2706
W_counIndia	0.224	`.`
W_counIndonesia	-0.0169
W_counKenya	0.3156	`.`
W_counMalaysia	0.4246	`.`
W_counMexico	0.3142
W_counNew Zealand	-0.3307	`.`
W_counNULL	0.1181
W_counSouth Africa	0.2038
W_counUnited Kingdom	-0.0893
W_counUnited Republic of Tanzania	-0.1043
W_counUnited States of America	-0.0034

Determinants of conservation optimism

Let’s look at differences in national and general CO.

Let’s compare general and national CO. For now, let’s assume the data are numeric (rather than ordinal). The mean of the mean across the ten general CO questions is 2.42. The mean national CO is 2.24. Although there a statistically significant difference between the two, the actual difference is very small (paired two-sided t-test).

We can extend this, to look at the difference in optimism within countries (only showing those countries with >30 responses).

Associations between general and national CO

So we’ve looked at differences in the average level of general and national CO, but what about the correlation between the two.

All the CO questions are Likert-scaled, so are ordinal. Furthermore, some previous exploratory factor analysis (see here) suggested the following relationship between the items and general CO.

Let’s explore the correlation between national CO (ordinal) and general CO (latent variable). We can’t do this in a single step (as far as I’m aware), so we’ll split it into two.

First, we perform confirmatory factor analysis (with the custom correlated error structure described previously). We then extract the factor scores (which represents where a respondent sits (estimated) along the range of a latent variable) - which we think represents the latent variable general CO.

# Extract the factor scores 
Factor_main <-lavPredict(fit_SO_simple, newdata = DF2)
colnames(Factor_main) <- c("General.CO")

# And combine it with the original data 
DF2 <- cbind(DF2,Factor_main )

Second, we regress the factor scores (representing general CO) against national CO, within an ordinal regression. First, we run the ordinal regression.

## Waiting for profiling to be done...

	OR	2.5% CI	97.5% CI
General.CO	6.55	5.53	7.78

Then we want to check that the equal interval assumption is met (this is a really useful walkthrough. There are more diagnostics that we can do, but the most informative is the plot shown below. Simply, we’re looking to see that for each predictor variable (in our case, only general CO), the distance between the levels of the dependent variable (national CO) remain similar. This doesn’t mean that the symbols of the same type have to be vertically aligned. It simply means that the distance between the two symbols looking at each horizontal line are similarly spaced, which in our case they are (nicely).

OK, now we’ve done the diagnostics, we can visualise the relationship between general and national CO.

# Create a dummy data frame which we use to predict levels SO_11 that we can plot 
newdat <- data.frame(General.CO = seq(min(DF2$General.CO), max(DF2$General.CO), length.out = 100))

# Predict outcomes 
newdat <- cbind(newdat, predict(ordinal_reg, newdat, type = "probs"))

# Put into a new DF
lnewdat <- reshape::melt(newdat, id.vars = c("General.CO" ),
  variable.name = "variable", value.name="value")

# Rename columns
colnames(lnewdat) <- c("General CO", "Level", "Probability" )

The latent variable general optimism is presented in standardized units. I.e. 1 on the x-axis of the below figure means 1 standard deviation of the latent unmeasured construct. So if we look at 2 on the x-axis, this is someone who has very low general CO. If we look up from there, we can see that they have a high probability of saying “probably won’t” when asked if the conservation goals for their country will be met. Similarly, you those have high general CO are likely to answer “probably will” or “definitely will” for the same national CO question. Neat!

Predictors of general CO

Age

Variable	Estimate (standardised)	p-value
age_year	-0.5021	`***`

Gender

There does not appear to be a significant difference by gender.

Variable	Estimate (standardised)	p-value
gender_Male	0.0517	`.`
gender_Other	-0.0306

Position

The reference level is ‘researcher’.

Variable	Estimate (standardised)	p-value
position_Administration	0.1308
position_Bachelorsstudent	0.2766	`**`
position_Consultantself_employed	-0.0471
position_Fieldworker	0.0679
position_Graduatestudent	0.0554
position_Intern	0.8485	`***`
position_Manager	0.0606
position_Other	0.1430	`***`
position_Policymaker	0.1767
position_Ranger	0.5237	`***`
position_Unknown	0.1456	`.`

Desk/non-desk

The reference level is ‘desk’.

Variable	Estimate (standardised)	p-value
position_simple_nondesk	0.0779
position_simple_unknownother	0.0858

Years in conservation

Variable	Estimate (standardised)	p-value
years_cons	-0.1857	`***`

Education

The reference level is ‘college’.

Variable	Estimate (standardised)	p-value
education_University	-0.0411
education_Unknown	0.0508
education_Primary	-0.0495
education_Secondary	-0.1820

University/non-university

The reference level is ‘college’.

Variable	Estimate (standardised)	p-value
education_simple	-0.0125

Was COVID-19 associated with conservation optimism?

Variable	Estimate (standardised)	p-value
CV	0.1231	`***`

Predictors of national CO

OK, lets do the same but looking at predictors of national CO, using ordinal regression (but not bothering to check the model assumptions, and doing it as one multi-variate analysis. The continuous variables, like age, have been scaled. Reference levels are:

Gender = female
Position = researcher
Education = college

	OR	2.5% CI	97.5% CI	p-value
age_year	0.79	0.46	1.37
genderMale	1.20	1.00	1.44	`.`
genderOther	0.34	0.07	1.65
genderPrefer not to say	1.17	0.42	3.31
positionBachelors student	1.80	0.75	4.31
positionConsultant/self-employed	0.72	0.39	1.33
positionFieldworker	0.95	0.51	1.74
positionGraduate student	0.59	0.33	1.05	`.`
positionIntern	1.70	0.66	4.37
positionManager	0.71	0.41	1.24
positionOther	0.77	0.43	1.38
positionPolicymaker	0.60	0.25	1.44
positionRanger	0.95	0.38	2.40
positionResearcher	0.64	0.38	1.09	`.`
years_cons	0.66	0.52	0.85	`**`
educationPrimary	0.78	0.14	4.32
educationSecondary	1.37	0.51	3.74
educationUniversity	1.16	0.73	1.85

Predictors of goal progres

As we mentioned before, we’d like construct a composite variable for goal progress. However, until I work that out we can stick with simply taking the mean of progress scores for those that are endorsed. Let’s look at what characteristics predict perceived goal progress. Reference levels are:

Gender = female
Position = researcher
Education = college

	Estimate	2.5% CI	97.5% CI	p-value
(Intercept)	20.99	15.84	27.82	`***`
age_year	0.94	0.78	1.13
genderMale	1.08	1.02	1.15	`*`
genderOther	0.87	0.49	1.54
genderPrefer not to say	0.90	0.59	1.38
positionBachelors student	1.22	0.91	1.63
positionConsultant/self-employed	0.90	0.72	1.12
positionFieldworker	0.92	0.74	1.15
positionGraduate student	0.77	0.63	0.95	`*`
positionIntern	1.02	0.73	1.43
positionManager	0.94	0.77	1.14
positionOther	0.86	0.70	1.06
positionPolicymaker	1.01	0.75	1.36
positionRanger	1.15	0.84	1.59
positionResearcher	0.81	0.67	0.98	`*`
years_cons	1.03	0.95	1.12
educationPrimary	0.43	0.20	0.92	`*`
educationSecondary	0.81	0.54	1.22
educationUniversity	0.89	0.76	1.04

Conservation optimism, dispositional optimism, and goal progress

Now let’s explore the multivariate associations between conservation and dispositional optimism and goal progress. First, lets look at the association between conservation optimism and dispositional optimism.

# The model
model_SO_DO <- '
### Conservation optimism
SO =~ SO_1 + SO_2 + SO_3 +SO_4 + SO_5+ SO_6 + SO_7 + SO_9

# Correlated error terms 
SO_1 ~~ SO_2
SO_3 ~~ SO_4 
SO_5 ~~ SO_6 
SO_7 ~~ SO_9

###  Dispositional optimism 
DO =~ LOTR_1 + LOTR_2 + LOTR_3 +  LOTR_4 + LOTR_5 + LOTR_6

# The method effect 
method =~  LOTR_1 + LOTR_3 + LOTR_6

# Setting the correlation between the method effect and DO to 0 
DO ~~ 0*method
SO ~~ 0*method

### regressing SO and DO 
SO ~ DO
'

# Now lets run the SEM with the same
fit_SO_DO <- lavaan::sem(model_SO_DO,  estimator = "WLSMVS", DF2, ordered = c("SO_1" , "SO_2" , "SO_3" , "SO_4", "SO_5", "SO_6", "SO_7",  "SO_9", "LOTR_1" , "LOTR_2" , "LOTR_3" , "LOTR_4", "LOTR_5", "LOTR_6")  )

The RMSEA of this model is 0.095 (95% CI 0.09 - 0.1), indicating a not particularly well fitting model. Clearly there is a lot of variation in situational conservation optimism that is not explained by dispositional optimism.

What about if we now include goal progress?

# The model
model_SO_DO_GP <- '
### Conservation optimism
SO =~ SO_1 + SO_2 + SO_3 +SO_4 + SO_5+ SO_6 + SO_7 + SO_9

# Correlated error terms 
SO_1 ~~ SO_2
SO_3 ~~ SO_4 
SO_5 ~~ SO_6 
SO_7 ~~ SO_9

###  Dispositional optimism 
DO =~ LOTR_1 + LOTR_2 + LOTR_3 +  LOTR_4 + LOTR_5 + LOTR_6

# The method effect 
method =~  LOTR_1 + LOTR_3 + LOTR_6

### regressing SO and DO 
SO ~ DO

### regressing SO and DO on goal progress
GP_total ~ SO + DO

# Setting the correlation between the method effect and DO to 0 
DO ~~ 0*method
SO ~~ 0*method
'

# Now lets run the SEM with the same
fit_SO_DO_GP <- lavaan::sem(model_SO_DO_GP,  estimator = "WLSMVS", DF2, ordered = c("SO_1" , "SO_2" , "SO_3" , "SO_4", "SO_5", "SO_6", "SO_7",  "SO_9", "LOTR_1" , "LOTR_2" , "LOTR_3" , "LOTR_4", "LOTR_5", "LOTR_6"))

The RMSEA of this model is 0.094 (95% CI 0.09 - 0.1).

Optimism, goal progress, and psychological distress

So now we also think that goal progress and dispositional optimism predict psychological distress. Let’s see if that is supported by the data. (Sorry for the annoying overlapping text - this will be resolved in later models.)

# The model
model_OP_PD <- '
### Conservation optimism
SO =~ SO_1 + SO_2 + SO_3 +SO_4 + SO_5+ SO_6 + SO_7 + SO_9

# Correlated error terms 
SO_1 ~~ SO_2
SO_3 ~~ SO_4 
SO_5 ~~ SO_6 
SO_7 ~~ SO_9

###  Dispositional optimism 
DO =~ LOTR_1 + LOTR_2 + LOTR_3 +  LOTR_4 + LOTR_5 + LOTR_6

# The method effect 
method =~  LOTR_1 + LOTR_3 + LOTR_6

### regressing SO and DO 
SO ~ DO

### regressing SO and DO on goal progress
GP_total ~ DO

### PD 
PD =~ K10_1 + K10_2 + K10_3 + K10_4 + K10_5 + K10_6 + K10_7 + K10_8 + K10_9 + K10_10

# Correlated error terms 
K10_2 ~~ K10_3
K10_5 ~~ K10_6
K10_7 ~~ K10_8

### regressing GP and DO on PD 
PD ~ GP_total +  DO + SO 

# Setting the correlation between the method effect and DO to 0 
DO ~~ 0*method
SO ~~ 0*method
'

# Now lets run the SEM with the same
fit_OP_PD <- lavaan::sem(model_OP_PD,  estimator = "WLSMVS", DF2, ordered = c("SO_1" , "SO_2" , "SO_3" , "SO_4", "SO_5", "SO_6", "SO_7",  "SO_9", "LOTR_1" , "LOTR_2" , "LOTR_3" , "LOTR_4", "LOTR_5", "LOTR_6", "K10_1" , "K10_2" , "K10_3" , "K10_4", "K10_5", "K10_6", "K10_7", "K10_8", "K10_9", "K10_10"))

The RMSEA of this model is 0.073 (95% CI 0.07 - 0.08).

Next steps

So far we’ve just been using the complete cases. However, we know some variables have missing data because some people didn’t complete all the questions. The next step is to use multiple imputation to account for this, described here.