Hypothesis
Below are some possible hypothesis that we could test:
Overview of sections
The following sections illustrate:
The raw dataset contains 915 observations. The next plot shows the pattern of missing data.
##
## Agree Disagree Not applicable Strongly agree
## 213 295 136 106
## Strongly disagree
## 61
##
## Agree Disagree Strongly agree Strongly disagree
## 213 295 106 61
The dataset containing missing observations (including where the respondents indicate the question is ‘not applicable’) is 915. Removing these missing observation, the dataset contains 493 complete observations. We could use multiple imputation methods to impute synthetic values for the truly ‘missing data’, but this is not suitable for imputing the ‘not applicable’ observations. Alternatively, instead of using a single SEM, we could repeat the analysis for each explanatory variable in turn (and using complete observations for that variable), which might yield slightly larger datasets for each analysis. However, we then use the information that exists across patterns of responses (discussed below).
Now, let’s look at some summaries of each variable.
Observations by site
The below shows the number of sites with a given number of observations. For instance, there are 360 sites with only one observation but 1 sites with 11 observations. This suggest there is probably no need for a random effect for site.
# Number of sites with a given number of observations
table(table(ranger_3$Q1PA_Name))
##
## 1 2 3 4 5 7 9 10 11
## 360 24 10 1 1 1 2 1 1
Observations by country
First, which countries are represented in the sample. There are 53 unique countries represented in the dataset.
levels(factor(ranger_3$Q2Country_ENG))
## [1] "Argentina" "Australia"
## [3] "Bhutan" "Brazil"
## [5] "Canada" "Chile"
## [7] "China" "Colombia"
## [9] "Costa Rica" "Côte d'Ivoire"
## [11] "Democratic Republic of the Congo" "Ecuador"
## [13] "France" "Germany"
## [15] "Ghana" "Guinea"
## [17] "Iceland" "India"
## [19] "Indonesia" "Israel"
## [21] "Kenya" "Liberia"
## [23] "Malaysia" "Mexico"
## [25] "Mozambique" "Namibia"
## [27] "Nepal" "Netherlands"
## [29] "New Zealand" "Nigeria"
## [31] "Pakistan" "Paraguay"
## [33] "Peru" "Philippines"
## [35] "Portugal" "Romania"
## [37] "Solomon Islands" "South Africa"
## [39] "Spain" "Sri Lanka"
## [41] "Switzerland" "Tanzania"
## [43] "Thailand" "The Bahamas"
## [45] "UAE" "Uganda"
## [47] "United Kingdom" "Uruguay"
## [49] "USA" "Venezuela"
## [51] "Vietnam" "Zambia"
## [53] "Zimbabwe"
We can also look at the distribution of observations between countries. There is 1 country with 97 observations, but then a reasonable spread of observations with the other countries.
# Number of countries with a given number of observations
table(table(ranger_3$Q2Country_ENG))
##
## 1 2 3 4 5 6 9 10 11 12 13 15 16 18 19 22 24 25 69 97
## 15 5 7 2 4 2 2 1 1 2 3 1 1 1 1 1 1 1 1 1
Observations by region
We can also look at the observation in each region, which shows reasonably good spread between regions.
# Plot as table
table(ranger_3$RegionCode)
##
## South America North America
## 74 70
## Asia Africa
## 201 60
## Europe Australia and Oceania
## 56 19
## Central America and Caribbean
## 13
Age
summary(ranger_3$Q3Age_Number)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 22.00 31.00 37.00 39.06 45.00 68.00
Gender
# Inspect
summary(as.factor(ranger_3$Q4Sex_ENG))
## Female Male
## 78 415
Years experience
summary(ranger_3$Q5Experince_Number)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 4.00 8.00 10.85 15.00 40.00
Living and working in remote locations without healthcare facility
# Recode to factor
ranger_3$Q6__Living_Working_Remotelocation_without_healthcareFacility <- as.factor(ranger_3$Q6__Living_Working_Remotelocation_without_healthcareFacility)
# Table
table(ranger_3$Q6__Living_Working_Remotelocation_without_healthcareFacility)
##
## No Yes
## 339 154
Red List Index score
I’ve included this additional variable, which is the Red List Index score. The Red List Index “shows trends in the status of groups of species based only on genuine improvements or deteriorations in status of sufficient magnitude to qualify species for listing in more threatened or less threatened Red List Categories.” I’ve use the aggregated RLI across the five taxonomic groups birds, mammals, amphibians, cycads and warm-water reef-forming corals. This score is disaggregated by country, which has been weighted by the fraction of each species’ distribution occurring within a particular country. The RLI ranges between 1 (no threatened species) and 0 (all highly threatened species).
summary(ranger_3$rli)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.5706 0.7736 0.8210 0.8034 0.8489 0.9836
Perceived negative impacts of COVID-19 on conservation
This can probably be better displayed as a divergent barplot, but it’ll do for now. We can respondents overwhelmingly agree or strongly agree with the statements. We can’t read to much into this in absolute terms, because all items are positively worded and so there is likely to be acquiescence bias.
Note that we’ve removed the not applicable category here. When we present the results, it might be worth still including the ‘not applicable’ responses in the descriptive statistics, since this information is still useful. Then subset the data, excluding the ‘not applicable’ responses, in the statistical analysis.
# Subset the labels and data for impacts
labels.imp <- labels_list[9:14]
data.imp<- ranger_3[9:14]
# Recode variable name
names(data.imp)<- labels.imp
# Melt into DF
out <- melt(apply(data.imp, 2, table))
# Add paragraph breaks
out$X2 <- as.factor(sapply( strwrap(out$X2, 55, simplify=FALSE), paste, collapse="\n" ))
labels.imp <- as.factor(sapply( strwrap(labels.imp, 55, simplify=FALSE), paste, collapse="\n" ))
# Remove Question numbers
out$X2 <- trimws(gsub("Q7 |Q8|Q9 |Q10|Q11|Q12", "", out$X2))
labels.imp <- trimws(gsub("Q7 |Q8|Q9 |Q10|Q11|Q12", "", labels.imp))
# Response order
level_order2 <- c("Strongly agree", "Agree", "Disagree","Strongly disagree")
# The plot
ggplot(data=out, aes(x= factor(X2, level = rev(labels.imp)), y= value, fill= factor(X1, level = level_order2))) + geom_bar(stat="identity", position="fill", width = 0.85) + coord_flip() + theme(legend.position=c(.1,-.18) , legend.direction="horizontal", legend.title = element_blank(), axis.text.y = element_text(size=11), plot.margin = margin(t = 0.05, r = 0.05, b = 1.5, l = 0.05, "cm")) + guides(fill = guide_legend(reverse = TRUE)) + ylab("Percent") + xlab("Statement") + scale_y_continuous(breaks = c(0, .25, 0.5,.75, 1), labels = c("0%", "25%", "50%", "75%", "100%"))
Perceived threat changes since COVID-19
Again, probably better displayed divergently. However, interesting to see respondents are more ambivalent about changes in threats. Again, see the comments above about the non-applicable responses.
# Subset the labels and data for impacts
labels.imp <- labels_list[15:19]
data.imp<- ranger_3[15:19]
# Recode variable name
names(data.imp)<- labels.imp
# Melt into DF
out <- melt(apply(data.imp, 2, table))
# Add paragraph breaks
out$X2 <- as.factor(sapply( strwrap(out$X2, 55, simplify=FALSE), paste, collapse="\n" ))
labels.imp <- as.factor(sapply( strwrap(labels.imp, 55, simplify=FALSE), paste, collapse="\n" ))
# Remove Question numbers
out$X2 <- trimws(gsub("Q44|Q45|Q46|Q47|Q48", "", out$X2))
labels.imp <- trimws(gsub("Q44|Q45|Q46|Q47|Q48", "", labels.imp))
# Response order
level_order2 <- c("Strongly agree", "Agree", "Disagree","Strongly disagree")
# The plot
ggplot(data=out, aes(x= factor(X2, level = rev(labels.imp)), y= value, fill= factor(X1, level = level_order2))) + geom_bar(stat="identity", position="fill", width = 0.85) + coord_flip() + theme(legend.position=c(.1,-.18) , legend.direction="horizontal", legend.title = element_blank(), axis.text.y = element_text(size=11), plot.margin = margin(t = 0.05, r = 0.05, b = 1.5, l = 0.05, "cm")) + guides(fill = guide_legend(reverse = TRUE)) + ylab("Percent") + xlab("Statement") + scale_y_continuous(breaks = c(0, .25, 0.5,.75, 1), labels = c("0%", "25%", "50%", "75%", "100%"))
My main point of interest here is how impacts of COVID-19 on conservation activities and resultant changes to threats to biodiversity vary between regions and remote vs non-remote areas (and by experience). For starters, I’m just going to graphically display the predicted probabilities of saying agree or strongly agree to each set of items, within each region.
Impacts of COVID-19 and regions
I’ve not bothered trying to visualise the uncertainty around the estimates, since this is a bit of a headache. Nevertheless, you can see the predicted probabilities of saying agree or strongly agree to each of the impact statements (numbered 1-6, corresponding to the order presented above and in the survey). I’ve added the lines so we can see trends for the same impact across regions.
A couple of take-homes:
In summary, there appear to be important and interesting regional differences. Trying to understand the source of these differences might suggest interesting further analysis. For instance, factors that might influence the perceived impact could include:
Perceived threat changes by regions
Similarly, we can also look at perceived threat changes between regions. Once again, there appears to be a couple of interesting trends:
Once again, asking why these differences exist might reveal some interesting ways to use this data.
Impacts of COVID-19 and remote vs. non-remote
To interpret these, look to see if there appears to be differences between pairs of response levels between “yes” and “no”. I.e. does agree in the “yes” column appear to be significantly different to agree in the “no column”. For the most part, there does not seem to be significant differences in agreement based on remoteness.
Perceived threat changes and remote vs. non-remote
As above, to interpret these, look to see if there appears to be differences between pairs of response levels between “yes” and “no”. Here, there does appear to be significant differences in perceived threat change in remote vs non-remote locations. Those in remote locations are significantly more likely to agree with the statements, and significantly less likely to disagree with them. In other words, those in remote locations perceive greater threat changes than those not in remote locations.
## adding dummy grobs
Impacts of COVID-19 and experience
Here we can also examine the relationship between experience and agreement with each impact statement. These plots (accompanied by an inspection of p-values), suggests that likelihood of agreement doesn’t really change with age.
Impacts of COVID-19 and experience
Here we can also examine the relationship between experience and agreement with each impact statement. These plots (accompanied by an inspection of p-values), suggests that likelihood of agreement doesn’t really change with age.
Perceived threat changes and RLI
We can also examine the association between the impacts of COVID-19 and Red List Index score - i.e. are the impacts on management perceived to be worse in places that are highly threatened? For the most part, there does not appear to be clear associations.
Perceived threat changes and RLI
Similarly, we can look at the associations between perceived threat change and RLI. Here, for the most part it appears like those in countries with high RLI (meaning few threatened species) are more likely to disagree with the statements saying the threat has increased. Conversely, those in countries with low RLI (meaning a high proportion of threatened species) are more likely to say the threats have increased. In other words, the perceived threat change appears to be greatest in areas that are already most threatened.
## adding dummy grobs
Hypothesis 1 simply involved looking at the patterns of agreement and disagreement with each statement. However, hypothesis 2 and 3 explore sources of variation in agreement within among respondents and between locations, explored through multivariate analysis. Within this multivariate the response variable could be perceived threat to biodiversity. The explanatory variables could include Red List Index scores, regions, remote v non-remote locations, and perceived impact of COVID-19 on management activities. Before fitting the multivariate analysis, we need to decide on how to model the response variable.
Perceived threat to biodiversity
We have a number of options to model the perceived threat to biodiversity.
Among these three options, the second is perhaps the easiest to rule in or out as an approach. Doing so involves:
Before proceeding through these steps, lets create numeric duplicates of the ordinal data.
# Function to convert from ordinal to numeric
ord_num_threat_imp <- function(x){
ifelse(x == "Strongly disagree", 1,
ifelse(x == "Disagree", 2,
ifelse(x == "Agree", 3,
ifelse(x == "Strongly agree", 4, 999))))
}
# Recode threats as numeric
threat_recode <- data.frame(apply(ranger_3[,threat_resp_list ],2,ord_num_threat_imp ))
impact_recode <- data.frame(apply(ranger_3[,impact_resp_list ],2,ord_num_threat_imp ))
# Numeric classifier
names(threat_recode) <- paste0(names(threat_recode), "_num")
names(impact_recode) <- paste0(names(impact_recode), "_num")
# Bind to original dataset
ranger_3 <- cbind(ranger_3, threat_recode)
ranger_3 <- cbind(ranger_3, impact_recode)
# List of variables
threat_resp_list_num <- names(threat_recode)
impact_resp_list_num <- names(impact_recode)
1. Breaking the data into a training and test dataset
We’re splitting the data into training and test data, with 70% of randomly selected observations used as the training dataset and the remaining used as the test data.
# Choosing the sample size (the seed has been set above, so is reproducible)
smp_size <- floor(.70*nrow(ranger_3))
## Training observations
train_obs <- sample(seq_len(nrow(ranger_3)), size = smp_size)
# Subset to test and training
ranger_train <- ranger_3[train_obs, ]
ranger_test <- ranger_3[-train_obs, ]
2. Exploring the covariance between the threat statements
The figure below describes the polychoric correlation between the threat statements using the training data. There seems to be strong positive correlation between all threats, which is promising.
The Ordinal Alpha shows good internal consistency between items.
# Ordinal Alpha
round(alpha(poly_cor$rho)$total,2)
## raw_alpha std.alpha G6(smc) average_r S/N median_r
## 0.92 0.92 0.92 0.7 11.84 0.7
3. Conducting a parallel analysis to assess the number of factors to extract
Parallel analysis using polychoric correlation and WLS estimator, suggesting the extraction of 1 factor.
## Parallel analysis suggests that the number of factors = 1 and the number of components = NA
4. Conducting an explanatory analysis (with the training data)
Next, we can conducted the factor analysis extracting 1 factor and using and WLSMVS estimator with polychoric correlation. We can see all items are heavily and significantly loaded onto a single factor.
# Factor analysis
fa_mod1 <- efaUnrotate(ranger_train[,threat_resp_list_num], nf = 1, estimator = "WLSMVS", ordered = c("Q44S_IllegalHunting_ForSubsistance_Increased_num", "Q45_IllegalHunting_ForProfit_Increased_num" , "Q46_Illegal_Logging_Increased_num","Q47_IllegalEnchroachment_Increased_num", "Q48_Increase_OtherPressures_NTFP_OnPA_num"))
fa_mod1_out[1:5,c(3,6,9)]
## PE.rhs PE.est PE.pvalue
## 1 Q44S_IllegalHunting_ForSubsistance_Increased_num 0.9537914 0
## 2 Q45_IllegalHunting_ForProfit_Increased_num 0.9397049 0
## 3 Q46_Illegal_Logging_Increased_num 0.8651787 0
## 4 Q47_IllegalEnchroachment_Increased_num 0.8186173 0
## 5 Q48_Increase_OtherPressures_NTFP_OnPA_num 0.7061899 0
The root mean square error of approximation of this model is 0.128 (95% CI 0.09 - 0.17), and the Comparative Fit Index (CFI) is 0.998. The Tucker–Lewis index is 0.996. Unfortunately, this shows poor model fit, which is somewhat unexpected given the above appeared to be going well.
5. Conducting a confirmatory analysis with the test data
Now, lets do the final step and perform the CFA with the test data (using and WLSMVS estimator with polychoric correlation), using the structure identified above and inspect the fit statistics.
# The model
model_threat_1 <- '
threat =~ Q44S_IllegalHunting_ForSubsistance_Increased_num + Q45_IllegalHunting_ForProfit_Increased_num + Q46_Illegal_Logging_Increased_num + Q47_IllegalEnchroachment_Increased_num + Q48_Increase_OtherPressures_NTFP_OnPA_num
'
# Now lets run the SEM with the same
fit_model_threat_1 <- lavaan::cfa(model = model_threat_1, estimator = "WLSMVS", data=ranger_test[,threat_resp_list_num], ordered = c("Q44S_IllegalHunting_ForSubsistance_Increased_num", "Q45_IllegalHunting_ForProfit_Increased_num" , "Q46_Illegal_Logging_Increased_num","Q47_IllegalEnchroachment_Increased_num", "Q48_Increase_OtherPressures_NTFP_OnPA_num"))
As above, we can check the fit statistics. The root mean square error of approximation of this model is 0.073 (95% CI 0 - 0.15), and the Comparative Fit Index (CFI) is 0.998. The Tucker–Lewis index is 0.996. Again, this describes poor model fit. There might be any number of reasons for this. However, we have seen that region is an important variable associated with threat. We can see if the model has a better fit with its inclusion.
6. Fitting a SEM with the core explanatory variables
# The model
model_threat_2 <- '
# Estimate threat
threat =~ Q44S_IllegalHunting_ForSubsistance_Increased_num + Q45_IllegalHunting_ForProfit_Increased_num + Q46_Illegal_Logging_Increased_num + Q47_IllegalEnchroachment_Increased_num + Q48_Increase_OtherPressures_NTFP_OnPA_num
### Regression part
threat ~
# Region
RegionCode_South_America + RegionCode_Asia + RegionCode_Africa + RegionCode_Europe + RegionCode_Australia_and_Oceania + RegionCode_Central_America_and_Caribbean
'
# Now lets run the SEM with the same
fit_model_threat_2 <- lavaan::sem(model = model_threat_2, estimator = "WLSMVS", data=ranger_train, ordered = c("Q44S_IllegalHunting_ForSubsistance_Increased_num", "Q45_IllegalHunting_ForProfit_Increased_num" , "Q46_Illegal_Logging_Increased_num","Q47_IllegalEnchroachment_Increased_num", "Q48_Increase_OtherPressures_NTFP_OnPA_num"))
This is a massive improvement, and comes well within the range of acceptable fit measures. We can also see if the fit improves through the inclusion of the other variables we believe to be important correlates.
# The model
model_threat_3 <- '
# Estimate threat
threat =~ Q44S_IllegalHunting_ForSubsistance_Increased_num + Q45_IllegalHunting_ForProfit_Increased_num + Q46_Illegal_Logging_Increased_num + Q47_IllegalEnchroachment_Increased_num + Q48_Increase_OtherPressures_NTFP_OnPA_num
### Regression part
threat ~
# Region
RegionCode_South_America + RegionCode_Asia + RegionCode_Africa + RegionCode_Europe + RegionCode_Australia_and_Oceania + RegionCode_Central_America_and_Caribbean +
# Experience
Q5Experince_Number +
# Remote location
Q6__Living_Working_Remotelocation_without_healthcareFacility_Yes +
# RLI
rli
'
# Now lets run the SEM with the same
fit_model_threat_3 <- lavaan::sem(model = model_threat_3, estimator = "WLSMVS", data=ranger_train, ordered = c("Q44S_IllegalHunting_ForSubsistance_Increased_num", "Q45_IllegalHunting_ForProfit_Increased_num" , "Q46_Illegal_Logging_Increased_num","Q47_IllegalEnchroachment_Increased_num", "Q48_Increase_OtherPressures_NTFP_OnPA_num"))
# return the coefficients derived from regression
P_table_regre <- P_table[ which(P_table$PE.op == '~'), ]
P_table_regre[,c(3,11,8)]
## PE.rhs
## 6 RegionCode_South_America
## 7 RegionCode_Asia
## 8 RegionCode_Africa
## 9 RegionCode_Europe
## 10 RegionCode_Australia_and_Oceania
## 11 RegionCode_Central_America_and_Caribbean
## 12 Q5Experince_Number
## 13 Q6__Living_Working_Remotelocation_without_healthcareFacility_Yes
## 14 rli
## PE.std.lv PE.pvalue
## 6 1.091619066 0.0000012879604
## 7 0.317924393 0.0626130062680
## 8 0.974085927 0.0000006147344
## 9 -0.462300101 0.0207343759743
## 10 0.280401927 0.2923386771809
## 11 0.321644256 0.3445781633430
## 12 -0.008627366 0.1609070534137
## 13 0.271672667 0.0234664214321
## 14 0.004593891 0.9957724203832
The fit measures are even better. Although we do expect them to decrease as the model becomes more complex, this is still a relatively simply model. Inspecting the results, it is somewhat surprising that RLI is not associated with the latent threat variable, as suggested in the bivariate analysis. This might indicate a degree of masking going on. Let’s repeat the analysis, but without region.
# The model
model_threat_4 <- '
# Estimate threat
threat =~ Q44S_IllegalHunting_ForSubsistance_Increased_num + Q45_IllegalHunting_ForProfit_Increased_num + Q46_Illegal_Logging_Increased_num + Q47_IllegalEnchroachment_Increased_num + Q48_Increase_OtherPressures_NTFP_OnPA_num
### Regression part
threat ~
# Experience
Q5Experince_Number +
# Remote location
Q6__Living_Working_Remotelocation_without_healthcareFacility_Yes +
# RLI
rli
'
# Now lets run the SEM with the same
fit_model_threat_4 <- lavaan::sem(model = model_threat_4, estimator = "WLSMVS", data=ranger_train, ordered = c("Q44S_IllegalHunting_ForSubsistance_Increased_num", "Q45_IllegalHunting_ForProfit_Increased_num" , "Q46_Illegal_Logging_Increased_num","Q47_IllegalEnchroachment_Increased_num", "Q48_Increase_OtherPressures_NTFP_OnPA_num"))
# Return the coefficients derived from regression
P_table_regre <- P_table[ which(P_table$PE.op == '~'), ]
P_table_regre[,c(3,11,8)]
## PE.rhs PE.std.lv
## 6 Q5Experince_Number -0.009440558
## 7 Q6__Living_Working_Remotelocation_without_healthcareFacility_Yes 0.531696919
## 8 rli -2.046765435
## PE.pvalue
## 6 0.14596460600
## 7 0.00001191381
## 8 0.00895001886
This suggests that the inclusion of the region variable did mask the association with RLI. I think we should think about other relevant variables that vary between locations, rather than including region (since this is likely to obscure important regional effects that we might be interested in).
Perceived impacts of COVID-19 on conservation effort
Another key explanatory variable or set of variables is the perceived impact of COVID-19 on conservation efforts. We could include each statement as a variable in its own right. This might offer the advantage of showing which impacts of COVID-19 appear to be more closely associated with changes in threat status. However, these variables might be highly correlated with each other, potentially suggesting they have a common driving latent variable - disruption from COVID-19. Let’s repeat some of the steps above, exploring covariance between the impact variables.
1. Exploring the covariance between the threat statements
This suggests relatively strong covariance between the impact statements. As a result, I’d be inclined not to include each as a separate variable in the full SEM.
2. Conducting a parallel analysis to assess the number of factors to extract
Parallel analysis using polychoric correlation and WLS estimator, again suggesting the extraction of 1 factor.
## Parallel analysis suggests that the number of factors = 1 and the number of components = NA
3. Conducting an explanatory analysis (with the training data)
Now, let’s do the factor analysis extracting 1 factor and using and WLSMVS estimator with polychoric correlation. All items are heavily and significantly loaded onto a single factor.
# Factor analysis
fa_mod1 <- efaUnrotate(ranger_train[,impact_resp_list_num], nf = 1, estimator = "WLSMVS", ordered = c("Q7_NegativeImpacts_LawEnforcementActivity_num","Q8_NegativelyImpacted_HabitatManagement_num", "Q9_NegativelyImpacted_WildlifeMonitoringCensus_num","Q10_Negativelyimpacted_CommunityEducationConservation_num","Q11_NegativelyImpacted_HumanWildlifeConflict_num","Q12_NegativeImpacted_Training_CapacityBuilding_num"))
fa_mod1_out[1:5,c(3,6,9)]
## PE.rhs PE.est PE.pvalue
## 1 Q7_NegativeImpacts_LawEnforcementActivity_num 0.8264260 0
## 2 Q8_NegativelyImpacted_HabitatManagement_num 0.8261414 0
## 3 Q9_NegativelyImpacted_WildlifeMonitoringCensus_num 0.8141798 0
## 4 Q10_Negativelyimpacted_CommunityEducationConservation_num 0.7437867 0
## 5 Q11_NegativelyImpacted_HumanWildlifeConflict_num 0.7129135 0
The root mean square error of approximation is not great - it’s higher than we’d ideally like, but not outside the realms of what would be considered an adequate model.
4. Conducting a confirmatory analysis with the test data
Now, lets perform the CFA with the test data (using and WLSMVS estimator with polychoric correlation).
# The model
model_impact_1 <- '
impact =~ Q7_NegativeImpacts_LawEnforcementActivity_num + Q8_NegativelyImpacted_HabitatManagement_num + Q9_NegativelyImpacted_WildlifeMonitoringCensus_num + Q10_Negativelyimpacted_CommunityEducationConservation_num + Q11_NegativelyImpacted_HumanWildlifeConflict_num + Q12_NegativeImpacted_Training_CapacityBuilding_num
'
# Now lets run the SEM with the same
fit_model_impact_1 <- lavaan::cfa(model = model_impact_1, estimator = "WLSMVS", data=ranger_test[,impact_resp_list_num], ordered = c("Q7_NegativeImpacts_LawEnforcementActivity_num","Q8_NegativelyImpacted_HabitatManagement_num", "Q9_NegativelyImpacted_WildlifeMonitoringCensus_num","Q10_Negativelyimpacted_CommunityEducationConservation_num","Q11_NegativelyImpacted_HumanWildlifeConflict_num","Q12_NegativeImpacted_Training_CapacityBuilding_num"))
The fit measures appear adequate using the test data.
Full SEM
Let’s combine what we’ve done so far, with the latent perceived change in threat as the response variable, and experience, remote vs non-remote, RLI and latent impact of COVID-19 on conservation as explanatory variables. Additionally, I’ve scaled and centred the continuous variables (experience and RLI) so they’re more easily comparable.
# Scale and centre continous variables
ranger_3$Q5Experince_Number <- scale(ranger_3$Q5Experince_Number, scale = T, center = T)
ranger_3$rli <- scale(ranger_3$rli, scale = T, center = T)
# The model
model_threat_full_1 <- '
### Latent part ###
# Estimate threat
threat =~ Q44S_IllegalHunting_ForSubsistance_Increased_num + Q45_IllegalHunting_ForProfit_Increased_num + Q46_Illegal_Logging_Increased_num + Q47_IllegalEnchroachment_Increased_num + Q48_Increase_OtherPressures_NTFP_OnPA_num
# Estimate impact
impact =~ Q7_NegativeImpacts_LawEnforcementActivity_num + Q8_NegativelyImpacted_HabitatManagement_num + Q9_NegativelyImpacted_WildlifeMonitoringCensus_num + Q10_Negativelyimpacted_CommunityEducationConservation_num + Q11_NegativelyImpacted_HumanWildlifeConflict_num + Q12_NegativeImpacted_Training_CapacityBuilding_num
### Regression part
threat ~
# Experience
Q5Experince_Number +
# Remote location
Q6__Living_Working_Remotelocation_without_healthcareFacility_Yes +
# RLI
rli +
# Impact
impact
'
# Now lets run the SEM with the same
fit_model_full_1 <- lavaan::sem(model = model_threat_full_1, estimator = "WLSMVS", data=ranger_3, ordered = c("Q44S_IllegalHunting_ForSubsistance_Increased_num", "Q45_IllegalHunting_ForProfit_Increased_num" , "Q46_Illegal_Logging_Increased_num","Q47_IllegalEnchroachment_Increased_num", "Q48_Increase_OtherPressures_NTFP_OnPA_num" ))
The fit measures indicate good model fit. Let’s look at the results, focusing just on the regression part. These results conform to our expectations, as discussed below.
# return the coefficients derived from regression
P_table_regre <- P_table[ which(P_table$PE.op == '~'), ]
P_table_regre[,c(3,11,8)]
## PE.rhs PE.std.lv
## 12 Q5Experince_Number -0.1184627
## 13 Q6__Living_Working_Remotelocation_without_healthcareFacility_Yes 0.4713914
## 14 rli -0.1698734
## 15 impact 0.3490911
## PE.pvalue
## 12 0.0129160850631611090
## 13 0.0000021477376701551
## 14 0.0004542515939762737
## 15 0.0000000000003670397
We can also visualise the SEM, as show below (this can be neatened for when it comes to publication, including more informative labels etc.)
Based on this provisional analysis, we can explore the results in relation to our original hypothesis.
1. Threats to biodiversity are perceived to have increased since the start of COVID-19
The figure under the section Perceived threat changes since COVID-19 suggests a rough balance of those that agreed and those that disagrees with statements about increased risk of illegal activities. However, it appeared a higher proportion believed there was an increase in other pressures. Furthermore, there is likely to be variance in these perceptions within the conservation community, as discussed next.
2. The perceived change in threats to biodiversity are not homogeneous, and vary between individuals and locations
Those with more experience and living in countries with higher RLI scores appear more likely to report lower perceived changes in threat. Conversely, those in remote areas and where who report higher perceived impacts of COVID-19 on management activities are also more likely to report greater perceived threats changes. Regions was excluded in post hoc model adjustment, because it appeared to covary with RLI score.
3. Threats to biodiversity are perceived to have increased in places where COVID-19 is perceived to have disrupted management activities
Those more likely to report higher levels of threat change also appear more likely to report higher levels of impact of COVID-19 on management activities.