Repeated the analysis here http://rpubs.com/Chopin/412588, but without spatial position information.
Without lon and lat, P and %N met the requirement of 5 % relative influence in the BRT models.
Overall, the models had similar, or marginally worse predictive power than the models with spatial information.
The major difference was that fewer covariates were significant in these models.
Redid the variable choice analysis in http://rpubs.com/Chopin/412588 but with lon and lat excluded. P and perc_N showed up as having a relative influence of 5% in at least one of the BRT models. The final variables used are below.
fin_occ_df <- dat_occ %>%
select(1:10,
ph_et_al,
salt,
carbon,
Ca,
Q_cover,
elevation,
percent_over1,
percent_over2,
P,
N_perc,
aspect,
texture,
corr_dN,
drainage)
The model fit looked good with scaled covariates.
## NULL
The above model was used to predict onto the same data. The AUC values were all close to one, suggesting a very good predictive power.
| Species | AUC |
|---|---|
| R_burtoniae | 1.000 |
| R_comptonii | 1.000 |
| D_diversifolium | 0.999 |
| A_delaetii | 1.000 |
| A_fissum | 1.000 |
| A_framesii | 0.999 |
| C_spissum | 1.000 |
| C_staminodiosum | 0.988 |
| Dicrocaulon_sp | 1.000 |
| Oophytum_sp | 0.997 |
Let’s use sites 2 and 3 to predict site 1. Some species are predicted really well, others very badly.
| Species | AUC |
|---|---|
| R_burtoniae | 0.911 |
| R_comptonii | 0.707 |
| D_diversifolium | 0.357 |
| A_delaetii | 0.910 |
| A_fissum | 0.415 |
| A_framesii | 0.665 |
| C_spissum | 0.357 |
| C_staminodiosum | 0.729 |
| Dicrocaulon_sp | 0.359 |
| Oophytum_sp | 0.917 |
Now sites 1 and 2 on 3. Predictions are much worse than they were. A_framesii is not included as it didn’t occur in site 3. It’s a bit strange that the auc values are exactly the same as there were for the last analysis, but I can’t find and error in the code.
| Species | AUC |
|---|---|
| R_burtoniae | 0.851 |
| R_comptonii | 0.518 |
| D_diversifolium | 0.426 |
| A_delaetii | 0.575 |
| A_fissum | 0.417 |
| C_spissum | 0.492 |
| C_staminodiosum | 0.725 |
| Dicrocaulon_sp | 0.486 |
| Oophytum_sp | 0.660 |
Now sites 1 and 3 on 2. Predictions are much worse than they were. R_comptonii and C_stamin are not included as it didn’t occur in site 2.
| Species | AUC |
|---|---|
| R_burtoniae | 0.943 |
| D_diversifolium | 0.323 |
| A_delaetii | 0.682 |
| A_fissum | 0.078 |
| A_framesii | 0.526 |
| C_spissum | 0.402 |
| Dicrocaulon_sp | 0.635 |
| Oophytum_sp | 0.650 |
Let’s try by randomly selecting 100 plots and predicting onto the remaining 50. It predicts a lot better than site by site.
| Species | AUC |
|---|---|
| R_burtoniae | 0.870 |
| R_comptonii | 0.960 |
| D_diversifolium | 0.778 |
| A_delaetii | 0.836 |
| A_fissum | 0.435 |
| A_framesii | 0.833 |
| C_spissum | 0.619 |
| C_staminodiosum | 0.780 |
| Dicrocaulon_sp | 0.841 |
| Oophytum_sp | 0.951 |
Overall then, there is some indication that distribution is deterministic.
Boral can partition variance into that explained by environment and latent variables. However, this seems quite sensitive to the data used. The full model is shown on the left, and the randomly chosen one on the right.
Using the full model. Correlation due to environment on the left, correlation due to latent variables on the right.
Using the random model.
Full model