I used the population call WUG5 as the main experimentation population, because the dadi and tajimas D results are very clearly showing a strong expansion for it. Therefore, I expect ABC to be able to also detect this. I later had quick looks if the different algorithms do similarly in the other populations. The main task was the model selection, so I focused on this and only later will do parameter estimation with the best set of methods. I also start here with the unmasked data. So no missing data introduction and filtering done on the simulated data.
To get an idea about how the real (target) data and the simulated data actually fit together in summary statistics, I do a PCA of the all the summary statistics first.
It shows, the summary statistics of the target are not completely off of
the simulated data. So generally, we simulated scenarios that probably
have at least some similarity to the demographic process behind it. Also
as expected, does the target fall within the summary stats simulated
from the expansion scenario. On a not so positive note though, one can
see here that there is a large overlap in summary statistics between the
three scenarios expansion, contraction and neutral. The loadings of each
summary statistic also indicate little variety between the summary
statistics, or more precisely said, many of them have a high
collinearity. This means, they all kinda describe the same aspects in
the data. This is not ideal. There also a lot of outliers in the
contraction simulation, which make up a lot of the variation in the data
set. This issue is adressed later by doing less simulations with extreme
priors.
I first went with the ABC random forest algorithm I usually used for all the bat experiments. It has the advantage, that it is pretty robust against to choice of summary statistics. Theoretically, the high collinearity in the summary stats shouldn’t be large issue with this method.
First, I do cross validation, which is treating randomly selected simulations as the target to see how well the model selection works generally.
## Warning in lda.default(x, grouping, ...): variables are collinear
## con exp neu class.error
## con 415 225 295 0.5561497
## exp 188 410 337 0.5614973
## neu 221 262 452 0.5165775
A classification error of around 0.5 is obviously not very good, but at least there is a majority for the correct model for each of them and it is not the case that two models are completely indistinguishable. So I would say it has not a good resolution and there are many cases which can’t be distinguished, but it is not no signal at all. Likely, this is due to the many simulations which overlap in their summary statistics, as shown in the PCA.
Next, I attempt the model selection with the real target using the random forest.
## selected model votes model1 votes model2 votes model3 post.proba
## 1 neu 111 98 791 0.7423667
With an ~ 80 % posterior probability, the neutral model was selected. A very similar number of votes were casted to the other model, contraction and expansion. This is likely not correct and doing this with other populations and species shows, that the random forest algortihm always tends to favor the neutral model. Looking again at the PCA, the neutral model seems to be a bit in the middel of the two other models. My hypothesis is therefore, that the random forest is bad at differentiating something from the “middle” of the summary stats, even it is clearly different from it, like the target of call WUG5 here. If it can’t differentiate it from this “middle”, it will always pick neutral, because in this “middle” the neutral model is in the majority.
The collinearity of the summary stats can also have an influence on it, but reducing or transforming wasn’t enough to get a better result from it. I will explain later in another section how I choose summary stats, but with an more ideal choice, the random forest still picked neutral model, but now with a low posterior probability of ~30%. So at least it figured itself, that it’s wrong..
Due to this observation and further testing around, I decided to try the more classical ABC methods, that include always a rejection step. This means, these algorithms first always exclude all simulations from the prediction, which summary stats are to far away from the target. I hope, that this rejection step will exclude the “middle” simulations from the prediction and will then produce for extreme enough targets, an accurtate model selection. In the most classical ABC, the best model is the one which occurs most often in the not rejected (or accepted) simulations. In the R package used for this, there are also two expansions of this baseline algorithm implemented. One were a logistic regression model predicts the best model from the accepted simulations and another one were a neural network does this. I tried all three of them, and figured that the one with the logistic regression is producing the most accurate and clear results for the error classification and the model selection for WUG5.
Before I show the results of it, there are two caveats with the classical ABC. The first one is, that is highly dependent on the choice of summary statistics. Given the high collinearity of them, I had to reduce them to a set with a good representation of the different scenarios. I used the PCA and also correlation tests between each of them to find a set, which doesn’t contain to little information but also reduced the collinearity to a minimum. Here is a PCA of the chosen set to also show the effect of the reduction.
So its the number of sites fixed for the alternative allele (fs), the
number of sites fixed for the reference allele (sfs_0), the rate of
heterozygosity (he), and Tajimas D (TD).
With these, I did again cross validation for this algorithm, but with different tolerances. The tolerance is the proportion of the closest simulations accepted by the algorithm. I did 1, 5, and 10 %.
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 2 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 2 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning in matrix(pred[pred != 0], nmod, nmod, byrow = T): data length [2] is
## not a sub-multiple or multiple of the number of rows [3]
## Warning in matrix(pred[pred != 0], nmod, nmod, byrow = F): data length [2] is
## not a sub-multiple or multiple of the number of rows [3]
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Warning: There are 3 models but only 1 for which simulations have been accepted.
## No regression is performed, method is set to rejection.
## Consider increasing the tolerance rate.TRUE
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 39 31 30
## exp 23 46 31
## neu 25 34 41
##
## $tol0.02
## con exp neu
## con 43 27 30
## exp 16 39 45
## neu 24 27 49
##
## $tol0.05
## con exp neu
## con 33 29 38
## exp 22 39 39
## neu 21 31 48
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.4432 0.2774 0.2793
## exp 0.2495 0.4338 0.3167
## neu 0.2788 0.3415 0.3797
##
## $tol0.02
## con exp neu
## con 0.4369 0.2848 0.2783
## exp 0.2692 0.4000 0.3308
## neu 0.2930 0.3303 0.3767
##
## $tol0.05
## con exp neu
## con 0.4300 0.2809 0.2892
## exp 0.2798 0.3920 0.3282
## neu 0.3008 0.3291 0.3701
The results of this are similarly bad to the random forest results.. even a bit worse, tbh. An interesting observation is though, that the cross validation seem to get better with small tolerances. However, it gives also a lot of warning to increase the tolerance and this is, because there not enough simulations accepted to do the second step of the algorithm, the regression. This shows the second caveat of the classical ABC, which is that it needs a lot more simulations or is more dependent on the number of simulations. We later added more simulations to adress this and be able to choose a lower tolerance.
But with that, next comes the actual model selection for the real target call WUG5. Here, with a tolerance of 5 % cause this was the lowest I could go with the around 1000 simulations used initially.
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.05,
## method = "mnlogistic")
## Data:
## postpr.out$values (141 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.3617 0.3262 0.3121
##
## Bayes factors:
## con exp neu
## con 1.0000 1.1087 1.1591
## exp 0.9020 1.0000 1.0455
## neu 0.8627 0.9565 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.2454 0.4026 0.3519
##
## Bayes factors:
## con exp neu
## con 1.0000 0.6096 0.6974
## exp 1.6404 1.0000 1.1440
## neu 1.4339 0.8741 1.0000
The result of this shows, that the first step of the algorithm, the rejection, accepted round about the same amount of simulations from each model, a little more for the contraction model. Overall, 141 simulations were accepted. However, the second step was then able to correctly predict the expansion model for this target, but not with a lot of confidence. I like about the results from this package though, that it also calculated bayes factor, which is very helpful in interpreting the results. It shows that the expansion model is 1.6 more likely to be correct for this target than the contraction model, whereas it is only 1.1 times more likely than the neutral model.
There were initially two ways to create simulations which are comparable to the target data. The first was to simulate data sets with round about the same number of SNPs as the filtered target data set (no_mask). The second was to simulate data sets with round about the same number of SNPs in the unfiltered (?) target data set, introduce artificial missingness into them by randomly (?) masking genotypes, and then filter for different proportions of missingness (?) (mask). The question is, which approach is more realistic and then hopefully better able to do accurate model selection.
A good way to see how well the simulated summary statistics fit the target summary statistics, is to look at a PCA biplot of them. Below are two of them next to each other for all the call populations, one from the mask approach the other one from the no_mask approach. Its a lot of plots to look at, so I collapsed them.
So looking at them, it seems the masking always pushes the target further into the direction of contraction. Another indication I was curious about though, is the size of the data sets. Which of the approaches leads to a data set with a more similar size to the real data. Here is an overview for the call populations again. Showing the number of SNPs in the targets and the mean number of SNPs in the simulations of all three demographic scenarios.
| GH | WUG1 | WUG2 | WUG3 | WUG4 | WUG5 | |
|---|---|---|---|---|---|---|
| target | 1549.000 | 7236.000 | 8732.00 | 6336.000 | 6018.000 | 12344.00 |
| mask_con | 13426.177 | 31453.085 | 25704.25 | 8796.583 | 22028.267 | 14929.48 |
| mask_exp | 8577.369 | 17799.025 | 15152.86 | 6446.554 | 13926.624 | 11491.44 |
| mask_neu | 8822.756 | 19822.839 | 16360.98 | 6888.942 | 13848.016 | 11521.73 |
| no_mask_con | 11995.624 | 17883.867 | 18885.73 | 10434.312 | 16057.820 | 15550.39 |
| no_mask_exp | 6990.708 | 9387.704 | 10537.31 | 7126.544 | 9193.808 | 10128.25 |
| no_mask_neu | 7499.442 | 10364.788 | 11247.07 | 7592.429 | 9571.133 | 10769.53 |
The target always has the lowest number of SNPs, and mask has in most cases higher numbers of SNPs than no_mask. This is still a bit inconclusive, so I generated results from both methods. Here is an overview of the model selection for pall and call together with some more details about the data sets. The last six columns show the posterior probabilities of the model selection.
From these results, and also the fact that the number of SNPs is often times closer to the target, the no mask simulations seemed to be a better fit for the analysis.
After choosing the ABC rejection+regression algorithm, a set of summary statistics, generating more simulations for no mask and tweaking the priors a little, I started doing the model selection for all of the populations and species. Here are the results.
Here are all the results and how they are generated in detail, including cross validation and other stuff. Just expand what you are interessted in. In the end is a summary table.
** PCA: **
** Cross validation: **
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 42 26 32
## exp 17 53 30
## neu 32 27 41
##
## $tol0.02
## con exp neu
## con 40 27 33
## exp 15 48 37
## neu 31 30 39
##
## $tol0.05
## con exp neu
## con 40 27 33
## exp 17 47 36
## neu 27 35 38
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.4192 0.2713 0.3095
## exp 0.2042 0.4733 0.3225
## neu 0.3146 0.2855 0.3999
##
## $tol0.02
## con exp neu
## con 0.4449 0.2697 0.2853
## exp 0.2094 0.4593 0.3312
## neu 0.3027 0.3192 0.3782
##
## $tol0.05
## con exp neu
## con 0.4495 0.2572 0.2933
## exp 0.2347 0.4356 0.3297
## neu 0.2902 0.3416 0.3682
** Model Selection: **
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.02,
## method = "mnlogistic")
## Data:
## postpr.out$values (113 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.3097 0.3717 0.3186
##
## Bayes factors:
## con exp neu
## con 1.0000 0.8333 0.9722
## exp 1.2000 1.0000 1.1667
## neu 1.0286 0.8571 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.4461 0.2709 0.2830
##
## Bayes factors:
## con exp neu
## con 1.0000 1.6467 1.5763
## exp 0.6073 1.0000 0.9573
## neu 0.6344 1.0446 1.0000
** PCA: **
** Cross validation: **
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 64 14 22
## exp 24 44 32
## neu 20 30 50
##
## $tol0.02
## con exp neu
## con 53 18 29
## exp 23 49 28
## neu 25 22 53
##
## $tol0.05
## con exp neu
## con 57 18 25
## exp 24 38 38
## neu 24 19 57
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.6321 0.1414 0.2265
## exp 0.2197 0.4500 0.3303
## neu 0.2101 0.3220 0.4679
##
## $tol0.02
## con exp neu
## con 0.5355 0.2041 0.2604
## exp 0.2575 0.4560 0.2866
## neu 0.2881 0.2496 0.4623
##
## $tol0.05
## con exp neu
## con 0.5331 0.2163 0.2506
## exp 0.2569 0.4120 0.3312
## neu 0.2678 0.2849 0.4473
** Model Selection: **
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.02,
## method = "mnlogistic")
## Data:
## postpr.out$values (115 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.2000 0.3217 0.4783
##
## Bayes factors:
## con exp neu
## con 1.0000 0.6216 0.4182
## exp 1.6087 1.0000 0.6727
## neu 2.3913 1.4865 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.1343 0.0983 0.7675
##
## Bayes factors:
## con exp neu
## con 1.0000 1.3660 0.1749
## exp 0.7320 1.0000 0.1281
## neu 5.7163 7.8087 1.0000
** PCA: **
** Cross validation: **
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 49 21 30
## exp 14 57 29
## neu 15 28 57
##
## $tol0.02
## con exp neu
## con 43 22 35
## exp 16 65 19
## neu 20 27 53
##
## $tol0.05
## con exp neu
## con 36 29 35
## exp 18 55 27
## neu 25 34 41
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.4882 0.2151 0.2967
## exp 0.1535 0.5572 0.2894
## neu 0.1779 0.2909 0.5312
##
## $tol0.02
## con exp neu
## con 0.4467 0.2472 0.3061
## exp 0.1928 0.5691 0.2381
## neu 0.2042 0.3077 0.4881
##
## $tol0.05
## con exp neu
## con 0.4321 0.2745 0.2934
## exp 0.2119 0.5048 0.2833
## neu 0.2615 0.3428 0.3956
** Model Selection: **
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.02,
## method = "mnlogistic")
## Data:
## postpr.out$values (116 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.3362 0.2759 0.3879
##
## Bayes factors:
## con exp neu
## con 1.0000 1.2188 0.8667
## exp 0.8205 1.0000 0.7111
## neu 1.1538 1.4062 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.4123 0.0391 0.5486
##
## Bayes factors:
## con exp neu
## con 1.0000 10.5445 0.7516
## exp 0.0948 1.0000 0.0713
## neu 1.3305 14.0291 1.0000
** PCA: **
** Cross validation: **
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 52 28 20
## exp 31 39 30
## neu 24 29 47
##
## $tol0.02
## con exp neu
## con 45 34 21
## exp 22 44 34
## neu 24 38 38
##
## $tol0.05
## con exp neu
## con 37 37 26
## exp 22 47 31
## neu 18 40 42
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.4712 0.3189 0.2098
## exp 0.2882 0.3957 0.3161
## neu 0.2680 0.3193 0.4127
##
## $tol0.02
## con exp neu
## con 0.4446 0.3153 0.2401
## exp 0.2512 0.4042 0.3446
## neu 0.2763 0.3595 0.3642
##
## $tol0.05
## con exp neu
## con 0.4266 0.3080 0.2654
## exp 0.2575 0.4224 0.3201
## neu 0.2747 0.3568 0.3684
** Model Selection: **
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.02,
## method = "mnlogistic")
## Data:
## postpr.out$values (117 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.3162 0.4188 0.2650
##
## Bayes factors:
## con exp neu
## con 1.0000 0.7551 1.1935
## exp 1.3243 1.0000 1.5806
## neu 0.8378 0.6327 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.2978 0.2631 0.4391
##
## Bayes factors:
## con exp neu
## con 1.0000 1.1320 0.6784
## exp 0.8834 1.0000 0.5993
## neu 1.4742 1.6687 1.0000
** PCA: **
** Cross validation: **
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 44 32 24
## exp 14 54 32
## neu 25 40 35
##
## $tol0.02
## con exp neu
## con 49 28 23
## exp 18 56 26
## neu 26 35 39
##
## $tol0.05
## con exp neu
## con 46 23 31
## exp 20 46 34
## neu 28 36 36
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.4362 0.3045 0.2593
## exp 0.1480 0.5364 0.3156
## neu 0.2627 0.3596 0.3777
##
## $tol0.02
## con exp neu
## con 0.4840 0.2665 0.2495
## exp 0.2074 0.5024 0.2902
## neu 0.2903 0.3563 0.3534
##
## $tol0.05
## con exp neu
## con 0.4813 0.2339 0.2848
## exp 0.2289 0.4775 0.2936
## neu 0.2849 0.3553 0.3598
** Model Selection: **
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.02,
## method = "mnlogistic")
## Data:
## postpr.out$values (117 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.2051 0.2906 0.5043
##
## Bayes factors:
## con exp neu
## con 1.0000 0.7059 0.4068
## exp 1.4167 1.0000 0.5763
## neu 2.4583 1.7353 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.1974 0.4497 0.3529
##
## Bayes factors:
## con exp neu
## con 1.0000 0.4390 0.5594
## exp 2.2780 1.0000 1.2743
## neu 1.7877 0.7848 1.0000
** PCA: **
** Cross validation: **
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 40 27 33
## exp 27 48 25
## neu 27 30 43
##
## $tol0.02
## con exp neu
## con 33 27 40
## exp 25 50 25
## neu 22 36 42
##
## $tol0.05
## con exp neu
## con 37 30 33
## exp 23 53 24
## neu 21 39 40
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.4086 0.2764 0.3150
## exp 0.2622 0.4625 0.2754
## neu 0.2758 0.3012 0.4230
##
## $tol0.02
## con exp neu
## con 0.3685 0.2900 0.3415
## exp 0.2576 0.4893 0.2531
## neu 0.2648 0.3323 0.4029
##
## $tol0.05
## con exp neu
## con 0.3893 0.3062 0.3046
## exp 0.2666 0.4527 0.2807
## neu 0.2790 0.3677 0.3533
** Model Selection: **
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.02,
## method = "mnlogistic")
## Data:
## postpr.out$values (113 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.3186 0.3009 0.3805
##
## Bayes factors:
## con exp neu
## con 1.0000 1.0588 0.8372
## exp 0.9444 1.0000 0.7907
## neu 1.1944 1.2647 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.1144 0.2473 0.6382
##
## Bayes factors:
## con exp neu
## con 1.0000 0.4627 0.1793
## exp 2.1614 1.0000 0.3876
## neu 5.5771 2.5803 1.0000
** PCA: **
** Cross validation: **
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 49 21 30
## exp 15 49 36
## neu 16 33 51
##
## $tol0.02
## con exp neu
## con 46 18 36
## exp 13 51 36
## neu 8 34 58
##
## $tol0.05
## con exp neu
## con 43 17 40
## exp 15 46 39
## neu 15 29 56
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.4753 0.2165 0.3081
## exp 0.2019 0.4670 0.3311
## neu 0.2213 0.3209 0.4578
##
## $tol0.02
## con exp neu
## con 0.4658 0.2226 0.3116
## exp 0.2019 0.4813 0.3168
## neu 0.2189 0.3341 0.4470
##
## $tol0.05
## con exp neu
## con 0.4770 0.2327 0.2903
## exp 0.2239 0.4580 0.3181
## neu 0.2491 0.3457 0.4052
** Model Selection: **
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.02,
## method = "mnlogistic")
## Data:
## postpr.out$values (116 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.3879 0.3103 0.3017
##
## Bayes factors:
## con exp neu
## con 1.0000 1.2500 1.2857
## exp 0.8000 1.0000 1.0286
## neu 0.7778 0.9722 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.3818 0.3061 0.3121
##
## Bayes factors:
## con exp neu
## con 1.0000 1.2474 1.2231
## exp 0.8017 1.0000 0.9806
## neu 0.8176 1.0198 1.0000
** PCA: **
** Cross validation: **
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 51 23 26
## exp 26 49 25
## neu 24 27 49
##
## $tol0.02
## con exp neu
## con 49 31 20
## exp 17 58 25
## neu 22 31 47
##
## $tol0.05
## con exp neu
## con 48 31 21
## exp 17 56 27
## neu 17 37 46
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.5288 0.2194 0.2518
## exp 0.2493 0.4901 0.2606
## neu 0.2608 0.3059 0.4333
##
## $tol0.02
## con exp neu
## con 0.4989 0.2621 0.2391
## exp 0.2150 0.5157 0.2693
## neu 0.2749 0.3257 0.3994
##
## $tol0.05
## con exp neu
## con 0.4866 0.2614 0.2520
## exp 0.2115 0.5074 0.2811
## neu 0.2683 0.3481 0.3836
** Model Selection: **
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.02,
## method = "mnlogistic")
## Data:
## postpr.out$values (115 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.2522 0.4087 0.3391
##
## Bayes factors:
## con exp neu
## con 1.0000 0.6170 0.7436
## exp 1.6207 1.0000 1.2051
## neu 1.3448 0.8298 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.0420 0.6673 0.2906
##
## Bayes factors:
## con exp neu
## con 1.0000 0.0630 0.1447
## exp 15.8720 1.0000 2.2964
## neu 6.9117 0.4355 1.0000
** PCA: **
** Cross validation: **
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 53 24 23
## exp 21 51 28
## neu 19 38 43
##
## $tol0.02
## con exp neu
## con 55 21 24
## exp 19 46 35
## neu 16 32 52
##
## $tol0.05
## con exp neu
## con 49 27 24
## exp 21 44 35
## neu 16 42 42
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.5238 0.2482 0.2280
## exp 0.2117 0.5097 0.2785
## neu 0.2056 0.3668 0.4275
##
## $tol0.02
## con exp neu
## con 0.5317 0.2368 0.2315
## exp 0.2356 0.4430 0.3214
## neu 0.2218 0.3222 0.4560
##
## $tol0.05
## con exp neu
## con 0.5291 0.2462 0.2248
## exp 0.2458 0.4293 0.3249
## neu 0.2551 0.3575 0.3874
** Model Selection: **
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.02,
## method = "mnlogistic")
## Data:
## postpr.out$values (117 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.3248 0.2479 0.4274
##
## Bayes factors:
## con exp neu
## con 1.0000 1.3103 0.7600
## exp 0.7632 1.0000 0.5800
## neu 1.3158 1.7241 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.0876 0.1467 0.7656
##
## Bayes factors:
## con exp neu
## con 1.0000 0.5972 0.1145
## exp 1.6743 1.0000 0.1917
## neu 8.7357 5.2174 1.0000
** PCA: **
** Cross validation: **
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 47 23 30
## exp 25 40 35
## neu 13 35 52
##
## $tol0.02
## con exp neu
## con 46 20 34
## exp 24 41 35
## neu 20 30 50
##
## $tol0.05
## con exp neu
## con 49 17 34
## exp 21 42 37
## neu 21 26 53
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.4881 0.2211 0.2908
## exp 0.2687 0.4279 0.3034
## neu 0.2263 0.3528 0.4209
##
## $tol0.02
## con exp neu
## con 0.4830 0.2379 0.2791
## exp 0.2917 0.4094 0.2989
## neu 0.2705 0.3284 0.4012
##
## $tol0.05
## con exp neu
## con 0.4890 0.2344 0.2766
## exp 0.2789 0.4196 0.3015
## neu 0.2898 0.3151 0.3952
** Model Selection: **
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.02,
## method = "mnlogistic")
## Data:
## postpr.out$values (116 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.1983 0.3879 0.4138
##
## Bayes factors:
## con exp neu
## con 1.0000 0.5111 0.4792
## exp 1.9565 1.0000 0.9375
## neu 2.0870 1.0667 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.2015 0.3269 0.4715
##
## Bayes factors:
## con exp neu
## con 1.0000 0.6164 0.4274
## exp 1.6223 1.0000 0.6933
## neu 2.3400 1.4423 1.0000
** PCA: **
** Cross validation: **
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 52 20 28
## exp 35 36 29
## neu 27 30 43
##
## $tol0.02
## con exp neu
## con 44 21 35
## exp 37 27 36
## neu 33 23 44
##
## $tol0.05
## con exp neu
## con 45 14 41
## exp 35 24 41
## neu 30 14 56
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.5067 0.2198 0.2735
## exp 0.3456 0.3510 0.3034
## neu 0.2764 0.2964 0.4272
##
## $tol0.02
## con exp neu
## con 0.4667 0.2295 0.3038
## exp 0.3155 0.3356 0.3490
## neu 0.2967 0.2786 0.4246
##
## $tol0.05
## con exp neu
## con 0.4637 0.2133 0.3230
## exp 0.3347 0.3181 0.3472
## neu 0.3271 0.2681 0.4048
** Model Selection: **
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.02,
## method = "mnlogistic")
## Data:
## postpr.out$values (115 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.2783 0.3565 0.3652
##
## Bayes factors:
## con exp neu
## con 1.0000 0.7805 0.7619
## exp 1.2812 1.0000 0.9762
## neu 1.3125 1.0244 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.2238 0.0683 0.7079
##
## Bayes factors:
## con exp neu
## con 1.0000 3.2761 0.3162
## exp 0.3052 1.0000 0.0965
## neu 3.1625 10.3609 1.0000
** PCA: **
** Cross validation: **
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 73 9 18
## exp 7 68 25
## neu 17 20 63
##
## $tol0.02
## con exp neu
## con 73 9 18
## exp 9 66 25
## neu 20 21 59
##
## $tol0.05
## con exp neu
## con 73 11 16
## exp 6 72 22
## neu 16 19 65
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.7298 0.1024 0.1678
## exp 0.0882 0.6547 0.2570
## neu 0.1724 0.2681 0.5595
##
## $tol0.02
## con exp neu
## con 0.7282 0.1227 0.1490
## exp 0.1101 0.6614 0.2285
## neu 0.1984 0.2640 0.5376
##
## $tol0.05
## con exp neu
## con 0.7247 0.1374 0.1379
## exp 0.1165 0.6639 0.2196
## neu 0.1955 0.2661 0.5384
** Model Selection: **
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.02,
## method = "mnlogistic")
## Data:
## postpr.out$values (115 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.0174 0.0957 0.8870
##
## Bayes factors:
## con exp neu
## con 1.0000 0.1818 0.0196
## exp 5.5000 1.0000 0.1078
## neu 51.0000 9.2727 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.0172 0.0033 0.9794
##
## Bayes factors:
## con exp neu
## con 1.0000 5.1997 0.0176
## exp 0.1923 1.0000 0.0034
## neu 56.7857 295.2708 1.0000
** PCA: **
** Cross validation: **
## Confusion matrix based on 100 samples for each model.
##
## $tol0.01
## con exp neu
## con 78 10 12
## exp 10 68 22
## neu 19 22 59
##
## $tol0.02
## con exp neu
## con 80 9 11
## exp 9 63 28
## neu 21 16 63
##
## $tol0.05
## con exp neu
## con 74 10 16
## exp 13 68 19
## neu 18 19 63
##
##
## Mean model posterior probabilities (mnlogistic)
##
## $tol0.01
## con exp neu
## con 0.7741 0.1181 0.1078
## exp 0.0989 0.6660 0.2351
## neu 0.2133 0.2293 0.5574
##
## $tol0.02
## con exp neu
## con 0.7697 0.1172 0.1131
## exp 0.1016 0.6674 0.2311
## neu 0.2251 0.2090 0.5659
##
## $tol0.05
## con exp neu
## con 0.7430 0.1217 0.1353
## exp 0.1085 0.6672 0.2243
## neu 0.2320 0.2457 0.5223
** Model Selection: **
## Call:
## postpr(target = tar, index = model, sumstat = sumstat, tol = 0.02,
## method = "mnlogistic")
## Data:
## postpr.out$values (117 posterior samples)
## Models a priori:
## con, exp, neu
## Models a posteriori:
## con, exp, neu
##
## Proportion of accepted simulations (rejection):
## con exp neu
## 0.1197 0.2051 0.6752
##
## Bayes factors:
## con exp neu
## con 1.0000 0.5833 0.1772
## exp 1.7143 1.0000 0.3038
## neu 5.6429 3.2917 1.0000
##
##
## Posterior model probabilities (mnlogistic):
## con exp neu
## 0.0000 0.2091 0.7909
##
## Bayes factors:
## con exp neu
## con 1.0000 0.0001 0.0000
## exp 14011.9688 1.0000 0.2644
## neu 52995.1978 3.7821 1.0000
| con | exp | neu | |
|---|---|---|---|
| call_GH_pred | 0.4460928 | 0.2709064 | 0.2830008 |
| call_WUG1_pred | 0.1342582 | 0.0982820 | 0.7674598 |
| call_WUG2_pred | 0.4123209 | 0.0391028 | 0.5485763 |
| call_WUG3_pred | 0.2978334 | 0.2631150 | 0.4390517 |
| call_WUG4_pred | 0.1974067 | 0.4496898 | 0.3529035 |
| call_WUG5_pred | 0.1144357 | 0.2473456 | 0.6382187 |
| pall_GH_pred | 0.3817845 | 0.3060761 | 0.3121394 |
| pall_WUG1_pred | 0.0420456 | 0.6673467 | 0.2906078 |
| pall_WUG2_pred | 0.0876423 | 0.1467432 | 0.7656145 |
| plib_GH_pred | 0.2015201 | 0.3269322 | 0.4715478 |
| plib_WUG1_pred | 0.2238246 | 0.0683200 | 0.7078554 |
| ppli_GH_pred | 0.0172479 | 0.0033171 | 0.9794350 |
| ppli_WUG1_pred | 0.0000149 | 0.2091084 | 0.7908767 |