Suppose that each of \(N\) participants are asked to rank \(J_i\) items (where \(i=1,\ldots,N\) indicates the participant) out of a total of \(J_0\) items that could be ranked. Let \(r_{ij}\) (\(j=1,\ldots,J_0\), \(r_{ij}\in\{1,2,\ldots,J_i\}\)) be the rank given to by the \(i\)th participant to the \(j\)th item. Note that if a participant does not see an item, it is not ranked and the corresponding \(r_{ij}\) is missing.
In the case at hand, \(J_0=17\) and \(J_i=16\) for all \(i\). The data look something like this:
| Example rank data (rij) | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Item 1 | Item 2 | Item 3 | Item 4 | Item 5 | Item 6 | Item 7 | Item 8 | Item 9 | Item 10 | Item 11 | Item 12 | Item 13 | Item 14 | Item 15 | Item 16 | Item 17 | |
| Participant 1 | 1 | 2 | 6 | 3 | 4 | NA | 8 | 5 | 7 | 9 | 11 | 10 | 12 | 14 | 13 | 16 | 15 |
| Participant 2 | 1 | 4 | 3 | 2 | 6 | 5 | 7 | 9 | 8 | 12 | 11 | 10 | 15 | 14 | 13 | NA | 16 |
| Participant 3 | 1 | 2 | 4 | 3 | 5 | 6 | 7 | 9 | 8 | 13 | NA | 10 | 11 | 12 | 15 | 14 | 16 |
| Participant 4 | 1 | 2 | 4 | NA | 5 | 3 | 6 | 7 | 8 | 10 | 9 | 12 | 11 | 13 | 14 | 15 | 16 |
| Participant 5 | 3 | 6 | 2 | 1 | 7 | 4 | 5 | 10 | 9 | 8 | 11 | NA | 12 | 14 | 13 | 16 | 15 |
We construct an ordered probit model for the ranks.
\[ Pr(r_{ij}<r_{ik}) = Pr(y_{ij}<y_{ik}) \] where \[ \boldsymbol{y}_i \stackrel{indep.}{\sim} \mbox{MvtNorm}_{J_0}\left(\boldsymbol{X_i\beta}, \boldsymbol{\Sigma}\right) \] and \(\boldsymbol{X}_i\) is a \(J_0\times p\) design matrix mapping the \(p\) \(\boldsymbol{\beta}\) parameters to the ranked items for the \(i\)th participant.
For now, we assume that \[ \begin{eqnarray*} \boldsymbol{X}_i&=&\boldsymbol{I}_{J_0},\forall i\\ \boldsymbol{\Sigma}&=&\boldsymbol{I}_{J_0} \end{eqnarray*} \] and that one of the \(\boldsymbol{\beta}\) parameters is set to 0 and treated as a reference. Eventually, we would like to introduce at least some covariances into \(\boldsymbol{\Sigma}\), but some constraint is required to ensure the model is identifiable.
We additionally assume that each participant has a classification \(z_i\in\{1,2,3\}\); if \(z_i=1\), then the model above holds without modification for participant \(i\) (they are “typical”). If \(z_i=2\), participant \(i\) is a “randomizer” and all \(J_i!\) rank orderings are equally probable. If \(z_i=3\), participant \(i\) is a “reverser” and their rank orderings will be presented in reverse order, and \[ Pr(r_{ij}<r_{ik}) = Pr(y_{ij}>y_{ik}) \] Of course, some prior constraint is needed to differentiate two solutions (where “reversers” become “typical” and all the \(\boldsymbol{\beta}\) parameters flip sign).
The latent \(z\) classifications are assumed to be distributed as a categorical variable, \[ z_i\stackrel{indep.}{\sim}\mbox{categorical}(\boldsymbol{p}_z) \]
\[ \begin{eqnarray*} \boldsymbol{\beta} &\sim& \mbox{MvtNorm}_p(\boldsymbol{\mu}_\beta, \boldsymbol{\Sigma}_\beta),\\ \boldsymbol{p}_z &\sim& \mbox{Dirichlet}(\boldsymbol{\alpha}_z) \end{eqnarray*} \]
(with one \(\boldsymbol{\beta}\) parameter set to 0 for identifiability)
tmvmixnorm::rtmvn with differences of adjacent ranked values being constrained to be positive.mvtnorm::pmvnorm (but is slow).This is extremely slow due to tmvmixnorm::rtmvn (Step 1), mvtnorm::pmvnorm, (Step 3), and the autocorrelation resulting from the data augmentation strategy (Albert & Chib).