Base model

Suppose that each of \(N\) participants are asked to rank \(J_i\) items (where \(i=1,\ldots,N\) indicates the participant) out of a total of \(J_0\) items that could be ranked. Let \(r_{ij}\) (\(j=1,\ldots,J_0\), \(r_{ij}\in\{1,2,\ldots,J_i\}\)) be the rank given to by the \(i\)th participant to the \(j\)th item. Note that if a participant does not see an item, it is not ranked and the corresponding \(r_{ij}\) is missing.

In the case at hand, \(J_0=17\) and \(J_i=16\) for all \(i\). The data look something like this:

Example rank data (rij)
Item 1 Item 2 Item 3 Item 4 Item 5 Item 6 Item 7 Item 8 Item 9 Item 10 Item 11 Item 12 Item 13 Item 14 Item 15 Item 16 Item 17
Participant 1 1 2 6 3 4 NA 8 5 7 9 11 10 12 14 13 16 15
Participant 2 1 4 3 2 6 5 7 9 8 12 11 10 15 14 13 NA 16
Participant 3 1 2 4 3 5 6 7 9 8 13 NA 10 11 12 15 14 16
Participant 4 1 2 4 NA 5 3 6 7 8 10 9 12 11 13 14 15 16
Participant 5 3 6 2 1 7 4 5 10 9 8 11 NA 12 14 13 16 15


We construct an ordered probit model for the ranks.

\[ Pr(r_{ij}<r_{ik}) = Pr(y_{ij}<y_{ik}) \] where \[ \boldsymbol{y}_i \stackrel{indep.}{\sim} \mbox{MvtNorm}_{J_0}\left(\boldsymbol{X_i\beta}, \boldsymbol{\Sigma}\right) \] and \(\boldsymbol{X}_i\) is a \(J_0\times p\) design matrix mapping the \(p\) \(\boldsymbol{\beta}\) parameters to the ranked items for the \(i\)th participant.

For now, we assume that \[ \begin{eqnarray*} \boldsymbol{X}_i&=&\boldsymbol{I}_{J_0},\forall i\\ \boldsymbol{\Sigma}&=&\boldsymbol{I}_{J_0} \end{eqnarray*} \] and that one of the \(\boldsymbol{\beta}\) parameters is set to 0 and treated as a reference. Eventually, we would like to introduce at least some covariances into \(\boldsymbol{\Sigma}\), but some constraint is required to ensure the model is identifiable.

Latent classes

We additionally assume that each participant has a classification \(z_i\in\{1,2,3\}\); if \(z_i=1\), then the model above holds without modification for participant \(i\) (they are “typical”). If \(z_i=2\), participant \(i\) is a “randomizer” and all \(J_i!\) rank orderings are equally probable. If \(z_i=3\), participant \(i\) is a “reverser” and their rank orderings will be presented in reverse order, and \[ Pr(r_{ij}<r_{ik}) = Pr(y_{ij}>y_{ik}) \] Of course, some prior constraint is needed to differentiate two solutions (where “reversers” become “typical” and all the \(\boldsymbol{\beta}\) parameters flip sign).

The latent \(z\) classifications are assumed to be distributed as a categorical variable, \[ z_i\stackrel{indep.}{\sim}\mbox{categorical}(\boldsymbol{p}_z) \]

Priors

\[ \begin{eqnarray*} \boldsymbol{\beta} &\sim& \mbox{MvtNorm}_p(\boldsymbol{\mu}_\beta, \boldsymbol{\Sigma}_\beta),\\ \boldsymbol{p}_z &\sim& \mbox{Dirichlet}(\boldsymbol{\alpha}_z) \end{eqnarray*} \]

(with one \(\boldsymbol{\beta}\) parameter set to 0 for identifiability)

Gibbs strategy

  1. Sample latent continuous variables \(\boldsymbol y\mid\boldsymbol r, \boldsymbol\beta, \boldsymbol\Sigma, \boldsymbol z\) (a la Albert & Chib, 1992), constrained to be in the proper order. The constraint can be accomplished via tmvmixnorm::rtmvn with differences of adjacent ranked values being constrained to be positive.
  2. Sample \(\boldsymbol\beta \mid \boldsymbol y,\boldsymbol\Sigma, \boldsymbol z, \boldsymbol\mu_\beta, \boldsymbol\Sigma_\beta\) with appropriate constraints on \(\beta\) for identifiability.
  3. Sample classifications \(z_i\mid \boldsymbol r_i,\boldsymbol\beta,\boldsymbol\Sigma,\boldsymbol{p}_z\) by first computing the probability of the rank ordering for participant \(i\) under each potential classification. This can be done via mvtnorm::pmvnorm (but is slow).
  4. Sample \(\boldsymbol p_z\mid\boldsymbol z, \boldsymbol\alpha_z\)

This is extremely slow due to tmvmixnorm::rtmvn (Step 1), mvtnorm::pmvnorm, (Step 3), and the autocorrelation resulting from the data augmentation strategy (Albert & Chib).