Probit ranking model

Base model

Suppose that each of \(N\) participants are asked to rank \(J_i\) items (where \(i=1,\ldots,N\) indicates the participant) out of a total of \(J_0\) items that could be ranked. Let \(r_{ij}\) (\(j=1,\ldots,J_0\), \(r_{ij}\in\{1,2,\ldots,J_i\}\)) be the rank given to by the \(i\)th participant to the \(j\)th item. Note that if a participant does not see an item, it is not ranked and the corresponding \(r_{ij}\) is missing.

In the case at hand, \(J_0=17\) and \(J_i=16\) for all \(i\). The data look something like this:

Example rank data (r_ij)
	Item 1	Item 2	Item 3	Item 4	Item 5	Item 6	Item 7	Item 8	Item 9	Item 10	Item 11	Item 12	Item 13	Item 14	Item 15	Item 16	Item 17
Participant 1	1	2	6	3	4	NA	8	5	7	9	11	10	12	14	13	16	15
Participant 2	1	4	3	2	6	5	7	9	8	12	11	10	15	14	13	NA	16
Participant 3	1	2	4	3	5	6	7	9	8	13	NA	10	11	12	15	14	16
Participant 4	1	2	4	NA	5	3	6	7	8	10	9	12	11	13	14	15	16
Participant 5	3	6	2	1	7	4	5	10	9	8	11	NA	12	14	13	16	15

We construct an ordered probit model for the ranks.

\[ Pr(r_{ij}<r_{ik}) = Pr(y_{ij}<y_{ik}) \] where \[ \boldsymbol{y}_i \stackrel{indep.}{\sim} \mbox{MvtNorm}_{J_0}\left(\boldsymbol{X_i\beta}, \boldsymbol{\Sigma}\right) \] and \(\boldsymbol{X}_i\) is a \(J_0\times p\) design matrix mapping the \(p\) \(\boldsymbol{\beta}\) parameters to the ranked items for the \(i\)th participant.

For now, we assume that \[ \begin{eqnarray*} \boldsymbol{X}_i&=&\boldsymbol{I}_{J_0},\forall i\\ \boldsymbol{\Sigma}&=&\boldsymbol{I}_{J_0} \end{eqnarray*} \] and that one of the \(\boldsymbol{\beta}\) parameters is set to 0 and treated as a reference. Eventually, we would like to introduce at least some covariances into \(\boldsymbol{\Sigma}\), but some constraint is required to ensure the model is identifiable.

Latent classes

We additionally assume that each participant has a classification \(z_i\in\{1,2,3\}\); if \(z_i=1\), then the model above holds without modification for participant \(i\) (they are “typical”). If \(z_i=2\), participant \(i\) is a “randomizer” and all \(J_i!\) rank orderings are equally probable. If \(z_i=3\), participant \(i\) is a “reverser” and their rank orderings will be presented in reverse order, and \[ Pr(r_{ij}<r_{ik}) = Pr(y_{ij}>y_{ik}) \] Of course, some prior constraint is needed to differentiate two solutions (where “reversers” become “typical” and all the \(\boldsymbol{\beta}\) parameters flip sign).

The latent \(z\) classifications are assumed to be distributed as a categorical variable, \[ z_i\stackrel{indep.}{\sim}\mbox{categorical}(\boldsymbol{p}_z) \]

Priors

\[ \begin{eqnarray*} \boldsymbol{\beta} &\sim& \mbox{MvtNorm}_p(\boldsymbol{\mu}_\beta, \boldsymbol{\Sigma}_\beta),\\ \boldsymbol{p}_z &\sim& \mbox{Dirichlet}(\boldsymbol{\alpha}_z) \end{eqnarray*} \]

(with one \(\boldsymbol{\beta}\) parameter set to 0 for identifiability)

Gibbs strategy

Sample latent continuous variables \(\boldsymbol y\mid\boldsymbol r, \boldsymbol\beta, \boldsymbol\Sigma, \boldsymbol z\) (a la Albert & Chib, 1992), constrained to be in the proper order. The constraint can be accomplished via tmvmixnorm::rtmvn with differences of adjacent ranked values being constrained to be positive.
Sample \(\boldsymbol\beta \mid \boldsymbol y,\boldsymbol\Sigma, \boldsymbol z, \boldsymbol\mu_\beta, \boldsymbol\Sigma_\beta\) with appropriate constraints on \(\beta\) for identifiability.
Sample classifications \(z_i\mid \boldsymbol r_i,\boldsymbol\beta,\boldsymbol\Sigma,\boldsymbol{p}_z\) by first computing the probability of the rank ordering for participant \(i\) under each potential classification. This can be done via mvtnorm::pmvnorm (but is slow).
Sample \(\boldsymbol p_z\mid\boldsymbol z, \boldsymbol\alpha_z\)

This is extremely slow due to tmvmixnorm::rtmvn (Step 1), mvtnorm::pmvnorm, (Step 3), and the autocorrelation resulting from the data augmentation strategy (Albert & Chib).

Probit ranking model

Richard D. Morey

5/17/2021

Base model

Latent classes

Priors

Gibbs strategy