the model

A determinantal point process (DPP) over a set \(\mathbb{M}\) is a distribution over all possible subsets of items that can be assembled from the elements of \(\mathbb{M}\). The probability of selecting any particular subset \(Y\) is given by

\[\begin{equation} p(Y) \propto \mathrm{det}(\mathbf{L}_Y) \tag{1} \end{equation}\]

where \(\mathbf{L}\) is a \(\vert\mathbb{M}\vert \times \vert\mathbb{M}\vert\) positive semi-definite matrix indexed by the items in \(\mathbb{M}\), and \(L_Y\) is the submatrix of \(\mathbf{L}\) which contains only elements of \(\mathbf{L}\) which are indexed by the items in \(Y\). If \(\mathbf{L}\) is interpretable as a similarity matrix where \(L_{i,j}\) is the similarity between items \(i, j \neq i \in \mathbb{M}\), then (1) implies that similar items repel each other and \(Y\) with more diverse elements are more probable. Finally, a probability distribution you can take to a restaurant without incident! (citation) proposes representing \(\mathbf{L}\) with the decomposition

\[\begin{equation} \mathbf{L} = \mathbf{Q} \mathbf{\Phi}^\intercal \mathbf{\Phi} \mathbf{Q} \tag{2} \end{equation}\]

where \(\mathbf{Q}\) is a diagonal matrix whose nonzero elements \(\mathbf{Q}_{i,i} = q_i\) are the qualities of the items in \(\mathbb{M}\), and \(\mathbf{\Phi}\) is a matrix whose columns \(\phi_i\) represent feature vectors whose pairwise-dot products, \(S_{i,j} \equiv \phi_i^\intercal \phi_j\) encode the similarities between pairs of items.

Consider a DPP over the set of all possible edges in an undirected network with nodes signified as \(\{1,\ldots,N\}\). In this case, there are \(E=N(N-1)/2\) edges and accordingly \(\mathbf{L}\) is an \(E \times E\) matrix. Let \(\{i,j\}\) denote an edge connecting nodes \(i\) and \(j, \ j\neq i\). For convenience we can order the set of all possible edges as \(\{\{1,2\},\{1,3\},\ldots,\{1,N\},\{2,3\},\ldots,\{N-1,N\}\}\) and enumerate them as \(\{e_1, e_2,\ldots,e_E\}\) with edge \(e_1 \equiv\{1,2\}\), edge \(e_2 \equiv \{1,3\}\), and so on.

A natural, if somewhat crude, notion of similarity between edges is simply whether or not they share a node between them. Under the quality and similarity decomposition described above, this can be achieved by defining the feature vector \(\phi_i\) for edge \(e_i\) where the \(k\)th element is given by:

\[\begin{equation} \phi_i(k) = \begin{cases} \sqrt{0.5} & \text{if}\quad k \in e_i \\ 0 & \text{otherwise} \end{cases}, \quad k \in \{1,\ldots,N\}. \tag{3} \end{equation}\]

This meets the criteria that \(\| \phi_i \| = 0\), which is necessary for the decomposition (2) to be identifiability. From (3) we can see that the similarity matrix \(S_{i,j}\) takes on values \(1\), \(0.5\), and \(0\) when \(e_i\) and \(e_j\) share two, one or zero nodes, respectively (the first of these obtains only when \(i = j\)).

This formulation gives us no control over the strength of the repulsion between overlapping edges. However, the simple structure of the similarity matrix immediately suggests the following generalization:

\[\begin{equation} S_{i,j} = \begin{cases} 1 & \text{if}\quad | e_i \cap e_j | = 2 \\ \rho & \text{if}\quad | e_i \cap e_j | = 1 \\ 0 & \text{if}\quad | e_i \cap e_j | = 0 \end{cases}, \quad 0 \leq \rho \leq 1, \tag{4} \end{equation}\]

Here, the parameter \(\rho\) dictates the strength of repulsion between edges which share a node, and \(\mathbf{L}\) is still guarranteed a positive semidefinite form.

Finally, we will assume that each edge in the network is of the same quality, \(q_i = \nu\).

properties of a small network

Let’s attempt to gain some intuition regarding this model in a small network. A network of three nodes must have overlapping edges if there is more than one edge present, so the smallest network in which similarity can be disentangled from quality has \(4\) nodes and \(6\) possible edges.

to be continued…