In this weeks project we will be summarizing a talk from the 10th ACM Conference on Recommender Systems (RecSys’16). We have selected chosen to select the presentation of Bayesian Low-Rank Determinantal Point Processes because this topic is brand new and we are curious about the application of Bayesian methods in Recommender Systems. The original paper can be found on the ACM Digital Library however the full paper requires a membership to access. The domain for this talk is the application of the system to an online shopping basket and making recommendations of products. This method can also applied to a number of different recommenders and, according to the authors, performs better than a number of ‘state of the art’ methods.
The determinational point process (DPP), measures the probability of all possible point processes or subsets of a ground base set. For example it would measure the probability of all possible shopping baskets over a ground base sets of all possible items in the store. According to the presentation the DPP measures the probability of getting a certain amount of diversity in a set across all of the possible subsets.
DPP was developed in the statistical physics community by Macchi in 1975 and is also used in combinatorics and random matrix theory. DPP’s are also being studied for their application to machine learning problems. The application to recommender systems is based on the following:
For a discrete base set (item catalog) \(\mathcal{Y} = \{x_{1}, x_{2}, \dots, x_{M} \}\). A DPP defined by a \(M \times M\) positive semi-definite kernel matrix \(L\) is a probability measure on the \(2^M\) possible subsets \(Y\) in \(\mathcal{Y}\):
\[
P_{L}(Y) = \frac{det(L_Y)}{det(L+I)}
\]
The traditional DPP model has a number of issues that make it concerning for large data sets:
In the presentation the method of lowering the rank of the matrix is proposed to be as follows: \[ L = VV^T \] Where \(V\) is the item trait matrix where each row is an item trait vector and the number of item traits selected i a hyper-parameter that can be manipulated.
In the presentation they note that the model is very sensitive to the weighted values of each of the vectors. To improve the ability of finding these weights the authors proposed a Bayesian methodology for finding these weights:
The real world application that the authors describe is the idea of having a shopping basket filled with items A, B, and C and we would like to know a good item D to show at checkout that the user may also want to add to the cart. The recommender was tested on the Amazon Baby Registry Data Set, a Microsoft Store data set consisting of 243,000 checkout baskets and finally the Belgian Retail supermarket data set consisting of 88,000 shopping baskets and 16,000 items. The performance of the DPP method was compared to the following systems:
Finally the system was compared to these systems using a couple of measures. The first is mean percentile rank (MPR) where a 100% indicates perfect predictions and the system performed equal to or better then the rest of the systems. The other metric used was precision and this also was competitive with the above mentioned systems.