getwd()
## [1] "C:/Users/graci/Documents"
setwd('C:\\Users\\graci\\Documents')
# setwd ( 'Documents')
# setwd()

Scaling Recommendation Engine: 15,000 to 130M Users in 24 Months

This is an interesting article that details the time frame of buidling, testing, and executing a recommendation sytem in a real world example. The example is to provide product recommendations for 15,000 customers tracked by an e-commerce client. Metrics such as click and redeem rate were to be compared with a baseline already being used #https:\\/www.retentionscience.com/blog/scalingrecommendations/

Here’s the chronology of events:

Month 1: Cold Start on a winter night

They first started with a simple Rule: Annotate the top K items (per user attribute) based on number of interactions and purchases. which served 2 important purposes: (1) No user was left without a recommendation, and (2) You could draw a connection from a user to the item through some prior modeling.

Month 2: Something better than a Cold Start? Besides a cold start, starting in month 2, the developers started to think in analogy to the resultant rule: “Select the top items from the category that the user has already bought.”

Month 3: Manual Checks before send After two months of “successful” implementation of some legacy SQL codes, manual exploration had uncovered a rule that would be corroborated by our future use of classical discriminative learning. The users’ purchase categories turned out to be a very strong latent factor.

Month 5: De-duplication! Not every person is unique, and their purchases are not unique either, which warrants extensive de-dup effort.

Month 6: Visualization (see figure below)

Month 7: Feedback The impact of the algorithms on the business was measured. Month 8: Feature Engineering Data was split into behavioral and transactional data. Month 9: The Banana problem

“All of our users are getting bananas,” ,w hich was what the models picked up!” The clients were going to buy banana anyway, which does not need our recommendation. A week later, this was fixed by another rule: “Remove all items above 99 percentile in the affinity score.” They were pionner in recogzing the common problems in many large e-commerce companies: not letting their users explore enough of their inventory.

Month 11: Exploring “off the shelf” solutions

Month 12: ‘Robust-ification’

Figure

Figure