Recommendation System with Association Rules

1 Introduction
2 Data description
3 Key metrics in Association Rules
4 Analysis
5 App
6 Summary

1 Introduction

Association Rules plays a key role in analyzing market basket analysis. Whether you are a stationary shop manager or an on-line shop owner, this method can come in very handy. For example, a proper layout of groceries results in reduced customer’s shopping time and it also boosts cros-selling, as the customer gets reminded of what relevant items he or she might be interested in. The same, naturally, goes for on-line shopping.

Creating a modern on-line shopping experience for customers involves implementing a recommendation system. In general, there are two types of recommendation systems: personalised and non-personalised. Personalised recommender systems base on user preferences and item descriptions and therefore it faces many challenges - for example the cold-start problem. That is where non-personalised systems (based on methods such as association rules) come in handy as they are independent of user interests and they do not need item ratings.

In this paper, I will use Association Rules on a transactional dataset to build a recommendation system for a small online shop in form of simple web app. I will also try to capture crucial to business associations in customers’ basket. In my analysis I will use R packages arules and arulesViz.

2 Data description

The company analysed in this paper sells children’s clothing and have 2 stationary points, suited for wholesale. The company uses Polish biggest e-commerce platform, Allegro as their only on-line sales channel, which is not fully developed yet, therefore the number of total transactions is not big. The transactional dataset has been collected from 18.06.2017 until 11.12.2018 (1.5 years), giving 1313 observations.

The nature of clothing stores is that the products on shelves change very fast - a new collection is usually released on quarterly basis. If we used raw transactional data that was collected throughout 1.5 years, then probably many of the items would be sold out already. Therefore raw transactional data cannot be used - it needs to be transformed first. In this paper items were grouped depending on types of clothes and main theme, e.g. blouse with horse print ({blouse_horse}) or shorts designed for physical education lessons ({shorts_physical education}). The complete list of groups can be find in chapter 4. The frequency of items in transactions is visualised below.

itemFrequencyPlot(trans1, topN=15, type="relative", main="Item Frequency")

3 Key metrics in Association Rules

In order to explain measures used in Association Rules, I limited the original transactional dataset to just 3 products - blouse with no special themes, blouse with horse theme and elegant blouse. After that, I plotted Venn diagram for this reduced dataset as it will be useful to show logical relations of sets of particular items.

dataToPlot <- dplyr::filter(trans1.csv, MerchCat %in% c("blouse_", "blouse_horse", "blouse_elegant"))
dataToPlot$freq <- 1
pivotTable <-  dcast(unique(dataToPlot), OrderId ~ MerchCat, value.var = "freq")
pivotTable[is.na(pivotTable)] <- 0
grid.newpage()
venn.plot <- draw.triple.venn(
    area1 = sum(pivotTable$blouse_), 
    area2 = sum(pivotTable$blouse_horse), 
    area3 = sum(pivotTable$blouse_elegant), 
    n12 = sum(pivotTable$blouse_[pivotTable$blouse_ & pivotTable$blouse_horse]), 
    n23 = sum(pivotTable$blouse_horse[pivotTable$blouse_horse & pivotTable$blouse_elegant]), 
    n13 = sum(pivotTable$blouse_[pivotTable$blouse_ & pivotTable$blouse_elegant]),  
    n123 = sum(pivotTable$blouse_[pivotTable$blouse_ & pivotTable$blouse_horse & pivotTable$blouse_elegant]), 
    category = c("blouse_", "blouse_horse", "blouse_elegant"), lty = "blank", 
    fill = c("skyblue", "pink1", "mediumorchid"))
grid.draw(venn.plot)

3.1 Support

\[Support( \{X\} \rightarrow \{Y\}) = \frac{Number\ of\ transactions\ with\ both\ X\ and\ Y}{Total\ number\ of\ transactions} = P(X \cap Y)\] Support is basically a measure of how frequent an itemset is in all the transactions. In our case, the blouse with horse print will have a higher support then the elegant blouse (0.35 vs 0.30).

Support of \(\{blouse\_horse\} \rightarrow \{blouse\_elegant\}\) will be significantly lower, at approximately 0.01 as there are only 7 transactions containing both products.

3.2 Confidence

\[Confidence( \{X\} \rightarrow \{Y\}) = \frac{Number\ of\ transactions\ with\ both\ X\ and\ Y}{Total\ number\ of\ transactions\ with\ X} = \frac{P(X \cap Y)}{P(X)}\] Confidence answers to question: of all transactions containings {blouse_horse}, how many had also {blouse elegant}? The answer to that is, naturally, 7/246 = 0.03. Confidence for \(\{blouse\_horse\} \rightarrow \{blouse\_\}\) will be much higher, at 0.12. It is worth noticing that confidence might be misleading as high confidence value does not necessarily mean strong items association. If the consequent (the item on right hand side - {X}) is very frequent, it does not matter what the antecedent is (left hand side - {Y}) - the confidence will always be high.

3.3 Lift

\[Lift( \{X\} \rightarrow \{Y\}) = \frac{(No.\ of\ transactions\ with\ both\ X\ and\ Y)/No.\ of\ transactions\ with\ X}{Fraction\ of\ transactions\ with\ Y} = \] \[ = \frac{P(X\cap Y)}{P(X).P(Y)}\] Lift has been introduced to overcome the mentioned challenge of confidence. Lift is the rise in probability of having {Y} in transaction knowing that {X} is in the basket over the probability of having {Y} in transaction without any knowledge about presence of {X}. A value of lift greater than 1 implies that there is high association between {X} and {Y}. In other words, customer will be more likely to buy {Y} if he has already bought {X}. In our case, \(Lift(\{blouse\_horse\} \rightarrow \{blouse\_\})\) equals to (29/246)/(254/667) = 0.31, so having a blouse with horse print on the cart does not increase probabilities of occurence of a regular blouse.

4 Analysis

To get understanding on data, I will firstly mine frequent itemsets with the Eclat algorithm.

freq.items <- eclat(trans1, parameter=list(supp=0.005)) # basic eclat
freq.items <- sort(freq.items, by="support", decreasing = TRUE)

inspectDT(freq.items)

Unfortunately, most of the transactions are containing just 1 item, so the dataset might be not big enough to mine significant rules.

4.1 Grouped matrix-based visualization

Grouped matrix-based visualization comes in very handy while mining interesting patterns in transactions.

Firstly, I will render a balloon plot with antecedent groups as columns and consequents as rows. The size and color of the balloons show the aggregated support and the aggregated lift respectively. Columns and rows in the plot are ordered such that lift value is decreasing from top down and from left to right, so the maximum value is in the top left corner.

rules <- ruleInduction(freq.items, trans1, confidence=0.1)
plot(rules, method = "grouped")

We can clearly see that having {blouse_dog, blouse_} significantly increases the chances of adding {blouse_cat} - the number of such transactions (support) is not very big though as in the analyzed period number of cat print blouse models were not that big. There is also a very strong association between {blouse_physical education, blouse_} and {shorts_physical education} as well as {shorts_physical education, blouse_} and {blouse_physical education}.

4.2 Graph-based visualizations

In graphs items (annodated with item labels) are connected with itemsets or rules using arrows. Rules are represented by vertices. The color and size of vertices show respectively lift and support values. Arrows pointing from items to rule indicate left-hand-side items and arrows pointing from a rule to items indicate right-hand-side items. Graphs are a very clear visualization of rules, however they become messy for a big set of rules. Therefore I will limit the set of rules to 15, choosing only rules with the highest lift value.

plot(head(rules, n = 15, by = "lift"), method="graph")

We can clearly see that shorts for physical education are associated with blouse for physical education. In general, having bottoms on basket increases the probability of adding another type of bottoms and having one kind of a blouse increases the probability of buying another top.

4.3 Rules inspection

Firstly, I will inspect only significant rules, using significance level 5%. To help me find significant rules, I will use is.significant function which uses Fisher’s exact test.

inspectDT(rules[is.significant(rules, trans1, alpha=0.05)])

There are just 7 significant rules. There is strong link between warm trousers and trousers, shorts for pe and blouse for pe, as well as owl print blouse and blouse with sequins. What’s interesting, having a blouse with a dog print increases the chances of buying a blouse with a cat, but not the other way around. The lack of significance of other rules does not mean that they will not be used - we need to build a recommendation system, so a rule for each combination on basket is essential, even if it is not statistically significant.

To dig deeper on relationship between all physical education clothing, let’s inspect all rules regarding it.

rules.shorts_pe <- apriori(trans1, parameter = list(support=0.003, confidence = 0.1), appearance = list(rhs = "shorts_physical education"))
rules.shorts_pe <- sort(rules.shorts_pe, by="lift", decreasing = TRUE)

rules.blouse_pe <-  apriori(trans1, parameter = list(support=0.003, confidence = 0.1), appearance = list(rhs = "blouse_physical education"))
rules.blouse_pe <- sort(rules.blouse_pe, by="lift", decreasing = TRUE)

inspectDT(rules.shorts_pe)

inspectDT(rules.blouse_pe)

There is a strong link between blouse for pe and shorts for pe, but more surprisingly if the client has also a regular blouse on the basket, the lift and confidence value is even higher.

This company sells rather a big number of blouses with horse print (almost 1/3 of total transactions), so I will inspect what customers buy having a blouse with a horse on the basket and see if the company is able to use cross-selling.

rules.horse <- apriori(trans1, parameter = list(support=0.005, confidence = 0.02), appearance = list(lhs = "blouse_horse"))
rules.horse <- sort(rules.horse, by="lift", decreasing = TRUE)

inspectDT(rules.horse)

We can see that lift value is rather big for cat print blouse and owl print blouse. This suggests developing other animals print collection as customers having a blouse with a horse on the basket, are more likely to buy a blouse with another animal. Unfortunately the confidence values are not that high.

5 App

To show how a recommender system based on association rules would behave, I created a web app using shiny apps.

As the user “adds” items to basket from the list, the recommendations are displayed below, with the value of support, confidence and lift. The recommendations are order in such way that rules with the exact sames items on LHS as the basket are displayed first. Rules are then ordered by significance. User can alter the support and confidence level.

If no rules are found for the specific basket, rules are displayed as if the basket was empty.

https://kkrynska.shinyapps.io/associationrules/

6 Summary

In this article, I used association methods to discover rules between items. I came to conclusion that in general customers choose similar items - e.g. if they buy warm trousers, they are more likely to buy also regular trousers. For this shop it might be crucial to provide variety of items so that customers who buy a blouse with dog print, might also add a blouse with cat print.

I created a simple web-app in R shiny for interactive association rules for this shop. If this engine would be to use in an on-line shop it could be developed in a numerous ways. Firstly, in children’s clothing there are often collections dedicated for small children (small sizes) and pre-teenagers and teenagers, so the ideal model should also consider the size of items. Also, there should be a boost for items that are trending now (best-sellers), as in fashion trends are critical.