Market Basket Analysis for Product Recommendation Systems,Fast Mover Analysis, Store Layout and Promotional Activities

Association Mining or Market Basket Analysis is an interesting approach for understanding the product purchase pattern and using which we can come up with various decision factors like Promotional schemes, product recommenders on website, product placement in a store layout, can do fast mover and slow mover analysis.

Let’s load the data.

for dataset visit:https://www.kaggle.com/gorkhachatryan01/purchase-behaviour

And plot the item frequency plot with top 25 transactions.

We will follow the standard process and apply algorithms on the standard data without any parameters which enables the algorithms to give default rules or standard rules

rules<-apriori(mydata)

## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.8    0.1    1 none FALSE            TRUE       5     0.1      1
##  maxlen target   ext
##      10  rules FALSE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 149 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[95 item(s), 1499 transaction(s)] done [0.00s].
## sorting and recoding items ... [50 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [86 rule(s)] done [0.00s].
## creating S4 object  ... done [0.00s].

We the see the default parameters and lets find how many rules the algorithm was able to find

## set of 86 rules

Now lets plot the default rules

After seeing the data to clear out we can set parameters and start mining the rules playing with support, confidence and lift.

We can choose an individual product and start mining the association. Let’s take an example that we want to sell bags and we want to target the group customer who buys other products along with bag. So we start looking for the association in the transactions.

bags_rules<-apriori(mydata,appearance = list(default="lhs",rhs="bags"))

## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.8    0.1    1 none FALSE            TRUE       5     0.1      1
##  maxlen target   ext
##      10  rules FALSE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 149 
## 
## set item appearances ...[1 item(s)] done [0.00s].
## set transactions ...[95 item(s), 1499 transaction(s)] done [0.00s].
## sorting and recoding items ... [50 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [0 rule(s)] done [0.00s].
## creating S4 object  ... done [0.00s].

Lets find the number of rules

## set of 0 rules

We are able to find no rules in the data. so we can play with parameters and find the associations

## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.9    0.1    1 none FALSE            TRUE       5   0.001      1
##  maxlen target   ext
##      10  rules FALSE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 1 
## 
## set item appearances ...[1 item(s)] done [0.00s].
## set transactions ...[95 item(s), 1499 transaction(s)] done [0.00s].
## sorting and recoding items ... [95 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 5 6 7 8 9 10

## Warning in apriori(mydata, appearance = list(default = "lhs", rhs =
## "bags"), : Mining stopped (maxlen reached). Only patterns up to a length of
## 10 returned!

##  done [4.38s].
## writing ... [4803 rule(s)] done [0.44s].
## creating S4 object  ... done [0.14s].

Lets find the rules

## set of 4803 rules

By decreasing the support I was able to find few associations. Now Lets plot it

## Warning: plot: Too many rules supplied. Only plotting the best 100 rules
## using 'lift' (change control parameter max if needed)

Similarly we can find associations with sandwich

sand_rules<-apriori(mydata,appearance = list(default="lhs",rhs="sandwich"))

## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.8    0.1    1 none FALSE            TRUE       5     0.1      1
##  maxlen target   ext
##      10  rules FALSE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 149 
## 
## set item appearances ...[1 item(s)] done [0.00s].
## set transactions ...[95 item(s), 1499 transaction(s)] done [0.01s].
## sorting and recoding items ... [50 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [6 rule(s)] done [0.00s].
## creating S4 object  ... done [0.00s].

Lets find the rules

## set of 6 rules

Lets plot the rules

If you want to have a check on the rules individually based on confidence

rules<-sort(rules,by="confidence",decreasing = TRUE)
inspect(rules[1:10])

##      lhs                    rhs           support   confidence lift    
## [1]  {loaves,}           => {sandwich}    0.2368245 1          2.320433
## [2]  {bags,}             => {sandwich}    0.2414943 1          2.320433
## [3]  {rolls,}            => {dinner}      0.2414943 1          3.863402
## [4]  {sauce,}            => {spaghetti}   0.2434957 1          3.934383
## [5]  {towels,}           => {paper}       0.2481654 1          3.785354
## [6]  {detergent,}        => {laundry}     0.2501668 1          3.785354
## [7]  {purpose,}          => {all-}        0.2508339 1          3.794937
## [8]  {foil,}             => {aluminum}    0.2521681 1          3.785354
## [9]  {meals,}            => {individual}  0.2568379 1          3.683047
## [10] {liquid/detergent,} => {dishwashing} 0.2541694 1          3.728856

Similarly we can sort with lift

rules<-sort(rules,by="lift",decreasing = TRUE)
inspect(rules[1:10])

##      lhs                        rhs         support   confidence lift    
## [1]  {sandwich,spaghetti}    => {sauce,}    0.1060707 0.9636364  3.957509
## [2]  {sauce,}                => {spaghetti} 0.2434957 1.0000000  3.934383
## [3]  {sauce,,soap,}          => {spaghetti} 0.1014009 1.0000000  3.934383
## [4]  {sandwich,sauce,}       => {spaghetti} 0.1060707 1.0000000  3.934383
## [5]  {sauce,,vegetables,}    => {spaghetti} 0.1334223 1.0000000  3.934383
## [6]  {spaghetti}             => {sauce,}    0.2434957 0.9580052  3.934383
## [7]  {spaghetti,vegetables,} => {sauce,}    0.1334223 0.9569378  3.929999
## [8]  {soap,,spaghetti}       => {sauce,}    0.1014009 0.9500000  3.901507
## [9]  {all-,sandwich}         => {purpose,}  0.1107405 0.9707602  3.870132
## [10] {dinner}                => {rolls,}    0.2414943 0.9329897  3.863402

Market Basket Analysis for Product Recommendation Systems,Fast Mover Analysis, Store Layout and Promotional Activities

Sangamesh K S

December 30, 2017