Unsupervised Learning - Association Rules Project

Adrianna Łazuga

418397

Project Introduction

The aim of this project is to analyze the transactions from the supermarket. By using association rules I will try discover frequent patterns in the products appearing in transactions made by customers. This analysis could be helpful in planning future product placement in the market for the best profits.

Libraries

library(arules)
## Warning: pakiet 'arules' został zbudowany w wersji R 4.4.2
## Ładowanie wymaganego pakietu: Matrix
## 
## Dołączanie pakietu: 'arules'
## Następujące obiekty zostały zakryte z 'package:base':
## 
##     abbreviate, write
library(arulesViz)
## Warning: pakiet 'arulesViz' został zbudowany w wersji R 4.4.2

Dataset

transactions<-read.transactions("Market_Basket_Optimisation.csv",header = FALSE,sep=",")
## Warning in asMethod(object): removing duplicated items in transactions
summary(transactions)
## transactions as itemMatrix in sparse format with
##  7501 rows (elements/itemsets/transactions) and
##  119 columns (items) and a density of 0.03288973 
## 
## most frequent items:
## mineral water          eggs     spaghetti  french fries     chocolate 
##          1788          1348          1306          1282          1229 
##       (Other) 
##         22405 
## 
## element (itemset/transaction) length distribution:
## sizes
##    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16 
## 1754 1358 1044  816  667  493  391  324  259  139  102   67   40   22   17    4 
##   18   19   20 
##    1    2    1 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   2.000   3.000   3.914   5.000  20.000 
## 
## includes extended item information - examples:
##              labels
## 1           almonds
## 2 antioxydant juice
## 3         asparagus

The dataset contains information about transactions (7501 rows) and items (119 columns). Most frequent items bought are mineral water, eggs, spaghetti, french fries and chocolate. This histogram shows 10 most frequent items appearing in transactions.

itemFrequencyPlot(transactions, topN = 10, col = "lightblue")

The sample of 10 transactions showing the purchased items.

inspect(transactions[1:10])
##      items               
## [1]  {almonds,           
##       antioxydant juice, 
##       avocado,           
##       cottage cheese,    
##       energy drink,      
##       frozen smoothie,   
##       green grapes,      
##       green tea,         
##       honey,             
##       low fat yogurt,    
##       mineral water,     
##       olive oil,         
##       salad,             
##       salmon,            
##       shrimp,            
##       spinach,           
##       tomato juice,      
##       vegetables mix,    
##       whole weat flour,  
##       yams}              
## [2]  {burgers,           
##       eggs,              
##       meatballs}         
## [3]  {chutney}           
## [4]  {avocado,           
##       turkey}            
## [5]  {energy bar,        
##       green tea,         
##       milk,              
##       mineral water,     
##       whole wheat rice}  
## [6]  {low fat yogurt}    
## [7]  {french fries,      
##       whole wheat pasta} 
## [8]  {light cream,       
##       shallot,           
##       soup}              
## [9]  {frozen vegetables, 
##       green tea,         
##       spaghetti}         
## [10] {french fries}

Frequency

Here we can see how many items each item appears in transactions and the spread of products in 20 sample transactions.

itemFrequency(transactions, type="absolute")
##              almonds    antioxydant juice            asparagus 
##                  153                   67                   36 
##              avocado          babies food                bacon 
##                  250                   34                   65 
##       barbecue sauce            black tea          blueberries 
##                   81                  107                   69 
##           body spray              bramble             brownies 
##                   86                   14                  253 
##            bug spray         burger sauce              burgers 
##                   65                   44                  654 
##               butter                 cake           candy bars 
##                  226                  608                   73 
##              carrots          cauliflower              cereals 
##                  115                   36                  193 
##            champagne              chicken                chili 
##                  351                  450                   46 
##            chocolate      chocolate bread              chutney 
##                 1229                   32                   31 
##                cider  clothes accessories              cookies 
##                   79                   63                  603 
##          cooking oil                 corn       cottage cheese 
##                  383                   36                  239 
##                cream         dessert wine             eggplant 
##                    7                   33                   99 
##                 eggs           energy bar         energy drink 
##                 1348                  203                  200 
##             escalope extra dark chocolate            flax seed 
##                  595                   90                   68 
##         french fries          french wine          fresh bread 
##                 1282                  169                  323 
##           fresh tuna        fromage blanc      frozen smoothie 
##                  167                  102                  475 
##    frozen vegetables      gluten free bar        grated cheese 
##                  715                   52                  393 
##          green beans         green grapes            green tea 
##                   65                   68                  991 
##          ground beef                 gums                  ham 
##                  737                  101                  199 
##     hand protein bar        herb & pepper                honey 
##                   39                  371                  356 
##             hot dogs              ketchup          light cream 
##                  243                   33                  117 
##           light mayo       low fat yogurt            magazines 
##                  204                  574                   82 
##        mashed potato           mayonnaise            meatballs 
##                   31                   46                  157 
##               melons                 milk        mineral water 
##                   90                  972                 1788 
##                 mint       mint green tea              muffins 
##                  131                   42                  181 
## mushroom cream sauce              napkins          nonfat milk 
##                  143                    5                   78 
##              oatmeal                  oil            olive oil 
##                   33                  173                  494 
##             pancakes      parmesan cheese                pasta 
##                  713                  149                  118 
##               pepper             pet food              pickles 
##                  199                   49                   45 
##          protein bar             red wine                 rice 
##                  139                  211                  141 
##                salad               salmon                 salt 
##                   37                  319                   69 
##             sandwich              shallot              shampoo 
##                   34                   58                   37 
##               shrimp                 soda                 soup 
##                  536                   47                  379 
##            spaghetti      sparkling water              spinach 
##                 1306                   47                   53 
##         strawberries        strong cheese                  tea 
##                  160                   58                   29 
##         tomato juice         tomato sauce             tomatoes 
##                  228                  106                  513 
##           toothpaste               turkey       vegetables mix 
##                   61                  469                  193 
##          water spray           white wine     whole weat flour 
##                    3                  124                   70 
##    whole wheat pasta     whole wheat rice                 yams 
##                  221                  439                   86 
##          yogurt cake             zucchini 
##                  205                   71
image(sample(transactions, 20))

Association Rules

The aim of applying Apriori algorithm is to find association rules between the items in the transaction. The parameters used for the algorithm are support level = 0.01, confidence level = 0.2 and minimum length of the rule = 3.

as_rules <- apriori(transactions, parameter = list(support = 0.01, confidence = 0.2, minlen = 2))
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.2    0.1    1 none FALSE            TRUE       5    0.01      2
##  maxlen target  ext
##      10  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 75 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[119 item(s), 7501 transaction(s)] done [0.00s].
## sorting and recoding items ... [75 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 done [0.01s].
## writing ... [163 rule(s)] done [0.00s].
## creating S4 object  ... done [0.00s].

Using those parameters the algorithm found 163 rules consisting of 2 or 3 items.

summary(as_rules)
## set of 163 rules
## 
## rule length distribution (lhs + rhs):sizes
##   2   3 
## 116  47 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.000   2.000   2.000   2.288   3.000   3.000 
## 
## summary of quality measures:
##     support          confidence        coverage            lift       
##  Min.   :0.01013   Min.   :0.2000   Min.   :0.02000   Min.   :0.9025  
##  1st Qu.:0.01280   1st Qu.:0.2283   1st Qu.:0.04773   1st Qu.:1.3512  
##  Median :0.01640   Median :0.2656   Median :0.05999   Median :1.5718  
##  Mean   :0.02010   Mean   :0.2881   Mean   :0.07545   Mean   :1.6372  
##  3rd Qu.:0.02300   3rd Qu.:0.3318   3rd Qu.:0.09112   3rd Qu.:1.8494  
##  Max.   :0.05973   Max.   :0.5067   Max.   :0.23837   Max.   :3.2920  
##      count      
##  Min.   : 76.0  
##  1st Qu.: 96.0  
##  Median :123.0  
##  Mean   :150.8  
##  3rd Qu.:172.5  
##  Max.   :448.0  
## 
## mining info:
##          data ntransactions support confidence
##  transactions          7501    0.01        0.2
##                                                                                          call
##  apriori(data = transactions, parameter = list(support = 0.01, confidence = 0.2, minlen = 2))
inspect(as_rules[1:10])
##      lhs              rhs             support    confidence coverage   lift    
## [1]  {cereals}     => {mineral water} 0.01026530 0.3989637  0.02572990 1.673729
## [2]  {red wine}    => {spaghetti}     0.01026530 0.3649289  0.02812958 2.095966
## [3]  {red wine}    => {mineral water} 0.01093188 0.3886256  0.02812958 1.630358
## [4]  {champagne}   => {chocolate}     0.01159845 0.2478632  0.04679376 1.512793
## [5]  {avocado}     => {mineral water} 0.01159845 0.3480000  0.03332889 1.459926
## [6]  {fresh bread} => {mineral water} 0.01333156 0.3095975  0.04306093 1.298820
## [7]  {salmon}      => {chocolate}     0.01066524 0.2507837  0.04252766 1.530617
## [8]  {salmon}      => {spaghetti}     0.01346487 0.3166144  0.04252766 1.818472
## [9]  {salmon}      => {mineral water} 0.01706439 0.4012539  0.04252766 1.683336
## [10] {honey}       => {spaghetti}     0.01186508 0.2500000  0.04746034 1.435873
##      count
## [1]   77  
## [2]   77  
## [3]   82  
## [4]   87  
## [5]   87  
## [6]  100  
## [7]   80  
## [8]  101  
## [9]  128  
## [10]  89

10 most significant rules with highest level in different metrics : support, confidence, coverage, lift and count.

inspect(sort(as_rules, by = "support")[1:10])
##      lhs                rhs             support    confidence coverage  
## [1]  {spaghetti}     => {mineral water} 0.05972537 0.3430322  0.17411012
## [2]  {mineral water} => {spaghetti}     0.05972537 0.2505593  0.23836822
## [3]  {chocolate}     => {mineral water} 0.05265965 0.3213995  0.16384482
## [4]  {mineral water} => {chocolate}     0.05265965 0.2209172  0.23836822
## [5]  {eggs}          => {mineral water} 0.05092654 0.2833828  0.17970937
## [6]  {mineral water} => {eggs}          0.05092654 0.2136465  0.23836822
## [7]  {milk}          => {mineral water} 0.04799360 0.3703704  0.12958272
## [8]  {mineral water} => {milk}          0.04799360 0.2013423  0.23836822
## [9]  {ground beef}   => {mineral water} 0.04092788 0.4165536  0.09825357
## [10] {ground beef}   => {spaghetti}     0.03919477 0.3989145  0.09825357
##      lift     count
## [1]  1.439085 448  
## [2]  1.439085 448  
## [3]  1.348332 395  
## [4]  1.348332 395  
## [5]  1.188845 382  
## [6]  1.188845 382  
## [7]  1.553774 360  
## [8]  1.553774 360  
## [9]  1.747522 307  
## [10] 2.291162 294
inspect(sort(as_rules, by = "confidence")[1:10])
##      lhs                          rhs             support    confidence
## [1]  {eggs, ground beef}       => {mineral water} 0.01013198 0.5066667 
## [2]  {ground beef, milk}       => {mineral water} 0.01106519 0.5030303 
## [3]  {chocolate, ground beef}  => {mineral water} 0.01093188 0.4739884 
## [4]  {frozen vegetables, milk} => {mineral water} 0.01106519 0.4689266 
## [5]  {soup}                    => {mineral water} 0.02306359 0.4564644 
## [6]  {pancakes, spaghetti}     => {mineral water} 0.01146514 0.4550265 
## [7]  {olive oil, spaghetti}    => {mineral water} 0.01026530 0.4476744 
## [8]  {milk, spaghetti}         => {mineral water} 0.01573124 0.4436090 
## [9]  {chocolate, milk}         => {mineral water} 0.01399813 0.4356846 
## [10] {ground beef, spaghetti}  => {mineral water} 0.01706439 0.4353741 
##      coverage   lift     count
## [1]  0.01999733 2.125563  76  
## [2]  0.02199707 2.110308  83  
## [3]  0.02306359 1.988472  82  
## [4]  0.02359685 1.967236  83  
## [5]  0.05052660 1.914955 173  
## [6]  0.02519664 1.908923  86  
## [7]  0.02293028 1.878079  77  
## [8]  0.03546194 1.861024 118  
## [9]  0.03212905 1.827780 105  
## [10] 0.03919477 1.826477 128
inspect(sort(as_rules, by = "coverage")[1:10])
##      lhs                rhs             support    confidence coverage 
## [1]  {mineral water} => {milk}          0.04799360 0.2013423  0.2383682
## [2]  {mineral water} => {chocolate}     0.05265965 0.2209172  0.2383682
## [3]  {mineral water} => {eggs}          0.05092654 0.2136465  0.2383682
## [4]  {mineral water} => {spaghetti}     0.05972537 0.2505593  0.2383682
## [5]  {eggs}          => {french fries}  0.03639515 0.2025223  0.1797094
## [6]  {eggs}          => {spaghetti}     0.03652846 0.2032641  0.1797094
## [7]  {eggs}          => {mineral water} 0.05092654 0.2833828  0.1797094
## [8]  {spaghetti}     => {ground beef}   0.03919477 0.2251149  0.1741101
## [9]  {spaghetti}     => {milk}          0.03546194 0.2036753  0.1741101
## [10] {spaghetti}     => {chocolate}     0.03919477 0.2251149  0.1741101
##      lift     count
## [1]  1.553774 360  
## [2]  1.348332 395  
## [3]  1.188845 382  
## [4]  1.439085 448  
## [5]  1.184961 273  
## [6]  1.167446 274  
## [7]  1.188845 382  
## [8]  2.291162 294  
## [9]  1.571779 266  
## [10] 1.373952 294
inspect(sort(as_rules, by = "lift")[1:10])
##      lhs                                   rhs                 support   
## [1]  {herb & pepper}                    => {ground beef}       0.01599787
## [2]  {mineral water, spaghetti}         => {ground beef}       0.01706439
## [3]  {tomatoes}                         => {frozen vegetables} 0.01613118
## [4]  {shrimp}                           => {frozen vegetables} 0.01666444
## [5]  {milk, mineral water}              => {frozen vegetables} 0.01106519
## [6]  {ground beef, mineral water}       => {spaghetti}         0.01706439
## [7]  {frozen vegetables, mineral water} => {milk}              0.01106519
## [8]  {milk, mineral water}              => {ground beef}       0.01106519
## [9]  {soup}                             => {milk}              0.01519797
## [10] {ground beef}                      => {spaghetti}         0.03919477
##      confidence coverage   lift     count
## [1]  0.3234501  0.04946007 3.291994 120  
## [2]  0.2857143  0.05972537 2.907928 128  
## [3]  0.2358674  0.06839088 2.474464 121  
## [4]  0.2332090  0.07145714 2.446574 125  
## [5]  0.2305556  0.04799360 2.418737  83  
## [6]  0.4169381  0.04092788 2.394681 128  
## [7]  0.3097015  0.03572857 2.389991  83  
## [8]  0.2305556  0.04799360 2.346536  83  
## [9]  0.3007916  0.05052660 2.321232 114  
## [10] 0.3989145  0.09825357 2.291162 294
inspect(sort(as_rules, by = "count")[1:10])
##      lhs                rhs             support    confidence coverage  
## [1]  {spaghetti}     => {mineral water} 0.05972537 0.3430322  0.17411012
## [2]  {mineral water} => {spaghetti}     0.05972537 0.2505593  0.23836822
## [3]  {chocolate}     => {mineral water} 0.05265965 0.3213995  0.16384482
## [4]  {mineral water} => {chocolate}     0.05265965 0.2209172  0.23836822
## [5]  {eggs}          => {mineral water} 0.05092654 0.2833828  0.17970937
## [6]  {mineral water} => {eggs}          0.05092654 0.2136465  0.23836822
## [7]  {milk}          => {mineral water} 0.04799360 0.3703704  0.12958272
## [8]  {mineral water} => {milk}          0.04799360 0.2013423  0.23836822
## [9]  {ground beef}   => {mineral water} 0.04092788 0.4165536  0.09825357
## [10] {ground beef}   => {spaghetti}     0.03919477 0.3989145  0.09825357
##      lift     count
## [1]  1.439085 448  
## [2]  1.439085 448  
## [3]  1.348332 395  
## [4]  1.348332 395  
## [5]  1.188845 382  
## [6]  1.188845 382  
## [7]  1.553774 360  
## [8]  1.553774 360  
## [9]  1.747522 307  
## [10] 2.291162 294

As an example below we can see the rules involving “frozen vegetables” as the antecedent, on the left hand sight of the rule. This shows what other items people buy when they are buying frozen vegetables.

inspect(subset(as_rules, lhs %in% "frozen vegetables"))
##      lhs                                   rhs             support   
## [1]  {frozen vegetables}                => {milk}          0.02359685
## [2]  {frozen vegetables}                => {french fries}  0.01906412
## [3]  {frozen vegetables}                => {chocolate}     0.02293028
## [4]  {frozen vegetables}                => {eggs}          0.02173044
## [5]  {frozen vegetables}                => {spaghetti}     0.02786295
## [6]  {frozen vegetables}                => {mineral water} 0.03572857
## [7]  {frozen vegetables, milk}          => {mineral water} 0.01106519
## [8]  {frozen vegetables, mineral water} => {milk}          0.01106519
## [9]  {frozen vegetables, spaghetti}     => {mineral water} 0.01199840
## [10] {frozen vegetables, mineral water} => {spaghetti}     0.01199840
##      confidence coverage   lift     count
## [1]  0.2475524  0.09532062 1.910382 177  
## [2]  0.2000000  0.09532062 1.170203 143  
## [3]  0.2405594  0.09532062 1.468215 172  
## [4]  0.2279720  0.09532062 1.268559 163  
## [5]  0.2923077  0.09532062 1.678867 209  
## [6]  0.3748252  0.09532062 1.572463 268  
## [7]  0.4689266  0.02359685 1.967236  83  
## [8]  0.3097015  0.03572857 2.389991  83  
## [9]  0.4306220  0.02786295 1.806541  90  
## [10] 0.3358209  0.03572857 1.928784  90

Here are the rules involving “eggs” as the consequent, on the right hand sight of the rule. Those rules show with which products already picked people tend to buy eggs as well.

inspect(subset(as_rules, rhs %in% "eggs"))
##      lhs                             rhs    support    confidence coverage  
## [1]  {herb & pepper}              => {eggs} 0.01253166 0.2533693  0.04946007
## [2]  {cooking oil}                => {eggs} 0.01173177 0.2297650  0.05105986
## [3]  {turkey}                     => {eggs} 0.01946407 0.3113006  0.06252500
## [4]  {chicken}                    => {eggs} 0.01439808 0.2400000  0.05999200
## [5]  {low fat yogurt}             => {eggs} 0.01679776 0.2195122  0.07652313
## [6]  {cake}                       => {eggs} 0.01906412 0.2351974  0.08105586
## [7]  {burgers}                    => {eggs} 0.02879616 0.3302752  0.08718837
## [8]  {pancakes}                   => {eggs} 0.02173044 0.2286115  0.09505399
## [9]  {frozen vegetables}          => {eggs} 0.02173044 0.2279720  0.09532062
## [10] {ground beef}                => {eggs} 0.01999733 0.2035278  0.09825357
## [11] {milk}                       => {eggs} 0.03079589 0.2376543  0.12958272
## [12] {french fries}               => {eggs} 0.03639515 0.2129485  0.17091055
## [13] {chocolate}                  => {eggs} 0.03319557 0.2026037  0.16384482
## [14] {spaghetti}                  => {eggs} 0.03652846 0.2098009  0.17411012
## [15] {mineral water}              => {eggs} 0.05092654 0.2136465  0.23836822
## [16] {ground beef, mineral water} => {eggs} 0.01013198 0.2475570  0.04092788
## [17] {milk, mineral water}        => {eggs} 0.01306492 0.2722222  0.04799360
## [18] {chocolate, spaghetti}       => {eggs} 0.01053193 0.2687075  0.03919477
## [19] {chocolate, mineral water}   => {eggs} 0.01346487 0.2556962  0.05265965
## [20] {mineral water, spaghetti}   => {eggs} 0.01426476 0.2388393  0.05972537
##      lift     count
## [1]  1.409883  94  
## [2]  1.278537  88  
## [3]  1.732245 146  
## [4]  1.335490 108  
## [5]  1.221484 126  
## [6]  1.308765 143  
## [7]  1.837830 216  
## [8]  1.272118 163  
## [9]  1.268559 163  
## [10] 1.132539 150  
## [11] 1.322437 231  
## [12] 1.184961 273  
## [13] 1.127397 249  
## [14] 1.167446 274  
## [15] 1.188845 382  
## [16] 1.377541  76  
## [17] 1.514791  98  
## [18] 1.495234  79  
## [19] 1.422832 101  
## [20] 1.329031 107

The graphs below show the rules between shopping items and the distribution of the rules between their parameters.

plot(as_rules, method="graph")
## Warning: Too many rules supplied. Only plotting the best 100 using 'lift'
## (change control parameter max if needed).

plot(as_rules)
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.

Eclat algorithm

Applying Eclat algorithm to the transactions database.

ec_freq <- eclat(transactions, parameter=list(supp=0.01, minlen=2))
## Eclat
## 
## parameter specification:
##  tidLists support minlen maxlen            target  ext
##     FALSE    0.01      2     10 frequent itemsets TRUE
## 
## algorithmic control:
##  sparse sort verbose
##       7   -2    TRUE
## 
## Absolute minimum support count: 75 
## 
## create itemset ... 
## set transactions ...[119 item(s), 7501 transaction(s)] done [0.00s].
## sorting and recoding items ... [75 item(s)] done [0.00s].
## creating sparse bit matrix ... [75 row(s), 7501 column(s)] done [0.00s].
## writing  ... [182 set(s)] done [0.01s].
## Creating S4 object  ... done [0.00s].
inspect(ec_freq[1:10])
##      items                        support    count
## [1]  {cereals, mineral water}     0.01026530  77  
## [2]  {mineral water, red wine}    0.01093188  82  
## [3]  {red wine, spaghetti}        0.01026530  77  
## [4]  {champagne, chocolate}       0.01159845  87  
## [5]  {avocado, mineral water}     0.01159845  87  
## [6]  {cookies, eggs}              0.01053193  79  
## [7]  {chocolate, cookies}         0.01039861  78  
## [8]  {cookies, french fries}      0.01333156 100  
## [9]  {cookies, green tea}         0.01199840  90  
## [10] {fresh bread, mineral water} 0.01333156 100
ec_rules<-ruleInduction(ec_freq, transactions, confidence=0.2)
inspect(ec_rules[1:10])
##      lhs              rhs             support    confidence lift     itemset
## [1]  {cereals}     => {mineral water} 0.01026530 0.3989637  1.673729  1     
## [2]  {red wine}    => {mineral water} 0.01093188 0.3886256  1.630358  2     
## [3]  {red wine}    => {spaghetti}     0.01026530 0.3649289  2.095966  3     
## [4]  {champagne}   => {chocolate}     0.01159845 0.2478632  1.512793  4     
## [5]  {avocado}     => {mineral water} 0.01159845 0.3480000  1.459926  5     
## [6]  {fresh bread} => {mineral water} 0.01333156 0.3095975  1.298820 10     
## [7]  {salmon}      => {mineral water} 0.01706439 0.4012539  1.683336 11     
## [8]  {salmon}      => {spaghetti}     0.01346487 0.3166144  1.818472 12     
## [9]  {salmon}      => {chocolate}     0.01066524 0.2507837  1.530617 13     
## [10] {honey}       => {mineral water} 0.01506466 0.3174157  1.331619 14

The Eclat algorithm also found 163 rules based on those transactions. 10 most significant rules:

inspect(sort(ec_rules, by = "support")[1:10])
##      lhs                rhs             support    confidence lift     itemset
## [1]  {spaghetti}     => {mineral water} 0.05972537 0.3430322  1.439085 182    
## [2]  {mineral water} => {spaghetti}     0.05972537 0.2505593  1.439085 182    
## [3]  {mineral water} => {chocolate}     0.05265965 0.2209172  1.348332 176    
## [4]  {chocolate}     => {mineral water} 0.05265965 0.3213995  1.348332 176    
## [5]  {mineral water} => {eggs}          0.05092654 0.2136465  1.188845 180    
## [6]  {eggs}          => {mineral water} 0.05092654 0.2833828  1.188845 180    
## [7]  {mineral water} => {milk}          0.04799360 0.2013423  1.553774 163    
## [8]  {milk}          => {mineral water} 0.04799360 0.3703704  1.553774 163    
## [9]  {ground beef}   => {mineral water} 0.04092788 0.4165536  1.747522 146    
## [10] {spaghetti}     => {ground beef}   0.03919477 0.2251149  2.291162 147
inspect(sort(ec_rules, by = "confidence")[1:10])
##      lhs                          rhs             support    confidence
## [1]  {eggs, ground beef}       => {mineral water} 0.01013198 0.5066667 
## [2]  {ground beef, milk}       => {mineral water} 0.01106519 0.5030303 
## [3]  {chocolate, ground beef}  => {mineral water} 0.01093188 0.4739884 
## [4]  {frozen vegetables, milk} => {mineral water} 0.01106519 0.4689266 
## [5]  {soup}                    => {mineral water} 0.02306359 0.4564644 
## [6]  {pancakes, spaghetti}     => {mineral water} 0.01146514 0.4550265 
## [7]  {olive oil, spaghetti}    => {mineral water} 0.01026530 0.4476744 
## [8]  {milk, spaghetti}         => {mineral water} 0.01573124 0.4436090 
## [9]  {chocolate, milk}         => {mineral water} 0.01399813 0.4356846 
## [10] {ground beef, spaghetti}  => {mineral water} 0.01706439 0.4353741 
##      lift     itemset
## [1]  2.125563 144    
## [2]  2.110308 142    
## [3]  1.988472 143    
## [4]  1.967236 132    
## [5]  1.914955  25    
## [6]  1.908923 122    
## [7]  1.878079  84    
## [8]  1.861024 162    
## [9]  1.827780 159    
## [10] 1.826477 145
inspect(sort(ec_rules, by = "lift")[1:10])
##      lhs                                   rhs                 support   
## [1]  {herb & pepper}                    => {ground beef}       0.01599787
## [2]  {mineral water, spaghetti}         => {ground beef}       0.01706439
## [3]  {tomatoes}                         => {frozen vegetables} 0.01613118
## [4]  {shrimp}                           => {frozen vegetables} 0.01666444
## [5]  {milk, mineral water}              => {frozen vegetables} 0.01106519
## [6]  {ground beef, mineral water}       => {spaghetti}         0.01706439
## [7]  {frozen vegetables, mineral water} => {milk}              0.01106519
## [8]  {milk, mineral water}              => {ground beef}       0.01106519
## [9]  {soup}                             => {milk}              0.01519797
## [10] {spaghetti}                        => {ground beef}       0.03919477
##      confidence lift     itemset
## [1]  0.3234501  3.291994  19    
## [2]  0.2857143  2.907928 145    
## [3]  0.2358674  2.474464  82    
## [4]  0.2332090  2.446574 100    
## [5]  0.2305556  2.418737 132    
## [6]  0.4169381  2.394681 145    
## [7]  0.3097015  2.389991 132    
## [8]  0.2305556  2.346536 142    
## [9]  0.3007916  2.321232  28    
## [10] 0.2251149  2.291162 147
inspect(sort(ec_rules, by = "itemset")[1:10])
##      lhs                           rhs             support    confidence
## [1]  {spaghetti}                => {mineral water} 0.05972537 0.3430322 
## [2]  {mineral water}            => {spaghetti}     0.05972537 0.2505593 
## [3]  {spaghetti}                => {eggs}          0.03652846 0.2098009 
## [4]  {eggs}                     => {spaghetti}     0.03652846 0.2032641 
## [5]  {mineral water}            => {eggs}          0.05092654 0.2136465 
## [6]  {eggs}                     => {mineral water} 0.05092654 0.2833828 
## [7]  {mineral water, spaghetti} => {eggs}          0.01426476 0.2388393 
## [8]  {eggs, spaghetti}          => {mineral water} 0.01426476 0.3905109 
## [9]  {eggs, mineral water}      => {spaghetti}     0.01426476 0.2801047 
## [10] {chocolate}                => {eggs}          0.03319557 0.2026037 
##      lift     itemset
## [1]  1.439085 182    
## [2]  1.439085 182    
## [3]  1.167446 181    
## [4]  1.167446 181    
## [5]  1.188845 180    
## [6]  1.188845 180    
## [7]  1.329031 179    
## [8]  1.638268 179    
## [9]  1.608779 179    
## [10] 1.127397 178
plot(ec_rules, method="graph", col = "purple")
## Warning: Too many rules supplied. Only plotting the best 100 using 'lift'
## (change control parameter max if needed).

plot(ec_rules, col = "purple")
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.

Conclusion

The results of this analysis could be helpful in the placement of the products between different sections of the shop. Looking at the patterns in past transactions can help to understand which products should be put close to each other to increase the possibility for those items to be bought together, even though only one of them was initially on on the customer’s shopping list.

Data source: https://www.kaggle.com/datasets/d4rklucif3r/market-basket-optimisation