Adrianna Łazuga
418397
The aim of this project is to analyze the transactions from the supermarket. By using association rules I will try discover frequent patterns in the products appearing in transactions made by customers. This analysis could be helpful in planning future product placement in the market for the best profits.
library(arules)
## Warning: pakiet 'arules' został zbudowany w wersji R 4.4.2
## Ładowanie wymaganego pakietu: Matrix
##
## Dołączanie pakietu: 'arules'
## Następujące obiekty zostały zakryte z 'package:base':
##
## abbreviate, write
library(arulesViz)
## Warning: pakiet 'arulesViz' został zbudowany w wersji R 4.4.2
transactions<-read.transactions("Market_Basket_Optimisation.csv",header = FALSE,sep=",")
## Warning in asMethod(object): removing duplicated items in transactions
summary(transactions)
## transactions as itemMatrix in sparse format with
## 7501 rows (elements/itemsets/transactions) and
## 119 columns (items) and a density of 0.03288973
##
## most frequent items:
## mineral water eggs spaghetti french fries chocolate
## 1788 1348 1306 1282 1229
## (Other)
## 22405
##
## element (itemset/transaction) length distribution:
## sizes
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
## 1754 1358 1044 816 667 493 391 324 259 139 102 67 40 22 17 4
## 18 19 20
## 1 2 1
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 3.000 3.914 5.000 20.000
##
## includes extended item information - examples:
## labels
## 1 almonds
## 2 antioxydant juice
## 3 asparagus
The dataset contains information about transactions (7501 rows) and items (119 columns). Most frequent items bought are mineral water, eggs, spaghetti, french fries and chocolate. This histogram shows 10 most frequent items appearing in transactions.
itemFrequencyPlot(transactions, topN = 10, col = "lightblue")
The sample of 10 transactions showing the purchased items.
inspect(transactions[1:10])
## items
## [1] {almonds,
## antioxydant juice,
## avocado,
## cottage cheese,
## energy drink,
## frozen smoothie,
## green grapes,
## green tea,
## honey,
## low fat yogurt,
## mineral water,
## olive oil,
## salad,
## salmon,
## shrimp,
## spinach,
## tomato juice,
## vegetables mix,
## whole weat flour,
## yams}
## [2] {burgers,
## eggs,
## meatballs}
## [3] {chutney}
## [4] {avocado,
## turkey}
## [5] {energy bar,
## green tea,
## milk,
## mineral water,
## whole wheat rice}
## [6] {low fat yogurt}
## [7] {french fries,
## whole wheat pasta}
## [8] {light cream,
## shallot,
## soup}
## [9] {frozen vegetables,
## green tea,
## spaghetti}
## [10] {french fries}
Here we can see how many items each item appears in transactions and the spread of products in 20 sample transactions.
itemFrequency(transactions, type="absolute")
## almonds antioxydant juice asparagus
## 153 67 36
## avocado babies food bacon
## 250 34 65
## barbecue sauce black tea blueberries
## 81 107 69
## body spray bramble brownies
## 86 14 253
## bug spray burger sauce burgers
## 65 44 654
## butter cake candy bars
## 226 608 73
## carrots cauliflower cereals
## 115 36 193
## champagne chicken chili
## 351 450 46
## chocolate chocolate bread chutney
## 1229 32 31
## cider clothes accessories cookies
## 79 63 603
## cooking oil corn cottage cheese
## 383 36 239
## cream dessert wine eggplant
## 7 33 99
## eggs energy bar energy drink
## 1348 203 200
## escalope extra dark chocolate flax seed
## 595 90 68
## french fries french wine fresh bread
## 1282 169 323
## fresh tuna fromage blanc frozen smoothie
## 167 102 475
## frozen vegetables gluten free bar grated cheese
## 715 52 393
## green beans green grapes green tea
## 65 68 991
## ground beef gums ham
## 737 101 199
## hand protein bar herb & pepper honey
## 39 371 356
## hot dogs ketchup light cream
## 243 33 117
## light mayo low fat yogurt magazines
## 204 574 82
## mashed potato mayonnaise meatballs
## 31 46 157
## melons milk mineral water
## 90 972 1788
## mint mint green tea muffins
## 131 42 181
## mushroom cream sauce napkins nonfat milk
## 143 5 78
## oatmeal oil olive oil
## 33 173 494
## pancakes parmesan cheese pasta
## 713 149 118
## pepper pet food pickles
## 199 49 45
## protein bar red wine rice
## 139 211 141
## salad salmon salt
## 37 319 69
## sandwich shallot shampoo
## 34 58 37
## shrimp soda soup
## 536 47 379
## spaghetti sparkling water spinach
## 1306 47 53
## strawberries strong cheese tea
## 160 58 29
## tomato juice tomato sauce tomatoes
## 228 106 513
## toothpaste turkey vegetables mix
## 61 469 193
## water spray white wine whole weat flour
## 3 124 70
## whole wheat pasta whole wheat rice yams
## 221 439 86
## yogurt cake zucchini
## 205 71
image(sample(transactions, 20))
The aim of applying Apriori algorithm is to find association rules between the items in the transaction. The parameters used for the algorithm are support level = 0.01, confidence level = 0.2 and minimum length of the rule = 3.
as_rules <- apriori(transactions, parameter = list(support = 0.01, confidence = 0.2, minlen = 2))
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0.2 0.1 1 none FALSE TRUE 5 0.01 2
## maxlen target ext
## 10 rules TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 75
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[119 item(s), 7501 transaction(s)] done [0.00s].
## sorting and recoding items ... [75 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 done [0.01s].
## writing ... [163 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].
Using those parameters the algorithm found 163 rules consisting of 2 or 3 items.
summary(as_rules)
## set of 163 rules
##
## rule length distribution (lhs + rhs):sizes
## 2 3
## 116 47
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 2.000 2.000 2.288 3.000 3.000
##
## summary of quality measures:
## support confidence coverage lift
## Min. :0.01013 Min. :0.2000 Min. :0.02000 Min. :0.9025
## 1st Qu.:0.01280 1st Qu.:0.2283 1st Qu.:0.04773 1st Qu.:1.3512
## Median :0.01640 Median :0.2656 Median :0.05999 Median :1.5718
## Mean :0.02010 Mean :0.2881 Mean :0.07545 Mean :1.6372
## 3rd Qu.:0.02300 3rd Qu.:0.3318 3rd Qu.:0.09112 3rd Qu.:1.8494
## Max. :0.05973 Max. :0.5067 Max. :0.23837 Max. :3.2920
## count
## Min. : 76.0
## 1st Qu.: 96.0
## Median :123.0
## Mean :150.8
## 3rd Qu.:172.5
## Max. :448.0
##
## mining info:
## data ntransactions support confidence
## transactions 7501 0.01 0.2
## call
## apriori(data = transactions, parameter = list(support = 0.01, confidence = 0.2, minlen = 2))
inspect(as_rules[1:10])
## lhs rhs support confidence coverage lift
## [1] {cereals} => {mineral water} 0.01026530 0.3989637 0.02572990 1.673729
## [2] {red wine} => {spaghetti} 0.01026530 0.3649289 0.02812958 2.095966
## [3] {red wine} => {mineral water} 0.01093188 0.3886256 0.02812958 1.630358
## [4] {champagne} => {chocolate} 0.01159845 0.2478632 0.04679376 1.512793
## [5] {avocado} => {mineral water} 0.01159845 0.3480000 0.03332889 1.459926
## [6] {fresh bread} => {mineral water} 0.01333156 0.3095975 0.04306093 1.298820
## [7] {salmon} => {chocolate} 0.01066524 0.2507837 0.04252766 1.530617
## [8] {salmon} => {spaghetti} 0.01346487 0.3166144 0.04252766 1.818472
## [9] {salmon} => {mineral water} 0.01706439 0.4012539 0.04252766 1.683336
## [10] {honey} => {spaghetti} 0.01186508 0.2500000 0.04746034 1.435873
## count
## [1] 77
## [2] 77
## [3] 82
## [4] 87
## [5] 87
## [6] 100
## [7] 80
## [8] 101
## [9] 128
## [10] 89
10 most significant rules with highest level in different metrics : support, confidence, coverage, lift and count.
inspect(sort(as_rules, by = "support")[1:10])
## lhs rhs support confidence coverage
## [1] {spaghetti} => {mineral water} 0.05972537 0.3430322 0.17411012
## [2] {mineral water} => {spaghetti} 0.05972537 0.2505593 0.23836822
## [3] {chocolate} => {mineral water} 0.05265965 0.3213995 0.16384482
## [4] {mineral water} => {chocolate} 0.05265965 0.2209172 0.23836822
## [5] {eggs} => {mineral water} 0.05092654 0.2833828 0.17970937
## [6] {mineral water} => {eggs} 0.05092654 0.2136465 0.23836822
## [7] {milk} => {mineral water} 0.04799360 0.3703704 0.12958272
## [8] {mineral water} => {milk} 0.04799360 0.2013423 0.23836822
## [9] {ground beef} => {mineral water} 0.04092788 0.4165536 0.09825357
## [10] {ground beef} => {spaghetti} 0.03919477 0.3989145 0.09825357
## lift count
## [1] 1.439085 448
## [2] 1.439085 448
## [3] 1.348332 395
## [4] 1.348332 395
## [5] 1.188845 382
## [6] 1.188845 382
## [7] 1.553774 360
## [8] 1.553774 360
## [9] 1.747522 307
## [10] 2.291162 294
inspect(sort(as_rules, by = "confidence")[1:10])
## lhs rhs support confidence
## [1] {eggs, ground beef} => {mineral water} 0.01013198 0.5066667
## [2] {ground beef, milk} => {mineral water} 0.01106519 0.5030303
## [3] {chocolate, ground beef} => {mineral water} 0.01093188 0.4739884
## [4] {frozen vegetables, milk} => {mineral water} 0.01106519 0.4689266
## [5] {soup} => {mineral water} 0.02306359 0.4564644
## [6] {pancakes, spaghetti} => {mineral water} 0.01146514 0.4550265
## [7] {olive oil, spaghetti} => {mineral water} 0.01026530 0.4476744
## [8] {milk, spaghetti} => {mineral water} 0.01573124 0.4436090
## [9] {chocolate, milk} => {mineral water} 0.01399813 0.4356846
## [10] {ground beef, spaghetti} => {mineral water} 0.01706439 0.4353741
## coverage lift count
## [1] 0.01999733 2.125563 76
## [2] 0.02199707 2.110308 83
## [3] 0.02306359 1.988472 82
## [4] 0.02359685 1.967236 83
## [5] 0.05052660 1.914955 173
## [6] 0.02519664 1.908923 86
## [7] 0.02293028 1.878079 77
## [8] 0.03546194 1.861024 118
## [9] 0.03212905 1.827780 105
## [10] 0.03919477 1.826477 128
inspect(sort(as_rules, by = "coverage")[1:10])
## lhs rhs support confidence coverage
## [1] {mineral water} => {milk} 0.04799360 0.2013423 0.2383682
## [2] {mineral water} => {chocolate} 0.05265965 0.2209172 0.2383682
## [3] {mineral water} => {eggs} 0.05092654 0.2136465 0.2383682
## [4] {mineral water} => {spaghetti} 0.05972537 0.2505593 0.2383682
## [5] {eggs} => {french fries} 0.03639515 0.2025223 0.1797094
## [6] {eggs} => {spaghetti} 0.03652846 0.2032641 0.1797094
## [7] {eggs} => {mineral water} 0.05092654 0.2833828 0.1797094
## [8] {spaghetti} => {ground beef} 0.03919477 0.2251149 0.1741101
## [9] {spaghetti} => {milk} 0.03546194 0.2036753 0.1741101
## [10] {spaghetti} => {chocolate} 0.03919477 0.2251149 0.1741101
## lift count
## [1] 1.553774 360
## [2] 1.348332 395
## [3] 1.188845 382
## [4] 1.439085 448
## [5] 1.184961 273
## [6] 1.167446 274
## [7] 1.188845 382
## [8] 2.291162 294
## [9] 1.571779 266
## [10] 1.373952 294
inspect(sort(as_rules, by = "lift")[1:10])
## lhs rhs support
## [1] {herb & pepper} => {ground beef} 0.01599787
## [2] {mineral water, spaghetti} => {ground beef} 0.01706439
## [3] {tomatoes} => {frozen vegetables} 0.01613118
## [4] {shrimp} => {frozen vegetables} 0.01666444
## [5] {milk, mineral water} => {frozen vegetables} 0.01106519
## [6] {ground beef, mineral water} => {spaghetti} 0.01706439
## [7] {frozen vegetables, mineral water} => {milk} 0.01106519
## [8] {milk, mineral water} => {ground beef} 0.01106519
## [9] {soup} => {milk} 0.01519797
## [10] {ground beef} => {spaghetti} 0.03919477
## confidence coverage lift count
## [1] 0.3234501 0.04946007 3.291994 120
## [2] 0.2857143 0.05972537 2.907928 128
## [3] 0.2358674 0.06839088 2.474464 121
## [4] 0.2332090 0.07145714 2.446574 125
## [5] 0.2305556 0.04799360 2.418737 83
## [6] 0.4169381 0.04092788 2.394681 128
## [7] 0.3097015 0.03572857 2.389991 83
## [8] 0.2305556 0.04799360 2.346536 83
## [9] 0.3007916 0.05052660 2.321232 114
## [10] 0.3989145 0.09825357 2.291162 294
inspect(sort(as_rules, by = "count")[1:10])
## lhs rhs support confidence coverage
## [1] {spaghetti} => {mineral water} 0.05972537 0.3430322 0.17411012
## [2] {mineral water} => {spaghetti} 0.05972537 0.2505593 0.23836822
## [3] {chocolate} => {mineral water} 0.05265965 0.3213995 0.16384482
## [4] {mineral water} => {chocolate} 0.05265965 0.2209172 0.23836822
## [5] {eggs} => {mineral water} 0.05092654 0.2833828 0.17970937
## [6] {mineral water} => {eggs} 0.05092654 0.2136465 0.23836822
## [7] {milk} => {mineral water} 0.04799360 0.3703704 0.12958272
## [8] {mineral water} => {milk} 0.04799360 0.2013423 0.23836822
## [9] {ground beef} => {mineral water} 0.04092788 0.4165536 0.09825357
## [10] {ground beef} => {spaghetti} 0.03919477 0.3989145 0.09825357
## lift count
## [1] 1.439085 448
## [2] 1.439085 448
## [3] 1.348332 395
## [4] 1.348332 395
## [5] 1.188845 382
## [6] 1.188845 382
## [7] 1.553774 360
## [8] 1.553774 360
## [9] 1.747522 307
## [10] 2.291162 294
As an example below we can see the rules involving “frozen vegetables” as the antecedent, on the left hand sight of the rule. This shows what other items people buy when they are buying frozen vegetables.
inspect(subset(as_rules, lhs %in% "frozen vegetables"))
## lhs rhs support
## [1] {frozen vegetables} => {milk} 0.02359685
## [2] {frozen vegetables} => {french fries} 0.01906412
## [3] {frozen vegetables} => {chocolate} 0.02293028
## [4] {frozen vegetables} => {eggs} 0.02173044
## [5] {frozen vegetables} => {spaghetti} 0.02786295
## [6] {frozen vegetables} => {mineral water} 0.03572857
## [7] {frozen vegetables, milk} => {mineral water} 0.01106519
## [8] {frozen vegetables, mineral water} => {milk} 0.01106519
## [9] {frozen vegetables, spaghetti} => {mineral water} 0.01199840
## [10] {frozen vegetables, mineral water} => {spaghetti} 0.01199840
## confidence coverage lift count
## [1] 0.2475524 0.09532062 1.910382 177
## [2] 0.2000000 0.09532062 1.170203 143
## [3] 0.2405594 0.09532062 1.468215 172
## [4] 0.2279720 0.09532062 1.268559 163
## [5] 0.2923077 0.09532062 1.678867 209
## [6] 0.3748252 0.09532062 1.572463 268
## [7] 0.4689266 0.02359685 1.967236 83
## [8] 0.3097015 0.03572857 2.389991 83
## [9] 0.4306220 0.02786295 1.806541 90
## [10] 0.3358209 0.03572857 1.928784 90
Here are the rules involving “eggs” as the consequent, on the right hand sight of the rule. Those rules show with which products already picked people tend to buy eggs as well.
inspect(subset(as_rules, rhs %in% "eggs"))
## lhs rhs support confidence coverage
## [1] {herb & pepper} => {eggs} 0.01253166 0.2533693 0.04946007
## [2] {cooking oil} => {eggs} 0.01173177 0.2297650 0.05105986
## [3] {turkey} => {eggs} 0.01946407 0.3113006 0.06252500
## [4] {chicken} => {eggs} 0.01439808 0.2400000 0.05999200
## [5] {low fat yogurt} => {eggs} 0.01679776 0.2195122 0.07652313
## [6] {cake} => {eggs} 0.01906412 0.2351974 0.08105586
## [7] {burgers} => {eggs} 0.02879616 0.3302752 0.08718837
## [8] {pancakes} => {eggs} 0.02173044 0.2286115 0.09505399
## [9] {frozen vegetables} => {eggs} 0.02173044 0.2279720 0.09532062
## [10] {ground beef} => {eggs} 0.01999733 0.2035278 0.09825357
## [11] {milk} => {eggs} 0.03079589 0.2376543 0.12958272
## [12] {french fries} => {eggs} 0.03639515 0.2129485 0.17091055
## [13] {chocolate} => {eggs} 0.03319557 0.2026037 0.16384482
## [14] {spaghetti} => {eggs} 0.03652846 0.2098009 0.17411012
## [15] {mineral water} => {eggs} 0.05092654 0.2136465 0.23836822
## [16] {ground beef, mineral water} => {eggs} 0.01013198 0.2475570 0.04092788
## [17] {milk, mineral water} => {eggs} 0.01306492 0.2722222 0.04799360
## [18] {chocolate, spaghetti} => {eggs} 0.01053193 0.2687075 0.03919477
## [19] {chocolate, mineral water} => {eggs} 0.01346487 0.2556962 0.05265965
## [20] {mineral water, spaghetti} => {eggs} 0.01426476 0.2388393 0.05972537
## lift count
## [1] 1.409883 94
## [2] 1.278537 88
## [3] 1.732245 146
## [4] 1.335490 108
## [5] 1.221484 126
## [6] 1.308765 143
## [7] 1.837830 216
## [8] 1.272118 163
## [9] 1.268559 163
## [10] 1.132539 150
## [11] 1.322437 231
## [12] 1.184961 273
## [13] 1.127397 249
## [14] 1.167446 274
## [15] 1.188845 382
## [16] 1.377541 76
## [17] 1.514791 98
## [18] 1.495234 79
## [19] 1.422832 101
## [20] 1.329031 107
The graphs below show the rules between shopping items and the distribution of the rules between their parameters.
plot(as_rules, method="graph")
## Warning: Too many rules supplied. Only plotting the best 100 using 'lift'
## (change control parameter max if needed).
plot(as_rules)
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.
Applying Eclat algorithm to the transactions database.
ec_freq <- eclat(transactions, parameter=list(supp=0.01, minlen=2))
## Eclat
##
## parameter specification:
## tidLists support minlen maxlen target ext
## FALSE 0.01 2 10 frequent itemsets TRUE
##
## algorithmic control:
## sparse sort verbose
## 7 -2 TRUE
##
## Absolute minimum support count: 75
##
## create itemset ...
## set transactions ...[119 item(s), 7501 transaction(s)] done [0.00s].
## sorting and recoding items ... [75 item(s)] done [0.00s].
## creating sparse bit matrix ... [75 row(s), 7501 column(s)] done [0.00s].
## writing ... [182 set(s)] done [0.01s].
## Creating S4 object ... done [0.00s].
inspect(ec_freq[1:10])
## items support count
## [1] {cereals, mineral water} 0.01026530 77
## [2] {mineral water, red wine} 0.01093188 82
## [3] {red wine, spaghetti} 0.01026530 77
## [4] {champagne, chocolate} 0.01159845 87
## [5] {avocado, mineral water} 0.01159845 87
## [6] {cookies, eggs} 0.01053193 79
## [7] {chocolate, cookies} 0.01039861 78
## [8] {cookies, french fries} 0.01333156 100
## [9] {cookies, green tea} 0.01199840 90
## [10] {fresh bread, mineral water} 0.01333156 100
ec_rules<-ruleInduction(ec_freq, transactions, confidence=0.2)
inspect(ec_rules[1:10])
## lhs rhs support confidence lift itemset
## [1] {cereals} => {mineral water} 0.01026530 0.3989637 1.673729 1
## [2] {red wine} => {mineral water} 0.01093188 0.3886256 1.630358 2
## [3] {red wine} => {spaghetti} 0.01026530 0.3649289 2.095966 3
## [4] {champagne} => {chocolate} 0.01159845 0.2478632 1.512793 4
## [5] {avocado} => {mineral water} 0.01159845 0.3480000 1.459926 5
## [6] {fresh bread} => {mineral water} 0.01333156 0.3095975 1.298820 10
## [7] {salmon} => {mineral water} 0.01706439 0.4012539 1.683336 11
## [8] {salmon} => {spaghetti} 0.01346487 0.3166144 1.818472 12
## [9] {salmon} => {chocolate} 0.01066524 0.2507837 1.530617 13
## [10] {honey} => {mineral water} 0.01506466 0.3174157 1.331619 14
The Eclat algorithm also found 163 rules based on those transactions. 10 most significant rules:
inspect(sort(ec_rules, by = "support")[1:10])
## lhs rhs support confidence lift itemset
## [1] {spaghetti} => {mineral water} 0.05972537 0.3430322 1.439085 182
## [2] {mineral water} => {spaghetti} 0.05972537 0.2505593 1.439085 182
## [3] {mineral water} => {chocolate} 0.05265965 0.2209172 1.348332 176
## [4] {chocolate} => {mineral water} 0.05265965 0.3213995 1.348332 176
## [5] {mineral water} => {eggs} 0.05092654 0.2136465 1.188845 180
## [6] {eggs} => {mineral water} 0.05092654 0.2833828 1.188845 180
## [7] {mineral water} => {milk} 0.04799360 0.2013423 1.553774 163
## [8] {milk} => {mineral water} 0.04799360 0.3703704 1.553774 163
## [9] {ground beef} => {mineral water} 0.04092788 0.4165536 1.747522 146
## [10] {spaghetti} => {ground beef} 0.03919477 0.2251149 2.291162 147
inspect(sort(ec_rules, by = "confidence")[1:10])
## lhs rhs support confidence
## [1] {eggs, ground beef} => {mineral water} 0.01013198 0.5066667
## [2] {ground beef, milk} => {mineral water} 0.01106519 0.5030303
## [3] {chocolate, ground beef} => {mineral water} 0.01093188 0.4739884
## [4] {frozen vegetables, milk} => {mineral water} 0.01106519 0.4689266
## [5] {soup} => {mineral water} 0.02306359 0.4564644
## [6] {pancakes, spaghetti} => {mineral water} 0.01146514 0.4550265
## [7] {olive oil, spaghetti} => {mineral water} 0.01026530 0.4476744
## [8] {milk, spaghetti} => {mineral water} 0.01573124 0.4436090
## [9] {chocolate, milk} => {mineral water} 0.01399813 0.4356846
## [10] {ground beef, spaghetti} => {mineral water} 0.01706439 0.4353741
## lift itemset
## [1] 2.125563 144
## [2] 2.110308 142
## [3] 1.988472 143
## [4] 1.967236 132
## [5] 1.914955 25
## [6] 1.908923 122
## [7] 1.878079 84
## [8] 1.861024 162
## [9] 1.827780 159
## [10] 1.826477 145
inspect(sort(ec_rules, by = "lift")[1:10])
## lhs rhs support
## [1] {herb & pepper} => {ground beef} 0.01599787
## [2] {mineral water, spaghetti} => {ground beef} 0.01706439
## [3] {tomatoes} => {frozen vegetables} 0.01613118
## [4] {shrimp} => {frozen vegetables} 0.01666444
## [5] {milk, mineral water} => {frozen vegetables} 0.01106519
## [6] {ground beef, mineral water} => {spaghetti} 0.01706439
## [7] {frozen vegetables, mineral water} => {milk} 0.01106519
## [8] {milk, mineral water} => {ground beef} 0.01106519
## [9] {soup} => {milk} 0.01519797
## [10] {spaghetti} => {ground beef} 0.03919477
## confidence lift itemset
## [1] 0.3234501 3.291994 19
## [2] 0.2857143 2.907928 145
## [3] 0.2358674 2.474464 82
## [4] 0.2332090 2.446574 100
## [5] 0.2305556 2.418737 132
## [6] 0.4169381 2.394681 145
## [7] 0.3097015 2.389991 132
## [8] 0.2305556 2.346536 142
## [9] 0.3007916 2.321232 28
## [10] 0.2251149 2.291162 147
inspect(sort(ec_rules, by = "itemset")[1:10])
## lhs rhs support confidence
## [1] {spaghetti} => {mineral water} 0.05972537 0.3430322
## [2] {mineral water} => {spaghetti} 0.05972537 0.2505593
## [3] {spaghetti} => {eggs} 0.03652846 0.2098009
## [4] {eggs} => {spaghetti} 0.03652846 0.2032641
## [5] {mineral water} => {eggs} 0.05092654 0.2136465
## [6] {eggs} => {mineral water} 0.05092654 0.2833828
## [7] {mineral water, spaghetti} => {eggs} 0.01426476 0.2388393
## [8] {eggs, spaghetti} => {mineral water} 0.01426476 0.3905109
## [9] {eggs, mineral water} => {spaghetti} 0.01426476 0.2801047
## [10] {chocolate} => {eggs} 0.03319557 0.2026037
## lift itemset
## [1] 1.439085 182
## [2] 1.439085 182
## [3] 1.167446 181
## [4] 1.167446 181
## [5] 1.188845 180
## [6] 1.188845 180
## [7] 1.329031 179
## [8] 1.638268 179
## [9] 1.608779 179
## [10] 1.127397 178
plot(ec_rules, method="graph", col = "purple")
## Warning: Too many rules supplied. Only plotting the best 100 using 'lift'
## (change control parameter max if needed).
plot(ec_rules, col = "purple")
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.
The results of this analysis could be helpful in the placement of the products between different sections of the shop. Looking at the patterns in past transactions can help to understand which products should be put close to each other to increase the possibility for those items to be bought together, even though only one of them was initially on on the customer’s shopping list.
Data source: https://www.kaggle.com/datasets/d4rklucif3r/market-basket-optimisation