Market Basket Analysis is a fundamental technique in data mining used to uncover relationships between items frequently purchased together. This project applies Association Rule Mining to the built-in ‘Groceries’ dataset in R, utilizing the Apriori algorithm to identify meaningful associations between grocery items. The results of this analysis can help businesses optimize inventory management, improve product placements, and enhance cross-selling strategies
Groceries dataset is a build-in dataset in R containing 9835 transactions from grocery store, made over a span of 30 days. These transactions where devided into 169 categories to make the analysis easier
library(arules)
library(arulesViz)
data('Groceries')
transactions <- as(Groceries, "transactions")
stab<-crossTable(transactions, measure="support", sort=TRUE)
head(round(stab, 3))
## whole milk other vegetables rolls/buns soda yogurt
## whole milk 0.256 0.075 0.057 0.040 0.056
## other vegetables 0.075 0.193 0.043 0.033 0.043
## rolls/buns 0.057 0.043 0.184 0.038 0.034
## soda 0.040 0.033 0.038 0.174 0.027
## yogurt 0.056 0.043 0.034 0.027 0.140
## bottled water 0.034 0.025 0.024 0.029 0.023
## bottled water root vegetables tropical fruit shopping bags
## whole milk 0.034 0.049 0.042 0.025
## other vegetables 0.025 0.047 0.036 0.023
## rolls/buns 0.024 0.024 0.025 0.020
## soda 0.029 0.019 0.021 0.025
## yogurt 0.023 0.026 0.029 0.015
## bottled water 0.111 0.016 0.019 0.011
## sausage pastry citrus fruit bottled beer newspapers
## whole milk 0.030 0.033 0.031 0.020 0.027
## other vegetables 0.027 0.023 0.029 0.016 0.019
## rolls/buns 0.031 0.021 0.017 0.014 0.020
## soda 0.024 0.021 0.013 0.017 0.015
## yogurt 0.020 0.018 0.022 0.009 0.015
## bottled water 0.012 0.009 0.014 0.016 0.011
## canned beer pip fruit fruit/vegetable juice whipped/sour cream
## whole milk 0.009 0.030 0.027 0.032
## other vegetables 0.009 0.026 0.021 0.029
## rolls/buns 0.011 0.014 0.015 0.015
## soda 0.014 0.013 0.018 0.012
## yogurt 0.005 0.018 0.019 0.021
## bottled water 0.008 0.011 0.014 0.009
## brown bread domestic eggs frankfurter margarine coffee pork
## whole milk 0.025 0.030 0.021 0.024 0.019 0.022
## other vegetables 0.019 0.022 0.016 0.020 0.013 0.022
## rolls/buns 0.013 0.016 0.019 0.015 0.011 0.011
## soda 0.013 0.012 0.011 0.010 0.010 0.012
## yogurt 0.015 0.014 0.011 0.014 0.010 0.010
## bottled water 0.008 0.009 0.007 0.010 0.007 0.007
## butter curd beef napkins chocolate frozen vegetables chicken
## whole milk 0.028 0.026 0.021 0.020 0.017 0.020 0.018
## other vegetables 0.020 0.017 0.020 0.014 0.013 0.018 0.018
## rolls/buns 0.013 0.010 0.014 0.012 0.012 0.010 0.010
## soda 0.009 0.008 0.008 0.012 0.014 0.009 0.008
## yogurt 0.015 0.017 0.012 0.012 0.009 0.012 0.008
## bottled water 0.009 0.006 0.006 0.009 0.006 0.006 0.005
## white bread cream cheese waffles salty snack
## whole milk 0.017 0.016 0.013 0.011
## other vegetables 0.014 0.014 0.010 0.011
## rolls/buns 0.007 0.010 0.009 0.005
## soda 0.010 0.007 0.010 0.009
## yogurt 0.009 0.012 0.008 0.006
## bottled water 0.004 0.006 0.004 0.004
## long life bakery product dessert sugar UHT-milk hamburger meat
## whole milk 0.014 0.014 0.015 0.004 0.015
## other vegetables 0.011 0.012 0.011 0.008 0.014
## rolls/buns 0.008 0.007 0.007 0.006 0.009
## soda 0.008 0.010 0.007 0.008 0.006
## yogurt 0.009 0.010 0.007 0.007 0.007
## bottled water 0.004 0.005 0.005 0.007 0.003
## berries hygiene articles onions specialty chocolate candy
## whole milk 0.012 0.013 0.012 0.008 0.008
## other vegetables 0.010 0.010 0.014 0.006 0.007
## rolls/buns 0.007 0.006 0.007 0.006 0.007
## soda 0.007 0.007 0.005 0.006 0.009
## yogurt 0.011 0.007 0.007 0.005 0.005
## bottled water 0.004 0.006 0.006 0.004 0.004
## frozen meals misc. beverages oil butter milk specialty bar
## whole milk 0.010 0.007 0.011 0.012 0.007
## other vegetables 0.008 0.006 0.010 0.010 0.006
## rolls/buns 0.005 0.005 0.005 0.008 0.006
## soda 0.006 0.007 0.005 0.005 0.007
## yogurt 0.006 0.004 0.005 0.009 0.004
## bottled water 0.003 0.005 0.004 0.004 0.003
## ham beverages meat ice cream sliced cheese hard cheese
## whole milk 0.011 0.007 0.010 0.006 0.011 0.010
## other vegetables 0.009 0.005 0.010 0.005 0.009 0.009
## rolls/buns 0.007 0.005 0.007 0.003 0.008 0.006
## soda 0.005 0.005 0.005 0.006 0.005 0.004
## yogurt 0.007 0.005 0.005 0.004 0.008 0.006
## bottled water 0.003 0.003 0.003 0.003 0.004 0.004
## cat food grapes chewing gum red/blush wine detergent
## whole milk 0.009 0.007 0.005 0.004 0.009
## other vegetables 0.007 0.009 0.005 0.005 0.006
## rolls/buns 0.004 0.005 0.004 0.004 0.003
## soda 0.005 0.004 0.005 0.005 0.003
## yogurt 0.006 0.005 0.002 0.002 0.004
## bottled water 0.004 0.004 0.003 0.003 0.002
## white wine pickled vegetables semi-finished bread
## whole milk 0.003 0.007 0.007
## other vegetables 0.002 0.006 0.005
## rolls/buns 0.003 0.004 0.003
## soda 0.004 0.004 0.004
## yogurt 0.002 0.004 0.004
## bottled water 0.004 0.003 0.003
## baking powder dishes flour pot plants soft cheese
## whole milk 0.009 0.005 0.008 0.007 0.008
## other vegetables 0.007 0.006 0.006 0.004 0.007
## rolls/buns 0.004 0.003 0.004 0.003 0.005
## soda 0.002 0.002 0.003 0.003 0.003
## yogurt 0.005 0.004 0.005 0.004 0.006
## bottled water 0.003 0.002 0.003 0.003 0.002
## processed cheese herbs pasta canned fish seasonal products
## whole milk 0.007 0.008 0.006 0.005 0.004
## other vegetables 0.005 0.008 0.004 0.005 0.004
## rolls/buns 0.005 0.003 0.002 0.004 0.003
## soda 0.005 0.002 0.004 0.003 0.003
## yogurt 0.003 0.004 0.003 0.003 0.003
## bottled water 0.002 0.003 0.002 0.002 0.001
## cake bar packaged fruit/vegetables mustard frozen fish
## whole milk 0.006 0.004 0.005 0.005
## other vegetables 0.004 0.003 0.003 0.005
## rolls/buns 0.003 0.003 0.004 0.002
## soda 0.004 0.002 0.002 0.001
## yogurt 0.002 0.003 0.002 0.003
## bottled water 0.002 0.003 0.002 0.001
## cling film/bags spread cheese liquor frozen dessert salt
## whole milk 0.004 0.003 0.001 0.004 0.004
## other vegetables 0.003 0.003 0.001 0.004 0.004
## rolls/buns 0.002 0.004 0.001 0.003 0.002
## soda 0.002 0.003 0.002 0.003 0.002
## yogurt 0.003 0.004 0.001 0.002 0.002
## bottled water 0.002 0.002 0.001 0.002 0.002
## canned vegetables dish cleaner flower (seeds) condensed milk
## whole milk 0.004 0.003 0.004 0.002
## other vegetables 0.005 0.002 0.004 0.003
## rolls/buns 0.002 0.002 0.002 0.002
## soda 0.003 0.002 0.002 0.002
## yogurt 0.003 0.002 0.001 0.002
## bottled water 0.001 0.002 0.002 0.001
## roll products pet care photo/film mayonnaise sweet spreads
## whole milk 0.005 0.003 0.002 0.003 0.004
## other vegetables 0.005 0.002 0.001 0.004 0.002
## rolls/buns 0.002 0.001 0.001 0.003 0.002
## soda 0.002 0.002 0.001 0.002 0.003
## yogurt 0.002 0.001 0.001 0.003 0.002
## bottled water 0.001 0.001 0.001 0.001 0.000
## chocolate marshmallow candles specialty cheese dog food
## whole milk 0.003 0.003 0.004 0.003
## other vegetables 0.002 0.002 0.004 0.002
## rolls/buns 0.002 0.001 0.001 0.001
## soda 0.002 0.001 0.001 0.002
## yogurt 0.002 0.001 0.003 0.002
## bottled water 0.001 0.001 0.002 0.001
## frozen potato products house keeping products turkey
## whole milk 0.003 0.004 0.004
## other vegetables 0.003 0.003 0.004
## rolls/buns 0.002 0.002 0.002
## soda 0.003 0.002 0.001
## yogurt 0.002 0.002 0.002
## bottled water 0.001 0.001 0.002
## Instant food products liquor (appetizer) rice instant coffee
## whole milk 0.003 0.002 0.005 0.002
## other vegetables 0.003 0.001 0.004 0.002
## rolls/buns 0.002 0.001 0.002 0.002
## soda 0.002 0.002 0.001 0.002
## yogurt 0.001 0.001 0.002 0.002
## bottled water 0.001 0.001 0.001 0.001
## popcorn zwieback soups finished products vinegar
## whole milk 0.003 0.002 0.003 0.001 0.003
## other vegetables 0.002 0.002 0.003 0.002 0.002
## rolls/buns 0.001 0.001 0.001 0.001 0.002
## soda 0.002 0.001 0.001 0.002 0.001
## yogurt 0.001 0.001 0.001 0.001 0.002
## bottled water 0.001 0.001 0.001 0.001 0.001
## female sanitary products kitchen towels dental care cereals
## whole milk 0.002 0.003 0.002 0.004
## other vegetables 0.001 0.002 0.002 0.002
## rolls/buns 0.002 0.001 0.001 0.001
## soda 0.001 0.001 0.001 0.001
## yogurt 0.002 0.001 0.001 0.002
## bottled water 0.001 0.000 0.001 0.001
## sparkling wine sauces softener jam spices liver loaf
## whole milk 0.001 0.002 0.002 0.003 0.001 0.002
## other vegetables 0.002 0.002 0.002 0.002 0.002 0.002
## rolls/buns 0.001 0.001 0.001 0.001 0.001 0.002
## soda 0.001 0.001 0.001 0.001 0.001 0.001
## yogurt 0.001 0.001 0.001 0.001 0.001 0.002
## bottled water 0.001 0.001 0.001 0.001 0.001 0.000
## curd cheese cleaner male cosmetics rum ketchup meat spreads
## whole milk 0.002 0.002 0.001 0.002 0.002 0.001
## other vegetables 0.002 0.002 0.001 0.002 0.002 0.001
## rolls/buns 0.002 0.001 0.001 0.001 0.001 0.001
## soda 0.000 0.001 0.001 0.001 0.001 0.001
## yogurt 0.001 0.002 0.001 0.000 0.001 0.002
## bottled water 0.001 0.001 0.001 0.001 0.000 0.001
## brandy light bulbs tea specialty fat abrasive cleaner
## whole milk 0.001 0.001 0.002 0.001 0.002
## other vegetables 0.001 0.001 0.002 0.001 0.002
## rolls/buns 0.001 0.001 0.001 0.001 0.001
## soda 0.001 0.001 0.000 0.000 0.001
## yogurt 0.001 0.001 0.001 0.001 0.001
## bottled water 0.001 0.000 0.001 0.001 0.000
## skin care nuts/prunes artif. sweetener canned fruit syrup
## whole milk 0.002 0.001 0.001 0.001 0.001
## other vegetables 0.001 0.001 0.001 0.001 0.001
## rolls/buns 0.001 0.001 0.001 0.001 0.001
## soda 0.001 0.001 0.001 0.001 0.001
## yogurt 0.001 0.000 0.001 0.000 0.001
## bottled water 0.001 0.000 0.001 0.001 0.001
## nut snack snack products fish potato products
## whole milk 0.001 0.001 0.001 0.001
## other vegetables 0.001 0.001 0.001 0.001
## rolls/buns 0.001 0.001 0.000 0.001
## soda 0.001 0.001 0.000 0.001
## yogurt 0.001 0.000 0.001 0.001
## bottled water 0.000 0.000 0.000 0.001
## bathroom cleaner cookware soap cooking chocolate tidbits
## whole milk 0.001 0.000 0.001 0.001 0.001
## other vegetables 0.001 0.001 0.001 0.001 0.000
## rolls/buns 0.001 0.001 0.001 0.001 0.001
## soda 0.001 0.001 0.001 0.000 0.001
## yogurt 0.001 0.000 0.000 0.000 0.001
## bottled water 0.000 0.000 0.000 0.001 0.000
## pudding powder organic sausage cocoa drinks prosecco
## whole milk 0.001 0.001 0.001 0.001
## other vegetables 0.001 0.001 0.001 0.000
## rolls/buns 0.000 0.001 0.001 0.001
## soda 0.000 0.000 0.000 0.001
## yogurt 0.001 0.001 0.000 0.000
## bottled water 0.000 0.000 0.000 0.000
## flower soil/fertilizer ready soups specialty vegetables
## whole milk 0 0.001 0.000
## other vegetables 0 0.001 0.001
## rolls/buns 0 0.001 0.000
## soda 0 0.001 0.001
## yogurt 0 0.000 0.001
## bottled water 0 0.001 0.000
## organic products honey decalcifier cream frozen fruits
## whole milk 0.000 0.001 0.001 0.000 0.000
## other vegetables 0.001 0.000 0.001 0.001 0.001
## rolls/buns 0.000 0.000 0.000 0.000 0.000
## soda 0.001 0.000 0.000 0.000 0.000
## yogurt 0.000 0.000 0.000 0.000 0.000
## bottled water 0.000 0.000 0.000 0.000 0.000
## hair spray rubbing alcohol liqueur salad dressing whisky
## whole milk 0 0.001 0 0.000 0
## other vegetables 0 0.000 0 0.001 0
## rolls/buns 0 0.000 0 0.000 0
## soda 0 0.000 0 0.000 0
## yogurt 0 0.000 0 0.000 0
## bottled water 0 0.000 0 0.000 0
## make up remover toilet cleaner frozen chicken baby cosmetics
## whole milk 0 0 0 0
## other vegetables 0 0 0 0
## rolls/buns 0 0 0 0
## soda 0 0 0 0
## yogurt 0 0 0 0
## bottled water 0 0 0 0
## kitchen utensil bags preservation products baby food
## whole milk 0 0 0 0
## other vegetables 0 0 0 0
## rolls/buns 0 0 0 0
## soda 0 0 0 0
## yogurt 0 0 0 0
## bottled water 0 0 0 0
## sound storage medium
## whole milk 0
## other vegetables 0
## rolls/buns 0
## soda 0
## yogurt 0
## bottled water 0
In order to keep only the most relevant rules I decided to set support to 0.01
rules.transactions<-apriori(transactions, parameter=list(supp=0.01, conf=0.5))
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0.5 0.1 1 none FALSE TRUE 5 0.01 1
## maxlen target ext
## 10 rules TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 98
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[169 item(s), 9835 transaction(s)] done [0.00s].
## sorting and recoding items ... [88 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 done [0.00s].
## writing ... [15 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].
That level of support gives us 15 rules
rules_trans_conf <- sort(rules.transactions, by = 'confidence', decreasing = TRUE)
inspect(head(rules_trans_conf))
## lhs rhs support
## [1] {citrus fruit, root vegetables} => {other vegetables} 0.01037112
## [2] {tropical fruit, root vegetables} => {other vegetables} 0.01230300
## [3] {curd, yogurt} => {whole milk} 0.01006609
## [4] {other vegetables, butter} => {whole milk} 0.01148958
## [5] {tropical fruit, root vegetables} => {whole milk} 0.01199797
## [6] {root vegetables, yogurt} => {whole milk} 0.01453991
## confidence coverage lift count
## [1] 0.5862069 0.01769192 3.029608 102
## [2] 0.5845411 0.02104728 3.020999 121
## [3] 0.5823529 0.01728521 2.279125 99
## [4] 0.5736041 0.02003050 2.244885 113
## [5] 0.5700483 0.02104728 2.230969 118
## [6] 0.5629921 0.02582613 2.203354 143
rules_trans_lift <- sort(rules.transactions, by = 'lift', decreasing = TRUE)
inspect(head(rules_trans_lift))
## lhs rhs support
## [1] {citrus fruit, root vegetables} => {other vegetables} 0.01037112
## [2] {tropical fruit, root vegetables} => {other vegetables} 0.01230300
## [3] {root vegetables, rolls/buns} => {other vegetables} 0.01220132
## [4] {root vegetables, yogurt} => {other vegetables} 0.01291307
## [5] {curd, yogurt} => {whole milk} 0.01006609
## [6] {other vegetables, butter} => {whole milk} 0.01148958
## confidence coverage lift count
## [1] 0.5862069 0.01769192 3.029608 102
## [2] 0.5845411 0.02104728 3.020999 121
## [3] 0.5020921 0.02430097 2.594890 120
## [4] 0.5000000 0.02582613 2.584078 127
## [5] 0.5823529 0.01728521 2.279125 99
## [6] 0.5736041 0.02003050 2.244885 113
rules_trans_count <- sort(rules.transactions, by = 'count', decreasing = TRUE)
inspect(head(rules_trans_count))
## lhs rhs support
## [1] {other vegetables, yogurt} => {whole milk} 0.02226741
## [2] {tropical fruit, yogurt} => {whole milk} 0.01514997
## [3] {other vegetables, whipped/sour cream} => {whole milk} 0.01464159
## [4] {root vegetables, yogurt} => {whole milk} 0.01453991
## [5] {pip fruit, other vegetables} => {whole milk} 0.01352313
## [6] {root vegetables, yogurt} => {other vegetables} 0.01291307
## confidence coverage lift count
## [1] 0.5128806 0.04341637 2.007235 219
## [2] 0.5173611 0.02928317 2.024770 149
## [3] 0.5070423 0.02887646 1.984385 144
## [4] 0.5629921 0.02582613 2.203354 143
## [5] 0.5175097 0.02613116 2.025351 133
## [6] 0.5000000 0.02582613 2.584078 127
rules_trans_supp <- sort(rules.transactions, by = 'support', decreasing = TRUE)
inspect(head(rules_trans_supp))
## lhs rhs support
## [1] {other vegetables, yogurt} => {whole milk} 0.02226741
## [2] {tropical fruit, yogurt} => {whole milk} 0.01514997
## [3] {other vegetables, whipped/sour cream} => {whole milk} 0.01464159
## [4] {root vegetables, yogurt} => {whole milk} 0.01453991
## [5] {pip fruit, other vegetables} => {whole milk} 0.01352313
## [6] {root vegetables, yogurt} => {other vegetables} 0.01291307
## confidence coverage lift count
## [1] 0.5128806 0.04341637 2.007235 219
## [2] 0.5173611 0.02928317 2.024770 149
## [3] 0.5070423 0.02887646 1.984385 144
## [4] 0.5629921 0.02582613 2.203354 143
## [5] 0.5175097 0.02613116 2.025351 133
## [6] 0.5000000 0.02582613 2.584078 127
plot(rules.transactions, measure=c("support","lift"), shading="confidence")
We can see that for most rules lift has value around 2.10, apart from
four rules with a lift level of 2.60 and 3.00. Similar dependence refers
to support, where majority of rules have a support level of 0.01 to
0.016, apart from one rule with support level of nearly 0.0225
plot(rules.transactions, method = 'grouped')
Here we can see how vegetables and milk is split depending on LHS
groups. Milk is bought with majority of items regardless of which
products are in LHS group
plot(rules.transactions, method = 'graph', engine = 'htmlwidget')
More engaging plot showing exactly how these rules are connected to each other and their relationship with RHS items