The aim of this study is to apply association rules to identify patterns and dependencies in consumer purchasing behavior based on their market baskets in a grocery store. The dataset used for this analysis was sourced from Kaggle (https://www.kaggle.com/datasets/heeraldedhia/groceries-dataset) and contains transactional data about customer purchases.
As a BONUS, in the end we will compare ECLAT and APRIORI computation times to determine which algorithm is faster.
library(arules)
library(arulesViz)
library(dplyr)
The dataset “Groceries_dataset.csv” was preprocessed in R to structure the transactional data in a format suitable for association rules analysis. The data was grouped by customer (Member_number), and all purchased items were concatenated into a single row per customer. Then, using arulez the dataset was converted into a transaction object, where each row represents a single market basket.
data <- read.csv("Groceries_dataset.csv")
head(data)
## Member_number Date itemDescription
## 1 1808 21-07-2015 tropical fruit
## 2 2552 05-01-2015 whole milk
## 3 2300 19-09-2015 pip fruit
## 4 1187 12-12-2015 other vegetables
## 5 3037 01-02-2015 whole milk
## 6 4941 14-02-2015 rolls/buns
summary(data)
## Member_number Date itemDescription
## Min. :1000 Length:38765 Length:38765
## 1st Qu.:2002 Class :character Class :character
## Median :3005 Mode :character Mode :character
## Mean :3004
## 3rd Qu.:4007
## Max. :5000
Current format is not suitable for further analysis. We need to group the products by member_number into one transaction
data_grouped <- data %>%
group_by(Member_number) %>%
summarise(Items = paste(itemDescription, collapse = ", "))
data_grouped <- data_grouped[,2]
write.table(data_grouped, file = "data_grouped.csv", sep = ",", row.names = FALSE, col.names = FALSE, quote = FALSE)
transactions<-read.transactions("data_grouped.csv", format="basket",
sep=",", skip=0, quote="", rm.duplicates = FALSE)
## Warning in asMethod(object): removing duplicated items in transactions
summary(transactions)
## transactions as itemMatrix in sparse format with
## 3898 rows (elements/itemsets/transactions) and
## 167 columns (items) and a density of 0.05340678
##
## most frequent items:
## whole milk other vegetables rolls/buns soda
## 1786 1468 1363 1222
## yogurt (Other)
## 1103 27824
##
## element (itemset/transaction) length distribution:
## sizes
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
## 6 248 87 331 261 381 303 332 340 296 276 238 181 179 123 97 66 46 39 28
## 21 22 23 24 25 26
## 15 13 3 5 2 2
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 6.000 8.500 8.919 12.000 26.000
##
## includes extended item information - examples:
## labels
## 1 abrasive cleaner
## 2 artif. sweetener
## 3 baby cosmetics
The dataset consists of 3898 transactions and 167 unique items. The most frequently purchased items were: whole milk, other vegetables, rolls/buns, soda and yogurt. The transaction size varies from 1 to 26 items, with a median of 8.5 items and an average of 8.9 items per transaction.
We can also inspect few baskets.
inspect(transactions[1:10])
## items
## [1] {canned beer,
## hygiene articles,
## misc. beverages,
## pastry,
## pickled vegetables,
## salty snack,
## sausage,
## semi-finished bread,
## soda,
## whole milk,
## yogurt}
## [2] {beef,
## curd,
## frankfurter,
## rolls/buns,
## sausage,
## soda,
## whipped/sour cream,
## white bread,
## whole milk}
## [3] {butter,
## butter milk,
## frozen vegetables,
## other vegetables,
## specialty chocolate,
## sugar,
## tropical fruit,
## whole milk}
## [4] {dental care,
## detergent,
## frozen meals,
## rolls/buns,
## root vegetables,
## sausage}
## [5] {canned beer,
## chocolate,
## cling film/bags,
## dish cleaner,
## frozen fish,
## hygiene articles,
## other vegetables,
## packaged fruit/vegetables,
## pastry,
## pip fruit,
## red/blush wine,
## rolls/buns,
## root vegetables,
## shopping bags,
## tropical fruit,
## whole milk}
## [6] {margarine,
## rolls/buns,
## whipped/sour cream}
## [7] {bottled beer,
## bottled water,
## chicken,
## chocolate,
## flour,
## frankfurter,
## rice,
## rolls/buns,
## shopping bags,
## skin care,
## softener,
## whole milk}
## [8] {dessert,
## domestic eggs,
## hamburger meat,
## liquor (appetizer),
## liver loaf,
## photo/film,
## root vegetables,
## soda,
## tropical fruit,
## white wine,
## yogurt}
## [9] {canned fish,
## cocoa drinks,
## herbs,
## ketchup,
## newspapers,
## pastry,
## tropical fruit,
## yogurt}
## [10] {bottled water,
## candles,
## coffee,
## frankfurter,
## kitchen towels,
## pip fruit,
## rolls/buns,
## sliced cheese,
## specialty bar,
## UHT-milk}
Below we can see charts displaying the 25 most popular products, presented in both relative and absolute terms.
itemFrequencyPlot(transactions, topN=25, type="relative", main="Item Frequency Plot - relative", ylim=c(0, 0.5), col='orange')
itemFrequencyPlot(transactions, topN=25, type="absolute", main="Item Frequency Plot - absolute", ylim=c(0, 2000.0), col='yellow')
## Item Occurrences
## 1 whole milk 1786
## 2 other vegetables 1468
## 3 rolls/buns 1363
## 4 soda 1222
## 5 yogurt 1103
## 6 tropical fruit 911
## 7 root vegetables 899
## 8 bottled water 833
## 9 sausage 803
## 10 citrus fruit 723
## 11 pastry 692
## 12 pip fruit 665
## 13 shopping bags 656
## 14 canned beer 644
## 15 bottled beer 619
## 16 whipped/sour cream 603
## 17 newspapers 545
## 18 frankfurter 536
## 19 brown bread 530
## 20 domestic eggs 519
## 21 pork 516
## 22 butter 493
## 23 fruit/vegetable juice 487
## 24 curd 471
## 25 beef 466
## Item Occurrences
## 1 kitchen utensil 1
## 2 preservation products 1
## 3 baby cosmetics 3
## 4 bags 4
## 5 frozen chicken 5
## 6 make up remover 5
## 7 rubbing alcohol 5
## 8 toilet cleaner 5
## 9 salad dressing 6
## 10 whisky 8
## 11 decalcifier 9
## 12 hair spray 9
## 13 liqueur 9
## 14 organic products 10
## 15 frozen fruits 11
## 16 specialty vegetables 11
## 17 cream 12
## 18 honey 13
## 19 cooking chocolate 15
## 20 ready soups 15
## 21 cocoa drinks 16
## 22 flower soil/fertilizer 16
## 23 pudding powder 16
## 24 bathroom cleaner 17
## 25 cookware 17
To analyze purchasing behavior, we will now explore association rules, which reveal relationships between occurance of two or more products. These rules will help us to identify patterns in transactions.
Association rules are defined by three key metrics:
We will apply the Apriori algorithm, which helps filter out less significant rules by setting a minimum support and confidence threshold. This algorithm follows the principle that if a combination of items is frequent, its subsets must also be frequent, and if an item is infrequent, any set containing it will be infrequent as well.
For our analysis, we will extract rules involving at least two products that meet the minimum support of 4% and confidence of 43%. This ensures that the rule appears in at least 4% of all transactions and that the consequent product is strongly associated with the antecedent in 43% of relevant cases.
rules<-apriori(transactions, parameter=list(supp=0.04, conf=0.43))
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0.43 0.1 1 none FALSE TRUE 5 0.04 1
## maxlen target ext
## 10 rules TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 155
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[167 item(s), 3898 transaction(s)] done [0.00s].
## sorting and recoding items ... [60 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 done [0.00s].
## writing ... [93 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].
We found 93 rules. Below we can find all of them.
inspect(rules)
## lhs rhs support
## [1] {} => {whole milk} 0.45818368
## [2] {hamburger meat} => {whole milk} 0.04540790
## [3] {UHT-milk} => {whole milk} 0.04053361
## [4] {napkins} => {whole milk} 0.04309903
## [5] {dessert} => {whole milk} 0.04079015
## [6] {cream cheese} => {whole milk} 0.04771678
## [7] {chocolate} => {whole milk} 0.04797332
## [8] {white bread} => {whole milk} 0.04771678
## [9] {chicken} => {whole milk} 0.05002565
## [10] {frozen vegetables} => {other vegetables} 0.04566444
## [11] {frozen vegetables} => {whole milk} 0.05515649
## [12] {coffee} => {whole milk} 0.05515649
## [13] {margarine} => {whole milk} 0.05951770
## [14] {beef} => {whole milk} 0.06413545
## [15] {curd} => {whole milk} 0.06362237
## [16] {fruit/vegetable juice} => {whole milk} 0.06233966
## [17] {butter} => {other vegetables} 0.05720883
## [18] {butter} => {whole milk} 0.06618779
## [19] {pork} => {whole milk} 0.06695741
## [20] {domestic eggs} => {whole milk} 0.07029246
## [21] {brown bread} => {other vegetables} 0.05977424
## [22] {brown bread} => {whole milk} 0.06977937
## [23] {newspapers} => {whole milk} 0.07234479
## [24] {frankfurter} => {other vegetables} 0.06105695
## [25] {frankfurter} => {whole milk} 0.06798358
## [26] {whipped/sour cream} => {other vegetables} 0.06695741
## [27] {whipped/sour cream} => {whole milk} 0.07978450
## [28] {bottled beer} => {other vegetables} 0.06849666
## [29] {bottled beer} => {whole milk} 0.08542842
## [30] {shopping bags} => {other vegetables} 0.07311442
## [31] {shopping bags} => {whole milk} 0.09132889
## [32] {canned beer} => {whole milk} 0.08722422
## [33] {pip fruit} => {whole milk} 0.08696768
## [34] {pastry} => {whole milk} 0.09107234
## [35] {citrus fruit} => {whole milk} 0.09235505
## [36] {sausage} => {other vegetables} 0.09286814
## [37] {sausage} => {whole milk} 0.10697794
## [38] {bottled water} => {other vegetables} 0.09389430
## [39] {bottled water} => {whole milk} 0.11236532
## [40] {tropical fruit} => {whole milk} 0.11646998
## [41] {root vegetables} => {whole milk} 0.11313494
## [42] {yogurt} => {whole milk} 0.15059005
## [43] {soda} => {whole milk} 0.15110313
## [44] {rolls/buns} => {whole milk} 0.17855310
## [45] {other vegetables} => {whole milk} 0.19138019
## [46] {rolls/buns, shopping bags} => {whole milk} 0.04130323
## [47] {shopping bags, whole milk} => {rolls/buns} 0.04130323
## [48] {other vegetables, shopping bags} => {whole milk} 0.04284248
## [49] {shopping bags, whole milk} => {other vegetables} 0.04284248
## [50] {pastry, rolls/buns} => {whole milk} 0.04027707
## [51] {pastry, whole milk} => {rolls/buns} 0.04027707
## [52] {other vegetables, pastry} => {whole milk} 0.04181632
## [53] {pastry, whole milk} => {other vegetables} 0.04181632
## [54] {citrus fruit, other vegetables} => {whole milk} 0.04258594
## [55] {citrus fruit, whole milk} => {other vegetables} 0.04258594
## [56] {sausage, yogurt} => {whole milk} 0.04489482
## [57] {sausage, soda} => {whole milk} 0.04002052
## [58] {rolls/buns, sausage} => {other vegetables} 0.04181632
## [59] {other vegetables, sausage} => {rolls/buns} 0.04181632
## [60] {rolls/buns, sausage} => {whole milk} 0.04874295
## [61] {sausage, whole milk} => {rolls/buns} 0.04874295
## [62] {other vegetables, sausage} => {whole milk} 0.05028220
## [63] {sausage, whole milk} => {other vegetables} 0.05028220
## [64] {bottled water, yogurt} => {whole milk} 0.04027707
## [65] {bottled water, soda} => {whole milk} 0.04002052
## [66] {bottled water, rolls/buns} => {whole milk} 0.04515136
## [67] {bottled water, other vegetables} => {whole milk} 0.05618266
## [68] {bottled water, whole milk} => {other vegetables} 0.05618266
## [69] {tropical fruit, yogurt} => {whole milk} 0.04232940
## [70] {rolls/buns, tropical fruit} => {whole milk} 0.04643407
## [71] {other vegetables, tropical fruit} => {whole milk} 0.05053874
## [72] {tropical fruit, whole milk} => {other vegetables} 0.05053874
## [73] {root vegetables, soda} => {whole milk} 0.04309903
## [74] {rolls/buns, root vegetables} => {other vegetables} 0.04130323
## [75] {other vegetables, root vegetables} => {rolls/buns} 0.04130323
## [76] {rolls/buns, root vegetables} => {whole milk} 0.04797332
## [77] {other vegetables, root vegetables} => {whole milk} 0.04540790
## [78] {soda, yogurt} => {rolls/buns} 0.04232940
## [79] {soda, yogurt} => {other vegetables} 0.04309903
## [80] {soda, yogurt} => {whole milk} 0.05438687
## [81] {rolls/buns, yogurt} => {other vegetables} 0.05233453
## [82] {other vegetables, yogurt} => {rolls/buns} 0.05233453
## [83] {rolls/buns, yogurt} => {whole milk} 0.06593125
## [84] {whole milk, yogurt} => {rolls/buns} 0.06593125
## [85] {other vegetables, yogurt} => {whole milk} 0.07183171
## [86] {whole milk, yogurt} => {other vegetables} 0.07183171
## [87] {rolls/buns, soda} => {other vegetables} 0.05259107
## [88] {rolls/buns, soda} => {whole milk} 0.06516162
## [89] {soda, whole milk} => {rolls/buns} 0.06516162
## [90] {other vegetables, soda} => {whole milk} 0.06926629
## [91] {soda, whole milk} => {other vegetables} 0.06926629
## [92] {other vegetables, rolls/buns} => {whole milk} 0.08209338
## [93] {rolls/buns, whole milk} => {other vegetables} 0.08209338
## confidence coverage lift count
## [1] 0.4581837 1.00000000 1.000000 1786
## [2] 0.5654952 0.08029759 1.234211 177
## [3] 0.5163399 0.07850180 1.126928 158
## [4] 0.5299685 0.08132376 1.156672 168
## [5] 0.4718101 0.08645459 1.029740 159
## [6] 0.5391304 0.08850693 1.176669 186
## [7] 0.5548961 0.08645459 1.211078 187
## [8] 0.5375723 0.08876347 1.173268 186
## [9] 0.4974490 0.10056439 1.085698 195
## [10] 0.4450000 0.10261673 1.181614 178
## [11] 0.5375000 0.10261673 1.173110 215
## [12] 0.4799107 0.11493073 1.047420 215
## [13] 0.5087719 0.11698307 1.110410 232
## [14] 0.5364807 0.11954849 1.170886 250
## [15] 0.5265393 0.12083120 1.149188 248
## [16] 0.4989733 0.12493586 1.089025 243
## [17] 0.4523327 0.12647512 1.201085 223
## [18] 0.5233266 0.12647512 1.142176 258
## [19] 0.5058140 0.13237558 1.103955 261
## [20] 0.5279383 0.13314520 1.152242 274
## [21] 0.4396226 0.13596716 1.167336 233
## [22] 0.5132075 0.13596716 1.120091 272
## [23] 0.5174312 0.13981529 1.129310 282
## [24] 0.4440299 0.13750641 1.179038 238
## [25] 0.4944030 0.13750641 1.079050 265
## [26] 0.4328358 0.15469472 1.149315 261
## [27] 0.5157546 0.15469472 1.125650 311
## [28] 0.4313409 0.15879938 1.145345 267
## [29] 0.5379645 0.15879938 1.174124 333
## [30] 0.4344512 0.16829143 1.153604 285
## [31] 0.5426829 0.16829143 1.184422 356
## [32] 0.5279503 0.16521293 1.152268 340
## [33] 0.5097744 0.17060031 1.112598 339
## [34] 0.5130058 0.17752694 1.119651 355
## [35] 0.4979253 0.18547973 1.086737 360
## [36] 0.4508095 0.20600308 1.197040 362
## [37] 0.5193026 0.20600308 1.133394 417
## [38] 0.4393758 0.21369933 1.166680 366
## [39] 0.5258103 0.21369933 1.147597 438
## [40] 0.4983535 0.23370959 1.087672 454
## [41] 0.4905451 0.23063109 1.070630 441
## [42] 0.5321850 0.28296562 1.161510 587
## [43] 0.4819967 0.31349410 1.051973 589
## [44] 0.5106383 0.34966650 1.114484 696
## [45] 0.5081744 0.37660339 1.109106 746
## [46] 0.6007463 0.06875321 1.311147 161
## [47] 0.4522472 0.09132889 1.293367 161
## [48] 0.5859649 0.07311442 1.278886 167
## [49] 0.4691011 0.09132889 1.245610 167
## [50] 0.5793358 0.06952283 1.264418 157
## [51] 0.4422535 0.09107234 1.264787 157
## [52] 0.5842294 0.07157517 1.275099 163
## [53] 0.4591549 0.09107234 1.219200 163
## [54] 0.5496689 0.07747563 1.199669 166
## [55] 0.4611111 0.09235505 1.224394 166
## [56] 0.5952381 0.07542329 1.299125 175
## [57] 0.5182724 0.07721909 1.131146 156
## [58] 0.5077882 0.08234992 1.348337 163
## [59] 0.4502762 0.09286814 1.287731 163
## [60] 0.5919003 0.08234992 1.291841 190
## [61] 0.4556355 0.10697794 1.303057 190
## [62] 0.5414365 0.09286814 1.181702 196
## [63] 0.4700240 0.10697794 1.248061 196
## [64] 0.6061776 0.06644433 1.323001 157
## [65] 0.5252525 0.07619292 1.146380 156
## [66] 0.5695793 0.07927142 1.243124 176
## [67] 0.5983607 0.09389430 1.305941 219
## [68] 0.5000000 0.11236532 1.327657 219
## [69] 0.5593220 0.07567984 1.220738 165
## [70] 0.5261628 0.08825038 1.148366 181
## [71] 0.5533708 0.09132889 1.207749 197
## [72] 0.4339207 0.11646998 1.152195 197
## [73] 0.5299685 0.08132376 1.156672 168
## [74] 0.4548023 0.09081580 1.207643 161
## [75] 0.4386921 0.09415085 1.254601 161
## [76] 0.5282486 0.09081580 1.152919 187
## [77] 0.4822888 0.09415085 1.052610 177
## [78] 0.4342105 0.09748589 1.241785 165
## [79] 0.4421053 0.09748589 1.173928 168
## [80] 0.5578947 0.09748589 1.217622 212
## [81] 0.4700461 0.11133915 1.248120 204
## [82] 0.4349680 0.12031811 1.243951 204
## [83] 0.5921659 0.11133915 1.292420 257
## [84] 0.4378194 0.15059005 1.252106 257
## [85] 0.5970149 0.12031811 1.303003 280
## [86] 0.4770017 0.15059005 1.266589 280
## [87] 0.4389722 0.11980503 1.165609 205
## [88] 0.5438972 0.11980503 1.187072 254
## [89] 0.4312394 0.15110313 1.233288 254
## [90] 0.5578512 0.12416624 1.217528 270
## [91] 0.4584041 0.15110313 1.217206 270
## [92] 0.5594406 0.14674192 1.220996 320
## [93] 0.4597701 0.17855310 1.220834 320
inspect(sort(rules, by = "lift")[1:5])
## lhs rhs support
## [1] {rolls/buns, sausage} => {other vegetables} 0.04181632
## [2] {bottled water, whole milk} => {other vegetables} 0.05618266
## [3] {bottled water, yogurt} => {whole milk} 0.04027707
## [4] {rolls/buns, shopping bags} => {whole milk} 0.04130323
## [5] {bottled water, other vegetables} => {whole milk} 0.05618266
## confidence coverage lift count
## [1] 0.5077882 0.08234992 1.348337 163
## [2] 0.5000000 0.11236532 1.327657 219
## [3] 0.6061776 0.06644433 1.323001 157
## [4] 0.6007463 0.06875321 1.311147 161
## [5] 0.5983607 0.09389430 1.305941 219
inspect(sort(rules, by = "confidence")[1:5])
## lhs rhs support confidence
## [1] {bottled water, yogurt} => {whole milk} 0.04027707 0.6061776
## [2] {rolls/buns, shopping bags} => {whole milk} 0.04130323 0.6007463
## [3] {bottled water, other vegetables} => {whole milk} 0.05618266 0.5983607
## [4] {other vegetables, yogurt} => {whole milk} 0.07183171 0.5970149
## [5] {sausage, yogurt} => {whole milk} 0.04489482 0.5952381
## coverage lift count
## [1] 0.06644433 1.323001 157
## [2] 0.06875321 1.311147 161
## [3] 0.09389430 1.305941 219
## [4] 0.12031811 1.303003 280
## [5] 0.07542329 1.299125 175
inspect(sort(rules, by = "support")[1:5])
## lhs rhs support confidence coverage lift
## [1] {} => {whole milk} 0.4581837 0.4581837 1.0000000 1.000000
## [2] {other vegetables} => {whole milk} 0.1913802 0.5081744 0.3766034 1.109106
## [3] {rolls/buns} => {whole milk} 0.1785531 0.5106383 0.3496665 1.114484
## [4] {soda} => {whole milk} 0.1511031 0.4819967 0.3134941 1.051973
## [5] {yogurt} => {whole milk} 0.1505900 0.5321850 0.2829656 1.161510
## count
## [1] 1786
## [2] 746
## [3] 696
## [4] 589
## [5] 587
plot(rules, measure=c("support","lift"), shading="confidence", engine = "ggplot2")
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.
summary(rules)
## set of 93 rules
##
## rule length distribution (lhs + rhs):sizes
## 1 2 3
## 1 44 48
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 3.000 2.505 3.000 3.000
##
## summary of quality measures:
## support confidence coverage lift
## Min. :0.04002 Min. :0.4312 Min. :0.06644 Min. :1.000
## 1st Qu.:0.04310 1st Qu.:0.4584 1st Qu.:0.09082 1st Qu.:1.142
## Median :0.05516 Median :0.5098 Median :0.11493 Median :1.177
## Mean :0.06866 Mean :0.5058 Mean :0.13786 Mean :1.186
## 3rd Qu.:0.07183 3rd Qu.:0.5380 3rd Qu.:0.15110 3rd Qu.:1.243
## Max. :0.45818 Max. :0.6062 Max. :1.00000 Max. :1.348
## count
## Min. : 156.0
## 1st Qu.: 168.0
## Median : 215.0
## Mean : 267.6
## 3rd Qu.: 280.0
## Max. :1786.0
##
## mining info:
## data ntransactions support confidence
## transactions 3898 0.04 0.43
## call
## apriori(data = transactions, parameter = list(supp = 0.04, conf = 0.43))
The generated rules have support values ranging from 4% to 45.8%, meaning some associations appear in nearly half of all transactions. Confidence varies between 43.1% and 60.6%, indicating that in many cases, the consequent product appears in over half of the transactions where the antecedent is present. Lift values range from 1.000 to 1.348, confirming positive associations between products.
plot(rules, method="graph", measure="support", shading="lift", engine="html")
Rule 77: {other vegetables, root vegetables} => {whole milk}
support = 0.0454 - There is a 4.54% chance of finding a transaction where other vegetables, root vegetables and whole milk are purchased together.
confidence = 0.482 - There is a 48,2% chance that if customer buys other vegetables and root vegetables, he will also buy whole milk.
lift = 1.05 - This indicates a very weak positive relationship, meaning customers who buy other vegetables and root vegetables are only slightly more likely to purchase whole milk than the general population.
In this study, the Apriori algorithm was applied to a grocery store transaction dataset to uncover association rules. The analysis with support = 0.04 and confidence = 0.43 threshold generated 93 different rules. Based on them we could identify key purchasing patterns, revealing how certain products are commonly bought together. While these results provide valuable insights into consumer behavior, further research could refine the analysis by adjusting support and confidence thresholds to uncover more specific and potentially stronger relationships.
Since Eclat and Apriori generate similar frequent itemsets with the same support and confidence, the key difference lies in their efficiency. To test this, I will compare their computation time using identical parameters (support = 0.001, confidence = 0.1, maxlen = 20) to determine which algorithm performs better for mining association rules in our dataset
start_timeECLAT <- Sys.time()
computationECLAT<-eclat(transactions, parameter=list(supp=0.001, maxlen=20))
computationECLAT2<-ruleInduction(computationECLAT, transactions, confidence=0.1)
end_timeECLAT <- Sys.time()
start_timeAPRIORI <- Sys.time()
computationAPRIORI<-apriori(transactions, parameter=list(supp=0.001, conf=0.1, maxlen=20))
end_timeAPRIORI <- Sys.time()
total_timeECLAT <- end_timeECLAT - start_timeECLAT
print(paste("Total computation time in seconds ECLAT:", total_timeECLAT))
## [1] "Total computation time in seconds ECLAT: 2.36619806289673"
total_timeAPRIORI <- end_timeAPRIORI - start_timeAPRIORI
print(paste("Total computation time in seconds APRIORI:", total_timeAPRIORI))
## [1] "Total computation time in seconds APRIORI: 0.982537984848022"
The results show that Apriori algorithm outperformed Eclat algorithm in terms of computation time, making it more efficient for this dataset and parameter set.