The purpose of this paper is to perform and check how helpful can rule association algorithms such as Eclat and Apriori help in analysing restaurant sales data. Usage of machine learning in gastronomy is still quite unique, most of the restaurants excluding large restaurant chains do not use algorithms to analyse sales and behavior of their customers. Reason for that is that many of them are small businesses that do not have funds to cooperate with Data Scientists or Data Engineers. Another reason is that some bigger and detailed data sets are required and many restaurants do not store such information. In my opinion if a restaurant owner have a possibility or abilities to perform a detailed analysis using machine learning, it can only benefit his business and improve understanding of his customers habits. In this paper I have used real-life data from one of the biggest restaurants in Warsaw. Analysed data set was a receipts data from one month period.
Used packages are listed below. Data has been transformed behind the scenes as it is restaurant confidential information. Transformations that were required: eliminating rows with missing data, translating variables to English, deleting irrelevant for the analysis columns. Variable that will be used to perform analysis is Product Category. I will try to find products from which categories are often bought together.
library(arules)
## Loading required package: Matrix
##
## Attaching package: 'arules'
## The following objects are masked from 'package:base':
##
## abbreviate, write
library(arulesViz)
library(arulesCBA)
##
## Attaching package: 'arulesCBA'
## The following object is masked from 'package:arules':
##
## rules
library(readxl)
unique(Restaurant$KATEGORIA_SPRZEDAŻY)
## [1] "Lunch" "Cold drinks" "Red Wine" "Hot drinks"
## [5] "Vodka" "White Wine" "Champagne" "Beer"
## [9] "Juice" "Starters" "Salad" "Bread"
## [13] "Additions" "Pizza" "Seasonal" "Pasta"
## [17] "Bourbon" "Cigarettes" "Breakfast" "Mocktail"
## [21] "Lemonade" "Burger" "Desert" "Coctail"
## [25] "Aperitif" "Whisky" "Liqeur" "Gin"
## [29] "Rum" "Tequila" "Breakfast Drinks" "Kids Menu"
## [33] "Cognac"
keep <- c("ID_RACHUNKU", "KATEGORIA_SPRZEDAŻY")
R1 <- Restaurant[keep]
head(R1)
## # A tibble: 6 × 2
## ID_RACHUNKU KATEGORIA_SPRZEDAŻY
## <dbl> <chr>
## 1 296542 Lunch
## 2 296550 Cold drinks
## 3 296550 Cold drinks
## 4 296550 Cold drinks
## 5 296550 Cold drinks
## 6 296550 Cold drinks
As we can see there are 33 different categories of food and beverages served in this restaurant.
In the next step we need to transform the data to csv so then it can be uploaded as separate transactions.
write.csv(R1, "/Users/konradwronski/Downloads/Transactions.csv", row.names = FALSE)
R1_trans <- read.transactions("Downloads/Transactions.csv", format="single", sep=",", cols=c("ID_RACHUNKU","KATEGORIA_SPRZEDAŻY"),header = TRUE)
summary(R1_trans)
## transactions as itemMatrix in sparse format with
## 12346 rows (elements/itemsets/transactions) and
## 33 columns (items) and a density of 0.08531533
##
## most frequent items:
## Hot drinks Breakfast Burger Pizza Beer (Other)
## 4139 3039 2878 2617 2567 19519
##
## element (itemset/transaction) length distribution:
## sizes
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 17
## 2510 4346 2293 1475 795 400 227 135 64 44 18 16 7 8 6 1
## 18
## 1
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 2.000 2.815 4.000 18.000
##
## includes extended item information - examples:
## labels
## 1 Additions
## 2 Aperitif
## 3 Beer
##
## includes extended transaction information - examples:
## transactionID
## 1 296542
## 2 296550
## 3 296560
#frequency of specific products
itemFrequencyPlot(R1_trans, topN=8, type="relative", main="Item Frequency")
According to the frequency plot most popular category is Hot Drinks which is quite surprising for a regular restaurant not a cafe. In the top 5 there is also Breakfasts, Burgers, Pizza and Beer.
summary(R1_trans)
## transactions as itemMatrix in sparse format with
## 12346 rows (elements/itemsets/transactions) and
## 33 columns (items) and a density of 0.08531533
##
## most frequent items:
## Hot drinks Breakfast Burger Pizza Beer (Other)
## 4139 3039 2878 2617 2567 19519
##
## element (itemset/transaction) length distribution:
## sizes
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 17
## 2510 4346 2293 1475 795 400 227 135 64 44 18 16 7 8 6 1
## 18
## 1
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 2.000 2.815 4.000 18.000
##
## includes extended item information - examples:
## labels
## 1 Additions
## 2 Aperitif
## 3 Beer
##
## includes extended transaction information - examples:
## transactionID
## 1 296542
## 2 296550
## 3 296560
Eclat algorithm helps preparing item sets for further analysis. Researcher need to specify minimum support for each item set as a threshold. In our case min.support = 0.05. Support statistic shows how often given set appears in the data. Maximum length can be specify as well to omit baskets containing more items. This time max length is set for 10.
#creating sets
sets_eclat <- eclat(R1_trans, parameter = list(supp=0.05, maxlen=10))
## Eclat
##
## parameter specification:
## tidLists support minlen maxlen target ext
## FALSE 0.05 1 10 frequent itemsets TRUE
##
## algorithmic control:
## sparse sort verbose
## 7 -2 TRUE
##
## Absolute minimum support count: 617
##
## create itemset ...
## set transactions ...[33 item(s), 12346 transaction(s)] done [0.00s].
## sorting and recoding items ... [17 item(s)] done [0.00s].
## creating bit matrix ... [17 row(s), 12346 column(s)] done [0.00s].
## writing ... [31 set(s)] done [0.00s].
## Creating S4 object ... done [0.00s].
inspect(sets_eclat)
## items support count
## [1] {Breakfast, Hot drinks} 0.18921108 2336
## [2] {Burger, Lemonade} 0.06706626 828
## [3] {Burger, Pasta} 0.05200065 642
## [4] {Burger, Seasonal} 0.06034343 745
## [5] {Cold drinks, Seasonal} 0.05621254 694
## [6] {Beer, Seasonal} 0.05345861 660
## [7] {Beer, Burger} 0.09752146 1204
## [8] {Beer, Cold drinks} 0.05078568 627
## [9] {Beer, Pizza} 0.07022517 867
## [10] {Burger, Pizza} 0.06115341 755
## [11] {Cold drinks, Pizza} 0.06366434 786
## [12] {Cold drinks, Hot drinks} 0.05329661 658
## [13] {Burger, Cold drinks} 0.08180787 1010
## [14] {Burger, Hot drinks} 0.05305362 655
## [15] {Hot drinks} 0.33525028 4139
## [16] {Burger} 0.23311194 2878
## [17] {Cold drinks} 0.20184675 2492
## [18] {Pizza} 0.21197149 2617
## [19] {Beer} 0.20792159 2567
## [20] {Seasonal} 0.15997084 1975
## [21] {Pasta} 0.13591447 1678
## [22] {Lemonade} 0.13186457 1628
## [23] {Starters} 0.10772720 1330
## [24] {Breakfast} 0.24615260 3039
## [25] {Coctail} 0.11193909 1382
## [26] {Salad} 0.07621902 941
## [27] {Desert} 0.06617528 817
## [28] {Additions} 0.06747125 833
## [29] {Lunch} 0.16240078 2005
## [30] {White Wine} 0.05451158 673
## [31] {Juice} 0.05046169 623
With item sets created, now we have to create rules based on them. To do that we use ruleInduction() function. Confidence threshold for rules was set really low (0.4) as for restaurant data connections between products are much weaker than in markets for example. Confidence statistic indicates how many times if-then conditions turned out to be true.
#getting rules
rules_eclat <- ruleInduction(sets_eclat, R1_trans, confidence = 0.4)
rules_eclat
## set of 6 rules
inspect(rules_eclat)
## lhs rhs support confidence lift itemset
## [1] {Hot drinks} => {Breakfast} 0.18921108 0.5643875 2.292836 1
## [2] {Breakfast} => {Hot drinks} 0.18921108 0.7686739 2.292836 1
## [3] {Lemonade} => {Burger} 0.06706626 0.5085995 2.181782 2
## [4] {Burger} => {Beer} 0.09752146 0.4183461 2.012038 7
## [5] {Beer} => {Burger} 0.09752146 0.4690300 2.012038 7
## [6] {Cold drinks} => {Burger} 0.08180787 0.4052970 1.738637 13
We obtained 6 rules, where logic of their interpretation is as follows. If a customer buy item on left-hand side (lhs) then he buys item on the right-hand side (rhs). Support statistic tell us how often those items appear togheter on the same receipt, confidence statistic shows the appearance probability of rhs item when there is already lhs on the receipt and finally lift statistic represents how many more times given items combination appear than it is expected.
In this case we can clearly see that when someone orders breakfast, there is 77% probability that he will order hot drink. It shows how much of an impact promotion offer (lowered price for breakfast+hot drink set) that works in this restaurant influence customers behavior. Looking at the next rule set on the list Lemonade and Burger pair appears on about 6% of the receipts. The probability that someone orders Burger after the Lemonade is approximately 51%. Looking at the lift statistic we can assume that those products appear together 2.18 times more often than it would be expected.
This algorithm as well creates item sets and based on that creates rules. Just like in Eclat algorithm minimum support has to be specified as well as minimum confidence level for rules.
rules_apriori<-apriori(R1_trans, parameter=list(supp=0.05, conf=0.4))
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0.4 0.1 1 none FALSE TRUE 5 0.05 1
## maxlen target ext
## 10 rules TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 617
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[33 item(s), 12346 transaction(s)] done [0.00s].
## sorting and recoding items ... [17 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [6 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].
rules_apriori_conf<-sort(rules_apriori, by="confidence", decreasing=TRUE)
inspect(head(rules_apriori_conf))
## lhs rhs support confidence coverage lift
## [1] {Breakfast} => {Hot drinks} 0.18921108 0.7686739 0.2461526 2.292836
## [2] {Hot drinks} => {Breakfast} 0.18921108 0.5643875 0.3352503 2.292836
## [3] {Lemonade} => {Burger} 0.06706626 0.5085995 0.1318646 2.181782
## [4] {Beer} => {Burger} 0.09752146 0.4690300 0.2079216 2.012038
## [5] {Burger} => {Beer} 0.09752146 0.4183461 0.2331119 2.012038
## [6] {Cold drinks} => {Burger} 0.08180787 0.4052970 0.2018468 1.738637
## count
## [1] 2336
## [2] 2336
## [3] 828
## [4] 1204
## [5] 1204
## [6] 1010
Pizza is 4th most bought product in this restaurant lets check what people buy when they choose pizza.
Pizza<-apriori(data=R1_trans, parameter=list(supp=0.05,conf = 0.2),
appearance=list(default="lhs", rhs="Pizza"), control=list(verbose=F))
inspect(Pizza)
## lhs rhs support confidence coverage lift count
## [1] {} => {Pizza} 0.21197149 0.2119715 1.0000000 1.000000 2617
## [2] {Beer} => {Pizza} 0.07022517 0.3377483 0.2079216 1.593367 867
## [3] {Cold drinks} => {Pizza} 0.06366434 0.3154093 0.2018468 1.487980 786
## [4] {Burger} => {Pizza} 0.06115341 0.2623350 0.2331119 1.237595 755
Receipts containing beer and pizza cover about 7% of all data and there is about 33% probability that one orders pizza after ordering beer.
This graph shows predecessors on axis x and successors on axis y. The color shows the value of the lift statistic. As we can see there are two rules with high lift (from 3 to 4). The highest lift was obtained for Breakfast and Hot Drinks.
plot(rules_apriori, method="matrix", measure="lift")
## Itemsets in Antecedent (LHS)
## [1] "{Breakfast}" "{Hot drinks}" "{Lemonade}" "{Beer}"
## [5] "{Burger}" "{Cold drinks}"
## Itemsets in Consequent (RHS)
## [1] "{Burger}" "{Beer}" "{Breakfast}" "{Hot drinks}"
Graphs below cummulate all three statistics: support, confidence and lift.
plot(rules_apriori)
plot(rules_apriori, measure=c("support","lift"), shading="confidence")
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.
As shown on the graph the higher the lift the higher the confidendce of a given rule. Most of the rules fall into support value range of [0, 0.2]
One of the statistics made for assessing rules quality is Jaccard index. It show how likely it is for two products to appear on the same receipt.
dissim_1 <- R1_trans[,itemFrequency(R1_trans)>0.2]
d.jac.i<-dissimilarity(dissim_1, which="items")
round(d.jac.i,3)
## Beer Breakfast Burger Cold drinks Hot drinks
## Breakfast 0.993
## Burger 0.716 0.999
## Cold drinks 0.859 0.979 0.768
## Hot drinks 0.933 0.518 0.897 0.890
## Pizza 0.799 0.998 0.841 0.818 0.913
As we can see from the results two items that appear togheter ultimately rarely is Beer and Breakfast.
There is a lot to discover in restaurant data and i would recommend any bigger restaurant to use and implement such algorithms to their business. It helps with offer manipulation and flexibility. One of the main concers usually of restaurants marketing teams is how and on which product they have to implement some kind of promotion to make it most efficient. Researches like this one help to achieve such goals.