Association Rules for Recipes

Introduction

The main aim of this report is to analyze relationships between ingredients in various recipes. Association Rule is a data method for identifying all associations and correlations between attribute values. Association rules are used in data science to discover correlations and co-occurrences between data sets. They are best suited for explaining patterns in data from seemingly unrelated information repositories, such as relational and transactional databases. The use of association rules is sometimes referred to as “association rule usage.”

[https://www.sciencedirect.com/topics/computer-science/association-rules]

Dataset

I decided to explore almost 50.000 recipes in order to investigate association rules between ingredients. Data comes from Cookteau Database. Database consists of 45 772 recipes and 1033 ingredients. Firstly, I transformed the database downloaded from the website in such a way that each row of the database contains all the ingredients needed for a given recipe. Each ingredient is in a separate column.

[https://cookteau.com/en/home-2/https://cookteau.com/en/home-2/].

Reading data

trans<-read.csv("recipes.csv", sep=";", header=TRUE) 
trans <- data.frame(trans)
trans2 <- trans[sample(nrow(trans), 10), ]
trans2 %>%
  kbl() %>%
  kable_styling()

	Item1	Item2	Item3	Item4	Item5	Item6	Item7	Item8	Item9	Item10	Item11	Item11.1	Item12	Item13	Item14
13272	chicken	cinnamon	cornstarch	cumin	garlic	olive	onion	pepper bell	pineapple	salt	tomato	vegetable oil	water
1548	buttermilk	cayenne	chickpea	egg	ginger garlic paste	pomegranate	potato	spinach	sunflower	turmeric
40338	egg	eggplant	garlic	nutmeg	olive	pepper	pepper	ricotta cheese	salt	salt	tomato	water
5994	bay leaf	celery	chicken	chicken	garlic	onion	oregano	pepper bell	rice	salt	sausage	sausage	shrimp	tomato	turmeric
36563	butter	chive	horseradish	olive	steak
4910	bamboo shoot	cake	cake	carrot	chicken	daikon	salt	shiitake	shrimp	soy sauce	spinach	water
15663	asparagu	basil	butter	cheese parmesan	garlic	olive	oregano	pepper	pepper	pepper bell	shallot	shrimp
15696	chicken	garlic	ginger	onion	soy sauce	sugar
42938	garlic	peanut oil	pepper	salt	string bean	turkey
18504	celery	celery	egg	garlic	mayonnaise	mustard	onion	pepper	potato	relish

nrow(trans)

[1] 45749

ncol(trans)

[1] 63

trans1<-read.transactions("recipes.csv", format="basket", sep=";", skip=0)

Summary Table

The most frequent items are listed below. As expected, the most common ingredients are spices: salt, pepper. Also, onion, garlic and olive were pretty popular in recipes. The most complex recipes contain even 60 ingredients. Moreover, only 278 out of 45750 recipes have 1 ingredient. Mean and Median are similar and equal = 9 ingredients per recipe. Additionally, analyziong quartiles, 75% recipes have less than 11 ingredients. Density value is equal to 0.009802723, which refers to the proportion of non-zero matrix cells.

summary(trans1)

## transactions as itemMatrix in sparse format with
##  45750 rows (elements/itemsets/transactions) and
##  925 columns (items) and a density of 0.009802723 
## 
## most frequent items:
##    salt  pepper  garlic   onion   olive (Other) 
##   18745   16065   15995   13877   13476  336681 
## 
## element (itemset/transaction) length distribution:
## sizes
##    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16 
##  278  814 1565 2439 3438 4299 4699 4893 4657 4060 3513 2808 2225 1618 1276  947 
##   17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   32 
##  656  501  337  213  177  102   92   45   26   20    8   10   12    3    6    6 
##   33   34   37   38   60 
##    3    1    1    1    1 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   6.000   9.000   9.068  11.000  60.000 
## 
## includes extended item information - examples:
##        labels
## 1     abalone
## 2       adobo
## 3 adobo sauce

Inspecting Data

Below on the R printout, there are sample of 10 first recipes and ingredients stored inside them. They contain many ingridients per one recipe. However, the number of items vary. For example in the first basket there are 4 ingredients: capsicum, pepper bell, soy sauce, sunflower, and in the seventh we can see 8 ingredients: asafoetida, cayenne, chickpea, fennel, fenugreek, mustard oil, nigella seed, turmeric.

inspect(trans1[2:11])

##      items                 
## [1]  {capsicum,            
##       pepper bell,         
##       soy sauce,           
##       sunflower}           
## [2]  {buttermilk,          
##       cumin,               
##       fenugreek,           
##       ginger garlic paste, 
##       mustard oil,         
##       nigella seed,        
##       pepper bell,         
##       potato,              
##       sunflower}           
## [3]  {asafoetida,          
##       cayenne,             
##       fenugreek,           
##       ginger garlic paste, 
##       mustard oil,         
##       sesame,              
##       sunflower,           
##       turmeric}            
## [4]  {butter,              
##       cardamom,            
##       cashew,              
##       cayenne,             
##       cinnamon,            
##       clove,               
##       coriander,           
##       corn grit,           
##       cumin,               
##       potato,              
##       raisin,              
##       sunflower,           
##       tomato,              
##       turmeric}            
## [5]  {curry leaf,          
##       lemon,               
##       sunflower}           
## [6]  {coriander,           
##       cumin,               
##       mint,                
##       mustard oil,         
##       pepper bell,         
##       potato,              
##       sunflower,           
##       turmeric}            
## [7]  {asafoetida,          
##       cayenne,             
##       chickpea,            
##       fennel,              
##       fenugreek,           
##       mustard oil,         
##       nigella seed,        
##       turmeric}            
## [8]  {anise,               
##       asafoetida,          
##       cayenne,             
##       fenugreek,           
##       mango,               
##       mustard oil,         
##       nigella seed,        
##       sunflower,           
##       turmeric}            
## [9]  {cayenne,             
##       coriander,           
##       fennel,              
##       mango,               
##       nigella seed,        
##       sunflower,           
##       turmeric}            
## [10] {buttermilk,          
##       mint}

# itemFrequencyPlot(trans1, support = 0.1)

We are able to check what products are most common in the recipes dataset. In addition to the ingredients already mentioned, sugar, butter, egg, water, flour, tomato, chicken, cream, parsley, vegatble oil are also often present. It is worth noting that all the items have minumum support = 10%.

sugar
butter
egg
water
flour
tomato
chicken
cream:
parsley
vegatble oil

itemFrequencyPlot(trans1, topN=15, type="absolute", main="Item Frequency")

itemFrequencyPlot(trans1, topN=15, type="relative", main="Item Frequency")

The Apriori algorithm

It proceeds by identifying the frequent individual items in the database and extending them to larger and larger item sets as long as those item sets appear sufficiently often in the database. [https://en.wikipedia.org/wiki/Apriori_algorithm]

support - the number of times, or percentage, that the products co-occur
confidence - the number of times that a rule occurs, also the conditional probability of the right-hand side given the left-hand side
lift - the strength of association

Support

The first step is to compute the support of each individual item. After, we need to decide on the support threshold. The support of an association rule is the percentage of groups that contain all of the items listed in that association rule. The percentage value is calculated from among all the groups that were considered. This percentage value shows how often the joined rule body and rule head occur among all of the groups that were considered. The support of a rule is the occurence of the number of groups containing all the items that appear in the rule diveded by the total number of all the groups that are considered.

[https://www.ibm.com/docs/en/db2/9.7?topic=associations-support-in-association-rule]

Confidence

There is an additional mesasure called confidence. The confidence tells you a percentage of cases in which this rule is valid. 100% confidence means that this association always occurs; 50% for example means that the rule only holds 50% of the time.

[https://towardsdatascience.com/the-apriori-algorithm-5da3db9aea95]

Lift

Once we have obtained the rules, the last step is to compute the lift of each rule. According to the definition, the lift of a rule is a performance metric that indicates the strength of the association between the products in the rule. This means that lift basically compares the improvement of an association rule against the overall dataset.

[https://towardsdatascience.com/the-apriori-algorithm-5da3db9aea95[]

Apriori implements the Apriori algorithm. It starts with a minimum support of 100% of the data items and decreases this in steps of 5% until there are at least 10 rules with the required minimum confidence of 0.9 or until the support has reached a lower bound of 10%, whichever occurs first. (These default values can be changed.)

[https://www.sciencedirect.com/topics/computer-science/minimum-confidence]

Support and Confidence threshold

Support = 0.1, Confidence = 0.60

Rules have to refer to at least two products and fulfill minimum values of support and confidence level. Firstly, the Apriori algorithm has been used with values (minimum support = 0.1, minimum confidence = 0.6). For these values, there are only 8 rules detected. However, I will analyze this case, as well as the assotation rules for confidence = 0.55.

rules1b<-apriori(trans1, parameter=list(supp=0.10, conf=0.60))

## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.6    0.1    1 none FALSE            TRUE       5     0.1      1
##  maxlen target  ext
##      10  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 4575 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[925 item(s), 45750 transaction(s)] done [0.07s].
## sorting and recoding items ... [17 item(s)] done [0.00s].
## creating transaction tree ... done [0.03s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [8 rule(s)] done [0.00s].
## creating S4 object  ... done [0.01s].

plot(rules1b, measure=c("support","lift"), shading="confidence", main="Morning transactions rules")

plot(rules1b)

plot(rules1b, method="grouped")

plot(rules1b, method="graph", measure="support", shading="lift", main="Graf dla 8 reguł")

## Available control parameters (with default values):
## layout    =  stress
## circular  =  FALSE
## ggraphdots    =  NULL
## edges     =  <environment>
## nodes     =  <environment>
## nodetext  =  <environment>
## colors    =  c("#EE0000FF", "#EEEEEEFF")
## engine    =  ggplot2
## max   =  100
## verbose   =  FALSE

Analysis of rules by support, confidence and lift levels

Another very useful option is to sort the rules by the support, confidence and lift levels. The highest support value is equal to almost 18%. It can be understood as that pair: {olive -> garlic} occurs in 18% of all recipes (8142 times). The highest confidence value achieved is 67% for {flour -> salt}. It tells us that if someone use flour, he will also use butter with 67% probability. The second highest confidence suggests that if someone use garlic or salt, he will also use pepper with probability 66%. Lastly, the highest lift value = 2.22 is recorder in rule {flour -> butter}. Interpreting second and forth rules, garlic would be more likely used if we used onion and pepper, rather than olive and pepper.

rules.by.supp<-sort(rules1b, by="support", decreasing=TRUE) 
inspect(rules.by.supp)

##     lhs                rhs      support   confidence coverage  lift     count
## [1] {olive}         => {garlic} 0.1779672 0.6041852  0.2945574 1.728132 8142 
## [2] {flour}         => {salt}   0.1270164 0.6705516  0.1894208 1.636582 5811 
## [3] {flour}         => {butter} 0.1162623 0.6137780  0.1894208 2.218738 5319 
## [4] {garlic, salt}  => {pepper} 0.1108415 0.6697926  0.1654863 1.907439 5071 
## [5] {olive, pepper} => {garlic} 0.1094645 0.6539566  0.1673880 1.870492 5008 
## [6] {garlic, olive} => {pepper} 0.1094645 0.6150823  0.1779672 1.751635 5008 
## [7] {tomato}        => {garlic} 0.1057705 0.6330455  0.1670820 1.810680 4839 
## [8] {onion, pepper} => {garlic} 0.1043716 0.6675521  0.1563497 1.909378 4775

rules.by.conf<-sort(rules1b, by="confidence", decreasing=TRUE) 
inspect(rules.by.conf)

##     lhs                rhs      support   confidence coverage  lift     count
## [1] {flour}         => {salt}   0.1270164 0.6705516  0.1894208 1.636582 5811 
## [2] {garlic, salt}  => {pepper} 0.1108415 0.6697926  0.1654863 1.907439 5071 
## [3] {onion, pepper} => {garlic} 0.1043716 0.6675521  0.1563497 1.909378 4775 
## [4] {olive, pepper} => {garlic} 0.1094645 0.6539566  0.1673880 1.870492 5008 
## [5] {tomato}        => {garlic} 0.1057705 0.6330455  0.1670820 1.810680 4839 
## [6] {garlic, olive} => {pepper} 0.1094645 0.6150823  0.1779672 1.751635 5008 
## [7] {flour}         => {butter} 0.1162623 0.6137780  0.1894208 2.218738 5319 
## [8] {olive}         => {garlic} 0.1779672 0.6041852  0.2945574 1.728132 8142

rules.by.lift<-sort(rules1b, by="lift", decreasing=TRUE) 
inspect(rules.by.lift)

##     lhs                rhs      support   confidence coverage  lift     count
## [1] {flour}         => {butter} 0.1162623 0.6137780  0.1894208 2.218738 5319 
## [2] {onion, pepper} => {garlic} 0.1043716 0.6675521  0.1563497 1.909378 4775 
## [3] {garlic, salt}  => {pepper} 0.1108415 0.6697926  0.1654863 1.907439 5071 
## [4] {olive, pepper} => {garlic} 0.1094645 0.6539566  0.1673880 1.870492 5008 
## [5] {tomato}        => {garlic} 0.1057705 0.6330455  0.1670820 1.810680 4839 
## [6] {garlic, olive} => {pepper} 0.1094645 0.6150823  0.1779672 1.751635 5008 
## [7] {olive}         => {garlic} 0.1779672 0.6041852  0.2945574 1.728132 8142 
## [8] {flour}         => {salt}   0.1270164 0.6705516  0.1894208 1.636582 5811

plot(rules1b, method="paracoord", control=list(reorder=TRUE))

Support = 0.1, Confidence = 0.55

For support = 0.1 and confidence = 0.55, there are 19 rules detected. I can consider it as a satsiafactory result.

rules1a<-apriori(trans1, parameter=list(supp=0.10, conf=0.55))

## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##        0.55    0.1    1 none FALSE            TRUE       5     0.1      1
##  maxlen target  ext
##      10  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 4575 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[925 item(s), 45750 transaction(s)] done [0.09s].
## sorting and recoding items ... [17 item(s)] done [0.01s].
## creating transaction tree ... done [0.03s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [19 rule(s)] done [0.00s].
## creating S4 object  ... done [0.01s].

plot_rules1a <-plot(rules1a, measure=c("support","lift"), shading="confidence", main="Morning transactions rules")
plot(rules1a)

plot(rules1a, method="grouped")

plot(rules1a, method="graph", measure="support", shading="lift", main="Graf dla 8 reguł")

## Available control parameters (with default values):
## layout    =  stress
## circular  =  FALSE
## ggraphdots    =  NULL
## edges     =  <environment>
## nodes     =  <environment>
## nodetext  =  <environment>
## colors    =  c("#EE0000FF", "#EEEEEEFF")
## engine    =  ggplot2
## max   =  100
## verbose   =  FALSE

Analysis of rules by support, confidence and lift levels

Another very useful option is to sort the rules by the support, confidence and lift levels. The highest support value is equal to 19.78%. It can be understood as that pair: {pepper -> garlic} occurs in 19.78% of all recipes (9049 times). The highest confidence value achieved is 67% for {flour -> salt}. It tells us that if someone use flour, he will also use butter with 67% probability. The second highest confidence suggests that if someone use garlic or salt, he will also use pepper with probability 66%. Lastly, the highest lift value = 2.55 is recorder in rule {flour -> egg}. Interpreting third and seventh rules, garlic would be more likely used if we used an onion and pepper, rather than olive and pepper.

rules.by.supp<-sort(rules1a, by="support", decreasing=TRUE) 
inspect(rules.by.supp)

##      lhs                 rhs      support   confidence coverage  lift     count
## [1]  {pepper}         => {garlic} 0.1977923 0.5632742  0.3511475 1.611116 9049 
## [2]  {garlic}         => {pepper} 0.1977923 0.5657393  0.3496175 1.611116 9049 
## [3]  {olive}          => {garlic} 0.1779672 0.6041852  0.2945574 1.728132 8142 
## [4]  {onion}          => {garlic} 0.1773552 0.5847085  0.3033224 1.672424 8114 
## [5]  {olive}          => {pepper} 0.1673880 0.5682695  0.2945574 1.618321 7658 
## [6]  {egg}            => {salt}   0.1307978 0.5688754  0.2299235 1.388426 5984 
## [7]  {flour}          => {salt}   0.1270164 0.6705516  0.1894208 1.636582 5811 
## [8]  {flour}          => {butter} 0.1162623 0.6137780  0.1894208 2.218738 5319 
## [9]  {flour}          => {egg}    0.1113880 0.5880452  0.1894208 2.557569 5096 
## [10] {garlic, pepper} => {salt}   0.1108415 0.5603934  0.1977923 1.367725 5071 
## [11] {pepper, salt}   => {garlic} 0.1108415 0.5852279  0.1893989 1.673909 5071 
## [12] {garlic, salt}   => {pepper} 0.1108415 0.6697926  0.1654863 1.907439 5071 
## [13] {olive, pepper}  => {garlic} 0.1094645 0.6539566  0.1673880 1.870492 5008 
## [14] {garlic, olive}  => {pepper} 0.1094645 0.6150823  0.1779672 1.751635 5008 
## [15] {garlic, pepper} => {olive}  0.1094645 0.5534313  0.1977923 1.878857 5008 
## [16] {flour}          => {sugar}  0.1060109 0.5596584  0.1894208 1.906506 4850 
## [17] {tomato}         => {garlic} 0.1057705 0.6330455  0.1670820 1.810680 4839 
## [18] {onion, pepper}  => {garlic} 0.1043716 0.6675521  0.1563497 1.909378 4775 
## [19] {garlic, onion}  => {pepper} 0.1043716 0.5884890  0.1773552 1.675902 4775

rules.by.conf<-sort(rules1a, by="confidence", decreasing=TRUE) 
inspect(rules.by.conf)

##      lhs                 rhs      support   confidence coverage  lift     count
## [1]  {flour}          => {salt}   0.1270164 0.6705516  0.1894208 1.636582 5811 
## [2]  {garlic, salt}   => {pepper} 0.1108415 0.6697926  0.1654863 1.907439 5071 
## [3]  {onion, pepper}  => {garlic} 0.1043716 0.6675521  0.1563497 1.909378 4775 
## [4]  {olive, pepper}  => {garlic} 0.1094645 0.6539566  0.1673880 1.870492 5008 
## [5]  {tomato}         => {garlic} 0.1057705 0.6330455  0.1670820 1.810680 4839 
## [6]  {garlic, olive}  => {pepper} 0.1094645 0.6150823  0.1779672 1.751635 5008 
## [7]  {flour}          => {butter} 0.1162623 0.6137780  0.1894208 2.218738 5319 
## [8]  {olive}          => {garlic} 0.1779672 0.6041852  0.2945574 1.728132 8142 
## [9]  {garlic, onion}  => {pepper} 0.1043716 0.5884890  0.1773552 1.675902 4775 
## [10] {flour}          => {egg}    0.1113880 0.5880452  0.1894208 2.557569 5096 
## [11] {pepper, salt}   => {garlic} 0.1108415 0.5852279  0.1893989 1.673909 5071 
## [12] {onion}          => {garlic} 0.1773552 0.5847085  0.3033224 1.672424 8114 
## [13] {egg}            => {salt}   0.1307978 0.5688754  0.2299235 1.388426 5984 
## [14] {olive}          => {pepper} 0.1673880 0.5682695  0.2945574 1.618321 7658 
## [15] {garlic}         => {pepper} 0.1977923 0.5657393  0.3496175 1.611116 9049 
## [16] {pepper}         => {garlic} 0.1977923 0.5632742  0.3511475 1.611116 9049 
## [17] {garlic, pepper} => {salt}   0.1108415 0.5603934  0.1977923 1.367725 5071 
## [18] {flour}          => {sugar}  0.1060109 0.5596584  0.1894208 1.906506 4850 
## [19] {garlic, pepper} => {olive}  0.1094645 0.5534313  0.1977923 1.878857 5008

rules.by.lift<-sort(rules1a, by="lift", decreasing=TRUE) 
inspect(rules.by.lift)

##      lhs                 rhs      support   confidence coverage  lift     count
## [1]  {flour}          => {egg}    0.1113880 0.5880452  0.1894208 2.557569 5096 
## [2]  {flour}          => {butter} 0.1162623 0.6137780  0.1894208 2.218738 5319 
## [3]  {onion, pepper}  => {garlic} 0.1043716 0.6675521  0.1563497 1.909378 4775 
## [4]  {garlic, salt}   => {pepper} 0.1108415 0.6697926  0.1654863 1.907439 5071 
## [5]  {flour}          => {sugar}  0.1060109 0.5596584  0.1894208 1.906506 4850 
## [6]  {garlic, pepper} => {olive}  0.1094645 0.5534313  0.1977923 1.878857 5008 
## [7]  {olive, pepper}  => {garlic} 0.1094645 0.6539566  0.1673880 1.870492 5008 
## [8]  {tomato}         => {garlic} 0.1057705 0.6330455  0.1670820 1.810680 4839 
## [9]  {garlic, olive}  => {pepper} 0.1094645 0.6150823  0.1779672 1.751635 5008 
## [10] {olive}          => {garlic} 0.1779672 0.6041852  0.2945574 1.728132 8142 
## [11] {garlic, onion}  => {pepper} 0.1043716 0.5884890  0.1773552 1.675902 4775 
## [12] {pepper, salt}   => {garlic} 0.1108415 0.5852279  0.1893989 1.673909 5071 
## [13] {onion}          => {garlic} 0.1773552 0.5847085  0.3033224 1.672424 8114 
## [14] {flour}          => {salt}   0.1270164 0.6705516  0.1894208 1.636582 5811 
## [15] {olive}          => {pepper} 0.1673880 0.5682695  0.2945574 1.618321 7658 
## [16] {garlic}         => {pepper} 0.1977923 0.5657393  0.3496175 1.611116 9049 
## [17] {pepper}         => {garlic} 0.1977923 0.5632742  0.3511475 1.611116 9049 
## [18] {egg}            => {salt}   0.1307978 0.5688754  0.2299235 1.388426 5984 
## [19] {garlic, pepper} => {salt}   0.1108415 0.5603934  0.1977923 1.367725 5071

plot(rules1a, method="paracoord", control=list(reorder=TRUE))

Support = 0.1, Confidence = 0.50

In order to find the most interesting assotiation rules, I decided to analyze more association rules for lower confidence level = 0.50.

rules1c<-apriori(trans1, parameter=list(supp=0.10, conf=0.50))

## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.5    0.1    1 none FALSE            TRUE       5     0.1      1
##  maxlen target  ext
##      10  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 4575 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[925 item(s), 45750 transaction(s)] done [0.07s].
## sorting and recoding items ... [17 item(s)] done [0.00s].
## creating transaction tree ... done [0.03s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [28 rule(s)] done [0.00s].
## creating S4 object  ... done [0.01s].

plot(rules1c, measure=c("support","lift"), shading="confidence", main="Morning transactions rules")

plot(rules1c)

plot(rules1c, method="grouped")

plot(rules1c, method="graph", measure="support", shading="lift", main="Graf dla 8 reguł")

## Available control parameters (with default values):
## layout    =  stress
## circular  =  FALSE
## ggraphdots    =  NULL
## edges     =  <environment>
## nodes     =  <environment>
## nodetext  =  <environment>
## colors    =  c("#EE0000FF", "#EEEEEEFF")
## engine    =  ggplot2
## max   =  100
## verbose   =  FALSE

Analysis of rules by support, confidence and lift levels

Despite the increased number of rules to 28, the highest support, confidence and lift values are the same as for the previous version with confidence = 0.55.

rules.by.supp<-sort(rules1c, by="support", decreasing=TRUE) 
inspect(rules.by.supp)

##      lhs                 rhs      support   confidence coverage  lift     count
## [1]  {pepper}         => {garlic} 0.1977923 0.5632742  0.3511475 1.611116 9049 
## [2]  {garlic}         => {pepper} 0.1977923 0.5657393  0.3496175 1.611116 9049 
## [3]  {pepper}         => {salt}   0.1893989 0.5393713  0.3511475 1.316417 8665 
## [4]  {olive}          => {garlic} 0.1779672 0.6041852  0.2945574 1.728132 8142 
## [5]  {garlic}         => {olive}  0.1779672 0.5090341  0.3496175 1.728132 8142 
## [6]  {onion}          => {garlic} 0.1773552 0.5847085  0.3033224 1.672424 8114 
## [7]  {garlic}         => {onion}  0.1773552 0.5072835  0.3496175 1.672424 8114 
## [8]  {olive}          => {pepper} 0.1673880 0.5682695  0.2945574 1.618321 7658 
## [9]  {onion}          => {pepper} 0.1563497 0.5154572  0.3033224 1.467922 7153 
## [10] {sugar}          => {salt}   0.1495738 0.5095309  0.2935519 1.243587 6843 
## [11] {butter}         => {salt}   0.1391694 0.5030815  0.2766339 1.227846 6367 
## [12] {egg}            => {salt}   0.1307978 0.5688754  0.2299235 1.388426 5984 
## [13] {flour}          => {salt}   0.1270164 0.6705516  0.1894208 1.636582 5811 
## [14] {egg}            => {sugar}  0.1261421 0.5486263  0.2299235 1.868924 5771 
## [15] {flour}          => {butter} 0.1162623 0.6137780  0.1894208 2.218738 5319 
## [16] {water}          => {salt}   0.1139454 0.5396480  0.2111475 1.317092 5213 
## [17] {flour}          => {egg}    0.1113880 0.5880452  0.1894208 2.557569 5096 
## [18] {garlic, pepper} => {salt}   0.1108415 0.5603934  0.1977923 1.367725 5071 
## [19] {pepper, salt}   => {garlic} 0.1108415 0.5852279  0.1893989 1.673909 5071 
## [20] {garlic, salt}   => {pepper} 0.1108415 0.6697926  0.1654863 1.907439 5071 
## [21] {olive, pepper}  => {garlic} 0.1094645 0.6539566  0.1673880 1.870492 5008 
## [22] {garlic, olive}  => {pepper} 0.1094645 0.6150823  0.1779672 1.751635 5008 
## [23] {garlic, pepper} => {olive}  0.1094645 0.5534313  0.1977923 1.878857 5008 
## [24] {flour}          => {sugar}  0.1060109 0.5596584  0.1894208 1.906506 4850 
## [25] {tomato}         => {garlic} 0.1057705 0.6330455  0.1670820 1.810680 4839 
## [26] {onion, pepper}  => {garlic} 0.1043716 0.6675521  0.1563497 1.909378 4775 
## [27] {garlic, onion}  => {pepper} 0.1043716 0.5884890  0.1773552 1.675902 4775 
## [28] {garlic, pepper} => {onion}  0.1043716 0.5276826  0.1977923 1.739676 4775

rules.by.conf<-sort(rules1c, by="confidence", decreasing=TRUE) 
inspect(rules.by.conf)

##      lhs                 rhs      support   confidence coverage  lift     count
## [1]  {flour}          => {salt}   0.1270164 0.6705516  0.1894208 1.636582 5811 
## [2]  {garlic, salt}   => {pepper} 0.1108415 0.6697926  0.1654863 1.907439 5071 
## [3]  {onion, pepper}  => {garlic} 0.1043716 0.6675521  0.1563497 1.909378 4775 
## [4]  {olive, pepper}  => {garlic} 0.1094645 0.6539566  0.1673880 1.870492 5008 
## [5]  {tomato}         => {garlic} 0.1057705 0.6330455  0.1670820 1.810680 4839 
## [6]  {garlic, olive}  => {pepper} 0.1094645 0.6150823  0.1779672 1.751635 5008 
## [7]  {flour}          => {butter} 0.1162623 0.6137780  0.1894208 2.218738 5319 
## [8]  {olive}          => {garlic} 0.1779672 0.6041852  0.2945574 1.728132 8142 
## [9]  {garlic, onion}  => {pepper} 0.1043716 0.5884890  0.1773552 1.675902 4775 
## [10] {flour}          => {egg}    0.1113880 0.5880452  0.1894208 2.557569 5096 
## [11] {pepper, salt}   => {garlic} 0.1108415 0.5852279  0.1893989 1.673909 5071 
## [12] {onion}          => {garlic} 0.1773552 0.5847085  0.3033224 1.672424 8114 
## [13] {egg}            => {salt}   0.1307978 0.5688754  0.2299235 1.388426 5984 
## [14] {olive}          => {pepper} 0.1673880 0.5682695  0.2945574 1.618321 7658 
## [15] {garlic}         => {pepper} 0.1977923 0.5657393  0.3496175 1.611116 9049 
## [16] {pepper}         => {garlic} 0.1977923 0.5632742  0.3511475 1.611116 9049 
## [17] {garlic, pepper} => {salt}   0.1108415 0.5603934  0.1977923 1.367725 5071 
## [18] {flour}          => {sugar}  0.1060109 0.5596584  0.1894208 1.906506 4850 
## [19] {garlic, pepper} => {olive}  0.1094645 0.5534313  0.1977923 1.878857 5008 
## [20] {egg}            => {sugar}  0.1261421 0.5486263  0.2299235 1.868924 5771 
## [21] {water}          => {salt}   0.1139454 0.5396480  0.2111475 1.317092 5213 
## [22] {pepper}         => {salt}   0.1893989 0.5393713  0.3511475 1.316417 8665 
## [23] {garlic, pepper} => {onion}  0.1043716 0.5276826  0.1977923 1.739676 4775 
## [24] {onion}          => {pepper} 0.1563497 0.5154572  0.3033224 1.467922 7153 
## [25] {sugar}          => {salt}   0.1495738 0.5095309  0.2935519 1.243587 6843 
## [26] {garlic}         => {olive}  0.1779672 0.5090341  0.3496175 1.728132 8142 
## [27] {garlic}         => {onion}  0.1773552 0.5072835  0.3496175 1.672424 8114 
## [28] {butter}         => {salt}   0.1391694 0.5030815  0.2766339 1.227846 6367

rules.by.lift<-sort(rules1c, by="lift", decreasing=TRUE) 
inspect(rules.by.lift)

##      lhs                 rhs      support   confidence coverage  lift     count
## [1]  {flour}          => {egg}    0.1113880 0.5880452  0.1894208 2.557569 5096 
## [2]  {flour}          => {butter} 0.1162623 0.6137780  0.1894208 2.218738 5319 
## [3]  {onion, pepper}  => {garlic} 0.1043716 0.6675521  0.1563497 1.909378 4775 
## [4]  {garlic, salt}   => {pepper} 0.1108415 0.6697926  0.1654863 1.907439 5071 
## [5]  {flour}          => {sugar}  0.1060109 0.5596584  0.1894208 1.906506 4850 
## [6]  {garlic, pepper} => {olive}  0.1094645 0.5534313  0.1977923 1.878857 5008 
## [7]  {olive, pepper}  => {garlic} 0.1094645 0.6539566  0.1673880 1.870492 5008 
## [8]  {egg}            => {sugar}  0.1261421 0.5486263  0.2299235 1.868924 5771 
## [9]  {tomato}         => {garlic} 0.1057705 0.6330455  0.1670820 1.810680 4839 
## [10] {garlic, olive}  => {pepper} 0.1094645 0.6150823  0.1779672 1.751635 5008 
## [11] {garlic, pepper} => {onion}  0.1043716 0.5276826  0.1977923 1.739676 4775 
## [12] {garlic}         => {olive}  0.1779672 0.5090341  0.3496175 1.728132 8142 
## [13] {olive}          => {garlic} 0.1779672 0.6041852  0.2945574 1.728132 8142 
## [14] {garlic, onion}  => {pepper} 0.1043716 0.5884890  0.1773552 1.675902 4775 
## [15] {pepper, salt}   => {garlic} 0.1108415 0.5852279  0.1893989 1.673909 5071 
## [16] {onion}          => {garlic} 0.1773552 0.5847085  0.3033224 1.672424 8114 
## [17] {garlic}         => {onion}  0.1773552 0.5072835  0.3496175 1.672424 8114 
## [18] {flour}          => {salt}   0.1270164 0.6705516  0.1894208 1.636582 5811 
## [19] {olive}          => {pepper} 0.1673880 0.5682695  0.2945574 1.618321 7658 
## [20] {garlic}         => {pepper} 0.1977923 0.5657393  0.3496175 1.611116 9049 
## [21] {pepper}         => {garlic} 0.1977923 0.5632742  0.3511475 1.611116 9049 
## [22] {onion}          => {pepper} 0.1563497 0.5154572  0.3033224 1.467922 7153 
## [23] {egg}            => {salt}   0.1307978 0.5688754  0.2299235 1.388426 5984 
## [24] {garlic, pepper} => {salt}   0.1108415 0.5603934  0.1977923 1.367725 5071 
## [25] {water}          => {salt}   0.1139454 0.5396480  0.2111475 1.317092 5213 
## [26] {pepper}         => {salt}   0.1893989 0.5393713  0.3511475 1.316417 8665 
## [27] {sugar}          => {salt}   0.1495738 0.5095309  0.2935519 1.243587 6843 
## [28] {butter}         => {salt}   0.1391694 0.5030815  0.2766339 1.227846 6367

plot(rules1c, method="paracoord", control=list(reorder=TRUE))

Association Rules analysis for selected ingredients

Due to the fact that the database contains over 1000 ingredients and over 50 thousand recipes, some interesting combinations have not been detected. Therefore, with the support and confidence levels reduced, I found rules for selected products. For products like chocolate, which do not occur often (only 1% level of support), the confidence level is high: {chocolate} => {sugar}. The highest confidence level is recorded by the rules {chocolate, egg, flour, vanilla} => {sugar}, {cream, flour, vanilla} => {sugar} and {baking powder, butter, salt, vanilla} => {sugar}. They have the support level = 0.01, whereas confidence is over 99%.

Sugar

rules_sugar<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="rhs", lhs="sugar"), control=list(verbose=F)) 
rules_sugar_byconf<-sort(rules_sugar, by="confidence", decreasing=TRUE)
inspect((rules_sugar_byconf)[1:3], linebreak = FALSE)

##     lhs        rhs      support   confidence coverage  lift     count
## [1] {sugar} => {salt}   0.1495738 0.5095309  0.2935519 1.243587 6843 
## [2] {sugar} => {egg}    0.1261421 0.4297096  0.2935519 1.868924 5771 
## [3] {sugar} => {butter} 0.1197377 0.4078928  0.2935519 1.474486 5478

rules_sugar_1<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="lhs", rhs="sugar"), control=list(verbose=F)) 
rules_sugar_byconf_1<-sort(rules_sugar_1, by="confidence", decreasing=TRUE)
inspect((rules_sugar_byconf_1)[1:3], linebreak = FALSE)

##     lhs                                       rhs     support    confidence
## [1] {chocolate, egg, flour, vanilla}       => {sugar} 0.01022951 0.9957447 
## [2] {cream, flour, vanilla}                => {sugar} 0.01300546 0.9949833 
## [3] {baking powder, butter, salt, vanilla} => {sugar} 0.01241530 0.9947461 
##     coverage   lift     count
## [1] 0.01027322 3.392057 468  
## [2] 0.01307104 3.389463 595  
## [3] 0.01248087 3.388655 568

Egg

rules_egg<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="rhs", lhs="egg"), control=list(verbose=F)) 
rules_egg_byconf<-sort(rules_egg, by="confidence", decreasing=TRUE)
inspect((rules_egg_byconf)[1:3], linebreak = FALSE)

##     lhs      rhs     support   confidence coverage  lift     count
## [1] {egg} => {salt}  0.1307978 0.5688754  0.2299235 1.388426 5984 
## [2] {egg} => {sugar} 0.1261421 0.5486263  0.2299235 1.868924 5771 
## [3] {egg} => {flour} 0.1113880 0.4844567  0.2299235 2.557569 5096

rules_egg_1<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="lhs", rhs="egg"), control=list(verbose=F)) 
rules_egg_byconf_1<-sort(rules_egg_1, by="confidence", decreasing=TRUE)
inspect((rules_egg_byconf_1)[1:3], linebreak = FALSE)

##     lhs                                           rhs   support    confidence
## [1] {baking soda, flour, salt, sugar, vanilla} => {egg} 0.01029508 0.9515152 
## [2] {baking soda, flour, salt, vanilla}        => {egg} 0.01036066 0.9480000 
## [3] {baking soda, flour, sugar, vanilla}       => {egg} 0.01208743 0.9469178 
##     coverage   lift     count
## [1] 0.01081967 4.138399 471  
## [2] 0.01092896 4.123111 474  
## [3] 0.01276503 4.118404 553

Milk

rules_milk<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="rhs", lhs="milk"), control=list(verbose=F)) 
rules_milk_byconf<-sort(rules_milk, by="confidence", decreasing=TRUE)
inspect((rules_milk_byconf)[1:3], linebreak = FALSE)

##     lhs       rhs      support    confidence coverage  lift     count
## [1] {milk} => {egg}    0.05855738 0.5642376  0.1037814 2.454023 2679 
## [2] {milk} => {butter} 0.05407650 0.5210615  0.1037814 1.883578 2474 
## [3] {milk} => {salt}   0.05355191 0.5160067  0.1037814 1.259392 2450

rules_milk_1<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="lhs", rhs="milk"), control=list(verbose=F)) 
rules_milk_byconf_1<-sort(rules_milk_1, by="confidence", decreasing=TRUE)
inspect((rules_milk_byconf_1)[1:3], linebreak = FALSE)

##     lhs                                  rhs    support    confidence
## [1] {baking powder, egg, flour, salt} => {milk} 0.01084153 0.3737754 
## [2] {baking powder, butter, flour}    => {milk} 0.01147541 0.3733997 
## [3] {baking powder, butter, egg}      => {milk} 0.01005464 0.3724696 
##     coverage   lift     count
## [1] 0.02900546 3.601564 496  
## [2] 0.03073224 3.597944 525  
## [3] 0.02699454 3.588982 460

Bread

rules_bread<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="rhs", lhs="bread"), control=list(verbose=F)) 
rules_bread_byconf<-sort(rules_bread, by="confidence", decreasing=TRUE)
inspect((rules_bread_byconf)[1:3], linebreak = FALSE)

##     lhs        rhs      support    confidence coverage   lift     count
## [1] {bread} => {pepper} 0.02824044 0.4877312  0.05790164 1.388964 1292 
## [2] {bread} => {garlic} 0.02681967 0.4631937  0.05790164 1.324858 1227 
## [3] {bread} => {olive}  0.02581421 0.4458286  0.05790164 1.513554 1181

rules_bread_1<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="lhs", rhs="bread"), control=list(verbose=F)) 
rules_bread_byconf_1<-sort(rules_bread_1, by="confidence", decreasing=TRUE)
inspect((rules_bread_byconf_1)[1:3], linebreak = FALSE)

##     lhs                  rhs     support    confidence coverage   lift    
## [1] {egg, pepper}     => {bread} 0.01156284 0.1916667  0.06032787 3.310211
## [2] {garlic, parsley} => {bread} 0.01101639 0.1504029  0.07324590 2.597558
## [3] {parsley, pepper} => {bread} 0.01044809 0.1432424  0.07293989 2.473893
##     count
## [1] 529  
## [2] 504  
## [3] 478

Cinnamon

rules_cinnamon<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="rhs", lhs="cinnamon"), control=list(verbose=F)) 
rules_cinnamon_byconf<-sort(rules_cinnamon, by="confidence", decreasing=TRUE)
inspect((rules_cinnamon_byconf)[1:3], linebreak = FALSE)

##     lhs           rhs      support    confidence coverage   lift     count
## [1] {cinnamon} => {sugar}  0.05016393 0.6521739  0.07691803 2.221665 2295 
## [2] {cinnamon} => {salt}   0.03849180 0.5004263  0.07691803 1.221366 1761 
## [3] {cinnamon} => {butter} 0.03289617 0.4276783  0.07691803 1.546008 1505

rules_cinnamon_1<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="lhs", rhs="cinnamon"), control=list(verbose=F)) 
rules_cinnamon_byconf_1<-sort(rules_cinnamon_1, by="confidence", decreasing=TRUE)
inspect((rules_cinnamon_byconf_1)[1:3], linebreak = FALSE)

##     lhs                rhs        support    confidence coverage   lift     
## [1] {clove, sugar}  => {cinnamon} 0.01110383 0.7851623  0.01414208 10.207779
## [2] {clove, salt}   => {cinnamon} 0.01016393 0.6981982  0.01455738  9.077172
## [3] {nutmeg, sugar} => {cinnamon} 0.01193443 0.6876574  0.01735519  8.940133
##     count
## [1] 508  
## [2] 465  
## [3] 546

Chocolate

rules_chocolate<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="rhs", lhs="chocolate"), control=list(verbose=F)) 
rules_chocolate_byconf<-sort(rules_chocolate, by="confidence", decreasing=TRUE)
inspect((rules_chocolate_byconf)[1:3], linebreak = FALSE)

##     lhs            rhs      support    confidence coverage   lift     count
## [1] {chocolate} => {sugar}  0.03143169 0.7564440  0.04155191 2.576866 1438 
## [2] {chocolate} => {egg}    0.02330055 0.5607575  0.04155191 2.438887 1066 
## [3] {chocolate} => {butter} 0.02229508 0.5365597  0.04155191 1.939602 1020

rules_chocolate_1<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="lhs", rhs="chocolate"), control=list(verbose=F)) 
rules_chocolate_byconf_1<-sort(rules_chocolate_1, by="confidence", decreasing=TRUE)
inspect((rules_chocolate_byconf_1)[1:3], linebreak = FALSE)

##     lhs                              rhs         support    confidence
## [1] {cream, egg, sugar}           => {chocolate} 0.01014208 0.2783443 
## [2] {butter, egg, sugar, vanilla} => {chocolate} 0.01090710 0.2668449 
## [3] {butter, egg, vanilla}        => {chocolate} 0.01103825 0.2656497 
##     coverage   lift     count
## [1] 0.03643716 6.698713 464  
## [2] 0.04087432 6.421965 499  
## [3] 0.04155191 6.393199 505

Red Wine

rules_wine<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="rhs", lhs="wine red"), control=list(verbose=F)) 
rules_wine_byconf<-sort(rules_wine, by="confidence", decreasing=TRUE)
inspect((rules_wine_byconf)[1:3], linebreak = FALSE)

##     lhs           rhs      support    confidence coverage   lift     count
## [1] {wine red} => {garlic} 0.01108197 0.6007109  0.01844809 1.718195 507  
## [2] {wine red} => {olive}  0.01060109 0.5746445  0.01844809 1.950875 485  
## [3] {wine red} => {onion}  0.01014208 0.5497630  0.01844809 1.812471 464

rules_wine_1<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="lhs", rhs="wine red"), control=list(verbose=F)) 
rules_wine_byconf_1<-sort(rules_wine_1, by="confidence", decreasing=TRUE)
inspect((rules_wine_byconf_1)[1:3], linebreak = FALSE)

##     lhs         rhs        support    confidence coverage  lift     count
## [1] {olive}  => {wine red} 0.01060109 0.03598991 0.2945574 1.950875 485  
## [2] {onion}  => {wine red} 0.01014208 0.03343662 0.3033224 1.812471 464  
## [3] {garlic} => {wine red} 0.01108197 0.03169741 0.3496175 1.718195 507

Chicken

rules_chicken<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="rhs", lhs="chicken"), control=list(verbose=F)) 
rules_chicken_byconf<-sort(rules_chicken, by="confidence", decreasing=TRUE)
inspect((rules_chicken_byconf)[1:3], linebreak = FALSE)

##     lhs          rhs      support    confidence coverage  lift     count
## [1] {chicken} => {garlic} 0.08756284 0.5683076  0.1540765 1.625512 4006 
## [2] {chicken} => {onion}  0.08212022 0.5329834  0.1540765 1.757151 3757 
## [3] {chicken} => {pepper} 0.07746448 0.5027663  0.1540765 1.431781 3544

rules_chicken_1<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="lhs", rhs="chicken"), control=list(verbose=F)) 
rules_chicken_byconf_1<-sort(rules_chicken_1, by="confidence", decreasing=TRUE)
inspect((rules_chicken_byconf_1)[1:3], linebreak = FALSE)

##     lhs                 rhs       support    confidence coverage   lift    
## [1] {onion, rice}    => {chicken} 0.01130055 0.4751838  0.02378142 3.084077
## [2] {garlic, rice}   => {chicken} 0.01018579 0.4735772  0.02150820 3.073650
## [3] {carrot, celery} => {chicken} 0.01084153 0.4126456  0.02627322 2.678186
##     count
## [1] 517  
## [2] 466  
## [3] 496

Tomato

rules_tomato<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="rhs", lhs="tomato"), control=list(verbose=F)) 
rules_tomato_byconf<-sort(rules_tomato, by="confidence", decreasing=TRUE)
inspect((rules_tomato_byconf)[1:3], linebreak = FALSE)

##     lhs         rhs      support    confidence coverage lift     count
## [1] {tomato} => {garlic} 0.10577049 0.6330455  0.167082 1.810680 4839 
## [2] {tomato} => {onion}  0.09702732 0.5807169  0.167082 1.914520 4439 
## [3] {tomato} => {olive}  0.09106011 0.5450026  0.167082 1.850243 4166

rules_tomato_1<-apriori(data=trans1, parameter=list(supp=0.01,conf = 0.005, minlen=2), appearance=list(default="lhs", rhs="tomato"), control=list(verbose=F)) 
rules_tomato_byconf_1<-sort(rules_tomato_1, by="confidence", decreasing=TRUE)
inspect((rules_tomato_byconf_1)[1:3], linebreak = FALSE)

##     lhs                              rhs      support    confidence coverage  
## [1] {basil, garlic, olive, onion} => {tomato} 0.01027322 0.7378336  0.01392350
## [2] {basil, olive, onion}         => {tomato} 0.01165027 0.7202703  0.01617486
## [3] {basil, garlic, onion}        => {tomato} 0.01366120 0.6565126  0.02080874
##     lift     count
## [1] 4.415998 470  
## [2] 4.310880 533  
## [3] 3.929285 625

Conclusions

The goal of Association Rule Mining, given a set of transactions, is to find the rules that allow us to predict the occurrence of a specific item based on the occurrences of the other items in the transaction. Association Rules in analyzed recipes dataset provided many insights. Apriori algorithm was performed for support level = 0.10 and confidence level: 0.50, 0.55, 0.60. Apriori algorithm found interesting patterns, which actually encounter in real life. As I expected before, the most common products and rules consist of spices and basic foodstuff like butter, olive or flour. Additionally, thanks to couple associations rules I spot some interesting occurrences: the usage of flour ties in with the usege of eggs, which makes sense, as in the databse there are a lot of baking recipes (f.e. cakes).

Association Rules for Recipes

Pola Parol

2023-02-07

Introduction

Dataset

Reading data

Summary Table

Inspecting Data

The Apriori algorithm

Support

Confidence

Lift

Support and Confidence threshold

Support = 0.1, Confidence = 0.60

Analysis of rules by support, confidence and lift levels

Support = 0.1, Confidence = 0.55

Analysis of rules by support, confidence and lift levels

Support = 0.1, Confidence = 0.50

Analysis of rules by support, confidence and lift levels

Association Rules analysis for selected ingredients

Sugar

Egg

Milk

Bread

Cinnamon

Chocolate

Red Wine

Chicken

Tomato

Conclusions

References