Data 624 HW Week 14

groceriesTran<- read.transactions("GroceryDataSet.csv",format="basket", header=F, sep=",")
summary(groceriesTran)
## transactions as itemMatrix in sparse format with
##  9835 rows (elements/itemsets/transactions) and
##  169 columns (items) and a density of 0.02609146 
## 
## most frequent items:
##       whole milk other vegetables       rolls/buns             soda 
##             2513             1903             1809             1715 
##           yogurt          (Other) 
##             1372            34055 
## 
## element (itemset/transaction) length distribution:
## sizes
##    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16 
## 2159 1643 1299 1005  855  645  545  438  350  246  182  117   78   77   55   46 
##   17   18   19   20   21   22   23   24   26   27   28   29   32 
##   29   14   14    9   11    4    6    1    1    1    1    3    1 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   2.000   3.000   4.409   6.000  32.000 
## 
## includes extended item information - examples:
##             labels
## 1 abrasive cleaner
## 2 artif. sweetener
## 3   baby cosmetics
itemFrequencyPlot(groceriesTran,topN=15, col = brewer.pal(8,'Accent'), type="absolute")

association.rules <- apriori(groceriesTran, parameter = list(supp=0.001, conf=0.8,maxlen=10))
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.8    0.1    1 none FALSE            TRUE       5   0.001      1
##  maxlen target  ext
##      10  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 9 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[169 item(s), 9835 transaction(s)] done [0.01s].
## sorting and recoding items ... [157 item(s)] done [0.00s].
## creating transaction tree ... done [0.01s].
## checking subsets of size 1 2 3 4 5 6 done [0.04s].
## writing ... [410 rule(s)] done [0.00s].
## creating S4 object  ... done [0.00s].
inspect(head(association.rules, n=10, by="lift"))
##      lhs                         rhs                   support confidence    coverage      lift count
## [1]  {liquor,                                                                                        
##       red/blush wine}         => {bottled beer}    0.001931876  0.9047619 0.002135231 11.235269    19
## [2]  {citrus fruit,                                                                                  
##       fruit/vegetable juice,                                                                         
##       other vegetables,                                                                              
##       soda}                   => {root vegetables} 0.001016777  0.9090909 0.001118454  8.340400    10
## [3]  {oil,                                                                                           
##       other vegetables,                                                                              
##       tropical fruit,                                                                                
##       whole milk,                                                                                    
##       yogurt}                 => {root vegetables} 0.001016777  0.9090909 0.001118454  8.340400    10
## [4]  {citrus fruit,                                                                                  
##       fruit/vegetable juice,                                                                         
##       grapes}                 => {tropical fruit}  0.001118454  0.8461538 0.001321810  8.063879    11
## [5]  {other vegetables,                                                                              
##       rice,                                                                                          
##       whole milk,                                                                                    
##       yogurt}                 => {root vegetables} 0.001321810  0.8666667 0.001525165  7.951182    13
## [6]  {oil,                                                                                           
##       other vegetables,                                                                              
##       tropical fruit,                                                                                
##       whole milk}             => {root vegetables} 0.001321810  0.8666667 0.001525165  7.951182    13
## [7]  {ham,                                                                                           
##       other vegetables,                                                                              
##       pip fruit,                                                                                     
##       yogurt}                 => {tropical fruit}  0.001016777  0.8333333 0.001220132  7.941699    10
## [8]  {beef,                                                                                          
##       citrus fruit,                                                                                  
##       other vegetables,                                                                              
##       tropical fruit}         => {root vegetables} 0.001016777  0.8333333 0.001220132  7.645367    10
## [9]  {butter,                                                                                        
##       cream cheese,                                                                                  
##       root vegetables}        => {yogurt}          0.001016777  0.9090909 0.001118454  6.516698    10
## [10] {butter,                                                                                        
##       sliced cheese,                                                                                 
##       tropical fruit,                                                                                
##       whole milk}             => {yogurt}          0.001016777  0.9090909 0.001118454  6.516698    10
summary(association.rules)
## set of 410 rules
## 
## rule length distribution (lhs + rhs):sizes
##   3   4   5   6 
##  29 229 140  12 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   3.000   4.000   4.000   4.329   5.000   6.000 
## 
## summary of quality measures:
##     support           confidence        coverage             lift       
##  Min.   :0.001017   Min.   :0.8000   Min.   :0.001017   Min.   : 3.131  
##  1st Qu.:0.001017   1st Qu.:0.8333   1st Qu.:0.001220   1st Qu.: 3.312  
##  Median :0.001220   Median :0.8462   Median :0.001322   Median : 3.588  
##  Mean   :0.001247   Mean   :0.8663   Mean   :0.001449   Mean   : 3.951  
##  3rd Qu.:0.001322   3rd Qu.:0.9091   3rd Qu.:0.001627   3rd Qu.: 4.341  
##  Max.   :0.003152   Max.   :1.0000   Max.   :0.003559   Max.   :11.235  
##      count      
##  Min.   :10.00  
##  1st Qu.:10.00  
##  Median :12.00  
##  Mean   :12.27  
##  3rd Qu.:13.00  
##  Max.   :31.00  
## 
## mining info:
##           data ntransactions support confidence
##  groceriesTran          9835   0.001        0.8
##                                                                                    call
##  apriori(data = groceriesTran, parameter = list(supp = 0.001, conf = 0.8, maxlen = 10))
subRules<-association.rules[quality(association.rules)$confidence>0.8]
plot(subRules)
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.

plot(head(association.rules, n=10, by="lift"), method="paracoord")

plot(head(association.rules, n=20, by="lift"), method = "graph",  engine = "htmlwidget")

Dairy products tend to flock together, and root vegetables are situationally popular, though there is a lovely party hearty linkage of liquor, red/blush wine leading to bottled beer.

References

– Market Basket Analysis using R https://www.datacamp.com/community/tutorials/market-basket-analysis-r