Data

Imagine 10000 receipts sitting on your table. Each receipt represents a transaction with items that were purchased. The receipt is a representation of stuff that went into a customer’s basket - and therefore ‘Market Basket Analysis’. That is exactly what the Groceries Data Set contains: a collection of receipts with each line representing 1 receipt and the items purchased. Each line is called a transaction and each column in a row represents an item. The data set is attached.

Your assignment is to use R to mine the data for association rules. You should report support, confidence and lift and your top 10 rules by lift.

library(plyr)
library(tidyr)
library(readxl)
library(knitr)
library(ggplot2)
library(lubridate)
library(arules)
library(arulesViz)

Load Data

df <- read.transactions("https://raw.githubusercontent.com/mkivenson/Predictive-Analytics/master/Market%20Basket%20Analysis/GroceryDataSet.csv", header = FALSE, sep = ",")

Use arules to return top 10 rules by lift

rules <- apriori(df, parameter = list(supp = 0.001, conf = 0.8))
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.8    0.1    1 none FALSE            TRUE       5   0.001      1
##  maxlen target   ext
##      10  rules FALSE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 9 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[169 item(s), 9835 transaction(s)] done [0.00s].
## sorting and recoding items ... [157 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 5 6 done [0.01s].
## writing ... [410 rule(s)] done [0.00s].
## creating S4 object  ... done [0.00s].
inspect(rules[0:10])
##      lhs                         rhs                    support confidence      lift count
## [1]  {liquor,                                                                             
##       red/blush wine}         => {bottled beer}     0.001931876  0.9047619 11.235269    19
## [2]  {cereals,                                                                            
##       curd}                   => {whole milk}       0.001016777  0.9090909  3.557863    10
## [3]  {cereals,                                                                            
##       yogurt}                 => {whole milk}       0.001728521  0.8095238  3.168192    17
## [4]  {butter,                                                                             
##       jam}                    => {whole milk}       0.001016777  0.8333333  3.261374    10
## [5]  {bottled beer,                                                                       
##       soups}                  => {whole milk}       0.001118454  0.9166667  3.587512    11
## [6]  {house keeping products,                                                             
##       napkins}                => {whole milk}       0.001321810  0.8125000  3.179840    13
## [7]  {house keeping products,                                                             
##       whipped/sour cream}     => {whole milk}       0.001220132  0.9230769  3.612599    12
## [8]  {pastry,                                                                             
##       sweet spreads}          => {whole milk}       0.001016777  0.9090909  3.557863    10
## [9]  {curd,                                                                               
##       turkey}                 => {other vegetables} 0.001220132  0.8000000  4.134524    12
## [10] {rice,                                                                               
##       sugar}                  => {whole milk}       0.001220132  1.0000000  3.913649    12