Background

Pada Latihan ini kita akan membuat association rules dari data yang digunakan diambil dari Weka dataset. Dataset dapat diakses pada data/supermarket.csv. supermarket.csv merupakan dataset yang berisi daftar pembelian barang setiap transaksinya.

Data Preparation

## 'data.frame':    79626 obs. of  2 variables:
##  $ TID : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ name: chr  "baby needs " "bread and cake " "baking needs " "juice sat cord ms" ...
##       TID           name          
##  Min.   :   1   Length:79626      
##  1st Qu.:1162   Class :character  
##  Median :2324   Mode  :character  
##  Mean   :2317                     
##  3rd Qu.:3476                     
##  Max.   :4627

Data terdiri dari dua kolom yaitu TID dan name. TID merupakan id transaksi dan name merupakan keterangan produk yang dibeli. Kita dapat summary statistik dengan menggunakan summary.

Exploratory Analysis

##   [1] "baby needs "         "bread and cake "     "baking needs "      
##   [4] "juice sat cord ms"   "biscuits"            "canned vegetables " 
##   [7] "cleaners polishers"  "coffee"              "sauces gravy pkle"  
##  [10] "confectionary"       "dishcloths scour"    "frozen foods "      
##  [13] "razor blades "       "party snack foods "  "tissues paper prd " 
##  [16] "wrapping"            "mens toiletries "    "cheese"             
##  [19] "milk cream"          "margarine"           "small goods "       
##  [22] "fruit"               "vegetables"          "750ml white nz "    
##  [25] "canned fish meat "   "canned fruit "       "deod disinfectant"  
##  [28] "pet foods "          "laundry needs "      "deodorants soap"    
##  [31] "haircare"            "puddings deserts"    "health food other " 
##  [34] "dairy foods "        "beef"                "lamb"               
##  [37] "stationary"          "breakfast food "     "jams spreads"       
##  [40] "prepared meals "     "tea"                 "dental needs "      
##  [43] "meat misc "          "poultry"             "potatoes"           
##  [46] "condiments"          "small goods2 "       "spices"             
##  [49] "bake off products "  "soft drinks "        "sanitary pads "     
##  [52] "electrical"          "750ml red imp "      "sparkling nz "      
##  [55] "pork"                "beverages hot "      "lotions creams"     
##  [58] "cough cold pain"     "pet food "           "grocery misc "      
##  [61] "cigs tobacco pkts "  "pkt canned soup "    "produce misc "      
##  [64] "deli gourmet "       "variety misc "       "kitchen"            
##  [67] "imported cheese "    "fuels garden aids "  "cold meats"         
##  [70] "casks white wine "   "cooking oils "       "offal"              
##  [73] "manchester"          "plasticware"         "insecticides"       
##  [76] "brushware"           "non host support "   "haberdashery"       
##  [79] "port and sherry "    "delicatessen misc "  "health beauty misc "
##  [82] "dried vegetables "   "medicines"           "trim pork "         
##  [85] "cigarette cartons "  "plants"              "preserving needs "  
##  [88] "750ml white imp "    "hogget"              "fruit drinks "      
##  [91] "750ml red nz "       "casks red wine "     "trim lamb "         
##  [94] "mutton"              "sparkling imp "      "pantyhose"          
##  [97] "salads"              "chickens"            "gourmet meat "      
## [100] "veal"

Yang paling banyak dibeli adalah bread and cake dengan frekuensi 3330.

Kita dapat melihat berapa banyak yang dibeli tiap transaksi dengan menggunakan koding berikut

Build Rules

## $`1`
##  [1] "baby needs "        "bread and cake "    "baking needs "     
##  [4] "juice sat cord ms"  "biscuits"           "canned vegetables "
##  [7] "cleaners polishers" "coffee"             "sauces gravy pkle" 
## [10] "confectionary"      "dishcloths scour"   "frozen foods "     
## [13] "razor blades "      "party snack foods " "tissues paper prd "
## [16] "wrapping"           "mens toiletries "   "cheese"            
## [19] "milk cream"         "margarine"          "small goods "      
## [22] "fruit"              "vegetables"         "750ml white nz "   
## 
## $`2`
##  [1] "canned fish meat "  "canned fruit "      "canned vegetables "
##  [4] "sauces gravy pkle"  "deod disinfectant"  "frozen foods "     
##  [7] "pet foods "         "laundry needs "     "tissues paper prd "
## [10] "deodorants soap"    "haircare"           "milk cream"        
## [13] "fruit"              "vegetables"        
## 
## $`3`
##  [1] "bread and cake "    "baking needs "      "juice sat cord ms" 
##  [4] "biscuits"           "canned fruit "      "sauces gravy pkle" 
##  [7] "puddings deserts"   "wrapping"           "health food other "
## [10] "small goods "       "dairy foods "       "beef"              
## [13] "lamb"               "fruit"              "vegetables"        
## [16] "stationary"
##     items                transactionID
## [1] {750ml white nz ,                 
##      baby needs ,                     
##      baking needs ,                   
##      biscuits,                        
##      bread and cake ,                 
##      canned vegetables ,              
##      cheese,                          
##      cleaners polishers,              
##      coffee,                          
##      confectionary,                   
##      dishcloths scour,                
##      frozen foods ,                   
##      fruit,                           
##      juice sat cord ms,               
##      margarine,                       
##      mens toiletries ,                
##      milk cream,                      
##      party snack foods ,              
##      razor blades ,                   
##      sauces gravy pkle,               
##      small goods ,                    
##      tissues paper prd ,              
##      vegetables,                      
##      wrapping}                       1
## [2] {canned fish meat ,               
##      canned fruit ,                   
##      canned vegetables ,              
##      deod disinfectant,               
##      deodorants soap,                 
##      frozen foods ,                   
##      fruit,                           
##      haircare,                        
##      laundry needs ,                  
##      milk cream,                      
##      pet foods ,                      
##      sauces gravy pkle,               
##      tissues paper prd ,              
##      vegetables}                     2
## [1] "ngCMatrix"
## attr(,"package")
## [1] "Matrix"
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.7    0.1    1 none FALSE            TRUE       5     0.1      1
##  maxlen target  ext
##      10  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 460 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[100 item(s), 4601 transaction(s)] done [0.01s].
## sorting and recoding items ... [47 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 5 6 7 done [0.06s].
## writing ... [15513 rule(s)] done [0.00s].
## creating S4 object  ... done [0.00s].

catatan : komputer tidak kuat menggunakan supp 0.01, sehingga analisis berikutnya menggunakan supp 0.1

## set of 15513 rules
## 
## rule length distribution (lhs + rhs):sizes
##    1    2    3    4    5    6    7 
##    1   92 1889 6093 5794 1570   74 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   4.000   4.000   4.456   5.000   7.000 
## 
## summary of quality measures:
##     support         confidence        coverage           lift      
##  Min.   :0.1002   Min.   :0.7000   Min.   :0.1098   Min.   :1.000  
##  1st Qu.:0.1074   1st Qu.:0.7364   1st Qu.:0.1378   1st Qu.:1.174  
##  Median :0.1178   Median :0.7689   Median :0.1528   Median :1.235  
##  Mean   :0.1291   Mean   :0.7788   Mean   :0.1668   Mean   :1.247  
##  3rd Qu.:0.1378   3rd Qu.:0.8164   3rd Qu.:0.1782   3rd Qu.:1.307  
##  Max.   :0.7238   Max.   :0.9205   Max.   :1.0000   Max.   :1.594  
##      count     
##  Min.   : 461  
##  1st Qu.: 494  
##  Median : 542  
##  Mean   : 594  
##  3rd Qu.: 634  
##  Max.   :3330  
## 
## mining info:
##                      data ntransactions support confidence
##  supermarket_transactions          4601     0.1        0.7

Rules Interpretation

##     lhs                     rhs                 support confidence  coverage     lift count
## [1] {biscuits,                                                                             
##      frozen foods ,                                                                        
##      milk cream,                                                                           
##      pet foods ,                                                                           
##      vegetables}         => {bread and cake } 0.1032384  0.9205426 0.1121495 1.271897   475
## [2] {baking needs ,                                                                        
##      biscuits,                                                                             
##      fruit,                                                                                
##      margarine,                                                                            
##      milk cream,                                                                           
##      vegetables}         => {bread and cake } 0.1008476  0.9188119 0.1097587 1.269506   464
## [3] {biscuits,                                                                             
##      frozen foods ,                                                                        
##      margarine,                                                                            
##      milk cream,                                                                           
##      vegetables}         => {bread and cake } 0.1167138  0.9179487 0.1271463 1.268313   537
## [4] {biscuits,                                                                             
##      canned vegetables ,                                                                   
##      frozen foods ,                                                                        
##      fruit,                                                                                
##      vegetables}         => {bread and cake } 0.1069333  0.9179104 0.1164964 1.268260   492
## [5] {baking needs ,                                                                        
##      frozen foods ,                                                                        
##      fruit,                                                                                
##      margarine,                                                                            
##      milk cream,                                                                           
##      vegetables}         => {bread and cake } 0.1030211  0.9168279 0.1123669 1.266764   474

Nilai Confidence yang tinggi menunjukkan seberapa besar peluang membeli item lainnya jika kita telah membeli suatu item. Rules dengan Confidence tertinggi adalah {biscuits,frozen foods ,milk cream,pet foods ,vegetables} => {bread and cake}.

artinya ketika seorang pelanggan membeli 5 item tersebut, kemungkinan besar pelanggan juga akan membeli bread and cake karena dari seluruh transaksi yang berisi 5 item pertama yang dibeli, 0.958 atau 92.05% pembelian juga terdapat item bread and cake.

Kelemahan apabila melihat rules berdasarkan nilai Confidence adalah, Confidence hanya melihat dari transaksi antecedent dan tidak memperhatikan transaksi-transaksi lain dari consequent. Maka, kita perlu melihat seberapa baik antecedent meningkatkan peluang untuk pelanggan membeli item lainnya jika kita mengetahui dia telah memberi sekumpulan barang-barang tertentu dibandingkan ketika kita tidak tahu bahwa pelanggan membeli barang-barang tersebut. Oleh sebab itu, mari kita periksa 5 rules dengan Lift tertinggi.

##      lhs                     rhs                    support confidence  coverage     lift count
## [1]  {baking needs ,                                                                           
##       biscuits,                                                                                
##       bread and cake ,                                                                         
##       juice sat cord ms,                                                                       
##       sauces gravy pkle}  => {party snack foods } 0.1010650  0.8072917 0.1251902 1.594141   465
## [2]  {laundry needs ,                                                                          
##       wrapping}           => {tissues paper prd } 0.1038905  0.7697262 0.1349707 1.576106   478
## [3]  {biscuits,                                                                                
##       bread and cake ,                                                                         
##       frozen foods ,                                                                           
##       juice sat cord ms,                                                                       
##       sauces gravy pkle}  => {party snack foods } 0.1056292  0.7928222 0.1332319 1.565569   486
## [4]  {frozen foods ,                                                                           
##       party snack foods ,                                                                      
##       prepared meals }    => {sauces gravy pkle}  0.1049772  0.7442219 0.1410563 1.555731   483
## [5]  {biscuits,                                                                                
##       margarine,                                                                               
##       wrapping}           => {tissues paper prd } 0.1006303  0.7565359 0.1330146 1.549097   463
## [6]  {baking needs ,                                                                           
##       biscuits,                                                                                
##       cheese,                                                                                  
##       tissues paper prd } => {margarine}          0.1019344  0.7701149 0.1323625 1.548645   469
## [7]  {biscuits,                                                                                
##       juice sat cord ms,                                                                       
##       sauces gravy pkle,                                                                       
##       tissues paper prd } => {party snack foods } 0.1025864  0.7840532 0.1308411 1.548253   472
## [8]  {baking needs ,                                                                           
##       bread and cake ,                                                                         
##       frozen foods ,                                                                           
##       juice sat cord ms,                                                                       
##       sauces gravy pkle}  => {party snack foods } 0.1028037  0.7831126 0.1312758 1.546395   473
## [9]  {baking needs ,                                                                           
##       biscuits,                                                                                
##       bread and cake ,                                                                         
##       laundry needs }     => {tissues paper prd } 0.1049772  0.7523364 0.1395349 1.540498   483
## [10] {biscuits,                                                                                
##       frozen foods ,                                                                           
##       juice sat cord ms,                                                                       
##       sauces gravy pkle}  => {party snack foods } 0.1225820  0.7800830 0.1571398 1.540413   564

Berdasarkan hasil di atas, rules {baking needs, biscuits, bread and cake, juice sat cord ms, sauces gravy pkle} => { party snack foods } memiliki Lift terbesar dengan nilai 1.594141. Ketika nilai Lift lebih besar dari 1, maka pembelian barang{Juice sat cord ms, soft drinks} memang meningkatkan peluang pembeli untuk membeli party snack foods. Jika kita bandingkan, rules dengan Confidence tertinggi, yakni{biscuits,frozen foods ,milk cream,pet foods ,vegetables} => {bread and cake} ternyata hanya memiliki Lift sebesar 1.271897. Meskipun pembelian kedua item tersebut meningkatkan peluang untuk membeli bread and cake, tetapi efeknya tidak terlalu besar jika dibandingkan dengan rules lainnya.

Berdasarkan rules yang dihasilkan, hanya terdapat beberapa rules yang memiliki Confidence tinggi dan Lift yang tinggi pula, sementara sebagian besar rules hanya memiliki Lift tinggi atau Confidence tinggi saja. Sedangkan, dari semua rules yang dihasilkan, tidak terdapat rules yang memilki Lift kurang dari 1, sehingga dapat disimpulkan bahwa semua rules yang dihasilkan dapat meningkatkan peluang untuk pembelian item tertentu. Semua confidence juga berada di atas angka 0.7 yang menandakan nilai yang cukup baik.

Terdapat beberapa nilai yang memberikan confidence juga lift yang sangat baik.

Dari rules yang dihasilkan dapat dilihat juga hubungan antar rules dengan menggunakan graph atau network, dengan tiap lingkaran atau titik adalah rules dan panah sebagai hubungan antara rules dengan item barangnya.