In this analysis, i want to share about how to use Apriori and EDA for Market Basket Analaysis. Before that, the data that’s about to be analyze was generate by me with microsoft excel and the data is about transaction of a bakery shop in 1 day. There are 2 data that I generate, the first one is transaction data, and the second one is customers data. The transaction data tells us about the transaction per item that the customers bought and it has 3 variables. The customer data tells us about the customer that went to the store that made some transaction and it has 9 variables.

About the bakery shop that’s located at South Jakarta, and I’m about to analyze a bakery shop that sells beverages too. Here, I want to know the customer behaviour about buying at the bakery shop using Apriori method and EDA.

First step is we import the data and import some packages that we will need for this project

library(tidyverse)
library(lubridate)
library(viridis)
library(ggthemes)
library(gridExtra)
library(ggridges)
library(arules)
library(arulesViz)
library(dplyr)
library(ggplot2)
x1 <- read.csv("C:/Users/user/Documents/SEM 5/BISNIS ANALITIK/ETS BA/dp.csv",header=TRUE,sep=";") %>% mutate(Jam_Transaksi=hms(Jam_Transaksi))
x2 <- read.csv("C:/Users/user/Documents/SEM 5/BISNIS ANALITIK/ETS BA/dt.csv",header=TRUE,sep=";")

From the code above, I name the data transaction data as x1, and customer data as x2. Then, from the transaction data, there is a time variable that the item is bought. I only took the hours only, so i could easily analyze. But before that, we must see does the data have a missing value or not.

print(sum(is.null(x2)))
## [1] 0
print(sum(is.null(x1)))
## [1] 0
x1[x1==0] <- NA
x2[x2==0] <- NA
print(sum(is.na(x2)))
## [1] 18
print(sum(is.na(x1)))
## [1] 0
missing_value <- subset(x2, is.na(x2$Banyak_Produk))
missing_value
##     Transaksi Jam_Transaksi Lama_Antrian Jenis_Kelamin Pembayaran Daerah_Rumah
## 31         31      10:31:46      0:01:23        FEMALE       CARD  Jkt_Selatan
## 35         35      10:49:53      0:00:18        FEMALE       CASH    Jkt_Pusat
## 38         38      10:51:16      0:00:01          MALE   E-WALLET   Tanggerang
## 47         47      11:13:09      0:00:21        FEMALE       CARD    Jkt_Barat
## 55         55      11:33:35      0:00:31        FEMALE   E-WALLET       Bekasi
## 63         63      11:58:29      0:06:31          MALE   E-WALLET   Tanggerang
## 68         68      12:02:33      0:04:36        FEMALE       CARD  Jkt_Selatan
## 76         76      12:21:37      0:02:50        FEMALE       CARD    Jkt_Barat
## 86         86      12:47:29      0:00:02        FEMALE       CARD  Jkt_Selatan
## 87         87      12:48:29      0:01:43        FEMALE       CARD    Jkt_Pusat
## 101       101      13:01:53      0:00:40        FEMALE       CARD    Jkt_Pusat
## 113       113      13:22:22      0:05:13          MALE   E-WALLET   Tanggerang
## 115       115      13:24:06      0:00:24        FEMALE       CARD  Jkt_Selatan
## 120       120      13:33:23      0:04:47        FEMALE   E-WALLET    Jkt_Barat
## 122       122      13:36:51      0:02:26        FEMALE       CARD    Jkt_Timur
## 177       177      15:44:11      0:10:20          MALE   E-WALLET   Tanggerang
## 181       181      15:50:45      0:08:55        FEMALE       CARD  Jkt_Selatan
## 190       190      18:16:18      0:01:51        FEMALE       CASH   Jkt _Utara
##     Type_Pelanggan Type_Dine Banyak_Produk Total_Harga
## 31      MEMBERSHIP    ONLINE            NA        Rp0 
## 35      MEMBERSHIP    ONLINE            NA        Rp0 
## 38          NORMAL TAKE_AWAY            NA        Rp0 
## 47          NORMAL    ONLINE            NA        Rp0 
## 55          NORMAL TAKE_AWAY            NA        Rp0 
## 63          NORMAL TAKE_AWAY            NA        Rp0 
## 68      MEMBERSHIP    ONLINE            NA        Rp0 
## 76          NORMAL    ONLINE            NA        Rp0 
## 86          NORMAL    ONLINE            NA        Rp0 
## 87      MEMBERSHIP    ONLINE            NA        Rp0 
## 101     MEMBERSHIP    ONLINE            NA        Rp0 
## 113         NORMAL TAKE_AWAY            NA        Rp0 
## 115     MEMBERSHIP    ONLINE            NA        Rp0 
## 120         NORMAL    ONLINE            NA        Rp0 
## 122         NORMAL    ONLINE            NA        Rp0 
## 177         NORMAL TAKE_AWAY            NA        Rp0 
## 181     MEMBERSHIP    ONLINE            NA        Rp0 
## 190     MEMBERSHIP    ONLINE            NA        Rp0

From the result above, it can’t detect the missing value or 0. Then, I tried another syntax that will indicate if there’s 0, it means missing value. After that, it’s shown that there are 18 data from x2 that had some missing value. The solution is, we could delete the missing value in data x2. Why deleting it? Because the customer didn’t make any transaction so I deleted it. However, data x1 and x2 are connected. So, deleting the transaction that had a missing value in customer data, we must delete the same transaction at transaction data.

x2 <- drop_na(x2)
x1 <- drop(x1[])
print(x2) #to check is there any missing valur or not
##     Transaksi Jam_Transaksi Lama_Antrian Jenis_Kelamin Pembayaran Daerah_Rumah
## 1           1       8:05:27      0:00:00        FEMALE       CASH   Jkt _Utara
## 2           2       8:12:37      0:00:56        FEMALE   E-WALLET       Bekasi
## 3           3       8:22:52      0:11:04        FEMALE       CARD  Jkt_Selatan
## 4           4       8:35:17      0:00:38          MALE   E-WALLET   Tanggerang
## 5           5       8:40:39      0:00:53        FEMALE   E-WALLET       Bekasi
## 6           6       8:41:18      0:04:22        FEMALE   E-WALLET    Jkt_Barat
## 7           7       8:41:56      0:04:53        FEMALE       CARD  Jkt_Selatan
## 8           8       8:47:50      0:03:16          MALE   E-WALLET   Tanggerang
## 9           9       8:50:20      0:02:16          MALE   E-WALLET   Tanggerang
## 10         10       8:53:16      0:02:22          MALE   E-WALLET   Tanggerang
## 11         11       9:00:37      0:03:44        FEMALE       CARD  Jkt_Selatan
## 12         12       9:14:11      0:00:13          MALE   E-WALLET   Tanggerang
## 13         13       9:19:23      0:00:35          MALE   E-WALLET   Tanggerang
## 14         14       9:33:56      0:09:44        FEMALE   E-WALLET       Bekasi
## 15         15       9:35:00      0:08:38        FEMALE       CARD  Jkt_Selatan
## 16         16       9:35:45      0:08:30        FEMALE       CASH   Jkt _Utara
## 17         17       9:41:24      0:04:04        FEMALE       CASH    Jkt_Pusat
## 18         18       9:42:40      0:01:09        FEMALE       CARD  Jkt_Selatan
## 19         19       9:55:12      0:00:05        FEMALE       CARD    Jkt_Pusat
## 20         20      10:02:39      0:05:33        FEMALE       CARD  Jkt_Selatan
## 21         21      10:08:26      0:00:57          MALE   E-WALLET        Depok
## 22         22      10:09:29      0:03:48        FEMALE       CARD    Jkt_Barat
## 23         23      10:12:06      0:08:53        FEMALE       CARD    Jkt_Pusat
## 24         24      10:12:15      0:08:32        FEMALE       CASH   Jkt _Utara
## 25         25      10:14:11      0:02:56        FEMALE       CASH   Jkt _Utara
## 26         26      10:16:30      0:00:19        FEMALE       CARD  Jkt_Selatan
## 27         27      10:22:11      0:01:05        FEMALE   E-WALLET       Bekasi
## 28         28      10:23:26      0:08:05        FEMALE   E-WALLET       Bekasi
## 29         29      10:23:33      0:00:44        FEMALE   E-WALLET       Bekasi
## 30         30      10:26:59      0:03:36        FEMALE   E-WALLET       Bekasi
## 31         32      10:44:59      0:06:11          MALE   E-WALLET   Tanggerang
## 32         33      10:46:15      0:01:36        FEMALE       CARD    Jkt_Barat
## 33         34      10:46:42      0:03:21        FEMALE       CARD  Jkt_Selatan
## 34         36      10:49:58      0:06:25        FEMALE   E-WALLET       Bekasi
## 35         37      10:50:19      0:09:33          MALE   E-WALLET   Tanggerang
## 36         39      10:54:02      0:02:59        FEMALE   E-WALLET       Bekasi
## 37         40      10:57:33      0:01:30        FEMALE       CARD  Jkt_Selatan
## 38         41      10:58:54      0:05:00          MALE   E-WALLET   Tanggerang
## 39         42      11:01:26      0:00:54          MALE   E-WALLET   Tanggerang
## 40         43      11:03:36      0:02:44          MALE   E-WALLET        Depok
## 41         44      11:06:46      0:09:06          MALE   E-WALLET        Depok
## 42         45      11:09:16      0:00:28        FEMALE   E-WALLET       Bekasi
## 43         46      11:10:51      0:00:12        FEMALE       CARD    Jkt_Pusat
## 44         48      11:13:41      0:01:00        FEMALE       CARD  Jkt_Selatan
## 45         49      11:15:29      0:01:21          MALE   E-WALLET   Tanggerang
## 46         50      11:15:59      0:01:57        FEMALE       CARD  Jkt_Selatan
## 47         51      11:17:23      0:02:27          MALE   E-WALLET   Tanggerang
## 48         52      11:22:55      0:00:03          MALE   E-WALLET   Tanggerang
## 49         53      11:25:11      0:00:50          MALE   E-WALLET        Depok
## 50         54      11:29:49      0:02:30          MALE   E-WALLET        Depok
## 51         56      11:38:57      0:01:58        FEMALE       CARD    Jkt_Timur
## 52         57      11:39:22      0:01:01        FEMALE       CARD  Jkt_Selatan
## 53         58      11:41:26      0:04:21        FEMALE       CARD  Jkt_Selatan
## 54         59      11:48:37      0:05:42          MALE   E-WALLET   Tanggerang
## 55         60      11:50:19      0:03:50        FEMALE       CASH    Jkt_Pusat
## 56         61      11:51:37      0:02:33        FEMALE       CARD  Jkt_Selatan
## 57         62      11:56:28      0:05:56        FEMALE       CASH    Jkt_Pusat
## 58         64      11:59:14      0:04:11        FEMALE       CASH    Jkt_Pusat
## 59         65      11:59:54      0:03:51        FEMALE       CASH   Jkt _Utara
## 60         66      12:00:05      0:02:16          MALE   E-WALLET        Depok
## 61         67      12:00:38      0:02:01        FEMALE       CARD  Jkt_Selatan
## 62         69      12:02:42      0:06:38        FEMALE   E-WALLET       Bekasi
## 63         70      12:03:29      0:00:07          MALE   E-WALLET   Tanggerang
## 64         71      12:07:16      0:04:03          MALE   E-WALLET   Tanggerang
## 65         72      12:07:24      0:03:37          MALE   E-WALLET   Tanggerang
## 66         73      12:13:22      0:01:21        FEMALE   E-WALLET    Jkt_Barat
## 67         74      12:14:11      0:04:25        FEMALE       CARD  Jkt_Selatan
## 68         75      12:20:21      0:17:16          MALE   E-WALLET   Tanggerang
## 69         77      12:22:11      0:01:27        FEMALE       CARD    Jkt_Barat
## 70         78      12:24:41      0:10:57          MALE   E-WALLET        Depok
## 71         79      12:25:36      0:00:32          MALE   E-WALLET   Tanggerang
## 72         80      12:27:51      0:01:12        FEMALE       CASH    Jkt_Pusat
## 73         81      12:28:36      0:01:38        FEMALE   E-WALLET       Bekasi
## 74         82      12:38:09      0:01:50        FEMALE       CARD    Jkt_Timur
## 75         83      12:42:25      0:02:36          MALE   E-WALLET   Tanggerang
## 76         84      12:46:06      0:03:05        FEMALE   E-WALLET       Bekasi
## 77         85      12:46:10      0:01:43        FEMALE   E-WALLET       Bekasi
## 78         88      12:49:52      0:04:12        FEMALE       CARD  Jkt_Selatan
## 79         89      12:50:59      0:03:39        FEMALE       CARD    Jkt_Timur
## 80         90      12:51:08      0:02:25          MALE   E-WALLET   Tanggerang
## 81         91      12:51:38      0:03:48        FEMALE   E-WALLET       Bekasi
## 82         92      12:52:56      0:15:52          MALE   E-WALLET        Depok
## 83         93      12:54:46      0:02:28          MALE   E-WALLET   Tanggerang
## 84         94      12:55:32      0:02:04        FEMALE       CARD  Jkt_Selatan
## 85         95      12:56:07      0:06:55        FEMALE       CARD  Jkt_Selatan
## 86         96      12:57:45      0:03:32          MALE   E-WALLET   Tanggerang
## 87         97      12:58:17      0:01:46        FEMALE   E-WALLET    Jkt_Barat
## 88         98      12:59:28      0:03:37        FEMALE       CASH    Jkt_Pusat
## 89         99      13:00:10      0:06:33          MALE   E-WALLET   Tanggerang
## 90        100      13:01:51      0:00:48        FEMALE   E-WALLET    Jkt_Barat
## 91        102      13:02:22      0:00:52          MALE   E-WALLET        Depok
## 92        103      13:05:15      0:00:12        FEMALE       CARD    Jkt_Pusat
## 93        104      13:06:32      0:04:39          MALE   E-WALLET        Depok
## 94        105      13:08:29      0:02:32          MALE   E-WALLET   Tanggerang
## 95        106      13:10:43      0:00:16        FEMALE       CARD  Jkt_Selatan
## 96        107      13:11:08      0:02:51        FEMALE       CARD    Jkt_Timur
## 97        108      13:11:20      0:00:37        FEMALE       CASH    Jkt_Pusat
## 98        109      13:13:41      0:02:00          MALE   E-WALLET        Depok
## 99        110      13:15:29      0:04:36          MALE   E-WALLET   Tanggerang
## 100       111      13:15:43      0:01:28        FEMALE       CARD    Jkt_Pusat
## 101       112      13:20:44      0:01:17          MALE   E-WALLET   Tanggerang
## 102       114      13:23:53      0:00:23        FEMALE   E-WALLET       Bekasi
## 103       116      13:26:00      0:03:50        FEMALE       CARD  Jkt_Selatan
## 104       117      13:31:22      0:03:11          MALE   E-WALLET   Tanggerang
## 105       118      13:32:19      0:03:08        FEMALE   E-WALLET    Jkt_Barat
## 106       119      13:32:26      0:02:15        FEMALE       CARD    Jkt_Pusat
## 107       121      13:35:57      0:01:17          MALE   E-WALLET   Tanggerang
## 108       123      13:38:08      0:01:42        FEMALE       CARD  Jkt_Selatan
## 109       124      13:38:50      0:01:47        FEMALE   E-WALLET       Bekasi
## 110       125      13:40:49      0:00:16          MALE   E-WALLET   Tanggerang
## 111       126      13:40:49      0:01:49          MALE   E-WALLET   Tanggerang
## 112       127      13:42:55      0:12:02          MALE   E-WALLET   Tanggerang
## 113       128      13:43:15      0:03:28        FEMALE   E-WALLET    Jkt_Barat
## 114       129      13:49:35      0:03:26        FEMALE       CARD    Jkt_Pusat
## 115       130      13:49:47      0:07:26        FEMALE       CARD  Jkt_Selatan
## 116       131      13:50:14      0:07:38          MALE   E-WALLET   Tanggerang
## 117       132      13:53:05      0:01:34        FEMALE       CASH    Jkt_Pusat
## 118       133      13:53:58      0:00:18        FEMALE   E-WALLET       Bekasi
## 119       134      13:55:34      0:02:21          MALE   E-WALLET   Tanggerang
## 120       135      13:56:49      0:00:07        FEMALE       CARD    Jkt_Timur
## 121       136      13:57:26      0:05:58        FEMALE       CASH    Jkt_Pusat
## 122       137      13:58:31      0:00:20          MALE   E-WALLET        Depok
## 123       138      13:58:38      0:02:35          MALE   E-WALLET        Depok
## 124       139      13:59:56      0:00:48        FEMALE   E-WALLET       Bekasi
## 125       140      14:01:21      0:07:53        FEMALE       CARD  Jkt_Selatan
## 126       141      14:01:34      0:02:44        FEMALE   E-WALLET    Jkt_Barat
## 127       142      14:03:01      0:01:36        FEMALE       CARD  Jkt_Selatan
## 128       143      14:04:55      0:00:29          MALE   E-WALLET   Tanggerang
## 129       144      14:10:24      0:00:40        FEMALE       CARD  Jkt_Selatan
## 130       145      14:10:39      0:01:46        FEMALE       CARD    Jkt_Barat
## 131       146      14:20:16      0:02:42        FEMALE       CARD  Jkt_Selatan
## 132       147      14:20:48      0:01:12          MALE   E-WALLET        Depok
## 133       148      14:28:19      0:03:40        FEMALE       CARD  Jkt_Selatan
## 134       149      14:33:03      0:08:24          MALE   E-WALLET   Tanggerang
## 135       150      14:37:35      0:00:08        FEMALE   E-WALLET       Bekasi
## 136       151      14:39:41      0:01:38        FEMALE       CARD  Jkt_Selatan
## 137       152      14:41:26      0:03:57          MALE   E-WALLET   Tanggerang
## 138       153      14:42:40      0:12:04          MALE   E-WALLET   Tanggerang
## 139       154      14:48:13      0:04:05        FEMALE       CARD    Jkt_Timur
## 140       155      14:51:34      0:02:23        FEMALE   E-WALLET    Jkt_Barat
## 141       156      14:51:55      0:01:00          MALE   E-WALLET   Tanggerang
## 142       157      14:52:21      0:02:50        FEMALE       CASH    Jkt_Pusat
## 143       158      14:57:40      0:01:43          MALE   E-WALLET        Depok
## 144       159      14:57:52      0:03:12        FEMALE   E-WALLET       Bekasi
## 145       160      14:57:54      0:01:03        FEMALE       CASH    Jkt_Pusat
## 146       161      14:58:24      0:00:04        FEMALE   E-WALLET       Bekasi
## 147       162      15:01:49      0:14:08        FEMALE       CARD  Jkt_Selatan
## 148       163      15:07:52      0:03:18          MALE   E-WALLET   Tanggerang
## 149       164      15:11:30      0:05:12        FEMALE   E-WALLET       Bekasi
## 150       165      15:16:15      0:01:20          MALE   E-WALLET   Tanggerang
## 151       166      15:16:45      0:00:18        FEMALE       CARD  Jkt_Selatan
## 152       167      15:17:24      0:03:40          MALE   E-WALLET   Tanggerang
## 153       168      15:19:37      0:00:43        FEMALE       CARD    Jkt_Pusat
## 154       169      15:30:22      0:01:37        FEMALE       CASH    Jkt_Pusat
## 155       170      15:34:24      0:03:03          MALE   E-WALLET   Tanggerang
## 156       171      15:34:28      0:04:19        FEMALE       CARD  Jkt_Selatan
## 157       172      15:36:46      0:00:57          MALE   E-WALLET        Depok
## 158       173      15:37:03      0:03:48        FEMALE       CARD  Jkt_Selatan
## 159       174      15:38:04      0:03:53        FEMALE       CARD  Jkt_Selatan
## 160       175      15:42:26      0:04:26        FEMALE       CARD    Jkt_Pusat
## 161       176      15:42:36      0:02:30          MALE   E-WALLET   Tanggerang
## 162       178      15:44:39      0:01:52          MALE   E-WALLET   Tanggerang
## 163       179      15:44:48      0:08:59        FEMALE   E-WALLET       Bekasi
## 164       180      15:50:38      0:07:47          MALE   E-WALLET   Tanggerang
## 165       182      15:53:03      0:00:53        FEMALE       CARD    Jkt_Pusat
## 166       183      16:09:09      0:00:14        FEMALE       CASH   Jkt _Utara
## 167       184      16:18:50      0:01:04        FEMALE       CASH    Jkt_Pusat
## 168       185      17:03:11      0:01:09          MALE   E-WALLET   Tanggerang
## 169       186      17:18:48      0:02:06          MALE   E-WALLET   Tanggerang
## 170       187      17:28:42      0:08:58        FEMALE       CARD  Jkt_Selatan
## 171       188      17:50:06      0:01:04        FEMALE       CASH    Jkt_Pusat
## 172       189      18:12:47      0:00:16        FEMALE   E-WALLET       Bekasi
## 173       191      18:26:02      0:07:49        FEMALE       CASH    Jkt_Pusat
## 174       192      18:28:17      0:05:12        FEMALE       CARD    Jkt_Barat
## 175       193      18:31:33      0:00:05          MALE   E-WALLET   Tanggerang
## 176       194      18:34:43      0:01:21          MALE   E-WALLET   Tanggerang
## 177       195      18:38:03      0:09:54        FEMALE   E-WALLET       Bekasi
## 178       196      18:44:20      0:05:18        FEMALE       CASH    Jkt_Pusat
## 179       197      18:48:53      0:00:08        FEMALE       CARD    Jkt_Barat
## 180       198      18:53:12      0:00:42        FEMALE       CARD  Jkt_Selatan
## 181       199      18:54:34      0:01:20          MALE   E-WALLET   Tanggerang
##     Type_Pelanggan Type_Dine Banyak_Produk Total_Harga
## 1       MEMBERSHIP    ONLINE             5   Rp65,000 
## 2           NORMAL TAKE_AWAY             5   Rp51,000 
## 3       MEMBERSHIP    ONLINE             2   Rp18,000 
## 4           NORMAL TAKE_AWAY             7   Rp87,000 
## 5           NORMAL TAKE_AWAY             1   Rp10,000 
## 6           NORMAL    ONLINE             5   Rp52,000 
## 7           NORMAL    ONLINE             1   Rp10,000 
## 8           NORMAL   DINE_IN             1   Rp20,000 
## 9           NORMAL TAKE_AWAY             1   Rp20,000 
## 10          NORMAL TAKE_AWAY             6   Rp75,000 
## 11      MEMBERSHIP    ONLINE             3   Rp42,000 
## 12          NORMAL   DINE_IN             2   Rp28,000 
## 13          NORMAL TAKE_AWAY             2   Rp18,000 
## 14          NORMAL    ONLINE             2   Rp23,000 
## 15          NORMAL    ONLINE             4   Rp48,000 
## 16      MEMBERSHIP    ONLINE             2   Rp18,000 
## 17      MEMBERSHIP    ONLINE             3   Rp33,000 
## 18          NORMAL    ONLINE             2   Rp16,000 
## 19      MEMBERSHIP    ONLINE             3   Rp34,000 
## 20      MEMBERSHIP    ONLINE             3   Rp30,000 
## 21          NORMAL   DINE_IN             1   Rp10,000 
## 22          NORMAL    ONLINE             2   Rp25,000 
## 23      MEMBERSHIP    ONLINE             3   Rp32,000 
## 24      MEMBERSHIP    ONLINE             2   Rp18,000 
## 25      MEMBERSHIP    ONLINE             2   Rp30,000 
## 26          NORMAL    ONLINE             3   Rp33,000 
## 27          NORMAL    ONLINE             3   Rp35,000 
## 28          NORMAL TAKE_AWAY             1   Rp10,000 
## 29          NORMAL TAKE_AWAY             3   Rp33,000 
## 30          NORMAL TAKE_AWAY             8  Rp100,000 
## 31          NORMAL TAKE_AWAY             3   Rp38,000 
## 32          NORMAL    ONLINE             2   Rp23,000 
## 33          NORMAL    ONLINE             2   Rp20,000 
## 34          NORMAL TAKE_AWAY             3   Rp47,000 
## 35          NORMAL TAKE_AWAY             3   Rp45,000 
## 36          NORMAL    ONLINE             3   Rp26,000 
## 37          NORMAL    ONLINE             2   Rp25,000 
## 38          NORMAL   DINE_IN             1   Rp10,000 
## 39          NORMAL TAKE_AWAY             2   Rp20,000 
## 40          NORMAL   DINE_IN             4   Rp41,000 
## 41          NORMAL   DINE_IN             3   Rp35,000 
## 42          NORMAL TAKE_AWAY             3   Rp28,000 
## 43      MEMBERSHIP    ONLINE             2   Rp35,000 
## 44      MEMBERSHIP    ONLINE             2   Rp19,000 
## 45          NORMAL   DINE_IN             3   Rp50,000 
## 46      MEMBERSHIP    ONLINE             1   Rp15,000 
## 47          NORMAL TAKE_AWAY             2   Rp20,000 
## 48          NORMAL TAKE_AWAY             4   Rp50,000 
## 49          NORMAL   DINE_IN             1   Rp15,000 
## 50          NORMAL   DINE_IN             3   Rp28,000 
## 51          NORMAL    ONLINE             3   Rp38,000 
## 52      MEMBERSHIP    ONLINE             3   Rp28,000 
## 53      MEMBERSHIP    ONLINE             5   Rp47,000 
## 54          NORMAL TAKE_AWAY             3   Rp45,000 
## 55      MEMBERSHIP    ONLINE             2   Rp23,000 
## 56          NORMAL    ONLINE             2   Rp30,000 
## 57      MEMBERSHIP    ONLINE             4   Rp43,000 
## 58      MEMBERSHIP    ONLINE             3   Rp42,000 
## 59      MEMBERSHIP    ONLINE             3   Rp38,000 
## 60          NORMAL   DINE_IN             1   Rp10,000 
## 61      MEMBERSHIP    ONLINE             3   Rp28,000 
## 62          NORMAL TAKE_AWAY             4   Rp43,000 
## 63          NORMAL TAKE_AWAY             4   Rp43,000 
## 64          NORMAL TAKE_AWAY             1   Rp15,000 
## 65          NORMAL TAKE_AWAY             6   Rp74,000 
## 66          NORMAL    ONLINE             2   Rp16,000 
## 67      MEMBERSHIP    ONLINE             3   Rp40,000 
## 68          NORMAL TAKE_AWAY             5   Rp62,000 
## 69          NORMAL    ONLINE             6   Rp77,000 
## 70          NORMAL   DINE_IN             3   Rp35,000 
## 71          NORMAL TAKE_AWAY             2   Rp35,000 
## 72      MEMBERSHIP    ONLINE             1   Rp10,000 
## 73          NORMAL TAKE_AWAY             1   Rp20,000 
## 74          NORMAL    ONLINE             3   Rp31,000 
## 75          NORMAL TAKE_AWAY             4   Rp38,000 
## 76          NORMAL TAKE_AWAY             3   Rp28,000 
## 77          NORMAL TAKE_AWAY             6   Rp63,000 
## 78      MEMBERSHIP    ONLINE             3   Rp35,000 
## 79          NORMAL    ONLINE             2   Rp20,000 
## 80          NORMAL TAKE_AWAY             1   Rp15,000 
## 81          NORMAL    ONLINE             3   Rp35,000 
## 82          NORMAL   DINE_IN             4   Rp46,000 
## 83          NORMAL TAKE_AWAY             1   Rp15,000 
## 84          NORMAL    ONLINE             2   Rp35,000 
## 85      MEMBERSHIP    ONLINE             3   Rp35,000 
## 86          NORMAL TAKE_AWAY             1   Rp20,000 
## 87          NORMAL    ONLINE             1    Rp8,000 
## 88      MEMBERSHIP    ONLINE             3   Rp35,000 
## 89          NORMAL TAKE_AWAY             2   Rp30,000 
## 90          NORMAL    ONLINE             1   Rp10,000 
## 91          NORMAL   DINE_IN             6   Rp92,000 
## 92      MEMBERSHIP    ONLINE             2   Rp18,000 
## 93          NORMAL   DINE_IN             1   Rp15,000 
## 94          NORMAL TAKE_AWAY             5   Rp58,000 
## 95          NORMAL    ONLINE             2   Rp23,000 
## 96          NORMAL    ONLINE             2   Rp30,000 
## 97      MEMBERSHIP    ONLINE             5   Rp46,000 
## 98          NORMAL   DINE_IN             3   Rp35,000 
## 99          NORMAL TAKE_AWAY             2   Rp24,000 
## 100     MEMBERSHIP    ONLINE             1   Rp15,000 
## 101         NORMAL   DINE_IN             3   Rp38,000 
## 102         NORMAL TAKE_AWAY             6   Rp70,000 
## 103     MEMBERSHIP    ONLINE             3   Rp33,000 
## 104         NORMAL TAKE_AWAY             2   Rp16,000 
## 105         NORMAL    ONLINE             4   Rp43,000 
## 106     MEMBERSHIP    ONLINE             2   Rp27,000 
## 107         NORMAL TAKE_AWAY             1   Rp20,000 
## 108     MEMBERSHIP    ONLINE             3   Rp35,000 
## 109         NORMAL TAKE_AWAY             2   Rp23,000 
## 110         NORMAL TAKE_AWAY             4   Rp50,000 
## 111         NORMAL TAKE_AWAY             2   Rp18,000 
## 112         NORMAL TAKE_AWAY             1   Rp20,000 
## 113         NORMAL    ONLINE             2   Rp25,000 
## 114     MEMBERSHIP    ONLINE             3   Rp50,000 
## 115         NORMAL    ONLINE             4   Rp60,000 
## 116         NORMAL TAKE_AWAY             3   Rp45,000 
## 117     MEMBERSHIP    ONLINE             2   Rp27,000 
## 118         NORMAL    ONLINE             3   Rp33,000 
## 119         NORMAL TAKE_AWAY             2   Rp35,000 
## 120         NORMAL    ONLINE             2   Rp30,000 
## 121     MEMBERSHIP    ONLINE             2   Rp35,000 
## 122         NORMAL   DINE_IN             4   Rp43,000 
## 123         NORMAL   DINE_IN             2   Rp20,000 
## 124         NORMAL TAKE_AWAY             4   Rp50,000 
## 125         NORMAL    ONLINE             3   Rp37,000 
## 126         NORMAL    ONLINE             1   Rp10,000 
## 127         NORMAL    ONLINE             4   Rp46,000 
## 128         NORMAL   DINE_IN             1    Rp8,000 
## 129         NORMAL    ONLINE             5   Rp62,000 
## 130         NORMAL    ONLINE             1   Rp12,000 
## 131     MEMBERSHIP    ONLINE             2   Rp25,000 
## 132         NORMAL   DINE_IN             3   Rp30,000 
## 133     MEMBERSHIP    ONLINE             1   Rp15,000 
## 134         NORMAL TAKE_AWAY             6   Rp74,000 
## 135         NORMAL TAKE_AWAY             6   Rp63,000 
## 136     MEMBERSHIP    ONLINE             3   Rp37,000 
## 137         NORMAL TAKE_AWAY             5   Rp60,000 
## 138         NORMAL TAKE_AWAY             5   Rp50,000 
## 139         NORMAL    ONLINE             1   Rp15,000 
## 140         NORMAL    ONLINE             3   Rp40,000 
## 141         NORMAL   DINE_IN             6   Rp66,000 
## 142     MEMBERSHIP    ONLINE             4   Rp52,000 
## 143         NORMAL   DINE_IN             3   Rp29,000 
## 144         NORMAL TAKE_AWAY             1    Rp8,000 
## 145     MEMBERSHIP    ONLINE             5   Rp50,000 
## 146         NORMAL    ONLINE             3   Rp34,000 
## 147     MEMBERSHIP    ONLINE             2   Rp18,000 
## 148         NORMAL TAKE_AWAY             2   Rp18,000 
## 149         NORMAL TAKE_AWAY             2   Rp28,000 
## 150         NORMAL TAKE_AWAY             3   Rp33,000 
## 151     MEMBERSHIP    ONLINE             3   Rp41,000 
## 152         NORMAL TAKE_AWAY             1    Rp8,000 
## 153     MEMBERSHIP    ONLINE             3   Rp35,000 
## 154     MEMBERSHIP    ONLINE             2   Rp25,000 
## 155         NORMAL TAKE_AWAY             2   Rp18,000 
## 156         NORMAL    ONLINE             2   Rp20,000 
## 157         NORMAL   DINE_IN             1    Rp8,000 
## 158     MEMBERSHIP    ONLINE             1   Rp15,000 
## 159         NORMAL    ONLINE             3   Rp40,000 
## 160     MEMBERSHIP    ONLINE             3   Rp32,000 
## 161         NORMAL TAKE_AWAY             4   Rp45,000 
## 162         NORMAL TAKE_AWAY             4   Rp50,000 
## 163         NORMAL TAKE_AWAY             1    Rp8,000 
## 164         NORMAL TAKE_AWAY             4   Rp60,000 
## 165     MEMBERSHIP    ONLINE             2   Rp30,000 
## 166     MEMBERSHIP    ONLINE             1   Rp12,000 
## 167     MEMBERSHIP    ONLINE             2   Rp23,000 
## 168         NORMAL TAKE_AWAY             1   Rp15,000 
## 169         NORMAL   DINE_IN             2   Rp32,000 
## 170     MEMBERSHIP    ONLINE             4   Rp39,000 
## 171     MEMBERSHIP    ONLINE             4   Rp41,000 
## 172         NORMAL TAKE_AWAY             3   Rp35,000 
## 173     MEMBERSHIP    ONLINE             4   Rp48,000 
## 174         NORMAL    ONLINE             3   Rp28,000 
## 175         NORMAL TAKE_AWAY             4   Rp45,000 
## 176         NORMAL TAKE_AWAY             1   Rp15,000 
## 177         NORMAL TAKE_AWAY             1    Rp8,000 
## 178     MEMBERSHIP    ONLINE             2   Rp20,000 
## 179         NORMAL    ONLINE             1   Rp12,000 
## 180     MEMBERSHIP    ONLINE             1   Rp15,000 
## 181         NORMAL TAKE_AWAY             4   Rp38,000

As we could see, there are no missing value any more and we could continue anlyze

1. ANALYZE TRANSACTION DATA

1.1 Transaction by Hour

From the transaction data, it is shown that there’s a transaction time(Jam Transaksi). That variable could tell us at what time is the bakery busy.

Grafik1 <- x1 %>%
mutate(Hour = as.factor(hour(x1$Jam_Transaksi))) %>% 
group_by(Hour) %>% summarise(Count=n()) %>% 
ggplot(aes(x=Hour,y=Count,fill=Count))+
theme_fivethirtyeight()+
geom_bar(stat="identity")+
ggtitle("Transaction by Hour")+
theme(legend.position="none")
Grafik1

From the graph above, we could see that the purchases start to increase from 10:00 until 13:00. Then, from 13:00 until 16:00 the purchases kept decreasing. So, i can interpret that, the most purchase was at lunch time.

1.2 Bakery Signature Product

Grafik2 <- x1 %>% 
  group_by(Nama_Produk) %>% 
  summarise(Count = n()) %>% 
  arrange(desc(Count)) %>%
  ggplot(aes(x=reorder(Nama_Produk,Count),y=Count,fill=Nama_Produk))+
  geom_bar(stat="identity")+
  coord_flip()+
  ggtitle("Bakery Signature Product")+
  labs(y= "", x = "Product")+
  theme(legend.position="none")
Grafik2

From the graph above, the most product that is purchase is Kopi(“Coffee”), then Teh(“Tea”), etc.

1.3 Unique Transaction by Hour

From the customer data, there’s a total of item that is bought by the customer. From that variabel(“Banyak_Produk”), we can see the mean of produk that is bought per customer at some certain hour.

Grafik3.1 <- x1 %>% 
  mutate(Hour = as.factor(hour(x1$Jam_Transaksi)))%>% 
  group_by(Hour) %>% 
  summarise(Count= n()) 

Grafik3.2 <- x1 %>% 
  mutate(Hour = as.factor(hour(x1$Jam_Transaksi)))%>% 
  group_by(Hour,x1$Transkasi) %>% 
  summarise(n_distinct(x1$Transkasi)) %>% 
  summarise(Count=n())

Grafik3.3 <- data.frame(Grafik3.1, # Days, total items
                   Grafik3.2[2], # unique transactions
                   Grafik3.1[2]/Grafik3.2[2])  # items per unique transaction
colnames(Grafik3.3) <- c("Hour","Line","Unique","Items.Trans")

Grafik3 <- 
  ggplot(Grafik3.3,aes(x=Hour,y=Items.Trans,fill=Items.Trans))+
  theme_fivethirtyeight()+
  geom_bar(stat="identity")+
  ggtitle("Unique Transaction by Hour")+
  theme(legend.position="none")+
  geom_text(aes(label=round(Items.Trans,1)), vjust=2)
Grafik3

From the graph above, it tells us about the mean or average total item that is bought by the customers per hour. From the barplot above, the most average total item that is bought by customer is at 08:00 and 14:00. From 08:00, it tells that most of the customer need to by more food/drinks because the customer didn’t make or eat breakfast at home, so the customer bought many than usual. Then, at 14:00 is lunch time where people need to eat many so they bought more than average.

2. ANALYZE CUSTOMER DATA

From the customer data, there are many variables that we can explore more. There are a lot of variabels that we can conclude such as membership, payment etc.

2.1 Membership

From the customer data, we know that the bakery shop has some membership customer. The variable could tell us how many membership customer that went to the bakery shop and make some purchase

member <- x2 %>% 
  group_by(Type_Pelanggan) %>% 
  count() %>% 
  ungroup() %>% 
  mutate(per=`n`/sum(`n`)) %>% 
  arrange(desc(Type_Pelanggan))
member$label <- scales::percent(member$per)
ggplot(member=member)+
  geom_bar(aes(x="", y=member$per, fill=member$Type_Pelanggan), stat="identity", width = 1)+
  coord_polar("y", start=0)+
  theme_void()+
  geom_text(aes(x=1, y = cumsum(member$per) - member$per/2, label=member$label))

The pie chart above explain that 29% that went to the bakery shop and bought something was 29% is a membership of that bakery shop and the others were not a normal customer.

2.2 Type of payment

Not there is only the type of customer that went to the store, the customer data tells us about what type of payment does the customer use.

payment <- x2 %>% 
  group_by(Pembayaran) %>% 
  count() %>% 
  ungroup() %>% 
  mutate(per=`n`/sum(`n`)) %>% 
  arrange(desc(Pembayaran))
payment$label <- scales::percent(payment$per)
ggplot(payment=payment)+
  geom_bar(aes(x="", y=payment$per, fill=payment$Pembayaran), stat="identity", width = 1)+
  coord_polar("y", start=0)+
  theme_void()+
  geom_text(aes(x=1, y = cumsum(payment$per) - payment$per/2, label=payment$label))

Now we know that most of the customer pay for their purchase with E-Wallet(Gopay OVO, & etc) with 56% of all the customer at 1 day and 32% of the customer that payed with debit card and the rest is with cash.

2.3 Customer Domicile

The customer data also shown the customer domicile.Now we can see where’s does the most buyers domicile

Grafik6 <- x2 %>% #grafik daerah rumah
  group_by(Daerah_Rumah) %>% 
  summarise(Count = n()) %>% 
  arrange(desc(Count)) %>%
  ggplot(aes(x=Daerah_Rumah,y=Count,fill=Daerah_Rumah))+
  geom_bar(stat="identity")+
  ggtitle("Daerah Rumah pelanggan")+
  theme(legend.position="none")
Grafik6

From the barplot above, the most buyer at that day was customer that live in Tanggerang.

2.4 Customer Gender

From the customer data, it gives us information about the customer gender. The gender variable could give us some new information.

jk <- x2 %>% 
  group_by(Jenis_Kelamin) %>% 
  count() %>% 
  ungroup() %>% 
  mutate(per=`n`/sum(`n`)) %>% 
  arrange(desc(Jenis_Kelamin))
jk$label <- scales::percent(jk$per)
ggplot(jk=jk)+
  geom_bar(aes(x="", y=jk$per, fill=jk$Jenis_Kelamin), stat="identity", width = 1)+
  coord_polar("y", start=0)+
  theme_void()+
  geom_text(aes(x=1, y = cumsum(jk$per) - jk$per/2, label=jk$label))

As you could see, most of the costumer is female. From this new insight, the bakery shop owner can boost the sales with adding a promo like “TGIF(Thanks God Is Female)” where the membership customer and female will get a promotion. From this type of promo, it can boost the membership for the bakery shop and boost the sales.

2.5 Customer Dine type

There are variables about the dine type of customer.

dt <- x2 %>% 
  group_by(Type_Dine) %>% 
  count() %>% 
  ungroup() %>% 
  mutate(per=`n`/sum(`n`)) %>% 
  arrange(desc(Type_Dine))
dt$label <- scales::percent(dt$per)
ggplot(dt=dt)+
  geom_bar(aes(x="", y=dt$per, fill=dt$Type_Dine), stat="identity", width = 1)+
  coord_polar("y", start=0)+
  theme_void()+
  geom_text(aes(x=1, y = cumsum(dt$per) - dt$per/2, label=dt$label))

From the pie chart above, most of the type dine customer is online(gofood and grabfood).

MARKET BASKET ANALYSIS with APRIORI

First of all, maybe i’ll explain what is apriori. Imagine in grocery shop, there are a housewife, college students, etc. Every people have their own business when they go to grocery shopping. Housewife buying diapers and milk for their child, a college students attend to buy chips and sweet drinks or food. From these buying patterns, it can help to increase the sales in several ways. Example if there is a pair of items, A and B, that are frequently bought together. To boost some sales, there are ways such as, put items A and B in the same shelf so the buyers could easily see the items, or the store could give a discount or promotion for buying item A and B together, etc.

In apriori method, there are three common ways to measure association: 1.) Support, support is to find the combination of the item in the database. The greater the support for the items, the more items are purchased or it shows the dominance of an item from the whole transaction. 2.) Confidence, confidence is the probability of a several products being purchased, while on product is definitely purchase. 3.) Lift, lift indicates the validity of the transaction process and provides information on whether the item was purchased together with other items. If the value of lift is 1, it will shown that the combination is independent. If the value is greater that 1, it will shown that the combination have a positive correlation and if the value is less than 1, it will shown that the combination have a negative correlation.

y <- read.transactions("C:/Users/user/Documents/SEM 5/BISNIS ANALITIK/ETS BA/dp.csv",format="single",cols=c(1,3),sep=";"
)
rules_a <- apriori(y,parameter=list(support=0.01,confidence=.6,maxlen=3))
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport maxtime support minlen
##         0.6    0.1    1 none FALSE            TRUE       5    0.01      1
##  maxlen target  ext
##       3  rules TRUE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 1 
## 
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[9 item(s), 182 transaction(s)] done [0.00s].
## sorting and recoding items ... [8 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 done [0.01s].
## writing ... [8 rule(s)] done [0.00s].
## creating S4 object  ... done [0.00s].
rules_a
## set of 8 rules
inspect(head(rules_a,by="support",n=3))
##     lhs                         rhs            support    confidence coverage  
## [1] {Roti Abon,Roti Cokelat} => {Kopi}         0.04395604 0.6153846  0.07142857
## [2] {Roti Abon,Roti Keju}    => {Roti Cokelat} 0.03296703 0.6666667  0.04945055
## [3] {Roti Abon,Roti Keju}    => {Kopi}         0.03296703 0.6666667  0.04945055
##     lift     count
## [1] 1.191489 8    
## [2] 1.733333 6    
## [3] 1.290780 6

With the result above, it’s filtered by the 3 highest support with the minimum of support is 0.01, minimum of confidence is 0.6 and the maxlen or the maximum of combination to be shown is 3. From the table above, it tells us that if the customer at that day bought Roti Abon and Roti Cokelat, the customer will bought Kopi and the reality, there are 8 customer that bought with those combintation. It also shown the value of the support is 0.04, with confidence 0.62 and lift 1.2. The support value is not that high because the transaction is not that many and it’s fine, therefore we could see the value of confidence is greater than 60% with the value of lift is greater than 1. For the other combination has the same interpretation.

If we see carefully see the table, most of the following combination is either Kopi or Roti Coklat. So, if the owner of the bakery shop wants to boost sales, my suggestion is to have a promotion of buy 2(choose : Roti Abon, Roti Cokelat, or Roti Keju) and the customer will get 1 free item(Kopi). It will be a good thing if the promotion is only for the membership only. Seeing that, the other customer that haven’t been a membership will be interested to be a membership.

Here are some graph about the apriori

#plot(rules_a, method="paracoord", control=list(reorder=TRUE)) can use this
#plot(rules_a, method="two-key plot") can use this
plot(rules_a, method="graph")

RECOMMENDATION & CONCLUSION

Here are some of my recommendation for the bakery shop

  1. Busy hours, is around 12:00 and around 13:00. Therefore, the waiter from the bakery shop should standby around those time

2.To increase consumer loyalty, a promo is held for every member who successfully invites his friends to join as members

3.Cooperate with various cashless payment providers to get attractive discounts for new customers.

4.Establish new promotion for member only or it depends on the owner. My recommendation is : - Promtion = Buy 2(choose : Roti Abon, Roti Cokelat, or Roti Keju) get 1 Kopi - TGIF(Thanks God is Female) = Discount for membership women who shops in store