Workflow

Inputs

Association rule mining is an approach to discovering patterns of co-occurrence in a (large) dataset, by identifying entities that frequently appear together in a group. As an unsupervised learning technique, association rule mining can be used to identify novel patterns/relationships amongst entities in a large set of data.

This type of patterns are summarized by association rules, which predicts the occurrence of one or more entities based on the occurrences of other entities in a certain grouping, such as a transaction or an individual.

As usual, we will perform association rule mining on the IBM Telco customer churn dataset, to identify customer characteristics and purchasing behaviours that tend to appear together, which may be helpful in informing marketing and customer retention strategies.

Hope you enjoy your stay! :)

Prepare & inspect data Convert dataset to transaction format

Description

General workflow

There are two things we need to do before the dataset is ready for association rule mining:

There are two continuous variables whose relationship to customer churn we are interested in: MonthlyCharges and Tenure. We will ignore the TotalCharges variable here, as it is a product of these two. To make use of them, we need to discretize them into categorical variables. However, simply dividing up the values into equal sized bins would almost definitely result in information loss. To obtain the most informative binning, we will use a supervised discretization function from the arulesCBA package, which identifies bin breaks that retain the most predictive power with respect to the target variable that we are interested in, Churn.
The arules package requires the input dataset to be in the transaction format. In our case, this means that each row corresponds to a single customer, listing all their personal characteristics and purchasing behaviours as comma-separated items.

Once all of this is done, we can look at the prevalence of each of these characteristics or behaviours is amongst the customers. For example, month-to-month contracts and paperless billing are the most common.

## Import libraries
library(arulesCBA)
library(arules)
library(ggplot2)

## Discretize "MonthlyCharges" with respect to "Churn"/"No Churn" 
df$Binned_MonthlyCharges <- discretizeDF.supervised(Churn ~ ., 
                                                    df[, c('MonthlyCharges', 'Churn')], 
                                                    method='mdlp')$MonthlyCharges

## Discretize "Tenure" with respect to "Churn"/"No Churn" 
df$Binned_Tenure <- discretizeDF.supervised(Churn ~ ., 
                                            df[, c('Tenure', 'Churn')], 
                                            method='mdlp')$Tenure

## Convert dataframe to transaction format
tData <- read.transactions(df, 
                           format = "basket", 
                           sep = ",", 
                           header=TRUE)

## Get item frequency
x <- data.frame(sort(table(unlist(LIST(tData))), 
                     decreasing=TRUE))

## Plot
ggplot(data=x, aes(x=factor(x$Var1), y=x$Freq)) + 
      geom_col()

Generate & filter rules Extract the most informative rules by measures of “interestingness” and statistical significance

	LHS	RHS	support	confidence	lift	count
751	1-1.5m,MultipleLines,Partner	Churn	0.0015643	0.9166667	3.448903	11
7551	1-1.5m,DeviceProtection,Electronic check,Female	Churn	0.0015643	0.9166667	3.448903	11
26875	1.5-5.5m,Bank transfer (automatic),MultipleLines,PaperlessBilling	Churn	0.0015643	0.9166667	3.448903	11
26889	$68.8-107,1.5-5.5m,Bank transfer (automatic),Male	Churn	0.0015643	0.9166667	3.448903	11
7524	1-1.5m,Electronic check,OnlineSecurity,PaperlessBilling	Churn	0.0014221	1.0000000	3.762440	10
7549	1-1.5m,DeviceProtection,Electronic check,StreamingMovies	Churn	0.0014221	0.9090909	3.420400	10
7607	1-1.5m,DeviceProtection,Month-to-month,StreamingTV	Churn	0.0014221	0.9090909	3.420400	10
7613	1-1.5m,DeviceProtection,Month-to-month,StreamingMovies	Churn	0.0014221	0.9090909	3.420400	10
26344	1.5-5.5m,MultipleLines,Partner,SeniorCitizen	Churn	0.0014221	0.9090909	3.420400	10
26741	1.5-5.5m,Credit card (automatic),Fiber optic,MultipleLines	Churn	0.0014221	0.9090909	3.420400	10
26744	$68.8-107,1.5-5.5m,Credit card (automatic),MultipleLines	Churn	0.0014221	0.9090909	3.420400	10
26863	1.5-5.5m,Bank transfer (automatic),DeviceProtection,Fiber optic	Churn	0.0014221	1.0000000	3.762440	10
26868	1.5-5.5m,Bank transfer (automatic),Fiber optic,StreamingMovies	Churn	0.0014221	1.0000000	3.762440	10
26870	$68.8-107,1.5-5.5m,Bank transfer (automatic),StreamingMovies	Churn	0.0014221	0.9090909	3.420400	10
7351	1-1.5m,Fiber optic,Mailed check,Partner	Churn	0.0012799	1.0000000	3.762440	9
7364	$68.8-107,1-1.5m,Mailed check,Partner	Churn	0.0012799	1.0000000	3.762440	9
7547	1-1.5m,DeviceProtection,Electronic check,StreamingTV	Churn	0.0012799	0.9000000	3.386196	9
7605	1-1.5m,DeviceProtection,Fiber optic,StreamingTV	Churn	0.0012799	1.0000000	3.762440	9
7670	1-1.5m,Male,Partner,StreamingTV	Churn	0.0012799	0.9000000	3.386196	9
26333	1.5-5.5m,DeviceProtection,MultipleLines,SeniorCitizen	Churn	0.0012799	1.0000000	3.762440	9

Description

General workflow

The Apriori algorithm is most commonly used for association rule mining. We can set various parameters to limit the number of rules created from the dataset, usually how often the rule is observed (support), how often it is true (confidence) and the minimum/maximum length of the rule. As an example, let’s generate rules that appear in at least 0.1% of customers, holds true 90% of the time, and contain 3-5 “items”.

The number of rules obtained at this step will almost certainly be too numerous to provide insights. We can further filter the rules by their lift, which is a measure of how much more or less likely the items in a given rule appear together as compared to by chance. Therefore, it is a metric of the importance of a rule. In addition, redundant and statistically insignificant rules should also be removed to leave only the most informative ones. Finally, as we are most interested in customer churn, we can extract only the rules that contain “Churn” in the right-hand side.

After all this is done, we can then convert the association rules to a dataframe for easy inspection and take a look. “LHS” and “RHS” refer to items on the left- and right-hand side of each rule, respectively. “Count” gives how many instances, whether it be transactions or customers, in which the rule appears. In our case, the “count” of each rule divided by the total number of customers in the dataset equals its support.

## Import libraries
library(arules)

## Convert dataframe to transaction format
tData <- read.transactions(df, 
                           format = "basket", 
                           sep = ",", 
                           header=TRUE)

## Create rules
rules <- apriori(tData, 
                 parameter = list(supp = 0.001, 
                                  conf=0.9,      
                                  minlen=3,      
                                  maxlen=5))   

## Filter rules by lift
filtered_rules <- subset(rules, subset = lift > 1.5)

## Filter out redundant rules
nonr_rules <- filtered_rules[!is.redundant(filtered_rules)]   

## Filter out statistically insignificant rules
sig_rules <- nonr_rules[!is.significant(nonr_rules, tData, 
                                        method = "fisher", 
                                        adjust = 'bonferroni')]

## Extract rules that have "Churn" as consequent 
churn_rules <- subset(sig_rules, 
                      subset=rhs %pin% 'Churn')

## Convert rules matrix to dataframe
churn_rules_df <- DATAFRAME(churn_rules, 
                            setStart='', setEnd='', 
                            separate = TRUE)

Grouped matrix Overview of the association rules

Description

General workflow

The grouped matrix plot provides a birds eye overview of the most informative rules. The antecedents of each rule are grouped in the columns using clustering. Each group is labeled with the “items” that are most prevalent in its antecedent. Balloons in the matrix are used to represent with what consequent the antecedents are connected.

From this plot, we can see that purchasing fiber optic internet service and having a month-to-month contract are associated with higher likelihood of the customer leaving the company.

However, its utility is limited precisely by how distilled the information is. So, that is where the circular grouped plot comes in, as we will see next.

## Import libraries
library(arules)
library(arulesViz)

## Convert dataframe to transaction format
tData <- read.transactions(df, 
                           format = "basket", 
                           sep = ",", 
                           header=TRUE)

## Create rules
rules <- apriori(tData, 
                 parameter = list(supp = 0.001, 
                                  conf=0.9,      
                                  minlen=3,      
                                  maxlen=5))   

## Plot
plot(rules, method="grouped", 
     control=list(main=NULL))

Circular grouped plot More granular summary of the association rules

Description

General workflow

The circular grouped plot provides a more detailed summary of the rules than the grouped matrix, but obviously at the cost of the number of rules that can be visualized before it gets too cluttered.

Looking at customer characteristics and purchasing behaviours that have the most number of arrows stemming from it, we see that customers with short tenures (1-5.5 months), dependents and purchased fiber optic internet service seem to be the most likely to leave the company. This is consistent with what we saw in the grouped matrix plot previously.

These insights can help to inform new marketing campaigns and retention strategies, such as offering discounted year-long contracts to new customers, create family plans, and make the internet service package more competitive.

## Import libraries
library(arules)
library(arulesViz)

## Convert dataframe to transaction format
tData <- read.transactions(df, 
                           format = "basket", 
                           sep = ",", 
                           header=TRUE)

## Create rules
rules <- apriori(tData, 
                 parameter = list(supp = 0.001, 
                                  conf=0.9,      
                                  minlen=3,      
                                  maxlen=5))   

## Plot
plot(rules, method="graph",
     control=list(main=NULL, 
                  layout=igraph::in_circle()))

Interactive network graph User-friendly filtering and exploration of association rules

Description

General workflow

With a handy dropdown menu, this plot allows interactive exploration of rules that contain specific items in this antecedent.

In addition, the zoom-in/out, drag-and-drop and highlight-upon-hovering capabilities make it easy to zero in on something of interest.

Feel free to play!

## Import libraries
library(arules)
library(arulesViz)

## Convert dataframe to transaction format
tData <- read.transactions(df, 
                           format = "basket", 
                           sep = ",", 
                           header=TRUE)

## Create rules
rules <- apriori(tData, 
                 parameter = list(supp = 0.001, 
                                  conf=0.9,      
                                  minlen=3,      
                                  maxlen=5))   

## Plot
plot(churn_rules, method = "graph", 
     control=list(main=NULL),  
     engine = "htmlwidget")

Shiny app A light-weight all-in-one exploration and recommender tool

A shiny app

Session info

Column

R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS  10.14.2

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_IE.UTF-8/en_IE.UTF-8/en_IE.UTF-8/C/en_IE.UTF-8/en_IE.UTF-8

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] arulesViz_1.3-3       ggplot2_3.2.1         arulesCBA_1.1.4      
 [4] discretization_1.0-1  arules_1.6-4          Matrix_1.2-10        
 [7] dplyr_0.8.0.1         plyr_1.8.4            kableExtra_1.1.0     
[10] knitr_1.21            flexdashboard_0.5.1.1

loaded via a namespace (and not attached):
 [1] viridis_0.5.1        httr_1.2.1           tidyr_0.6.3         
 [4] jsonlite_1.6         viridisLite_0.3.0    foreach_1.4.3       
 [7] gtools_3.8.1         assertthat_0.2.0     highr_0.6           
[10] yaml_2.2.0           pillar_1.3.1         lattice_0.20-35     
[13] glue_1.3.0           digest_0.6.18        rvest_0.3.2         
[16] colorspace_1.3-2     htmltools_0.3.6      pkgconfig_2.0.2     
[19] purrr_0.2.2.2        scales_1.0.0         webshot_0.5.1       
[22] gdata_2.18.0         tibble_2.0.1         DT_0.2              
[25] withr_2.1.2          lazyeval_0.2.0       magrittr_1.5        
[28] crayon_1.3.4         evaluate_0.10.1      MASS_7.3-47         
[31] gplots_3.0.1.1       xml2_1.2.0           tools_3.4.1         
[34] registry_0.5-1       data.table_1.10.4    hms_0.3             
[37] stringr_1.2.0        plotly_4.9.0         munsell_0.5.0       
[40] cluster_2.0.6        compiler_3.4.1       vcd_1.4-4           
[43] caTools_1.17.1.1     rlang_0.3.1          iterators_1.0.8     
[46] rstudioapi_0.6       htmlwidgets_1.3      visNetwork_2.0.8    
[49] igraph_1.2.4         bitops_1.0-6         labeling_0.3        
[52] rmarkdown_1.11       testthat_2.2.1       gtable_0.2.0        
[55] codetools_0.2-15     TSP_1.1-7            R6_2.2.2            
[58] seriation_1.2-8      gridExtra_2.2.1      zoo_1.8-0           
[61] KernSmooth_2.23-15   dendextend_1.12.0    readr_1.1.1         
[64] stringi_1.1.5        Rcpp_1.0.0           gclus_1.3.2         
[67] scatterplot3d_0.3-41 tidyselect_0.2.5     xfun_0.4            
[70] lmtest_0.9-35

--- title: "Association rule mining" output: flexdashboard::flex_dashboard: social: menu source_code: embed theme: united --- ```{r setup, include=FALSE} library(flexdashboard) library(knitr) library(kableExtra) knitr::opts_chunk$set(cache = TRUE, warning = FALSE, message = FALSE) ``` Workflow {.storyboard} ========================================= Inputs {.sidebar} ------------------------------------- Association rule mining is an approach to discovering patterns of co-occurrence in a (large) dataset, by identifying entities that frequently appear together in a group. As an unsupervised learning technique, association rule mining can be used to identify novel patterns/relationships amongst entities in a large set of data. This type of patterns are summarized by **association rules**, which predicts the occurrence of one or more entities based on the occurrences of other entities in a certain grouping, such as a transaction or an individual. As usual, we will perform association rule mining on the [IBM Telco customer churn dataset](https://developer.ibm.com/patterns/predict-customer-churn-using-watson-studio-and-jupyter-notebooks/), to identify customer characteristics and purchasing behaviours that tend to appear together, which may be helpful in informing marketing and customer retention strategies. Hope you enjoy your stay! :) ### **Prepare & inspect data** Convert dataset to transaction format {data-commentary-width=550} ```{r} ## Import library library(plyr) library(dplyr) library(arulesCBA) library(arules) library(ggplot2) ## Import data that has been cleaned up df <-read.csv("https://github.com/nchelaru/data-prep/raw/master/telco_cleaned_yes_no.csv") ## Discretize "MonthlyCharges" with respect to "Churn"/"No Churn" label and assign to new column in dataframe df$Binned_MonthlyCharges <- discretizeDF.supervised(Churn ~ ., df[, c('MonthlyCharges', 'Churn')], method='mdlp')$MonthlyCharges ## Rename the levels based on knowledge of min/max monthly charges df$Binned_MonthlyCharges = revalue(df$Binned_MonthlyCharges, c("[-Inf,29.4)"="$0-29.4", "[29.4,56)"="$29.4-56", "[56,68.8)"="$56-68.8", "[68.8,107)"="$68.8-107", "[107, Inf]" = "$107-118.75")) ## Discretize "Tenure" with respect to "Churn"/"No Churn" label and assign to new column in dataframe df$Binned_Tenure <- discretizeDF.supervised(Churn ~ ., df[, c('Tenure', 'Churn')], method='mdlp')$Tenure ## Rename the levels based on knowledge of min/max tenures df$Binned_Tenure = revalue(df$Binned_Tenure, c("[-Inf,1.5)"="1-1.5m", "[1.5,5.5)"="1.5-5.5m", "[5.5,17.5)"="5.5-17.5m", "[17.5,43.5)"="17.5-43.5m", "[43.5,59.5)"="43.5-59.5m", "[59.5,70.5)"="59.5-70.5m", "[70.5, Inf]"="70.5-72m")) ## Replace "No"s with empty values df[df=="No"]<-NA df[] <- lapply(df, function(x) levels(x)[x]) ## Replace "Yes"s with the column name w <- which(df == "Yes", arr.ind = TRUE) df[w] <- names(df)[w[,"col"]] ## Output to CSV write.csv(df, './final_df.csv', row.names=FALSE) ## Convert dataframe to transaction format tData <- read.transactions('./final_df.csv', format = "basket", sep = ",", header=TRUE) ## Get item frequency x <- data.frame(sort(table(unlist(LIST(tData))), decreasing=TRUE)) ## Plot ggplot(data=x, aes(x=factor(x$Var1), y=x$Freq)) + geom_col() + theme(axis.text.x = element_text(angle = 45, hjust=1), panel.grid.major = element_blank(), panel.background = element_blank()) + ylab('Frequency') + xlab('Items') ``` *** Description General workflow There are two things we need to do before the dataset is ready for association rule mining: 1. There are two continuous variables whose relationship to customer churn we are interested in: `MonthlyCharges` and `Tenure`. We will ignore the TotalCharges variable here, as it is a product of these two. To make use of them, we need to **discretize** them into categorical variables. However, simply dividing up the values into equal sized bins would almost definitely result in information loss. To obtain the most informative binning, we will use a **supervised discretization** function from the `arulesCBA` package, which identifies bin breaks that retain the most predictive power with respect to the target variable that we are interested in, `Churn`. 2. The `arules` package requires the input dataset to be in the **transaction** format. In our case, this means that each row corresponds to a single customer, listing all their personal characteristics and purchasing behaviours as comma-separated items. Once all of this is done, we can look at the prevalence of each of these characteristics or behaviours is amongst the customers. For example, month-to-month contracts and paperless billing are the most common. ```{r eval=F, echo=T} ## Import libraries library(arulesCBA) library(arules) library(ggplot2) ## Discretize "MonthlyCharges" with respect to "Churn"/"No Churn" df$Binned_MonthlyCharges <- discretizeDF.supervised(Churn ~ ., df[, c('MonthlyCharges', 'Churn')], method='mdlp')$MonthlyCharges ## Discretize "Tenure" with respect to "Churn"/"No Churn" df$Binned_Tenure <- discretizeDF.supervised(Churn ~ ., df[, c('Tenure', 'Churn')], method='mdlp')$Tenure ## Convert dataframe to transaction format tData <- read.transactions(df, format = "basket", sep = ",", header=TRUE) ## Get item frequency x <- data.frame(sort(table(unlist(LIST(tData))), decreasing=TRUE)) ## Plot ggplot(data=x, aes(x=factor(x$Var1), y=x$Freq)) + geom_col() ``` ### **Generate & filter rules** Extract the most informative rules by measures of "interestingness" and statistical significance {data-commentary-width=500} ```{r message=FALSE, warning=FALSE, include=FALSE} ## Import libraries library(knitr) library(kableExtra) ## Create rules rules <- apriori(tData, parameter = list(supp = 0.001, conf=0.9, minlen=3, maxlen=5)) ## Filter rules by lift filtered_rules <- subset(rules, subset = lift > 1.5) ## Filter out redundant rules nonr_rules <- filtered_rules[!is.redundant(filtered_rules)] ## Filter out statistically insignificant rules sig_rules <- nonr_rules[!is.significant(nonr_rules, tData, method = "fisher", adjust = 'bonferroni')] ## Extract rules that have "Churn" as consequent churn_rules <- subset(sig_rules, subset=rhs %pin% 'Churn') ## Convert rules matrix to dataframe churn_rules_df <- DATAFRAME(churn_rules, setStart='', setEnd='', separate = TRUE) ``` ```{r} ## Sort the rules by how often they are true, which is measured by "support" kable(head(churn_rules_df[order(-churn_rules_df$support),], 20), format='html', escape=F) %>% kable_styling(full_width=T, bootstrap_options = c("striped", "hover")) ``` *** Description General workflow The Apriori algorithm is most commonly used for association rule mining. We can set various parameters to limit the number of rules created from the dataset, usually how often the rule is observed (**support**), how often it is true (**confidence**) and the minimum/maximum length of the rule. As an example, let’s generate rules that appear in at least 0.1% of customers, holds true 90% of the time, and contain 3-5 “items”. The number of rules obtained at this step will almost certainly be too numerous to provide insights. We can further filter the rules by their **lift**, which is a measure of how much more or less likely the items in a given rule appear together as compared to by chance. Therefore, it is a metric of the importance of a rule. In addition, redundant and statistically insignificant rules should also be removed to leave only the most informative ones. Finally, as we are most interested in customer churn, we can extract *only* the rules that contain "Churn" in the right-hand side. After all this is done, we can then convert the association rules to a dataframe for easy inspection and take a look. “LHS” and “RHS” refer to items on the left- and right-hand side of each rule, respectively. “Count” gives how many instances, whether it be transactions or customers, in which the rule appears. In our case, the “count” of each rule divided by the total number of customers in the dataset equals its support. ```{r, eval=F, echo=T} ## Import libraries library(arules) ## Convert dataframe to transaction format tData <- read.transactions(df, format = "basket", sep = ",", header=TRUE) ## Create rules rules <- apriori(tData, parameter = list(supp = 0.001, conf=0.9, minlen=3, maxlen=5)) ## Filter rules by lift filtered_rules <- subset(rules, subset = lift > 1.5) ## Filter out redundant rules nonr_rules <- filtered_rules[!is.redundant(filtered_rules)] ## Filter out statistically insignificant rules sig_rules <- nonr_rules[!is.significant(nonr_rules, tData, method = "fisher", adjust = 'bonferroni')] ## Extract rules that have "Churn" as consequent churn_rules <- subset(sig_rules, subset=rhs %pin% 'Churn') ## Convert rules matrix to dataframe churn_rules_df <- DATAFRAME(churn_rules, setStart='', setEnd='', separate = TRUE) ``` ### **Grouped matrix** Overview of the association rules {data-commentary-width=550} ```{r} ## Import libraries library(ggplot2) library(arulesViz) ## Grouped matrix plot(churn_rules, method="grouped", control=list(main=NULL)) ``` *** Description General workflow The grouped matrix plot provides a birds eye overview of the most informative rules. The antecedents of each rule are grouped in the columns using clustering. Each group is labeled with the "items" that are most prevalent in its antecedent. Balloons in the matrix are used to represent with what consequent the antecedents are connected. From this plot, we can see that purchasing fiber optic internet service and having a month-to-month contract are associated with higher likelihood of the customer leaving the company. However, its utility is limited precisely by how distilled the information is. So, that is where the circular grouped plot comes in, as we will see next. ```{r, eval=F, echo=T} ## Import libraries library(arules) library(arulesViz) ## Convert dataframe to transaction format tData <- read.transactions(df, format = "basket", sep = ",", header=TRUE) ## Create rules rules <- apriori(tData, parameter = list(supp = 0.001, conf=0.9, minlen=3, maxlen=5)) ## Plot plot(rules, method="grouped", control=list(main=NULL)) ``` ### **Circular grouped plot** More granular summary of the association rules {data-commentary-width=450} ```{r, fig.height=15, fig.width=17} plot(churn_rules, method="graph", cex=1.7, control=list(main=NULL, layout=igraph::in_circle())) ``` *** Description General workflow The circular grouped plot provides a more detailed summary of the rules than the grouped matrix, but obviously at the cost of the number of rules that can be visualized before it gets too cluttered. Looking at customer characteristics and purchasing behaviours that have the most number of arrows stemming from it, we see that customers with short tenures (1-5.5 months), dependents and purchased fiber optic internet service seem to be the most likely to leave the company. This is consistent with what we saw in the grouped matrix plot previously. These insights can help to inform new marketing campaigns and retention strategies, such as offering discounted year-long contracts to new customers, create family plans, and make the internet service package more competitive. ```{r, eval=F, echo=T} ## Import libraries library(arules) library(arulesViz) ## Convert dataframe to transaction format tData <- read.transactions(df, format = "basket", sep = ",", header=TRUE) ## Create rules rules <- apriori(tData, parameter = list(supp = 0.001, conf=0.9, minlen=3, maxlen=5)) ## Plot plot(rules, method="graph", control=list(main=NULL, layout=igraph::in_circle())) ``` ### **Interactive network graph** User-friendly filtering and exploration of association rules {data-commentary-width=450} ```{r message=FALSE, warning=FALSE, echo=F, include=FALSE, fig.width=10, fig.height=10} x <- plot(churn_rules, method = "graph", control=list(main=NULL), engine = "htmlwidget") ``` ```{r} x ``` *** Description General workflow With a handy dropdown menu, this plot allows interactive exploration of rules that contain specific items in this antecedent. In addition, the zoom-in/out, drag-and-drop and highlight-upon-hovering capabilities make it easy to zero in on something of interest. Feel free to play! ```{r, eval=F, echo=T} ## Import libraries library(arules) library(arulesViz) ## Convert dataframe to transaction format tData <- read.transactions(df, format = "basket", sep = ",", header=TRUE) ## Create rules rules <- apriori(tData, parameter = list(supp = 0.001, conf=0.9, minlen=3, maxlen=5)) ## Plot plot(churn_rules, method = "graph", control=list(main=NULL), engine = "htmlwidget") ``` ### **Shiny app** A light-weight all-in-one exploration and recommender tool {data-commentary-width=250} ```{r out.width='100%'} knitr::include_app("https://nancy-chelaru-centea.shinyapps.io/arules_app/", height='700px') ``` *** A shiny app Session info ========================================= Column -------------------------------------------- ```{r} sessionInfo() ```