Cross Sell Analysis
Problem Statement
A consumer bank with a range of products including mortgages, would like to cross-sell insurance to its consumer base. Attached is a sample data set from their customer portfolio containing various fields about their product ownership, credit standing, whether they have an outstanding mortgage, and insurance ownership (called as PPI / personal protection insurance), if any, and its type.
The bank would like to adopt analytics driven approach for deciding: + Who should they target for PPI, and + What type of PPI product they should be targeting them with
Approach
The gist of the problem is what to cross sell to, as well as whom to cross sell. The first part is fairly easy via the use of apriori algorithm. However, there are certain assumptions to be made here. Assuming the column Ref denotes a customer id, the dataset is unique at this level. There are no repeated records of purchase of products. Hence we shall be going ahead with Customer_Segment and calculating the products purchased per Customer_Segment.
What to cross-sell (Apriori Algorithm)
Code Walkthrough
# Load required libraries
library(dplyr)
library(arules)
library(plotly)
library(arulesViz)
library(visNetwork)
library(igraph)
library(reshape2)
# Read the dataset
df <- read.csv("../data/Dataset - Mortgage Insurance Cross Sell_Updated.csv",
header = TRUE, stringsAsFactors = FALSE)
# Change the description to lowercase
df$Insurance_Description <- tolower(df$Insurance_Description)
# Specify columns to work with
required_cols <- c("Customer.Segment", "Insurance_Description")
# Filter data and select only required columns. Filtering on PPI=1 as only
# they have product info.
required_df <- df %>% filter(PPI == 1) %>% select(required_cols)
# Wrange data to be usable as transaction data
products <- aggregate(Insurance_Description ~ Customer.Segment, required_df,
c)
transactions <- as(products$Insurance_Description, "transactions")
# Run apriori algorithm to get rules
rules <- apriori(transactions, parameter = list(supp = 0.006, conf = 0.25, minlen = 2))
# Sort the rules according to decreasing confidence
rules <- sort(rules, by = "confidence", decreasing = TRUE)
# Remove the redundant rules
rules <- rules[!is.redundant(rules)]
rules_df <- as(rules, "data.frame")
# Output the rules into a csv file
write.csv(rules_df, "../output/rules.csv")Interpretation
Let us have a look at what some of these rules look like:
rules support confidence lift count
1 {bronze} => {1st cust-lasu/ 2nd c} 0.01666667 1 5.454545 1
2 {bronze} => {life & critical illn} 0.01666667 1 1.935484 1
3 {bronze} => {lasu} 0.01666667 1 1.500000 1
4 {bronze} => {joint} 0.01666667 1 1.333333 1
5 {bronze} => {life & ci} 0.01666667 1 1.153846 1
6 {bronze} => {single} 0.01666667 1 1.034483 1
The interpretation of these rules (going by the second one) is that those who have purchase bronze are 1.935 times more likely to buy life & critical illn.