load("CLV.RData")
head(CLV_data)
## # A tibble: 6 x 4
## ID x t_x T
## <int> <dbl> <dbl> <dbl>
## 1 1 1 3.86 29
## 2 2 2 13.3 27.1
## 3 3 5 29.1 31.3
## 4 4 0 0 38.4
## 5 5 0 0 32
## 6 6 0 0 30.6
The three variables of the data are defined as following:
* \(x\): the frequency of purchases
* \(t_x\): the recency, i.e., the time the most recent purchase
* \(T\): the end time for the observations
The first function “lik_bgnbd” is the log-likelihood function. The A1-A4 terms defined in the function are the logged terms of A1-A4 in Fader et al. (2005), p. 279.
The function requires data “CLV_data”. To use this function, please make sure you have loaded “CLV_data”, a data.frame containing three variables: \(x\), \(t_x\), and \(T\) as defined above.
See the comments within the function for more details.
lik_bgnbd
## function(para) {
## # This is the likelihood function with CLV_data
## # Change CLV_data to the names your datasets; the variables should be the same.
##
## # unpack 4 parameters:
## # r and alpha are the parameters of the gamma distribution of lambda
## # a and b are the parameters of the beta distribution ofr the "death" rate
## r <- para[1]
## alpha <- para[2]
## a <- para[3]
## b <- para[4]
##
## # unpack three variables:
## # x - # of transactions (frequency)
## # t_x - the most recent time of transaction (recency)
## # T - the end time for the Poisson process
## x <- CLV_data$x
## t_x <- CLV_data$t_x
## T <- CLV_data$T
##
## # A1 - A4 corresponds to the log of 4 terms in the likelihood function
## # See p.280 of Fader et al. (2005)
## # They are the log of A1-A4 for the log-likelihood
## A1 <- lgamma(r+x) + r*log(alpha) - lgamma(r)
## A2 <- lgamma(a+b) + lgamma(b+x) - lgamma(b) - lgamma(a+b+x)
## A3 <- -(r+x)*log(alpha+T)
##
## A4 <- rep(0,length(x))
## idx <- x>0
## if (sum(idx)>0) {
## A4[idx] <- log(a/(b+x[idx]-1)) - (r+x[idx])*log(alpha+t_x[idx])
## }
##
## lh <- sum(A1+A2+log(exp(A3)+idx*exp(A4)))
##
## # To return the minus of likelihood for minimization
## # maximize likelihood = minimize negative likelihood
## return(-lh)
##
## }
## <bytecode: 0x0000000014b8d8d0>
The second function “pred_Y” is the function of prediction future # of transactions based on the estimated parameters from “lik_bgnbd”.
This function follows Equation (10) of Fader et al. (2005), p.279.
The function requires a package “hypergeo” for the evaluation of the hypergeometeric function 2F1(.).
As evaluation of 2F1(.) is somewhat costly, this function uses input from only 1 customer. You can extend the function to calculate for many customers by iterations or use lapply(.).
Please see the comments within the function for more details.
pred_Y
## function(para,H,t) {
## # Given estimated parameters, for a customer with history {x,t_x,T}, to predict her # of transactions in t time.
## # Please see Equation (10) of Fader et al. (2005), p.279;
## # This function uses a package "hypergeo" for function evaluation;
## # As evaluation of 2F1(.) is somewhat costly, this function is to accommodate only 1 customer;
## # You can extend the function to calculate for many customers by iterations or use lapply(.);
##
## # The function takes three input:
## # para - a vector of the 4 parameters;
## # H - a data.frame with only 1 row (for 1 customer) and three variables {x,t_x,T};
## # t - how long into the future?
##
## # unpack 4 parameters:
## # r and alpha are the parameters of the gamma distribution of lambda
## # a and b are the parameters of the beta distribution of the "death" rate
## r <- para[1]
## alpha <- para[2]
## a <- para[3]
## b <- para[4]
##
## # unpack three variables:
## # x - # of transactions (frequency)
## # t_x - the most recent time of transaction (recency)
## # T - the end time for the Poisson process
## x <- H$x
## t_x <- H$t_x
## T <- H$T
##
## # the term of the Gaussian hypergeometric function 2F1(.)
## hg <- hypergeo::hypergeo(r+x,b+x,a+b+x-1,t/(alpha+T+t))
## hg <- Re(hg) # hg is a real number in the complex format Re(.) makes it real.
##
## # the numerator term, given hg
## Y1 <- (a+b+x-1)/(a-1)*(1-(((alpha+T)/(alpha+T+t))^(r+x))*hg)
## Y2 <- 1
## if (x>0) {
## Y2 <- Y2 + a/(b+x-1)*(((alpha+T)/(alpha+t_x))^(r+x))
## }
##
## return(Y1/Y2)
## }
For the estimation, we need to maximize the log likelihood (or minimize the negative log likelihood, as we have as the output of the “lik_bgnbd” function). To do so, we are using a non-linear solver built into R called “constrOptim(.)”. The function is in the R Base, and use “?constrOptim” to see more details.
One thing to take note is the all parameters \((r,\alpha)\) for the Gamma distribution and \((a,b)\) for the Beta distribution are positive (\(>0\)). We thus have to specify the constraints in “constrOptim” as “ui” and “ci” with: ui %*% parameters >= ci. For our purpose, we specify “ui” as a \(4\times4\) identity matrix, and “ci” as a length 4 all-zero vector.
\[
\left[ \begin{matrix}
1& 0& 0& 0\\
0& 1& 0& 0\\
0& 0& 1& 0\\
0& 0& 0& 1\\
\end{matrix} \right] \times \left[ \begin{array}{c}
r\\
\alpha\\
a\\
b\\
\end{array} \right] \geqslant \left[ \begin{array}{c}
0\\
0\\
0\\
0\\
\end{array} \right]
\]
# to specify the constraint matrix
ui <- diag(4)
ci <- rep(0,4)
# para: the starting values; a vector with 4 elements corresponding to (r,alpha,a,b).
# you can change this to other values or test different values.
para <- rep(.01,4)
# to run the constrained optimization
results <- constrOptim(para, lik_bgnbd, NULL, ui, ci)
# to get the estimated parameters in the order of (r,alpha,a,b)
para <- results$par
names(para) <- c("r","alpha","a","b")
para
## r alpha a b
## 0.2171882 3.7234535 0.8349768 3.7701876
We use the function “pred_Y” to predict the no. of transactions for a single customer. You have to specify how long into the future you want to predict. Please see comments within the function “pred_Y” for more details.
# suppose we want to predict next year and set t = 52
t <- 52
# let's predict the first customer in the data
H <- CLV_data[1,]
# to obtain the prediction
Yhat <- pred_Y(para,H,t)
as.numeric(Yhat)
## [1] 0.6823722