RFM

R - Recency of purchase (time in days between this and previous purchase) F - Frequency of orders (total orders for each user within dataset) M - There is no monetory amount in the dataset. Hence we will try using total quantity of products for all orders

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:data.table':
## 
##     between, first, last
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## 
Read 0.0% of 8337362 rows
Read 8.6% of 8337362 rows
Read 18.8% of 8337362 rows
Read 27.9% of 8337362 rows
Read 37.5% of 8337362 rows
Read 45.7% of 8337362 rows
Read 52.3% of 8337362 rows
Read 60.2% of 8337362 rows
Read 70.0% of 8337362 rows
Read 79.8% of 8337362 rows
Read 88.2% of 8337362 rows
Read 99.3% of 8337362 rows
Read 8337362 rows and 15 (of 15) columns from 0.755 GB file in 00:00:14
##   user_id Recency Frequency  Monetary
## 1       5       6         5  9.200000
## 2       8      10         4 16.750000
## 3      10      30         6 24.500000
## 4      17      30        41  7.317073
## 5      27       4        82  9.573171
## 6      42      14        17  8.235294

Independent Score

Calculate independent score

##       user_id Recency Frequency Monetary R_Score F_Score M_Score
## 48198  198768       6        27 35.92593       5       5       5
## 47326  195169       3        28 35.75000       5       5       5
## 42885  176970       6        51 33.56863       5       5       5
## 30349  125536       6        34 33.02941       5       5       5
## 10418   43162       6        31 32.70968       5       5       5
## 37986  156719       5        44 32.40909       5       5       5
##       Total_Score
## 48198         555
## 47326         555
## 42885         555
## 30349         555
## 10418         555
## 37986         555

Scoring with user specified breaks

Function getScoreWithBreaks(df,r,f,m)

Description Scoring the Recency, Frequency, and Monetary in r, f, and m which are vector object containing a series of breaks

Arguments * df - A data frame returned by the function of getDataFrame * r - A vector of Recency breaks * f - A vector of Frequency breaks * m - A vector of Monetary breaks

Return Value * Returns a new data frame with four new columns of “R_Score”,“F_Score”,“M_Score”, and “Total_Score”.

## Warning in plot.window(xlim, ylim, "", ...): "nclasss" is not a graphical
## parameter
## Warning in title(main = main, sub = sub, xlab = xlab, ylab = ylab, ...):
## "nclasss" is not a graphical parameter
## Warning in axis(1, ...): "nclasss" is not a graphical parameter
## Warning in axis(2, ...): "nclasss" is not a graphical parameter

##      user_id Recency Frequency Monetary R_Score F_Score M_Score
## 931     3830       2        85 27.77647       5       5       5
## 2153    8852       1        45 20.11111       5       5       5
## 3771   15558       0        56 21.30357       5       5       5
## 4920   20263       5        31 20.58065       5       5       5
## 5623   23185       5        35 22.00000       5       5       5
## 5645   23297       3        92 20.81522       5       5       5
##      Total_Score
## 931          555
## 2153         555
## 3771         555
## 4920         555
## 5623         555
## 5645         555

Plot Score

Function drawHistograms(df,r,f,m)

Description Draw the histograms in the R, F, and M dimensions so that we can see the quantity of customers in each RFM block.

Arguments * df - A data frame returned by the function of getIndependent or getScoreWithBreaks * r - The highest point of Recency * f - The highest point of Frequency * m - The highest point of Monetary

Return Value No return value.

Output to csv