Analysis of Business Customers

The first step is to run a Cluster Analysis of Business Customers taking into account their Transactions

Specify the Number of Clusters based on their transactions

We take into consideration the Number of Transactions , Total Amount of Transactions, Average Transactional Amount, Median Transactional Amount, First Quartile of the Transaction Amount, Third Quartile of the Transactional Amount, and Max and Min transactional Amount

Represent the 3 clusters

Applying the k-Means Algorithm and by applying the Elbow Rule we came up with 3 Clusters

N_Transactions Total_Amount Average_Amount Median_Amount Q1 Q3 Max Min
42.160714 2312271.45 186117.70 128519.42 86116.835 204671.38 662328.11 58492.589
6.895669 86604.71 13084.13 11135.44 8643.459 15191.71 28306.13 6748.126
45.600000 9650696.00 365222.53 201398.30 82794.000 442236.85 1964096.10 7218.300

Sizes of 3 clusters

Cluster Sizes Proportions
1 56 9.76
2 508 88.50
3 10 1.74

Plot the 3 clusters

Histogram of the Transaction Amount by Cluster

At this point we represent a Histogram of the Transaction Amount. For convenience we represent all the Amounts up to $ 5M

Boxplot of the Transaction Amount by Cluster

At this point we represent a Boxplot of the Transaction Amount by Cluster. For convenience we represent all the Amounts up to $ 5M

Scatter Plot of the Transaction Amount by Cluster

Time Series Plot of the Average Transaction Amount by Cluster

At this point we represent the Average Daily Transaction Amount by Cluster

Time Series Plot of the Average Maximum Transaction Amount by Cluster

Since we would like to detect the anomalies of the customers’s transaction within clusters we would like to consider the average daily maximum transaction amount of the users. To be more specific, for every customer we get the Maximum transaction amount per day and then we take the average of this measure

Time Series Plot of the Average Maximum Transaction Amount by Cluster applying also an Upper and Lower Bound

Scatter plot of the Transactions applying also the Upper and Lower Bounds that we calculated before

Anomaly Detection on the Average Max Amount by Cluster assign a Maximum Anomalies Threshold equal to 5%

Analysis of Non Business Customers

The first step is to run a Cluster Analysis of Non Business Customers taking into account their Transactions

Specify the Number of Clusters based on their transactions

We take into consideration the Number of Transactions , Total Amount of Transactions, Average Transactional Amount, Median Transactional Amount, First Quartile of the Transaction Amount, Third Quartile of the Transactional Amount, and Max and Min transactional Amount

Represent the 3 clusters

Applying the k-Means Algorithm and by applying the Elbow Rule we came up with 3 Clusters

N_Transactions Total_Amount Average_Amount Median_Amount Q1 Q3 Max Min
6.494981 23132.32 4230.48 3652.892 2808.816 5045.696 8915.395 2157.876
13.480000 768943.28 135664.84 117275.360 95084.710 157459.760 284730.680 83690.040
19.833333 2891919.83 461702.60 315254.750 180862.042 659529.500 1497927.333 52546.500

Sizes of 3 clusters

Cluster Sizes Proportions
1 1295 97.66
2 25 1.89
3 6 0.45

Plot the 3 clusters

clusplot(data, clusters$cluster, color=TRUE, shade=TRUE,     labels=1, lines=0)

Histogram of the Transaction Amount by Cluster

At this point we represent a Histogram of the Transaction Amount. For convenience we represent all the Amounts up to $ 5M

Boxplot of the Transaction Amount by Cluster

At this point we represent a Boxplot of the Transaction Amount by Cluster. For convenience we represent all the Amounts up to $ 5M

Scatter Plot of the Transaction Amount by Cluster

Time Series Plot of the Average Transaction Amount by Cluster

At this point we represent the Average Daily Transaction Amount by Cluster

Time Series Plot of the Average Maximum Transaction Amount by Cluster

Since we would like to detect the anomalies of the customers’s transaction within clusters we would like to consider the average daily maximum transaction amount of the users. To be more specific, for every customer we get the Maximum transaction amount per day and then we take the average of this measure

Time Series Plot of the Average Maximum Transaction Amount by Cluster applying also an Upper and Lower Bound

Scatter plot of the Transactions applying also the Upper and Lower Bounds that we calculated before

Anomaly Detection on the Average Max Amount by Cluster assign a Maximum Anomalies Threshold equal to 5%