INTRODUCTION
This dashboard report illustrates the end results of my use of an unsupervised machine learning algorithm called k-means. Said algorithm facilitated my execution of behavioral segmenting per customer. Due to the data set being very limited in terms of customer details, the behavior of interest was money spent and number of transactions in the year 2011.
The data set used for this report is from UCI’s machine learning repository called ‘Online Retail Data Set’. This data set is a real-world transactional e-commerce data set. Said data set was donated on 2015-11-06. You can access data here.
A little bit about the data set from UCI itself:
This is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers.
CUSTOMER ACTIVITY
The left plot highlights the three different customer segments:
The red cluster (segment 3) features 93.63% of the customers. This segment of customers generated 55.04% of the company’s 2011 revenue, yet features the least amount of transactions (5.91%) as can be seen in both the “Segment Revenue” and “Segment Transactions” tabs
The green cluster (segment 2) features 6.21% of the customer base. This segment of customers generated 30.87% of the company’s 2011 revenue, and features 33.34% of the overall transactions, as can be seen in both the “Segment Revenue” and “Segment Transactions” tabs
The blue cluster (segment 1) features 0.15% of the customer base. This segment of customers generated 14.08% of the company’s 2011 revenue, and features 60.74% of the overall transactions, as can be seen in both the “Segment Revenue” and “Segment Transactions” tabs
SEGMENT REVENUE
On average, individual members of segment 1 are the most valuable and most active of all the segments. So much so that it leads me to believe that these customers aren’t the typical customer, but in fact wholesalers, as UCI’s description mentions. Unfortunately, this data set does not feature variables to confirm this assumption. Segment 2 might very well be the same situation, just at a significantly smaller scale, or special customers. Segment 3, again, features the most customers, but its members spend the least amount of money with the company. The company’s sales force should give segment 1 customers special treatment and promotions. Same with segment 2, just at a lesser extent to boost sales and revenue. Futher research on segment 3 is needed to further understand the needs of this segment in hopes for higher retention and generated revenue from this segment.
COUNTRY TRANSACTIONS
With the left plot in this tab, we can see that the clear majority of business is done with those from the United Kingdom, which makes sense seeing as how this company is based within the UK. As for foreign customers; Germany, France, and EIRE (aka Ireland) are the top three when it comes to activity. We can see this in the right plot.
Customer ActivityTransaction TableCustomer SegmentsTransaction CountAll CountriesExcluding The UK