Recommender - Clustering

November 24, 2019

The Organization of Hierarchy

Introduction to Hierarchical Clustering

Decision on Cluster Level

Hierarchical Cluster Analysis: Agglomerative

Hierarchical Cluster Analysis: Divisive

Agglomerative vs Divisive Intuition

Agglomerative vs Divisive

Steps to Perform Hierarchical Clustering (Agglomerative)

view each data point as an individual "cluster" with just that one point as a member

calculate Euclidean distance between the centroids of all the clusters

group the closest point pairs together

repeat Step 2 and Step 3 until you reach a single cluster containing all the data in your set.

plot a dendrogram

decide on level

Decision on Cluster Level

Hierarchical Cluster Analysis (Agglomerative) in R

Compute distances

distances = dist(data, method = "euclidean")

applying Hierarchical clustering

clusterData = hclust(distances, method = "complete")

Plot the dendrogram

plot(clusterData)

Decision on Cluster Level

clusterGroups = cutree(clusterData, k = 10)

Agglomerative vs Divisive

Agglomerative

less complexity
may fooled by local neighbors
not see the larger implications of clusters

Divisive

see the entire data distribution
more accurate
deeper complexity (can decrease the stability and increase the runtime)

k-mean vs Hierarchical Cluster Analysis

k-mean

simplicity
instantiating random centroids and finding the closest points are time consuming

Hierarchical Cluster Analysis

no need to pass in an explicit "k" number of clusters
has more parameters to tweak
clusters can be subjectively chosen through the evaluation of a dendrogram plot

Conclusions

Clustering Analysis for Recommender Systems:

works at a group level
generates less-personal recommendations
often leads to worse accuracy than nearest neighbor algorithms
works faster
effective in shrinking the selection of relevant neighbors in a collaborative filtering algorithm

Sources

By Benjamin Johnston, Aaron Jones, Christopher Kruger May 2019, "Applied Unsupervised Learning with Python"

https://www.researchgate.net/publication/303870754_FHCC_A_Soft_Hierarchical_Clustering_Approach_for_Collaborative_Filtering_Recommendation

https://rpubs.com/kismetk/Netflix-recommendation

https://www.displayr.com/what-is-hierarchical-clustering/

https://towardsdatascience.com/understanding-the-concept-of-hierarchical-clustering-technique-c6e8243758ec

https://www.sciencedirect.com/topics/computer-science/hierarchical-cluster-analysis

Image Sources

https://www.bbvadata.com/recommender-systems-marketing-gets-personal/

https://subscription.packtpub.com/book/data/9781785884856/4/ch04lvl1sec25/clustering-techniques

https://towardsdatascience.com/unsupervised-learning-k-means-vs-hierarchical-clustering-5fe2da7c9554

https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781789952292/2/ch02lvl1sec10/the-organization-of-hierarchy