Juande
Dec 20 2014
k-means is a popular clustering method used in data mining. It works by pairing each point or observation with any of the n centroids which represent an individual cluster.
At the initialization part, the k initial centroids are randomly generated within the data domain. At each iteration each observation is paired with its nearest centroid, then for each centroid, the mean is calculated and its position is updated. These 2 steps are repeated until convergence has been reached.
More information available at: k-means clustering
We will present an example to see how the centroids converge after a number of iterations. To show this, we will use 3 centroids. Iteration 1
Iteration 2. Notice how the rightmost centroid updates its position
At the third iteration the algorithm converges because none of the centroids was updated.
Click the following link to see the webapp build in Shiny: k-means clustering in Shiny