Comparing GMM-EM soft clustering with KMeans hard clustering

In this article, the clustering output results of GMM-EM soft clustering is compared with KMeans hard clustering on an image.

  1. The apples and oranges image shown is used for the comparing the clustering techniques.

    Data File Format

  2. The two color channels R,G are used as the variables for this image data.

  3. Two initial 2-dimensional Gaussian models the first one with red (1,0) and the second one with green (0,1) mean vectors along with random covariance matrices are used as initial models for GMM-EM.

  4. The same two initial points (1,0) and (0,1) are used as the initial cluster centroids for the Kmeans clustering also.

  5. The EM algorithm steps for GMM and change in the gaussian contours with iterations (till convergence) are shown in the next animation. Data File Format

  6. The change in the centroids with iteration (till convergence) for the KMeans clustering are shown in the next animation. Data File Format

  7. Finally, after both the algorithms converge, the pixels assigned to one of clusters obtained are marked as black, for each of the algorithms. The next figure shows that GMM can idenfy the orange from the apples but KMeans can not. Data File Format