Fuzzy Cmeans Clustering

Fuzzy Cmeans Clustering with the Seeds Dataset

Brian Bartling
9/11/16

Developing Data Products

About the App

This Shiny app provides a way for the user to interactively view fuzzy clusters that emerge out of the Seeds dataset, which is available on the UCI Machine Learning Repository.

The app is available to view here.

The github repository, which contains the ui.R and server.R files, is available here.

Seeds Dataset

Data Set Information:

The examined group comprised kernels belonging to three different varieties of wheat: Kama, Rosa and Canadian, 70 elements each, randomly selected for the experiment. High quality visualization of the internal kernel structure was detected using a soft X-ray technique. It is non-destructive and considerably cheaper than other more sophisticated imaging techniques like scanning microscopy or laser technology. The images were recorded on 13x18 cm X-ray KODAK plates. Studies were conducted using combine harvested wheat grain originating from experimental fields, explored at the Institute of Agrophysics of the Polish Academy of Sciences in Lublin.

A .txt file of the dataset can be downloaded here.

Fuzzy Cmeans Clustering

Fuzzy Cmeans Clustering is a method of clustering similar to kmeans clustering, but allows a data point to belong to more than one cluster.

Each datapoint is assigned a real number value between [0,1].

A table of the membership value of each data point is shown under the tab “Membership Values of the Data Points [0,1]”.

The cmeans() function from the e1071 package was used to determine the fuzzy cmeans.

Fuzzy Cmeans Clustering

The membership values [0,1] of the first five datapoints of area ~ perimeter, with 3 clusters, are as follows:

               1           2         3
[1,] 0.007212115 0.006120286 0.9866676
[2,] 0.002359277 0.001234742 0.9964060
[3,] 0.115631025 0.026652900 0.8577161
[4,] 0.304941083 0.038685318 0.6563736
[5,] 0.055347788 0.116889817 0.8277624