Fuzzy C-Means Clustering

Service as a Freelancer

If you want hire me as a freelancer please use the following links.

For Fiverr https://www.fiverr.com/dataanalyst_85
For Upwork https://www.upwork.com/freelancers/~017667e5ef4347dcb4

Introduction

Clustering is one of the machine learning methods and is included in unsupervised learning. Unsupervised learning is a machine learning method where there is no target variable in the data to be analyzed. In unsupervised learning, the focus is more on exploring data such as looking for patterns in the data. Clustering itself aims to find similar data patterns so that it has the possibility of grouping similar data. Those that have been grouped in clusters are usually referred to as clusters. In determining a good cluster is when a member in the cluster has a similarity as close as possible while between cluster members has a significant difference. Clustering is widely used in various fields such as customer segmentation, product recommendations, data profiling, and many more.

There are two types of clustering, namely non-fuzzy clustering (hard clustering) and fuzzy clustering (soft clustering). The most basic difference between the two types of clustering is that in non-fuzzy clustering (hard custering), each data will be divided into several groups and each data point can only be held in 1 cluster. In contrast to fuzzy clustering (soft clustering), each data point has the opportunity to be owned by more than 1 cluster. Here, below I've performed Fuzzy c-means clustering on the sample data between 1931 to 1940.

Import Data

First, and the most important part of the analysis is to read the data. As, we data in the csv format, we imported the data in R using the read.csv() function. Further, we changed the column names.

f <- read.csv("lluvia.csv",header = T,sep = ",")
#Change the names of variable
for (i in 2:length(names(f))) {
   names(f)[i] = paste("Year",strsplit(names(f)[i],"X")[[1]][2],sep = "-")
}
head(f)

##            X Year-1931 Year-1932 Year-1933 Year-1934 Year-1935 Year-1936
## 1  lluvia P1    365.73    795.99    758.85    322.61    354.37    823.71
## 2  lluvia P2    373.66    781.51    730.22    329.57    332.82    825.11
## 3  lluvia P3    381.95    762.00    715.92    298.40    331.25    791.31
## 4  lluvia P4    393.14    742.93    732.87    272.69    318.89    646.91
## 5  lluvia P5    406.01    677.26    740.59    439.46    318.48    661.18
## 6 lluvia P15   1251.11   1389.28   1306.29   1503.02    727.05   1380.97
##   Year-1937 Year-1938 Year-1939 Year-1940
## 1    613.34    400.07    732.71    746.53
## 2    559.48    375.95    716.42    765.16
## 3    556.42    300.40    604.38    690.77
## 4    481.08    317.70    672.19    606.09
## 5    506.17    336.62    696.67    623.15
## 6   1703.69    508.69   1304.27    881.91

Data Preprocessing

In order to check the missing values, we used the colSums() function.

colSums(is.na(f)) #To check for missing values

##         X Year-1931 Year-1932 Year-1933 Year-1934 Year-1935 Year-1936 Year-1937 
##         0         0         0         0         0         0         0         0 
## Year-1938 Year-1939 Year-1940 
##         0         0         0

We can see that there is no missing value. So, we don't need to preprocessed further.

Summary of Data

Here, below is the summary of each year. Where, minimum, maximum, median, mean, 1st quartile and 3rd quartile is computed for eah year and displayed.

summary(f)

##       X               Year-1931        Year-1932        Year-1933     
##  Length:60          Min.   : 288.4   Min.   : 383.7   Min.   : 328.3  
##  Class :character   1st Qu.: 439.0   1st Qu.: 706.6   1st Qu.: 634.2  
##  Mode  :character   Median : 686.4   Median : 918.4   Median : 856.0  
##                     Mean   : 716.7   Mean   : 976.0   Mean   : 895.4  
##                     3rd Qu.: 936.5   3rd Qu.:1229.6   3rd Qu.:1161.4  
##                     Max.   :1365.5   Max.   :1736.7   Max.   :1671.4  
##    Year-1934        Year-1935        Year-1936        Year-1937     
##  Min.   : 257.7   Min.   : 206.1   Min.   : 445.8   Min.   : 274.4  
##  1st Qu.: 464.1   1st Qu.: 410.6   1st Qu.: 644.4   1st Qu.: 547.6  
##  Median : 633.7   Median : 509.0   Median : 830.1   Median : 784.8  
##  Mean   : 780.3   Mean   : 686.3   Mean   : 969.6   Mean   : 937.4  
##  3rd Qu.:1125.9   3rd Qu.: 978.6   3rd Qu.:1163.4   3rd Qu.:1255.8  
##  Max.   :1576.0   Max.   :1502.5   Max.   :2491.8   Max.   :2684.6  
##    Year-1938        Year-1939        Year-1940     
##  Min.   : 228.6   Min.   : 283.3   Min.   : 132.9  
##  1st Qu.: 375.9   1st Qu.: 520.2   1st Qu.: 492.2  
##  Median : 492.9   Median : 718.5   Median : 690.1  
##  Mean   : 629.2   Mean   : 805.4   Mean   : 744.9  
##  3rd Qu.: 740.0   3rd Qu.: 993.6   3rd Qu.: 900.6  
##  Max.   :1421.9   Max.   :1746.5   Max.   :1895.2

Correlation of the Data

df <- f[,-1]
#Correlation Matrix
res <- cor(df)
round(res, 2)

##           Year-1931 Year-1932 Year-1933 Year-1934 Year-1935 Year-1936 Year-1937
## Year-1931      1.00      0.76      0.85      0.90      0.69      0.81      0.85
## Year-1932      0.76      1.00      0.94      0.76      0.76      0.65      0.62
## Year-1933      0.85      0.94      1.00      0.85      0.78      0.77      0.76
## Year-1934      0.90      0.76      0.85      1.00      0.66      0.72      0.83
## Year-1935      0.69      0.76      0.78      0.66      1.00      0.81      0.73
## Year-1936      0.81      0.65      0.77      0.72      0.81      1.00      0.94
## Year-1937      0.85      0.62      0.76      0.83      0.73      0.94      1.00
## Year-1938      0.47      0.76      0.69      0.49      0.87      0.52      0.43
## Year-1939      0.72      0.72      0.77      0.75      0.85      0.80      0.81
## Year-1940      0.66      0.58      0.66      0.66      0.81      0.91      0.89
##           Year-1938 Year-1939 Year-1940
## Year-1931      0.47      0.72      0.66
## Year-1932      0.76      0.72      0.58
## Year-1933      0.69      0.77      0.66
## Year-1934      0.49      0.75      0.66
## Year-1935      0.87      0.85      0.81
## Year-1936      0.52      0.80      0.91
## Year-1937      0.43      0.81      0.89
## Year-1938      1.00      0.73      0.57
## Year-1939      0.73      1.00      0.86
## Year-1940      0.57      0.86      1.00

#Correlation Plot
corrplot(res, type = "upper", order = "hclust", 
         tl.col = "black", tl.srt = 45)

#Correlation chart
chart.Correlation(df, histogram=TRUE, pch=19)

Fuzzy C-Means

Fuzzy c-means clustering is a clustering method that is almost similar to k-means clustering. Because this clustering method is similar to k-means clustering, some call this method fuzzy k-means clustering. Fuzzy c-means is one type of soft clustering where in grouping data, each data can be owned by more than one cluster. For example, a tomato can be grouped into red or green in hard clustering, but tomatoes can be grouped into red and green in fuzzy clustering. Red tomatoes have the same level as green tomatoes. For example, with numbers from 0 to 1, red tomatoes are 0.5 and green tomatoes are 0.5.

Run FCM

In order to start FCM as well as the other alternating optimization algorithms, an initialization step is required to build the initial cluster prototypes matrix and fuzzy membership degrees matrix. Although this task is usually performed in the initialization step of the clustering algorithm, the initial prototypes and memberships can also be directly input by the user.

FCM is usually started by using an integer specifying the number of clusters. In this case, the prototypes matrix is internally generated by using any of the prototype initalization algorithms which are included in the package inaparc. The default initialization technique is K-means++ with the current version of fcm function in this package. In the following code block, FCM runs for three clusters with the default values of the remaining arguments. Belowis a usual way to run FCM with a pre-determined number of clusters.

data <- f[,-1]
res.fcm <- fcm(data, centers=3)

Clustering Results

The fuzzy membership degrees matrix is the main output of the function fcm, presented as follows.

df <- as.data.frame(res.fcm$u)
df

##      Cluster 1   Cluster 2   Cluster 3
## 1  0.889617944 0.016525523 0.093856533
## 2  0.896941516 0.015863375 0.087195109
## 3  0.929108559 0.011468127 0.059423314
## 4  0.945001766 0.009186187 0.045812047
## 5  0.957996995 0.006629469 0.035373537
## 6  0.113166207 0.240970092 0.645863701
## 7  0.107550952 0.131687099 0.760761950
## 8  0.101280768 0.065046890 0.833672342
## 9  0.169391949 0.065443902 0.765164149
## 10 0.286015152 0.067036371 0.646948477
## 11 0.091913310 0.062081858 0.846004832
## 12 0.175078747 0.062809820 0.762111433
## 13 0.388424373 0.057144794 0.554430833
## 14 0.687511880 0.037882124 0.274605997
## 15 0.825586753 0.030857897 0.143555350
## 16 0.939863646 0.010632261 0.049504093
## 17 0.947465362 0.009684093 0.042850545
## 18 0.970325586 0.005079270 0.024595143
## 19 0.776101179 0.025039442 0.198859379
## 20 0.073890241 0.076858683 0.849251075
## 21 0.901108200 0.018597168 0.080294632
## 22 0.904346360 0.017784463 0.077869177
## 23 0.908128266 0.016824550 0.075047184
## 24 0.912731768 0.016015000 0.071253233
## 25 0.924279897 0.013800165 0.061919938
## 26 0.859962506 0.021760226 0.118277268
## 27 0.837903316 0.025240188 0.136856496
## 28 0.836789881 0.025122372 0.138087747
## 29 0.853767591 0.021783410 0.124448999
## 30 0.885256259 0.017126546 0.097617195
## 31 0.888373984 0.017958808 0.093667209
## 32 0.888810181 0.021196819 0.089993000
## 33 0.908756025 0.018036184 0.073207792
## 34 0.919193241 0.015923429 0.064883330
## 35 0.923853972 0.014482401 0.061663627
## 36 0.323210365 0.091643226 0.585146409
## 37 0.536921059 0.072403424 0.390675517
## 38 0.917913924 0.014216826 0.067869251
## 39 0.681122743 0.039274013 0.279603244
## 40 0.934687625 0.011839062 0.053473313
## 41 0.828917214 0.021112576 0.149970211
## 42 0.020600510 0.930516760 0.048882731
## 43 0.012811395 0.955484676 0.031703929
## 44 0.003618514 0.986712318 0.009669168
## 45 0.005569341 0.976988274 0.017442385
## 46 0.069334099 0.086479552 0.844186349
## 47 0.062468597 0.065961144 0.871570259
## 48 0.149001798 0.036909330 0.814088872
## 49 0.146514277 0.043883147 0.809602577
## 50 0.175934075 0.066430546 0.757635380
## 51 0.187612049 0.201667760 0.610720191
## 52 0.144419210 0.262978949 0.592601841
## 53 0.125994116 0.337225334 0.536780550
## 54 0.111410019 0.401477420 0.487112561
## 55 0.108997340 0.182800448 0.708202212
## 56 0.183864592 0.105377772 0.710757636
## 57 0.213166438 0.096281850 0.690551712
## 58 0.244101536 0.088968803 0.666929662
## 59 0.300290685 0.077774219 0.621935096
## 60 0.435768093 0.061567190 0.502664717

Initial and final cluster prototypes matrices can be achieved as follows:

res.fcm$v0

##           Year-1931 Year-1932 Year-1933 Year-1934 Year-1935 Year-1936 Year-1937
## Cluster 1    297.67    383.67    362.62    260.95    438.61    605.48    670.15
## Cluster 2   1347.30   1551.38   1476.87   1467.85   1416.75   2039.39   2245.62
## Cluster 3    709.93   1644.58   1212.05    702.77   1003.72   1001.17    777.46
##           Year-1938 Year-1939 Year-1940
## Cluster 1    345.63    710.95    590.41
## Cluster 2    939.09   1354.72   1546.66
## Cluster 3   1205.99    895.87    820.05

res.fcm$v

##           Year-1931 Year-1932 Year-1933 Year-1934 Year-1935 Year-1936 Year-1937
## Cluster 1  475.0462  693.3888  646.7344  477.0544  421.4518  659.2198  549.6685
## Cluster 2 1290.6401 1612.7112 1528.8296 1481.2221 1425.4179 2125.1090 2309.7916
## Cluster 3  922.9729 1189.9605 1085.0147 1019.8172  860.6241 1157.9839 1172.1680
##           Year-1938 Year-1939 Year-1940
## Cluster 1  450.0273  547.4263  476.8767
## Cluster 2 1077.8014 1439.5518 1668.0428
## Cluster 3  723.6102  995.3122  894.5514

Summary

summary(res.fcm)

## Summary for 'res.fcm'
## 
## Number of data objects:  60 
## 
## Number of clusters:  3 
## 
## Crisp clustering vector:
##  [1] 1 1 1 1 1 3 3 3 3 3 3 3 3 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 1
## [39] 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
## 
## Initial cluster prototypes:
##           Year-1931 Year-1932 Year-1933 Year-1934 Year-1935 Year-1936 Year-1937
## Cluster 1    297.67    383.67    362.62    260.95    438.61    605.48    670.15
## Cluster 2   1347.30   1551.38   1476.87   1467.85   1416.75   2039.39   2245.62
## Cluster 3    709.93   1644.58   1212.05    702.77   1003.72   1001.17    777.46
##           Year-1938 Year-1939 Year-1940
## Cluster 1    345.63    710.95    590.41
## Cluster 2    939.09   1354.72   1546.66
## Cluster 3   1205.99    895.87    820.05
## 
## Final cluster prototypes:
##           Year-1931 Year-1932 Year-1933 Year-1934 Year-1935 Year-1936 Year-1937
## Cluster 1  475.0462  693.3888  646.7344  477.0544  421.4518  659.2198  549.6685
## Cluster 2 1290.6401 1612.7112 1528.8296 1481.2221 1425.4179 2125.1090 2309.7916
## Cluster 3  922.9729 1189.9605 1085.0147 1019.8172  860.6241 1157.9839 1172.1680
##           Year-1938 Year-1939 Year-1940
## Cluster 1  450.0273  547.4263  476.8767
## Cluster 2 1077.8014 1439.5518 1668.0428
## Cluster 3  723.6102  995.3122  894.5514
## 
## Distance between the final cluster prototypes
##           Cluster 1 Cluster 2
## Cluster 2  12160469          
## Cluster 3   2212948   4193363
## 
## Difference between the initial and final cluster prototypes
##           Year-1931  Year-1932  Year-1933 Year-1934   Year-1935 Year-1936
## Cluster 1 177.37621  309.71875  284.11435 216.10437  -17.158230  53.73978
## Cluster 2 -56.65991   61.33119   51.95963  13.37213    8.667944  85.71896
## Cluster 3 213.04289 -454.61946 -127.03532 317.04718 -143.095852 156.81392
##            Year-1937 Year-1938  Year-1939  Year-1940
## Cluster 1 -120.48147  104.3973 -163.52371 -113.53332
## Cluster 2   64.17161  138.7114   84.83183  121.38277
## Cluster 3  394.70796 -482.3798   99.44218   74.50137
## 
## Root Mean Squared Deviations (RMSD): 633.328 
## Mean Absolute Deviation (MAD): 15698.8 
## 
## Membership degrees matrix (top and bottom 5 rows): 
##   Cluster 1   Cluster 2  Cluster 3
## 1 0.8896179 0.016525523 0.09385653
## 2 0.8969415 0.015863375 0.08719511
## 3 0.9291086 0.011468127 0.05942331
## 4 0.9450018 0.009186187 0.04581205
## 5 0.9579970 0.006629469 0.03537354
## ...
##    Cluster 1  Cluster 2 Cluster 3
## 56 0.1838646 0.10537777 0.7107576
## 57 0.2131664 0.09628185 0.6905517
## 58 0.2441015 0.08896880 0.6669297
## 59 0.3002907 0.07777422 0.6219351
## 60 0.4357681 0.06156719 0.5026647
## 
## Descriptive statistics for the membership degrees by clusters
##           Size       Min        Q1      Mean    Median        Q3       Max
## Cluster 1   31 0.5369211 0.8458355 0.8715595 0.9011082 0.9240669 0.9703256
## Cluster 2    4 0.9305168 0.9492427 0.9624255 0.9662365 0.9794193 0.9867123
## Cluster 3   25 0.4871126 0.6107202 0.6989878 0.7082022 0.8096026 0.8715703
## 
## Dunn's Fuzziness Coefficients:
## dunn_coeff normalized 
##  0.7044822  0.5567234 
## 
## Within cluster sum of squares by cluster:
##          1          2          3 
##  8915921.3   383353.1 18044479.9 
## (between_SS / total_SS =  68.94%) 
## 
## Available components: 
##  [1] "u"          "v"          "v0"         "d"          "x"         
##  [6] "cluster"    "csize"      "sumsqrs"    "k"          "m"         
## [11] "iter"       "best.start" "func.val"   "comp.time"  "inpargs"   
## [16] "algorithm"  "call"

All available components of the ppclust object are listed at the end of summary. These elements can be accessed using as the attributes of the object. For example, the execution time of the run of FCM is accessed as follows:

res.fcm$comp.time

## [1] 0.5

FCM with Multiple Start

In order to find an optimal solution, the function fcm can be started for multiple times. As seen in the following code, the argument nstart is used for this purpose. When the multiple start is performed, either initial cluster prototypes or initial membership degrees could be wanted to keep unchanged between the starts of algorithm. In this case, we used the the arguments fixcent and fixmemb for fixing the initial cluster prototypes matrix and initial membership degrees matrix, respectively.

res.fcm <- fcm(data, centers=3, nstart=5, fixmemb=TRUE)

Best Solution

The clustering result contains some outputs providing information about some components such as the objective function values, number of iterations and computing time obtained with each start of the algorithm. All these are demonstrated below.

res.fcm$func.val # objective function values

## [1] 19699527 19699527 19699527 19699527 19699527

res.fcm$iter # number of interations

## [1] 49 62 56 62 62

res.fcm$comp.time # computing time

## [1] 0.37 0.52 0.49 0.54 0.50

Among the outputs from succesive starts of the algorithm, the best solution is obtained from the start giving the minimum value of the objective function, and stored as the the final clustering result of the multiple starts of FCM.

res.fcm$best.start

## [1] 1

Summary

summary(res.fcm)

## Summary for 'res.fcm'
## 
## Number of data objects:  60 
## 
## Number of clusters:  3 
## 
## Crisp clustering vector:
##  [1] 3 3 3 3 3 1 1 1 1 1 1 1 1 3 3 3 3 3 3 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 1 3 3
## [39] 3 3 3 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## 
## Initial cluster prototypes:
##           Year-1931 Year-1932 Year-1933 Year-1934 Year-1935 Year-1936 Year-1937
## Cluster 1    489.35    728.29    634.99    571.23    427.53    474.13    284.85
## Cluster 2    858.09   1469.51   1295.34   1478.17   1417.83   1217.48   1252.54
## Cluster 3   1131.19   1177.29   1133.31    925.32   1112.10   1450.67   1381.79
##           Year-1938 Year-1939 Year-1940
## Cluster 1    483.91    320.36    150.89
## Cluster 2   1355.85   1520.86   1319.39
## Cluster 3    608.90   1104.03   1028.49
## 
## Final cluster prototypes:
##           Year-1931 Year-1932 Year-1933 Year-1934 Year-1935 Year-1936 Year-1937
## Cluster 1  922.9729 1189.9605 1085.0147 1019.8172  860.6241 1157.9839 1172.1680
## Cluster 2 1290.6401 1612.7112 1528.8296 1481.2221 1425.4179 2125.1090 2309.7916
## Cluster 3  475.0462  693.3888  646.7344  477.0544  421.4518  659.2198  549.6685
##           Year-1938 Year-1939 Year-1940
## Cluster 1  723.6102  995.3122  894.5514
## Cluster 2 1077.8014 1439.5518 1668.0428
## Cluster 3  450.0273  547.4263  476.8767
## 
## Distance between the final cluster prototypes
##           Cluster 1 Cluster 2
## Cluster 2   4193363          
## Cluster 3   2212948  12160469
## 
## Difference between the initial and final cluster prototypes
##           Year-1931 Year-1932 Year-1933   Year-1934   Year-1935 Year-1936
## Cluster 1  433.6229  461.6705  450.0247  448.587182  433.094148  683.8539
## Cluster 2  432.5501  143.2012  233.4896    3.052133    7.587944  907.6290
## Cluster 3 -656.1438 -483.9012 -486.5756 -448.265630 -690.648230 -791.4502
##           Year-1937 Year-1938  Year-1939 Year-1940
## Cluster 1  887.3180  239.7002  674.95218  743.6614
## Cluster 2 1057.2516 -278.0486  -81.30817  348.6528
## Cluster 3 -832.1215 -158.8727 -556.60371 -551.6133
## 
## Root Mean Squared Deviations (RMSD): 1756.919 
## Mean Absolute Deviation (MAD): 48684.84 
## 
## Membership degrees matrix (top and bottom 5 rows): 
##    Cluster 1   Cluster 2 Cluster 3
## 1 0.09385653 0.016525523 0.8896179
## 2 0.08719511 0.015863375 0.8969415
## 3 0.05942331 0.011468127 0.9291086
## 4 0.04581205 0.009186187 0.9450018
## 5 0.03537354 0.006629469 0.9579970
## ...
##    Cluster 1  Cluster 2 Cluster 3
## 56 0.7107576 0.10537777 0.1838646
## 57 0.6905517 0.09628185 0.2131664
## 58 0.6669297 0.08896880 0.2441015
## 59 0.6219351 0.07777422 0.3002907
## 60 0.5026647 0.06156719 0.4357681
## 
## Descriptive statistics for the membership degrees by clusters
##           Size       Min        Q1      Mean    Median        Q3       Max
## Cluster 1   25 0.4871126 0.6107202 0.6989878 0.7082022 0.8096026 0.8715703
## Cluster 2    4 0.9305168 0.9492427 0.9624255 0.9662365 0.9794193 0.9867123
## Cluster 3   31 0.5369211 0.8458355 0.8715595 0.9011082 0.9240669 0.9703256
## 
## Dunn's Fuzziness Coefficients:
## dunn_coeff normalized 
##  0.7044822  0.5567234 
## 
## Within cluster sum of squares by cluster:
##          1          2          3 
## 18044479.9   383353.1  8915921.3 
## (between_SS / total_SS =  68.94%) 
## 
## Available components: 
##  [1] "u"          "v"          "v0"         "d"          "x"         
##  [6] "cluster"    "csize"      "sumsqrs"    "k"          "m"         
## [11] "iter"       "best.start" "func.val"   "comp.time"  "inpargs"   
## [16] "algorithm"  "call"

Visualization

Pairwise Scatter Plots

There are many ways of visual representation of the clustering results. One common techique is to display the clustering results by using pairs of the features. The plotcluster() can be used to plot the clustering results as follows:

plotcluster(res.fcm, cp=1, trans=TRUE)

Cluster Plot

res.fcm2 <- ppclust2(res.fcm, "kmeans")
factoextra::fviz_cluster(res.fcm2, data = data, 
  ellipse.type = "convex",
  palette = "jco",
  repel = TRUE)

res.fcm3 <- ppclust2(res.fcm, "fanny")
cluster::clusplot(scale(data), res.fcm3$cluster,  
  main = "Cluster plot",
  color=TRUE, labels = 2, lines = 2, cex=1)

Validation of Results

Cluster validation is an evaluation process for the goodness of the clustering result. For this purpose, various validity indexes have been proposed in the related literature. Since clustering is an unsupervised learning analysis which does not use any external information, the internal indexes are used to validate the clustering results. Although there are many internal indexes have originally been proposed for working with hard membership degrees produced by the K-means and its variants, most of these indexes cannot be used for fuzzy clustering results. In R environment, Partition Entropy (PE), Partition Coefficient (PC) and Modified Partition Coefficient (MPC) and Fuzzy Silhouette Index are available

res.fcm4 <- ppclust2(res.fcm, "fclust")
idxsf <- SIL.F(res.fcm4$Xca, res.fcm4$U, alpha=1)
idxpe <- PE(res.fcm4$U)
idxpc <- PC(res.fcm4$U)
idxmpc <- MPC(res.fcm4$U)

cat("Partition Entropy: ", idxpe)

## Partition Entropy:  0.5356576

cat("Partition Coefficient: ", idxpc)

## Partition Coefficient:  0.7044822

cat("Modified Partition Coefficient: ", idxmpc)

## Modified Partition Coefficient:  0.5567234

cat("Fuzzy Silhouette Index: ", idxsf)

## Fuzzy Silhouette Index:  0.7457829

Fuzzy C-Means Clustering

Abdul Samad

June 26, 2021

Service as a Freelancer

Introduction

Import Data

Data Preprocessing

Summary of Data

Correlation of the Data

Fuzzy C-Means

Run FCM

Clustering Results

Summary

FCM with Multiple Start

Best Solution

Summary

Visualization

Pairwise Scatter Plots

Cluster Plot

Validation of Results