We deal with clustering in almost every aspect of daily life. Clustering is the subject of active research in several fields such as statistics, pattern recognition, and machine learning. In data mining, clustering deals with very large data sets with different attributes associated with the data. This imposes unique computational requirements on relevant clustering algorithms. A variety of algorithms have recently emerged that meet these requirements and were successfully applied to real life problems.
Clustering methods are divided into two basic types: hierarchical and flat clustering. Within each of these types there exists a wealth of subtypes and different algorithms for finding the clusters. Flat clustering algorithm goal is to create clusters that are coherent internally, and clearly different from each other. The data within a cluster should be as similar as possible; data in one cluster should be as dissimilar as possible from documents in other clusters. Hierarchical clustering builds a cluster hierarchy that can be represented as a tree of clusters. Each cluster can be represented as child, a parent and a sibling to other clusters. Even though hierarchical clustering is superior to flat clustering in representing the clusters, it has a drawback of being computationally intensive in finding the relevant hierarchies. Among the most popular clustering techniques we may list:
Compared to that, scientists have developed algorithms that cluster data not necessarily with binary membership. Such methods are called fuzzy clustering. They allow the objects to belong to several clusters simultaneously, with different degrees of membership. In many situations, fuzzy clustering is more natural than hard clustering. Objects on the boundaries between several classes are not forced to fully belong to one of the classes, but rather are assigned membership degrees between 0 and 1 indicating their partial membership.
To visualize the biggest difference between flat and fuzzy clustering, we may refer to the picture below. Flat clustering assigns ones (1) when an observation belongs to one of the clusters and zeroes (0) while it does not. Therefore left-hand side matrix may represent such assignment. On the other hand fuzzy clustering assigns the probability of membership to each of the wanted clusters (right-hand side matrix may represent such probability assignment).
The most popular clustering algorithm is fuzzy c-Means (FCM) clustering, hence I will be using it in this project.
The dataset that this project will operate on contains data on 721 Pokemon, including their number, name, first and second type, and basic stats: HP, Attack, Defense, Special Attack, Special Defense, and Speed. It was provided by Kaggle (https://www.kaggle.com/abcsds/pokemon)
Main idea is to present the clustering of Pokemon depending on similarities in their statistics. Hence we end up with a dataset in the form presented below.
There are legendary Pokemon in each generation of games. They have more powerful statistics, thus I wanted to check whether dataset limited only to those observations clusters well too.
data <- read.csv('Pokemon.csv')
summary(data)
## X. Name Type.1 Type.2 Total HP Attack Defense Sp..Atk Sp..Def Speed Generation Legendary
## Min. : 1.0 Abomasnow : 1 Water :112 :386 Min. :180.0 Min. : 1.00 Min. : 5 Min. : 5.00 Min. : 10.00 Min. : 20.0 Min. : 5.00 Min. :1.000 False:735
## 1st Qu.:184.8 AbomasnowMega Abomasnow: 1 Normal : 98 Flying : 97 1st Qu.:330.0 1st Qu.: 50.00 1st Qu.: 55 1st Qu.: 50.00 1st Qu.: 49.75 1st Qu.: 50.0 1st Qu.: 45.00 1st Qu.:2.000 True : 65
## Median :364.5 Abra : 1 Grass : 70 Ground : 35 Median :450.0 Median : 65.00 Median : 75 Median : 70.00 Median : 65.00 Median : 70.0 Median : 65.00 Median :3.000
## Mean :362.8 Absol : 1 Bug : 69 Poison : 34 Mean :435.1 Mean : 69.26 Mean : 79 Mean : 73.84 Mean : 72.82 Mean : 71.9 Mean : 68.28 Mean :3.324
## 3rd Qu.:539.2 AbsolMega Absol : 1 Psychic: 57 Psychic : 33 3rd Qu.:515.0 3rd Qu.: 80.00 3rd Qu.:100 3rd Qu.: 90.00 3rd Qu.: 95.00 3rd Qu.: 90.0 3rd Qu.: 90.00 3rd Qu.:5.000
## Max. :721.0 Accelgor : 1 Fire : 52 Fighting: 26 Max. :780.0 Max. :255.00 Max. :190 Max. :230.00 Max. :194.00 Max. :230.0 Max. :180.00 Max. :6.000
## (Other) :794 (Other):342 (Other) :189
head(data)
## X. Name Type.1 Type.2 Total HP Attack Defense Sp..Atk Sp..Def Speed Generation Legendary
## 1 1 Bulbasaur Grass Poison 318 45 49 49 65 65 45 1 False
## 2 2 Ivysaur Grass Poison 405 60 62 63 80 80 60 1 False
## 3 3 Venusaur Grass Poison 525 80 82 83 100 100 80 1 False
## 4 3 VenusaurMega Venusaur Grass Poison 625 80 100 123 122 120 80 1 False
## 5 4 Charmander Fire 309 39 52 43 60 50 65 1 False
## 6 5 Charmeleon Fire 405 58 64 58 80 65 80 1 False
legendary <- data %>% select(-c(X., Name, Type.1, Type.2, Total, Generation)) %>% filter(Legendary == 'True') %>% select(-Legendary)
legendary_g <- data %>% select(c(Generation, Legendary)) %>% filter(Legendary == 'True') %>% select(-Legendary)
data <- data %>% select(-c(X., Name, Type.1, Type.2, Total, Generation, Legendary))
plot(data)
plot(legendary)
First of all we should check whether datasets are clusterable at all. To do that I used simple Hopkins statistic. It is a way of measuring the cluster tendency of a data set. A value close to 1 tends to indicate the data is highly clustered, random data will tend to result in values around 0.5, and uniformly distributed data will tend to result in values close to 0.
data_h <- hopkins(data, nrow(data)-1)
cat("All dataset: Hopkin's statistic is equal to:",data_h$H)
## All dataset: Hopkin's statistic is equal to: 0.1888738
leg_h <- hopkins(data, nrow(legendary)-1)
cat("Legendary dataset: Hopkin's statistic is equal to:",leg_h$H)
## Legendary dataset: Hopkin's statistic is equal to: 0.1842238
Hopkins statistic values for both datasets seem to inform us that there are no statistically significant clusters (it is less than 0,5), but nevertheless the values are ca. 20%, hence letâs not give up and proceed with analysis.
Below we can see the Ordered Dissimilarity Matrices, which show distant observations in blue and close ones in red. We can easily notice clusters of data for both of our datasets.
d<-get_dist(data, method = "euclidean")
fviz_dist(d, show_labels = F) + labs(title = "Ordered Dissimilarity Matrix - All data")
d<-get_dist(legendary, method = "euclidean")
fviz_dist(d, show_labels = F) + labs(title = "Ordered Dissimilarity Matrix - Legendaries")
Next step are prediagnostics about the optimal number of clusters. Most popular technique there uses Silhouette - statistic that is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation).
a <- fviz_nbclust(data,FUNcluster=kmeans,method = "s")
b <- fviz_nbclust(data,FUNcluster=pam,method = "s")
c <- fviz_nbclust(data,FUNcluster=clara,method = "s")
#using fanny function which relies on the same idea as fbclust
d<-fviz_nbclust(data,FUNcluster=fanny,method = "s")
grid.arrange(a,b,c,d, top = "Optimal number of clusters")
clusnum <- 2
#LEGENDS
a <- fviz_nbclust(legendary,FUNcluster=kmeans,method = "s") + labs(title = "K-means")
b <- fviz_nbclust(legendary,FUNcluster=pam,method = "s") + labs(title = "PAM")
c <- fviz_nbclust(legendary,FUNcluster=clara,method = "s") + labs(title = "CLARA")
d <- fviz_nbclust(legendary,FUNcluster=fanny,method = "s") + labs(title = "Fuzzy clustering")
grid.arrange(a,b,c,d, top = "Optimal number of clusters")
clus_leg = 6
The greater the silhouette value the better, and we should choose the number of clusters that has the increase silhouette relatively low. It can be easily noted that the optimal number of clusters equals to 2 for each of the flat and fuzzy partitioning methods. Therefore this will be an optimal number used since now.
For the legendary Pokemon dataset, the optimal value varies between methods. That is why the potential number of clusters was picked by me - 6, because that is the number of generations available in the dataset.
The main point of analysis begins here. I applied each of the flat partitioning algorithms (fuzzy one is at the end) to the dataset. It seems that the division into two clusters is very similar between the methods, with some outlying differences in the center of the dataset. Moreover to understand the dimensions of the plotted data I provided small PCA analysis. It seems that the first PC (x axis) composes of all the analyzed variables and the second PC (y axis) consists of speed and special attack mostly. Therefore second cluster groups the observations (Pokemon) with high values of all attributes.
There is also a Silhouette diagram for PAM and CLARA methods - it seems that CLARA is better, because it has created clusters that are more homogenous inside and heterogeneous outside. The first observations that are the closest to the center of a cluster have high silhouettes, and those that lie far from the center have small or even negative silhouettes.
Moreover the clustering is rather bad, because last observations of the second cluster have negative silhouettes. For K-means, I calculated Calinski-Harabasz statistic (a variance ratio) that will be later compared to the case of 3 clusters.
prcomp(data)$rotation[,1:2]
## PC1 PC2
## HP -0.3008079 -0.04221029
## Attack -0.4928918 -0.07654480
## Defense -0.3806345 -0.69521578
## Sp..Atk -0.5089806 0.38331141
## Sp..Def -0.3943698 -0.17389431
## Speed -0.3272626 0.57607928
ckm <- kmeans(data, clusnum)
ckm_p <- fviz_cluster(list(data=data, cluster=ckm$cluster), ellipse.type="convex", geom="point", stand=FALSE, palette="Dark2", ggtheme=theme_minimal()) + labs(title = "K-means")+theme(legend.position="bottom")
cpam <- pam(data, clusnum)
cpam_p <- fviz_cluster(list(data=data, cluster=cpam$clustering), ellipse.type="convex", geom="point", stand=FALSE, palette="Dark2", ggtheme=theme_minimal()) + labs(title = "PAM")+theme(legend.position="bottom")
claraa <-clara(data, clusnum)
clara_p <- fviz_cluster(list(data=data, cluster=claraa$clustering), ellipse.type="convex", geom="point", stand=FALSE, palette="Dark2", ggtheme=theme_minimal()) + labs(title = "CLARA")+theme(legend.position="bottom")
grid.arrange(ckm_p,cpam_p,clara_p,heights=unit(0.8, "npc") , top = "Clustering all data, clusters = 2")
chara2 <- round(calinhara(data,ckm$cluster),digits=2)
sil_pam <- fviz_silhouette(cpam) + labs(title = "PAM")
## cluster size ave.sil.width
## 1 1 296 0.47
## 2 2 504 0.16
sil_cla <- fviz_silhouette(claraa) + labs(title = "CLARA")
## cluster size ave.sil.width
## 1 1 22 0.49
## 2 2 22 0.13
grid.arrange(sil_pam,sil_cla,top = "Silhouette plots, clusters = 2")
When it comes to the case of Legendary Pokemon there are 6 clusters - as many as the generations in the dataset. This causes small disturbances in the analysis compared to the available dataset - 65 observations. Nevertheless there are significant differences between the clusters.
First of all first dimension consists mostly of Attack and Special Attack values and Speed attribute. Second dimension is mostly all variables but speed. The main disadvantage od these clusterings is that they differ between methods, therefore there is no cluster of high quality.
There are some cases of the best Pokemon that do not switch clustering between the methods. But the biggest difference make the outlying variables that are inserted into different clusters each time.
Silhouette diagram shows that there is not much difference between PAM and CLARA clustering.
prcomp(legendary)$rotation[,1:2]
## PC1 PC2
## HP 0.01346974 0.2874259
## Attack 0.49750799 0.4532307
## Defense -0.44611129 0.3321664
## Sp..Atk 0.54138483 0.4464765
## Sp..Def -0.39580888 0.5163867
## Speed 0.32175594 -0.3682897
ckm <- kmeans(legendary, clus_leg)
ckm_p <- fviz_cluster(list(data=legendary, cluster=ckm$cluster), ellipse.type="convex", geom="point", stand=FALSE, palette="Dark2", ggtheme=theme_minimal()) + labs(title = "K-means")+theme(legend.position="bottom")
cpam <- pam(legendary, clus_leg)
cpam_p <- fviz_cluster(list(data=legendary, cluster=cpam$clustering), ellipse.type="convex", geom="point", stand=FALSE, palette="Dark2", ggtheme=theme_minimal()) + labs(title = "PAM")+theme(legend.position="bottom")
claraa <-clara(legendary, clus_leg)
clara_p <- fviz_cluster(list(data=legendary, cluster=claraa$clustering), ellipse.type="convex", geom="point", stand=FALSE, palette="Dark2", ggtheme=theme_minimal()) + labs(title = "CLARA")+theme(legend.position="bottom")
grid.arrange(ckm_p,cpam_p,clara_p,heights=unit(0.8, "npc") , top = "Clustering legendary Pokemon, clusters = 6")
sil_pam <- fviz_silhouette(cpam) + labs(title = "PAM")
## cluster size ave.sil.width
## 1 1 18 0.25
## 2 2 13 0.09
## 3 3 4 0.15
## 4 4 6 0.21
## 5 5 13 0.11
## 6 6 11 0.25
sil_cla <- fviz_silhouette(claraa) + labs(title = "CLARA")
## cluster size ave.sil.width
## 1 1 10 0.07
## 2 2 14 0.09
## 3 3 12 0.19
## 4 4 6 0.25
## 5 5 8 0.33
## 6 6 2 0.43
grid.arrange(sil_pam,sil_cla,top = "Silhouette plots, clusters = 6")
What if there were 3 clusters? It seems that the clustering is less logical now. Most of the clusters overlay each other. The values of Silhouette statistic for PAM and CLARA are noticeably lower than previously. Same case is with Calinski-Harabasz index, it is much lower than previously. Therefore we can easily say that the case of 3 clusters is far worse than case of two clusters and we should only consider smaller case.
ckm <- kmeans(data, 3)
ckm_p <- fviz_cluster(list(data=data, cluster=ckm$cluster), ellipse.type="convex", geom="point", stand=FALSE, palette="Dark2", ggtheme=theme_minimal()) + labs(title = "K-means")+theme(legend.position="bottom")
cpam <- pam(data, 3)
cpam_p <- fviz_cluster(list(data=data, cluster=cpam$clustering), ellipse.type="convex", geom="point", stand=FALSE, palette="Dark2", ggtheme=theme_minimal()) + labs(title = "PAM")+theme(legend.position="bottom")
claraa <-clara(data, 3)
clara_p <- fviz_cluster(list(data=data, cluster=claraa$clustering), ellipse.type="convex", geom="point",stand=FALSE, palette="Dark2", ggtheme=theme_minimal()) + labs(title = "CLARA")+theme(legend.position="bottom")
grid.arrange(ckm_p,cpam_p,clara_p,heights=unit(0.8, "npc") , top = "Clustering all data, clusters = 3")
round(calinhara(data,ckm$cluster),digits=2)
## [1] 273.86
sil_pam <- fviz_silhouette(cpam) + labs(title = "PAM")
## cluster size ave.sil.width
## 1 1 266 0.33
## 2 2 234 0.13
## 3 3 300 0.06
sil_cla <- fviz_silhouette(claraa) + labs(title = "CLARA")
## cluster size ave.sil.width
## 1 1 10 0.56
## 2 2 16 0.19
## 3 3 20 0.09
grid.arrange(sil_pam,sil_cla,top = "Silhouette plots, clusters = 3")
There is also a branch of clustering called hierarchical clustering. It is a method of cluster analysis which seeks to build a hierarchy of clusters. Below there is the clustering for whole dataset with the tree cut off on the second level. It becomes highly unreadable after more or less 6th level, because of the capacity of data.
d <- dist(data, method = "euclidean")
res <- hclust(d, method = "ward.D2" )
grp_all <- cutree(res, k = 2)
plot(res, cex = 0.6)
rect.hclust(res, k = 2, border = c('red','green'))
That is why I provided the plot (dendrogram) for Legendary Pokemon datset. Below there is its clustering with the tree cut of on the sixth level. There is more visibility. Eeach observation i denoted by a leaf at the bottom of the tree. Observations that are similiar are connected into groups on each of the levels, going to the top of the tree where there is only one group. It is important to find a cut off that proiduces reliably the best clustering.
To check whether there are clusters of Pokemon similar to the generations the labels are painted in colors of generations. It is easily noticeably (âeyeâ method) that each cluster contains observations coming from various generations.
d <- dist(legendary, method = "euclidean")
res <- as.dendrogram(hclust(d, method = "ward.D2" ))
cols<- brewer.pal(6,"Dark2")
labels_colors(res) <-cols
grp_leg <- cutree(res, k = 2)
pl<-plot(res, cex = 0.6)
x <- rect.dendrogram(res, k = 6, border = cols)
To statistically pick which method of clustering provided the best results let us use clValid function that provides Connectivity index, Dunn index and Silhouette coefficient together. The higher the values the better is the method in clustering given dataset.
Dunn index identifies sets of clusters that are compact, with a small variance between members of the cluster, and well separated, where the means of different clusters are sufficiently far apart, as compared to the within cluster variance.
Compactness assesses cluster homogeneity, usually by looking at the intra-cluster variance, while separation quanties the degree of separation between clusters (usually by measuring the distance between cluster centroids).
assess <- clValid(data, nClust = 2:4, clMethods = c("hierarchical", "kmeans", "pam", "clara","fanny"), validation = "internal",maxitems = 1000 )
summary(assess)
##
## Clustering Methods:
## hierarchical kmeans pam clara fanny
##
## Cluster sizes:
## 2 3 4
##
## Validation Measures:
## 2 3 4
##
## hierarchical Connectivity 2.9290 6.8869 31.1976
## Dunn 0.4024 0.3389 0.1521
## Silhouette 0.6322 0.5718 0.3944
## kmeans Connectivity 120.8877 249.0107 247.4766
## Dunn 0.0273 0.0313 0.0393
## Silhouette 0.2883 0.2325 0.2262
## pam Connectivity 84.4429 303.2147 278.5552
## Dunn 0.0347 0.0214 0.0472
## Silhouette 0.2733 0.1696 0.2273
## clara Connectivity 134.9786 259.0056 313.6921
## Dunn 0.0282 0.0314 0.0383
## Silhouette 0.2844 0.2300 0.2025
## fanny Connectivity 117.8472 NA NA
## Dunn 0.0273 NA NA
## Silhouette 0.2878 NA NA
##
## Optimal Scores:
##
## Score Method Clusters
## Connectivity 2.9290 hierarchical 2
## Dunn 0.4024 hierarchical 2
## Silhouette 0.6322 hierarchical 2
assess_leg <- clValid(legendary, nClust = 2:4, clMethods = c("hierarchical", "kmeans", "pam", "clara"), validation = "internal",maxitems = 1000 )
summary(assess_leg)
##
## Clustering Methods:
## hierarchical kmeans pam clara
##
## Cluster sizes:
## 2 3 4
##
## Validation Measures:
## 2 3 4
##
## hierarchical Connectivity 3.8579 12.9012 15.8302
## Dunn 0.2812 0.2332 0.2366
## Silhouette 0.4253 0.3376 0.2980
## kmeans Connectivity 35.2234 14.7782 26.4984
## Dunn 0.0863 0.2019 0.2366
## Silhouette 0.1942 0.3064 0.2360
## pam Connectivity 35.9476 44.6702 45.0056
## Dunn 0.0851 0.0858 0.1410
## Silhouette 0.1933 0.1934 0.1797
## clara Connectivity 30.8187 37.0107 44.7567
## Dunn 0.0537 0.1395 0.1410
## Silhouette 0.1889 0.1812 0.1745
##
## Optimal Scores:
##
## Score Method Clusters
## Connectivity 3.8579 hierarchical 2
## Dunn 0.2812 hierarchical 2
## Silhouette 0.4253 hierarchical 2
For both cases (whole dataset and Legendary Pokemon) it seems that the most reasonable is to pick hierarchical clustering and cut the tree at the second level, beacause all of the measures (Dunn Index, Connectivity measure and silhouette) are the best for those (connectivity the lowest, Dunn and silhouette the highest).
Below we can see boxplots for whole dataset and Legendary Pokemon. There are two different sets for each variable. Boxplot presents mean and overall spread of the variable.
All Pokemon
As aforementioned, it is easily noticeable that the second cluster consists of much better Pokemon. They have all their statistics better. Whatâs interesting also variance in this cluster is higher. It is also a cluster with higher number of observations.
Legendary Pokemon
When we cut off the dendrogram at the second level we created one cluster with high number of observations and another one with just 5, highly strong, defensive Legendary Pokemon. Variances in both clusters are comparable.
groupBWplot(data, as.factor(grp_all), alpha=0.05)
groupBWplot(legendary, as.factor(grp_leg), alpha=0.05)
res.f <- fanny(data, k=2, diss=FALSE, memb.exp = 1.2, metric = "euclidean",
stand = FALSE, maxit = 500)
res.f3 <- fanny(data, k=2, diss=FALSE, memb.exp = 1.3, metric = "euclidean",
stand = FALSE, maxit = 500)
head(res.f$membership)
## [,1] [,2]
## [1,] 0.997187736 0.002812264
## [2,] 0.805354144 0.194645856
## [3,] 0.005398136 0.994601864
## [4,] 0.009690168 0.990309832
## [5,] 0.997989775 0.002010225
## [6,] 0.821683501 0.178316499
res.f2 <- fanny(legendary, k=2, diss=FALSE, memb.exp = 2, metric = "euclidean",
stand = FALSE, maxit = 500)
res.f4 <- fanny(legendary, k=2, diss=FALSE, memb.exp = 2.5, metric = "euclidean",
stand = FALSE, maxit = 500)
head(res.f2$membership)
## [,1] [,2]
## [1,] 0.5 0.5
## [2,] 0.5 0.5
## [3,] 0.5 0.5
## [4,] 0.5 0.5
## [5,] 0.5 0.5
## [6,] 0.5 0.5
We can see that the algorithm provided us with the probabilities of the membership to each of the clusters.
res.f_p <- fviz_cluster(list(data=data, cluster=res.f$clustering), ellipse.type="convex", geom="point", stand=FALSE, palette="Dark2", ggtheme=theme_minimal()) + labs(title = "Fuzzy = 1.2")+theme(legend.position="bottom")
res.f3_p <- fviz_cluster(list(data=data, cluster=res.f3$clustering), ellipse.type="convex", geom="point", stand=FALSE, palette="Dark2", ggtheme=theme_minimal()) + labs(title = "Fuzzy = 1.3")+theme(legend.position="bottom")
grid.arrange(res.f_p,res.f3_p,heights=unit(0.8, "npc") , top = "Fuzzy clustering - all data, clusters = 2")
sil_1 <- fviz_silhouette(res.f) + labs(title = "Fuzzy = 1.2")
## cluster size ave.sil.width
## 1 1 365 0.43
## 2 2 435 0.17
sil_3 <- fviz_silhouette(res.f3) + labs(title = "Fuzzy = 1.3")
## cluster size ave.sil.width
## 1 1 372 0.42
## 2 2 428 0.17
grid.arrange(sil_1,sil_3,top = "Silhouette plots, clusters = 2")
res.f2_p <- fviz_cluster(list(data=legendary, cluster=res.f2$clustering), ellipse.type="convex", geom="point", stand=FALSE, palette="Dark2", ggtheme=theme_minimal()) + labs(title = "Fuzzy = 2")+theme(legend.position="bottom")
res.f4_p <- fviz_cluster(list(data=legendary, cluster=res.f4$clustering), ellipse.type="convex", geom="point", stand=FALSE, palette="Dark2", ggtheme=theme_minimal()) + labs(title = "Fuzzy = 2.5")+theme(legend.position="bottom")
grid.arrange(res.f2_p,res.f4_p,heights=unit(0.8, "npc") , top = "Fuzzy clustering - Legendary Pokemon, clusters = 6")
sil_2 <- fviz_silhouette(res.f2) + labs(title = "Fuzzy = 2")
## cluster size ave.sil.width
## 1 1 32 0.19
## 2 2 33 0.20
sil_4 <- fviz_silhouette(res.f4) + labs(title = "Fuzzy = 2.5")
## cluster size ave.sil.width
## 1 1 31 0.20
## 2 2 34 0.19
grid.arrange(sil_2,sil_4,top = "Silhouette plots, clusters = 2")
Increasing the fuzziness leads to marked increase in silhouette scores for both analyzed datasets. Nevertheless the values of silhouette coefficients are significantly lower than in the case with 2 clusters using CLARA or PCA.
For the smaller dataset it is also possible to present the probabilities of being in any of the clusters. It can be seen on the plot below.
corrplot(t(res.f2$membership), is.corr = FALSE, method = "shade")
To sum up the winning methods in both cases was hierarchical clustering, which provided us with the clusters with best statistics. Therefore for cluistering of such datasets dendrograms should be used. Moreover fanny clustering that was one of the aims of this analysis shopwed mediocre results. In particular, hierarchical clustering is appropriate for
There are few differences between the applications of flat and hierarchical clustering in information retrieval. In particular, hierarchical clustering is appropriate for search results clustering or collection clustering (Pokemon).In general, we select flat clustering when efficiency is important and hierarchical clustering when one of the potential problems of flat clustering (not enough structure, predetermined number of clusters, non-determinism) is a concern. In addition, many researchers believe that hierarchical clustering produces better clusters than flat clustering. Results of this study show that hierarchical clustering is exactly better.
List of Pokemon: https://pokemondb.net/pokedex/national