Introduction

Cluster analysis is one of the multidimensional methods of unsupervised learning that classifies observations into subgroups. Observations possessing similar traits are sorted and clustered together in the same group. This statistical technique can be applied to various kinds of data and has various applications such as customer segmentation, clustering texts or films etc. One of the well-known clustering techniques is K - means and K - mediods. However, the focus of this paper is to employ cluster analysis and classify countries into sub groups on the basis of human development indicators. Classification of Countries will be done having similar characteristics of human development.

The main aim of this paper is to compare, present and employ various clustering and dimension reduction methods on human development indicators of selected countries. The comparison of different methods would help us to see how various methods behave and react to certain socio-economic data. In our data we have countries from all regions of the world. So, one might expect some sort of variation. Therefore, it would allow us to increase the quality of analysis. Data used in this study has been retrieved from the UN world explorer data base. Our data set has total of 189 countries as columns (variables) and five rows representing observations (indicators). The description of selected human development indicators is as follows:

Human development index (HDI) (hd_index, v1)
Life Expectancy at Birth (life_exp, v2)
Expected Years of Schooling (expct_sch, v3)
Mean Years of Schooling (mean_sch, v4)
Gross National Income (GNI) Per Capita (gni_capita, v5)

library(cluster)
library(factoextra)

## Loading required package: ggplot2

## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa

library(flexclust)

## Loading required package: grid

## Loading required package: lattice

## Loading required package: modeltools

## Loading required package: stats4

library(fpc)
library(clustertend)
library(ClusterR)

## Loading required package: gtools

library(readxl)
library(gridExtra)
library(clValid)

## Warning: package 'clValid' was built under R version 4.0.4

## 
## Attaching package: 'clValid'

## The following object is masked from 'package:flexclust':
## 
##     clusters

## The following object is masked from 'package:modeltools':
## 
##     clusters

#Setting the working directory 
setwd("D:")

#lets read the data set 
data<- read_excel("D:\\dev.xlsx")

## New names:
## * `` -> ...1

# Dim function provides the dimesions of our data set and head function displays the first five rows of the data matrix  
dim(data)

## [1]   5 190

head(data)

## # A tibble: 5 x 190
##   ...1   Norway Switzerland Ireland Germany `Hong Kong, Chi~ Australia Iceland
##   <chr>   <dbl>       <dbl>   <dbl>   <dbl>            <dbl>     <dbl>   <dbl>
## 1 hd_i~ 9.54e-1       0.946 9.42e-1 9.39e-1            0.939     0.938 9.38e-1
## 2 expc~ 1.81e+1      16.2   1.88e+1 1.71e+1           16.5      22.1   1.92e+1
## 3 life~ 8.23e+1      83.6   8.21e+1 8.12e+1           84.7      83.3   8.29e+1
## 4 mean~ 1.26e+1      13.4   1.25e+1 1.41e+1           12.0      12.7   1.25e+1
## 5 gni_~ 6.81e+4   59375.    5.57e+4 4.69e+4        60221.    44097.    4.76e+4
## # ... with 182 more variables: Sweden <dbl>, Singapore <dbl>,
## #   Netherlands <dbl>, Denmark <dbl>, Finland <dbl>, Canada <dbl>, `New
## #   Zealand` <dbl>, `United Kingdom` <dbl>, `United States` <dbl>,
## #   Belgium <dbl>, Liechtenstein <dbl>, Japan <dbl>, Austria <dbl>,
## #   Luxembourg <dbl>, Israel <dbl>, `Korea (Republic of)` <dbl>,
## #   Slovenia <dbl>, Spain <dbl>, Czechia <dbl>, France <dbl>, Malta <dbl>,
## #   Italy <dbl>, Estonia <dbl>, Cyprus <dbl>, Greece <dbl>, Poland <dbl>,
## #   Lithuania <dbl>, `United Arab Emirates` <dbl>, Andorra <dbl>, `Saudi
## #   Arabia` <dbl>, Slovakia <dbl>, Latvia <dbl>, Portugal <dbl>, Qatar <dbl>,
## #   Chile <dbl>, `Brunei Darussalam` <dbl>, Hungary <dbl>, Bahrain <dbl>,
## #   Croatia <dbl>, Oman <dbl>, Argentina <dbl>, `Russian Federation` <dbl>,
## #   Belarus <dbl>, Kazakhstan <dbl>, Bulgaria <dbl>, Montenegro <dbl>,
## #   Romania <dbl>, Palau <dbl>, Barbados <dbl>, Kuwait <dbl>, Uruguay <dbl>,
## #   Turkey <dbl>, Bahamas <dbl>, Malaysia <dbl>, Seychelles <dbl>,
## #   Serbia <dbl>, `Trinidad and Tobago` <dbl>, `Iran (Islamic Republic
## #   of)` <dbl>, Mauritius <dbl>, Panama <dbl>, `Costa Rica` <dbl>,
## #   Albania <dbl>, Georgia <dbl>, `Sri Lanka` <dbl>, Cuba <dbl>, `Saint Kitts
## #   and Nevis` <dbl>, `Antigua and Barbuda` <dbl>, `Bosnia and
## #   Herzegovina` <dbl>, Mexico <dbl>, Thailand <dbl>, Grenada <dbl>,
## #   Brazil <dbl>, Colombia <dbl>, Armenia <dbl>, Algeria <dbl>, `North
## #   Macedonia` <dbl>, Peru <dbl>, China <dbl>, Ecuador <dbl>, Azerbaijan <dbl>,
## #   Ukraine <dbl>, `Dominican Republic` <dbl>, `Saint Lucia` <dbl>,
## #   Tunisia <dbl>, Mongolia <dbl>, Lebanon <dbl>, Botswana <dbl>, `Saint
## #   Vincent and the Grenadines` <dbl>, Jamaica <dbl>, `Venezuela (Bolivarian
## #   Republic of)` <dbl>, Dominica <dbl>, Fiji <dbl>, Paraguay <dbl>,
## #   Suriname <dbl>, Jordan <dbl>, Belize <dbl>, Maldives <dbl>, Tonga <dbl>,
## #   Philippines <dbl>, `Moldova (Republic of)` <dbl>, ...

We can easily see that we have 5 rows and 190 columns. We do not require the first column as it has characters so lets transform our data set a bit to get the desired results.

working_data <- as.matrix(data[ ,2:190])

#We would like to cluster countries and to perform the row wise operation its better to take trasnpose of our data matrix.

working_data <- t(working_data)

round(working_data,digits=2)

##                                    [,1]  [,2]  [,3]  [,4]      [,5]
## Norway                             0.95 18.06 82.27 12.57  68058.62
## Switzerland                        0.95 16.21 83.63 13.38  59374.73
## Ireland                            0.94 18.79 82.10 12.53  55659.68
## Germany                            0.94 17.10 81.18 14.13  46945.95
## Hong Kong, China (SAR)             0.94 16.51 84.69 12.04  60220.80
## Australia                          0.94 22.10 83.28 12.68  44097.02
## Iceland                            0.94 19.17 82.86 12.54  47566.45
## Sweden                             0.94 18.83 82.65 12.43  47955.45
## Singapore                          0.93 16.33 83.46 11.50  83792.67
## Netherlands                        0.93 18.04 82.14 12.19  50012.59
## Denmark                            0.93 19.07 80.78 12.59  48836.09
## Finland                            0.93 19.32 81.74 12.44  41779.26
## Canada                             0.92 16.09 82.32 13.32  43602.25
## New Zealand                        0.92 18.84 82.14 12.68  35107.50
## United Kingdom                     0.92 17.44 81.24 12.95  39507.29
## United States                      0.92 16.27 78.85 13.41  56140.23
## Belgium                            0.92 19.70 81.47 11.78  43820.84
## Liechtenstein                      0.92 14.72 80.54 12.55  99732.14
## Japan                              0.91 15.23 84.47 12.80  40799.01
## Austria                            0.91 16.29 81.43 12.56  46230.57
## Luxembourg                         0.91 14.23 82.10 12.20  65543.05
## Israel                             0.91 15.99 82.82 12.96  33649.69
## Korea (Republic of)                0.91 16.39 82.85 12.19  36757.02
## Slovenia                           0.90 17.42 81.17 12.27  32143.04
## Spain                              0.89 17.88 83.43  9.82  35041.30
## Czechia                            0.89 16.83 79.22 12.74  31597.07
## France                             0.89 15.49 82.54 11.42  40510.78
## Malta                              0.89 15.90 82.38 11.29  34795.18
## Italy                              0.88 16.25 83.35 10.25  36141.43
## Estonia                            0.88 16.06 78.57 13.03  30378.63
## Cyprus                             0.87 14.67 80.83 12.10  33100.32
## Greece                             0.87 17.34 82.07 10.54  24909.34
## Poland                             0.87 16.43 78.54 12.29  27625.80
## Lithuania                          0.87 16.50 75.74 12.96  29775.26
## United Arab Emirates               0.87 13.64 77.81 10.95  66911.66
## Andorra                            0.86 13.30 81.79 10.16  48640.89
## Saudi Arabia                       0.86 16.98 75.00  9.67  49338.41
## Slovakia                           0.86 14.53 77.39 12.61  30671.87
## Latvia                             0.85 15.98 75.17 12.83  26300.77
## Portugal                           0.85 16.30 81.86  9.19  27935.38
## Qatar                              0.85 12.18 80.10  9.67 110488.74
## Chile                              0.85 16.53 80.04 10.45  21972.28
## Brunei Darussalam                  0.84 14.38 75.72  9.10  76388.54
## Hungary                            0.84 15.12 76.70 11.89  27144.21
## Bahrain                            0.84 15.26 77.16  9.41  40399.12
## Croatia                            0.84 14.96 78.34 11.41  23060.96
## Oman                               0.83 14.66 77.63  9.73  37039.23
## Argentina                          0.83 17.64 76.52 10.56  17611.22
## Russian Federation                 0.82 15.54 72.39 12.02  25036.02
## Belarus                            0.82 15.36 74.59 12.31  17038.53
## Kazakhstan                         0.82 15.27 73.24 11.78  22167.70
## Bulgaria                           0.82 14.81 74.93 11.81  19645.94
## Montenegro                         0.82 15.03 76.77 11.39  17510.71
## Romania                            0.82 14.26 75.92 10.98  23905.77
## Palau                              0.81 15.55 73.68 12.40  16720.01
## Barbados                           0.81 15.16 79.08 10.56  15912.28
## Kuwait                             0.81 13.76 75.40  7.28  71164.22
## Uruguay                            0.81 16.34 77.77  8.73  19434.85
## Turkey                             0.81 16.44 77.44  7.67  24905.38
## Bahamas                            0.81 12.82 73.75 11.53  28395.40
## Malaysia                           0.80 13.47 76.00 10.16  27226.68
## Seychelles                         0.80 15.45 73.33  9.67  25076.87
## Serbia                             0.80 14.77 75.85 11.18  15217.70
## Trinidad and Tobago                0.80 12.96 73.38 11.03  28497.37
## Iran (Islamic Republic of)         0.80 14.73 76.48 10.01  18166.47
## Mauritius                          0.80 14.97 74.86  9.43  22724.23
## Panama                             0.80 12.88 78.33 10.17  20454.87
## Costa Rica                         0.79 15.38 80.10  8.67  14789.93
## Albania                            0.79 15.23 78.46 10.05  12299.80
## Georgia                            0.79 15.43 73.60 12.81   9569.52
## Sri Lanka                          0.78 13.97 76.81 11.05  11610.91
## Cuba                               0.78 14.37 78.73 11.75   7811.36
## Saint Kitts and Nevis              0.78 13.61 74.56  8.50  26770.07
## Antigua and Barbuda                0.78 12.45 76.89  9.26  22201.23
## Bosnia and Herzegovina             0.77 13.79 77.26  9.69  12689.68
## Mexico                             0.77 14.30 74.99  8.60  17628.12
## Thailand                           0.76 14.65 76.93  7.73  16128.55
## Grenada                            0.76 16.60 72.38  8.80  12683.83
## Brazil                             0.76 15.40 75.67  7.84  14068.05
## Colombia                           0.76 14.60 77.11  8.33  12895.59
## Armenia                            0.76 13.17 74.94 11.79   9277.23
## Algeria                            0.76 14.72 76.69  7.99  13639.43
## North Macedonia                    0.76 13.46 75.69  9.68  12873.75
## Peru                               0.76 13.85 76.52  9.22  12322.66
## China                              0.76 13.89 76.70  7.90  16126.57
## Ecuador                            0.76 14.92 76.80  8.99  10141.15
## Azerbaijan                         0.75 12.40 72.86 10.48  15240.14
## Ukraine                            0.75 15.07 71.95 11.34   7994.21
## Dominican Republic                 0.74 14.14 73.89  7.94  15074.26
## Saint Lucia                        0.74 13.87 76.06  8.49  11528.37
## Tunisia                            0.74 15.10 76.50  7.17  10676.96
## Mongolia                           0.73 14.21 69.69 10.17  10783.71
## Lebanon                            0.73 11.29 78.88  8.70  11136.25
## Botswana                           0.73 12.70 69.28  9.33  15951.33
## Saint Vincent and the Grenadines   0.73 13.57 72.42  8.62  11746.45
## Jamaica                            0.73 13.14 74.37  9.80   7931.52
## Venezuela (Bolivarian Republic of) 0.73 12.82 72.13 10.32   9069.70
## Dominica                           0.72 12.97 78.12  7.80   9245.16
## Fiji                               0.72 14.43 67.34 10.88   9110.44
## Paraguay                           0.72 12.69 74.13  8.45  11719.96
## Suriname                           0.72 12.86 71.57  9.13  11932.99
## Jordan                             0.72 11.88 74.40 10.45   8267.81
## Belize                             0.72 13.12 74.50  9.80   7135.97
## Maldives                           0.72 12.12 78.63  6.82  12549.26
## Tonga                              0.72 14.30 70.80 11.21   5782.57
## Philippines                        0.71 12.72 71.10  9.39   9539.70
## Moldova (Republic of)              0.71 11.61 71.81 11.56   6833.11
## Turkmenistan                       0.71 10.89 68.07  9.78  16407.47
## Uzbekistan                         0.71 12.01 71.57 11.52   6461.84
## Libya                              0.71 12.79 72.72  7.56  11684.73
## Indonesia                          0.71 12.92 71.51  7.98  11255.78
## Samoa                              0.71 12.52 73.19 10.59   5884.84
## South Africa                       0.70 13.67 63.86 10.24  11756.30
## Bolivia (Plurinational State of)   0.70 14.01 71.24  9.02   6849.20
## Gabon                              0.70 12.90 66.19  8.32  15794.08
## Egypt                              0.70 13.10 71.83  7.33  10743.81
## Marshall Islands                   0.70 12.39 73.86 10.89   4633.48
## Viet Nam                           0.69 12.69 75.32  8.20   6220.27
## Palestine, State of                0.69 12.84 73.89  9.10   5313.83
## Iraq                               0.69 11.15 70.45  7.32  15364.96
## Morocco                            0.68 13.07 76.45  5.51   7479.59
## Kyrgyzstan                         0.67 13.36 71.32 10.88   3316.79
## Guyana                             0.67 11.47 69.77  8.47   7615.42
## El Salvador                        0.67 12.04 73.10  6.94   6973.46
## Tajikistan                         0.66 11.41 70.88 10.67   3482.38
## Cabo Verde                         0.65 11.87 72.78  6.24   6513.49
## Guatemala                          0.65 10.62 74.06  6.47   7377.92
## Nicaragua                          0.65 12.21 74.28  6.80   4789.84
## India                              0.65 12.35 69.42  6.45   6828.60
## Namibia                            0.65 12.63 63.37  6.94   9682.66
## Timor-Leste                        0.63 12.40 69.26  4.54   7526.66
## Honduras                           0.62 10.21 75.09  6.60   4258.35
## Kiribati                           0.62 11.81 68.12  7.87   3917.43
## Bhutan                             0.62 12.13 71.46  3.14   8609.12
## Bangladesh                         0.61 11.20 72.32  6.06   4057.25
## Micronesia (Federated States of)   0.61 11.55 67.75  7.72   3700.10
## Sao Tome and Principe              0.61 12.69 70.17  6.44   3024.43
## Congo                              0.61 11.60 64.29  6.50   5803.88
## Eswatini (Kingdom of)              0.61 11.38 59.40  6.75   9359.11
## Lao People's Democratic Republic   0.60 11.06 67.61  5.20   6316.52
## Vanuatu                            0.60 11.42 70.32  6.84   2807.86
## Ghana                              0.60 11.52 63.78  7.18   4098.86
## Zambia                             0.59 12.06 63.51  7.10   3581.89
## Equatorial Guinea                  0.59  9.20 58.40  5.55  17795.54
## Myanmar                            0.58 10.32 66.87  4.95   5763.94
## Cambodia                           0.58 11.34 69.57  4.84   3597.40
## Kenya                              0.58 11.06 66.34  6.56   3051.69
## Nepal                              0.58 12.20 70.48  4.86   2748.20
## Angola                             0.57 11.78 60.78  5.13   5554.70
## Cameroon                           0.56 12.75 58.92  6.29   3291.13
## Zimbabwe                           0.56 10.45 61.20  8.34   2661.07
## Pakistan                           0.56  8.46 67.11  5.16   5190.08
## Solomon Islands                    0.56 10.22 72.83  5.54   2026.72
## Syrian Arab Republic               0.55  8.85 71.78  5.10   2725.19
## Papua New Guinea                   0.54 10.00 64.26  4.62   3685.80
## Comoros                            0.54 11.24 64.12  4.91   2426.39
## Rwanda                             0.54 11.17 68.70  4.42   1958.61
## Nigeria                            0.53  9.75 54.33  6.46   5085.54
## Tanzania (United Republic of)      0.53  8.01 65.02  6.01   2805.12
## Uganda                             0.53 11.24 62.97  6.09   1752.21
## Mauritania                         0.53  8.47 64.70  4.61   3746.08
## Madagascar                         0.52 10.41 66.68  6.10   1403.92
## Benin                              0.52 12.61 61.47  3.77   2134.59
## Lesotho                            0.52 10.74 53.70  6.35   3243.84
## Côte d'Ivoire                      0.52  9.63 57.42  5.19   3589.41
## Senegal                            0.51  8.97 67.67  3.07   3255.99
## Togo                               0.51 12.57 60.76  4.95   1592.54
## Sudan                              0.51  7.74 65.10  3.72   3961.62
## Haiti                              0.50  9.50 63.66  5.44   1664.89
## Afghanistan                        0.50 10.14 64.49  3.93   1745.67
## Djibouti                           0.50  6.48 66.58  4.00   3600.71
## Malawi                             0.49 10.95 63.80  4.63   1159.12
## Ethiopia                           0.47  8.71 66.24  2.80   1781.76
## Gambia                             0.47  9.48 61.74  3.67   1489.57
## Guinea                             0.47  9.01 61.19  2.71   2211.00
## Liberia                            0.46  9.58 63.73  4.67   1040.09
## Yemen                              0.46  8.66 66.10  3.20   1433.30
## Guinea-Bissau                      0.46 10.50 58.00  3.30   1593.18
## Congo (Democratic Republic of the) 0.46  9.70 60.37  6.76    800.02
## Mozambique                         0.45  9.75 60.16  3.54   1153.70
## Sierra Leone                       0.44 10.18 54.31  3.60   1381.30
## Burkina Faso                       0.43  8.91 61.17  1.59   1705.49
## Eritrea                            0.43  5.01 65.94  3.90   1707.71
## Mali                               0.43  7.60 58.89  2.35   1965.39
## Burundi                            0.42 11.30 61.25  3.12    659.73
## South Sudan                        0.41  5.00 57.60  4.85   1455.23
## Chad                               0.40  7.47 53.98  2.41   1715.57
## Central African Republic           0.38  7.57 52.80  4.28    776.68
## Niger                              0.38  6.47 62.02  2.03    912.04

# The data has been round off to two digits just for simplicity purposes.Below one can see the transformed data matrix withour labels. Because clustering is performed on data having no labels at all.

Before proceeding further lets now scale our data set and further see whether we have any missing values in our data set.

working_data <- scale(working_data)
sum(is.na(working_data))

## [1] 0

Prediagnostic Analysis to check the Clustering Tendancy

As we can see that we do not have any missing values so we can move forward. The next step is to check whether our data set has the tendancy to be clustered or not. To check the clusterablity we will use a built in R function get_clust tendency. We will get a Hopkin’s statistic and if the value is higher than 0.75 at the 90% confidence level this implies that we reject the null hypothesis that data is not clusterable.

clusterability <- get_clust_tendency(working_data, n = nrow(working_data)-1, graph = FALSE)
clusterability$hopkins_stat

## [1] 0.8596448

In our case we have Hopkin’s statistic of 0.796 which means we reject null hypothesis so our data is has the cluster tendancy. The other method to check cluster tendancy is to examine the distance matrix.

d<-dist(working_data)
fviz_dist(d, show_labels = FALSE)+ labs(title = "working_data")

This above is dissimilarity image the more dissimilarities we have the better the clusters will be. As we are good with our clustering tendancy results. Lets turn to clustering analysis.

Optimal Number of Clusters K Means and PAM (Silhouette Statistic)

In the next step, for each clustering method (K- means and Pam) the optimal number of clusters will be decided. As we have a rather moderate to small data set so there is no need to employ CLARA which is best suited for large data sets. However, in this study K-means ,Pam and Heirarichal clustering will be implemented for the comparative analysis. To, decide the optimal number of clusters silhouette statistics will be used.

f1 <- fviz_nbclust(working_data, FUNcluster = kmeans, method = "silhouette") + 
ggtitle("Optimal number of clusters \n K-means")

f2 <- fviz_nbclust(working_data, FUNcluster = cluster::pam, method = "silhouette") + 
  ggtitle("Optimal number of clusters \n PAM")

grid.arrange(f1, f2, ncol=2)

As we can see for both the clustering algorithms (K-means & PAM) the optimal number of given clusters is 2. Furthermore, the average silhouette width is same for 2 clusters for both cases (K-means and Pam). What is more, the average silhouette width in both cases for 3 clusters remains the same.

Total Within- Cluster Sum of Squares

There is also an alternative method to check the stability of the aforementioned obtained results. Therefore,its always a good idea to see the alternative method by using the WSS statistics.

f3 <- fviz_nbclust(working_data, FUNcluster = kmeans, method = "wss") + 
  ggtitle("Optimal number of clusters \n K-means")

f4 <- fviz_nbclust(working_data, FUNcluster = cluster::pam, method = "wss") + 
  ggtitle("Optimal number of clusters \n PAM")

grid.arrange(f3, f4, ncol=2)

Above all, for both cases (K-means and PAM) the categorization into 2 clusters seems reasonable however, due to the subject of interest of the analysis and the above obtained results, the case for 3 clusters will also be considered.

K-means Clustering

First, categorization in 2 and 3 clusters will be done using k-means algorithm. It is as follows:

km2 <- eclust(working_data, k=2 , FUNcluster="kmeans", hc_metric="euclidean", graph=F)

c2 <- fviz_cluster(km2, data=working_data, elipse.type="convex", geom=c("point")) + ggtitle("K-means with 2 clusters")

s2 <- fviz_silhouette(km2)

##   cluster size ave.sil.width
## 1       1  117          0.45
## 2       2   72          0.53

grid.arrange(c2, s2, ncol=2)

km2

## K-means clustering with 2 clusters of sizes 117, 72
## 
## Cluster means:
##         [,1]       [,2]       [,3]       [,4]       [,5]
## 1  0.6631316  0.5921902  0.6077436  0.6421912  0.4400564
## 2 -1.0775889 -0.9623091 -0.9875833 -1.0435607 -0.7150916
## 
## Clustering vector:
##                             Norway                        Switzerland 
##                                  1                                  1 
##                            Ireland                            Germany 
##                                  1                                  1 
##             Hong Kong, China (SAR)                          Australia 
##                                  1                                  1 
##                            Iceland                             Sweden 
##                                  1                                  1 
##                          Singapore                        Netherlands 
##                                  1                                  1 
##                            Denmark                            Finland 
##                                  1                                  1 
##                             Canada                        New Zealand 
##                                  1                                  1 
##                     United Kingdom                      United States 
##                                  1                                  1 
##                            Belgium                      Liechtenstein 
##                                  1                                  1 
##                              Japan                            Austria 
##                                  1                                  1 
##                         Luxembourg                             Israel 
##                                  1                                  1 
##                Korea (Republic of)                           Slovenia 
##                                  1                                  1 
##                              Spain                            Czechia 
##                                  1                                  1 
##                             France                              Malta 
##                                  1                                  1 
##                              Italy                            Estonia 
##                                  1                                  1 
##                             Cyprus                             Greece 
##                                  1                                  1 
##                             Poland                          Lithuania 
##                                  1                                  1 
##               United Arab Emirates                            Andorra 
##                                  1                                  1 
##                       Saudi Arabia                           Slovakia 
##                                  1                                  1 
##                             Latvia                           Portugal 
##                                  1                                  1 
##                              Qatar                              Chile 
##                                  1                                  1 
##                  Brunei Darussalam                            Hungary 
##                                  1                                  1 
##                            Bahrain                            Croatia 
##                                  1                                  1 
##                               Oman                          Argentina 
##                                  1                                  1 
##                 Russian Federation                            Belarus 
##                                  1                                  1 
##                         Kazakhstan                           Bulgaria 
##                                  1                                  1 
##                         Montenegro                            Romania 
##                                  1                                  1 
##                              Palau                           Barbados 
##                                  1                                  1 
##                             Kuwait                            Uruguay 
##                                  1                                  1 
##                             Turkey                            Bahamas 
##                                  1                                  1 
##                           Malaysia                         Seychelles 
##                                  1                                  1 
##                             Serbia                Trinidad and Tobago 
##                                  1                                  1 
##         Iran (Islamic Republic of)                          Mauritius 
##                                  1                                  1 
##                             Panama                         Costa Rica 
##                                  1                                  1 
##                            Albania                            Georgia 
##                                  1                                  1 
##                          Sri Lanka                               Cuba 
##                                  1                                  1 
##              Saint Kitts and Nevis                Antigua and Barbuda 
##                                  1                                  1 
##             Bosnia and Herzegovina                             Mexico 
##                                  1                                  1 
##                           Thailand                            Grenada 
##                                  1                                  1 
##                             Brazil                           Colombia 
##                                  1                                  1 
##                            Armenia                            Algeria 
##                                  1                                  1 
##                    North Macedonia                               Peru 
##                                  1                                  1 
##                              China                            Ecuador 
##                                  1                                  1 
##                         Azerbaijan                            Ukraine 
##                                  1                                  1 
##                 Dominican Republic                        Saint Lucia 
##                                  1                                  1 
##                            Tunisia                           Mongolia 
##                                  1                                  1 
##                            Lebanon                           Botswana 
##                                  1                                  1 
##   Saint Vincent and the Grenadines                            Jamaica 
##                                  1                                  1 
## Venezuela (Bolivarian Republic of)                           Dominica 
##                                  1                                  1 
##                               Fiji                           Paraguay 
##                                  1                                  1 
##                           Suriname                             Jordan 
##                                  1                                  1 
##                             Belize                           Maldives 
##                                  1                                  1 
##                              Tonga                        Philippines 
##                                  1                                  1 
##              Moldova (Republic of)                       Turkmenistan 
##                                  1                                  2 
##                         Uzbekistan                              Libya 
##                                  1                                  1 
##                          Indonesia                              Samoa 
##                                  1                                  1 
##                       South Africa   Bolivia (Plurinational State of) 
##                                  1                                  1 
##                              Gabon                              Egypt 
##                                  2                                  2 
##                   Marshall Islands                           Viet Nam 
##                                  1                                  1 
##                Palestine, State of                               Iraq 
##                                  1                                  2 
##                            Morocco                         Kyrgyzstan 
##                                  2                                  1 
##                             Guyana                        El Salvador 
##                                  2                                  2 
##                         Tajikistan                         Cabo Verde 
##                                  2                                  2 
##                          Guatemala                          Nicaragua 
##                                  2                                  2 
##                              India                            Namibia 
##                                  2                                  2 
##                        Timor-Leste                           Honduras 
##                                  2                                  2 
##                           Kiribati                             Bhutan 
##                                  2                                  2 
##                         Bangladesh   Micronesia (Federated States of) 
##                                  2                                  2 
##              Sao Tome and Principe                              Congo 
##                                  2                                  2 
##              Eswatini (Kingdom of)   Lao People's Democratic Republic 
##                                  2                                  2 
##                            Vanuatu                              Ghana 
##                                  2                                  2 
##                             Zambia                  Equatorial Guinea 
##                                  2                                  2 
##                            Myanmar                           Cambodia 
##                                  2                                  2 
##                              Kenya                              Nepal 
##                                  2                                  2 
##                             Angola                           Cameroon 
##                                  2                                  2 
##                           Zimbabwe                           Pakistan 
##                                  2                                  2 
##                    Solomon Islands               Syrian Arab Republic 
##                                  2                                  2 
##                   Papua New Guinea                            Comoros 
##                                  2                                  2 
##                             Rwanda                            Nigeria 
##                                  2                                  2 
##      Tanzania (United Republic of)                             Uganda 
##                                  2                                  2 
##                         Mauritania                         Madagascar 
##                                  2                                  2 
##                              Benin                            Lesotho 
##                                  2                                  2 
##                      Côte d'Ivoire                            Senegal 
##                                  2                                  2 
##                               Togo                              Sudan 
##                                  2                                  2 
##                              Haiti                        Afghanistan 
##                                  2                                  2 
##                           Djibouti                             Malawi 
##                                  2                                  2 
##                           Ethiopia                             Gambia 
##                                  2                                  2 
##                             Guinea                            Liberia 
##                                  2                                  2 
##                              Yemen                      Guinea-Bissau 
##                                  2                                  2 
## Congo (Democratic Republic of the)                         Mozambique 
##                                  2                                  2 
##                       Sierra Leone                       Burkina Faso 
##                                  2                                  2 
##                            Eritrea                               Mali 
##                                  2                                  2 
##                            Burundi                        South Sudan 
##                                  2                                  2 
##                               Chad           Central African Republic 
##                                  2                                  2 
##                              Niger 
##                                  2 
## 
## Within cluster sum of squares by cluster:
## [1] 280.6001 117.0650
##  (between_SS / total_SS =  57.7 %)
## 
## Available components:
## 
##  [1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss"
##  [6] "betweenss"    "size"         "iter"         "ifault"       "silinfo"     
## [11] "nbclust"      "data"

We can see countries with high GNI per capita, high life expectancy at birth and higher mean schooling are grouped in same cluster as compared to countries that are low in human development indicators. Normally, EU-27 countries, Arab states and few south east asian coutries are grouped together in cluster 1 while least developing economies they are grouped together in same cluster 2. This could easily be seen in clustering vector.

K-Means with 3 clusters

km3 <- eclust(working_data, k=3 , FUNcluster="kmeans", hc_metric="euclidean", graph=F)

c3 <- fviz_cluster(km3, data=working_data, elipse.type="convex", geom=c("point")) + ggtitle("K-means with 3 clusters")

s3 <- fviz_silhouette(km3)

##   cluster size ave.sil.width
## 1       1   81          0.46
## 2       2   61          0.46
## 3       3   47          0.37

grid.arrange(c3, s3, ncol=2)

km3

## K-means clustering with 3 clusters of sizes 81, 61, 47
## 
## Cluster means:
##         [,1]       [,2]       [,3]       [,4]       [,5]
## 1  0.2263475  0.1568418  0.2281714  0.2943044 -0.2533303
## 2 -1.2256656 -1.0536857 -1.1445775 -1.1737835 -0.7605953
## 3  1.2006692  1.0972477  1.0922839  1.0162157  1.4237462
## 
## Clustering vector:
##                             Norway                        Switzerland 
##                                  3                                  3 
##                            Ireland                            Germany 
##                                  3                                  3 
##             Hong Kong, China (SAR)                          Australia 
##                                  3                                  3 
##                            Iceland                             Sweden 
##                                  3                                  3 
##                          Singapore                        Netherlands 
##                                  3                                  3 
##                            Denmark                            Finland 
##                                  3                                  3 
##                             Canada                        New Zealand 
##                                  3                                  3 
##                     United Kingdom                      United States 
##                                  3                                  3 
##                            Belgium                      Liechtenstein 
##                                  3                                  3 
##                              Japan                            Austria 
##                                  3                                  3 
##                         Luxembourg                             Israel 
##                                  3                                  3 
##                Korea (Republic of)                           Slovenia 
##                                  3                                  3 
##                              Spain                            Czechia 
##                                  3                                  3 
##                             France                              Malta 
##                                  3                                  3 
##                              Italy                            Estonia 
##                                  3                                  3 
##                             Cyprus                             Greece 
##                                  3                                  3 
##                             Poland                          Lithuania 
##                                  3                                  3 
##               United Arab Emirates                            Andorra 
##                                  3                                  3 
##                       Saudi Arabia                           Slovakia 
##                                  3                                  3 
##                             Latvia                           Portugal 
##                                  3                                  3 
##                              Qatar                              Chile 
##                                  3                                  3 
##                  Brunei Darussalam                            Hungary 
##                                  3                                  3 
##                            Bahrain                            Croatia 
##                                  3                                  1 
##                               Oman                          Argentina 
##                                  3                                  1 
##                 Russian Federation                            Belarus 
##                                  1                                  1 
##                         Kazakhstan                           Bulgaria 
##                                  1                                  1 
##                         Montenegro                            Romania 
##                                  1                                  1 
##                              Palau                           Barbados 
##                                  1                                  1 
##                             Kuwait                            Uruguay 
##                                  3                                  1 
##                             Turkey                            Bahamas 
##                                  1                                  1 
##                           Malaysia                         Seychelles 
##                                  1                                  1 
##                             Serbia                Trinidad and Tobago 
##                                  1                                  1 
##         Iran (Islamic Republic of)                          Mauritius 
##                                  1                                  1 
##                             Panama                         Costa Rica 
##                                  1                                  1 
##                            Albania                            Georgia 
##                                  1                                  1 
##                          Sri Lanka                               Cuba 
##                                  1                                  1 
##              Saint Kitts and Nevis                Antigua and Barbuda 
##                                  1                                  1 
##             Bosnia and Herzegovina                             Mexico 
##                                  1                                  1 
##                           Thailand                            Grenada 
##                                  1                                  1 
##                             Brazil                           Colombia 
##                                  1                                  1 
##                            Armenia                            Algeria 
##                                  1                                  1 
##                    North Macedonia                               Peru 
##                                  1                                  1 
##                              China                            Ecuador 
##                                  1                                  1 
##                         Azerbaijan                            Ukraine 
##                                  1                                  1 
##                 Dominican Republic                        Saint Lucia 
##                                  1                                  1 
##                            Tunisia                           Mongolia 
##                                  1                                  1 
##                            Lebanon                           Botswana 
##                                  1                                  1 
##   Saint Vincent and the Grenadines                            Jamaica 
##                                  1                                  1 
## Venezuela (Bolivarian Republic of)                           Dominica 
##                                  1                                  1 
##                               Fiji                           Paraguay 
##                                  1                                  1 
##                           Suriname                             Jordan 
##                                  1                                  1 
##                             Belize                           Maldives 
##                                  1                                  1 
##                              Tonga                        Philippines 
##                                  1                                  1 
##              Moldova (Republic of)                       Turkmenistan 
##                                  1                                  1 
##                         Uzbekistan                              Libya 
##                                  1                                  1 
##                          Indonesia                              Samoa 
##                                  1                                  1 
##                       South Africa   Bolivia (Plurinational State of) 
##                                  1                                  1 
##                              Gabon                              Egypt 
##                                  1                                  1 
##                   Marshall Islands                           Viet Nam 
##                                  1                                  1 
##                Palestine, State of                               Iraq 
##                                  1                                  1 
##                            Morocco                         Kyrgyzstan 
##                                  1                                  1 
##                             Guyana                        El Salvador 
##                                  1                                  1 
##                         Tajikistan                         Cabo Verde 
##                                  1                                  1 
##                          Guatemala                          Nicaragua 
##                                  1                                  1 
##                              India                            Namibia 
##                                  2                                  2 
##                        Timor-Leste                           Honduras 
##                                  2                                  2 
##                           Kiribati                             Bhutan 
##                                  2                                  2 
##                         Bangladesh   Micronesia (Federated States of) 
##                                  2                                  2 
##              Sao Tome and Principe                              Congo 
##                                  2                                  2 
##              Eswatini (Kingdom of)   Lao People's Democratic Republic 
##                                  2                                  2 
##                            Vanuatu                              Ghana 
##                                  2                                  2 
##                             Zambia                  Equatorial Guinea 
##                                  2                                  2 
##                            Myanmar                           Cambodia 
##                                  2                                  2 
##                              Kenya                              Nepal 
##                                  2                                  2 
##                             Angola                           Cameroon 
##                                  2                                  2 
##                           Zimbabwe                           Pakistan 
##                                  2                                  2 
##                    Solomon Islands               Syrian Arab Republic 
##                                  2                                  2 
##                   Papua New Guinea                            Comoros 
##                                  2                                  2 
##                             Rwanda                            Nigeria 
##                                  2                                  2 
##      Tanzania (United Republic of)                             Uganda 
##                                  2                                  2 
##                         Mauritania                         Madagascar 
##                                  2                                  2 
##                              Benin                            Lesotho 
##                                  2                                  2 
##                      Côte d'Ivoire                            Senegal 
##                                  2                                  2 
##                               Togo                              Sudan 
##                                  2                                  2 
##                              Haiti                        Afghanistan 
##                                  2                                  2 
##                           Djibouti                             Malawi 
##                                  2                                  2 
##                           Ethiopia                             Gambia 
##                                  2                                  2 
##                             Guinea                            Liberia 
##                                  2                                  2 
##                              Yemen                      Guinea-Bissau 
##                                  2                                  2 
## Congo (Democratic Republic of the)                         Mozambique 
##                                  2                                  2 
##                       Sierra Leone                       Burkina Faso 
##                                  2                                  2 
##                            Eritrea                               Mali 
##                                  2                                  2 
##                            Burundi                        South Sudan 
##                                  2                                  2 
##                               Chad           Central African Republic 
##                                  2                                  2 
##                              Niger 
##                                  2 
## 
## Within cluster sum of squares by cluster:
## [1] 72.34430 81.80587 80.44265
##  (between_SS / total_SS =  75.0 %)
## 
## Available components:
## 
##  [1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss"
##  [6] "betweenss"    "size"         "iter"         "ifault"       "silinfo"     
## [11] "nbclust"      "data"

There is a slight difference in average silhouette width of 0.44 (3 clusters) to 0.48 (2 clusters). K-means with 2 clusters does not have a negative average silhouette value but k-means with 3 clusters does have.

PAM Clustering

In this part the categorization is based on PAM algorithm. It would be done separately for 2 and 3 clusters.

pam2 <- eclust(working_data, k=2 , FUNcluster="pam", hc_metric="euclidean", graph=F)

cp2 <- fviz_cluster(pam2, data=working_data, elipse.type="convex", geom=c("point")) + ggtitle("PAM with 2 clusters")

sp2 <- fviz_silhouette(pam2)

##   cluster size ave.sil.width
## 1       1  124          0.44
## 2       2   65          0.57

grid.arrange(cp2, sp2, ncol=2)

PAM with 3 clusters

pam3 <- eclust(working_data, k=3 , FUNcluster="pam", hc_metric="euclidean", graph=F)

cp3 <- fviz_cluster(pam3, data=working_data, elipse.type="convex", geom=c("point")) + ggtitle("PAM with 3 clusters")

pam3

## Medoids:
##                   ID                                                        
## Austria           20  1.3286246  1.03496723  1.1970915  1.2800465  1.4099650
## North Macedonia   83  0.3048589  0.07936403  0.4274238  0.3461972 -0.2825410
## Papua New Guinea 155 -1.1297791 -1.09450728 -1.1029371 -1.2954771 -0.7487326
## Clustering vector:
##                             Norway                        Switzerland 
##                                  1                                  1 
##                            Ireland                            Germany 
##                                  1                                  1 
##             Hong Kong, China (SAR)                          Australia 
##                                  1                                  1 
##                            Iceland                             Sweden 
##                                  1                                  1 
##                          Singapore                        Netherlands 
##                                  1                                  1 
##                            Denmark                            Finland 
##                                  1                                  1 
##                             Canada                        New Zealand 
##                                  1                                  1 
##                     United Kingdom                      United States 
##                                  1                                  1 
##                            Belgium                      Liechtenstein 
##                                  1                                  1 
##                              Japan                            Austria 
##                                  1                                  1 
##                         Luxembourg                             Israel 
##                                  1                                  1 
##                Korea (Republic of)                           Slovenia 
##                                  1                                  1 
##                              Spain                            Czechia 
##                                  1                                  1 
##                             France                              Malta 
##                                  1                                  1 
##                              Italy                            Estonia 
##                                  1                                  1 
##                             Cyprus                             Greece 
##                                  1                                  1 
##                             Poland                          Lithuania 
##                                  1                                  1 
##               United Arab Emirates                            Andorra 
##                                  1                                  1 
##                       Saudi Arabia                           Slovakia 
##                                  1                                  1 
##                             Latvia                           Portugal 
##                                  1                                  1 
##                              Qatar                              Chile 
##                                  1                                  2 
##                  Brunei Darussalam                            Hungary 
##                                  1                                  2 
##                            Bahrain                            Croatia 
##                                  1                                  2 
##                               Oman                          Argentina 
##                                  1                                  2 
##                 Russian Federation                            Belarus 
##                                  2                                  2 
##                         Kazakhstan                           Bulgaria 
##                                  2                                  2 
##                         Montenegro                            Romania 
##                                  2                                  2 
##                              Palau                           Barbados 
##                                  2                                  2 
##                             Kuwait                            Uruguay 
##                                  1                                  2 
##                             Turkey                            Bahamas 
##                                  2                                  2 
##                           Malaysia                         Seychelles 
##                                  2                                  2 
##                             Serbia                Trinidad and Tobago 
##                                  2                                  2 
##         Iran (Islamic Republic of)                          Mauritius 
##                                  2                                  2 
##                             Panama                         Costa Rica 
##                                  2                                  2 
##                            Albania                            Georgia 
##                                  2                                  2 
##                          Sri Lanka                               Cuba 
##                                  2                                  2 
##              Saint Kitts and Nevis                Antigua and Barbuda 
##                                  2                                  2 
##             Bosnia and Herzegovina                             Mexico 
##                                  2                                  2 
##                           Thailand                            Grenada 
##                                  2                                  2 
##                             Brazil                           Colombia 
##                                  2                                  2 
##                            Armenia                            Algeria 
##                                  2                                  2 
##                    North Macedonia                               Peru 
##                                  2                                  2 
##                              China                            Ecuador 
##                                  2                                  2 
##                         Azerbaijan                            Ukraine 
##                                  2                                  2 
##                 Dominican Republic                        Saint Lucia 
##                                  2                                  2 
##                            Tunisia                           Mongolia 
##                                  2                                  2 
##                            Lebanon                           Botswana 
##                                  2                                  2 
##   Saint Vincent and the Grenadines                            Jamaica 
##                                  2                                  2 
## Venezuela (Bolivarian Republic of)                           Dominica 
##                                  2                                  2 
##                               Fiji                           Paraguay 
##                                  2                                  2 
##                           Suriname                             Jordan 
##                                  2                                  2 
##                             Belize                           Maldives 
##                                  2                                  2 
##                              Tonga                        Philippines 
##                                  2                                  2 
##              Moldova (Republic of)                       Turkmenistan 
##                                  2                                  2 
##                         Uzbekistan                              Libya 
##                                  2                                  2 
##                          Indonesia                              Samoa 
##                                  2                                  2 
##                       South Africa   Bolivia (Plurinational State of) 
##                                  2                                  2 
##                              Gabon                              Egypt 
##                                  2                                  2 
##                   Marshall Islands                           Viet Nam 
##                                  2                                  2 
##                Palestine, State of                               Iraq 
##                                  2                                  2 
##                            Morocco                         Kyrgyzstan 
##                                  2                                  2 
##                             Guyana                        El Salvador 
##                                  2                                  2 
##                         Tajikistan                         Cabo Verde 
##                                  2                                  2 
##                          Guatemala                          Nicaragua 
##                                  2                                  2 
##                              India                            Namibia 
##                                  3                                  3 
##                        Timor-Leste                           Honduras 
##                                  3                                  3 
##                           Kiribati                             Bhutan 
##                                  3                                  3 
##                         Bangladesh   Micronesia (Federated States of) 
##                                  3                                  3 
##              Sao Tome and Principe                              Congo 
##                                  3                                  3 
##              Eswatini (Kingdom of)   Lao People's Democratic Republic 
##                                  3                                  3 
##                            Vanuatu                              Ghana 
##                                  3                                  3 
##                             Zambia                  Equatorial Guinea 
##                                  3                                  3 
##                            Myanmar                           Cambodia 
##                                  3                                  3 
##                              Kenya                              Nepal 
##                                  3                                  3 
##                             Angola                           Cameroon 
##                                  3                                  3 
##                           Zimbabwe                           Pakistan 
##                                  3                                  3 
##                    Solomon Islands               Syrian Arab Republic 
##                                  3                                  3 
##                   Papua New Guinea                            Comoros 
##                                  3                                  3 
##                             Rwanda                            Nigeria 
##                                  3                                  3 
##      Tanzania (United Republic of)                             Uganda 
##                                  3                                  3 
##                         Mauritania                         Madagascar 
##                                  3                                  3 
##                              Benin                            Lesotho 
##                                  3                                  3 
##                      Côte d'Ivoire                            Senegal 
##                                  3                                  3 
##                               Togo                              Sudan 
##                                  3                                  3 
##                              Haiti                        Afghanistan 
##                                  3                                  3 
##                           Djibouti                             Malawi 
##                                  3                                  3 
##                           Ethiopia                             Gambia 
##                                  3                                  3 
##                             Guinea                            Liberia 
##                                  3                                  3 
##                              Yemen                      Guinea-Bissau 
##                                  3                                  3 
## Congo (Democratic Republic of the)                         Mozambique 
##                                  3                                  3 
##                       Sierra Leone                       Burkina Faso 
##                                  3                                  3 
##                            Eritrea                               Mali 
##                                  3                                  3 
##                            Burundi                        South Sudan 
##                                  3                                  3 
##                               Chad           Central African Republic 
##                                  3                                  3 
##                              Niger 
##                                  3 
## Objective function:
##    build     swap 
## 1.054234 1.037102 
## 
## Available components:
##  [1] "medoids"    "id.med"     "clustering" "objective"  "isolation" 
##  [6] "clusinfo"   "silinfo"    "diss"       "call"       "data"      
## [11] "nbclust"

sp3 <- fviz_silhouette(pam3)

##   cluster size ave.sil.width
## 1       1   45          0.38
## 2       2   83          0.45
## 3       3   61          0.47

grid.arrange(cp3, sp3, ncol=2)

For both the clustering techniques K-means and Pam the silhouette statistics are approximately the same for both 2 clusters case scenario and 3 clusters case scenario. However, in case of Pam with 2 clusters we have couple of negative average silhouette values. And in case of Pam with 3 clusters we observe a small single negative average silhouette value. To summarize, both the methods have shown approximately the identical results in terms of silhouette statistics for both 2 and 3 cluster scenarios.

Hierarchical clustering

Hierarchical clustering as the last resort will be used.In the hierarchical clustering method, it is a necessary requirement to compute the dissimilarity matrix and thus it requires the specification of the linkage method. There are other options but i have decided to limit myself to complete linkage.

Complete linkage with 2 clusters

hc2 <- eclust(working_data, k=2, FUNcluster="hclust", hc_metric="euclidean", hc_method = "complete")

plot(hc2, cex=0.6, hang=-1, main = "Dendrogram of HAC")
rect.hclust(hc2, k=2, border='blue')

Complete Linkage with 3 Clusters

hc3 <- eclust(working_data, k=3, FUNcluster="hclust", hc_metric="euclidean", hc_method="complete")

plot(hc3, cex=0.5, hang=-1)
rect.hclust(hc3, k=3, border='blue')

The results are quite similar to K-means and Pam. The division of countries does not change much what ever clustering technique and how many the number of clusters (2 or 3) we may use.

Stability comparison

To examine the consistency of the above results, one can use the clvalid package. It encompasses the various stability measures which inspects the the stability of the technique by comparing basic clustering with clusters obtained after removing particular column of the data. They include:

The average proportion of non-overlap (APN)
The average distance (AD)
The average distance between means (ADM)
The figure of merit (FOM)

the smaller the values of above mentioned measures the more consistence our clustering results are.

clmethods <- c("hierarchical","kmeans","pam")
sty <- clValid(working_data, nClust=2:6, clMethods=clmethods, validation="stability", method="complete")

optimalScores(sty)

##          Score Method Clusters
## APN 0.05616406    pam        3
## AD  1.29399625    pam        6
## ADM 0.14232400    pam        3
## FOM 0.50386420 kmeans        6

plot(sty)

summary(sty)

## 
## Clustering Methods:
##  hierarchical kmeans pam 
## 
## Cluster sizes:
##  2 3 4 5 6 
## 
## Validation Measures:
##                        2      3      4      5      6
##                                                     
## hierarchical APN  0.3508 0.2403 0.2849 0.2898 0.3811
##              AD   2.4245 1.8205 1.6711 1.4618 1.4265
##              ADM  1.5818 0.9299 0.8201 0.6399 0.7040
##              FOM  0.6963 0.6251 0.5880 0.5574 0.5304
## kmeans       APN  0.0781 0.0881 0.1769 0.1723 0.3291
##              AD   1.9073 1.5011 1.4498 1.3151 1.3355
##              ADM  0.2744 0.2265 0.3726 0.3255 0.5384
##              FOM  0.6927 0.5658 0.5488 0.5252 0.5039
## pam          APN  0.0614 0.0562 0.1416 0.2631 0.2871
##              AD   1.8932 1.4744 1.3791 1.3447 1.2940
##              ADM  0.2356 0.1423 0.2615 0.4584 0.5277
##              FOM  0.6843 0.5541 0.5306 0.5131 0.5055
## 
## Optimal Scores:
## 
##     Score  Method Clusters
## APN 0.0562 pam    3       
## AD  1.2940 pam    6       
## ADM 0.1423 pam    3       
## FOM 0.5039 kmeans 6

From the above results Pam algorithm is considered to be the most consistent one in our case based on the above measures. There is no consensus on the optimal number of clusters but we had no more then 3 ,so its remain to be ambitions one.

Dimension Reduction Techniques

Dimension reduction techniques are used to transform data of high dimensions to low dimensional representation that still retains some meaningful information of the original data. By implementing dimension reduction techniques it is possible to interchange the initial factors for principal variables to evade overfitting of the model. At the begining, lets just see some basic statistics.

summary(working_data)

##        V1                 V2                 V3                V4         
##  Min.   :-2.23377   Min.   :-2.78585   Min.   :-2.6377   Min.   :-2.2799  
##  1st Qu.:-0.78102   1st Qu.:-0.64077   1st Qu.:-0.6906   1st Qu.:-0.7349  
##  Median : 0.09508   Median :-0.04222   Median : 0.1827   Median : 0.1312  
##  Mean   : 0.00000   Mean   : 0.00000   Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.: 0.77351   3rd Qu.: 0.67741   3rd Qu.: 0.7063   3rd Qu.: 0.8679  
##  Max.   : 1.59308   Max.   : 3.00420   Max.   : 1.6328   Max.   : 1.7906  
##        V5         
##  Min.   :-0.9023  
##  1st Qu.:-0.7347  
##  Median :-0.3466  
##  Mean   : 0.0000  
##  3rd Qu.: 0.4225  
##  Max.   : 4.6704

Before implementing PCA its feasible to see the correlation coefficients among the variables.

cor<-cor(working_data, method="pearson") 
print(cor, digits= 1)

##      [,1] [,2] [,3] [,4] [,5]
## [1,]  1.0  0.9  0.9  0.9  0.8
## [2,]  0.9  1.0  0.8  0.8  0.6
## [3,]  0.9  0.8  1.0  0.8  0.7
## [4,]  0.9  0.8  0.8  1.0  0.6
## [5,]  0.8  0.6  0.7  0.6  1.0

The above results suggest that majority of the variables are storngly correlated with each other. To get a better overview lets plot the corrplot.

library(corrplot)

## corrplot 0.84 loaded

corrplot(cor)

corrplot further supports the claim that indeed there exist strong positive correlation among the variables.

Principal Component Analysis (PCA)

Choosing the optimal number of components

Kaiser’s stopping rule is a method which helps us to decide the optimal number of components to be included. Components with higher eigenvalue of 1 should be retained. It further enhances the analysis by ploting a scree plot. The eigenvalues are plotted on vertical axis and the components on horizontal axis. They are plotted in a descending order, from the largest to the smallest. The other approach is to consider the percentage of variance explained by each component. Normally, components are considered feasible if they explain approximately 70%-90% variation.

pca <- prcomp(working_data, center=TRUE, scale=TRUE)
eigen(cor(working_data))$values

## [1] 4.12116193 0.44470704 0.22883898 0.18702570 0.01826635

fviz_eig(pca, choice='eigenvalue')

In our case based on Kaiser’s rule, there is only 1 component that has a eigenvalue of greater then 1. So this should be chosen and the scree plot gives the same result as above.

fviz_eig(pca)

summary(pca)

## Importance of components:
##                           PC1     PC2     PC3     PC4     PC5
## Standard deviation     2.0301 0.66686 0.47837 0.43246 0.13515
## Proportion of Variance 0.8242 0.08894 0.04577 0.03741 0.00365
## Cumulative Proportion  0.8242 0.91317 0.95894 0.99635 1.00000

If we look at the above results we see that there is only one PC1 component that explains approximately 82% of variation and has an eigenvalue of greater then 1. The remaining ones do not explain much variation and there eigenvalues are even less then 1.

Analysis of components

pcavar <- get_pca_var(pca)
fviz_pca_var(pca, col.var="cos2", alpha.var="contrib", gradient.cols = c("blue", "green", "red"), repel = TRUE)

As we can see that except v5 (Gni_per_capita) that is located in third quadrant all other variables v1, v2, v3 and v4 are located in 2nd quadrant. One might conclude that variables v1, v2, v3 and v4 they contribute positively to v5 or in other words Gni_per_capita. The more you perform good in terms of human development indicators the higher the Gni_per_capita you will be able to enjoy.

fviz_contrib(pca, choice = "var", axes = 1)

fviz_contrib(pca, choice = "var", axes = 2)

fviz_contrib(pca, choice = "var", axes = 3)

fviz_contrib(pca, choice = "var", axes = 4)

fviz_contrib(pca, choice = "var", axes = 5)

From the above plots in all components v1 (hdi_index) is the major contributor. We can even plot the individual components. The below graph shows the observations and their quality among two main principle components.

fviz_pca_ind(pca, col.ind = "cos2", 
             gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
             )

Conclusion

In this paper, I examined the grouping of countries depending on the similarities and dissimilarities in their human development indicators. Three clustering algorithms were implemented with two different case scenarios. Above all, all methods had shown the same division of countries. Stability comparison measures suggest that PAM is the most suited technique for such type of data with decision to optimal number of clusters being ambiguous one. Additionally, PCA was implemented to reduce the dimensions. One might conclude that country that has ranked high on human development indicators expect to enjoy higher Gni_per_capita.

References

https://www.datanovia.com/en/lessons/choosing-the-best-clustering-algorithms/

https://en.wikipedia.org/wiki/Hierarchical_clustering

http://www.sthda.com/english/wiki/wiki.php?id_contents=7932#compute-clvalid

http://data.un.org/Explorer.aspx

https://en.wikipedia.org/wiki/Principal_component_analysis

<https://rpubs.com/sgroszkiewicz/clustering>

Comparison of Clustering Techniques and Dimension Reductions on Human Development Indicators

Muhammad Usman

2/23/2021