In this modern day and age of constant stress, anxiety,
depression, the survey of the world happiness is a very important point
of interest. It is the mode of defining the factors that makes a country
more blissful than others. A very important part of the study is to
figure out the aspects, different variables, that leads to the life
satisfaction and higher rank in the happiness index.
This dataset
is the 2022 survey for the World Happiness Report, containing 147
countries, with 12 defining variables which recognizes the happiness
index for certain part of the world. The Happiness indexing scores are
defined by:
- GDP per capita
- Healthy Life Expectancy
-
Social support
- Freedom to make life choices
- Generosity
-
Corruption Perception
In this dataset, following are the variables:
- RANK: The
position in terms of other countries.
- Country: Name of the
country.
- Happiness score: A measure of happiness based on the
answers to the Cantril ladder question in the Gallup World Poll.
-
Whisker-high: The upper whisker of the happiness score based on the
confidence interval (95% by default).
- Whisker-low: The lower
whisker of the happiness score based on the confidence interval (95% by
default).
- Dystopia (1.83) + residual: An imaginary country that
has the world’s least happy people. It is used as a benchmark against
which all other countries can be compared.
- GDP per capita: The
country’s economic production divided by its total population.
-
Social support: The extent to which social support contributed to the
calculation of the happiness score.
- Life expectancy: The average
number of years a newborn infant can expect to live.
- Freedom of
Life Choices: The extent to which freedom contributed to the calculation
of the happiness score.
- Generosity: The extent to which generosity
contributed to the calculation of the happiness score.
- Corruption:
The extent to which perceptions of corruption contributed to the
calculation of the happiness score.
The Dataset was taken from Kaggle1, Editors: John Helliwell, Richard Layard, Jeffrey D. Sachs, and Jan Emmanuel De Neve, Co-Editors; Lara Aknin, Haifang Huang and Shun Wang, Associate Editors; and Sharon Paculor, Production Editor.
Libraries Required
library(factoextra)
library(clValid)
library(flexclust)
library(clustertend)
library(cluster)
library(ClusterR)
library(readxl)
library(fpc)
library(gridExtra)
library(corrplot)
Loading the Data
data <- read.csv("2022.csv", header = TRUE, sep = ",", dec = ",")
Checking for missing data and NAs
sapply(data, function(x) sum(is.na(x)))
## RANK Country
## 0 0
## Happiness.score Whisker.high
## 1 1
## Whisker.low Dystopia..1.83....residual
## 1 1
## GDP.per.capita Social.support
## 1 1
## Life.expectancy Freedom.of.Life.Choices
## 1 1
## Generosity Corruption
## 1 1
Apparently there are NAs for the last row, and it is removed totally as it is dummy data.
data <- data[-147,]
sapply(data, function(x) sum(is.na(x)))
## RANK Country
## 0 0
## Happiness.score Whisker.high
## 0 0
## Whisker.low Dystopia..1.83....residual
## 0 0
## GDP.per.capita Social.support
## 0 0
## Life.expectancy Freedom.of.Life.Choices
## 0 0
## Generosity Corruption
## 0 0
Now the data is clean and ready for further analysis.
Checking for the Data Structure
str(data)
## 'data.frame': 146 obs. of 12 variables:
## $ RANK : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Country : chr "Finland" "Denmark" "Iceland" "Switzerland" ...
## $ Happiness.score : num 7.82 7.64 7.56 7.51 7.42 ...
## $ Whisker.high : num 7.89 7.71 7.65 7.59 7.47 ...
## $ Whisker.low : num 7.76 7.56 7.46 7.44 7.36 ...
## $ Dystopia..1.83....residual: num 2.52 2.23 2.32 2.15 2.14 ...
## $ GDP.per.capita : num 1.89 1.95 1.94 2.03 1.95 ...
## $ Social.support : num 1.26 1.24 1.32 1.23 1.21 ...
## $ Life.expectancy : num 0.775 0.777 0.803 0.822 0.787 0.79 0.803 0.786 0.818 0.752 ...
## $ Freedom.of.Life.Choices : num 0.736 0.719 0.718 0.677 0.651 0.7 0.724 0.728 0.568 0.68 ...
## $ Generosity : num 0.109 0.188 0.27 0.147 0.271 0.12 0.218 0.217 0.155 0.245 ...
## $ Corruption : num 0.534 0.532 0.191 0.461 0.419 0.388 0.512 0.474 0.143 0.483 ...
All the columns are numeric except for the
Country column, which is to important to
remove. Thus the row name is converted to Country names and the couln
for the country name is removed.
rownames(data) <- data$Country
data <- data[,-2]
str(data)
## 'data.frame': 146 obs. of 11 variables:
## $ RANK : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Happiness.score : num 7.82 7.64 7.56 7.51 7.42 ...
## $ Whisker.high : num 7.89 7.71 7.65 7.59 7.47 ...
## $ Whisker.low : num 7.76 7.56 7.46 7.44 7.36 ...
## $ Dystopia..1.83....residual: num 2.52 2.23 2.32 2.15 2.14 ...
## $ GDP.per.capita : num 1.89 1.95 1.94 2.03 1.95 ...
## $ Social.support : num 1.26 1.24 1.32 1.23 1.21 ...
## $ Life.expectancy : num 0.775 0.777 0.803 0.822 0.787 0.79 0.803 0.786 0.818 0.752 ...
## $ Freedom.of.Life.Choices : num 0.736 0.719 0.718 0.677 0.651 0.7 0.724 0.728 0.568 0.68 ...
## $ Generosity : num 0.109 0.188 0.27 0.147 0.271 0.12 0.218 0.217 0.155 0.245 ...
## $ Corruption : num 0.534 0.532 0.191 0.461 0.419 0.388 0.512 0.474 0.143 0.483 ...
The clustering tendency of the dataset shows how good the
dataset can be clustered. A value close to 1 shows great tendency to
form clusters. Here the Hopkins Statistic is measured.
hopkins_stat <- get_clust_tendency(data, n = nrow(data)-1, graph = FALSE)$hopkins_stat
hopkins_stat
## [1] 0.7450885
As per the result of the Hopkins statistic (0.7451), the dataset
has great tendency to cluster.
Usig Within-cluster sum of squares (WSS)
clust1 <- fviz_nbclust(data, FUNcluster = kmeans, method = "wss") +
ggtitle("Cluster numbers using the \n wss method \n K-means")
clust2 <- fviz_nbclust(data, FUNcluster = cluster::pam, method = "wss") +
ggtitle("Cluster numbers using the \n wss method \n PAM")
grid.arrange(clust1, clust2, ncol=2)
Visualizing the Plot, it seems the best case is two and the
optimum number of clusters can be up to 4. Thus clusters with 2, 3, 4
will be tried later in this analysis.
Now, proceeding with
another mode of finding the optimum number of clusters.
Using Silhouette Statistics
clust3 <- fviz_nbclust(data, FUNcluster = kmeans, method = "silhouette") +
ggtitle("Cluster numbers using the \n silhouette method \n K-means")
clust4 <- fviz_nbclust(data, FUNcluster = cluster::pam, method = "silhouette") +
ggtitle("Cluster numbers using the \n silhouette method \n PAM")
grid.arrange(clust3, clust4, ncol=2)
Seemingly the best number of cluster is two, but to have more
clarity, upto four clusters will be considered.
In this analysis, K-Means, PAM and CLARA is considered as the
Clustering Techniques. K-means is fast and efficient for small to
medium-sized datasets with a relatively low number of clusters, while
PAM and CLARA are more suitable for larger datasets and produce more
robust clustering solutions. PAM provides a more accurate clustering
result than K-means, but at the cost of increased computational
complexity. CLARA is less computationally expensive than PAM, but it may
not always produce the most accurate result due to the random
subsampling of the data.
KMeans Clustering
The centroid-based algorithm that partitions data into k clusters where each data point belongs to the cluster whose mean is closest to it. The algorithm starts by selecting k initial centroids and then alternates between assigning each data point to its closest centroid and computing the new centroids of each cluster. The process continues until convergence is achieved.
cluster_kmeans<-eclust(data, "kmeans", k= 2)
fviz_silhouette(cluster_kmeans)
## cluster size ave.sil.width
## 1 1 73 0.62
## 2 2 73 0.62
summary(cluster_kmeans)
## Length Class Mode
## cluster 146 -none- numeric
## centers 22 -none- numeric
## totss 1 -none- numeric
## withinss 2 -none- numeric
## tot.withinss 1 -none- numeric
## betweenss 1 -none- numeric
## size 2 -none- numeric
## iter 1 -none- numeric
## ifault 1 -none- numeric
## clust_plot 9 gg list
## silinfo 3 -none- list
## nbclust 1 -none- numeric
## data 11 data.frame list
PAM
PAM is similar to K-means, but instead of
using the mean of each cluster, it selects the most central object in
each cluster as a representative, called a medoid. PAM algorithm works
by randomly selecting k medoids from the data set, assigning each object
to its closest medoid and then iteratively replacing a medoid with a
non-medoid object that reduces the total dissimilarity of the cluster.
pam <- eclust(data, k=2 , FUNcluster="pam", hc_metric="euclidean")
fviz_silhouette(pam)
## cluster size ave.sil.width
## 1 1 73 0.62
## 2 2 73 0.62
summary(pam)
## Medoids:
## ID RANK Happiness.score Whisker.high Whisker.low
## Panama 37 37 6.309 6.464 6.154
## Ghana 111 111 4.872 4.999 4.745
## Dystopia..1.83....residual GDP.per.capita Social.support Life.expectancy
## Panama 2.086 1.715 1.107 0.709
## Ghana 1.972 1.112 0.595 0.409
## Freedom.of.Life.Choices Generosity Corruption
## Panama 0.592 0.049 0.051
## Ghana 0.500 0.230 0.056
## Clustering vector:
## Finland Denmark Iceland
## 1 1 1
## Switzerland Netherlands Luxembourg*
## 1 1 1
## Sweden Norway Israel
## 1 1 1
## New Zealand Austria Australia
## 1 1 1
## Ireland Germany Canada
## 1 1 1
## United States United Kingdom Czechia
## 1 1 1
## Belgium France Bahrain
## 1 1 1
## Slovenia Costa Rica United Arab Emirates
## 1 1 1
## Saudi Arabia Taiwan Province of China Singapore
## 1 1 1
## Romania Spain Uruguay
## 1 1 1
## Italy Kosovo Malta
## 1 1 1
## Lithuania Slovakia Estonia
## 1 1 1
## Panama Brazil Guatemala*
## 1 1 1
## Kazakhstan Cyprus Latvia
## 1 1 1
## Serbia Chile Nicaragua
## 1 1 1
## Mexico Croatia Poland
## 1 1 1
## El Salvador Kuwait* Hungary
## 1 1 1
## Mauritius Uzbekistan Japan
## 1 1 1
## Honduras Portugal Argentina
## 1 1 1
## Greece South Korea Philippines
## 1 1 1
## Thailand Moldova Jamaica
## 1 1 1
## Kyrgyzstan Belarus* Colombia
## 1 1 1
## Bosnia and Herzegovina Mongolia Dominican Republic
## 1 1 1
## Malaysia Bolivia China
## 1 1 1
## Paraguay Peru Montenegro
## 1 2 2
## Ecuador Vietnam Turkmenistan*
## 2 2 2
## North Cyprus* Russia Hong Kong S.A.R. of China
## 2 2 2
## Armenia Tajikistan Nepal
## 2 2 2
## Bulgaria Libya* Indonesia
## 2 2 2
## Ivory Coast North Macedonia Albania
## 2 2 2
## South Africa Azerbaijan* Gambia*
## 2 2 2
## Bangladesh Laos Algeria
## 2 2 2
## Liberia* Ukraine Congo
## 2 2 2
## Morocco Mozambique Cameroon
## 2 2 2
## Senegal Niger* Georgia
## 2 2 2
## Gabon Iraq Venezuela
## 2 2 2
## Guinea Iran Ghana
## 2 2 2
## Turkey Burkina Faso Cambodia
## 2 2 2
## Benin Comoros* Uganda
## 2 2 2
## Nigeria Kenya Tunisia
## 2 2 2
## Pakistan Palestinian Territories* Mali
## 2 2 2
## Namibia Eswatini, Kingdom of* Myanmar
## 2 2 2
## Sri Lanka Madagascar* Egypt
## 2 2 2
## Chad* Ethiopia Yemen*
## 2 2 2
## Mauritania* Jordan Togo
## 2 2 2
## India Zambia Malawi
## 2 2 2
## Tanzania Sierra Leone Lesotho*
## 2 2 2
## Botswana* Rwanda* Zimbabwe
## 2 2 2
## Lebanon Afghanistan
## 2 2
## Objective function:
## build swap
## 24.20933 18.30461
##
## Numerical information per cluster:
## size max_diss av_diss diameter separation
## [1,] 73 36.10232 18.28096 72.11476 1.100119
## [2,] 73 37.02303 18.32827 72.22129 1.100119
##
## Isolated clusters:
## L-clusters: character(0)
## L*-clusters: character(0)
##
## Silhouette plot information:
## cluster neighbor sil_width
## Spain 1 2 0.76037407
## Romania 1 2 0.76023834
## Uruguay 1 2 0.76000207
## Taiwan Province of China 1 2 0.75950119
## Italy 1 2 0.75921357
## Singapore 1 2 0.75919244
## Saudi Arabia 1 2 0.75859281
## Kosovo 1 2 0.75775238
## United Arab Emirates 1 2 0.75731843
## Malta 1 2 0.75648212
## Costa Rica 1 2 0.75576608
## Lithuania 1 2 0.75470481
## Slovenia 1 2 0.75405210
## Slovakia 1 2 0.75235711
## Bahrain 1 2 0.75198977
## France 1 2 0.74964506
## Estonia 1 2 0.74927061
## Belgium 1 2 0.74704388
## Panama 1 2 0.74632675
## Czechia 1 2 0.74418767
## Brazil 1 2 0.74253607
## United Kingdom 1 2 0.74114648
## Guatemala* 1 2 0.73787563
## United States 1 2 0.73783269
## Canada 1 2 0.73431340
## Kazakhstan 1 2 0.73363056
## Germany 1 2 0.73058201
## Cyprus 1 2 0.72837845
## Ireland 1 2 0.72655483
## Latvia 1 2 0.72258353
## Australia 1 2 0.72245124
## Austria 1 2 0.71811471
## Serbia 1 2 0.71620764
## New Zealand 1 2 0.71353002
## Chile 1 2 0.70917052
## Israel 1 2 0.70858504
## Norway 1 2 0.70383875
## Nicaragua 1 2 0.70116719
## Sweden 1 2 0.69878901
## Luxembourg* 1 2 0.69352690
## Mexico 1 2 0.69307823
## Netherlands 1 2 0.68816014
## Croatia 1 2 0.68406039
## Switzerland 1 2 0.68253990
## Iceland 1 2 0.67678056
## Poland 1 2 0.67419427
## Denmark 1 2 0.67087838
## Finland 1 2 0.66468477
## El Salvador 1 2 0.66328100
## Kuwait* 1 2 0.65217162
## Hungary 1 2 0.64008078
## Mauritius 1 2 0.62699686
## Uzbekistan 1 2 0.61270660
## Japan 1 2 0.59746736
## Honduras 1 2 0.58099467
## Portugal 1 2 0.56431124
## Argentina 1 2 0.54600092
## Greece 1 2 0.52622411
## South Korea 1 2 0.50513537
## Philippines 1 2 0.48271792
## Thailand 1 2 0.45893949
## Moldova 1 2 0.43366645
## Jamaica 1 2 0.40656898
## Kyrgyzstan 1 2 0.37752915
## Belarus* 1 2 0.34690909
## Colombia 1 2 0.31415673
## Bosnia and Herzegovina 1 2 0.27920695
## Mongolia 1 2 0.24188452
## Dominican Republic 1 2 0.20212576
## Malaysia 1 2 0.15949278
## Bolivia 1 2 0.11386307
## China 1 2 0.06547249
## Paraguay 1 2 0.01376756
## Kenya 2 1 0.76017303
## Nigeria 2 1 0.76012570
## Tunisia 2 1 0.75976969
## Uganda 2 1 0.75964649
## Pakistan 2 1 0.75917146
## Comoros* 2 1 0.75871748
## Palestinian Territories* 2 1 0.75818454
## Benin 2 1 0.75727873
## Mali 2 1 0.75664659
## Cambodia 2 1 0.75604808
## Namibia 2 1 0.75546052
## Burkina Faso 2 1 0.75405878
## Eswatini, Kingdom of* 2 1 0.75363075
## Myanmar 2 1 0.75132666
## Turkey 2 1 0.75116390
## Ghana 2 1 0.74907134
## Sri Lanka 2 1 0.74867792
## Madagascar* 2 1 0.74632956
## Iran 2 1 0.74558715
## Egypt 2 1 0.74348898
## Guinea 2 1 0.74177281
## Chad* 2 1 0.74018691
## Ethiopia 2 1 0.73744776
## Venezuela 2 1 0.73651826
## Yemen* 2 1 0.73385192
## Iraq 2 1 0.73314198
## Mauritania* 2 1 0.73016708
## Gabon 2 1 0.72783249
## Jordan 2 1 0.72590423
## Georgia 2 1 0.72183951
## Togo 2 1 0.72175151
## India 2 1 0.71740882
## Niger* 2 1 0.71512932
## Zambia 2 1 0.71309550
## Senegal 2 1 0.70852946
## Malawi 2 1 0.70825668
## Tanzania 2 1 0.70331586
## Cameroon 2 1 0.70084681
## Sierra Leone 2 1 0.69823559
## Lesotho* 2 1 0.69292586
## Mozambique 2 1 0.69211133
## Botswana* 2 1 0.68709068
## Morocco 2 1 0.68328857
## Rwanda* 2 1 0.68159749
## Zimbabwe 2 1 0.67579374
## Congo 2 1 0.67332687
## Lebanon 2 1 0.66971383
## Afghanistan 2 1 0.66309340
## Ukraine 2 1 0.66257843
## Liberia* 2 1 0.65082960
## Algeria 2 1 0.63931889
## Laos 2 1 0.62637563
## Bangladesh 2 1 0.61230964
## Gambia* 2 1 0.59660732
## Azerbaijan* 2 1 0.58045114
## South Africa 2 1 0.56372493
## Albania 2 1 0.54539072
## North Macedonia 2 1 0.52564685
## Ivory Coast 2 1 0.50385845
## Indonesia 2 1 0.48215872
## Libya* 2 1 0.45837260
## Bulgaria 2 1 0.43254389
## Nepal 2 1 0.40552363
## Tajikistan 2 1 0.37684057
## Armenia 2 1 0.34613905
## Hong Kong S.A.R. of China 2 1 0.31223090
## Russia 2 1 0.27817067
## North Cyprus* 2 1 0.24076990
## Turkmenistan* 2 1 0.20105518
## Vietnam 2 1 0.15884173
## Ecuador 2 1 0.11339259
## Montenegro 2 1 0.06493588
## Peru 2 1 0.01316053
## Average silhouette width per cluster:
## [1] 0.6213818 0.6206844
## Average silhouette width of total data set:
## [1] 0.6210331
##
## Available components:
## [1] "medoids" "id.med" "clustering" "objective" "isolation"
## [6] "clusinfo" "silinfo" "diss" "call" "data"
## [11] "clust_plot" "nbclust"
CLARA
CLARA is similar to PAM, but it
takes a random sample of the data set and applies PAM to this sample.
This process is repeated multiple times, and the best clustering
solution is chosen based on some criterion.
clara<-eclust(data, "clara", k=2)
fviz_silhouette(clara)
## cluster size ave.sil.width
## 1 1 80 0.58
## 2 2 66 0.66
summary(clara)
## Object of class 'clara' from call:
## fun_clust(x = x, k = k)
## Medoids:
## RANK Happiness.score Whisker.high Whisker.low
## Nicaragua 45 6.165 6.312 6.017
## Uganda 117 4.603 4.747 4.459
## Dystopia..1.83....residual GDP.per.capita Social.support
## Nicaragua 2.418 1.105 1.029
## Uganda 1.842 0.777 0.875
## Life.expectancy Freedom.of.Life.Choices Generosity Corruption
## Nicaragua 0.617 0.617 0.168 0.212
## Uganda 0.418 0.402 0.222 0.066
## Objective function: 18.69552
## Numerical information per cluster:
## size max_diss av_diss isolation
## [1,] 80 44.10297 20.30215 0.6120757
## [2,] 66 36.07290 16.74809 0.5006316
## Average silhouette width per cluster:
## [1] 0.5751141 0.6641000
## Average silhouette width of best sample: 0.6153406
##
## Best sample:
## [1] Switzerland Australia Czechia
## [4] Slovenia Costa Rica United Arab Emirates
## [7] Taiwan Province of China Romania Italy
## [10] Kosovo Malta Lithuania
## [13] Panama Guatemala* Nicaragua
## [16] Kuwait* Japan Portugal
## [19] South Korea Philippines Moldova
## [22] Kyrgyzstan Mongolia Peru
## [25] Montenegro Ecuador Turkmenistan*
## [28] Russia Libya* North Macedonia
## [31] Gambia* Mozambique Venezuela
## [34] Iran Benin Uganda
## [37] Myanmar Madagascar* Chad*
## [40] Mauritania* Togo India
## [43] Lesotho* Botswana*
## Clustering vector:
## Finland Denmark Iceland
## 1 1 1
## Switzerland Netherlands Luxembourg*
## 1 1 1
## Sweden Norway Israel
## 1 1 1
## New Zealand Austria Australia
## 1 1 1
## Ireland Germany Canada
## 1 1 1
## United States United Kingdom Czechia
## 1 1 1
## Belgium France Bahrain
## 1 1 1
## Slovenia Costa Rica United Arab Emirates
## 1 1 1
## Saudi Arabia Taiwan Province of China Singapore
## 1 1 1
## Romania Spain Uruguay
## 1 1 1
## Italy Kosovo Malta
## 1 1 1
## Lithuania Slovakia Estonia
## 1 1 1
## Panama Brazil Guatemala*
## 1 1 1
## Kazakhstan Cyprus Latvia
## 1 1 1
## Serbia Chile Nicaragua
## 1 1 1
## Mexico Croatia Poland
## 1 1 1
## El Salvador Kuwait* Hungary
## 1 1 1
## Mauritius Uzbekistan Japan
## 1 1 1
## Honduras Portugal Argentina
## 1 1 1
## Greece South Korea Philippines
## 1 1 1
## Thailand Moldova Jamaica
## 1 1 1
## Kyrgyzstan Belarus* Colombia
## 1 1 1
## Bosnia and Herzegovina Mongolia Dominican Republic
## 1 1 1
## Malaysia Bolivia China
## 1 1 1
## Paraguay Peru Montenegro
## 1 1 1
## Ecuador Vietnam Turkmenistan*
## 1 1 1
## North Cyprus* Russia Hong Kong S.A.R. of China
## 1 1 2
## Armenia Tajikistan Nepal
## 2 2 2
## Bulgaria Libya* Indonesia
## 2 2 2
## Ivory Coast North Macedonia Albania
## 2 2 2
## South Africa Azerbaijan* Gambia*
## 2 2 2
## Bangladesh Laos Algeria
## 2 2 2
## Liberia* Ukraine Congo
## 2 2 2
## Morocco Mozambique Cameroon
## 2 2 2
## Senegal Niger* Georgia
## 2 2 2
## Gabon Iraq Venezuela
## 2 2 2
## Guinea Iran Ghana
## 2 2 2
## Turkey Burkina Faso Cambodia
## 2 2 2
## Benin Comoros* Uganda
## 2 2 2
## Nigeria Kenya Tunisia
## 2 2 2
## Pakistan Palestinian Territories* Mali
## 2 2 2
## Namibia Eswatini, Kingdom of* Myanmar
## 2 2 2
## Sri Lanka Madagascar* Egypt
## 2 2 2
## Chad* Ethiopia Yemen*
## 2 2 2
## Mauritania* Jordan Togo
## 2 2 2
## India Zambia Malawi
## 2 2 2
## Tanzania Sierra Leone Lesotho*
## 2 2 2
## Botswana* Rwanda* Zimbabwe
## 2 2 2
## Lebanon Afghanistan
## 2 2
##
## Silhouette plot information for best sample:
## cluster neighbor sil_width
## Uruguay 1 2 0.74057016
## Italy 1 2 0.74045905
## Spain 1 2 0.74032980
## Kosovo 1 2 0.73977001
## Romania 1 2 0.73965165
## Malta 1 2 0.73928766
## Lithuania 1 2 0.73836036
## Singapore 1 2 0.73819888
## Taiwan Province of China 1 2 0.73794347
## Slovakia 1 2 0.73696180
## Saudi Arabia 1 2 0.73663106
## United Arab Emirates 1 2 0.73500988
## Estonia 1 2 0.73494169
## Costa Rica 1 2 0.73315161
## Panama 1 2 0.73305889
## Slovenia 1 2 0.73116831
## Brazil 1 2 0.73045794
## Bahrain 1 2 0.72888329
## Guatemala* 1 2 0.72711612
## France 1 2 0.72636192
## Kazakhstan 1 2 0.72419886
## Belgium 1 2 0.72361692
## Czechia 1 2 0.72065931
## Cyprus 1 2 0.72040564
## United Kingdom 1 2 0.71755048
## Latvia 1 2 0.71617708
## United States 1 2 0.71420531
## Serbia 1 2 0.71146951
## Canada 1 2 0.71068917
## Germany 1 2 0.70699067
## Chile 1 2 0.70622265
## Ireland 1 2 0.70304079
## Nicaragua 1 2 0.70015781
## Australia 1 2 0.69902386
## Austria 1 2 0.69481058
## Mexico 1 2 0.69408437
## New Zealand 1 2 0.69038416
## Croatia 1 2 0.68724433
## Israel 1 2 0.68563608
## Norway 1 2 0.68108150
## Poland 1 2 0.67970757
## Sweden 1 2 0.67625841
## El Salvador 1 2 0.67128109
## Luxembourg* 1 2 0.67125223
## Netherlands 1 2 0.66615810
## Kuwait* 1 2 0.66279369
## Switzerland 1 2 0.66084135
## Iceland 1 2 0.65540662
## Hungary 1 2 0.65348434
## Denmark 1 2 0.64984842
## Finland 1 2 0.64403058
## Mauritius 1 2 0.64337515
## Uzbekistan 1 2 0.63228671
## Japan 1 2 0.62046317
## Honduras 1 2 0.60759229
## Portugal 1 2 0.59472333
## Argentina 1 2 0.58049229
## Greece 1 2 0.56507734
## South Korea 1 2 0.54867334
## Philippines 1 2 0.53119614
## Thailand 1 2 0.51274639
## Moldova 1 2 0.49308627
## Jamaica 1 2 0.47202602
## Kyrgyzstan 1 2 0.44946521
## Belarus* 1 2 0.42577281
## Colombia 1 2 0.40042848
## Bosnia and Herzegovina 1 2 0.37342626
## Mongolia 1 2 0.34459556
## Dominican Republic 1 2 0.31403984
## Malaysia 1 2 0.28127605
## Bolivia 1 2 0.24623268
## China 1 2 0.20936184
## Paraguay 1 2 0.16994882
## Peru 1 2 0.12782337
## Montenegro 1 2 0.08296025
## Ecuador 1 2 0.03487085
## Vietnam 1 2 -0.01637088
## Turkmenistan* 1 2 -0.06668479
## North Cyprus* 1 2 -0.11510669
## Russia 1 2 -0.16167399
## Pakistan 2 1 0.78051012
## Tunisia 2 1 0.78043486
## Kenya 2 1 0.78012613
## Palestinian Territories* 2 1 0.78010237
## Nigeria 2 1 0.77927361
## Mali 2 1 0.77906946
## Namibia 2 1 0.77838781
## Uganda 2 1 0.77790165
## Eswatini, Kingdom of* 2 1 0.77697405
## Comoros* 2 1 0.77599837
## Myanmar 2 1 0.77499256
## Benin 2 1 0.77347944
## Sri Lanka 2 1 0.77260192
## Cambodia 2 1 0.77112762
## Madagascar* 2 1 0.77055459
## Burkina Faso 2 1 0.76792696
## Egypt 2 1 0.76790728
## Chad* 2 1 0.76474710
## Turkey 2 1 0.76359345
## Ethiopia 2 1 0.76218768
## Ghana 2 1 0.76019164
## Yemen* 2 1 0.75865268
## Iran 2 1 0.75512659
## Mauritania* 2 1 0.75499942
## Jordan 2 1 0.75067669
## Guinea 2 1 0.74969785
## Togo 2 1 0.74648060
## Venezuela 2 1 0.74256357
## India 2 1 0.74205677
## Zambia 2 1 0.73764904
## Iraq 2 1 0.73743239
## Malawi 2 1 0.73263686
## Gabon 2 1 0.73009757
## Tanzania 2 1 0.72749419
## Sierra Leone 2 1 0.72219377
## Georgia 2 1 0.72191441
## Lesotho* 2 1 0.71661726
## Niger* 2 1 0.71292811
## Botswana* 2 1 0.71043624
## Rwanda* 2 1 0.70465072
## Senegal 2 1 0.70390461
## Zimbabwe 2 1 0.69850070
## Cameroon 2 1 0.69360038
## Lebanon 2 1 0.69202274
## Afghanistan 2 1 0.68495769
## Mozambique 2 1 0.68204455
## Morocco 2 1 0.67023617
## Congo 2 1 0.65709631
## Ukraine 2 1 0.64282291
## Liberia* 2 1 0.62754942
## Algeria 2 1 0.61216558
## Laos 2 1 0.59515187
## Bangladesh 2 1 0.57673292
## Gambia* 2 1 0.55636294
## Azerbaijan* 2 1 0.53508465
## South Africa 2 1 0.51313495
## Albania 2 1 0.48915006
## North Macedonia 2 1 0.46331900
## Ivory Coast 2 1 0.43518837
## Indonesia 2 1 0.40644357
## Libya* 2 1 0.37523131
## Bulgaria 2 1 0.34140984
## Nepal 2 1 0.30617107
## Tajikistan 2 1 0.26836648
## Armenia 2 1 0.22775673
## Hong Kong S.A.R. of China 2 1 0.18379911
##
## 946 dissimilarities, summarized :
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.071 19.053 42.039 47.701 71.069 138.190
## Metric : euclidean
## Number of objects : 44
##
## Available components:
## [1] "sample" "medoids" "i.med" "clustering" "objective"
## [6] "clusinfo" "diss" "call" "silinfo" "data"
## [11] "clust_plot" "nbclust"
Points to Focus on:
- The Two cluster method did not provide
that much of insights to the data and the separation of the countries in
terms of the Happiness index.
- The Average Silhouette Score(ASW) is
quiet high 0.62. It is a measure of the quality of clustering that
reflects the degree of separation between clusters.
- CLARA found
out that there are some negative Silhouette width (Turkmenistan,
Vietnam, North Cyprus, Russia) effecting the Clusters.
KMeans Clustering
cluster_kmeans<-eclust(data, "kmeans", k= 3)
fviz_silhouette(cluster_kmeans)
## cluster size ave.sil.width
## 1 1 48 0.62
## 2 2 49 0.50
## 3 3 49 0.62
summary(cluster_kmeans)
## Length Class Mode
## cluster 146 -none- numeric
## centers 33 -none- numeric
## totss 1 -none- numeric
## withinss 3 -none- numeric
## tot.withinss 1 -none- numeric
## betweenss 1 -none- numeric
## size 3 -none- numeric
## iter 1 -none- numeric
## ifault 1 -none- numeric
## clust_plot 9 gg list
## silinfo 3 -none- list
## nbclust 1 -none- numeric
## data 11 data.frame list
PAM
pam <- eclust(data, k=3 , FUNcluster="pam", hc_metric="euclidean")
fviz_silhouette(pam)
## cluster size ave.sil.width
## 1 1 49 0.61
## 2 2 48 0.51
## 3 3 49 0.61
summary(pam)
## Medoids:
## ID RANK Happiness.score Whisker.high Whisker.low
## Saudi Arabia 25 25 6.523 6.637 6.409
## Paraguay 73 73 5.578 5.689 5.467
## Palestinian Territories* 122 122 4.483 4.665 4.300
## Dystopia..1.83....residual GDP.per.capita
## Saudi Arabia 2.075 1.870
## Paraguay 1.555 1.409
## Palestinian Territories* 1.368 1.148
## Social.support Life.expectancy Freedom.of.Life.Choices
## Saudi Arabia 1.092 0.577 0.651
## Paraguay 1.130 0.624 0.629
## Palestinian Territories* 0.957 0.521 0.336
## Generosity Corruption
## Saudi Arabia 0.078 0.180
## Paraguay 0.171 0.059
## Palestinian Territories* 0.073 0.079
## Clustering vector:
## Finland Denmark Iceland
## 1 1 1
## Switzerland Netherlands Luxembourg*
## 1 1 1
## Sweden Norway Israel
## 1 1 1
## New Zealand Austria Australia
## 1 1 1
## Ireland Germany Canada
## 1 1 1
## United States United Kingdom Czechia
## 1 1 1
## Belgium France Bahrain
## 1 1 1
## Slovenia Costa Rica United Arab Emirates
## 1 1 1
## Saudi Arabia Taiwan Province of China Singapore
## 1 1 1
## Romania Spain Uruguay
## 1 1 1
## Italy Kosovo Malta
## 1 1 1
## Lithuania Slovakia Estonia
## 1 1 1
## Panama Brazil Guatemala*
## 1 1 1
## Kazakhstan Cyprus Latvia
## 1 1 1
## Serbia Chile Nicaragua
## 1 1 1
## Mexico Croatia Poland
## 1 1 1
## El Salvador Kuwait* Hungary
## 1 2 2
## Mauritius Uzbekistan Japan
## 2 2 2
## Honduras Portugal Argentina
## 2 2 2
## Greece South Korea Philippines
## 2 2 2
## Thailand Moldova Jamaica
## 2 2 2
## Kyrgyzstan Belarus* Colombia
## 2 2 2
## Bosnia and Herzegovina Mongolia Dominican Republic
## 2 2 2
## Malaysia Bolivia China
## 2 2 2
## Paraguay Peru Montenegro
## 2 2 2
## Ecuador Vietnam Turkmenistan*
## 2 2 2
## North Cyprus* Russia Hong Kong S.A.R. of China
## 2 2 2
## Armenia Tajikistan Nepal
## 2 2 2
## Bulgaria Libya* Indonesia
## 2 2 2
## Ivory Coast North Macedonia Albania
## 2 2 2
## South Africa Azerbaijan* Gambia*
## 2 2 2
## Bangladesh Laos Algeria
## 2 2 2
## Liberia* Ukraine Congo
## 2 3 3
## Morocco Mozambique Cameroon
## 3 3 3
## Senegal Niger* Georgia
## 3 3 3
## Gabon Iraq Venezuela
## 3 3 3
## Guinea Iran Ghana
## 3 3 3
## Turkey Burkina Faso Cambodia
## 3 3 3
## Benin Comoros* Uganda
## 3 3 3
## Nigeria Kenya Tunisia
## 3 3 3
## Pakistan Palestinian Territories* Mali
## 3 3 3
## Namibia Eswatini, Kingdom of* Myanmar
## 3 3 3
## Sri Lanka Madagascar* Egypt
## 3 3 3
## Chad* Ethiopia Yemen*
## 3 3 3
## Mauritania* Jordan Togo
## 3 3 3
## India Zambia Malawi
## 3 3 3
## Tanzania Sierra Leone Lesotho*
## 3 3 3
## Botswana* Rwanda* Zimbabwe
## 3 3 3
## Lebanon Afghanistan
## 3 3
## Objective function:
## build swap
## 12.21856 12.21840
##
## Numerical information per cluster:
## size max_diss av_diss diameter separation
## [1,] 49 24.11340 12.28772 48.09843 1.509574
## [2,] 48 24.06910 12.03177 47.06826 1.509574
## [3,] 49 24.29489 12.33189 48.24386 2.056001
##
## Isolated clusters:
## L-clusters: character(0)
## L*-clusters: character(0)
##
## Silhouette plot information:
## cluster neighbor sil_width
## Belgium 1 2 7.563113e-01
## France 1 2 7.560154e-01
## Czechia 1 2 7.558041e-01
## Bahrain 1 2 7.549558e-01
## United Kingdom 1 2 7.546692e-01
## Slovenia 1 2 7.529955e-01
## United States 1 2 7.527648e-01
## Canada 1 2 7.502451e-01
## Costa Rica 1 2 7.499473e-01
## Germany 1 2 7.471137e-01
## United Arab Emirates 1 2 7.462536e-01
## Ireland 1 2 7.432000e-01
## Saudi Arabia 1 2 7.415423e-01
## Australia 1 2 7.390493e-01
## Taiwan Province of China 1 2 7.356106e-01
## Austria 1 2 7.342502e-01
## New Zealand 1 2 7.288268e-01
## Singapore 1 2 7.263582e-01
## Israel 1 2 7.225910e-01
## Romania 1 2 7.199122e-01
## Norway 1 2 7.166690e-01
## Spain 1 2 7.107954e-01
## Sweden 1 2 7.100309e-01
## Luxembourg* 1 2 7.028844e-01
## Uruguay 1 2 6.998937e-01
## Netherlands 1 2 6.954532e-01
## Switzerland 1 2 6.874753e-01
## Italy 1 2 6.873886e-01
## Iceland 1 2 6.791400e-01
## Kosovo 1 2 6.725210e-01
## Denmark 1 2 6.704949e-01
## Finland 1 2 6.612464e-01
## Malta 1 2 6.573195e-01
## Lithuania 1 2 6.401163e-01
## Slovakia 1 2 6.204466e-01
## Estonia 1 2 5.978342e-01
## Panama 1 2 5.744849e-01
## Brazil 1 2 5.476900e-01
## Guatemala* 1 2 5.172614e-01
## Kazakhstan 1 2 4.858774e-01
## Cyprus 1 2 4.503289e-01
## Latvia 1 2 4.111142e-01
## Serbia 1 2 3.680570e-01
## Chile 1 2 3.205537e-01
## Nicaragua 1 2 2.677259e-01
## Mexico 1 2 2.105711e-01
## Croatia 1 2 1.470120e-01
## Poland 1 2 7.672187e-02
## El Salvador 1 2 -9.721052e-06
## Peru 2 3 7.443094e-01
## Paraguay 2 1 7.442610e-01
## Montenegro 2 3 7.380095e-01
## China 2 1 7.378856e-01
## Ecuador 2 3 7.303857e-01
## Bolivia 2 1 7.301184e-01
## Vietnam 2 3 7.214423e-01
## Malaysia 2 1 7.213150e-01
## Dominican Republic 2 1 7.113706e-01
## Turkmenistan* 2 3 7.107288e-01
## Mongolia 2 1 6.995406e-01
## North Cyprus* 2 3 6.989965e-01
## Bosnia and Herzegovina 2 1 6.865245e-01
## Russia 2 3 6.862371e-01
## Colombia 2 1 6.716097e-01
## Hong Kong S.A.R. of China 2 3 6.692710e-01
## Belarus* 2 1 6.546765e-01
## Armenia 2 3 6.546303e-01
## Kyrgyzstan 2 1 6.358836e-01
## Tajikistan 2 3 6.355714e-01
## Jamaica 2 1 6.153296e-01
## Nepal 2 3 6.144191e-01
## Moldova 2 1 5.923441e-01
## Bulgaria 2 3 5.918353e-01
## Libya* 2 3 5.668161e-01
## Thailand 2 1 5.666507e-01
## Philippines 2 1 5.382281e-01
## Indonesia 2 3 5.380464e-01
## South Korea 2 1 5.071955e-01
## Ivory Coast 2 3 5.051776e-01
## North Macedonia 2 3 4.731191e-01
## Greece 2 1 4.730686e-01
## Argentina 2 1 4.357158e-01
## Albania 2 3 4.355694e-01
## South Africa 2 3 3.941087e-01
## Portugal 2 1 3.938212e-01
## Azerbaijan* 2 3 3.476605e-01
## Honduras 2 1 3.469910e-01
## Japan 2 1 2.975648e-01
## Gambia* 2 3 2.964611e-01
## Bangladesh 2 3 2.428105e-01
## Uzbekistan 2 1 2.425722e-01
## Laos 2 3 1.820653e-01
## Mauritius 2 1 1.817816e-01
## Algeria 2 3 1.148615e-01
## Hungary 2 1 1.145044e-01
## Kuwait* 2 1 4.037035e-02
## Liberia* 2 3 3.976485e-02
## Madagascar* 3 2 7.546982e-01
## Egypt 3 2 7.542194e-01
## Sri Lanka 3 2 7.538719e-01
## Myanmar 3 2 7.535215e-01
## Chad* 3 2 7.525106e-01
## Eswatini, Kingdom of* 3 2 7.521173e-01
## Ethiopia 3 2 7.519113e-01
## Namibia 3 2 7.493267e-01
## Yemen* 3 2 7.492271e-01
## Mauritania* 3 2 7.462119e-01
## Mali 3 2 7.447804e-01
## Jordan 3 2 7.417778e-01
## Palestinian Territories* 3 2 7.406831e-01
## Togo 3 2 7.374764e-01
## Pakistan 3 2 7.349747e-01
## India 3 2 7.328530e-01
## Zambia 3 2 7.280229e-01
## Tunisia 3 2 7.278858e-01
## Malawi 3 2 7.219577e-01
## Kenya 3 2 7.199282e-01
## Tanzania 3 2 7.156530e-01
## Nigeria 3 2 7.103633e-01
## Sierra Leone 3 2 7.089701e-01
## Lesotho* 3 2 7.017342e-01
## Uganda 3 2 6.991833e-01
## Botswana* 3 2 6.933649e-01
## Comoros* 3 2 6.863678e-01
## Rwanda* 3 2 6.857347e-01
## Zimbabwe 3 2 6.774117e-01
## Benin 3 2 6.716036e-01
## Lebanon 3 2 6.684021e-01
## Afghanistan 3 2 6.584481e-01
## Cambodia 3 2 6.563540e-01
## Burkina Faso 3 2 6.386792e-01
## Turkey 3 2 6.174241e-01
## Ghana 3 2 5.973726e-01
## Iran 3 2 5.725327e-01
## Guinea 3 2 5.459548e-01
## Venezuela 3 2 5.142699e-01
## Iraq 3 2 4.846666e-01
## Gabon 3 2 4.489013e-01
## Georgia 3 2 4.091692e-01
## Niger* 3 2 3.660889e-01
## Senegal 3 2 3.193637e-01
## Cameroon 3 2 2.671798e-01
## Mozambique 3 2 2.093791e-01
## Morocco 3 2 1.460629e-01
## Congo 3 2 7.627173e-02
## Ukraine 3 2 -2.042020e-03
## Average silhouette width per cluster:
## [1] 0.6133778 0.5139921 0.6120984
## Average silhouette width of total data set:
## [1] 0.5802737
##
## Available components:
## [1] "medoids" "id.med" "clustering" "objective" "isolation"
## [6] "clusinfo" "silinfo" "diss" "call" "data"
## [11] "clust_plot" "nbclust"
CLARA
clara<-eclust(data, "clara", k=3)
fviz_silhouette(clara)
## cluster size ave.sil.width
## 1 1 47 0.63
## 2 2 50 0.49
## 3 3 49 0.62
summary(clara)
## Object of class 'clara' from call:
## fun_clust(x = x, k = k)
## Medoids:
## RANK Happiness.score Whisker.high Whisker.low
## France 20 6.687 6.758 6.615
## Peru 74 5.559 5.679 5.439
## Tunisia 120 4.516 4.629 4.403
## Dystopia..1.83....residual GDP.per.capita Social.support
## France 1.895 1.863 1.219
## Peru 1.890 1.397 0.865
## Tunisia 1.540 1.350 0.596
## Life.expectancy Freedom.of.Life.Choices Generosity Corruption
## France 0.808 0.567 0.070 0.266
## Peru 0.735 0.545 0.090 0.037
## Tunisia 0.656 0.316 0.029 0.029
## Objective function: 12.37337
## Numerical information per cluster:
## size max_diss av_diss isolation
## [1,] 47 27.01956 12.12679 0.5000006
## [2,] 50 26.02371 12.57209 0.5652624
## [3,] 49 26.27560 12.40712 0.5707338
## Average silhouette width per cluster:
## [1] 0.6326235 0.4914149 0.6219053
## Average silhouette width of best sample: 0.5806672
##
## Best sample:
## [1] Finland Netherlands Luxembourg*
## [4] Israel New Zealand Austria
## [7] France Slovenia United Arab Emirates
## [10] Saudi Arabia Taiwan Province of China Singapore
## [13] Italy Malta Kuwait*
## [16] Mauritius Uzbekistan South Korea
## [19] Jamaica Kyrgyzstan Bosnia and Herzegovina
## [22] Paraguay Peru Ecuador
## [25] Turkmenistan* North Cyprus* Russia
## [28] Bulgaria Indonesia Ivory Coast
## [31] Gambia* Bangladesh Congo
## [34] Cameroon Gabon Venezuela
## [37] Iran Ghana Tunisia
## [40] Mali Namibia Myanmar
## [43] Sri Lanka Ethiopia Mauritania*
## [46] India
## Clustering vector:
## Finland Denmark Iceland
## 1 1 1
## Switzerland Netherlands Luxembourg*
## 1 1 1
## Sweden Norway Israel
## 1 1 1
## New Zealand Austria Australia
## 1 1 1
## Ireland Germany Canada
## 1 1 1
## United States United Kingdom Czechia
## 1 1 1
## Belgium France Bahrain
## 1 1 1
## Slovenia Costa Rica United Arab Emirates
## 1 1 1
## Saudi Arabia Taiwan Province of China Singapore
## 1 1 1
## Romania Spain Uruguay
## 1 1 1
## Italy Kosovo Malta
## 1 1 1
## Lithuania Slovakia Estonia
## 1 1 1
## Panama Brazil Guatemala*
## 1 1 1
## Kazakhstan Cyprus Latvia
## 1 1 1
## Serbia Chile Nicaragua
## 1 1 1
## Mexico Croatia Poland
## 1 1 2
## El Salvador Kuwait* Hungary
## 2 2 2
## Mauritius Uzbekistan Japan
## 2 2 2
## Honduras Portugal Argentina
## 2 2 2
## Greece South Korea Philippines
## 2 2 2
## Thailand Moldova Jamaica
## 2 2 2
## Kyrgyzstan Belarus* Colombia
## 2 2 2
## Bosnia and Herzegovina Mongolia Dominican Republic
## 2 2 2
## Malaysia Bolivia China
## 2 2 2
## Paraguay Peru Montenegro
## 2 2 2
## Ecuador Vietnam Turkmenistan*
## 2 2 2
## North Cyprus* Russia Hong Kong S.A.R. of China
## 2 2 2
## Armenia Tajikistan Nepal
## 2 2 2
## Bulgaria Libya* Indonesia
## 2 2 2
## Ivory Coast North Macedonia Albania
## 2 2 2
## South Africa Azerbaijan* Gambia*
## 2 2 2
## Bangladesh Laos Algeria
## 2 2 2
## Liberia* Ukraine Congo
## 2 3 3
## Morocco Mozambique Cameroon
## 3 3 3
## Senegal Niger* Georgia
## 3 3 3
## Gabon Iraq Venezuela
## 3 3 3
## Guinea Iran Ghana
## 3 3 3
## Turkey Burkina Faso Cambodia
## 3 3 3
## Benin Comoros* Uganda
## 3 3 3
## Nigeria Kenya Tunisia
## 3 3 3
## Pakistan Palestinian Territories* Mali
## 3 3 3
## Namibia Eswatini, Kingdom of* Myanmar
## 3 3 3
## Sri Lanka Madagascar* Egypt
## 3 3 3
## Chad* Ethiopia Yemen*
## 3 3 3
## Mauritania* Jordan Togo
## 3 3 3
## India Zambia Malawi
## 3 3 3
## Tanzania Sierra Leone Lesotho*
## 3 3 3
## Botswana* Rwanda* Zimbabwe
## 3 3 3
## Lebanon Afghanistan
## 3 3
##
## Silhouette plot information for best sample:
## cluster neighbor sil_width
## Belgium 1 2 0.7649436414
## Czechia 1 2 0.7648527222
## France 1 2 0.7641658882
## United Kingdom 1 2 0.7640768039
## Bahrain 1 2 0.7625564366
## United States 1 2 0.7624685976
## Canada 1 2 0.7601953807
## Slovenia 1 2 0.7599733479
## Germany 1 2 0.7572577291
## Costa Rica 1 2 0.7562040608
## Ireland 1 2 0.7534839250
## United Arab Emirates 1 2 0.7517332943
## Australia 1 2 0.7494451159
## Saudi Arabia 1 2 0.7461414634
## Austria 1 2 0.7447079813
## New Zealand 1 2 0.7393028611
## Taiwan Province of China 1 2 0.7392258017
## Israel 1 2 0.7330316515
## Singapore 1 2 0.7287774661
## Norway 1 2 0.7270819398
## Romania 1 2 0.7212022733
## Sweden 1 2 0.7203652106
## Luxembourg* 1 2 0.7131053185
## Spain 1 2 0.7107843420
## Netherlands 1 2 0.7055355142
## Uruguay 1 2 0.6984111789
## Switzerland 1 2 0.6973859841
## Iceland 1 2 0.6888506951
## Italy 1 2 0.6842829871
## Denmark 1 2 0.6799846711
## Finland 1 2 0.6704794057
## Kosovo 1 2 0.6675785048
## Malta 1 2 0.6504554881
## Lithuania 1 2 0.6311095750
## Slovakia 1 2 0.6090549349
## Estonia 1 2 0.5837905204
## Panama 1 2 0.5575808160
## Brazil 1 2 0.5275997128
## Guatemala* 1 2 0.4936349870
## Kazakhstan 1 2 0.4584283951
## Cyprus 1 2 0.4186245027
## Latvia 1 2 0.3746913747
## Serbia 1 2 0.3263937249
## Chile 1 2 0.2730876564
## Nicaragua 1 2 0.2138664244
## Mexico 1 2 0.1494600336
## Croatia 1 2 0.0779340380
## Paraguay 2 1 0.7392901433
## China 2 1 0.7338304004
## Peru 2 3 0.7330667300
## Bolivia 2 1 0.7270984575
## Montenegro 2 3 0.7256945815
## Malaysia 2 1 0.7194135734
## Ecuador 2 3 0.7169978484
## Dominican Republic 2 1 0.7107056538
## Vietnam 2 3 0.7069669442
## Mongolia 2 1 0.7002596402
## Turkmenistan* 2 3 0.6951715086
## Bosnia and Herzegovina 2 1 0.6887455038
## North Cyprus* 2 3 0.6823371579
## Colombia 2 1 0.6755022722
## Russia 2 3 0.6684473878
## Belarus* 2 1 0.6604131846
## Hong Kong S.A.R. of China 2 3 0.6504225118
## Kyrgyzstan 2 1 0.6436477673
## Armenia 2 3 0.6345610864
## Jamaica 2 1 0.6253075392
## Tajikistan 2 3 0.6143562683
## Moldova 2 1 0.6047630401
## Nepal 2 3 0.5920489138
## Thailand 2 1 0.5817536538
## Bulgaria 2 3 0.5682666060
## Philippines 2 1 0.5563233200
## Libya* 2 3 0.5420138375
## South Korea 2 1 0.5285243861
## Indonesia 2 3 0.5120281785
## Greece 2 1 0.4980011985
## Ivory Coast 2 3 0.4779894388
## Argentina 2 1 0.4646020969
## North Macedonia 2 3 0.4445695338
## Portugal 2 1 0.4270973033
## Albania 2 3 0.4057183898
## Honduras 2 1 0.3851929200
## South Africa 2 3 0.3629406295
## Japan 2 1 0.3410376106
## Azerbaijan* 2 3 0.3151931250
## Uzbekistan 2 1 0.2920884978
## Gambia* 2 3 0.2626250868
## Mauritius 2 1 0.2379746148
## Bangladesh 2 3 0.2074699942
## Hungary 2 1 0.1779947458
## Laos 2 3 0.1452585927
## Kuwait* 2 1 0.1117964854
## Algeria 2 3 0.0765681734
## El Salvador 2 1 0.0384506884
## Liberia* 2 3 0.0000324327
## Poland 2 1 -0.0398156134
## Madagascar* 3 2 0.7591157573
## Egypt 3 2 0.7585675906
## Sri Lanka 3 2 0.7583856041
## Myanmar 3 2 0.7581265298
## Eswatini, Kingdom of* 3 2 0.7568371944
## Chad* 3 2 0.7568116040
## Ethiopia 3 2 0.7561507158
## Namibia 3 2 0.7541929037
## Yemen* 3 2 0.7534400658
## Mauritania* 3 2 0.7504052922
## Mali 3 2 0.7498308751
## Jordan 3 2 0.7459746466
## Palestinian Territories* 3 2 0.7459202910
## Togo 3 2 0.7416745353
## Pakistan 3 2 0.7404366635
## India 3 2 0.7370560772
## Tunisia 3 2 0.7336124566
## Zambia 3 2 0.7322361931
## Malawi 3 2 0.7261988931
## Kenya 3 2 0.7259488239
## Tanzania 3 2 0.7199250719
## Nigeria 3 2 0.7167262386
## Sierra Leone 3 2 0.7132772652
## Lesotho* 3 2 0.7060838983
## Uganda 3 2 0.7059400094
## Botswana* 3 2 0.6977712759
## Comoros* 3 2 0.6935717179
## Rwanda* 3 2 0.6901851775
## Zimbabwe 3 2 0.6819148084
## Benin 3 2 0.6793207733
## Lebanon 3 2 0.6729665105
## Cambodia 3 2 0.6646302393
## Afghanistan 3 2 0.6630803471
## Burkina Faso 3 2 0.6475920751
## Turkey 3 2 0.6271023623
## Ghana 3 2 0.6078266892
## Iran 3 2 0.5839281834
## Guinea 3 2 0.5583796356
## Venezuela 3 2 0.5279138414
## Iraq 3 2 0.4995976999
## Gabon 3 2 0.4653438179
## Georgia 3 2 0.4273379594
## Niger* 3 2 0.3861663700
## Senegal 3 2 0.3416483619
## Cameroon 3 2 0.2919813446
## Mozambique 3 2 0.2370353020
## Morocco 3 2 0.1770435445
## Congo 3 2 0.1110125707
## Ukraine 3 2 0.0371361695
##
## 1035 dissimilarities, summarized :
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.022 20.053 44.037 48.313 72.055 135.200
## Metric : euclidean
## Number of objects : 46
##
## Available components:
## [1] "sample" "medoids" "i.med" "clustering" "objective"
## [6] "clusinfo" "diss" "call" "silinfo" "data"
## [11] "clust_plot" "nbclust"
Points to Focus:
- The ASW decreases to 0.58. - CLARA showed
the negative silhouette width.
KMeans Clustering
cluster_kmeans<-eclust(data, "kmeans", k= 4)
fviz_silhouette(cluster_kmeans)
## cluster size ave.sil.width
## 1 1 36 0.62
## 2 2 37 0.50
## 3 3 37 0.61
## 4 4 36 0.51
summary(cluster_kmeans)
## Length Class Mode
## cluster 146 -none- numeric
## centers 44 -none- numeric
## totss 1 -none- numeric
## withinss 4 -none- numeric
## tot.withinss 1 -none- numeric
## betweenss 1 -none- numeric
## size 4 -none- numeric
## iter 1 -none- numeric
## ifault 1 -none- numeric
## clust_plot 9 gg list
## silinfo 3 -none- list
## nbclust 1 -none- numeric
## data 11 data.frame list
PAM
pam <- eclust(data, k=4 , FUNcluster="pam", hc_metric="euclidean")
fviz_silhouette(pam)
## cluster size ave.sil.width
## 1 1 35 0.62
## 2 2 36 0.50
## 3 3 37 0.50
## 4 4 38 0.61
summary(pam)
## Medoids:
## ID RANK Happiness.score Whisker.high Whisker.low
## Czechia 18 18 6.920 7.029 6.811
## Uzbekistan 53 53 6.063 6.178 5.948
## Albania 90 90 5.199 5.321 5.076
## Madagascar* 128 128 4.339 4.530 4.148
## Dystopia..1.83....residual GDP.per.capita Social.support
## Czechia 2.263 1.815 1.260
## Uzbekistan 1.913 1.219 1.092
## Albania 1.718 1.439 0.646
## Madagascar* 2.148 0.670 0.645
## Life.expectancy Freedom.of.Life.Choices Generosity Corruption
## Czechia 0.715 0.660 0.158 0.048
## Uzbekistan 0.600 0.716 0.283 0.240
## Albania 0.719 0.511 0.138 0.028
## Madagascar* 0.378 0.202 0.143 0.154
## Clustering vector:
## Finland Denmark Iceland
## 1 1 1
## Switzerland Netherlands Luxembourg*
## 1 1 1
## Sweden Norway Israel
## 1 1 1
## New Zealand Austria Australia
## 1 1 1
## Ireland Germany Canada
## 1 1 1
## United States United Kingdom Czechia
## 1 1 1
## Belgium France Bahrain
## 1 1 1
## Slovenia Costa Rica United Arab Emirates
## 1 1 1
## Saudi Arabia Taiwan Province of China Singapore
## 1 1 1
## Romania Spain Uruguay
## 1 1 1
## Italy Kosovo Malta
## 1 1 1
## Lithuania Slovakia Estonia
## 1 1 2
## Panama Brazil Guatemala*
## 2 2 2
## Kazakhstan Cyprus Latvia
## 2 2 2
## Serbia Chile Nicaragua
## 2 2 2
## Mexico Croatia Poland
## 2 2 2
## El Salvador Kuwait* Hungary
## 2 2 2
## Mauritius Uzbekistan Japan
## 2 2 2
## Honduras Portugal Argentina
## 2 2 2
## Greece South Korea Philippines
## 2 2 2
## Thailand Moldova Jamaica
## 2 2 2
## Kyrgyzstan Belarus* Colombia
## 2 2 2
## Bosnia and Herzegovina Mongolia Dominican Republic
## 2 2 2
## Malaysia Bolivia China
## 2 2 3
## Paraguay Peru Montenegro
## 3 3 3
## Ecuador Vietnam Turkmenistan*
## 3 3 3
## North Cyprus* Russia Hong Kong S.A.R. of China
## 3 3 3
## Armenia Tajikistan Nepal
## 3 3 3
## Bulgaria Libya* Indonesia
## 3 3 3
## Ivory Coast North Macedonia Albania
## 3 3 3
## South Africa Azerbaijan* Gambia*
## 3 3 3
## Bangladesh Laos Algeria
## 3 3 3
## Liberia* Ukraine Congo
## 3 3 3
## Morocco Mozambique Cameroon
## 3 3 3
## Senegal Niger* Georgia
## 3 3 3
## Gabon Iraq Venezuela
## 3 3 3
## Guinea Iran Ghana
## 4 4 4
## Turkey Burkina Faso Cambodia
## 4 4 4
## Benin Comoros* Uganda
## 4 4 4
## Nigeria Kenya Tunisia
## 4 4 4
## Pakistan Palestinian Territories* Mali
## 4 4 4
## Namibia Eswatini, Kingdom of* Myanmar
## 4 4 4
## Sri Lanka Madagascar* Egypt
## 4 4 4
## Chad* Ethiopia Yemen*
## 4 4 4
## Mauritania* Jordan Togo
## 4 4 4
## India Zambia Malawi
## 4 4 4
## Tanzania Sierra Leone Lesotho*
## 4 4 4
## Botswana* Rwanda* Zimbabwe
## 4 4 4
## Lebanon Afghanistan
## 4 4
## Objective function:
## build swap
## 10.170760 9.208021
##
## Numerical information per cluster:
## size max_diss av_diss diameter separation
## [1,] 35 17.08093 8.789478 34.09799 1.255182
## [2,] 36 18.02213 9.044807 35.03725 1.205409
## [3,] 37 18.10055 9.302966 36.07471 1.205409
## [4,] 38 19.03006 9.655699 37.27807 1.429613
##
## Isolated clusters:
## L-clusters: character(0)
## L*-clusters: character(0)
##
## Silhouette plot information:
## cluster neighbor sil_width
## Germany 1 2 0.7595133422
## Ireland 1 2 0.7586009668
## Canada 1 2 0.7585917070
## Australia 1 2 0.7569272096
## United States 1 2 0.7560033701
## Austria 1 2 0.7536764810
## United Kingdom 1 2 0.7517634780
## New Zealand 1 2 0.7490269684
## Czechia 1 2 0.7454210153
## Israel 1 2 0.7426539650
## Belgium 1 2 0.7371673190
## Norway 1 2 0.7367100157
## Sweden 1 2 0.7292083078
## France 1 2 0.7265825491
## Luxembourg* 1 2 0.7206307766
## Bahrain 1 2 0.7136447373
## Netherlands 1 2 0.7113808266
## Switzerland 1 2 0.7010827199
## Slovenia 1 2 0.6979737349
## Iceland 1 2 0.6899925092
## Costa Rica 1 2 0.6788332211
## Denmark 1 2 0.6783118514
## Finland 1 2 0.6655067127
## United Arab Emirates 1 2 0.6573388701
## Saudi Arabia 1 2 0.6319990495
## Taiwan Province of China 1 2 0.6024813457
## Singapore 1 2 0.5644558086
## Romania 1 2 0.5288435167
## Spain 1 2 0.4855899980
## Uruguay 1 2 0.4350480751
## Italy 1 2 0.3773225287
## Kosovo 1 2 0.3099657170
## Malta 1 2 0.2362374172
## Lithuania 1 2 0.1514459547
## Slovakia 1 2 0.0533736258
## Japan 2 1 0.7412578037
## Uzbekistan 2 1 0.7345170591
## Honduras 2 3 0.7314534849
## Mauritius 2 1 0.7255516017
## Portugal 2 3 0.7216987368
## Hungary 2 1 0.7135608070
## Argentina 2 3 0.7085689579
## Kuwait* 2 1 0.6987526836
## Greece 2 3 0.6919676491
## El Salvador 2 1 0.6809690663
## South Korea 2 3 0.6724712756
## Poland 2 1 0.6619113221
## Philippines 2 3 0.6502107833
## Croatia 2 1 0.6388195415
## Thailand 2 3 0.6245517764
## Mexico 2 1 0.6117196055
## Moldova 2 3 0.5954307633
## Nicaragua 2 1 0.5796262065
## Jamaica 2 3 0.5614327495
## Chile 2 1 0.5447564440
## Kyrgyzstan 2 3 0.5221777162
## Serbia 2 1 0.5039407917
## Belarus* 2 3 0.4782634347
## Latvia 2 1 0.4569392462
## Colombia 2 3 0.4283688712
## Cyprus 2 1 0.4033306608
## Bosnia and Herzegovina 2 3 0.3713454464
## Kazakhstan 2 1 0.3425036083
## Mongolia 2 3 0.3064135865
## Guatemala* 2 1 0.2721737232
## Dominican Republic 2 3 0.2327071126
## Brazil 2 1 0.1944243938
## Malaysia 2 3 0.1482013456
## Panama 2 1 0.1040777809
## Bolivia 2 3 0.0527087180
## Estonia 2 1 0.0005763932
## Albania 3 2 0.7382868515
## South Africa 3 4 0.7375549855
## North Macedonia 3 2 0.7298976303
## Azerbaijan* 3 4 0.7261971792
## Ivory Coast 3 2 0.7174401100
## Gambia* 3 4 0.7142985447
## Indonesia 3 2 0.7067905682
## Bangladesh 3 4 0.7018317846
## Libya* 3 2 0.6924467628
## Laos 3 4 0.6848750756
## Bulgaria 3 2 0.6737928410
## Algeria 3 4 0.6646439356
## Nepal 3 2 0.6531218138
## Liberia* 3 4 0.6398497650
## Tajikistan 3 2 0.6297393079
## Ukraine 3 4 0.6141339096
## Armenia 3 2 0.6026124350
## Congo 3 4 0.5854049719
## Hong Kong S.A.R. of China 3 2 0.5674326008
## Morocco 3 4 0.5517742584
## Russia 3 2 0.5347467700
## Mozambique 3 4 0.5124546708
## North Cyprus* 3 2 0.4931456605
## Cameroon 3 4 0.4698663684
## Turkmenistan* 3 2 0.4466972355
## Senegal 3 4 0.4201759699
## Vietnam 3 2 0.3946836159
## Niger* 3 4 0.3624589466
## Ecuador 3 2 0.3345678504
## Georgia 3 4 0.2983171489
## Montenegro 3 2 0.2664792411
## Gabon 3 4 0.2270914360
## Peru 3 2 0.1887689761
## Iraq 3 4 0.1451401455
## Paraguay 3 2 0.1006728898
## Venezuela 3 4 0.0530647970
## China 3 2 0.0001635217
## Yemen* 4 3 0.7523925571
## Mauritania* 4 3 0.7521573607
## Ethiopia 4 3 0.7519366797
## Jordan 4 3 0.7493133322
## Chad* 4 3 0.7473525434
## Togo 4 3 0.7461127035
## Egypt 4 3 0.7449973665
## India 4 3 0.7429429302
## Madagascar* 4 3 0.7393590620
## Zambia 4 3 0.7387759210
## Malawi 4 3 0.7323917242
## Sri Lanka 4 3 0.7319472067
## Tanzania 4 3 0.7258098910
## Myanmar 4 3 0.7242559554
## Sierra Leone 4 3 0.7182202532
## Eswatini, Kingdom of* 4 3 0.7140362963
## Lesotho* 4 3 0.7098145422
## Namibia 4 3 0.7010458189
## Botswana* 4 3 0.6996621160
## Rwanda* 4 3 0.6904339249
## Mali 4 3 0.6837724392
## Zimbabwe 4 3 0.6801517250
## Lebanon 4 3 0.6686842716
## Palestinian Territories* 4 3 0.6672064100
## Afghanistan 4 3 0.6556871663
## Pakistan 4 3 0.6461415713
## Tunisia 4 3 0.6219287681
## Kenya 4 3 0.5945059904
## Nigeria 4 3 0.5626716861
## Uganda 4 3 0.5260254409
## Comoros* 4 3 0.4842197463
## Benin 4 3 0.4365731633
## Cambodia 4 3 0.3855606797
## Burkina Faso 4 3 0.3253293628
## Turkey 4 3 0.2573822485
## Ghana 4 3 0.1819818493
## Iran 4 3 0.0961957690
## Guinea 4 3 -0.0031669666
## Average silhouette width per cluster:
## [1] 0.6215230 0.5029828 0.5021789 0.6074687
## Average silhouette width of total data set:
## [1] 0.5583912
##
## Available components:
## [1] "medoids" "id.med" "clustering" "objective" "isolation"
## [6] "clusinfo" "silinfo" "diss" "call" "data"
## [11] "clust_plot" "nbclust"
CLARA
clara<-eclust(data, "clara", k=4)
fviz_silhouette(clara)
## cluster size ave.sil.width
## 1 1 36 0.59
## 2 2 33 0.53
## 3 3 35 0.52
## 4 4 42 0.57
summary(clara)
## Object of class 'clara' from call:
## fun_clust(x = x, k = k)
## Medoids:
## RANK Happiness.score Whisker.high Whisker.low
## France 20 6.687 6.758 6.615
## Uzbekistan 53 6.063 6.178 5.948
## Bulgaria 85 5.371 5.485 5.257
## Namibia 124 4.459 4.593 4.326
## Dystopia..1.83....residual GDP.per.capita Social.support
## France 1.895 1.863 1.219
## Uzbekistan 1.913 1.219 1.092
## Bulgaria 1.235 1.625 1.163
## Namibia 1.414 1.292 0.877
## Life.expectancy Freedom.of.Life.Choices Generosity Corruption
## France 0.808 0.567 0.070 0.266
## Uzbekistan 0.600 0.716 0.283 0.240
## Bulgaria 0.640 0.563 0.123 0.021
## Namibia 0.354 0.384 0.067 0.071
## Objective function: 9.323045
## Numerical information per cluster:
## size max_diss av_diss isolation
## [1,] 36 19.11423 9.102438 0.5787639
## [2,] 33 16.01807 8.289128 0.5000365
## [3,] 35 19.10712 8.947259 0.5964675
## [4,] 42 22.31404 10.637653 0.5716216
## Average silhouette width per cluster:
## [1] 0.5937950 0.5286837 0.5218293 0.5703053
## Average silhouette width of best sample: 0.5550687
##
## Best sample:
## [1] Finland Denmark Netherlands
## [4] Luxembourg* New Zealand Austria
## [7] France Slovenia Saudi Arabia
## [10] Taiwan Province of China Singapore Spain
## [13] Italy Malta Mexico
## [16] Croatia Kuwait* Mauritius
## [19] Uzbekistan South Korea Jamaica
## [22] Kyrgyzstan Bosnia and Herzegovina Paraguay
## [25] Ecuador Turkmenistan* North Cyprus*
## [28] Russia Bulgaria Indonesia
## [31] Ivory Coast Albania Bangladesh
## [34] Congo Cameroon Gabon
## [37] Venezuela Iran Ghana
## [40] Tunisia Mali Namibia
## [43] Myanmar Sri Lanka Ethiopia
## [46] Mauritania* India Rwanda*
## Clustering vector:
## Finland Denmark Iceland
## 1 1 1
## Switzerland Netherlands Luxembourg*
## 1 1 1
## Sweden Norway Israel
## 1 1 1
## New Zealand Austria Australia
## 1 1 1
## Ireland Germany Canada
## 1 1 1
## United States United Kingdom Czechia
## 1 1 1
## Belgium France Bahrain
## 1 1 1
## Slovenia Costa Rica United Arab Emirates
## 1 1 1
## Saudi Arabia Taiwan Province of China Singapore
## 1 1 1
## Romania Spain Uruguay
## 1 1 1
## Italy Kosovo Malta
## 1 1 1
## Lithuania Slovakia Estonia
## 1 1 1
## Panama Brazil Guatemala*
## 2 2 2
## Kazakhstan Cyprus Latvia
## 2 2 2
## Serbia Chile Nicaragua
## 2 2 2
## Mexico Croatia Poland
## 2 2 2
## El Salvador Kuwait* Hungary
## 2 2 2
## Mauritius Uzbekistan Japan
## 2 2 2
## Honduras Portugal Argentina
## 2 2 2
## Greece South Korea Philippines
## 2 2 2
## Thailand Moldova Jamaica
## 2 2 2
## Kyrgyzstan Belarus* Colombia
## 2 2 2
## Bosnia and Herzegovina Mongolia Dominican Republic
## 2 2 2
## Malaysia Bolivia China
## 3 3 3
## Paraguay Peru Montenegro
## 3 3 3
## Ecuador Vietnam Turkmenistan*
## 3 3 3
## North Cyprus* Russia Hong Kong S.A.R. of China
## 3 3 3
## Armenia Tajikistan Nepal
## 3 3 3
## Bulgaria Libya* Indonesia
## 3 3 3
## Ivory Coast North Macedonia Albania
## 3 3 3
## South Africa Azerbaijan* Gambia*
## 3 3 3
## Bangladesh Laos Algeria
## 3 3 3
## Liberia* Ukraine Congo
## 3 3 3
## Morocco Mozambique Cameroon
## 3 3 3
## Senegal Niger* Georgia
## 3 3 4
## Gabon Iraq Venezuela
## 4 4 4
## Guinea Iran Ghana
## 4 4 4
## Turkey Burkina Faso Cambodia
## 4 4 4
## Benin Comoros* Uganda
## 4 4 4
## Nigeria Kenya Tunisia
## 4 4 4
## Pakistan Palestinian Territories* Mali
## 4 4 4
## Namibia Eswatini, Kingdom of* Myanmar
## 4 4 4
## Sri Lanka Madagascar* Egypt
## 4 4 4
## Chad* Ethiopia Yemen*
## 4 4 4
## Mauritania* Jordan Togo
## 4 4 4
## India Zambia Malawi
## 4 4 4
## Tanzania Sierra Leone Lesotho*
## 4 4 4
## Botswana* Rwanda* Zimbabwe
## 4 4 4
## Lebanon Afghanistan
## 4 4
##
## Silhouette plot information for best sample:
## cluster neighbor sil_width
## Germany 1 2 0.74725735
## Canada 1 2 0.74660455
## Ireland 1 2 0.74613453
## United States 1 2 0.74430788
## Australia 1 2 0.74425594
## Austria 1 2 0.74084510
## United Kingdom 1 2 0.74041577
## New Zealand 1 2 0.73607804
## Czechia 1 2 0.73444243
## Israel 1 2 0.72960416
## Belgium 1 2 0.72661138
## Norway 1 2 0.72360277
## France 1 2 0.71651231
## Sweden 1 2 0.71606253
## Luxembourg* 1 2 0.70747740
## Bahrain 1 2 0.70407002
## Netherlands 1 2 0.69824215
## Slovenia 1 2 0.68896107
## Switzerland 1 2 0.68798836
## Iceland 1 2 0.67696491
## Costa Rica 1 2 0.67038643
## Denmark 1 2 0.66537728
## Finland 1 2 0.65268663
## United Arab Emirates 1 2 0.64956284
## Saudi Arabia 1 2 0.62488099
## Taiwan Province of China 1 2 0.59608381
## Singapore 1 2 0.55890886
## Romania 1 2 0.52389589
## Spain 1 2 0.48151972
## Uruguay 1 2 0.43176319
## Italy 1 2 0.37472212
## Kosovo 1 2 0.30781900
## Malta 1 2 0.23524965
## Lithuania 1 2 0.15061001
## Slovakia 1 2 0.05257141
## Estonia 1 2 -0.05585569
## Uzbekistan 2 3 0.74884378
## Mauritius 2 1 0.74478537
## Japan 2 3 0.73959322
## Hungary 2 1 0.73381083
## Honduras 2 3 0.72768960
## Kuwait* 2 1 0.71983599
## Portugal 2 3 0.71556657
## El Salvador 2 1 0.70278052
## Argentina 2 3 0.69939379
## Poland 2 1 0.68430872
## Greece 2 3 0.67895228
## Croatia 2 1 0.66161146
## South Korea 2 3 0.65477095
## Mexico 2 1 0.63465960
## Philippines 2 3 0.62708746
## Nicaragua 2 1 0.60236352
## Thailand 2 3 0.59489808
## Chile 2 1 0.56706801
## Moldova 2 3 0.55823986
## Serbia 2 1 0.52543660
## Jamaica 2 3 0.51517340
## Latvia 2 1 0.47713823
## Kyrgyzstan 2 3 0.46511662
## Cyprus 2 1 0.42181641
## Belarus* 2 3 0.40852551
## Kazakhstan 2 1 0.35857578
## Colombia 2 3 0.34366412
## Guatemala* 2 1 0.28578547
## Bosnia and Herzegovina 2 3 0.26888911
## Brazil 2 1 0.20419915
## Mongolia 2 3 0.18305129
## Panama 2 1 0.10904084
## Dominican Republic 2 3 0.08388944
## North Macedonia 3 2 0.74499755
## Ivory Coast 3 2 0.73784555
## Albania 3 4 0.73767781
## Indonesia 3 2 0.73308345
## Libya* 3 2 0.72479888
## South Africa 3 4 0.72397170
## Bulgaria 3 2 0.71247627
## Azerbaijan* 3 4 0.70613626
## Nepal 3 2 0.69838734
## Gambia* 3 4 0.68710083
## Tajikistan 3 2 0.68211654
## Bangladesh 3 4 0.66712555
## Armenia 3 2 0.66257223
## Laos 3 4 0.64170245
## Hong Kong S.A.R. of China 3 2 0.63513833
## Algeria 3 4 0.61204240
## Russia 3 2 0.61128311
## North Cyprus* 3 2 0.57876161
## Liberia* 3 4 0.57681972
## Turkmenistan* 3 2 0.54220123
## Ukraine 3 4 0.53967567
## Vietnam 3 2 0.50103447
## Congo 3 4 0.49839150
## Ecuador 3 2 0.45262026
## Morocco 3 4 0.45063819
## Montenegro 3 2 0.39727961
## Mozambique 3 4 0.39584458
## Cameroon 3 4 0.33546270
## Peru 3 2 0.33359868
## Senegal 3 4 0.26607642
## Paraguay 3 2 0.26091098
## Niger* 3 4 0.18754994
## China 3 2 0.17740087
## Bolivia 3 2 0.08047560
## Malaysia 3 2 -0.02917339
## Ethiopia 4 3 0.73699988
## Yemen* 4 3 0.73609089
## Mauritania* 4 3 0.73463673
## Chad* 4 3 0.73451677
## Egypt 4 3 0.73390166
## Jordan 4 3 0.73100689
## Madagascar* 4 3 0.73077264
## Togo 4 3 0.72722202
## Sri Lanka 4 3 0.72602775
## India 4 3 0.72334058
## Myanmar 4 3 0.72134370
## Zambia 4 3 0.71877969
## Eswatini, Kingdom of* 4 3 0.71470670
## Malawi 4 3 0.71235570
## Namibia 4 3 0.70581704
## Tanzania 4 3 0.70573066
## Sierra Leone 4 3 0.69833514
## Mali 4 3 0.69358583
## Lesotho* 4 3 0.69022780
## Palestinian Territories* 4 3 0.68205084
## Botswana* 4 3 0.68059500
## Rwanda* 4 3 0.67183267
## Pakistan 4 3 0.66710584
## Zimbabwe 4 3 0.66215975
## Lebanon 4 3 0.65148817
## Tunisia 4 3 0.64972640
## Afghanistan 4 3 0.63948604
## Kenya 4 3 0.63007573
## Nigeria 4 3 0.60713680
## Uganda 4 3 0.58067714
## Comoros* 4 3 0.55045597
## Benin 4 3 0.51591565
## Cambodia 4 3 0.47942775
## Burkina Faso 4 3 0.43651393
## Turkey 4 3 0.38720344
## Ghana 4 3 0.33502475
## Iran 4 3 0.27402595
## Guinea 4 3 0.20496715
## Venezuela 4 3 0.12538222
## Iraq 4 3 0.04218809
## Gabon 4 3 -0.05309814
## Georgia 4 3 -0.14291460
##
## 1128 dissimilarities, summarized :
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.022 21.026 44.045 49.393 74.055 142.240
## Metric : euclidean
## Number of objects : 48
##
## Available components:
## [1] "sample" "medoids" "i.med" "clustering" "objective"
## [6] "clusinfo" "diss" "call" "silinfo" "data"
## [11] "clust_plot" "nbclust"
After considering all possibilities of clustering, the Clusters with
2 sections seems great with ASW as high as 0.62, on the other hand, the
clustering with 4 sections though has some overlapping detected by
CLARA, it can summarize the data very well and provides a better
overview of the segregation of the data in terms of Happiness.
The goal of dimension reduction is to simplify the dataset by
identifying a smaller set of new variables (also called features or
dimensions) that retain most of the important information present in the
original variables. This can be helpful for a number of reasons, such as
visualizing high-dimensional data, improving the performance of machine
learning models, and identifying key variables that explain the most
variation in the data.
summary(data)
## RANK Happiness.score Whisker.high Whisker.low
## Min. : 1.00 Min. :2.404 Min. :2.469 Min. :2.339
## 1st Qu.: 37.25 1st Qu.:4.889 1st Qu.:5.006 1st Qu.:4.755
## Median : 73.50 Median :5.569 Median :5.680 Median :5.453
## Mean : 73.50 Mean :5.554 Mean :5.674 Mean :5.434
## 3rd Qu.:109.75 3rd Qu.:6.305 3rd Qu.:6.449 3rd Qu.:6.190
## Max. :146.00 Max. :7.821 Max. :7.886 Max. :7.756
## Dystopia..1.83....residual GDP.per.capita Social.support Life.expectancy
## Min. :0.187 Min. :0.000 Min. :0.0000 Min. :0.0000
## 1st Qu.:1.555 1st Qu.:1.095 1st Qu.:0.7320 1st Qu.:0.4632
## Median :1.895 Median :1.446 Median :0.9575 Median :0.6215
## Mean :1.832 Mean :1.410 Mean :0.9059 Mean :0.5862
## 3rd Qu.:2.153 3rd Qu.:1.785 3rd Qu.:1.1142 3rd Qu.:0.7198
## Max. :2.844 Max. :2.209 Max. :1.3200 Max. :0.9420
## Freedom.of.Life.Choices Generosity Corruption
## Min. :0.0000 Min. :0.0000 Min. :0.00000
## 1st Qu.:0.4405 1st Qu.:0.0890 1st Qu.:0.06825
## Median :0.5435 Median :0.1325 Median :0.11950
## Mean :0.5172 Mean :0.1474 Mean :0.15478
## 3rd Qu.:0.6260 3rd Qu.:0.1978 3rd Qu.:0.19850
## Max. :0.7400 Max. :0.4680 Max. :0.58700
Checking for Correlation
The check for correlation is necessary to understand and identify the
super correlated variables and drop them. As the variables will be
leading to affect the accuracy of the model with large dataset by over
fitting.
correlation <- cor(data, method = 'pearson')
round(correlation, 2)
## RANK Happiness.score Whisker.high Whisker.low
## RANK 1.00 -0.98 -0.98 -0.98
## Happiness.score -0.98 1.00 1.00 1.00
## Whisker.high -0.98 1.00 1.00 1.00
## Whisker.low -0.98 1.00 1.00 1.00
## Dystopia..1.83....residual -0.44 0.50 0.51 0.48
## GDP.per.capita -0.79 0.76 0.75 0.77
## Social.support -0.77 0.78 0.77 0.78
## Life.expectancy -0.75 0.74 0.73 0.75
## Freedom.of.Life.Choices -0.62 0.62 0.62 0.63
## Generosity -0.03 0.06 0.07 0.06
## Corruption -0.40 0.42 0.41 0.42
## Dystopia..1.83....residual GDP.per.capita
## RANK -0.44 -0.79
## Happiness.score 0.50 0.76
## Whisker.high 0.51 0.75
## Whisker.low 0.48 0.77
## Dystopia..1.83....residual 1.00 -0.07
## GDP.per.capita -0.07 1.00
## Social.support 0.08 0.72
## Life.expectancy -0.01 0.82
## Freedom.of.Life.Choices 0.12 0.46
## Generosity 0.07 -0.16
## Corruption -0.05 0.38
## Social.support Life.expectancy
## RANK -0.77 -0.75
## Happiness.score 0.78 0.74
## Whisker.high 0.77 0.73
## Whisker.low 0.78 0.75
## Dystopia..1.83....residual 0.08 -0.01
## GDP.per.capita 0.72 0.82
## Social.support 1.00 0.67
## Life.expectancy 0.67 1.00
## Freedom.of.Life.Choices 0.48 0.43
## Generosity 0.00 -0.10
## Corruption 0.22 0.36
## Freedom.of.Life.Choices Generosity Corruption
## RANK -0.62 -0.03 -0.40
## Happiness.score 0.62 0.06 0.42
## Whisker.high 0.62 0.07 0.41
## Whisker.low 0.63 0.06 0.42
## Dystopia..1.83....residual 0.12 0.07 -0.05
## GDP.per.capita 0.46 -0.16 0.38
## Social.support 0.48 0.00 0.22
## Life.expectancy 0.43 -0.10 0.36
## Freedom.of.Life.Choices 1.00 0.18 0.40
## Generosity 0.18 1.00 0.10
## Corruption 0.40 0.10 1.00
corrplot(correlation, type = 'lower')
This plot shows great deal of correlation and this implies that
dimension reduction is required for the dataset.
The Dimension Reduction with Principal Component analysis is
used in this project as the primary goal is to find the underlying
structure of the data by identifying the variables that contribute most
to the variation in the data.
Optimal Number of components
pca <- prcomp(data, center=TRUE, scale=TRUE)
fviz_eig(pca)
summary(pca)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 2.5844 1.1878 1.0677 0.8630 0.72075 0.54873 0.41899
## Proportion of Variance 0.6072 0.1283 0.1036 0.0677 0.04723 0.02737 0.01596
## Cumulative Proportion 0.6072 0.7355 0.8391 0.9068 0.95404 0.98141 0.99737
## PC8 PC9 PC10 PC11
## Standard deviation 0.16493 0.04155 0.0007337 0.0002421
## Proportion of Variance 0.00247 0.00016 0.0000000 0.0000000
## Cumulative Proportion 0.99984 1.00000 1.0000000 1.0000000
The PC1 has the power to explain the 60.72% of the total
variance. This is definitely a good sign but the eigenvalues are also
needed to be checked to have further insights of the selection for the
component.
The eigenvalue is the representation of the total amount
of variance provided by a principal component.
Eigenvalue Calculation
eigen(cor(data))$values
## [1] 6.679248e+00 1.410970e+00 1.139978e+00 7.447409e-01 5.194834e-01
## [6] 3.011000e-01 1.755521e-01 2.720072e-02 1.726205e-03 5.383330e-07
## [11] 5.861220e-08
fviz_eig(pca, choice='eigenvalue')
As per the above plot, there are 3 variables which describes the
majority of the data as those have eigenvalue greater than 1. Thus first
3 variables will be used for the analysis according to Kaiser rule. Also
as there are 3 components having Eigenvalues greater than 1, PCA is the
best way to visualize the data unlike other dimension reduction
techniques.
Variable Loading Plot
varpca <- get_pca_var(pca)
options(ggrepel.max.overlaps = Inf)
fviz_pca_var(pca, col.var="steelblue", alpha.var="contrib", repel = TRUE)
The Dim1 (Principal Component 1 / PC1) captures 60.7% of the
variation and Dim2 (Principal Component 2/ PC2) captures 12.8%. Most of
the variables with large loading as seen in the above variable loading
plot are negatively contributing to Dim1.
Contribution Variables
fviz_contrib(pca, choice = "var", axes = 1:3)
The above plot shows the contribution of the variables to the
Principal components. These are the variables that are responsible for
explaining the highest variability of the dataset.The variables which
have lowest correlation with any of the Principal components can be
dropped to simplify the analysis. Even though as a part of the dataset,
‘Whisker.low’, ‘Whisker.high’, actually describes the Happiness score,
it can be concluded that the dataset can be described with variables
‘Happiness.Score’, ‘Dystopia’, ‘Rank’ and ‘GDP per Capita’ having the
largest contribution to the Principal Components.
- http://www.sthda.com/english/articles/31-principal-component-methods-in-r-practical-guide/112-pca-principal-component-analysis-essentials/#:~:text=The%20contribution%20of%20a%20variable,total%20cos2%20of%20the%20component).
- https://blogs.sas.com/content/iml/2019/11/04/interpret-graphs-principal-components.html#:~:text=The%20score%20plots%20project%20the,onto%20a%20pair%20of%20PCs.&text=There%20is%20one%20more%20plot,plot%20and%20a%20loadings%20plot.
- https://builtin.com/data-science/step-step-explanation-principal-component-analysis
- https://towardsdatascience.com/clustering-unsupervised-learning-788b215b074b#:~:text=%E2%80%9CClustering%E2%80%9D%20is%20the%20process%20of,the%20attributes%20of%20different%20groups.