The primary goal of this study is to identify similarities among European countries in terms of their level of development. Two key variables are used for this purpose. The first, GDP per capita, reflects a country’s economic and financial prosperity. The second, the Human Development Index (HDI), captures the quality of human capital, including factors such as life expectancy and years of education. Together, these variables provide a comprehensive view of a country’s development level. To explore patterns in the data, multiple clustering algorithms will be applied: k-means, PAM (Partitioning Around Medoids) and hierarchical clustering (both agglomerative and divisive approaches). Using different methods will help verify the robustness of the results and uncover meaningful groupings among European countries.
I created the dataset myself using data from 2023. GDP per capita values were obtained from the World Bank, while HDI values were taken from the UNDP Human Development Reports. Some countries were excluded due to missing values: San Marino and Monaco had incomplete data for either GDP or HDI, and Kosovo was not included due to missing HDI data and the fact that not all countries recognize it as a country. Russia was included because the most developed part of the country is located in Europe.
#Loading the libraries
library(dplyr)
library(utils)
library(factoextra)
library(ggplot2)
library(fpc)
library(clustertend)
library(NbClust)
library(readxl)
library(stats)
library(cluster)
library(maps)
library(tidyverse)
library(dendextend)
library(gridExtra)
library(grid)
library(hopkins)
library(googlesheets4)
library(readr)
# Importing the dataset from Google Drive and presenting its basic descriptive statistics:
url <- "https://docs.google.com/spreadsheets/d/1Rs6R63fb-JOXBKLn5sHHjuQBzx32VPTn/export?format=csv"
europe <- read_csv(
url,
locale = locale(decimal_mark = ",")
)
head(europe)
## # A tibble: 6 × 3
## Country hdi GDP_per_capita
## <chr> <dbl> <dbl>
## 1 Albania 0.81 8575.
## 2 Andorra 0.913 46812.
## 3 Austria 0.93 56034.
## 4 Belarus 0.824 7897.
## 5 Belgium 0.951 54690.
## 6 Bosnia and Herzegovina 0.804 8663.
summary(europe)
## Country hdi GDP_per_capita
## Length:42 Min. :0.7790 Min. : 5140
## Class :character 1st Qu.:0.8640 1st Qu.: 19270
## Mode :character Median :0.9130 Median : 32101
## Mean :0.8987 Mean : 43156
## 3rd Qu.:0.9440 3rd Qu.: 54503
## Max. :0.9720 Max. :207974
# Making a plot:
ggplot(europe, aes(x = hdi, y = GDP_per_capita)) +
geom_point(color = "black", size = 1.5) +
labs(x = "HDI", y = "GDP per capita") +
theme_gray(base_size = 14) +
theme(panel.grid.major = element_line(color = "gray80"),
panel.grid.minor = element_blank())
The algorithms applied in this study require the variables to be standardized before the analysis.
# Standardizing the variables using Z-score standardization:
data <- europe[, 2:3]
data_z <- as.data.frame(lapply(data, scale))
head(data_z)
## hdi GDP_per_capita
## 1 -1.6355506 -0.88692481
## 2 0.2633737 0.09379253
## 3 0.5767884 0.33029765
## 4 -1.3774444 -0.90432506
## 5 0.9639477 0.29583980
## 6 -1.7461676 -0.88467162
It is important to check whether the data are clusterable. For this reason, the Hopkins statistic will be calculated. If the value of the statistic is close to 1, it indicates that the data are significantly clusterable. The “hopkins” library will be used, as it is actively maintained and designed to align with modern clustering approaches.
#Checking dataset clusterability using Hopkins statistic
data_matrix <- as.matrix(data_z)
set.seed(123)
hopkins_stat <- round(hopkins(data_matrix), digits=2)
print(hopkins_stat)
## [1] 0.95
The value of the Hopkins statistic is 0.95 which suggests that the data are significantly clusterable.
First algorithm that will be used to cluster the data is k-means algorithm. K-means works as follows:
The ultimate goal of the algorithm is to minimize the square error of the intra-class dissimilarity.
The first step is to determine the optimal number of clusters. For this purpose, both the elbow method and the silhouette method are applied:
# Finding the optimal number of clusters using elbow method and silhouette method.
elb_meth_kmeans<-fviz_nbclust(data_z, kmeans, method = "wss")
sil_meth_kmeans<-fviz_nbclust(data_z, kmeans, method = "silhouette")
grid.arrange(elb_meth_kmeans, sil_meth_kmeans,
nrow=1,
top = textGrob("Optimal number of clusters for k-means",
gp = gpar(fontsize = 16, fontface = "bold")))
Both methods suggest that 4 is the optimal number of clusters. However, k=3 does not appear to be significantly worse, so it will also be considered. Cluster quality measures, such as the average silhouette width and the Calinski-Harabasz index, will be calculated to determine which number of clusters actually fits the data better and will be chosen as the final solution.
#K-means algorithm for k=4:
set.seed(123)
k_means_4 <- kmeans(data_z, centers=4, nstart=25)
#Average silhouette width for k=4:
sil_kmeans_4 <- silhouette(k_means_4$cluster, dist(data_z))
fviz_silhouette(sil_kmeans_4)
## cluster size ave.sil.width
## 1 1 11 0.61
## 2 2 12 0.49
## 3 3 17 0.60
## 4 4 2 0.17
#Calinski-Harabasz index for k=4:
round(calinhara(data_z, k_means_4$cluster), digits = 2)
## [1] 93.23
The average silhouette width is equal to 0.55, which means the clusters are reasonably well-separated. The Calinski-Harabasz index is equal to 93.23.
#K-means algorithm for k=3:
set.seed(123)
k_means_3 <- kmeans(data_z, centers=3, nstart=25)
#Average silhouette width for k=3:
sil_kmeans_3 <- silhouette(k_means_3$cluster, dist(data_z))
fviz_silhouette(sil_kmeans_3)
## cluster size ave.sil.width
## 1 1 12 0.63
## 2 2 6 0.32
## 3 3 24 0.54
#Calinski-Harabasz index for k=3:
round(calinhara(data_z, k_means_3$cluster), digits = 2)
## [1] 63.73
Both the average silhouette width and the Calinski-Harabasz index are higher for k=4, suggesting that it produces more distinct and compact clusters compared to k=3. Therefore, k=4 is chosen as the final number of clusters for the K-means algorithm.
# Visualization of the clusters:
plot_kmeans <- fviz_cluster(k_means_4, data = data_z, frame = FALSE, geom = "point", main = "K-means Algorithm Results (k=4)")
plot_kmeans
It is now time to assign clusters to each country in the dataset.
# Assigning number of k-means cluster to each observation in "europe" dataset:
europe$cluster_kmeans <- k_means_4$cluster
head(europe)
## # A tibble: 6 × 4
## Country hdi GDP_per_capita cluster_kmeans
## <chr> <dbl> <dbl> <int>
## 1 Albania 0.81 8575. 1
## 2 Andorra 0.913 46812. 3
## 3 Austria 0.93 56034. 2
## 4 Belarus 0.824 7897. 1
## 5 Belgium 0.951 54690. 2
## 6 Bosnia and Herzegovina 0.804 8663. 1
The PAM (Partitioning Around Medoids) algorithm is the next method applied in this study. It works as follows:
The ultimate goal is to minimize the sum of dissimilarities between points and their assigned medoids. PAM is more robust than k-means in the presence of noise and outliers because using medoids is less influenced by outliers or other extreme values than calculating mean values like in k-means.
The elbow method and the silhouette method are applied to find the optimal number of clusters:
elb_meth_pam<-fviz_nbclust(data_z, pam, method = "wss")
sil_meth_pam<-fviz_nbclust(data_z, pam, method = "silhouette")
grid.arrange(elb_meth_pam, sil_meth_pam,
nrow=1,
top = textGrob("Optimal number of clusters for PAM",
gp = gpar(fontsize = 16, fontface = "bold")))
Both the elbow and silhouette methods again suggest that 4 is the optimal number of clusters. However, as before, k=3 will also be considered, and cluster quality measures will be calculated to determine which k actually fits the data better.
#PAM algorithm for k=4:
set.seed(123)
pam_4 <- pam(data_z, 4)
#Average silhouette width for k=4:
sil_pam_4 <- silhouette(pam_4$clustering, dist(data_z))
fviz_silhouette(sil_pam_4)
## cluster size ave.sil.width
## 1 1 11 0.61
## 2 2 17 0.62
## 3 3 13 0.39
## 4 4 1 0.00
#Calinski-Harabasz index for k=4:
round(calinhara(data_z, pam_4$clustering), digits = 2)
## [1] 85.63
The average silhouette width is 0.53, suggesting that the clusters are reasonably well-separated. However, it is slightly lower than that obtained with K-means (0.55).
#PAM algorithm for k=3:
set.seed(123)
pam_3 <- pam(data_z, 3)
#Average silhouette width for k=3:
sil_pam_3 <- silhouette(pam_3$clustering, dist(data_z))
fviz_silhouette(sil_pam_3)
## cluster size ave.sil.width
## 1 1 11 0.61
## 2 2 17 0.65
## 3 3 14 0.25
#Calinski-Harabasz index for k=3:
round(calinhara(data_z, pam_3$clustering), digits = 2)
## [1] 54.37
Both the average silhouette width and the Calinski-Harabasz index are higher for k=4. Therefore, k=4 is chosen as the final number of clusters for the PAM algorithm.
#Visualization of the clusters:
plot_pam <- fviz_cluster(pam_4, data = data_z, frame = FALSE, geom = "point", main = "PAM Algorithm Results (k=4)")
plot_pam
It can be observed that, compared to K-means, similar clusters have different numbers assigned to them. For example, cluster 2 of K-means corresponds almost exactly to cluster 3 of PAM, and vice versa. Therefore, the specific numbering of clusters should not be given importance. What matters is the actual grouping, as the results of each algorithm will ultimately be compared using a colored map of Europe.
It is now time to assign clusters to each country in the dataset.
#Assigning PAM clusters to each observation in the dataset:
europe$cluster_pam <- pam_4$clustering
head(europe)
## # A tibble: 6 × 5
## Country hdi GDP_per_capita cluster_kmeans cluster_pam
## <chr> <dbl> <dbl> <int> <int>
## 1 Albania 0.81 8575. 1 1
## 2 Andorra 0.913 46812. 3 2
## 3 Austria 0.93 56034. 2 3
## 4 Belarus 0.824 7897. 1 1
## 5 Belgium 0.951 54690. 2 3
## 6 Bosnia and Herzegovina 0.804 8663. 1 1
The next algorithm used in this study is hierarchical clustering, which has two approaches - agglomerative and divisive. Unlike previous algorithms (K-means and PAM), hierarchical clustering does not require specifying the desired number of clusters at the beggining.
The agglomerative approach works as follows:
First, let’s compare the agglomerative coefficients (AC) across different linkage methods to choose the best one.
#Comparing different agglomerative coefficients (AC) across different linkage methods:
m <- c( "average", "single", "complete", "ward")
ac <- function(x) {
agnes(data_z, method = x)$ac
}
map_dbl(m, ac)
## [1] 0.9346212 0.9037888 0.9462390 0.9682194
The “Ward” method has the highest agglomerative coefficient (~0.97), suggesting that it is the best linkage method.
#Hierarchical clustering (agglomerative approach) with "Ward" linkage method:
set.seed(123)
hc_agg <- agnes(data_z, method = "ward")
#Calculating agglomerative coefficient
round(hc_agg$ac, digits=2)
## [1] 0.97
The agglomerative coefficient is 0.97. Such a high value indicates that the clusters are tight and well-separated.
#Dendrogram for agglomerative approach:
pltree(hc_agg, cex = 0.6, hang = -1, main = "Hierarchical clustering (agglomerative approach)")
Now, it’s time to cut the dendrogram into three and four clusters and compute cluster quality metrics to assess the most appropriate number of clusters.
#4 clusters:
hc_agg_cut_4 <- cutree(hc_agg, k = 4)
#Average silhouette width for k=4:
sil_agg_4 <- silhouette(hc_agg_cut_4, dist(data_z))
fviz_silhouette(sil_agg_4)
## cluster size ave.sil.width
## 1 1 11 0.62
## 2 2 18 0.58
## 3 3 12 0.41
## 4 4 1 0.00
#Calinski-Harabasz index for k=4:
round(calinhara(data_z, hc_agg_cut_4), digits = 2)
## [1] 84.08
#3 clusters:
hc_agg_cut_3 <- cutree(hc_agg, k = 3)
#Average silhouette width for k=3:
sil_agg_3 <- silhouette(hc_agg_cut_3, dist(data_z))
fviz_silhouette(sil_agg_3)
## cluster size ave.sil.width
## 1 1 11 0.62
## 2 2 18 0.62
## 3 3 13 0.25
#Calinski-Harabasz index for k=3:
round(calinhara(data_z, hc_agg_cut_3), digits = 2)
## [1] 54.76
Both the average silhouette width and the Calinski-Harabasz index are higher for k=4. Therefore, k=4 is chosen as the final number of clusters for the agglomerative hierarchical algorithm.
##Visualization of the clusters:
plot_agg <- fviz_cluster(list(data = data_z, cluster = hc_agg_cut_4), labelsize = 0, main = "Hierarchical Clustering: Agglomerative Approach Results For k=4")
plot_agg
Time to assign the clusters to each country in the dataset.
europe <- europe %>%
mutate(cluster_agg = hc_agg_cut_4)
head(europe)
## # A tibble: 6 × 6
## Country hdi GDP_per_capita cluster_kmeans cluster_pam cluster_agg
## <chr> <dbl> <dbl> <int> <int> <int>
## 1 Albania 0.81 8575. 1 1 1
## 2 Andorra 0.913 46812. 3 2 2
## 3 Austria 0.93 56034. 2 3 2
## 4 Belarus 0.824 7897. 1 1 1
## 5 Belgium 0.951 54690. 2 3 3
## 6 Bosnia and Herzeg… 0.804 8663. 1 1 1
The divisive approach works as follows:
#Divisive approach:
set.seed(123)
hc_div <- diana(data_z)
#Divisive coefficient:
round(hc_div$dc, digits = 2)
## [1] 0.95
Value of the divisive coefficient is equal to 0.95 which means that the divisive clustering captures meaningful structure.
#Dendrogram of the divisive approach:
pltree(hc_div, cex = 0.6, hang = -1, main = "Hierarchical clustering (divisive approach)")
Now, it is time to cut the dendrogram into three and four clusters and compute cluster quality metrics to assess the most appropriate number of clusters.
#Cutting the tree into 4 and 3 clusters to see which number is more optimal:
hc_div_cut_4 <- cutree(hc_div, k=4)
#Average silhouette width for k=4:
sil_div_4 <- silhouette(hc_div_cut_4, dist(data_z))
fviz_silhouette(sil_div_4)
## cluster size ave.sil.width
## 1 1 11 0.61
## 2 2 17 0.62
## 3 3 13 0.39
## 4 4 1 0.00
#Calinski-Harabasz index for k=4:
round(calinhara(data_z, hc_div_cut_4), digits = 2)
## [1] 85.63
#Cutting the tree into 3 clusters:
hc_div_cut_3 <- cutree(hc_div, k=3)
#Average silhouette width for k=3:
sil_div_3 <- silhouette(hc_div_cut_3, dist(data_z))
fviz_silhouette(sil_div_3)
## cluster size ave.sil.width
## 1 1 28 0.43
## 2 2 13 0.57
## 3 3 1 0.00
#Calinski-Harabasz index for k=3:
round(calinhara(data_z, hc_div_cut_3), digits = 2)
## [1] 37.22
Again, k=4 seems to be the better option. Therefore, k=4 is the final choice.
#Visualization of the clusters:
plot_div <- fviz_cluster(list(data = data_z, cluster = hc_div_cut_4), labelsize = 0, main = "Hierarchical Clustering: Divisive Approach Results For k=4")
plot_div
Assigning the clusters to each country in the dataset:
europe <- europe %>%
mutate(cluster_div = hc_div_cut_4)
head(europe)
## # A tibble: 6 × 7
## Country hdi GDP_per_capita cluster_kmeans cluster_pam cluster_agg
## <chr> <dbl> <dbl> <int> <int> <int>
## 1 Albania 0.81 8575. 1 1 1
## 2 Andorra 0.913 46812. 3 2 2
## 3 Austria 0.93 56034. 2 3 2
## 4 Belarus 0.824 7897. 1 1 1
## 5 Belgium 0.951 54690. 2 3 3
## 6 Bosnia and Herzeg… 0.804 8663. 1 1 1
## # ℹ 1 more variable: cluster_div <int>
All four algorithms have been applied to cluster the data. Now, it is time to compare the obtained results to assess whether they are robust to changes in the method and, therefore, whether the clustering is actually meaningful.
#Comparison of cluster quality measures:
algorithm <- c("K-means", "PAM", "Agglomerative hierarchical", "Divisive hierarchical")
avg_sil_width <- c(0.55, 0.53, 0.53, 0.53)
cal_har_index <- c(round(calinhara(data_z, k_means_4$cluster), digits = 2),
round(calinhara(data_z, pam_4$clustering), digits = 2),
round(calinhara(data_z, hc_agg_cut_4), digits = 2),
round(calinhara(data_z, hc_div_cut_4), digits = 2))
quality_stat_table <- cbind(algorithm, avg_sil_width, cal_har_index)
quality_stat_table
## algorithm avg_sil_width cal_har_index
## [1,] "K-means" "0.55" "93.23"
## [2,] "PAM" "0.53" "85.63"
## [3,] "Agglomerative hierarchical" "0.53" "84.08"
## [4,] "Divisive hierarchical" "0.53" "85.63"
K-means achieved the highest average silhouette width and Calinski–Harabasz index, indicating the most compact and well-separated clusters. The remaining methods (PAM and both hierarchical approaches) produced similar results, with slightly lower cluster quality.
#Making the plot:
plot1 <- fviz_cluster(k_means_4, data = data_z, frame = FALSE, geom = "point", main = "K-means (k=4)")
plot2 <- fviz_cluster(pam_4, data = data_z, frame = FALSE, geom = "point", main = "PAM (k=4)")
plot3 <- fviz_cluster(list(data = data_z, cluster = hc_agg_cut_4), labelsize = 0, main = "Hierarchical Agglomerative (k=4)")
plot4 <- fviz_cluster(list(data = data_z, cluster = hc_div_cut_4), labelsize = 0, main = "Hierarchical Divisive (k=4)")
grid.arrange(plot1, plot2, plot3, plot4, nrow = 2)
It is clearly visible that the results are very similar and that the differences between them are small. This suggests that the results are robust to changes in the clustering algorithm, which makes them more reliable.
Now, let us take a more detailed look at the four plots. It can be observed that Cluster 1 consistently represents the least developed countries, characterized by the lowest GDP per capita and HDI values.
Cluster 3 in the K-means algorithm and Cluster 2 in the PAM, hierarchical agglomerative, and hierarchical divisive algorithms generally represent moderately developed countries, with moderate values of GDP per capita and HDI.
Cluster 2 in the K-means algorithm and Cluster 3 in the PAM, hierarchical agglomerative, and hierarchical divisive algorithms generally represent highly developed countries, characterized by high GDP per capita and HDI.
Cluster 4 represents one or two outliers. In this case, these countries are Liechtenstein and Luxembourg, which can be described as the most developed countries in the dataset.
Now, let us prepare a colored map of Europe for all four algorithms to visualize which countries belong to which clusters. Since the clusters differ in names and colors across the algorithms, the same colors will be assigned to the same types of clusters as described above for consistency. The “maps” library will be used for this visualization.
map_countries <- unique(maps::map("world", plot = FALSE)$names)
setdiff(europe$Country, map_countries)
## [1] "Czechia" "United Kingdom"
Since the names of Czechia and the United Kingdom differ in the “map_countries” dataset (where they appear as Czech Republic and UK), it is necessary to modify these names in the “europe” dataset in order to merge the two datasets correctly. We can also include only European part of Russia on the map.
europe$Country[europe$Country == "Czechia"] <- "Czech Republic"
europe$Country[europe$Country == "United Kingdom"] <- "UK"
europe_map <- map_data("world", region = europe$Country)
europe_map <- europe_map %>%
left_join(europe, by = c("region" = "Country"))
#Including only European part of Russia
europe_map <- europe_map %>%
filter(!(region == "Russia" & long > 60))
As mentioned before, since the names and colors of clusters including the same countries can vary slightly across algorithms, it is useful to assign the same color to the same type of cluster for consistency.
#For k-means:
my_cols_kmeans <- c(
"1" = "#FF0000",
"2" = "#FFA500",
"3" = "#FFFF00",
"4" = "#FF69B4"
)
MAP_KMEANS <- ggplot(europe_map, aes(long, lat, group = group,)) +
geom_polygon(aes(fill = factor(cluster_kmeans)), colour = "black") +
scale_fill_manual(values = my_cols_kmeans) +
coord_cartesian(xlim = c(-25, 45), ylim = c(34, 72)) +
theme_minimal()+
labs(fill = "Cluster", title = "K-means")
my_cols_rest <- c(
"1" = "#FF0000",
"2" = "#FFFF00",
"3" = "#FFA500",
"4" = "#FF69B4"
)
#For PAM:
MAP_PAM <- ggplot(europe_map, aes(long, lat, group = group,)) +
geom_polygon(aes(fill = factor(cluster_pam)), colour = "black") +
scale_fill_manual(values = my_cols_rest) +
coord_cartesian(xlim = c(-25, 45), ylim = c(34, 72)) +
theme_minimal()+
labs(fill = "Cluster", title = "PAM")
#For hierarchical agglomerative:
MAP_AGG <- ggplot(europe_map, aes(long, lat, group = group,)) +
geom_polygon(aes(fill = factor(cluster_agg)), colour = "black") +
scale_fill_manual(values = my_cols_rest) +
coord_cartesian(xlim = c(-25, 45), ylim = c(34, 72)) +
theme_minimal()+
labs(fill = "Cluster", title = " Hierarchical Agglomerative")
#For hierarchical divisive:
MAP_DIV <- ggplot(europe_map, aes(long, lat, group = group,)) +
geom_polygon(aes(fill = factor(cluster_div)), colour = "black") +
scale_fill_manual(values = my_cols_rest) +
coord_cartesian(xlim = c(-25, 45), ylim = c(34, 72)) +
theme_minimal()+
labs(fill = "Cluster", title = " Hierarchical Divisive")
grid.arrange(MAP_KMEANS, MAP_PAM, MAP_AGG, MAP_DIV, nrow = 2)
Looking at the maps, it becomes even more apparent that the differences between the algorithms are very small.
What is particularly interesting is the visible geographical pattern of the clusters. The red clusters (least developed) include countries that were part of the former Soviet Union as well as most of the Balkans. The yellow clusters (moderately developed) include countries of Central Europe and some Western European countries such as France, Portugal, and Spain. The orange clusters (highly developed) consist mainly of highly-developed Western and Northern Europe. Finally, the pink clusters (most developed) include outliers like Liechtenstein or Luxembourg.
All algorithms gave the exact same result for red clusters. Countries that are classified as least developed are:
The only difference across the algorithms is that the agglomerative hierarchical algorithm assigns Austria to this cluster, while the other algorithms don’t. Since the agglomerative hierarchical algorithm has the lowest value of the Calinski–Harabasz index among all the applied algorithms, it is reasonable to conclude that Austria should rather not be classified as belonging to this group.
In general, the countries that can be described as moderately developed are:
There are some very small differences between the results. Austria is classified as a highly developed country by almost all algorithms, except for the agglomerative hierarchical algorithm, which assigns it to the moderately developed group. Similarly, Luxembourg is classified as highly developed by most algorithms, except for K-means, which assigns it to the most developed cluster.
In general, the countries that can be described as highly developed are:
K-means is the only algorithm that assigned both Liechtenstein and Luxembourg to this cluster. The other algorithms created this separate cluster only for Liechtenstein.
This study aimed to identify patterns of similarity among European countries based on their level of development using two indicators: GDP per capita and the Human Development Index (HDI). Four clustering approaches were applied and compared: K-means, Partitioning Around Medoids, Hierarchical Agglomerative Clustering and Hierarchical Divisive Clustering. All algorithms produced similar and consistent results, confirming the robustness of the clustering structure, with k-means achieving slightly higher cluster quality measures than the other methods.
The analysis shows that European countries can be divided into four development groups.
The cluster of least developed countries largely consists of post-socialist countries, former Soviet states, and Balkan countries, which explains their similar economic and social development patterns.
The moderately developed cluster mainly consists of Central and Southern European countries, as well as some Western European countries, characterized by medium levels of GDP per capita and HDI. The highly developed cluster consists primarily of Western and Northern European countries, known for strong economies, high institutional quality, and advanced welfare systems.
Liechtenstein consistently forms its own cluster due to its exceptionally high GDP per capita, making it a clear outlier among European countries. Luxembourg sometimes joins this cluster depending on the algorithm, but Liechtenstein remains the most extreme case.
These results show clear economic differences between European countries. The geographical distribution of the clusters confirms that some parts of Europe are generally less developed than others.