1 Greetings

Welcome to my Rmd. The reason why I created this Rmd is to improve my understanding on Unsupervised Machine Learning.

2 Brief explanation about the data

This dataset consists information about socio-economic and health factors that determine the overall development of the country.

Columns Insight :
1. country : Name of the country, there are 167 country listed
2. child_mort : Death of children under 5 years of age per 1000 live births
3. exports : Exports of goods and services per capita. Given as %age of the GDP per capita
4. health : Total health spending per capita. Given as %age of GDP per capita
5. imports : Imports of goods and services per capita. Given as %age of the GDP per capita
6. Income : Net income per person
7. Inflation : The measurement of the annual growth rate of the Total GDP
8. life_expec : The average number of years a new born child would live if the current mortality patterns are to remain the same
9. total_fer : The number of children that would be born to each woman if the current age-fertility rates remain the same
10. gdpp : The GDP per capita. Calculated as the Total GDP divided by the total population.

You may download the data set from kaggle: https://www.kaggle.com/rohan0301/unsupervised-learning-on-country-data

3 Bussines Questions

HELP International have been able to raise around $ 10 million. Now the CEO of the NGO needs to decide how to use this money strategically and effectively. So, CEO has to make decision to choose the countries that are in the direst need of aid. Hence, your Job as a Data scientist is to categorise the countries using some socio-economic and health factors that determine the overall development of the country. Then you need to suggest the countries which the CEO needs to focus on the most.

4 Data Preparation

1. Import necessary library

library(gridExtra)
library(factoextra)
library(FactoMineR)
library(arsenal)
library(tidyverse)

2. Read the daataset

country <- read.csv("Country-data.csv")
head(country)

5 Exploratory Data Analysis

1. Check missing value

colSums(is.na(country))

##    country child_mort    exports     health    imports     income  inflation 
##          0          0          0          0          0          0          0 
## life_expec  total_fer       gdpp 
##          0          0          0

From missing value inspection using function colSums(is.na()) there are no missing value.

2. Check data type

glimpse(country)

## Rows: 167
## Columns: 10
## $ country    <chr> "Afghanistan", "Albania", "Algeria", "Angola", "Antigua and~
## $ child_mort <dbl> 90.2, 16.6, 27.3, 119.0, 10.3, 14.5, 18.1, 4.8, 4.3, 39.2, ~
## $ exports    <dbl> 10.0, 28.0, 38.4, 62.3, 45.5, 18.9, 20.8, 19.8, 51.3, 54.3,~
## $ health     <dbl> 7.58, 6.55, 4.17, 2.85, 6.03, 8.10, 4.40, 8.73, 11.00, 5.88~
## $ imports    <dbl> 44.9, 48.6, 31.4, 42.9, 58.9, 16.0, 45.3, 20.9, 47.8, 20.7,~
## $ income     <int> 1610, 9930, 12900, 5900, 19100, 18700, 6700, 41400, 43200, ~
## $ inflation  <dbl> 9.440, 4.490, 16.100, 22.400, 1.440, 20.900, 7.770, 1.160, ~
## $ life_expec <dbl> 56.2, 76.3, 76.5, 60.1, 76.8, 75.8, 73.3, 82.0, 80.5, 69.1,~
## $ total_fer  <dbl> 5.82, 1.65, 2.89, 6.16, 2.13, 2.37, 1.69, 1.93, 1.44, 1.92,~
## $ gdpp       <int> 553, 4090, 4460, 3530, 12200, 10300, 3220, 51900, 46900, 58~

The data type for each columns are already suitable but if the data type for column country want to be change into data type factor it is also possible. In this case, let’s change data type for column country.

country <- country %>% 
  mutate(country = as.factor(country))

glimpse(country$country)

##  Factor w/ 167 levels "Afghanistan",..: 1 2 3 4 5 6 7 8 9 10 ...

3. Check the distribution/pattern data

summary(country)

##                 country      child_mort        exports            health      
##  Afghanistan        :  1   Min.   :  2.60   Min.   :  0.109   Min.   : 1.810  
##  Albania            :  1   1st Qu.:  8.25   1st Qu.: 23.800   1st Qu.: 4.920  
##  Algeria            :  1   Median : 19.30   Median : 35.000   Median : 6.320  
##  Angola             :  1   Mean   : 38.27   Mean   : 41.109   Mean   : 6.816  
##  Antigua and Barbuda:  1   3rd Qu.: 62.10   3rd Qu.: 51.350   3rd Qu.: 8.600  
##  Argentina          :  1   Max.   :208.00   Max.   :200.000   Max.   :17.900  
##  (Other)            :161                                                      
##     imports             income         inflation         life_expec   
##  Min.   :  0.0659   Min.   :   609   Min.   : -4.210   Min.   :32.10  
##  1st Qu.: 30.2000   1st Qu.:  3355   1st Qu.:  1.810   1st Qu.:65.30  
##  Median : 43.3000   Median :  9960   Median :  5.390   Median :73.10  
##  Mean   : 46.8902   Mean   : 17145   Mean   :  7.782   Mean   :70.56  
##  3rd Qu.: 58.7500   3rd Qu.: 22800   3rd Qu.: 10.750   3rd Qu.:76.80  
##  Max.   :174.0000   Max.   :125000   Max.   :104.000   Max.   :82.80  
##                                                                       
##    total_fer          gdpp       
##  Min.   :1.150   Min.   :   231  
##  1st Qu.:1.795   1st Qu.:  1330  
##  Median :2.410   Median :  4660  
##  Mean   :2.948   Mean   : 12964  
##  3rd Qu.:3.880   3rd Qu.: 14050  
##  Max.   :7.490   Max.   :105000  
##

From the data distribution above, it is very sad to know that there are huge gap in the welfare of life for each country.

Furthermore, from the inspection above. The data are required to be scale since there are different range of value on each columns, so every columns will have same standardization or range of value. The standardization is important because the higher the scale, the higher the variance or covariance value which might cause bias.

4. Data Scaling

Since the first column is a string, the first column must be excluded from the dataset before assign it to the scale() function. If there are any data type numeric assigned into scale() function, an error will occur.

country_scale <- country %>%
 mutate_at(c(2:10), funs(c(scale(.))))

summary(country_scale)

##                 country      child_mort         exports       
##  Afghanistan        :  1   Min.   :-0.8845   Min.   :-1.4957  
##  Albania            :  1   1st Qu.:-0.7444   1st Qu.:-0.6314  
##  Algeria            :  1   Median :-0.4704   Median :-0.2229  
##  Angola             :  1   Mean   : 0.0000   Mean   : 0.0000  
##  Antigua and Barbuda:  1   3rd Qu.: 0.5909   3rd Qu.: 0.3736  
##  Argentina          :  1   Max.   : 4.2086   Max.   : 5.7964  
##  (Other)            :161                                      
##      health           imports            income          inflation      
##  Min.   :-1.8223   Min.   :-1.9341   Min.   :-0.8577   Min.   :-1.1344  
##  1st Qu.:-0.6901   1st Qu.:-0.6894   1st Qu.:-0.7153   1st Qu.:-0.5649  
##  Median :-0.1805   Median :-0.1483   Median :-0.3727   Median :-0.2263  
##  Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.: 0.6496   3rd Qu.: 0.4899   3rd Qu.: 0.2934   3rd Qu.: 0.2808  
##  Max.   : 4.0353   Max.   : 5.2504   Max.   : 5.5947   Max.   : 9.1023  
##                                                                         
##    life_expec        total_fer            gdpp         
##  Min.   :-4.3242   Min.   :-1.1877   Min.   :-0.69471  
##  1st Qu.:-0.5910   1st Qu.:-0.7616   1st Qu.:-0.63475  
##  Median : 0.2861   Median :-0.3554   Median :-0.45307  
##  Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.00000  
##  3rd Qu.: 0.7021   3rd Qu.: 0.6157   3rd Qu.: 0.05924  
##  Max.   : 1.3768   Max.   : 3.0003   Max.   : 5.02140  
##

Compare to summary result from object country and country_scale, there are no huge gab on range of values on every columns in object country_scale or can be said the data on each of the columns already has the same minimum and maximum range of values. Hopefully it will provide better result during modeling machine learning.

6 PCA (Principal Component Analysis)

PCA is a dimensionality-reduction method that is often used to reduce the dimensionality of large data sets, by transforming a large set of variables into a smaller one that still contains most of the information in the large set.

Reducing the number of variables of a data set naturally comes at the expense of accuracy, but the trick in dimensionality reduction is to trade a little accuracy for simplicity because smaller data sets are easier to explore and visualize and make analyzing data much easier and faster for machine learning algorithms without extraneous variables to process.

6.1 PCA Calculation

PCA can be perform in R using function prcomp().

Unfortunately, function prcomp() cannot be implement inside function mutate_at(), so numeric value has to be separated at before committing PCA.

rownames(country_scale) <- country_scale[,"country"]

country_scale <- country_scale %>%
   select(-country)

head(country_scale)

country_pca <- prcomp(country_scale)
country_pca

## Standard deviations (1, .., p=9):
## [1] 2.0336314 1.2435217 1.0818425 0.9973889 0.8127847 0.4728437 0.3368067
## [8] 0.2971790 0.2586020
## 
## Rotation (n x k) = (9 x 9):
##                   PC1          PC2         PC3          PC4         PC5
## child_mort -0.4195194 -0.192883937  0.02954353 -0.370653262  0.16896968
## exports     0.2838970 -0.613163494 -0.14476069 -0.003091019 -0.05761584
## health      0.1508378  0.243086779  0.59663237 -0.461897497 -0.51800037
## imports     0.1614824 -0.671820644  0.29992674  0.071907461 -0.25537642
## income      0.3984411 -0.022535530 -0.30154750 -0.392159039  0.24714960
## inflation  -0.1931729  0.008404473 -0.64251951 -0.150441762 -0.71486910
## life_expec  0.4258394  0.222706743 -0.11391854  0.203797235 -0.10821980
## total_fer  -0.4037290 -0.155233106 -0.01954925 -0.378303645  0.13526221
## gdpp        0.3926448  0.046022396 -0.12297749 -0.531994575  0.18016662
##                     PC6         PC7         PC8         PC9
## child_mort -0.200628153  0.07948854  0.68274306  0.32754180
## exports     0.059332832  0.70730269  0.01419742 -0.12308207
## health     -0.007276456  0.24983051 -0.07249683  0.11308797
## imports     0.030031537 -0.59218953  0.02894642  0.09903717
## income     -0.160346990 -0.09556237 -0.35262369  0.61298247
## inflation  -0.066285372 -0.10463252  0.01153775 -0.02523614
## life_expec  0.601126516 -0.01848639  0.50466425  0.29403981
## total_fer   0.750688748 -0.02882643 -0.29335267 -0.02633585
## gdpp       -0.016778761 -0.24299776  0.24969636 -0.62564572

The way to interpret PCA from the result above:
- The first principal component has positive associations with exports, health, imports, income, life_expec and gdpp, but the first principal component has negative associations with child_mort, inflation and total_fer. The first component can be viewed as a measure of how stable the country is, since generally stable country has positive/high rating in exports, imports, income, health, life_expec and gdpp but has negative/low rating in child_mort, inflation and total_fer.
- Another way to interpret PCA is by examine the value of the coefficient/columns. The larger the absolute value of the coefficient, the more important the corresponding variable is in calculating the component. Example can be taken from PC7 at column exports, the largest value for columns exports are on PC7 compared to another PC. It can be said the contribution or the amount of information about column export is on PC7.

Another way to interpret result from PCA is to implement function biplot(), that function will help to observe:
- Overall data distribution using 2 PCs The goal is to find out similar observations and outliers.
- The correlation between variables and their contribution to PC.

biplot(country_pca, cex = 0.5)

From the plot above, Malta, Singapore and Luxemburg might be indicated as outliers but in this case there are no such thing can be categorized as outliers. Another insight from the plot are columns gdpp and income has strong correlation as well as columns total_fer and child_mort.

6.2 Determine the number of PCA

As stated above, the main function of PCA is to do dimensionality reduction. In order to determine the minimum number of principal components that account for most of the variation in your data, can use a function summary().

Function summary() will provide these three information:
- Standard deviation: standard deviation (root variance) captured by each PC.
- Proportion of Variance: information captured by each PC.
- Cumulative Proportion: the cumulative amount of information captured from PC1 to PC9.

summary(country_pca)

## Importance of components:
##                           PC1    PC2    PC3    PC4    PC5     PC6    PC7
## Standard deviation     2.0336 1.2435 1.0818 0.9974 0.8128 0.47284 0.3368
## Proportion of Variance 0.4595 0.1718 0.1300 0.1105 0.0734 0.02484 0.0126
## Cumulative Proportion  0.4595 0.6313 0.7614 0.8719 0.9453 0.97015 0.9828
##                            PC8     PC9
## Standard deviation     0.29718 0.25860
## Proportion of Variance 0.00981 0.00743
## Cumulative Proportion  0.99257 1.00000

The determination of PCs number are adjusted to the information requirements. Let’s say in this project requires at least 80% information, then the number PCs use are from PC1-4.

country_selected_pca <- data.frame(country_pca$x[,1:4])
head(country_selected_pca)

6.3 Combine to initial data

Once a PC is selected that summarizes the required information, it can be combined with the initial data and used for further analysis.

country_pca <- country %>% 
  select_if(purrr::negate(is.numeric)) %>% 
  cbind(country_selected_pca)

glimpse(country_pca)

## Rows: 167
## Columns: 5
## $ country <fct> "Afghanistan", "Albania", "Algeria", "Angola", "Antigua and Ba~
## $ PC1     <dbl> -2.90428986, 0.42862224, -0.28436983, -2.92362976, 1.03047668,~
## $ PC2     <dbl> -0.09533386, 0.58639208, 0.45380957, -1.69047094, -0.13624894,~
## $ PC3     <dbl> 0.7159652, 0.3324855, -1.2178421, -1.5204709, 0.2250441, -0.86~
## $ PC4     <dbl> -1.00224038, 1.15757715, 0.86551146, -0.83710739, 0.84452276, ~

country_final <- country_pca %>% 
  select(-country)

head(country_final)

7 K-Means Clustering

7.1 Clustering With PCA

1. Finding The Best K-Value For Clustering With PCA Value

To determine which country are in the direst need of aid, model K-Means might help. K-Means is a machine learning model which can be used to grouping data based on similar characteristics.

- Elbow Method

One of the most famous method is Elbow Method, but how to know optimum K-Value from Elbow Method? Elbow method can find out Optimum K value or groups by seeing the results of the visualization from function fviz_nbclust() which contains WSS (Within Sum of Square) values.

WSS can be interpreted more easily as the measures of variation that exists within each group, so the higher WSS result indicates a large degree of variability within the data set, while a lower result indicates that the data does not vary considerably from the mean value. From the explanation above, it can be concluded that the optimum K-Value is when the increasing the number of K does not result in a considerable decrease of the total within sum of squares.

fviz_nbclust(country_final, kmeans, method = "wss") + 
  labs(subtitle = "Elbow Method With PCA Value")

From the plots can be seen that 3 is the optimum number of K. Since After k=3, increasing the number of K does not result in a considerable decrease of the total within sum of squares.

Another way to find out the optimum K Value is to choose the number of cluster in the area of “bend of an elbow”, but this method can be said as biased method since the point where the K-Value will perform an area that looks like an elbow it depends on the opinion of each person and everyone’s opinion may be different.

- Silhouette Method

The second method is Silhouette Method, this method will use the same function as Elbow Method. Function fviz_nbclust() will visualize a measure of how close each point in one cluster is to points in the neighboring clusters and thus provides a way to assess parameters like number of clusters visually.

It can be said that, the highest Average Silhouette Width value or the peak value from the plot is the optimal K-Value since the average distance between each cluster is not that close.

fviz_nbclust(country_final, kmeans, method = "silhouette") + 
  labs(subtitle = "Silhouette Method With PCA Value")

From the plots can be seen that 4 is the optimum number of K. Since After k=4, the Average Silhouette Width are decreased.

- Gap Statistic

The last method is called Gap Statistic. Gap statistic can be visualized by function fviz_nbclust(). The basic idea of the Gap Statistics is to choose the number of K, where the biggest jump in within-cluster distance occurred or can be said that the first time K-Value reached the highest Gap Statistic value without dropping that is the optimal K-Value compute from the plot.

Based on the gap statistic method below, the optimal k is 3.

fviz_nbclust(country_final, kmeans, method = "gap_stat") + 
  labs(subtitle = "Gap Statistic method With PCA Value")

Based on three methods above, two out of three methods suggest that k = 3 is the optimum number of clusters. So, the clusters will be divided into 3 clusters.

Disclaimer : Determining the number of clusters is not obliged to use the 3 test methods above, cluster determination can also be determined based on business question or mutual agreement.

2. Clustering With PCA Value

set.seed(100)

km_pca <- kmeans(country_final, centers = 3)

country_final$cluster <- km_pca$cluster

unique(country_final$cluster)

## [1] 3 1 2

cluster_pca <- fviz_cluster(km_pca, data = country_final) +
   labs(subtitle = "K-Means With PCA & K-Value = 3")

cluster_pca

7.2 Country Cluster Profiling

The purpose of profiling is to to understand the characteristics of each cluster, in this case to understand which country cluster is the direst need of aid.

To find out characteristic from each cluster, the value from each columns can be averaged.

country_final %>%
  group_by(cluster) %>% 
  summarise_all(mean)

Instead profiling using PCA value, it is better to assign the cluster into original value so it will be easier to interpret.

#Assign into new object
country_profile <- country_scale

#Assign cluster result into the new object
country_profile$cluster <- km_pca$cluster

country_profile %>%
  group_by(cluster) %>% 
  summarise_all(mean)

Profiling:

Cluster 1:
- From an economic point of view, cluster 1 country population do not have a good economic level that can be seen from minus average values in columns income, gdpp and positive average value in column inflation. Furthermore, cluster 1 country have an unfavorable on the industrial side since columns export and import has minus average value.
- From health point of view, this is very sad because cluster 1 country have high positive average value on column child_mort and minus on columns health and life_expe.

Cluster 2:
- From an economic point of view, cluster 2 country population have a good economic level that can be seen from positive average values in columns income, gdpp and minus average value in column inflation. Other than that, cluster 2 can be said as developed country since columns export and import has high positive average value.
- From health point of view, cluster 2 country have high negative average on column child_mort but minus in column health.

Cluster 3:
- From an economic point of view, most of cluster 2 country population do not have a good economic level that can be seen from minus average values in columns income and gdpp. Furthermore, cluster 1 country have an unfavorable industrial since columns export has minus average value but fortunately the average for imports is still positive.
- From health point of view, cluster 2 country have the lowest average value on column health but fortunately average on columns child_mort and life expe rate are showing a good result.

8 Country Donation Selection

8.1 Based On Cluster 1 Overall Profile

Based on cluster profiling, country in cluster 1 is a country that in need of aid the most compared to countries in clusters 2 and 3. To determine which countries in cluster 1 should be given the donations first, that can be determined by finding out which countries has the value that is lower than the average profiling value for cluster 1.

country_profile %>% 
  filter(child_mort > 1.32,
         exports < -0.42,
         health < -0.13,
         imports < -0.15,
         income < -0.68,
         inflation < 0.39,
         life_expec < -1.27,
         total_fer > 1.35,
         gdpp < -0.60)

From the results above, there are 2 countries that have is lower than the average profiling value for cluster 1.

8.2 Based On Economic & Health Parameter Urgency

The distribution of donation funds can also be determined based on the economic and health segments that are most in need of aid. Let’s divide up any columns that are suitable as assessment parameters.

- Economic Sector

Parameter columns income, exports, imports and gdpp must be lower than average value from profiling value for cluster 1.

#Parameter filter
economic <- country_profile %>% 
  filter(exports < -0.42,
         imports < -0.15,
         income < -0.68,
         gdpp < -0.60) %>% 
  select(income,exports,imports, gdpp)

economic

If only seen from the results above, it is quite difficult to sort which country need the most aid in economic sector. Even though those 18 countries has average value for columns income, exports, imports and gdpp below average cluster 1 profiling. However, it would be wise if the countries that received the first aid were the country which has the lowest average.

In order to help figuring out which country will received the first aid, let’s visualize it.

#Change negative value into positive, for the sake of visualization
economic <- abs(economic)

#Change country section from index into columns, for the sake of visualization
economic <- tibble::rownames_to_column(economic, "country")

head(economic)

Implement function pivot_longer() to combine income, exports, imports and gdpp, so that the visualization results are easier to interpret.

eco_piv_long <- pivot_longer(data = economic, 
                   cols = c("income", "exports", "imports", "gdpp"))
head(eco_piv_long,8)

ggplot(data = eco_piv_long, aes(x = value, y = reorder(country, value))) +
  geom_col(aes(fill = name),position = "dodge") +
  scale_x_continuous(label = scales::comma,
                     expand = c(0,0),
                     breaks = seq(0, 2.5, 0.25)) +
  labs(title = "Most Needed Aid Country In Economic Segmen",
       subtitle = "Comparismn between Income, Exports, Imports & GDP",
       x = "Value",
       y = "Country",
       color = "") +
  theme_bw()  +
  theme(legend.position = "bottom",
        legend.title = element_blank())

- Health Sector

Parameter columns child_mort, health and life_expec must be lower than average value from profiling value for cluster 1.

#Parameter filter
health <- country_profile %>% 
  filter(child_mort > 1.32,
         health < -0.13,
         life_expec < -1.27) %>% 
  select(child_mort, health, life_expec)

health

Let’s visualize comparison between child_mort, healt and life_expec to find out which country in dearest need of aid in health sector.

#Change negative value into positive, for the sake of visualization
health <- abs(health)

#Change country section from index into columns, for the sake of visualization
health <- tibble::rownames_to_column(health, "country")

health

Implement function pivot_longer() to combine child_mort, health and life_expec, so that the visualization results are easier to interpret.

health_piv_long <- pivot_longer(data = health, 
                                cols = c("child_mort", "health", "life_expec"))
head(health_piv_long,9)

ggplot(data = health_piv_long, aes(x = value, y = reorder(country, value))) +
  geom_col(aes(fill = name),position = "dodge") +
  scale_x_continuous(label = scales::comma,
                     expand = c(0,0),
                     breaks = seq(0, 4, 0.25)) +
  labs(title = "Most Needed Aid Country In Health Segmen",
       subtitle = "Comparismn between Child Mortality, Health & Life Expected",
       x = "Value",
       y = "Channel",
       color = "") +
  theme_bw()  +
  theme(legend.position = "bottom",
        legend.title = element_blank())

9 Insight

How many K-Value or how many groups desired as the final result must be determined in advance, since model K-Means required that information to grouping the data based on similar characteristics. There are several way to find out the optimum K-Value such as Elbow Method, Silhouette Method and Gap Statistic Method. It is important to implement several methods, since from the case above the K-Value result from one method to another might be different. Even though there are difference in the results, the final suggestion for K-Value = 3 since two out three methods produced the same results.

After knowing how many groups are ideal, the characteristics of each country in each cluster can be determined by averaging the values in each column, this method can be call Cluster Profiling. From the results of cluster profiling, countries in cluster 1 are the countries most in need of aid when compared to countries included in cluster 2 and cluster 3.

The results of cluster 1 profiling are countries that really need aid in the economic and health sectors.
- When viewed from the economic sector, the average results from columns exports, imports and gdpp column have negative results or it can be interpreted that economic growth in cluster 1 countries is not good or even not developing.
- When analyzed from the health sector, the results are very sad because the mortality rate under 5 years is very high and the life expectancy also does not indicate that a large number of the population does not have a long life. It all can happen because of poor health figures.

There are 2 ways that can be applied to determine how to select countries in Cluster 1 to be assisted:
1. Allocating all funds to countries that have an average below the profiling cluster parameter 1.
From the filter results above, it is known that there are 2 countries that have an average below the profiling parameter cluster 1, namely Cameroon & Central African Republic.

2. Determine the Based On Economic & Health Urgency Parameters.
If the first method is used, there are only 2 countries that can be assisted, while maybe there are still many countries that have an average below the economy above the average profiling cluster 1 but have an average health segment below the average profiling cluster 1 or vice versa. From the computation results above, there are a total of 18 countries that need assistance in the economic segment and 7 countries that need assistance in the health segment. Determination of which countries will get assistance first can be started by the country that is in the topmost plot, for the economic segment it can start from the country Eritrea then the Central African Republic then Sudan and so on and for the health segment it can be started from countries Central African Republic then Chad then Niger and so on

Conclusion: From the two methods above, the best way that can be applied is the second way because in the second way many countries will get aid and aid can be given more precisely based on what segmented in need of aid and how urgent the country needs assistance. Countries that are in the first to third positions for each segment are going to be the earliest countries who will get the very first of aid.

The list of country will received the earliest of aid in each segment: