Math 56 Questions

  1. In your own words, define multivariate analysis.

Multivariate analysis is a statistical technique used to examine multiple variables simultaneously to understand relationships, patterns, and trends within complex datasets.

  1. Discuss the occurence of multivariate data.

Multivariate data occurs when multiple variables are measured or observed simultaneously, often arising in complex real-world scenarios such as economics, biology, and social sciences.

3.Why is knowledge of measurement scales important to an understanding of multivariate data analysis? Knowledge of measurement scales is crucial in multivariate data analysis because it determines the appropriate statistical methods, influences data interpretation, and ensures meaningful comparisons between variables.

4.Discuss the approaches in analyzing multivariate data. Analyzing multivariate data involves exploring multiple variables. Different approaches are used depending on the nature of the data, the objective of the analysis, and assumptions. Some approaches are as follows: PCA(Principal Component Analysis) - is used to simplify data by identifying the directions where the data has most variance, at the same time the results of the simplified data should represent the original data in a more simple form, with minimal loss of information. SEM(Sequential Equation Modelling) - is used to analyze the relationships between the observed variables and latent variables. It involves the specification of the models that include both direct and indirect effects among variables.

5.1 Provide an example of multivariate data in matrix form.

data.frame(Employees= c(1,2,3,4,5),
           Sales = c(120,120,130,150,110),
           Customer_service = c(90,70,80,90,80),
           Teamwork = c(85,95,88,80,75))
  Employees Sales Customer_service Teamwork
1         1   120               90       85
2         2   120               70       95
3         3   130               80       88
4         4   150               90       80
5         5   110               80       75

5.2 Of the given example in 5.1, how many variables are there? There are three variables; Sales, Customer Service, and Teamwork.

5.3 Of the given example in 5.2, how many cases are there? There are 5 cases, since there are 5 employees.

  1. . Table 1 below shows six measurements on each of 25 pottery goblets excavated from prehistoric sites in Thailand, with Fig. 1 illustrating the typical shape and the nature of the measurements. The main question of interest for these data concerns similarities and differences between the goblets, with obvious questions being:
  1. Is it possible to display the data graphically to show how the goblets are related and, if so, are there any obvious groupings of similar goblets?

  2. Are there any goblets that are particularly unusual? Carry out a principal components analysis and see whether the values of the principal components help to answer these questions.

One point that needs consideration with this exercise is the extent to which differences between goblets are due to shape differences rather than size differences. It may well be considered that two goblets that are almost the same shape but have very different sizes are really ‘similar’. The problem of separating size and shape differences has generated a considerable scientific literature that will not be considered here. However, it can be noted that one way to remove the effects of size involves dividing the measurements for a goblet b the total height of the body of the goblet. Alternatively, the measurements of a goblet can be expressed as a proportion of the sum of all measurements on that goblet. These types of standardization of variables will clearly ensure that the data values are similar for two gablets with the same shape but different sizes.

Table 1. Measurements (in cm) taken on 25 prehistoric goblets from Thailand. The variables are defined in Fig. 6.3. The data were kindly provided by Professor C.F.W. Higham of the University of Otago.

library(factoextra)
Warning: package 'factoextra' was built under R version 4.3.3
Loading required package: ggplot2
Warning: package 'ggplot2' was built under R version 4.3.2
Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
goblets<-read.csv( "C:/Users/USER/Dropbox/PC/Desktop/Second semester 24-25/Multivariate Analysis, Lab/Midterm exam/vase_midterm.csv",header=T, row.names=1)
goblets
   X1 X2 X3 X4 X5 X6
1  13 21 23 14  7  8
2  14 14 24 19  5  9
3  19 23 24 20  6 12
4  17 18 16 16 11  8
5  19 20 16 16 10  7
6  12 20 24 17  6  9
7  12 19 22 16  6 10
8  12 22 25 15  7  7
9  11 15 17 11  6  5
10 11 13 14 11  7  4
11 12 20 25 18  5 12
12 13 21 23 15  9  8
13 12 15 19 12  5  6
14 13 22 26 17  7 10
15 14 22 26 15  7  9
16 14 19 20 17  5 10
17 15 16 15 15  9  7
18 19 21 20 16  9 10
19 12 20 26 16  7 10
20 17 20 27 18  6 14
21 13 20 27 17  6  9
22  9  9 10  7  4  3
23  8  8  7  5  2  2
24  9  9  8  4  2  2
25 12 19 27 18  5 12

Let us perform a Principal Component Analysis on the data.

my_goblets<- prcomp(goblets[colnames(goblets)!="y"],
                scale = TRUE)
my_goblets
Standard deviations (1, .., p=6):
[1] 2.0668279 1.0450729 0.6202804 0.3773544 0.2555262 0.2088231

Rotation (n x k) = (6 x 6):
         PC1         PC2        PC3         PC4         PC5        PC6
X1 0.3660233  0.48592912 -0.6179335 -0.32436829  0.27835629  0.2556581
X2 0.4515367 -0.03412653  0.3752732 -0.67427405 -0.08391876 -0.4386709
X3 0.4111609 -0.44135161  0.3163501  0.02019451  0.38254463  0.6239630
X4 0.4618586 -0.11457532 -0.1588367  0.54119094  0.38182563 -0.5564635
X5 0.2963653  0.68277080  0.4914536  0.35921044 -0.22136144  0.1625790
X6 0.4381125 -0.29768029 -0.3324080  0.13346207 -0.75785442  0.1295892

The observed standard deviations for PC1 to PC6 are 2.0668279, 1.0450729, 0.6202804, 0.3773544, 0.2555262, and 0.2088231, respectively. The rotation indicates that the original measurements influence each component, with PC1 displaying relatively high values, suggesting that this principal component predominantly reflects the size variations among the goblets

Next is we determine the number of principal components by using a scree plot.

fviz_eig(my_goblets,
         addlabels = TRUE,
         choice ="eigenvalue",
         ncp = ncol(goblets)) +
           geom_hline(yintercept = 1,
                      linetype = "dashed",
                      color = "red")

The scree plot shows that we only have two principal components to consider, PC1 and PC2 with eigenvalues greater than 1.

summary(my_goblets)
Importance of components:
                         PC1   PC2     PC3     PC4     PC5     PC6
Standard deviation     2.067 1.045 0.62028 0.37735 0.25553 0.20882
Proportion of Variance 0.712 0.182 0.06412 0.02373 0.01088 0.00727
Cumulative Proportion  0.712 0.894 0.95812 0.98185 0.99273 1.00000
num_pcgoblets <- sum(my_goblets$sdev^2 > 1)
print(num_pcgoblets)
[1] 2

It is confirmed that the principal components are only 2.

fviz_pca_biplot(my_goblets,                             
                label = "var",                           
                col.var = "#353436",) +
  labs(x = "PC1",                                        
       y = "PC2")

Performing the Biplot we can see that in PC2, X1 and X5 are highly correlated, the same also in PC1, as it was shown previously in performing the PCA. Before performing a K-means clustering, we need to determine how many clusters we will use.

my_pca_goblets <- data.frame(my_goblets$x[ , 1:2])

fviz_nbclust(my_pca_goblets,
             FUNcluster = kmeans,
             method = "wss")

As we can see that in cluster 3, it started to level off at 3, so we will use 3 as our number of clusters.

my_goblets_pca_scores <- as.data.frame(my_goblets$x[,1:2])

my_kmeans_goblet <- kmeans(my_goblets_pca_scores,     
                       centers = 3)

fviz_pca_ind(my_goblets,                            
             habillage = my_kmeans_goblet$cluster,      
             repel = TRUE,                                  
             addEllipses = TRUE,                        
             ellipse.type = "convex") +               
  guides(color = guide_legend(override.aes = list(label = ""))) + 
  labs(x = "PC1",                                         
       y = "PC2")

my_kmeans_goblet
K-means clustering with 3 clusters of sizes 6, 15, 4

Cluster means:
         PC1        PC2
1 -3.1948479  0.0508247
2  1.0284111 -0.5523108
3  0.9357302  1.9949284

Clustering vector:
 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 
 2  2  2  3  3  2  2  2  1  1  2  2  1  2  2  2  3  3  2  2  2  1  1  1  2 

Within cluster sum of squares by cluster:
[1] 15.186120  9.935051  2.494463
 (between_SS / total_SS =  78.5 %)

Available components:

[1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss"
[6] "betweenss"    "size"         "iter"         "ifault"      

a.) Yes, the data can be represented graphically to illustrate the relationships among the goblets. Specifically, Group 1 consists of goblets 1, 2, 3, 6, 7, 8, 11, 12, 15, 16, 19, 20, 21, and 25, which share similarities. Likewise, Group 2 includes goblets 4, 5, 17, and 18, while Group 3 comprises goblets 9, 10, 13, 22, 23, and 24, showing distinct characteristics within their respective groups.

b.) The scatterplot visualizes the distribution of data points using PCA, where each point represents an observation based on PC1 and PC2. The red points signify the transformed data defined by the principal components. The 95% confidence ellipse highlights the area where most data points cluster, while those outside the ellipse are considered unusual, as they fall beyond the concentrated region.

goblet_pca_scores <- as.data.frame(my_goblets$x[, 1:2])


ggplot(goblet_pca_scores, aes(x = PC1, y = PC2)) +
  geom_point(size = 2, color = "red") +
  theme_minimal() +
  stat_ellipse(type = "t", level = 0.95)

mahal_dist <- mahalanobis(my_goblets_pca_scores, colMeans(my_goblets_pca_scores), cov(my_goblets_pca_scores))
mahal_dist
         1          2          3          4          5          6          7 
0.05284764 0.54101529 1.54405977 5.32224449 5.78497860 0.60885271 0.44418301 
         8          9         10         11         12         13         14 
0.14228860 0.70161594 1.66856335 2.20666819 0.48010680 0.46027735 0.70557856 
        15         16         17         18         19         20         21 
0.35204402 0.32649269 2.39913589 2.93767708 0.58948610 1.60396641 0.81232165 
        22         23         24         25 
3.81505022 6.18920620 5.70680660 2.60453284 

Using Mahalabonis Distance, we can see that goblets 4,5, 22, 23, and 24 have a standard deviations greater than 3, considering that this goblets are unusual, as we can see in the scatterplot above.

  1. Table 2 shows the estimates of the average protein consumption from different food sources for the inhabitants of 25 European countries as published by Weber (1975). Use principal components analysis to investigate the relationships between the countries on the basis of these variables
library(factoextra)
Protein<-read.csv( "C:/Users/USER/Dropbox/PC/Desktop/Second semester 24-25/Multivariate Analysis, Lab/Midterm exam/Protein_midterm.csv",header=T, row.names=1)
Protein
               Red.Meat White.Meat Eggs Milk Fish Cereals Starchy.Foods
Albania              10          1    1    9    9      42             1
Austria               9         14    4   20    2      28             4
Belgium              14          9    4   18    5      27             6
Bulgaria              8          6    2    8    1      57             1
Czechoslovakia       10         11    3   13    2      34             5
Denmark              11         11    4   25   10      22             5
E. Germany            8         12    4   11    5      25             7
Finland              10          5    3   34    6      26             5
France               18         10    3   20    6      28             5
Greece               10          3    3   18    6      42             2
Hungary               5         12    3   10    0      40             4
Ireland              14         10    5   26    2      24             6
Italy                 9          5    3   14    3      37             2
Netherlands          10         14    4   23    3      22             4
Norway                9          5    3   23   10      23             5
Poland                7         10    3   19    3      36             6
Portugal              6          4    1    5   14      27             6
Romania               6          6    2   11    1      50             3
Spain                 7          3    3    9    7      29             6
Sweden               10          8    4   25    8      20             4
Switzerland          13         10    3   24    2      26             3
UK                   17          6    5   21    4      24             5
USSR                  9          5    2   17    3      44             6
W. Germany           11         13    4   19    3      19             5
Yugoslavia            4          5    1   10    1      56             3
               Pulses..Nuts..Oilseeds Fruits...Vegetables Total
Albania                             6                   2    72
Austria                             1                   4    86
Belgium                             2                   4    89
Bulgaria                            4                   4    91
Czechoslovakia                      1                   4    83
Denmark                             1                   2    91
E. Germany                          1                   4    77
Finland                             1                   1    91
France                              2                   7    99
Greece                              8                   7    99
Hungary                             5                   4    83
Ireland                             2                   3    92
Italy                               4                   7    84
Netherlands                         2                   4    86
Norway                              2                   3    83
Poland                              2                   7    93
Portugal                            5                   8    76
Romania                             5                   3    87
Spain                               6                   7    77
Sweden                              1                   2    82
Switzerland                         2                   5    88
UK                                  3                   3    88
USSR                                3                   3    92
W. Germany                          2                   4    80
Yugoslavia                          6                   3    89

First is we need to check for outliers across the data by making a scatter plot matrix.

pairs(Protein,font.labels=0.1,gap=0.1,pch=".",cex = 0.00001)

round(sapply(Protein,var),2)
              Red.Meat             White.Meat                   Eggs 
                 11.58                  13.99                   1.24 
                  Milk                   Fish                Cereals 
                 50.38                  12.07                 121.23 
         Starchy.Foods Pulses..Nuts..Oilseeds    Fruits...Vegetables 
                  2.74                   4.08                   3.67 
                 Total 
                 45.81 

We can see that the variance are from 1.24 (Eggs) to 121.23 (Cereals). Next is we normalize the data and calculate its correlation matrix.

Nor_Protein=scale(Protein)
round(cor(Nor_Protein),2)
                       Red.Meat White.Meat  Eggs  Milk  Fish Cereals
Red.Meat                   1.00       0.19  0.58  0.54  0.07   -0.51
White.Meat                 0.19       1.00  0.60  0.30 -0.40   -0.44
Eggs                       0.58       0.60  1.00  0.61 -0.15   -0.70
Milk                       0.54       0.30  0.61  1.00  0.04   -0.59
Fish                       0.07      -0.40 -0.15  0.04  1.00   -0.42
Cereals                   -0.51      -0.44 -0.70 -0.59 -0.42    1.00
Starchy.Foods              0.15       0.33  0.41  0.21  0.22   -0.58
Pulses..Nuts..Oilseeds    -0.41      -0.67 -0.60 -0.62  0.03    0.64
Fruits...Vegetables       -0.06      -0.07 -0.16 -0.40  0.11    0.04
Total                      0.37       0.10  0.19  0.46 -0.32    0.19
                       Starchy.Foods Pulses..Nuts..Oilseeds Fruits...Vegetables
Red.Meat                        0.15                  -0.41               -0.06
White.Meat                      0.33                  -0.67               -0.07
Eggs                            0.41                  -0.60               -0.16
Milk                            0.21                  -0.62               -0.40
Fish                            0.22                   0.03                0.11
Cereals                        -0.58                   0.64                0.04
Starchy.Foods                   1.00                  -0.50                0.07
Pulses..Nuts..Oilseeds         -0.50                   1.00                0.35
Fruits...Vegetables             0.07                   0.35                1.00
Total                          -0.04                  -0.08                0.07
                       Total
Red.Meat                0.37
White.Meat              0.10
Eggs                    0.19
Milk                    0.46
Fish                   -0.32
Cereals                 0.19
Starchy.Foods          -0.04
Pulses..Nuts..Oilseeds -0.08
Fruits...Vegetables     0.07
Total                   1.00

Calculating the eigenvalues and eigenvectors of the normalized data, we have

eigen(cor(Nor_Protein))
eigen() decomposition
$values
 [1] 4.08102042 1.77649203 1.29332073 1.15617590 0.64090232 0.40846280
 [7] 0.35173106 0.17668554 0.11086093 0.00434827

$vectors
              [,1]         [,2]        [,3]        [,4]       [,5]       [,6]
 [1,] -0.323162712  0.085761181 -0.45555126  0.13466528 -0.4149609 -0.3378958
 [2,] -0.327966557  0.167793167  0.52378576  0.16677840 -0.1194565  0.3658457
 [3,] -0.427603002  0.069514798  0.05369812  0.10457088 -0.3052443 -0.3220580
 [4,] -0.391384747  0.151749080 -0.33065384 -0.21545795  0.1829815  0.2987262
 [5,] -0.002912023 -0.620011310 -0.40426232 -0.09156400  0.1109589  0.3017640
 [6,]  0.407352461  0.379047870  0.04161765 -0.03365992  0.2043929 -0.1377447
 [7,] -0.276316909 -0.336781738  0.19817040  0.25489398  0.6218107 -0.5160563
 [8,]  0.423039223  0.006822824 -0.20122839  0.15536750 -0.1800047 -0.2882860
 [9,]  0.132974407 -0.157406379 -0.04272630  0.83950049 -0.1221485  0.2947784
[10,] -0.114182012  0.519897886 -0.39894376  0.30545868  0.4458952  0.1125718
             [,7]         [,8]        [,9]       [,10]
 [1,]  0.55843275  0.002529492 -0.20345101 -0.15145045
 [2,]  0.12212643  0.492031222 -0.32884690 -0.22102950
 [3,] -0.49516831  0.199127568  0.54936105 -0.12234518
 [4,] -0.37188877 -0.359851908 -0.27811906 -0.44740553
 [5,]  0.06768754  0.501579332  0.20518330 -0.20525235
 [6,]  0.20980494  0.099997238  0.30264163 -0.69365849
 [7,]  0.03944011 -0.023409408 -0.18280865 -0.13228566
 [8,] -0.48684034  0.334997648 -0.53173185 -0.09419728
 [9,] -0.04931532 -0.329761355  0.11236684 -0.15918232
[10,] -0.01201355  0.325051449  0.09403944  0.37157289

next is we extract the Principal components.

Protein_PCA<-princomp(Nor_Protein,cor = TRUE)
summary(Protein_PCA, loadings = TRUE)
Importance of components:
                         Comp.1    Comp.2    Comp.3    Comp.4     Comp.5
Standard deviation     2.020154 1.3328511 1.1372426 1.0752562 0.80056375
Proportion of Variance 0.408102 0.1776492 0.1293321 0.1156176 0.06409023
Cumulative Proportion  0.408102 0.5857512 0.7150833 0.8307009 0.89479114
                           Comp.6     Comp.7     Comp.8     Comp.9     Comp.10
Standard deviation     0.63911094 0.59306918 0.42033979 0.33295784 0.065941413
Proportion of Variance 0.04084628 0.03517311 0.01766855 0.01108609 0.000434827
Cumulative Proportion  0.93563742 0.97081053 0.98847908 0.99956517 1.000000000

Loadings:
                       Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
Red.Meat                0.323         0.456  0.135  0.415  0.338  0.558       
White.Meat              0.328  0.168 -0.524  0.167  0.119 -0.366  0.122  0.492
Eggs                    0.428                0.105  0.305  0.322 -0.495  0.199
Milk                    0.391  0.152  0.331 -0.215 -0.183 -0.299 -0.372 -0.360
Fish                          -0.620  0.404        -0.111 -0.302         0.502
Cereals                -0.407  0.379               -0.204  0.138  0.210       
Starchy.Foods           0.276 -0.337 -0.198  0.255 -0.622  0.516              
Pulses..Nuts..Oilseeds -0.423         0.201  0.155  0.180  0.288 -0.487  0.335
Fruits...Vegetables    -0.133 -0.157         0.840  0.122 -0.295        -0.330
Total                   0.114  0.520  0.399  0.305 -0.446 -0.113         0.325
                       Comp.9 Comp.10
Red.Meat                0.203  0.151 
White.Meat              0.329  0.221 
Eggs                   -0.549  0.122 
Milk                    0.278  0.447 
Fish                   -0.205  0.205 
Cereals                -0.303  0.694 
Starchy.Foods           0.183  0.132 
Pulses..Nuts..Oilseeds  0.532        
Fruits...Vegetables    -0.112  0.159 
Total                         -0.372 

PC1 explains 40.81 % of variance, PC2 explains 17.77 % of variance, PC3 explains 12.93 % of variance, and the fourth explains 11.56 % of variance. Let us extract the number of principal components in our data.

fviz_eig(Protein_PCA,
         addlabels = TRUE,
         choice ="eigenvalue",
         ncp = ncol(Protein)) +
           geom_hline(yintercept = 1,
                      linetype = "dashed",
                      color = "red")

Here we calculate the axis scores of the country in each Principal Components

round(Protein_PCA$scores,1)
               Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
Albania          -3.5   -1.3    1.0   -2.3    1.5    0.1    0.7    0.1    0.3
Austria           1.5    0.8   -1.4    0.0    0.4   -0.7   -0.1    0.1   -0.2
Belgium           1.7   -0.1    0.3    0.5    0.0    0.8    0.5    0.2   -0.1
Bulgaria         -2.9    2.2   -0.1   -0.3    0.4   -0.3    0.7    0.2   -0.8
Czechoslovakia    0.5    0.2   -1.4    0.0    0.0    0.1    0.9   -0.2   -0.2
Denmark           2.4   -0.5    0.7   -0.9   -0.6   -0.7    0.0    1.1   -0.3
E. Germany        1.2   -1.5   -2.2    0.1   -0.1    0.6    0.2    0.2   -0.4
Finland           1.8    0.3    1.3   -1.9   -1.4   -0.3   -0.4   -0.7    0.2
France            1.6    0.6    1.8    2.2    0.0   -0.3    1.5    0.2    0.3
Greece           -2.2    1.1    2.5    1.5    0.3   -0.2   -1.3    0.6    0.0
Hungary          -1.3    0.9   -2.1    0.2    0.2    0.2   -0.7    0.5    0.4
Ireland           2.8    0.9    0.3    0.2    0.0    1.0   -0.5    0.0    0.1
Italy            -1.6    0.3    0.2    0.8    1.2   -0.5   -0.3   -0.9   -0.6
Netherlands       1.8    0.5   -0.9    0.0    0.6   -0.8   -0.5    0.2    0.4
Norway            0.7   -1.6    0.8   -1.1   -0.6   -0.4   -0.3   -0.1   -0.2
Poland            0.3    0.4   -0.6    1.7   -1.4   -0.6   -0.2   -0.4   -0.2
Portugal         -2.6   -4.0    0.2    1.2   -0.6   -0.6    0.4    0.2    0.2
Romania          -2.5    1.3   -0.6   -0.7   -0.3    0.3   -0.1    0.1    0.0
Spain            -1.8   -2.3   -0.2    1.2    0.1    1.0   -0.9   -0.4    0.0
Sweden            1.8   -0.9    0.3   -1.6    0.3   -0.5   -0.4    0.0   -0.4
Switzerland       1.1    0.9    0.3    0.2    0.8   -0.8    0.3   -0.7    0.5
UK                2.0    0.2    1.3   -0.1    1.0    1.6    0.0    0.0   -0.2
USSR             -0.7    0.7    0.2   -0.3   -1.8    0.8    0.5   -0.3    0.1
W. Germany        1.8   -0.4   -1.2    0.0    0.9    0.0   -0.1   -0.1    0.5
Yugoslavia       -3.7    1.5   -0.5   -0.7   -1.0    0.2   -0.1    0.1    0.4
               Comp.10
Albania            0.2
Austria            0.0
Belgium            0.0
Bulgaria          -0.1
Czechoslovakia     0.0
Denmark            0.0
E. Germany         0.0
Finland            0.0
France             0.0
Greece             0.0
Hungary            0.0
Ireland            0.1
Italy              0.0
Netherlands        0.0
Norway             0.0
Poland             0.2
Portugal          -0.1
Romania           -0.1
Spain              0.0
Sweden             0.0
Switzerland       -0.1
UK                -0.1
USSR               0.0
W. Germany         0.0
Yugoslavia        -0.1

Here we will use a biplot showing the variables of the PC1 and PC2 diagram

fviz_pca_biplot(Protein_PCA,                                
                label = "var", 
                col.var = "#353436",                    
                palette = c("#3fdf05",
                            "#f25c10",
                            "#1b98e0")) +
  labs(x = "PC1",                                      
       y = "PC2")

Performing the biplot, we can see that Starchy Foods, White Meat, Red Meat, Eggs, and Milk are highly correlated in PC1, While, in PC2, White Meat, Red Meat, Eggs, and Milk, Cereals, and Pulses, Nuts and Oilseeds are highly correlated in PC2.

biplot(Protein_PCA,xlim=c(-0.4,0.5),ylim=c(-0.3,0.3), xlabs=abbreviate(row.names(Protein)))

The dietary patterns reveal that Albania, Spain, and Portugal are characterized by high protein consumption from fruits and vegetables. In Belgium, Norway, Western Germany, Sweden, Eastern Germany, and Denmark, starchy foods are the distinguishing factor. Meanwhile, Switzerland, Ireland, Austria, France, Poland, Czechoslovakia, Finland, the Netherlands, and the UK show a preference for white meat, red meat, eggs, and milk. Lastly, Yugoslavia, Romania, Italy, Hungary, the USSR, Bulgaria, and Greece stand out due to their notable intake of cereals, pulses, nuts, and oilseeds.