The paper aims to review different algorithms and methods used for dimension reduction. Two different algorithms were used: Principal Component Analysis (PCA) and Weighted Least Squares Factor Analysis. Analysis, as in the paper regarding the clustering techniques, are based on the data from the database called “nutrition” that can be found here: https://www.kaggle.com/trolukovich/nutritional-values-for-common-foods-and-products?select=nutrition.csv. This data set contains nutrition data for almost 9 thousand different food items.
First of all, necessary libraries are loaded and the database is imported.
library(tidyverse)
library(dplyr)
library(lubridate)
library(ggplot2)
library(datasets)
library(readxl)
library(xlsx)
library(cluster)
library(factoextra)
library(flexclust)
library(fpc)
library(clustertend)
library(ClusterR)
library(tidyverse)
library(grid)
library(gridExtra)
library(lattice)
library(ppclust)
library(fclust)
library(wesanderson)
library(corrplot)
library(psych)
library(ggfortify)
library(pca3d)
library(knitr)
library(rgl)
library(smacof)
library(labdsv)
library(vegan)
library(MASS)
library(ape)
library(ggfortify)
library(pca3d)
library(pls)
library(ClusterR)
nutrition <- read.csv2("nutrition.csv", header=TRUE, sep = ",", stringsAsFactors = TRUE)
Second of all, the rows with NA values or omitted, irrelevant columns are deleted and the food items names are stored as an additional variable.
nutrition <- na.omit(nutrition)
namesfood <- nutrition[1:175,2]
nutrition <- nutrition[,-1]
nutrition <- nutrition[,-1]
nutrition <- nutrition[,-1]
nutrition <- nutrition[,-3]
dim(nutrition)
## [1] 8789 73
Further more, all columns containing char variables are converted to numeric.
for(i in 1:ncol(nutrition)) {
nutrition[,i] <- as.numeric(nutrition[,i])
}
Also, in order to obtain more interpretable results whole database is normalised.
nutrition2<-nutrition
nutrition <- scale(nutrition)
After the initial data processing we obtain the normalised matrix containing data for 8789 food products, for which 73 nutritional values has been assigned. Finally, in order not to work with big data, only 175 food items and 15 nutritional characteristics are selected.
nutritiontrim <- as.matrix(nutrition[1:175, 1:15])
nutrition2<-as.matrix(nutrition2[1:175, 1:15])
names_nutr<- as.matrix(colnames(nutritiontrim))
summary(nutritiontrim)
## calories total_fat cholesterol sodium
## Min. :-1.26152 Min. :-1.0124 Min. :-0.89088 Min. :-1.9078
## 1st Qu.:-0.96127 1st Qu.:-0.9158 1st Qu.:-0.89088 1st Qu.:-1.0133
## Median :-0.18417 Median :-0.4971 Median :-0.89088 Median :-0.2183
## Mean :-0.01106 Mean :-0.1256 Mean :-0.36797 Mean :-0.2200
## 3rd Qu.: 0.62236 3rd Qu.: 0.3295 3rd Qu.:-0.06053 3rd Qu.: 0.6415
## Max. : 3.82496 Max. : 2.7234 Max. : 1.63663 Max. : 1.8090
## choline folate folic_acid niacin
## Min. :-0.78718 Min. :-1.17831 Min. :-0.3548 Min. :-1.2115
## 1st Qu.:-0.78718 1st Qu.:-1.09848 1st Qu.:-0.3353 1st Qu.:-1.0396
## Median :-0.48857 Median :-0.22032 Median :-0.3353 Median :-0.5094
## Mean : 0.01346 Mean :-0.03467 Mean :-0.1089 Mean :-0.2006
## 3rd Qu.: 0.74183 3rd Qu.: 0.98858 3rd Qu.:-0.3353 3rd Qu.: 0.5672
## Max. : 2.22285 Max. : 1.58923 Max. : 4.6595 Max. : 1.9358
## pantothenic_acid riboflavin thiamin vitamin_a
## Min. :-0.96531 Min. :-1.061069 Min. :-0.80429 Min. :-0.97765
## 1st Qu.:-0.94700 1st Qu.:-0.854052 1st Qu.:-0.67142 1st Qu.:-0.97535
## Median :-0.36370 Median :-0.289718 Median :-0.36460 Median :-0.03169
## Mean : 0.07939 Mean : 0.003946 Mean : 0.04274 Mean : 0.07498
## 3rd Qu.: 0.85000 3rd Qu.: 0.484469 3rd Qu.: 0.45923 3rd Qu.: 1.03051
## Max. : 2.80656 Max. : 3.379871 Max. : 3.52501 Max. : 2.06049
## vitamin_a_rae carotene_alpha carotene_beta
## Min. :-0.75618 Min. :-0.2591 Min. :-0.4945
## 1st Qu.:-0.75032 1st Qu.:-0.2591 1st Qu.:-0.4945
## Median :-0.74446 Median :-0.2267 Median :-0.4879
## Mean :-0.07657 Mean : 0.1852 Mean : 0.2510
## 3rd Qu.: 0.52687 3rd Qu.:-0.2267 3rd Qu.: 0.8318
## Max. : 2.49539 Max. : 5.9855 Max. : 3.2059
In the next step we will examine the correlation matrix for analysed data.
#two correlation plots
corr_nutritiontrim = cor(nutritiontrim, method='pearson')
corrplot.mixed(cor(nutritiontrim), bg="white", upper="pie",lower="number", order="hclust", tl.col="black", tl.pos="lt", diag="l", number.font=0.5, tl.cex=1, number.cex=0.55)
corrplot(corr_nutritiontrim, tl.col="black")
## Warning in plot.window(...): 'tl.col' nie jest parametrem graficznym
## Warning in plot.xy(xy, type, ...): 'tl.col' nie jest parametrem graficznym
## Warning in axis(side = side, at = at, labels = labels, ...): 'tl.col' nie jest
## parametrem graficznym
## Warning in axis(side = side, at = at, labels = labels, ...): 'tl.col' nie jest
## parametrem graficznym
## Warning in box(...): 'tl.col' nie jest parametrem graficznym
## Warning in title(...): 'tl.col' nie jest parametrem graficznym
Looking at both plots we can clearly see that some variables are correlated, while most of them have correlation close to 0.
First method used is the Multidimensional Scaling (MDS). This algorithm projects multidimensional data set onto two dimensional plane, while trying to preserve the original structure of the data set. In our case we have 15 nutritional values - 15 dimensions for each food item. We will visualize 15 dimensions on the 2D plane.
distance<-dist(t(nutritiontrim))
mds<-cmdscale(distance, k=2)
#let's plot the results with labels
plot(mds, type='n')
text(mds, labels=names_nutr, cex=0.6, adj=0.5)
On the plot we can see all 15 nutritional values with their relative position in the two dimensional plane. We can see that although there are no obvious outliers, some variables such as carotene alpha or pantothenic acid lay further away than most points. In order not to have to deal with outliers we will use slightly different approach. This time we will calculate the distance matrix based on the correlation matrix. This way we will avoid problem of outliers.
#calculate the correlation matrix
simmilaritymatr<-cor(nutritiontrim)
#calculate the dissimilarity matrix
dissimilaritymatr<-sim2diss(simmilaritymatr, method=1, to.dist=TRUE)
#perform MDS on calcualted matrixes
mds2<-mds(dissimilaritymatr, ndim=2, type="ratio") # from smacof::
#plot the results
plot(mds2)
This time we can see that all variables are close to each other, there are no outliers visible. We can also look at the contribution of each variable to the STRESS function - the function that measures quality of representation of variables on the 2D plane compared to their original location.
plot(mds2, pch=21, cex=as.numeric(mds2$spp), bg="red")
We can see that some of the variables have bigger impact on the STRESS function than others. Finally, we shall examine the quality of performed MDS.
#theoretical stress function
stressfunction<-randomstress(n=15, ndim=2, nrep=1)
#empirical stress function
empirical<-mds2$stress
#ratio
result <- empirical/mean(stressfunction)
result
## [1] 0.6524064
Value of the stress functions ratio is equal to 0.64, which according to Kruskal (1964), obtained MDS results are poor (result close to 1). Thus, we should perform other methods to extract information from our data set.
Second method used for reducing the analysed dataset is the Principal Component Analysis (PCA). PCA is used to reduce the number of variables while containing as much of the original information as possible. We can achieve that by choosing the variables with high variance, and creating the principal components (which are linear combinations of variables). First principal component explains the highest percentage of the total variance in the data, second principal component explains the highest percentage of variance that has not been explained by the first component etc. (this property is associated with the orthogonality of the components, we can think about the orthogonality as the more general concept of perpendicularity in the higher dimensional spaces) (Górniak 1998).
pca<-prcomp(nutritiontrim, center=TRUE, scale.=FALSE)
pca
## Standard deviations (1, .., p=15):
## [1] 1.8457612 1.5338108 1.2021381 1.1280718 1.0732342 1.0289359 0.9820252
## [8] 0.9159694 0.8090402 0.7748769 0.6929745 0.6571189 0.6457515 0.6107240
## [15] 0.5229643
##
## Rotation (n x k) = (15 x 15):
## PC1 PC2 PC3 PC4 PC5
## calories -0.27302813 -0.11156558 0.610080784 -0.10125758 0.11818808
## total_fat -0.23582437 -0.21320868 0.527635012 -0.02897626 -0.19367552
## cholesterol -0.14406565 -0.07095549 -0.083396784 0.11524353 -0.48231343
## sodium -0.07242794 0.05679978 -0.007889995 0.45141426 0.54124791
## choline 0.08599310 -0.25004100 -0.159103902 0.09732101 0.13254513
## folate -0.06495034 -0.14543061 -0.258987933 0.24122284 0.13642134
## folic_acid -0.13134323 -0.08301567 0.076413442 0.06380616 0.26993769
## niacin -0.32906420 -0.17915130 -0.195692392 -0.03069413 -0.05073449
## pantothenic_acid -0.34078549 -0.33158394 -0.318295065 -0.06362482 -0.32547014
## riboflavin -0.33305249 -0.31137478 -0.199353682 0.10339677 0.19715379
## thiamin -0.33561653 -0.23588007 0.072206066 -0.13791248 0.25977458
## vitamin_a 0.16347016 -0.28278460 0.119599878 0.48628498 -0.15285821
## vitamin_a_rae 0.19134693 -0.26902804 0.123719670 0.28297197 -0.15056768
## carotene_alpha 0.39479334 -0.51290330 -0.080669257 -0.55862037 0.22414125
## carotene_beta 0.38444641 -0.36499637 0.153220618 0.19187597 -0.05260858
## PC6 PC7 PC8 PC9 PC10
## calories -0.11018887 0.0015081457 -0.17795499 0.23353763 -0.150158468
## total_fat 0.16360123 -0.3633090618 0.13287123 0.01079130 -0.003831069
## cholesterol 0.25331796 -0.2862792525 -0.14395705 -0.34627570 0.160173404
## sodium 0.61187040 -0.1006024839 -0.12870623 0.06642504 -0.197334750
## choline -0.33757182 -0.2818999891 -0.53782974 -0.21267369 -0.452220572
## folate -0.32417983 -0.5782226879 0.29207545 0.46858222 0.239061362
## folic_acid -0.20444712 -0.1528121378 -0.05524841 -0.43491158 0.305525877
## niacin 0.04544823 -0.0792876673 0.03645436 -0.34154796 -0.015108050
## pantothenic_acid 0.13388771 0.1472416164 -0.19001373 0.42365893 -0.261301142
## riboflavin 0.06813964 0.3736129060 0.08052513 -0.04563109 0.300898060
## thiamin -0.20511282 0.2170974335 0.07617543 -0.07460581 0.045302111
## vitamin_a -0.19365595 0.2226183843 0.51395177 -0.19231392 -0.435943468
## vitamin_a_rae 0.23939104 0.0009782173 -0.02496509 0.03298645 0.254585345
## carotene_alpha 0.29378543 -0.1662290179 0.22669196 -0.06557853 -0.089406364
## carotene_beta -0.11775157 0.2165419950 -0.41217350 0.14461563 0.365908493
## PC11 PC12 PC13 PC14 PC15
## calories 0.39281872 -0.0568560056 0.003348553 -0.46128505 0.15923312
## total_fat -0.50445328 -0.0388447490 0.130045184 0.35883322 0.06309202
## cholesterol 0.14770521 -0.0999371085 0.182675741 -0.39897968 -0.42962401
## sodium -0.02644088 0.1726832757 0.056122630 -0.02759144 -0.12604398
## choline -0.05168794 -0.3377800006 0.064635287 0.15066951 0.04084531
## folate 0.10825041 0.0214699008 0.092987247 -0.08585684 -0.03314396
## folic_acid -0.24623533 0.2237656657 -0.615926459 -0.22635639 0.03284080
## niacin 0.35775766 0.4551513597 0.254036157 0.20296550 0.50036287
## pantothenic_acid -0.17104162 0.2110314131 -0.400628883 -0.05602250 -0.04145774
## riboflavin -0.23608888 -0.5097680503 0.234584878 -0.18694313 0.23217324
## thiamin 0.22600781 0.0677936316 0.056887857 0.37119040 -0.66384344
## vitamin_a -0.02850743 0.0881389911 -0.011192010 -0.17369311 -0.04530423
## vitamin_a_rae 0.45707234 -0.3219781708 -0.429306958 0.36978893 0.12446879
## carotene_alpha 0.03368478 -0.0005076429 -0.008356985 -0.18031789 -0.03033686
## carotene_beta -0.13695630 0.4040161396 0.294216556 -0.02660041 -0.03486261
summary(pca)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 1.8458 1.5338 1.20214 1.12807 1.07323 1.02894 0.98203
## Proportion of Variance 0.2167 0.1496 0.09192 0.08094 0.07326 0.06734 0.06134
## Cumulative Proportion 0.2167 0.3663 0.45826 0.53920 0.61247 0.67981 0.74115
## PC8 PC9 PC10 PC11 PC12 PC13 PC14
## Standard deviation 0.91597 0.80904 0.77488 0.69297 0.65712 0.64575 0.61072
## Proportion of Variance 0.05337 0.04163 0.03819 0.03055 0.02747 0.02652 0.02372
## Cumulative Proportion 0.79452 0.83615 0.87434 0.90489 0.93236 0.95888 0.98260
## PC15
## Standard deviation 0.5230
## Proportion of Variance 0.0174
## Cumulative Proportion 1.0000
As we can see first two principal components explain around 36% of total variance. Now, we need to determine the proper number of principal components. We will do that by using two different approaches.
First one is performed by looking at the scree plot.
fviz_eig(pca, choice='eigenvalue')
This plot presents the eigenvalue of each principal component. Keiser’s rule tells us that we should keep all components, which have eigenvalues bigger than 1 (Rea 2016). Here, first six components have eigenvalues bigger than 1, therefore 6 components should be chosen. Second approach suggests that chosen components must explain at least 70% of variance.
#table for each component
eig.val<-get_eigenvalue(pca)
eig.val
## eigenvalue variance.percent cumulative.variance.percent
## Dim.1 3.4068344 21.669941 21.66994
## Dim.2 2.3525755 14.964089 36.63403
## Dim.3 1.4451359 9.192114 45.82614
## Dim.4 1.2725461 8.094317 53.92046
## Dim.5 1.1518316 7.326485 61.24695
## Dim.6 1.0587091 6.734158 67.98110
## Dim.7 0.9643735 6.134115 74.11522
## Dim.8 0.8389999 5.336648 79.45187
## Dim.9 0.6545461 4.163388 83.61526
## Dim.10 0.6004342 3.819198 87.43445
## Dim.11 0.4802137 3.054508 90.48896
## Dim.12 0.4318052 2.746595 93.23556
## Dim.13 0.4169950 2.652391 95.88795
## Dim.14 0.3729838 2.372448 98.26039
## Dim.15 0.2734916 1.739605 100.00000
We can see that in order to explain at least 70% of the total variance, we should take into account 7 components (74% of variance explained). Thus, I have decided to choose 7 principal components for further analysis.
fviz_pca_var(pca, col.var="contrib", gradient.cols = wes_palette("Rushmore1"))
We can see that carotene alpha, carotene beta and sodium are amongst the most influential variables. The graphs show the contribution of individual variables to each of the seven principal components.
# contributions of individual variables to PC
var<-get_pca_var(pca)
a<-fviz_contrib(pca, "var", color="navy blue", axes=1, xtickslab.rt=90, ggtheme=theme_classic(), palette="Set1", title = NULL)
b<-fviz_contrib(pca, "var", color="navy blue", axes=2, xtickslab.rt=90, ggtheme=theme_classic(), palette="Set1")
c<-fviz_contrib(pca, "var", color="navy blue", axes=3, xtickslab.rt=90, ggtheme=theme_classic(), palette="Set1")
d<-fviz_contrib(pca, "var", color="navy blue", axes=4, xtickslab.rt=90, ggtheme=theme_classic(), palette="Set1")
e<-fviz_contrib(pca, "var", color="navy blue", axes=5, xtickslab.rt=90, ggtheme=theme_classic(), palette="Set1")
f<-fviz_contrib(pca, "var", color="navy blue", axes=6, xtickslab.rt=90, ggtheme=theme_classic(), palette="Set1")
g<-fviz_contrib(pca, "var", color="navy blue", axes=7, xtickslab.rt=90, ggtheme=theme_classic(), palette="Set1")
grid.arrange(a,b,c,d,e,f,g, top='Contribution to the first seven Principal Components')
To draw meaningful conclusions from the performed analysis, it is useful to implement the rotated PCA. Rotated PCA allows us to analyse the contribution of the individual variables in the components.
#rotated pca
pca_rotated<-principal(nutritiontrim, nfactors=7, rotate="varimax")
pca_rotated
## Principal Components Analysis
## Call: principal(r = nutritiontrim, nfactors = 7, rotate = "varimax")
## Standardized loadings (pattern matrix) based upon correlation matrix
## RC1 RC2 RC3 RC7 RC6 RC4 RC5 h2 u2 com
## calories 0.15 -0.08 0.83 -0.07 -0.03 -0.10 -0.01 0.74 0.260 1.1
## total_fat 0.11 -0.06 0.78 0.13 0.01 0.32 -0.03 0.74 0.264 1.4
## cholesterol 0.15 -0.08 0.07 -0.02 0.02 0.90 0.00 0.84 0.163 1.1
## sodium 0.04 -0.09 -0.03 -0.01 0.03 -0.01 0.95 0.91 0.088 1.0
## choline 0.04 0.59 -0.08 -0.15 0.57 0.08 -0.04 0.70 0.296 2.2
## folate 0.12 -0.16 -0.17 0.30 0.73 0.08 -0.07 0.71 0.287 1.7
## folic_acid 0.11 0.00 0.39 -0.16 0.60 -0.14 0.19 0.61 0.389 2.3
## niacin 0.70 -0.15 0.11 -0.13 0.19 0.29 0.00 0.66 0.337 1.8
## pantothenic_acid 0.76 0.03 0.00 -0.02 -0.04 0.31 -0.12 0.69 0.309 1.4
## riboflavin 0.82 -0.03 0.08 0.10 0.05 -0.08 0.19 0.74 0.257 1.2
## thiamin 0.71 -0.07 0.42 -0.07 0.13 -0.29 -0.06 0.79 0.215 2.1
## vitamin_a 0.00 0.15 0.00 0.85 0.09 -0.08 -0.06 0.76 0.243 1.1
## vitamin_a_rae -0.10 0.51 0.06 0.55 -0.12 0.26 0.24 0.71 0.292 3.0
## carotene_alpha 0.01 0.75 -0.10 0.05 -0.08 -0.10 -0.11 0.60 0.405 1.1
## carotene_beta -0.19 0.72 -0.02 0.35 -0.02 -0.08 -0.02 0.69 0.314 1.6
##
## RC1 RC2 RC3 RC7 RC6 RC4 RC5
## SS loadings 2.38 1.79 1.70 1.34 1.31 1.31 1.07
## Proportion Var 0.16 0.12 0.11 0.09 0.09 0.09 0.07
## Cumulative Var 0.16 0.28 0.39 0.48 0.57 0.65 0.73
## Proportion Explained 0.22 0.16 0.16 0.12 0.12 0.12 0.10
## Cumulative Proportion 0.22 0.38 0.54 0.66 0.78 0.90 1.00
##
## Mean item complexity = 1.6
## Test of the hypothesis that 7 components are sufficient.
##
## The root mean square of the residuals (RMSR) is 0.07
## with the empirical chi square 193.7 with prob < 6.3e-30
##
## Fit based upon off diagonal values = 0.86
summary(pca_rotated)
##
## Factor analysis with Call: principal(r = nutritiontrim, nfactors = 7, rotate = "varimax")
##
## Test of the hypothesis that 7 factors are sufficient.
## The degrees of freedom for the model is 21 and the objective function was 2.11
## The number of observations was 175 with Chi Square = 345.77 with prob < 1.4e-60
##
## The root mean square of the residuals (RMSA) is 0.07
# we can try to look at the data and group nutritional values into groups
#threshold was chosen arbitrarily at the 0.45 level
print(loadings(pca_rotated), digits=3, cutoff=0.45, sort=TRUE)
##
## Loadings:
## RC1 RC2 RC3 RC7 RC6 RC4 RC5
## niacin 0.699
## pantothenic_acid 0.759
## riboflavin 0.824
## thiamin 0.708
## choline 0.588 0.566
## carotene_alpha 0.746
## carotene_beta 0.723
## calories 0.835
## total_fat 0.777
## vitamin_a 0.847
## vitamin_a_rae 0.508 0.549
## folate 0.734
## folic_acid 0.604
## cholesterol 0.895
## sodium 0.949
##
## RC1 RC2 RC3 RC7 RC6 RC4 RC5
## SS loadings 2.375 1.785 1.696 1.337 1.311 1.305 1.072
## Proportion Var 0.158 0.119 0.113 0.089 0.087 0.087 0.071
## Cumulative Var 0.158 0.277 0.390 0.480 0.567 0.654 0.725
Even though my knowledge regarding the nutritional values is limited and it is hard to assess the results without the expert knowledge, we can see that for example amount of total fat is connected with the amount of calories or that choline occurs together with folate and folic acid (which means that vitamin B4 and B9 can often be found in the same products).
Finally, we can visualize the results.
#visualisation of pca
# let's choose matching colors - all colors can be displayed by the function colors()
fviz_pca_ind(pca, col.ind="cos2", geom="point", gradient.cols=c("darkslategray1", "skyblue", "blue1", "navy blue" ))
#3D plot of variables
pca3d(pca, palette=c("darkslategray1", "salmon", "navy blue" ))
## [1] 0.09998575 0.09066615 0.10272537
## Creating new device
rglwidget()
The 3D plot is not very informative, therefore we shall try to assign the points to the clusters obtained in the paper regarding the clustering. The most effective way to group data was to use the fuzzy k-means algorithm for 3 clusters.
# 3D plot with points assigned to clusters
fuzzykm <- fcm(nutritiontrim, centers=3, m=1.5)
fuzzykm2 <- ppclust2(fuzzykm, "kmeans")
pca3d(pca, group=fuzzykm2$cluster, palette=c("darkslategray1", "salmon", "navy blue" ))
## [1] 0.09998575 0.09066615 0.10272537
rglwidget()
We can see that variables are grouped into three main groups, and these groups correspond with the previously obtained clusters. Every group is mostly described by different principal component.
Third method used to extract the information from the data set and make an attempt to group variables is the Weighted Least Squares Factor Analysis. In this approach we decompose the correlation matrix and calculate the communulaties (the sum of loading s of each variable). Then we assign weights to the correlation coefficients (between the original data and the matrix of created factors) in such a way that most unique variables are assigned low weights. As a result, we value commonly occurring variables more than rare ones (Revelle 2021).
In order to perform the Weighted Least Squares Factor Analysis I have decided to use library GPArotation, which also allows to visualize the contribution of each variable to each group.
library(GPArotation)
#WLSFA for our data set
f3wtest <- fa(nutritiontrim, 3, n.obs = 175, fm="wls")
#visualization of results
fa.diagram(f3wtest)
On the diagram we can see three groups that variables were assigned to, as well as the loading value for each factor. First one, the most abundant consists of variables such as thiamine, total calories or total fat. Second one consists mainly of vitamins. Contrary to previous method, sodium was assigned to individual group, while cholesterol and folate were not assigned to any group.
The analysis of 150 food items and 15 nutritional values was performed. After initial data preparation, three methods were applied. First one, the Multidimensional Scaling allowed us to reduce 15-dimensional data set to two dimension and plot the results. However, obtained results were of poor quality. Second one, the Principal Components Analysis allowed us to reduce the number of variables in order to display the original information with smaller matrix. Two methods were used to determine the number of components. Moreover the rotated PCA allowed us to distinct nutritional elements that were common within the same food items. Finally, we have visualized the results on the 3D plot, using results obtained in the paper regarding clustering to colour variables from the same clusters. Third method used was the Weighted Least Squares Factor Analysis. This approach allowed us to group variables into 3 categories, which were dissimilar to those obtained in the PCA.
Kruskal,(1964). J.B. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29, 1–27 . Rea, Alethea & Rea, William. (2016). How Many Components should be Retained from a Multivariate Time Series PCA?. Górniak, Jarosław. (1998). Analiza czynnikowa i analiza głównych składowych (see: https://kb.osu.edu/bitstream/handle/1811/69494/ASK_1998_83_102.pdf). Revelle, William. (2021). How To: Use the psych package for Factor Analysis and data reduction. Department of Psychology, Northwestern University (see: https://cran.r-project.org/web/packages/psychTools/vignettes/factor.pdf).