El objetivo de este trabajo, es practicar sobre el lenguaje de R y exponer los modelos de aprendizaje no supervisado, haciendo un poco trampa, ya que nosotros si tenemos la variable que categoriza a los datos, también otro objetivo es utilizar modelos de reducción de dimensión para los datos.
Este conjunto de datos, tiene información sobre diferentes Aceites, contando con las siguientes variables:
palmetic: porcentaje de ácido palmÃtico, ácido graso con una cadena de 16 carbonos, el ácido palmÃtico es producido por una amplia gama de plantas y organismos, generalmente en niveles bajos.
stearic: porcentaje de ácido estearÃlico, ácido graso saturado con una cadena de 18 carbonos, en la naturaleza se encuentra en muchas grasas animales y vegetales, pero generalmente es más alto en la grasa animal
oleic: porcentaje de ácido oleico, es un ácido graso monoinsaturado de la serie del omega 9, tÃpico de los aceites vegetales
linoleic: porcentaje de ácido linoleico, es un ácido graso esencial de la serie del omega 6, es un ácido graso poliinsaturado
linolenic: porcentaje de ácido linolénico, es un ácido graso poliinsaturado esencial que pertenece a la familia de los ácidos grasos omega 3, se encuentra en fuentes vegetales
eicosanoic: porcentaje de ácido eicosanoico o ácido araquÃdico, es un ácido graso saturado, que es un constituyente del aceite de mani
eicosenoic: porcentaje de ácido eicosenoico o ácido gondoico, es un ácido graso omega 9, monoinsaturado, que se encuentra en aceites vegetales y frutos secos
class: tipo de Aceite, que puede rondar en las categorÃas de pumpkin (calabaza), sunflower (girasol), soybean (soya), corn (maiz), rapesed (colza), peanut (cagaguate), olive (oliva)
Cuenta con 96 registros, sin ningún dato faltante.
## [1] 96 8
## Rows: 96
## Columns: 8
## $ palmitic <dbl> 9.7, 11.1, 11.5, 10.0, 12.2, 9.8, 10.5, 10.5, 11.5, 10.0, 1…
## $ stearic <dbl> 5.2, 5.0, 5.2, 4.8, 5.0, 4.2, 5.0, 5.0, 5.2, 4.8, 5.0, 4.4,…
## $ oleic <dbl> 31.0, 32.9, 35.0, 30.4, 31.1, 43.0, 31.8, 31.8, 35.0, 30.4,…
## $ linoleic <dbl> 52.7, 49.8, 47.2, 53.5, 50.5, 39.2, 51.3, 51.3, 47.2, 53.5,…
## $ linolenic <dbl> 0.4, 0.3, 0.2, 0.3, 0.3, 2.4, 0.4, 0.4, 0.2, 0.3, 0.3, 2.3,…
## $ eicosanoic <dbl> 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4,…
## $ eicosenoic <dbl> 0.1, 0.1, 0.1, 0.1, 0.1, 0.5, 0.1, 0.1, 0.1, 0.1, 0.1, 0.5,…
## $ class <fct> pumpkin, pumpkin, pumpkin, pumpkin, pumpkin, pumpkin, pumpk…
## # A tibble: 6 × 8
## palmitic stearic oleic linoleic linolenic eicosanoic eicosenoic class
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct>
## 1 9.7 5.2 31 52.7 0.4 0.4 0.1 pumpkin
## 2 11.1 5 32.9 49.8 0.3 0.4 0.1 pumpkin
## 3 11.5 5.2 35 47.2 0.2 0.4 0.1 pumpkin
## 4 10 4.8 30.4 53.5 0.3 0.4 0.1 pumpkin
## 5 12.2 5 31.1 50.5 0.3 0.4 0.1 pumpkin
## 6 9.8 4.2 43 39.2 2.4 0.4 0.5 pumpkin
## palmitic stearic oleic linoleic
## Min. : 4.50 Min. :1.700 Min. :22.80 Min. : 7.90
## 1st Qu.: 6.20 1st Qu.:3.475 1st Qu.:26.30 1st Qu.:43.10
## Median : 9.85 Median :4.200 Median :30.70 Median :50.80
## Mean : 9.04 Mean :4.200 Mean :36.73 Mean :46.49
## 3rd Qu.:11.12 3rd Qu.:5.000 3rd Qu.:38.62 3rd Qu.:58.08
## Max. :14.90 Max. :6.700 Max. :76.70 Max. :66.10
##
## linolenic eicosanoic eicosenoic class
## Min. :0.100 Min. :0.100 Min. :0.1000 corn : 2
## 1st Qu.:0.375 1st Qu.:0.100 1st Qu.:0.1000 olive : 7
## Median :0.800 Median :0.400 Median :0.1000 peanut : 3
## Mean :2.272 Mean :0.399 Mean :0.3115 pumpkin :37
## 3rd Qu.:2.650 3rd Qu.:0.400 3rd Qu.:0.3000 rapeseed :10
## Max. :9.500 Max. :2.800 Max. :1.8000 soybean :11
## sunflower:26
Observamos que tenemos 7 clases. La más común es sunflower y la menos común es corn.
## [1] "mean"
## # A tibble: 7 × 8
## class palmitic stearic oleic linoleic linolenic eicosanoic eicosenoic
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 corn 10.4 2.05 33.6 51.3 1.55 0.5 0.4
## 2 olive 11.5 2.79 72.6 10.7 1.16 0.2 0.214
## 3 peanut 9.77 3.33 59 20.8 0.167 1.5 1.43
## 4 pumpkin 11.0 5.37 33.3 48.8 0.873 0.403 0.162
## 5 rapeseed 5.22 1.95 58.2 23.8 8.38 0.42 0.96
## 6 soybean 10.5 3.97 25.7 52.2 6.72 0.327 0.227
## 7 sunflower 6.27 4.14 26.0 61.7 0.631 0.335 0.2
## [1] "min"
## # A tibble: 7 × 8
## class palmitic stearic oleic linoleic linolenic eicosanoic eicosenoic
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 corn 10 1.8 30.2 47.1 0.9 0.5 0.3
## 2 olive 9.3 2.6 65 7.9 0.6 0.1 0.1
## 3 peanut 9.6 3.3 57.7 20.5 0.1 1.5 1.2
## 4 pumpkin 7.6 4.2 25.8 39.2 0.2 0.1 0.1
## 5 rapeseed 4.5 1.7 52.2 18.6 6.8 0.1 0.1
## 6 soybean 9.6 3.5 23.1 49.2 5.5 0.1 0.1
## 7 sunflower 5.6 2.8 22.8 57.7 0.1 0.1 0.1
## [1] "max"
## # A tibble: 7 × 8
## class palmitic stearic oleic linoleic linolenic eicosanoic eicosenoic
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 corn 10.7 2.3 36.9 55.5 2.2 0.5 0.5
## 2 olive 14.9 3 76.7 17 3.9 0.5 0.7
## 3 peanut 10 3.4 60 21.3 0.2 1.5 1.8
## 4 pumpkin 13.1 6.7 43.3 56.1 4.2 1.3 0.7
## 5 rapeseed 6.2 2.3 64.9 29 9.5 0.8 1.6
## 6 soybean 11.9 4.3 30.3 55.1 7.8 0.7 0.9
## 7 sunflower 7.2 4.9 29.9 66.1 1.7 2.8 0.9
## [1] "sd"
## # A tibble: 7 × 8
## class palmitic stearic oleic linoleic linolenic eicosanoic eicosenoic
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 corn 0.495 0.354 4.74 5.94 0.919 0 0.141
## 2 olive 1.73 0.135 4.37 3.28 1.21 0.173 0.227
## 3 peanut 0.208 0.0577 1.18 0.416 0.0577 0 0.321
## 4 pumpkin 1.29 0.618 4.17 4.35 0.912 0.252 0.134
## 5 rapeseed 0.496 0.201 4.19 3.88 0.937 0.286 0.669
## 6 soybean 0.638 0.272 1.85 1.81 0.844 0.200 0.245
## 7 sunflower 0.342 0.421 1.87 2.11 0.468 0.518 0.179
En la siguiente sección vamos a revisar métodos de reducción de dimensión.
Podemos observar que existen correlaciones alejadas de cero, por lo que
podemos afirmar que el análisis con componentes principales será de gran
utilidad.
## [1] 0.4558454 0.6664997 0.8452196 0.9384298 0.9727448 0.9998312 1.0000000
La salida de la función prcomp nos muestra 3
resultados:
- sdev, que son los autovalores (eigenvalores),
- rotation, que son los autovectores
(eigenvectores),
- x, que son los datos transformados.
Fuera de este algoritmo, los eigenvalores se pueden calcular de manera habitual, pero para obtener x debemos multiplicar nuestra matriz centrada de los datos originales por la matriz de eigenvectores (es decir, matriz centrada * eigenvectores).
Con esto, podemos concluir que con 3 componentes principales podemos explicar más del 80% de la variabilidad de los datos.
## Warning: No shared levels found between `names(values)` of the manual scale and the
## data's colour values.
Elegimos estos parametros ya que estan generando grupos mas separados.
#### Metodos Agnes
## *** : The Hubert index is a graphical method of determining the number of clusters.
## In the plot of Hubert index, we seek a significant knee that corresponds to a
## significant increase of the value of the measure i.e the significant peak in Hubert
## index second differences plot.
##
## *** : The D index is a graphical method of determining the number of clusters.
## In the plot of D index, we seek a significant knee (the significant peak in Dindex
## second differences plot) that corresponds to a significant increase of the value of
## the measure.
##
## *******************************************************************
## * Among all indices:
## * 1 proposed 2 as the best number of clusters
## * 7 proposed 3 as the best number of clusters
## * 4 proposed 4 as the best number of clusters
## * 3 proposed 5 as the best number of clusters
## * 5 proposed 8 as the best number of clusters
## * 3 proposed 10 as the best number of clusters
##
## ***** Conclusion *****
##
## * According to the majority rule, the best number of clusters is 3
##
##
## *******************************************************************
Entonces, vamos a escoger un k = 3.
Ahora vamos a observar como afectan estos grupos a nuestras clases originales.
##
## corn olive peanut pumpkin rapeseed soybean sunflower
## 1 0 7 3 0 10 0 0
## 2 1 0 0 5 0 9 26
## 3 1 0 0 32 0 2 0
##
## 1 2 3
## 20 41 35
Ahora vamos a observarlo con otro método de reducción de dimensión.
d6 = d %>% filter(min_dist == 0.5 & n_neighbors == 15)
d6$kmeans = datos3d$kmeans
d6 %>%
ggplot(aes(x = x, y = y, col = kmeans)) +
geom_point() +
theme_bw() +
scale_color_manual(values = c('#740938',
'#133E87',
'#FFBF61'
)) +
labs(title = 'Eleccion Final',
subtitle = 'min_dist: 0.5 n_neigborhs: 15')
## Warning: Some values were outside the color scale and will be treated as NA
## Some values were outside the color scale and will be treated as NA
##
## corn olive peanut pumpkin rapeseed soybean sunflower
## FALSE 0 6 3 0 6 0 1
## TRUE 2 1 0 37 4 11 25
##
## FALSE TRUE
## 16 80
##
## corn olive peanut pumpkin rapeseed soybean sunflower
## 2 7 3 37 10 11 26
##
## 1 2 3
## FALSE 15 1 0
## TRUE 5 40 35
## 70% 75% 80% 90% 95% 99% 99.5% 99.9%
## 3.520937 4.029888 4.341659 7.804358 10.404326 16.801245 18.609567 20.918081
## 70% 75% 80% 90% 95% 99% 99.5% 99.9%
## 4.860345 5.728001 6.677724 10.745458 14.297523 21.492696 22.703095 24.366512
## DBSCAN clustering for 96 objects.
## Parameters: eps = 6.5, minPts = 14
## Using euclidean distances and borderpoints = TRUE
## The clustering contains 1 cluster(s) and 20 noise points.
##
## 0 1
## 20 76
##
## Available fields: cluster, eps, minPts, metric, borderPoints
## DBSCAN clustering for 96 objects.
## Parameters: eps = 4.4, minPts = 8
## Using euclidean distances and borderpoints = TRUE
## The clustering contains 1 cluster(s) and 22 noise points.
##
## 0 1
## 22 74
##
## Available fields: cluster, eps, minPts, metric, borderPoints
## [1] 19.87461
## [1] 17.70282
## [1] 33.46102
## [1] 23.67667
## [1] 32
## DBSCAN clustering for 96 objects.
## Parameters: eps = 25, minPts = 32
## Using euclidean distances and borderpoints = TRUE
## The clustering contains 1 cluster(s) and 11 noise points.
##
## 0 1
## 11 85
##
## Available fields: cluster, eps, minPts, metric, borderPoints
## DBSCAN clustering for 96 objects.
## Parameters: eps = 25, minPts = 32
## Using euclidean distances and borderpoints = TRUE
## The clustering contains 1 cluster(s) and 0 noise points.
##
## 1
## 96
##
## Available fields: cluster, eps, minPts, metric, borderPoints
## HDBSCAN clustering for 96 objects.
## Parameters: minPts = 32
## The clustering contains 0 cluster(s) and 96 noise points.
##
## 0
## 96
##
## Available fields: cluster, minPts, coredist, cluster_scores,
## membership_prob, outlier_scores, hc
## HDBSCAN clustering for 96 objects.
## Parameters: minPts = 14
## The clustering contains 2 cluster(s) and 43 noise points.
##
## 0 1 2
## 43 25 28
##
## Available fields: cluster, minPts, coredist, cluster_scores,
## membership_prob, outlier_scores, hc
## HDBSCAN clustering for 96 objects.
## Parameters: minPts = 8
## The clustering contains 2 cluster(s) and 5 noise points.
##
## 0 1 2
## 5 15 76
##
## Available fields: cluster, minPts, coredist, cluster_scores,
## membership_prob, outlier_scores, hc
##
## corn olive peanut pumpkin rapeseed soybean sunflower
## 0 2 7 3 9 10 11 1
## 1 0 0 0 0 0 0 25
## 2 0 0 0 28 0 0 0
##
## corn olive peanut pumpkin rapeseed soybean sunflower
## 0 0 5 0 0 0 0 0
## 1 0 2 3 0 10 0 0
## 2 2 0 0 37 0 11 26
##
## 0 1 2
## 0 5 15 23
## 1 0 0 25
## 2 0 0 28
##
## c1 c2 c3
## 0 20 14 9
## 1 0 25 0
## 2 0 2 26
##
## c1 c2 c3
## 0 5 0 0
## 1 15 0 0
## 2 0 41 35
## # A tibble: 6 × 6
## class PC1 PC2 PC3 PC4 PC5
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 pumpkin -61.7 -7.14 1.74 -0.456 -0.355
## 2 pumpkin -60.7 -3.83 2.86 0.356 0.0972
## 3 pumpkin -60.0 -0.579 3.37 0.498 -0.0818
## 4 pumpkin -62.0 -8.09 1.88 -0.355 0.146
## 5 pumpkin -60.4 -5.64 3.63 1.27 0.412
## 6 pumpkin -58.2 10.7 0.728 0.417 -0.113
## # A tibble: 6 × 5
## PC1 PC2 PC3 PC4 PC5
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 -0.387 -0.394 0.546 -0.210 -0.506
## 2 -0.152 -0.238 0.896 0.111 0.145
## 3 0.0262 -0.0858 1.06 0.167 -0.113
## 4 -0.457 -0.438 0.588 -0.170 0.215
## 5 -0.0642 -0.323 1.14 0.471 0.598
## 6 0.461 0.442 0.228 0.135 -0.158
## OPTICS ordering/clustering for 96 objects.
## Parameters: minPts = 5, eps = 3.88284864127684, eps_cl = NA, xi = 0.09
## The clustering contains 8 cluster(s) and 0 noise points.
##
## Available fields: order, reachdist, coredist, predecessor, minPts, eps,
## eps_cl, xi, clusters_xi, cluster
##
## 1 2 3 4 5 6 7 8
## 26 2 2 12 23 11 10 10
## # A tibble: 160 × 4
## db G1 dunn clusters
## <dbl> <dbl> <dbl> <int>
## 1 1.15 -4.83 0 8
## 2 1.15 -4.83 0 8
## 3 1.15 -5.70 0 8
## 4 1.01 -6.62 0 7
## 5 1.01 -6.62 0 7
## 6 1.01 -6.67 0 7
## 7 1.01 -6.67 0 7
## 8 1.01 -6.67 0 7
## 9 1.01 -6.67 0 7
## 10 0.864 -7.73 0 6
## 11 0.864 -7.73 0 6
## 12 0.864 -7.73 0 6
## 13 0.864 -7.73 0 6
## 14 NaN Inf Inf 2
## 15 NaN Inf Inf 2
## 16 NaN Inf Inf 2
## 17 1.45 -3.03 0 12
## 18 1.45 -3.03 0 12
## 19 1.45 -3.03 0 12
## 20 1.45 -3.03 0 12
## 21 2.37 2.63 0 9
## 22 2.37 2.63 0 9
## 23 2.37 2.63 0 9
## 24 1.30 6.20 0 7
## 25 1.35 13.1 0 6
## 26 1.03 18.7 0 5
## 27 1.02 15.4 0 5
## 28 1.02 15.4 0 5
## 29 1.02 15.4 0 5
## 30 1.02 15.4 0 5
## 31 1.02 15.4 0 5
## 32 1.21 33.3 0 4
## 33 1.53 34.8 0 9
## 34 1.52 35.0 0 9
## 35 1.86 32.1 0 9
## 36 1.86 32.1 0 9
## 37 1.29 41.2 0 7
## 38 1.12 49.4 0.0733 5
## 39 1.12 49.4 0.0733 5
## 40 1.12 49.4 0.0733 5
## 41 0.924 37.4 0 4
## 42 0.924 37.4 0 4
## 43 0.924 37.4 0 4
## 44 0.924 37.4 0 4
## 45 0.924 37.4 0 4
## 46 0.924 37.4 0 4
## 47 0.924 37.4 0 4
## 48 0.924 37.4 0 4
## 49 1.24 -2.29 0 13
## 50 1.15 -2.54 0 12
## 51 1.15 -2.54 0 12
## 52 1.15 -2.54 0 12
## 53 1.15 -2.54 0 12
## 54 1.15 -2.54 0 12
## 55 1.19 -2.71 0 11
## 56 1.19 -2.71 0 11
## 57 0.526 -3.69 0 8
## 58 0.526 -3.69 0 8
## 59 0.526 -3.69 0 8
## 60 0.526 -3.69 0 8
## 61 0.526 -3.69 0 8
## 62 0.552 -4.04 0 8
## 63 0.374 -4.86 0.444 7
## 64 0.362 -4.65 0.444 6
## 65 1.62 41.4 0 10
## 66 1.62 41.1 0 10
## 67 1.72 37.7 0 11
## 68 1.72 37.7 0 11
## 69 1.64 36.2 0 10
## 70 1.64 36.2 0 10
## 71 1.98 23.4 0 9
## 72 1.98 23.4 0 9
## 73 1.98 23.4 0 9
## 74 2.82 26.8 0 8
## 75 2.82 26.8 0 8
## 76 2.82 26.8 0 8
## 77 2.82 26.8 0 8
## 78 2.55 30.2 0.0332 7
## 79 1.33 -3.90 0.140 7
## 80 1.33 -3.90 0.140 7
## 81 1.39 -3.80 0 10
## 82 1.33 -3.87 0 10
## 83 1.41 -5.10 0 9
## 84 1.41 -5.10 0 9
## 85 1.32 -6.00 0 8
## 86 1.32 -6.00 0 8
## 87 0.831 -6.75 0 7
## 88 0.831 -6.75 0 7
## 89 0.831 -6.75 0 7
## 90 0.831 -6.75 0 7
## 91 0.831 -6.75 0 7
## 92 0.831 -6.75 0 7
## 93 0.894 -7.88 0.0524 6
## 94 0.894 -7.88 0.0524 6
## 95 0.885 -7.85 0 6
## 96 0.885 -7.85 0 6
## 97 0.935 71.5 0 8
## 98 0.938 70.3 0 8
## 99 1.17 67.4 0 7
## 100 1.17 67.4 0 7
## # ℹ 60 more rows
## # A tibble: 11 × 2
## clusters n
## <int> <int>
## 1 2 3
## 2 4 17
## 3 5 17
## 4 6 16
## 5 7 31
## 6 8 26
## 7 9 21
## 8 10 15
## 9 11 4
## 10 12 9
## 11 13 1
## # A tibble: 480 × 3
## clusters variable value
## <int> <chr> <dbl>
## 1 8 db 1.15
## 2 8 db 1.15
## 3 8 db 1.15
## 4 7 db 1.01
## 5 7 db 1.01
## 6 7 db 1.01
## 7 7 db 1.01
## 8 7 db 1.01
## 9 7 db 1.01
## 10 6 db 0.864
## # ℹ 470 more rows
## Best BIC values:
## VEV,8 VEV,7 VEV,9
## BIC -641.4895 -648.43081 -683.59914
## BIC diff 0.0000 -6.94128 -42.10961
## ----------------------------------------------------
## Gaussian finite mixture model fitted by EM algorithm
## ----------------------------------------------------
##
## Mclust VEV (ellipsoidal, equal shape) model with 9 components:
##
## log-likelihood n df BIC ICL
## 285.7983 96 275 -683.5991 -683.6065
##
## Clustering table:
## 1 2 3 4 5 6 7 8 9
## 14 15 13 17 8 14 4 5 6
##
## Mixing probabilities:
## 1 2 3 4 5 6 7
## 0.14583310 0.15622485 0.13541667 0.17707819 0.08332876 0.14586842 0.04166667
## 8 9
## 0.05208333 0.06250000
##
## Means:
## [,1] [,2] [,3] [,4] [,5]
## palmitic 0.678001586 0.95660881 0.58084185 -1.09147352 -0.88782551
## stearic 0.711495172 1.18356316 -0.09886782 -0.09450105 0.36148137
## oleic -0.317784336 -0.12933146 -0.70773229 -0.65965900 -0.73148979
## linoleic 0.270891242 0.02685227 0.34170325 0.92746264 0.89973897
## linolenic -0.655717612 -0.43396761 1.36384333 -0.54634376 -0.57895596
## eicosanoic 0.002672754 -0.37366595 -0.15522530 -0.51048909 -0.09351621
## eicosenoic -0.481998285 -0.33766460 -0.21605271 -0.38749963 -0.24189617
## [,6] [,7] [,8] [,9]
## palmitic 0.3272693 -1.4056631 0.8670520 -1.5047547
## stearic -0.2923862 -1.7070150 -1.1085556 -1.8743694
## oleic 0.3765327 1.3582034 2.5795344 1.5085694
## linoleic -0.3870366 -1.3284660 -2.3659284 -1.5063227
## linolenic -0.4509769 1.8775913 -0.5294786 2.2216209
## eicosanoic 1.4500978 -0.5104960 -0.7670804 0.4303134
## eicosenoic 0.7051054 -0.2724654 -0.5169203 2.8239638
##
## Variances:
## [,,1]
## palmitic stearic oleic linoleic
## palmitic 6.494794e-02 -6.843086e-03 -1.229226e-04 -7.982912e-03
## stearic -6.843086e-03 1.484479e-02 6.916348e-03 -7.150058e-03
## oleic -1.229226e-04 6.916348e-03 1.041817e-02 -1.017566e-02
## linoleic -7.982912e-03 -7.150058e-03 -1.017566e-02 1.161607e-02
## linolenic -6.101915e-03 1.985611e-03 1.778975e-04 2.146903e-04
## eicosanoic 3.688522e-35 -1.177498e-35 -4.861202e-36 1.596136e-36
## eicosenoic -8.390667e-03 5.685010e-03 2.057716e-03 -1.586625e-03
## linolenic eicosanoic eicosenoic
## palmitic -6.101915e-03 3.688522e-35 -8.390667e-03
## stearic 1.985611e-03 -1.177498e-35 5.685010e-03
## oleic 1.778975e-04 -4.861202e-36 2.057716e-03
## linoleic 2.146903e-04 1.596136e-36 -1.586625e-03
## linolenic 2.147563e-03 -1.022264e-35 1.896699e-03
## eicosanoic -1.022264e-35 9.801348e-06 -1.293142e-35
## eicosenoic 1.896699e-03 -1.293142e-35 4.697768e-03
## [,,2]
## palmitic stearic oleic linoleic linolenic
## palmitic 0.2379380 0.224655513 -0.16928156 0.14275640 -0.09425830
## stearic 0.2246555 0.348510263 -0.15161113 0.11796119 -0.14531702
## oleic -0.1692816 -0.151611133 0.13983328 -0.11825511 0.06204172
## linoleic 0.1427564 0.117961188 -0.11825511 0.10250578 -0.04833637
## linolenic -0.0942583 -0.145317015 0.06204172 -0.04833637 0.07591222
## eicosanoic -0.1021566 0.003550707 0.05467996 -0.06873647 -0.02814230
## eicosenoic -0.1445860 -0.167104098 0.09829037 -0.08734596 0.07653596
## eicosanoic eicosenoic
## palmitic -0.102156602 -0.14458598
## stearic 0.003550707 -0.16710410
## oleic 0.054679957 0.09829037
## linoleic -0.068736471 -0.08734596
## linolenic -0.028142304 0.07653596
## eicosanoic 0.358993611 0.09114322
## eicosenoic 0.091143220 0.13304574
## [,,3]
## palmitic stearic oleic linoleic linolenic
## palmitic 0.062513456 0.022156999 0.02785286 -0.017769096 -0.12686684
## stearic 0.022156999 0.101292884 0.01034257 0.002358465 -0.11040346
## oleic 0.027852857 0.010342568 0.03924834 -0.027861957 -0.12678997
## linoleic -0.017769096 0.002358465 -0.02786196 0.025656483 0.06748354
## linolenic -0.126866835 -0.110403456 -0.12678997 0.067483539 0.56155261
## eicosanoic 0.002355782 -0.037174710 0.06948293 -0.088455775 -0.12053212
## eicosenoic 0.032404568 -0.135244153 0.08821772 -0.120218538 -0.06029618
## eicosanoic eicosenoic
## palmitic 0.002355782 0.03240457
## stearic -0.037174710 -0.13524415
## oleic 0.069482928 0.08821772
## linoleic -0.088455775 -0.12021854
## linolenic -0.120532121 -0.06029618
## eicosanoic 0.638848393 0.63086465
## eicosenoic 0.630864649 0.91011190
## [,,4]
## palmitic stearic oleic linoleic linolenic
## palmitic 0.0100354658 0.010646932 -0.0006206978 0.0009732605 0.013408155
## stearic 0.0106469324 0.052652080 -0.0078137113 0.0098231233 0.004193706
## oleic -0.0006206978 -0.007813711 0.0385078740 -0.0456710834 0.046746614
## linoleic 0.0009732605 0.009823123 -0.0456710834 0.0574166217 -0.057306193
## linolenic 0.0134081552 0.004193706 0.0467466142 -0.0573061928 0.089682603
## eicosanoic -0.0370081943 -0.064546249 -0.0094036087 -0.0078354085 -0.071928901
## eicosenoic -0.0213444449 -0.067152342 0.0214746636 -0.0358466644 0.003871237
## eicosanoic eicosenoic
## palmitic -0.037008194 -0.021344445
## stearic -0.064546249 -0.067152342
## oleic -0.009403609 0.021474664
## linoleic -0.007835409 -0.035846664
## linolenic -0.071928901 0.003871237
## eicosanoic 0.303585200 0.153320987
## eicosenoic 0.153320987 0.132155425
## [,,5]
## palmitic stearic oleic linoleic linolenic
## palmitic 0.059989510 0.0014756400 0.038414085 -0.0437109412 0.0013983941
## stearic 0.001475640 0.0130844612 0.001276909 -0.0009754307 0.0008428665
## oleic 0.038414085 0.0012769087 0.029421062 -0.0344982202 0.0050401904
## linoleic -0.043710941 -0.0009754307 -0.034498220 0.0424736428 -0.0077560807
## linolenic 0.001398394 0.0008428665 0.005040190 -0.0077560807 0.0070501263
## eicosanoic -0.012919820 -0.0245793570 0.001183505 -0.0116040966 0.0044039134
## eicosenoic -0.037277886 -0.0206364745 -0.010355252 -0.0031469818 0.0142821510
## eicosanoic eicosenoic
## palmitic -0.012919820 -0.037277886
## stearic -0.024579357 -0.020636475
## oleic 0.001183505 -0.010355252
## linoleic -0.011604097 -0.003146982
## linolenic 0.004403913 0.014282151
## eicosanoic 0.132500187 0.123350564
## eicosenoic 0.123350564 0.171453996
## [,,6]
## palmitic stearic oleic linoleic linolenic
## palmitic 0.842755009 -0.17050235 0.6592271 -0.70129241 0.003352692
## stearic -0.170502349 1.34740807 -0.6662530 0.58744476 -0.078370864
## oleic 0.659227140 -0.66625296 1.9885539 -2.13349632 0.093102496
## linoleic -0.701292410 0.58744476 -2.1334963 2.31391601 -0.090931690
## linolenic 0.003352692 -0.07837086 0.0931025 -0.09093169 0.085831115
## eicosanoic -1.007270622 0.89351625 0.8855371 -1.22902567 -0.005768646
## eicosenoic -0.263874676 -0.35767166 1.7663914 -2.00665568 0.002265431
## eicosanoic eicosenoic
## palmitic -1.007270622 -0.263874676
## stearic 0.893516255 -0.357671664
## oleic 0.885537075 1.766391381
## linoleic -1.229025669 -2.006655676
## linolenic -0.005768646 0.002265431
## eicosanoic 7.487630418 4.001442223
## eicosenoic 4.001442223 3.529866165
## [,,7]
## palmitic stearic oleic linoleic linolenic
## palmitic 0.025002204 0.006516459 -0.013716832 0.012766184 0.003972187
## stearic 0.006516459 0.012972375 -0.014471234 0.011113402 -0.009260494
## oleic -0.013716832 -0.014471234 0.031096863 -0.022413114 0.008093337
## linoleic 0.012766184 0.011113402 -0.022413114 0.018814678 -0.007616418
## linolenic 0.003972187 -0.009260494 0.008093337 -0.007616418 0.015142683
## eicosanoic -0.005439128 0.015006920 -0.005896383 0.011315925 -0.026855259
## eicosenoic -0.005449080 0.014155764 -0.006168275 0.010591188 -0.024461254
## eicosanoic eicosenoic
## palmitic -0.005439128 -0.005449080
## stearic 0.015006920 0.014155764
## oleic -0.005896383 -0.006168275
## linoleic 0.011315925 0.010591188
## linolenic -0.026855259 -0.024461254
## eicosanoic 0.063350330 0.059973940
## eicosenoic 0.059973940 0.058027297
## [,,8]
## palmitic stearic oleic linoleic
## palmitic 1.290500e-02 0.0003329687 -2.230888e-03 0.0020041144
## stearic 3.329687e-04 0.0033412449 -2.158170e-03 0.0021210625
## oleic -2.230888e-03 -0.0021581696 3.029151e-03 -0.0015274031
## linoleic 2.004114e-03 0.0021210625 -1.527403e-03 0.0019938988
## linolenic -8.025464e-05 -0.0007451578 -1.875934e-05 -0.0005113572
## eicosanoic 0.000000e+00 0.0000000000 0.000000e+00 0.0000000000
## eicosenoic 0.000000e+00 0.0000000000 0.000000e+00 0.0000000000
## linolenic eicosanoic eicosenoic
## palmitic -8.025464e-05 0.000000e+00 0.000000e+00
## stearic -7.451578e-04 0.000000e+00 0.000000e+00
## oleic -1.875934e-05 0.000000e+00 0.000000e+00
## linoleic -5.113572e-04 0.000000e+00 0.000000e+00
## linolenic 8.264923e-04 0.000000e+00 0.000000e+00
## eicosanoic 0.000000e+00 5.034339e-05 0.000000e+00
## eicosenoic 0.000000e+00 0.000000e+00 1.997399e-06
## [,,9]
## palmitic stearic oleic linoleic linolenic
## palmitic 0.006874564 -0.001287151 -0.009063949 0.005874276 -0.002011160
## stearic -0.001287151 0.004047970 -0.004442296 0.005421042 -0.007620419
## oleic -0.009063949 -0.004442296 0.022894849 -0.019555992 0.018265527
## linoleic 0.005874276 0.005421042 -0.019555992 0.019241713 -0.020152944
## linolenic -0.002011160 -0.007620419 0.018265527 -0.020152944 0.044387654
## eicosanoic 0.007938304 0.007777060 -0.018285124 0.012956265 -0.034900732
## eicosenoic -0.002117548 -0.003028478 0.012110401 -0.013792566 0.020370530
## eicosanoic eicosenoic
## palmitic 0.007938304 -0.002117548
## stearic 0.007777060 -0.003028478
## oleic -0.018285124 0.012110401
## linoleic 0.012956265 -0.013792566
## linolenic -0.034900732 0.020370530
## eicosanoic 0.168779677 0.028985557
## eicosenoic 0.028985557 0.029960841
## -----------------------------------------------------------------
## Dimension reduction for model-based clustering and classification
## -----------------------------------------------------------------
##
## Mixture model type: Mclust (VEV, 9)
##
## Clusters n
## 1 14
## 2 15
## 3 13
## 4 17
## 5 8
## 6 14
## 7 4
## 8 5
## 9 6
##
## Estimated basis vectors:
## Dir1 Dir2 Dir3 Dir4 Dir5 Dir6
## palmitic 0.0398550 0.0274799 -0.042044 -0.158205 0.110356 -0.191597
## stearic 0.0513903 -0.0396812 -0.136271 -0.058337 -0.618839 0.094103
## oleic 0.7035073 -0.7456268 -0.594721 -0.678730 -0.566964 -0.622176
## linoleic 0.6720310 -0.6585070 -0.751601 -0.700154 -0.186876 -0.746142
## linolenic -0.2212336 -0.0871214 -0.104507 -0.106642 -0.243924 -0.078350
## eicosanoic 0.0160890 -0.0045188 0.070768 -0.014710 0.434002 0.041523
## eicosenoic 0.0043908 -0.0216544 -0.212451 -0.095338 -0.025151 -0.052364
## Dir7
## palmitic 0.095723
## stearic 0.047857
## oleic 0.679406
## linoleic 0.709572
## linolenic 0.140340
## eicosanoic 0.057310
## eicosenoic -0.021957
##
## Dir1 Dir2 Dir3 Dir4 Dir5 Dir6 Dir7
## Eigenvalues 1.7729 1.5348 1.3844 1.2009 0.58992 0.036647 2.0756e-04
## Cum. % 27.1933 50.7343 71.9677 90.3865 99.43472 99.996816 1.0000e+02
## Class_GMM
## 1 2 3 4 5 6 7 8 9
## 14 15 13 17 8 14 4 5 6
##
## Class_GMM corn olive peanut pumpkin rapeseed soybean sunflower
## 1 0 0 0 14 0 0 0
## 2 0 0 0 15 0 0 0
## 3 0 0 0 2 0 11 0
## 4 0 0 0 0 0 0 17
## 5 0 0 0 2 0 0 6
## 6 2 2 3 4 0 0 3
## 7 0 0 0 0 4 0 0
## 8 0 5 0 0 0 0 0
## 9 0 0 0 0 6 0 0
## Best BIC values:
## VEI,7 VEI,9 VEI,5
## BIC -559.1728 -561.98725 -563.379201
## BIC diff 0.0000 -2.81443 -4.206377
## ----------------------------------------------------
## Gaussian finite mixture model fitted by EM algorithm
## ----------------------------------------------------
##
## Mclust VEI (diagonal, equal shape) model with 7 components:
##
## log-likelihood n df BIC ICL
## -197.4281 96 36 -559.1728 -564.7159
##
## Clustering table:
## 1 2 3 4 5 6 7
## 29 22 23 6 6 6 4
##
## Mixing probabilities:
## 1 2 3 4 5 6 7
## 0.30691940 0.22028499 0.23959707 0.06193194 0.06747129 0.06214134 0.04165398
##
## Means:
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## PC1 -0.7052117 -0.09433951 -0.5756359 1.7897450 1.2076003 1.155634 2.6651009
## PC2 -0.7695881 0.32359665 1.0339965 0.7456446 -0.4795708 -2.234596 1.0134525
## PC3 -0.2100309 0.14579246 0.2230587 1.2108666 -2.4301516 1.301180 -0.3116263
##
## Variances:
## [,,1]
## PC1 PC2 PC3
## PC1 0.04246937 0.00000000 0.0000000
## PC2 0.00000000 0.07457073 0.0000000
## PC3 0.00000000 0.00000000 0.2123919
## [,,2]
## PC1 PC2 PC3
## PC1 0.07573322 0.0000000 0.0000000
## PC2 0.00000000 0.1329778 0.0000000
## PC3 0.00000000 0.0000000 0.3787465
## [,,3]
## PC1 PC2 PC3
## PC1 0.01172287 0.00000000 0.00000000
## PC2 0.00000000 0.02058385 0.00000000
## PC3 0.00000000 0.00000000 0.05862679
## [,,4]
## PC1 PC2 PC3
## PC1 0.1247447 0.0000000 0.0000000
## PC2 0.0000000 0.2190356 0.0000000
## PC3 0.0000000 0.0000000 0.6238559
## [,,5]
## PC1 PC2 PC3
## PC1 0.4914034 0.0000000 0.000000
## PC2 0.0000000 0.8628411 0.000000
## PC3 0.0000000 0.0000000 2.457539
## [,,6]
## PC1 PC2 PC3
## PC1 0.01596124 0.00000000 0.00000000
## PC2 0.00000000 0.02802589 0.00000000
## PC3 0.00000000 0.00000000 0.07982318
## [,,7]
## PC1 PC2 PC3
## PC1 0.003531958 0.000000000 0.00000000
## PC2 0.000000000 0.006201662 0.00000000
## PC3 0.000000000 0.000000000 0.01766354
## -----------------------------------------------------------------
## Dimension reduction for model-based clustering and classification
## -----------------------------------------------------------------
##
## Mixture model type: Mclust (VEI, 7)
##
## Clusters n
## 1 29
## 2 22
## 3 23
## 4 6
## 5 6
## 6 6
## 7 4
##
## Estimated basis vectors:
## Dir1 Dir2 Dir3
## PC1 -0.938582 0.34482 -0.012735
## PC2 -0.333815 -0.91673 -0.219473
## PC3 -0.087353 -0.20174 0.975535
##
## Dir1 Dir2 Dir3
## Eigenvalues 1.7092 1.5513 0.77493
## Cum. % 42.3546 80.7966 100.00000
## Class_GMM_pca
## 1 2 3 4 5 6 7
## 29 22 23 6 6 6 4
##
## Class_GMM_pca corn olive peanut pumpkin rapeseed soybean sunflower
## 1 0 0 0 29 0 0 0
## 2 2 0 0 7 0 11 2
## 3 0 0 0 0 0 0 23
## 4 0 0 0 0 6 0 0
## 5 0 1 3 1 0 0 1
## 6 0 6 0 0 0 0 0
## 7 0 0 0 0 4 0 0