Analisis de Componentes Principales

Pamela Ruiz

1/12/2020

1 Personalidad de adultos jóvenes

Se realizó una encuesta donde se buscaba recolectar información acerca de diferentes variables que pueden determinar la personalidad de un adulto-jóven, se obtuvieron respuestas de una encuesta que evaluo diferentes aspectos de la personalidad de adultos jóvenes.

Se recolectó información de 1010 individuos, entre las variables recolectadas se ecnontraron 159 variables a estudiar, las cuales buscan medir distintos aspectos de la personalidad como el gusto por la música, baile, genero musical como (Rock, pop, punk, jazz, etc), genero de peliculas como (terror, comedia,romance, fantasia, documentales, acción, historia, etc), edad, religion, idiomas, autos, peso, altura, número de hermanos entre otras. En total se obtuvo información de 159 variaables de las cuales se desea hacer una reducción para poder estudiar una menor cantidad de variables y determinar la personalidad de un adulto jóven.

2 Una vista a los datos

Muestra de datos
Music Slow.songs.or.fast.songs Dance Folk Country Classical.music Musical Pop
1 5 3 2 1 2 2 1 5
2 4 4 2 1 1 1 2 3
3 5 5 2 2 3 4 5 3
4 5 3 2 1 1 1 1 2
5 5 3 4 3 2 4 3 5
6 5 3 2 3 2 3 3 2
Muestra de datos
Sci.fi War Fantasy.Fairy.tales Animated Documentary Western Action History Psychology
1 4 1 5 5 3 1 2 1 5
2 4 1 3 5 4 1 4 1 3
3 4 2 5 5 2 2 1 1 2
4 4 3 1 2 5 1 2 4 4
5 3 3 4 4 3 1 4 3 2
6 3 3 4 3 3 2 4 5 3
Muestra de datos
Age Height Weight Number.of.siblings Gender Left…right.handed Education Only.child
1 20 163 48 1 female right handed college/bachelor degree no
2 19 163 58 2 female right handed college/bachelor degree no
3 20 176 67 2 female right handed secondary school no
4 22 172 59 1 female right handed college/bachelor degree yes
5 20 170 59 1 female right handed secondary school no
6 20 186 77 1 male right handed secondary school no

3 Preprocesamiento de los datos

Es importante notar que este analisis es parte del preprocesamiento de los datos, pero vamos a destacar algunos ajustes necesarios a realizar para poder realizar este analisis.

El analisis de componentes principales se realiza sobre variables númericas, y es importante identificar las variables categoricas que no serán útiles para el analisis. Existen variables categoricas que pueden ser explicadas con otras numericas, por ejemplo una variable categorica nos indica si es hijo único o no y tenemos otra variable numerica que que nos indica el número de hermanos. Como este ejemplo tenemos distintas variables que podemos eliminar dado que pueden ser explicadas con otra númerica.

También es importante verificar el tipo de dato que de las variables y utilizar las variables numericas. Y notar que es necesario escalar los datos, pues tenemos varianzas muy distintas entre algunas variables y otras, y no queremos que influya en el analisis de componentes principales.

encuesta %<>%
  select(-House...block.of.flats,-Village...town,-Only.child,-Education,-Left...right.handed,
         -Gender,-Internet.usage,-Punctuality,-Lying,-Alcohol,-Smoking)

4 Analisis de Componentes Principales

Se realiza el analisis de componentes principales, una vez seleccionadas las variables se cuenta con 139 variables que reduciremos con ayuda de PCA. Pondremos un Límite de 80 variables a obtener.

4.1 Descomposición de valores singulares

## Importance of first k=60 (out of 139) components:
##                            PC1     PC2     PC3    PC4     PC5    PC6     PC7
## Standard deviation     3.04124 2.76618 2.63232 2.0625 2.00990 1.8188 1.72636
## Proportion of Variance 0.06654 0.05505 0.04985 0.0306 0.02906 0.0238 0.02144
## Cumulative Proportion  0.06654 0.12159 0.17144 0.2020 0.23111 0.2549 0.27635
##                            PC8     PC9   PC10    PC11   PC12    PC13    PC14
## Standard deviation     1.61741 1.55241 1.5007 1.48319 1.4535 1.40767 1.38593
## Proportion of Variance 0.01882 0.01734 0.0162 0.01583 0.0152 0.01426 0.01382
## Cumulative Proportion  0.29517 0.31251 0.3287 0.34453 0.3597 0.37399 0.38781
##                           PC15    PC16    PC17    PC18    PC19    PC20    PC21
## Standard deviation     1.34383 1.31997 1.30705 1.29942 1.27015 1.26272 1.22407
## Proportion of Variance 0.01299 0.01253 0.01229 0.01215 0.01161 0.01147 0.01078
## Cumulative Proportion  0.40080 0.41333 0.42562 0.43777 0.44938 0.46085 0.47163
##                           PC22    PC23   PC24    PC25    PC26   PC27    PC28
## Standard deviation     1.20339 1.18379 1.1731 1.15938 1.15413 1.1368 1.12690
## Proportion of Variance 0.01042 0.01008 0.0099 0.00967 0.00958 0.0093 0.00914
## Cumulative Proportion  0.48205 0.49213 0.5020 0.51170 0.52128 0.5306 0.53971
##                           PC29    PC30    PC31    PC32    PC33    PC34    PC35
## Standard deviation     1.11934 1.11398 1.11174 1.07504 1.06886 1.06522 1.05945
## Proportion of Variance 0.00901 0.00893 0.00889 0.00831 0.00822 0.00816 0.00808
## Cumulative Proportion  0.54873 0.55766 0.56655 0.57486 0.58308 0.59124 0.59932
##                           PC36    PC37    PC38    PC39    PC40    PC41    PC42
## Standard deviation     1.04866 1.04223 1.02925 1.01912 1.01781 1.01342 0.99740
## Proportion of Variance 0.00791 0.00781 0.00762 0.00747 0.00745 0.00739 0.00716
## Cumulative Proportion  0.60723 0.61505 0.62267 0.63014 0.63759 0.64498 0.65214
##                           PC43    PC44    PC45    PC46    PC47    PC48    PC49
## Standard deviation     0.98076 0.97486 0.96922 0.96600 0.94708 0.94245 0.94107
## Proportion of Variance 0.00692 0.00684 0.00676 0.00671 0.00645 0.00639 0.00637
## Cumulative Proportion  0.65906 0.66589 0.67265 0.67937 0.68582 0.69221 0.69858
##                           PC50    PC51    PC52    PC53    PC54    PC55    PC56
## Standard deviation     0.92899 0.92360 0.91797 0.91500 0.90809 0.90167 0.89283
## Proportion of Variance 0.00621 0.00614 0.00606 0.00602 0.00593 0.00585 0.00573
## Cumulative Proportion  0.70479 0.71093 0.71699 0.72301 0.72894 0.73479 0.74053
##                          PC57    PC58    PC59    PC60
## Standard deviation     0.8821 0.87685 0.87166 0.86777
## Proportion of Variance 0.0056 0.00553 0.00547 0.00542
## Cumulative Proportion  0.7461 0.75166 0.75712 0.76254

4.1.1 Varianza explicada por las componentes

Se puede observar que a partir de la componente 10 obtenemos poca mejora en la explicación de la varianza, sin embargo no es el número adecuado de componentes a utilizar pues para 10 componentes obtenemos una explicación de 0.3287 y aún es muy baja por lo que tomaremos 37 componentes donde acumulamos 0.61505 de varianza explicada, por lo que reducimo en gran medida nuestro numero de componentes.

## Importance of first k=37 (out of 139) components:
##                            PC1     PC2     PC3    PC4     PC5    PC6     PC7
## Standard deviation     3.04124 2.76618 2.63232 2.0625 2.00990 1.8188 1.72636
## Proportion of Variance 0.06654 0.05505 0.04985 0.0306 0.02906 0.0238 0.02144
## Cumulative Proportion  0.06654 0.12159 0.17144 0.2020 0.23111 0.2549 0.27635
##                            PC8     PC9   PC10    PC11   PC12    PC13    PC14
## Standard deviation     1.61741 1.55241 1.5007 1.48319 1.4535 1.40767 1.38593
## Proportion of Variance 0.01882 0.01734 0.0162 0.01583 0.0152 0.01426 0.01382
## Cumulative Proportion  0.29517 0.31251 0.3287 0.34453 0.3597 0.37399 0.38781
##                           PC15    PC16    PC17    PC18    PC19    PC20    PC21
## Standard deviation     1.34383 1.31997 1.30705 1.29942 1.27015 1.26272 1.22407
## Proportion of Variance 0.01299 0.01253 0.01229 0.01215 0.01161 0.01147 0.01078
## Cumulative Proportion  0.40080 0.41333 0.42562 0.43777 0.44938 0.46085 0.47163
##                           PC22    PC23   PC24    PC25    PC26   PC27    PC28
## Standard deviation     1.20339 1.18379 1.1731 1.15938 1.15413 1.1368 1.12690
## Proportion of Variance 0.01042 0.01008 0.0099 0.00967 0.00958 0.0093 0.00914
## Cumulative Proportion  0.48205 0.49213 0.5020 0.51170 0.52128 0.5306 0.53971
##                           PC29    PC30    PC31    PC32    PC33    PC34    PC35
## Standard deviation     1.11934 1.11398 1.11174 1.07504 1.06886 1.06522 1.05945
## Proportion of Variance 0.00901 0.00893 0.00889 0.00831 0.00822 0.00816 0.00808
## Cumulative Proportion  0.54873 0.55766 0.56655 0.57486 0.58308 0.59124 0.59932
##                           PC36    PC37
## Standard deviation     1.04866 1.04223
## Proportion of Variance 0.00791 0.00781
## Cumulative Proportion  0.60723 0.61505

4.1.2 Variables rotadas

Un vistaso a las componentes rotadas, pues contiene el valor de loadings para cada componente, aqui podemos observar los pesos que cada variable tiene en los distintos componentes. Por ejemplo para las variables musicales podemos observar que en componente 2 es el que tiene los pesos para dichas variables sobre géneros musicales.

##                                   PC1        PC2         PC3          PC4
## Music                    -0.047582405 0.04938435 -0.03767730  0.092080493
## Slow.songs.or.fast.songs  0.064313113 0.01027430 -0.06073387 -0.030376721
## Dance                    -0.050989897 0.04765625 -0.17185018 -0.035187841
## Folk                     -0.073750387 0.12386414  0.09528105 -0.041192585
## Country                  -0.005007755 0.11595332  0.08244409 -0.008189785
## Classical.music          -0.061054130 0.16388349  0.16326957  0.091623269
##                                   PC5         PC6         PC7         PC8
## Music                    -0.066213032  0.09272616 -0.01671909  0.09413288
## Slow.songs.or.fast.songs -0.020183012  0.03662025  0.04870411  0.04415341
## Dance                     0.098555344  0.14960482  0.01952342 -0.13294629
## Folk                     -0.018508268  0.06857354 -0.04970073 -0.19373424
## Country                   0.057907240  0.09877130 -0.09269609 -0.14614308
## Classical.music           0.009994683 -0.02481594 -0.01569838 -0.09929204
##                                   PC9         PC10         PC11        PC12
## Music                    -0.047585529  0.037614277 -0.021478425  0.00327134
## Slow.songs.or.fast.songs  0.014203661 -0.006595415 -0.120183418  0.09067466
## Dance                    -0.048351927 -0.027277526 -0.131863945 -0.01731938
## Folk                      0.042994842 -0.051886501 -0.036804984 -0.07936266
## Country                   0.003475195 -0.103622576 -0.003224945 -0.04902021
## Classical.music          -0.083196649 -0.097619806  0.060019098 -0.03606276
##                                 PC13         PC14         PC15         PC16
## Music                    -0.07002795  0.070544464 -0.054512767  0.100774191
## Slow.songs.or.fast.songs -0.06791649  0.119509804  0.059384830  0.192476959
## Dance                    -0.14969820 -0.004770693  0.148166781  0.024879144
## Folk                     -0.08936829 -0.081762472 -0.076595385 -0.065730774
## Country                   0.04606722 -0.039544808 -0.006979301 -0.111137060
## Classical.music          -0.04389010 -0.145301150 -0.050043599  0.003550101
##                                 PC17        PC18        PC19        PC20
## Music                    -0.03077878  0.10828297 -0.08435355  0.14314527
## Slow.songs.or.fast.songs -0.09038227 -0.08304166 -0.06220677  0.15298635
## Dance                     0.07406015 -0.01239437 -0.07426104  0.03943098
## Folk                      0.06789898  0.01022200 -0.05799981  0.01968971
## Country                   0.17337028 -0.05652503  0.14726351 -0.02885842
## Classical.music           0.03576740  0.12052565 -0.05962899  0.04250060
##                                 PC21        PC22        PC23         PC24
## Music                    -0.05024822  0.13998601 -0.09748485  0.228475645
## Slow.songs.or.fast.songs -0.07661578  0.10846930 -0.08462884  0.058231941
## Dance                    -0.07588067  0.15447675  0.01250064  0.030885279
## Folk                      0.13176087  0.10736805 -0.06216723  0.086242483
## Country                   0.10744040  0.18577174  0.01331886  0.143480799
## Classical.music           0.02190333 -0.03173615 -0.12064399 -0.000917923
##                                  PC25        PC26         PC27          PC28
## Music                    -0.073590324  0.13897119 -0.036416903  0.0348211917
## Slow.songs.or.fast.songs -0.070262488  0.05770227 -0.018477067 -0.0088521645
## Dance                     0.078821325 -0.02069734  0.005871426  0.0305858411
## Folk                     -0.005501371 -0.03376453  0.048016000 -0.0009740571
## Country                   0.053890221  0.03193136 -0.069281620  0.0554160369
## Classical.music          -0.035178002 -0.01206947  0.100322068  0.0502515957
##                                  PC29        PC30        PC31        PC32
## Music                    -0.155785360  0.16434501 -0.06935369 -0.12254974
## Slow.songs.or.fast.songs  0.179139760 -0.07665298 -0.23513782  0.10055025
## Dance                    -0.002067251  0.10047582 -0.01847670  0.08201127
## Folk                      0.180039653  0.00285099 -0.18267916 -0.09340278
## Country                   0.051342416  0.05485036 -0.11078691 -0.04746951
## Classical.music          -0.029498018 -0.03309783  0.01886751 -0.13455546
##                                 PC33        PC34        PC35         PC36
## Music                     0.02186480  0.24276967 -0.07117291  0.015982107
## Slow.songs.or.fast.songs -0.05427405 -0.05927715 -0.07387483 -0.005653228
## Dance                    -0.06446429  0.01279733  0.04701843  0.037679529
## Folk                      0.14489480 -0.02662025  0.06034935  0.061365222
## Country                   0.15467404 -0.04323688  0.13465474  0.064140810
## Classical.music          -0.03859241  0.05504377 -0.03677877  0.047540606
##                                 PC37
## Music                     0.06660021
## Slow.songs.or.fast.songs -0.03610156
## Dance                     0.01398491
## Folk                     -0.02598364
## Country                   0.04678499
## Classical.music           0.05277598

Podemos observar en el diagrama los componentes con mayores pesos para las distintas variables, incluyendo los 37 componentes principales.

4.1.3 Valores de los componentes principales

Valor de las componentes principales para cada observación (principal component scores) multiplicando los datos por los vectores de loadings. El resultado se almacena en la matriz:

##          PC1        PC2         PC3        PC4        PC5        PC6        PC7
## 1 -1.0530571 -0.6334789 -1.33637596 -4.5456982 -0.8059460  1.1955210  0.8777088
## 2  2.9015048 -2.0715342  1.46402413  0.6630106 -1.2557060 -1.2537878 -0.1009695
## 3 -1.8848221  2.7063670  2.85136508  1.5208920 -3.3216476  0.8186191  0.8044389
## 5 -0.3302213 -0.2635941 -0.01618298 -2.2163474  0.6670398  0.2047556  1.3600030
## 6  3.1450176  2.1792054  2.22702147  0.5933384 -0.7623117  1.1436428  0.7135727
## 7 -3.1906127  0.9647040 -2.03279992 -4.7222350 -0.6491279 -0.2779432  3.6319297
##          PC8        PC9       PC10       PC11       PC12       PC13        PC14
## 1  2.1698138 -1.9888306 -0.3106233  3.2083750  0.4705492  2.4926455  1.22944986
## 2  3.5297389 -1.7411825 -1.4306605  0.6270017 -2.1422052  0.5999673  2.05666647
## 3 -1.2463240 -1.1531245  2.0365268 -1.4948925 -1.6395180  3.4700757  0.80225779
## 5 -1.0862003 -2.9587908  1.5525601 -0.1645998 -0.2226828 -2.1117461 -0.75277017
## 6  0.7180025  0.7582240  0.4531394 -1.6573571  1.8411153 -0.0610575  0.49119067
## 7  2.2258596  0.3643288 -0.1385240 -2.0809430 -0.8940154 -1.4011170  0.05398411
##          PC15       PC16       PC17        PC18       PC19       PC20
## 1 -0.08378039  0.3400839 -1.7359938 -0.31381989 -0.4627950  0.3028893
## 2  1.44366540 -0.8775515 -0.8365413 -2.47298514 -1.2767190 -0.4407617
## 3  3.03874541 -2.9873473 -0.6992351 -0.70752259 -0.2619535  1.3252956
## 5  0.11267696 -0.2405331  1.1741541  0.38283555 -0.9810110  2.5739937
## 6 -0.83372692 -0.9259021 -0.2685235  0.09426062 -0.6391717  1.1513992
## 7 -0.84860887  0.1917878 -1.5537888 -1.01711848  0.4086109  0.9776245
##           PC21       PC22       PC23       PC24         PC25       PC26
## 1 -0.855400897  0.7566404  0.8683802 -1.9796481  1.241736327  0.8908599
## 2 -2.260638516 -1.4041847 -0.6155612  0.3887553  1.716912088 -0.5445146
## 3 -1.239046465 -0.3997135 -2.0643117  0.6069301  0.306273109 -0.9934903
## 5  1.103970434 -0.5499236 -0.5012331 -2.3168332 -0.003982265  0.0730956
## 6  0.554280269 -0.8974694  1.1908446  0.5199019 -0.880485527 -0.8867896
## 7 -0.006353953 -0.2017059  1.5086426  0.7657306 -0.115633362  0.5971394
##           PC27        PC28        PC29        PC30       PC31       PC32
## 1 -1.995775697 -0.48584284 -1.41778492  0.80445388 -2.4901759 -1.5074082
## 2 -0.205115650 -0.03535099  0.07995225  0.23550960 -0.4829410  2.0286258
## 3 -2.875302750 -0.64502380  0.76566066 -1.48689994 -1.8113357  2.7324074
## 5 -0.006182388  0.42972822  1.09860257 -0.70040111  0.0106755 -0.6183481
## 6  0.193477721 -0.45130528  0.12238776  0.08088869  1.0247205 -1.7930233
## 7  1.098185013  0.16763676  0.27023758  0.86115267 -1.2710940  0.4705214
##           PC33       PC34        PC35        PC36         PC37
## 1 -0.004475272 -0.2931144  0.00645661  0.11918733  2.109312735
## 2 -1.253496529 -1.2830332  0.82233043  0.09958263  0.227989630
## 3  0.146921222  0.3601148 -0.09376475 -0.35066239  0.658878692
## 5  0.359296688  0.9040017 -0.74292049  0.15875317 -0.002450738
## 6  2.107508907  1.4232286 -1.53236935  0.82598706 -1.454308806
## 7 -1.564472644  0.3391209  0.18145575  1.78885351 -0.332875756

4.1.4 Diagrama

El diagrama para observar la variables es el siguiente:

4.2 Uso de eigenvectores/eigenvalores

Cuando se usa el calculo de eigenvectores/eigenvalores sobre la matriz de covarianza.

## Importance of components:
##                            Comp.1     Comp.2     Comp.3     Comp.4     Comp.5
## Standard deviation     15.9984176 5.77526107 3.51117834 3.18662002 3.06872118
## Proportion of Variance  0.5229317 0.06814502 0.02518817 0.02074682 0.01924003
## Cumulative Proportion   0.5229317 0.59107670 0.61626488 0.63701169 0.65625172
##                            Comp.6     Comp.7     Comp.8      Comp.9     Comp.10
## Standard deviation     2.81520010 2.47703533 2.32482555 2.148750949 2.131880497
## Proportion of Variance 0.01619233 0.01253589 0.01104261 0.009433287 0.009285742
## Cumulative Proportion  0.67244406 0.68497995 0.69602256 0.705455846 0.714741588
##                            Comp.11     Comp.12    Comp.13     Comp.14
## Standard deviation     2.006079342 1.907771857 1.86382497 1.823116207
## Proportion of Variance 0.008222182 0.007436075 0.00709743 0.006790779
## Cumulative Proportion  0.722963771 0.730399846 0.73749728 0.744288055
##                            Comp.15     Comp.16     Comp.17     Comp.18
## Standard deviation     1.782508594 1.741779187 1.656306778 1.615879982
## Proportion of Variance 0.006491636 0.006198364 0.005604959 0.005334689
## Cumulative Proportion  0.750779691 0.756978055 0.762583015 0.767917704
##                            Comp.19     Comp.20     Comp.21     Comp.22
## Standard deviation     1.602097391 1.586045760 1.552617373 1.536936913
## Proportion of Variance 0.005244073 0.005139517 0.004925154 0.004826174
## Cumulative Proportion  0.773161776 0.778301294 0.783226447 0.788052621
##                           Comp.23     Comp.24     Comp.25     Comp.26
## Standard deviation     1.51927175 1.499266829 1.485604395 1.453886542
## Proportion of Variance 0.00471587 0.004592496 0.004509177 0.004318689
## Cumulative Proportion  0.79276849 0.797360987 0.801870164 0.806188853
##                            Comp.27     Comp.28     Comp.29     Comp.30
## Standard deviation     1.424170664 1.387480174 1.367020576 1.355043492
## Proportion of Variance 0.004143954 0.003933186 0.003818045 0.003751434
## Cumulative Proportion  0.810332807 0.814265993 0.818084037 0.821835472
##                            Comp.31     Comp.32     Comp.33     Comp.34
## Standard deviation     1.342891575 1.336620014 1.318240867 1.310828856
## Proportion of Variance 0.003684451 0.003650117 0.003550426 0.003510612
## Cumulative Proportion  0.825519923 0.829170040 0.832720466 0.836231078
##                            Comp.35     Comp.36     Comp.37     Comp.38
## Standard deviation     1.303439434 1.285405386 1.268814256 1.260574067
## Proportion of Variance 0.003471144 0.003375757 0.003289175 0.003246591
## Cumulative Proportion  0.839702222 0.843077979 0.846367154 0.849613745
##                            Comp.39     Comp.40     Comp.41     Comp.42
## Standard deviation     1.246659002 1.232703104 1.221915134 1.212833000
## Proportion of Variance 0.003175311 0.003104616 0.003050514 0.003005335
## Cumulative Proportion  0.852789056 0.855893672 0.858944186 0.861949521
##                            Comp.43    Comp.44     Comp.45     Comp.46
## Standard deviation     1.201897616 1.18422317 1.173688963 1.169396746
## Proportion of Variance 0.002951385 0.00286522 0.002814472 0.002793924
## Cumulative Proportion  0.864900906 0.86776613 0.870580598 0.873374523
##                            Comp.47     Comp.48     Comp.49     Comp.50
## Standard deviation     1.167845528 1.145279006 1.142156650 1.125029433
## Proportion of Variance 0.002786517 0.002679869 0.002665276 0.002585941
## Cumulative Proportion  0.876161040 0.878840908 0.881506184 0.884092126
##                          Comp.51     Comp.52     Comp.53     Comp.54
## Standard deviation     1.1133657 1.101971192 1.093725776 1.092434327
## Proportion of Variance 0.0025326 0.002481026 0.002444037 0.002438269
## Cumulative Proportion  0.8866247 0.889105752 0.891549789 0.893988058
##                            Comp.55     Comp.56     Comp.57    Comp.58
## Standard deviation     1.083328573 1.069550287 1.061976177 1.05570768
## Proportion of Variance 0.002397791 0.002337186 0.002304201 0.00227708
## Cumulative Proportion  0.896385848 0.898723035 0.901027236 0.90330432
##                            Comp.59     Comp.60     Comp.61     Comp.62
## Standard deviation     1.046179141 1.040988679 1.022819636 1.016754395
## Proportion of Variance 0.002236161 0.002214027 0.002137416 0.002112142
## Cumulative Proportion  0.905540477 0.907754504 0.909891919 0.912004061
##                            Comp.63   Comp.64     Comp.65     Comp.66
## Standard deviation     1.009682933 1.0011969 0.994283520 0.991515624
## Proportion of Variance 0.002082864 0.0020480 0.002019814 0.002008584
## Cumulative Proportion  0.914086925 0.9161349 0.918154739 0.920163323
##                            Comp.67     Comp.68     Comp.69     Comp.70
## Standard deviation     0.971302144 0.965426073 0.961404789 0.944233856
## Proportion of Variance 0.001927523 0.001904272 0.001888441 0.001821587
## Cumulative Proportion  0.922090846 0.923995118 0.925883559 0.927705147
##                            Comp.71     Comp.72     Comp.73     Comp.74
## Standard deviation     0.939813732 0.935310607 0.922600046 0.920290702
## Proportion of Variance 0.001804573 0.001787321 0.001739073 0.001730378
## Cumulative Proportion  0.929509720 0.931297041 0.933036114 0.934766492
##                            Comp.75     Comp.76     Comp.77    Comp.78
## Standard deviation     0.913748572 0.901839801 0.898525443 0.89555798
## Proportion of Variance 0.001705864 0.001661689 0.001649497 0.00163862
## Cumulative Proportion  0.936472356 0.938134044 0.939783542 0.94142216
##                            Comp.79     Comp.80     Comp.81     Comp.82
## Standard deviation     0.885031621 0.879749066 0.872813999 0.860959850
## Proportion of Variance 0.001600326 0.001581279 0.001556447 0.001514456
## Cumulative Proportion  0.943022488 0.944603767 0.946160214 0.947674670
##                            Comp.83     Comp.84     Comp.85     Comp.86
## Standard deviation     0.849740131 0.845509109 0.834077266 0.825199329
## Proportion of Variance 0.001475242 0.001460587 0.001421358 0.001391261
## Cumulative Proportion  0.949149912 0.950610499 0.952031857 0.953423118
##                            Comp.87     Comp.88     Comp.89     Comp.90
## Standard deviation     0.820546761 0.817043718 0.810453399 0.807041063
## Proportion of Variance 0.001375617 0.001363897 0.001341983 0.001330706
## Cumulative Proportion  0.954798735 0.956162632 0.957504615 0.958835321
##                            Comp.91     Comp.92     Comp.93     Comp.94
## Standard deviation     0.802024379 0.796796395 0.791029965 0.780106030
## Proportion of Variance 0.001314214 0.001297136 0.001278429 0.001243364
## Cumulative Proportion  0.960149535 0.961446671 0.962725101 0.963968464
##                            Comp.95     Comp.96     Comp.97     Comp.98
## Standard deviation     0.772439968 0.768573562 0.763096685 0.749727091
## Proportion of Variance 0.001219047 0.001206874 0.001189734 0.001148411
## Cumulative Proportion  0.965187511 0.966394385 0.967584119 0.968732530
##                            Comp.99    Comp.100    Comp.101    Comp.102
## Standard deviation     0.743805898 0.736137620 0.732093748 0.721484743
## Proportion of Variance 0.001130343 0.001107156 0.001095026 0.001063519
## Cumulative Proportion  0.969862873 0.970970029 0.972065054 0.973128573
##                           Comp.103   Comp.104    Comp.105     Comp.106
## Standard deviation     0.715076882 0.70700872 0.701587431 0.6950309458
## Proportion of Variance 0.001044711 0.00102127 0.001005668 0.0009869592
## Cumulative Proportion  0.974173285 0.97519455 0.976200222 0.9771871813
##                            Comp.107     Comp.108     Comp.109     Comp.110
## Standard deviation     0.6920100074 0.6884785721 0.6763785283 0.6680437261
## Proportion of Variance 0.0009783982 0.0009684378 0.0009346963 0.0009118023
## Cumulative Proportion  0.9781655795 0.9791340174 0.9800687137 0.9809805160
##                            Comp.111     Comp.112     Comp.113     Comp.114
## Standard deviation     0.6593264796 0.6586379711 0.6530632641 0.6465293020
## Proportion of Variance 0.0008881615 0.0008863075 0.0008713676 0.0008540186
## Cumulative Proportion  0.9818686774 0.9827549849 0.9836263525 0.9844803711
##                           Comp.115     Comp.116     Comp.117     Comp.118
## Standard deviation     0.635429021 0.6261021475 0.6224620036 0.6196660567
## Proportion of Variance 0.000824945 0.0008009055 0.0007916197 0.0007845241
## Cumulative Proportion  0.985305316 0.9861062216 0.9868978413 0.9876823655
##                            Comp.119     Comp.120     Comp.121     Comp.122
## Standard deviation     0.6127025255 0.6086809479 0.5953769121 0.5931231464
## Proportion of Variance 0.0007669909 0.0007569554 0.0007242273 0.0007187546
## Cumulative Proportion  0.9884493564 0.9892063118 0.9899305391 0.9906492937
##                            Comp.123     Comp.124    Comp.125     Comp.126
## Standard deviation     0.5865289221 0.5817124836 0.578611893 0.5671206237
## Proportion of Variance 0.0007028615 0.0006913654 0.000684015 0.0006571156
## Cumulative Proportion  0.9913521552 0.9920435207 0.992727536 0.9933846512
##                            Comp.127     Comp.128     Comp.129     Comp.130
## Standard deviation     0.5609464867 0.5433238817 0.5417229679 0.5353217521
## Proportion of Variance 0.0006428857 0.0006031266 0.0005995776 0.0005854916
## Cumulative Proportion  0.9940275369 0.9946306635 0.9952302411 0.9958157327
##                            Comp.131     Comp.132     Comp.133     Comp.134
## Standard deviation     0.5237794014 0.5151699249 0.5028088028 0.4932116344
## Proportion of Variance 0.0005605156 0.0005422404 0.0005165313 0.0004970013
## Cumulative Proportion  0.9963762484 0.9969184888 0.9974350201 0.9979320214
##                            Comp.135     Comp.136     Comp.137     Comp.138
## Standard deviation     0.4803267300 0.4626176551 0.4501413654 0.4348469696
## Proportion of Variance 0.0004713727 0.0004372555 0.0004139889 0.0003863348
## Cumulative Proportion  0.9984033941 0.9988406496 0.9992546385 0.9996409733
##                            Comp.139
## Standard deviation     0.4191967621
## Proportion of Variance 0.0003590267
## Cumulative Proportion  1.0000000000

4.2.1 Varianza explicada por las componentes

Se puede observar que a partir de la componente 10 obtenemos poca mejora en la explicación de la varianza, y en este caso obtenemos una varianza explicada de 0.714741588 por lo que nos quedaremos con 10 componentes principales.

## Importance of first k=37 (out of 139) components:
##                            PC1     PC2     PC3    PC4     PC5    PC6     PC7
## Standard deviation     3.04124 2.76618 2.63232 2.0625 2.00990 1.8188 1.72636
## Proportion of Variance 0.06654 0.05505 0.04985 0.0306 0.02906 0.0238 0.02144
## Cumulative Proportion  0.06654 0.12159 0.17144 0.2020 0.23111 0.2549 0.27635
##                            PC8     PC9   PC10    PC11   PC12    PC13    PC14
## Standard deviation     1.61741 1.55241 1.5007 1.48319 1.4535 1.40767 1.38593
## Proportion of Variance 0.01882 0.01734 0.0162 0.01583 0.0152 0.01426 0.01382
## Cumulative Proportion  0.29517 0.31251 0.3287 0.34453 0.3597 0.37399 0.38781
##                           PC15    PC16    PC17    PC18    PC19    PC20    PC21
## Standard deviation     1.34383 1.31997 1.30705 1.29942 1.27015 1.26272 1.22407
## Proportion of Variance 0.01299 0.01253 0.01229 0.01215 0.01161 0.01147 0.01078
## Cumulative Proportion  0.40080 0.41333 0.42562 0.43777 0.44938 0.46085 0.47163
##                           PC22    PC23   PC24    PC25    PC26   PC27    PC28
## Standard deviation     1.20339 1.18379 1.1731 1.15938 1.15413 1.1368 1.12690
## Proportion of Variance 0.01042 0.01008 0.0099 0.00967 0.00958 0.0093 0.00914
## Cumulative Proportion  0.48205 0.49213 0.5020 0.51170 0.52128 0.5306 0.53971
##                           PC29    PC30    PC31    PC32    PC33    PC34    PC35
## Standard deviation     1.11934 1.11398 1.11174 1.07504 1.06886 1.06522 1.05945
## Proportion of Variance 0.00901 0.00893 0.00889 0.00831 0.00822 0.00816 0.00808
## Cumulative Proportion  0.54873 0.55766 0.56655 0.57486 0.58308 0.59124 0.59932
##                           PC36    PC37
## Standard deviation     1.04866 1.04223
## Proportion of Variance 0.00791 0.00781
## Cumulative Proportion  0.60723 0.61505

4.2.2 Variables rotadas

Un vistaso a las componentes rotadas, pues contiene el valor de loadings para cada componente, aqui podemos observar los pesos que cada variable tiene en los distintos componentes. Por ejemplo para las variables musicales podemos observar que en componente 2 es el que tiene los pesos para dichas variables sobre géneros musicales.

##                                   PC1        PC2         PC3          PC4
## Music                    -0.047582405 0.04938435 -0.03767730  0.092080493
## Slow.songs.or.fast.songs  0.064313113 0.01027430 -0.06073387 -0.030376721
## Dance                    -0.050989897 0.04765625 -0.17185018 -0.035187841
## Folk                     -0.073750387 0.12386414  0.09528105 -0.041192585
## Country                  -0.005007755 0.11595332  0.08244409 -0.008189785
## Classical.music          -0.061054130 0.16388349  0.16326957  0.091623269
##                                   PC5         PC6         PC7         PC8
## Music                    -0.066213032  0.09272616 -0.01671909  0.09413288
## Slow.songs.or.fast.songs -0.020183012  0.03662025  0.04870411  0.04415341
## Dance                     0.098555344  0.14960482  0.01952342 -0.13294629
## Folk                     -0.018508268  0.06857354 -0.04970073 -0.19373424
## Country                   0.057907240  0.09877130 -0.09269609 -0.14614308
## Classical.music           0.009994683 -0.02481594 -0.01569838 -0.09929204
##                                   PC9         PC10
## Music                    -0.047585529  0.037614277
## Slow.songs.or.fast.songs  0.014203661 -0.006595415
## Dance                    -0.048351927 -0.027277526
## Folk                      0.042994842 -0.051886501
## Country                   0.003475195 -0.103622576
## Classical.music          -0.083196649 -0.097619806

Podemos observar en el diagrama los componentes con mayores pesos para las distintas variables, incluyendo los 37 componentes principales.

4.2.3 Valores de los componentes principales

Valor de las componentes principales para cada observación (principal component scores) multiplicando los datos por los vectores de loadings. El resultado se almacena en la matriz:

##          PC1        PC2         PC3        PC4        PC5        PC6        PC7
## 1 -1.0530571 -0.6334789 -1.33637596 -4.5456982 -0.8059460  1.1955210  0.8777088
## 2  2.9015048 -2.0715342  1.46402413  0.6630106 -1.2557060 -1.2537878 -0.1009695
## 3 -1.8848221  2.7063670  2.85136508  1.5208920 -3.3216476  0.8186191  0.8044389
## 5 -0.3302213 -0.2635941 -0.01618298 -2.2163474  0.6670398  0.2047556  1.3600030
## 6  3.1450176  2.1792054  2.22702147  0.5933384 -0.7623117  1.1436428  0.7135727
## 7 -3.1906127  0.9647040 -2.03279992 -4.7222350 -0.6491279 -0.2779432  3.6319297
##          PC8        PC9       PC10
## 1  2.1698138 -1.9888306 -0.3106233
## 2  3.5297389 -1.7411825 -1.4306605
## 3 -1.2463240 -1.1531245  2.0365268
## 5 -1.0862003 -2.9587908  1.5525601
## 6  0.7180025  0.7582240  0.4531394
## 7  2.2258596  0.3643288 -0.1385240

4.2.4 Diagrama

la diferencia que se encontró con ambos métodos es que al utilizar eigenvectores/eigenvalores se puede ecplicar mejor con un menor número de componentes principales, y en ocasiones eso puede ser mucho más cómodo pues al reducir el número de variables a utilizar cada vez más puede ser incluso más sencillo de interpretar.

Por ejemplo podemos observar en el diagrama que el componente 2 es uno de los que puede explicar mejor sobre los gustos de peliculas, aunque identificar esto no siempre es posible en los componente3s principales, es algunas de las observaciones que esta reducción de dimensiones nos pudo ofrecer.