1 Personalidad de adultos jóvenes
Se realizó una encuesta donde se buscaba recolectar información acerca de diferentes variables que pueden determinar la personalidad de un adulto-jóven, se obtuvieron respuestas de una encuesta que evaluo diferentes aspectos de la personalidad de adultos jóvenes.
Se recolectó información de 1010 individuos, entre las variables recolectadas se ecnontraron 159 variables a estudiar, las cuales buscan medir distintos aspectos de la personalidad como el gusto por la música, baile, genero musical como (Rock, pop, punk, jazz, etc), genero de peliculas como (terror, comedia,romance, fantasia, documentales, acción, historia, etc), edad, religion, idiomas, autos, peso, altura, número de hermanos entre otras. En total se obtuvo información de 159 variaables de las cuales se desea hacer una reducción para poder estudiar una menor cantidad de variables y determinar la personalidad de un adulto jóven.
2 Una vista a los datos
| Music | Slow.songs.or.fast.songs | Dance | Folk | Country | Classical.music | Musical | Pop | |
|---|---|---|---|---|---|---|---|---|
| 1 | 5 | 3 | 2 | 1 | 2 | 2 | 1 | 5 |
| 2 | 4 | 4 | 2 | 1 | 1 | 1 | 2 | 3 |
| 3 | 5 | 5 | 2 | 2 | 3 | 4 | 5 | 3 |
| 4 | 5 | 3 | 2 | 1 | 1 | 1 | 1 | 2 |
| 5 | 5 | 3 | 4 | 3 | 2 | 4 | 3 | 5 |
| 6 | 5 | 3 | 2 | 3 | 2 | 3 | 3 | 2 |
| Sci.fi | War | Fantasy.Fairy.tales | Animated | Documentary | Western | Action | History | Psychology | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 4 | 1 | 5 | 5 | 3 | 1 | 2 | 1 | 5 |
| 2 | 4 | 1 | 3 | 5 | 4 | 1 | 4 | 1 | 3 |
| 3 | 4 | 2 | 5 | 5 | 2 | 2 | 1 | 1 | 2 |
| 4 | 4 | 3 | 1 | 2 | 5 | 1 | 2 | 4 | 4 |
| 5 | 3 | 3 | 4 | 4 | 3 | 1 | 4 | 3 | 2 |
| 6 | 3 | 3 | 4 | 3 | 3 | 2 | 4 | 5 | 3 |
| Age | Height | Weight | Number.of.siblings | Gender | Left…right.handed | Education | Only.child | |
|---|---|---|---|---|---|---|---|---|
| 1 | 20 | 163 | 48 | 1 | female | right handed | college/bachelor degree | no |
| 2 | 19 | 163 | 58 | 2 | female | right handed | college/bachelor degree | no |
| 3 | 20 | 176 | 67 | 2 | female | right handed | secondary school | no |
| 4 | 22 | 172 | 59 | 1 | female | right handed | college/bachelor degree | yes |
| 5 | 20 | 170 | 59 | 1 | female | right handed | secondary school | no |
| 6 | 20 | 186 | 77 | 1 | male | right handed | secondary school | no |
3 Preprocesamiento de los datos
Es importante notar que este analisis es parte del preprocesamiento de los datos, pero vamos a destacar algunos ajustes necesarios a realizar para poder realizar este analisis.
El analisis de componentes principales se realiza sobre variables númericas, y es importante identificar las variables categoricas que no serán útiles para el analisis. Existen variables categoricas que pueden ser explicadas con otras numericas, por ejemplo una variable categorica nos indica si es hijo único o no y tenemos otra variable numerica que que nos indica el número de hermanos. Como este ejemplo tenemos distintas variables que podemos eliminar dado que pueden ser explicadas con otra númerica.
También es importante verificar el tipo de dato que de las variables y utilizar las variables numericas. Y notar que es necesario escalar los datos, pues tenemos varianzas muy distintas entre algunas variables y otras, y no queremos que influya en el analisis de componentes principales.
4 Analisis de Componentes Principales
Se realiza el analisis de componentes principales, una vez seleccionadas las variables se cuenta con 139 variables que reduciremos con ayuda de PCA. Pondremos un Límite de 80 variables a obtener.
4.1 Descomposición de valores singulares
## Importance of first k=60 (out of 139) components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 3.04124 2.76618 2.63232 2.0625 2.00990 1.8188 1.72636
## Proportion of Variance 0.06654 0.05505 0.04985 0.0306 0.02906 0.0238 0.02144
## Cumulative Proportion 0.06654 0.12159 0.17144 0.2020 0.23111 0.2549 0.27635
## PC8 PC9 PC10 PC11 PC12 PC13 PC14
## Standard deviation 1.61741 1.55241 1.5007 1.48319 1.4535 1.40767 1.38593
## Proportion of Variance 0.01882 0.01734 0.0162 0.01583 0.0152 0.01426 0.01382
## Cumulative Proportion 0.29517 0.31251 0.3287 0.34453 0.3597 0.37399 0.38781
## PC15 PC16 PC17 PC18 PC19 PC20 PC21
## Standard deviation 1.34383 1.31997 1.30705 1.29942 1.27015 1.26272 1.22407
## Proportion of Variance 0.01299 0.01253 0.01229 0.01215 0.01161 0.01147 0.01078
## Cumulative Proportion 0.40080 0.41333 0.42562 0.43777 0.44938 0.46085 0.47163
## PC22 PC23 PC24 PC25 PC26 PC27 PC28
## Standard deviation 1.20339 1.18379 1.1731 1.15938 1.15413 1.1368 1.12690
## Proportion of Variance 0.01042 0.01008 0.0099 0.00967 0.00958 0.0093 0.00914
## Cumulative Proportion 0.48205 0.49213 0.5020 0.51170 0.52128 0.5306 0.53971
## PC29 PC30 PC31 PC32 PC33 PC34 PC35
## Standard deviation 1.11934 1.11398 1.11174 1.07504 1.06886 1.06522 1.05945
## Proportion of Variance 0.00901 0.00893 0.00889 0.00831 0.00822 0.00816 0.00808
## Cumulative Proportion 0.54873 0.55766 0.56655 0.57486 0.58308 0.59124 0.59932
## PC36 PC37 PC38 PC39 PC40 PC41 PC42
## Standard deviation 1.04866 1.04223 1.02925 1.01912 1.01781 1.01342 0.99740
## Proportion of Variance 0.00791 0.00781 0.00762 0.00747 0.00745 0.00739 0.00716
## Cumulative Proportion 0.60723 0.61505 0.62267 0.63014 0.63759 0.64498 0.65214
## PC43 PC44 PC45 PC46 PC47 PC48 PC49
## Standard deviation 0.98076 0.97486 0.96922 0.96600 0.94708 0.94245 0.94107
## Proportion of Variance 0.00692 0.00684 0.00676 0.00671 0.00645 0.00639 0.00637
## Cumulative Proportion 0.65906 0.66589 0.67265 0.67937 0.68582 0.69221 0.69858
## PC50 PC51 PC52 PC53 PC54 PC55 PC56
## Standard deviation 0.92899 0.92360 0.91797 0.91500 0.90809 0.90167 0.89283
## Proportion of Variance 0.00621 0.00614 0.00606 0.00602 0.00593 0.00585 0.00573
## Cumulative Proportion 0.70479 0.71093 0.71699 0.72301 0.72894 0.73479 0.74053
## PC57 PC58 PC59 PC60
## Standard deviation 0.8821 0.87685 0.87166 0.86777
## Proportion of Variance 0.0056 0.00553 0.00547 0.00542
## Cumulative Proportion 0.7461 0.75166 0.75712 0.76254
4.1.1 Varianza explicada por las componentes
Se puede observar que a partir de la componente 10 obtenemos poca mejora en la explicación de la varianza, sin embargo no es el número adecuado de componentes a utilizar pues para 10 componentes obtenemos una explicación de 0.3287 y aún es muy baja por lo que tomaremos 37 componentes donde acumulamos 0.61505 de varianza explicada, por lo que reducimo en gran medida nuestro numero de componentes.
## Importance of first k=37 (out of 139) components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 3.04124 2.76618 2.63232 2.0625 2.00990 1.8188 1.72636
## Proportion of Variance 0.06654 0.05505 0.04985 0.0306 0.02906 0.0238 0.02144
## Cumulative Proportion 0.06654 0.12159 0.17144 0.2020 0.23111 0.2549 0.27635
## PC8 PC9 PC10 PC11 PC12 PC13 PC14
## Standard deviation 1.61741 1.55241 1.5007 1.48319 1.4535 1.40767 1.38593
## Proportion of Variance 0.01882 0.01734 0.0162 0.01583 0.0152 0.01426 0.01382
## Cumulative Proportion 0.29517 0.31251 0.3287 0.34453 0.3597 0.37399 0.38781
## PC15 PC16 PC17 PC18 PC19 PC20 PC21
## Standard deviation 1.34383 1.31997 1.30705 1.29942 1.27015 1.26272 1.22407
## Proportion of Variance 0.01299 0.01253 0.01229 0.01215 0.01161 0.01147 0.01078
## Cumulative Proportion 0.40080 0.41333 0.42562 0.43777 0.44938 0.46085 0.47163
## PC22 PC23 PC24 PC25 PC26 PC27 PC28
## Standard deviation 1.20339 1.18379 1.1731 1.15938 1.15413 1.1368 1.12690
## Proportion of Variance 0.01042 0.01008 0.0099 0.00967 0.00958 0.0093 0.00914
## Cumulative Proportion 0.48205 0.49213 0.5020 0.51170 0.52128 0.5306 0.53971
## PC29 PC30 PC31 PC32 PC33 PC34 PC35
## Standard deviation 1.11934 1.11398 1.11174 1.07504 1.06886 1.06522 1.05945
## Proportion of Variance 0.00901 0.00893 0.00889 0.00831 0.00822 0.00816 0.00808
## Cumulative Proportion 0.54873 0.55766 0.56655 0.57486 0.58308 0.59124 0.59932
## PC36 PC37
## Standard deviation 1.04866 1.04223
## Proportion of Variance 0.00791 0.00781
## Cumulative Proportion 0.60723 0.61505
4.1.2 Variables rotadas
Un vistaso a las componentes rotadas, pues contiene el valor de loadings para cada componente, aqui podemos observar los pesos que cada variable tiene en los distintos componentes. Por ejemplo para las variables musicales podemos observar que en componente 2 es el que tiene los pesos para dichas variables sobre géneros musicales.
## PC1 PC2 PC3 PC4
## Music -0.047582405 0.04938435 -0.03767730 0.092080493
## Slow.songs.or.fast.songs 0.064313113 0.01027430 -0.06073387 -0.030376721
## Dance -0.050989897 0.04765625 -0.17185018 -0.035187841
## Folk -0.073750387 0.12386414 0.09528105 -0.041192585
## Country -0.005007755 0.11595332 0.08244409 -0.008189785
## Classical.music -0.061054130 0.16388349 0.16326957 0.091623269
## PC5 PC6 PC7 PC8
## Music -0.066213032 0.09272616 -0.01671909 0.09413288
## Slow.songs.or.fast.songs -0.020183012 0.03662025 0.04870411 0.04415341
## Dance 0.098555344 0.14960482 0.01952342 -0.13294629
## Folk -0.018508268 0.06857354 -0.04970073 -0.19373424
## Country 0.057907240 0.09877130 -0.09269609 -0.14614308
## Classical.music 0.009994683 -0.02481594 -0.01569838 -0.09929204
## PC9 PC10 PC11 PC12
## Music -0.047585529 0.037614277 -0.021478425 0.00327134
## Slow.songs.or.fast.songs 0.014203661 -0.006595415 -0.120183418 0.09067466
## Dance -0.048351927 -0.027277526 -0.131863945 -0.01731938
## Folk 0.042994842 -0.051886501 -0.036804984 -0.07936266
## Country 0.003475195 -0.103622576 -0.003224945 -0.04902021
## Classical.music -0.083196649 -0.097619806 0.060019098 -0.03606276
## PC13 PC14 PC15 PC16
## Music -0.07002795 0.070544464 -0.054512767 0.100774191
## Slow.songs.or.fast.songs -0.06791649 0.119509804 0.059384830 0.192476959
## Dance -0.14969820 -0.004770693 0.148166781 0.024879144
## Folk -0.08936829 -0.081762472 -0.076595385 -0.065730774
## Country 0.04606722 -0.039544808 -0.006979301 -0.111137060
## Classical.music -0.04389010 -0.145301150 -0.050043599 0.003550101
## PC17 PC18 PC19 PC20
## Music -0.03077878 0.10828297 -0.08435355 0.14314527
## Slow.songs.or.fast.songs -0.09038227 -0.08304166 -0.06220677 0.15298635
## Dance 0.07406015 -0.01239437 -0.07426104 0.03943098
## Folk 0.06789898 0.01022200 -0.05799981 0.01968971
## Country 0.17337028 -0.05652503 0.14726351 -0.02885842
## Classical.music 0.03576740 0.12052565 -0.05962899 0.04250060
## PC21 PC22 PC23 PC24
## Music -0.05024822 0.13998601 -0.09748485 0.228475645
## Slow.songs.or.fast.songs -0.07661578 0.10846930 -0.08462884 0.058231941
## Dance -0.07588067 0.15447675 0.01250064 0.030885279
## Folk 0.13176087 0.10736805 -0.06216723 0.086242483
## Country 0.10744040 0.18577174 0.01331886 0.143480799
## Classical.music 0.02190333 -0.03173615 -0.12064399 -0.000917923
## PC25 PC26 PC27 PC28
## Music -0.073590324 0.13897119 -0.036416903 0.0348211917
## Slow.songs.or.fast.songs -0.070262488 0.05770227 -0.018477067 -0.0088521645
## Dance 0.078821325 -0.02069734 0.005871426 0.0305858411
## Folk -0.005501371 -0.03376453 0.048016000 -0.0009740571
## Country 0.053890221 0.03193136 -0.069281620 0.0554160369
## Classical.music -0.035178002 -0.01206947 0.100322068 0.0502515957
## PC29 PC30 PC31 PC32
## Music -0.155785360 0.16434501 -0.06935369 -0.12254974
## Slow.songs.or.fast.songs 0.179139760 -0.07665298 -0.23513782 0.10055025
## Dance -0.002067251 0.10047582 -0.01847670 0.08201127
## Folk 0.180039653 0.00285099 -0.18267916 -0.09340278
## Country 0.051342416 0.05485036 -0.11078691 -0.04746951
## Classical.music -0.029498018 -0.03309783 0.01886751 -0.13455546
## PC33 PC34 PC35 PC36
## Music 0.02186480 0.24276967 -0.07117291 0.015982107
## Slow.songs.or.fast.songs -0.05427405 -0.05927715 -0.07387483 -0.005653228
## Dance -0.06446429 0.01279733 0.04701843 0.037679529
## Folk 0.14489480 -0.02662025 0.06034935 0.061365222
## Country 0.15467404 -0.04323688 0.13465474 0.064140810
## Classical.music -0.03859241 0.05504377 -0.03677877 0.047540606
## PC37
## Music 0.06660021
## Slow.songs.or.fast.songs -0.03610156
## Dance 0.01398491
## Folk -0.02598364
## Country 0.04678499
## Classical.music 0.05277598
Podemos observar en el diagrama los componentes con mayores pesos para las distintas variables, incluyendo los 37 componentes principales.
4.1.3 Valores de los componentes principales
Valor de las componentes principales para cada observación (principal component scores) multiplicando los datos por los vectores de loadings. El resultado se almacena en la matriz:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## 1 -1.0530571 -0.6334789 -1.33637596 -4.5456982 -0.8059460 1.1955210 0.8777088
## 2 2.9015048 -2.0715342 1.46402413 0.6630106 -1.2557060 -1.2537878 -0.1009695
## 3 -1.8848221 2.7063670 2.85136508 1.5208920 -3.3216476 0.8186191 0.8044389
## 5 -0.3302213 -0.2635941 -0.01618298 -2.2163474 0.6670398 0.2047556 1.3600030
## 6 3.1450176 2.1792054 2.22702147 0.5933384 -0.7623117 1.1436428 0.7135727
## 7 -3.1906127 0.9647040 -2.03279992 -4.7222350 -0.6491279 -0.2779432 3.6319297
## PC8 PC9 PC10 PC11 PC12 PC13 PC14
## 1 2.1698138 -1.9888306 -0.3106233 3.2083750 0.4705492 2.4926455 1.22944986
## 2 3.5297389 -1.7411825 -1.4306605 0.6270017 -2.1422052 0.5999673 2.05666647
## 3 -1.2463240 -1.1531245 2.0365268 -1.4948925 -1.6395180 3.4700757 0.80225779
## 5 -1.0862003 -2.9587908 1.5525601 -0.1645998 -0.2226828 -2.1117461 -0.75277017
## 6 0.7180025 0.7582240 0.4531394 -1.6573571 1.8411153 -0.0610575 0.49119067
## 7 2.2258596 0.3643288 -0.1385240 -2.0809430 -0.8940154 -1.4011170 0.05398411
## PC15 PC16 PC17 PC18 PC19 PC20
## 1 -0.08378039 0.3400839 -1.7359938 -0.31381989 -0.4627950 0.3028893
## 2 1.44366540 -0.8775515 -0.8365413 -2.47298514 -1.2767190 -0.4407617
## 3 3.03874541 -2.9873473 -0.6992351 -0.70752259 -0.2619535 1.3252956
## 5 0.11267696 -0.2405331 1.1741541 0.38283555 -0.9810110 2.5739937
## 6 -0.83372692 -0.9259021 -0.2685235 0.09426062 -0.6391717 1.1513992
## 7 -0.84860887 0.1917878 -1.5537888 -1.01711848 0.4086109 0.9776245
## PC21 PC22 PC23 PC24 PC25 PC26
## 1 -0.855400897 0.7566404 0.8683802 -1.9796481 1.241736327 0.8908599
## 2 -2.260638516 -1.4041847 -0.6155612 0.3887553 1.716912088 -0.5445146
## 3 -1.239046465 -0.3997135 -2.0643117 0.6069301 0.306273109 -0.9934903
## 5 1.103970434 -0.5499236 -0.5012331 -2.3168332 -0.003982265 0.0730956
## 6 0.554280269 -0.8974694 1.1908446 0.5199019 -0.880485527 -0.8867896
## 7 -0.006353953 -0.2017059 1.5086426 0.7657306 -0.115633362 0.5971394
## PC27 PC28 PC29 PC30 PC31 PC32
## 1 -1.995775697 -0.48584284 -1.41778492 0.80445388 -2.4901759 -1.5074082
## 2 -0.205115650 -0.03535099 0.07995225 0.23550960 -0.4829410 2.0286258
## 3 -2.875302750 -0.64502380 0.76566066 -1.48689994 -1.8113357 2.7324074
## 5 -0.006182388 0.42972822 1.09860257 -0.70040111 0.0106755 -0.6183481
## 6 0.193477721 -0.45130528 0.12238776 0.08088869 1.0247205 -1.7930233
## 7 1.098185013 0.16763676 0.27023758 0.86115267 -1.2710940 0.4705214
## PC33 PC34 PC35 PC36 PC37
## 1 -0.004475272 -0.2931144 0.00645661 0.11918733 2.109312735
## 2 -1.253496529 -1.2830332 0.82233043 0.09958263 0.227989630
## 3 0.146921222 0.3601148 -0.09376475 -0.35066239 0.658878692
## 5 0.359296688 0.9040017 -0.74292049 0.15875317 -0.002450738
## 6 2.107508907 1.4232286 -1.53236935 0.82598706 -1.454308806
## 7 -1.564472644 0.3391209 0.18145575 1.78885351 -0.332875756
4.1.4 Diagrama
El diagrama para observar la variables es el siguiente:
4.2 Uso de eigenvectores/eigenvalores
Cuando se usa el calculo de eigenvectores/eigenvalores sobre la matriz de covarianza.
## Importance of components:
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
## Standard deviation 15.9984176 5.77526107 3.51117834 3.18662002 3.06872118
## Proportion of Variance 0.5229317 0.06814502 0.02518817 0.02074682 0.01924003
## Cumulative Proportion 0.5229317 0.59107670 0.61626488 0.63701169 0.65625172
## Comp.6 Comp.7 Comp.8 Comp.9 Comp.10
## Standard deviation 2.81520010 2.47703533 2.32482555 2.148750949 2.131880497
## Proportion of Variance 0.01619233 0.01253589 0.01104261 0.009433287 0.009285742
## Cumulative Proportion 0.67244406 0.68497995 0.69602256 0.705455846 0.714741588
## Comp.11 Comp.12 Comp.13 Comp.14
## Standard deviation 2.006079342 1.907771857 1.86382497 1.823116207
## Proportion of Variance 0.008222182 0.007436075 0.00709743 0.006790779
## Cumulative Proportion 0.722963771 0.730399846 0.73749728 0.744288055
## Comp.15 Comp.16 Comp.17 Comp.18
## Standard deviation 1.782508594 1.741779187 1.656306778 1.615879982
## Proportion of Variance 0.006491636 0.006198364 0.005604959 0.005334689
## Cumulative Proportion 0.750779691 0.756978055 0.762583015 0.767917704
## Comp.19 Comp.20 Comp.21 Comp.22
## Standard deviation 1.602097391 1.586045760 1.552617373 1.536936913
## Proportion of Variance 0.005244073 0.005139517 0.004925154 0.004826174
## Cumulative Proportion 0.773161776 0.778301294 0.783226447 0.788052621
## Comp.23 Comp.24 Comp.25 Comp.26
## Standard deviation 1.51927175 1.499266829 1.485604395 1.453886542
## Proportion of Variance 0.00471587 0.004592496 0.004509177 0.004318689
## Cumulative Proportion 0.79276849 0.797360987 0.801870164 0.806188853
## Comp.27 Comp.28 Comp.29 Comp.30
## Standard deviation 1.424170664 1.387480174 1.367020576 1.355043492
## Proportion of Variance 0.004143954 0.003933186 0.003818045 0.003751434
## Cumulative Proportion 0.810332807 0.814265993 0.818084037 0.821835472
## Comp.31 Comp.32 Comp.33 Comp.34
## Standard deviation 1.342891575 1.336620014 1.318240867 1.310828856
## Proportion of Variance 0.003684451 0.003650117 0.003550426 0.003510612
## Cumulative Proportion 0.825519923 0.829170040 0.832720466 0.836231078
## Comp.35 Comp.36 Comp.37 Comp.38
## Standard deviation 1.303439434 1.285405386 1.268814256 1.260574067
## Proportion of Variance 0.003471144 0.003375757 0.003289175 0.003246591
## Cumulative Proportion 0.839702222 0.843077979 0.846367154 0.849613745
## Comp.39 Comp.40 Comp.41 Comp.42
## Standard deviation 1.246659002 1.232703104 1.221915134 1.212833000
## Proportion of Variance 0.003175311 0.003104616 0.003050514 0.003005335
## Cumulative Proportion 0.852789056 0.855893672 0.858944186 0.861949521
## Comp.43 Comp.44 Comp.45 Comp.46
## Standard deviation 1.201897616 1.18422317 1.173688963 1.169396746
## Proportion of Variance 0.002951385 0.00286522 0.002814472 0.002793924
## Cumulative Proportion 0.864900906 0.86776613 0.870580598 0.873374523
## Comp.47 Comp.48 Comp.49 Comp.50
## Standard deviation 1.167845528 1.145279006 1.142156650 1.125029433
## Proportion of Variance 0.002786517 0.002679869 0.002665276 0.002585941
## Cumulative Proportion 0.876161040 0.878840908 0.881506184 0.884092126
## Comp.51 Comp.52 Comp.53 Comp.54
## Standard deviation 1.1133657 1.101971192 1.093725776 1.092434327
## Proportion of Variance 0.0025326 0.002481026 0.002444037 0.002438269
## Cumulative Proportion 0.8866247 0.889105752 0.891549789 0.893988058
## Comp.55 Comp.56 Comp.57 Comp.58
## Standard deviation 1.083328573 1.069550287 1.061976177 1.05570768
## Proportion of Variance 0.002397791 0.002337186 0.002304201 0.00227708
## Cumulative Proportion 0.896385848 0.898723035 0.901027236 0.90330432
## Comp.59 Comp.60 Comp.61 Comp.62
## Standard deviation 1.046179141 1.040988679 1.022819636 1.016754395
## Proportion of Variance 0.002236161 0.002214027 0.002137416 0.002112142
## Cumulative Proportion 0.905540477 0.907754504 0.909891919 0.912004061
## Comp.63 Comp.64 Comp.65 Comp.66
## Standard deviation 1.009682933 1.0011969 0.994283520 0.991515624
## Proportion of Variance 0.002082864 0.0020480 0.002019814 0.002008584
## Cumulative Proportion 0.914086925 0.9161349 0.918154739 0.920163323
## Comp.67 Comp.68 Comp.69 Comp.70
## Standard deviation 0.971302144 0.965426073 0.961404789 0.944233856
## Proportion of Variance 0.001927523 0.001904272 0.001888441 0.001821587
## Cumulative Proportion 0.922090846 0.923995118 0.925883559 0.927705147
## Comp.71 Comp.72 Comp.73 Comp.74
## Standard deviation 0.939813732 0.935310607 0.922600046 0.920290702
## Proportion of Variance 0.001804573 0.001787321 0.001739073 0.001730378
## Cumulative Proportion 0.929509720 0.931297041 0.933036114 0.934766492
## Comp.75 Comp.76 Comp.77 Comp.78
## Standard deviation 0.913748572 0.901839801 0.898525443 0.89555798
## Proportion of Variance 0.001705864 0.001661689 0.001649497 0.00163862
## Cumulative Proportion 0.936472356 0.938134044 0.939783542 0.94142216
## Comp.79 Comp.80 Comp.81 Comp.82
## Standard deviation 0.885031621 0.879749066 0.872813999 0.860959850
## Proportion of Variance 0.001600326 0.001581279 0.001556447 0.001514456
## Cumulative Proportion 0.943022488 0.944603767 0.946160214 0.947674670
## Comp.83 Comp.84 Comp.85 Comp.86
## Standard deviation 0.849740131 0.845509109 0.834077266 0.825199329
## Proportion of Variance 0.001475242 0.001460587 0.001421358 0.001391261
## Cumulative Proportion 0.949149912 0.950610499 0.952031857 0.953423118
## Comp.87 Comp.88 Comp.89 Comp.90
## Standard deviation 0.820546761 0.817043718 0.810453399 0.807041063
## Proportion of Variance 0.001375617 0.001363897 0.001341983 0.001330706
## Cumulative Proportion 0.954798735 0.956162632 0.957504615 0.958835321
## Comp.91 Comp.92 Comp.93 Comp.94
## Standard deviation 0.802024379 0.796796395 0.791029965 0.780106030
## Proportion of Variance 0.001314214 0.001297136 0.001278429 0.001243364
## Cumulative Proportion 0.960149535 0.961446671 0.962725101 0.963968464
## Comp.95 Comp.96 Comp.97 Comp.98
## Standard deviation 0.772439968 0.768573562 0.763096685 0.749727091
## Proportion of Variance 0.001219047 0.001206874 0.001189734 0.001148411
## Cumulative Proportion 0.965187511 0.966394385 0.967584119 0.968732530
## Comp.99 Comp.100 Comp.101 Comp.102
## Standard deviation 0.743805898 0.736137620 0.732093748 0.721484743
## Proportion of Variance 0.001130343 0.001107156 0.001095026 0.001063519
## Cumulative Proportion 0.969862873 0.970970029 0.972065054 0.973128573
## Comp.103 Comp.104 Comp.105 Comp.106
## Standard deviation 0.715076882 0.70700872 0.701587431 0.6950309458
## Proportion of Variance 0.001044711 0.00102127 0.001005668 0.0009869592
## Cumulative Proportion 0.974173285 0.97519455 0.976200222 0.9771871813
## Comp.107 Comp.108 Comp.109 Comp.110
## Standard deviation 0.6920100074 0.6884785721 0.6763785283 0.6680437261
## Proportion of Variance 0.0009783982 0.0009684378 0.0009346963 0.0009118023
## Cumulative Proportion 0.9781655795 0.9791340174 0.9800687137 0.9809805160
## Comp.111 Comp.112 Comp.113 Comp.114
## Standard deviation 0.6593264796 0.6586379711 0.6530632641 0.6465293020
## Proportion of Variance 0.0008881615 0.0008863075 0.0008713676 0.0008540186
## Cumulative Proportion 0.9818686774 0.9827549849 0.9836263525 0.9844803711
## Comp.115 Comp.116 Comp.117 Comp.118
## Standard deviation 0.635429021 0.6261021475 0.6224620036 0.6196660567
## Proportion of Variance 0.000824945 0.0008009055 0.0007916197 0.0007845241
## Cumulative Proportion 0.985305316 0.9861062216 0.9868978413 0.9876823655
## Comp.119 Comp.120 Comp.121 Comp.122
## Standard deviation 0.6127025255 0.6086809479 0.5953769121 0.5931231464
## Proportion of Variance 0.0007669909 0.0007569554 0.0007242273 0.0007187546
## Cumulative Proportion 0.9884493564 0.9892063118 0.9899305391 0.9906492937
## Comp.123 Comp.124 Comp.125 Comp.126
## Standard deviation 0.5865289221 0.5817124836 0.578611893 0.5671206237
## Proportion of Variance 0.0007028615 0.0006913654 0.000684015 0.0006571156
## Cumulative Proportion 0.9913521552 0.9920435207 0.992727536 0.9933846512
## Comp.127 Comp.128 Comp.129 Comp.130
## Standard deviation 0.5609464867 0.5433238817 0.5417229679 0.5353217521
## Proportion of Variance 0.0006428857 0.0006031266 0.0005995776 0.0005854916
## Cumulative Proportion 0.9940275369 0.9946306635 0.9952302411 0.9958157327
## Comp.131 Comp.132 Comp.133 Comp.134
## Standard deviation 0.5237794014 0.5151699249 0.5028088028 0.4932116344
## Proportion of Variance 0.0005605156 0.0005422404 0.0005165313 0.0004970013
## Cumulative Proportion 0.9963762484 0.9969184888 0.9974350201 0.9979320214
## Comp.135 Comp.136 Comp.137 Comp.138
## Standard deviation 0.4803267300 0.4626176551 0.4501413654 0.4348469696
## Proportion of Variance 0.0004713727 0.0004372555 0.0004139889 0.0003863348
## Cumulative Proportion 0.9984033941 0.9988406496 0.9992546385 0.9996409733
## Comp.139
## Standard deviation 0.4191967621
## Proportion of Variance 0.0003590267
## Cumulative Proportion 1.0000000000
4.2.1 Varianza explicada por las componentes
Se puede observar que a partir de la componente 10 obtenemos poca mejora en la explicación de la varianza, y en este caso obtenemos una varianza explicada de 0.714741588 por lo que nos quedaremos con 10 componentes principales.
## Importance of first k=37 (out of 139) components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 3.04124 2.76618 2.63232 2.0625 2.00990 1.8188 1.72636
## Proportion of Variance 0.06654 0.05505 0.04985 0.0306 0.02906 0.0238 0.02144
## Cumulative Proportion 0.06654 0.12159 0.17144 0.2020 0.23111 0.2549 0.27635
## PC8 PC9 PC10 PC11 PC12 PC13 PC14
## Standard deviation 1.61741 1.55241 1.5007 1.48319 1.4535 1.40767 1.38593
## Proportion of Variance 0.01882 0.01734 0.0162 0.01583 0.0152 0.01426 0.01382
## Cumulative Proportion 0.29517 0.31251 0.3287 0.34453 0.3597 0.37399 0.38781
## PC15 PC16 PC17 PC18 PC19 PC20 PC21
## Standard deviation 1.34383 1.31997 1.30705 1.29942 1.27015 1.26272 1.22407
## Proportion of Variance 0.01299 0.01253 0.01229 0.01215 0.01161 0.01147 0.01078
## Cumulative Proportion 0.40080 0.41333 0.42562 0.43777 0.44938 0.46085 0.47163
## PC22 PC23 PC24 PC25 PC26 PC27 PC28
## Standard deviation 1.20339 1.18379 1.1731 1.15938 1.15413 1.1368 1.12690
## Proportion of Variance 0.01042 0.01008 0.0099 0.00967 0.00958 0.0093 0.00914
## Cumulative Proportion 0.48205 0.49213 0.5020 0.51170 0.52128 0.5306 0.53971
## PC29 PC30 PC31 PC32 PC33 PC34 PC35
## Standard deviation 1.11934 1.11398 1.11174 1.07504 1.06886 1.06522 1.05945
## Proportion of Variance 0.00901 0.00893 0.00889 0.00831 0.00822 0.00816 0.00808
## Cumulative Proportion 0.54873 0.55766 0.56655 0.57486 0.58308 0.59124 0.59932
## PC36 PC37
## Standard deviation 1.04866 1.04223
## Proportion of Variance 0.00791 0.00781
## Cumulative Proportion 0.60723 0.61505
4.2.2 Variables rotadas
Un vistaso a las componentes rotadas, pues contiene el valor de loadings para cada componente, aqui podemos observar los pesos que cada variable tiene en los distintos componentes. Por ejemplo para las variables musicales podemos observar que en componente 2 es el que tiene los pesos para dichas variables sobre géneros musicales.
## PC1 PC2 PC3 PC4
## Music -0.047582405 0.04938435 -0.03767730 0.092080493
## Slow.songs.or.fast.songs 0.064313113 0.01027430 -0.06073387 -0.030376721
## Dance -0.050989897 0.04765625 -0.17185018 -0.035187841
## Folk -0.073750387 0.12386414 0.09528105 -0.041192585
## Country -0.005007755 0.11595332 0.08244409 -0.008189785
## Classical.music -0.061054130 0.16388349 0.16326957 0.091623269
## PC5 PC6 PC7 PC8
## Music -0.066213032 0.09272616 -0.01671909 0.09413288
## Slow.songs.or.fast.songs -0.020183012 0.03662025 0.04870411 0.04415341
## Dance 0.098555344 0.14960482 0.01952342 -0.13294629
## Folk -0.018508268 0.06857354 -0.04970073 -0.19373424
## Country 0.057907240 0.09877130 -0.09269609 -0.14614308
## Classical.music 0.009994683 -0.02481594 -0.01569838 -0.09929204
## PC9 PC10
## Music -0.047585529 0.037614277
## Slow.songs.or.fast.songs 0.014203661 -0.006595415
## Dance -0.048351927 -0.027277526
## Folk 0.042994842 -0.051886501
## Country 0.003475195 -0.103622576
## Classical.music -0.083196649 -0.097619806
Podemos observar en el diagrama los componentes con mayores pesos para las distintas variables, incluyendo los 37 componentes principales.
4.2.3 Valores de los componentes principales
Valor de las componentes principales para cada observación (principal component scores) multiplicando los datos por los vectores de loadings. El resultado se almacena en la matriz:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## 1 -1.0530571 -0.6334789 -1.33637596 -4.5456982 -0.8059460 1.1955210 0.8777088
## 2 2.9015048 -2.0715342 1.46402413 0.6630106 -1.2557060 -1.2537878 -0.1009695
## 3 -1.8848221 2.7063670 2.85136508 1.5208920 -3.3216476 0.8186191 0.8044389
## 5 -0.3302213 -0.2635941 -0.01618298 -2.2163474 0.6670398 0.2047556 1.3600030
## 6 3.1450176 2.1792054 2.22702147 0.5933384 -0.7623117 1.1436428 0.7135727
## 7 -3.1906127 0.9647040 -2.03279992 -4.7222350 -0.6491279 -0.2779432 3.6319297
## PC8 PC9 PC10
## 1 2.1698138 -1.9888306 -0.3106233
## 2 3.5297389 -1.7411825 -1.4306605
## 3 -1.2463240 -1.1531245 2.0365268
## 5 -1.0862003 -2.9587908 1.5525601
## 6 0.7180025 0.7582240 0.4531394
## 7 2.2258596 0.3643288 -0.1385240
4.2.4 Diagrama
la diferencia que se encontró con ambos métodos es que al utilizar eigenvectores/eigenvalores se puede ecplicar mejor con un menor número de componentes principales, y en ocasiones eso puede ser mucho más cómodo pues al reducir el número de variables a utilizar cada vez más puede ser incluso más sencillo de interpretar.
Por ejemplo podemos observar en el diagrama que el componente 2 es uno de los que puede explicar mejor sobre los gustos de peliculas, aunque identificar esto no siempre es posible en los componente3s principales, es algunas de las observaciones que esta reducción de dimensiones nos pudo ofrecer.