usaremos mtcars para calcular los mpegs con principal component analisis veremos los eigen valos a los que corresponde el analisis
glimpse(mtcars)
Observations: 32
Variables: 11
$ mpg [3m[38;5;246m<dbl>[39m[23m 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8, 16.4, 17.3, 15.2, 10.4, 10.4, 14.7, 32.4, 30.4, 33.9, 21.5, 15.5, 15.2...
$ cyl [3m[38;5;246m<dbl>[39m[23m 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8, 8, 8, 8, 4, 4, 4, 8, 6, 8, 4
$ disp [3m[38;5;246m<dbl>[39m[23m 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 167.6, 167.6, 275.8, 275.8, 275.8, 472.0, 460.0, 440.0, 78.7, 75.7, 71.1,...
$ hp [3m[38;5;246m<dbl>[39m[23m 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180, 205, 215, 230, 66, 52, 65, 97, 150, 150, 245, 175, 66, 91, 113, 264, ...
$ drat [3m[38;5;246m<dbl>[39m[23m 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92, 3.07, 3.07, 3.07, 2.93, 3.00, 3.23, 4.08, 4.93, 4.22, 3.70, 2.76, 3.15...
$ wt [3m[38;5;246m<dbl>[39m[23m 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.440, 3.440, 4.070, 3.730, 3.780, 5.250, 5.424, 5.345, 2.200, 1.615, 1.8...
$ qsec [3m[38;5;246m<dbl>[39m[23m 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18.30, 18.90, 17.40, 17.60, 18.00, 17.98, 17.82, 17.42, 19.47, 18.52, 19....
$ vs [3m[38;5;246m<dbl>[39m[23m 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1
$ am [3m[38;5;246m<dbl>[39m[23m 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1
$ gear [3m[38;5;246m<dbl>[39m[23m 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3, 3, 3, 3, 4, 5, 5, 5, 5, 5, 4
$ carb [3m[38;5;246m<dbl>[39m[23m 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2, 2, 4, 2, 1, 2, 2, 4, 6, 8, 2
nos quedamos unicamente con los valores menos con la variable que deseamos predecir
glimpse(mtproy)
Observations: 32
Variables: 10
$ cyl [3m[38;5;246m<dbl>[39m[23m 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8, 8, 8, 8, 4, 4, 4, 8, 6, 8, 4
$ disp [3m[38;5;246m<dbl>[39m[23m 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 167.6, 167.6, 275.8, 275.8, 275.8, 472.0, 460.0, 440.0, 78.7, 75.7, 71.1,...
$ hp [3m[38;5;246m<dbl>[39m[23m 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180, 205, 215, 230, 66, 52, 65, 97, 150, 150, 245, 175, 66, 91, 113, 264, ...
$ drat [3m[38;5;246m<dbl>[39m[23m 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92, 3.07, 3.07, 3.07, 2.93, 3.00, 3.23, 4.08, 4.93, 4.22, 3.70, 2.76, 3.15...
$ wt [3m[38;5;246m<dbl>[39m[23m 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.440, 3.440, 4.070, 3.730, 3.780, 5.250, 5.424, 5.345, 2.200, 1.615, 1.8...
$ qsec [3m[38;5;246m<dbl>[39m[23m 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18.30, 18.90, 17.40, 17.60, 18.00, 17.98, 17.82, 17.42, 19.47, 18.52, 19....
$ vs [3m[38;5;246m<dbl>[39m[23m 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1
$ am [3m[38;5;246m<dbl>[39m[23m 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1
$ gear [3m[38;5;246m<dbl>[39m[23m 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3, 3, 3, 3, 4, 5, 5, 5, 5, 5, 4
$ carb [3m[38;5;246m<dbl>[39m[23m 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2, 2, 4, 2, 1, 2, 2, 4, 6, 8, 2
armamos nuestro pca
summary(res.pca)
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Standard deviation 2.400 1.628 0.77280 0.51914 0.47143 0.45839 0.36458 0.28405 0.23163 0.15426
Proportion of Variance 0.576 0.265 0.05972 0.02695 0.02223 0.02101 0.01329 0.00807 0.00537 0.00238
Cumulative Proportion 0.576 0.841 0.90071 0.92766 0.94988 0.97089 0.98419 0.99226 0.99762 1.00000
vemos la matriz de pca por cada dimension y lo que notamos como valores son los factores de la combinacion lineal de cada componente principal
res.pca
Standard deviations (1, .., p=10):
[1] 2.4000453 1.6277725 0.7727968 0.5191403 0.4714341 0.4583857 0.3645821 0.2840450 0.2316298 0.1542606
Rotation (n x k) = (10 x 10):
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
cyl 0.4029711 -0.03901479 0.13874360 -8.040022e-05 0.06148048 -0.18206407 -0.04257067 0.07041306 -0.863268748 0.1670687388
disp 0.3959243 0.05393117 0.01633491 -2.646304e-01 0.33851109 0.35738419 0.19767431 -0.14361684 -0.020039738 -0.6838300858
hp 0.3543255 -0.24496137 -0.18225874 6.000387e-02 0.52828704 -0.03269674 -0.08503414 0.58708325 0.291428365 0.2462606844
drat -0.3155948 -0.27847781 -0.13057734 -8.528509e-01 0.10299748 -0.23386885 0.03226657 0.04010725 -0.086765162 0.0544414772
wt 0.3668004 0.14675805 -0.38579961 -2.527210e-01 -0.14410292 0.43201764 -0.03368560 -0.36605124 0.075971836 0.5318885631
qsec -0.2198982 0.46066271 -0.40307004 -7.174202e-02 -0.21341845 0.29265169 -0.03797611 0.59621869 -0.244573292 -0.1545795278
vs -0.3333571 0.22751987 -0.41252247 2.119502e-01 0.62369179 -0.11710663 -0.23387904 -0.36246041 -0.182200371 -0.0055443849
am -0.2474991 -0.43201042 0.23493804 3.190779e-02 0.04930286 0.60874338 -0.54631997 0.02588771 -0.154149509 -0.0003995261
gear -0.2214375 -0.46516217 -0.27929375 2.623809e-01 0.02039816 0.24560902 0.69429321 -0.01069942 -0.198369367 0.0741152014
carb 0.2267080 -0.41169300 -0.56172255 1.233534e-01 -0.36576403 -0.25782743 -0.33623769 -0.08067483 0.003086198 -0.3585136181
visualizamos el aporte de la variavilidad por cada pc como podemos ver en el punto del pc4 llegamos al de mas aporte
graficaremos el conjunto de vectores de las principal componentes en una elipse
projectamos nuestros vectores en un circulo para las variables
fviz_pca_ind(res.pca, geom.ind = "point", pointshape = 21,
pointsize = 1,
fill.ind = as.factor(decathlon2$Competition[1:23]),
col.ind = "black",
palette = "jco",
addEllipses = TRUE,
label = "var",
col.var = "black",
repel = TRUE,
legend.title = "Diagnosis") +
ggtitle("2D PCA-plot from 30 feature dataset") +
theme(plot.title = element_text(hjust = 0.5))
Error in fviz(X, element = "ind", axes = axes, geom = geom.ind, habillage = habillage, :
The length of fill variableshould be the same as the number of rows in the data.