Principales librerias utilizadas:
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:kableExtra':
##
## group_rows
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## corrplot 0.92 loaded
##
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:dplyr':
##
## src, summarize
## The following objects are masked from 'package:base':
##
## format.pval, units
## Loading required package: ggplot2
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
Esta guia presenta un Ejercio para su realizacion, que es el siguiente: Una empresa especializada en el diseño de automóviles de turismo desea estudiar cuáles son los deseos del público que compra automóviles. Para ello diseña una encuesta con 10 preguntas donde se le pide a cada uno de los 20 encuestados que valore de 1 a 5 si una característica es o no muy importante. Los encuestados deberán contestar con un 5 si la característica es muy importante, un4 si es importante, un 3 si tiene regular importancia, un 2 si es poco importante y un 1 si no es nada importante. Las 10 características (V1 a V10) a valorar son: V1= precio, V2= financiación, V3= consumo, V4= combustible, V5= seguridad, V6= confort, V7= capacidad, V8= prestaciones, V9= modernida V10= aerodinámica.
Su Matriz de información es la siguiente:
load("C:/Users/eazuc/OneDrive/Escritorio/Metodos para el Analisis Economico/Tarea Componentes Principales/6-2.RData")matriz_x <- X6_2
matriz_x %>%
head() %>%
kable(caption = "Matriz de informacion",
align = "c") %>%
row_spec(0,background = "#E0B9EF" ) %>%
kable_material(html_font = "sans-serif")| V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 |
|---|---|---|---|---|---|---|---|---|---|
| 4 | 1 | 4 | 3 | 3 | 2 | 4 | 4 | 4 | 4 |
| 5 | 5 | 4 | 4 | 3 | 3 | 4 | 1 | 1 | 3 |
| 2 | 1 | 3 | 1 | 4 | 2 | 1 | 5 | 4 | 5 |
| 1 | 1 | 1 | 1 | 4 | 4 | 2 | 5 | 5 | 4 |
| 1 | 1 | 2 | 1 | 5 | 5 | 4 | 3 | 3 | 2 |
| 5 | 5 | 5 | 5 | 3 | 3 | 4 | 2 | 2 | 1 |
Realizar un análisis de Componentes Principales, una solución adecuada de la cantidad de Componentes a retener y justifique su respuesta.
“Cálculo de V(X) forma manual”
centrado <- function(X){
X - mean(X)
}
Xcentrado <- apply(X = matriz_x, MARGIN = 2, centrado)
n_observaciones <- nrow(matriz_x)
matriz_v <- t(Xcentrado) %*% Xcentrado/(n_observaciones - 1)
matriz_v %>%
kable(caption ="Calculo de V(X) forma manual",
align ="c",
digits =2) %>%
row_spec(0,background = "#E0D7FF" ) %>%
kable_material(html_font = "sans-serif") %>%
kable_styling(bootstrap_options = c("striped", "hover"))| V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | |
|---|---|---|---|---|---|---|---|---|---|---|
| V1 | 1.80 | 1.92 | 1.32 | 1.73 | -0.62 | -0.31 | 0.36 | -1.21 | -1.27 | -0.90 |
| V2 | 1.92 | 2.67 | 1.42 | 2.14 | -0.66 | -0.14 | 0.52 | -1.78 | -1.81 | -1.54 |
| V3 | 1.32 | 1.42 | 1.42 | 1.53 | -0.53 | -0.32 | 0.29 | -0.92 | -1.11 | -0.87 |
| V4 | 1.73 | 2.14 | 1.53 | 2.48 | -0.80 | -0.48 | 0.35 | -1.61 | -1.83 | -1.39 |
| V5 | -0.62 | -0.66 | -0.53 | -0.80 | 0.85 | 0.80 | 0.21 | 0.37 | 0.46 | 0.15 |
| V6 | -0.31 | -0.14 | -0.32 | -0.48 | 0.80 | 1.38 | 0.63 | 0.22 | 0.09 | -0.37 |
| V7 | 0.36 | 0.52 | 0.29 | 0.35 | 0.21 | 0.63 | 1.61 | -0.53 | -0.34 | -0.71 |
| V8 | -1.21 | -1.78 | -0.92 | -1.61 | 0.37 | 0.22 | -0.53 | 1.92 | 1.81 | 1.37 |
| V9 | -1.27 | -1.81 | -1.11 | -1.83 | 0.46 | 0.09 | -0.34 | 1.81 | 2.17 | 1.56 |
| V10 | -0.90 | -1.54 | -0.87 | -1.39 | 0.15 | -0.37 | -0.71 | 1.37 | 1.56 | 1.82 |
“Cálculo de V(X) a travéz de R base”
cov(matriz_x) %>%
kable(caption ="Calculo de V(X) a travez de R base",
align ="c",
digits =2) %>%
row_spec(0,background = "#C1F0C0" ) %>%
kable_material(html_font = "sans-serif") %>%
kable_styling(bootstrap_options = c("striped", "hover"))| V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | |
|---|---|---|---|---|---|---|---|---|---|---|
| V1 | 1.80 | 1.92 | 1.32 | 1.73 | -0.62 | -0.31 | 0.36 | -1.21 | -1.27 | -0.90 |
| V2 | 1.92 | 2.67 | 1.42 | 2.14 | -0.66 | -0.14 | 0.52 | -1.78 | -1.81 | -1.54 |
| V3 | 1.32 | 1.42 | 1.42 | 1.53 | -0.53 | -0.32 | 0.29 | -0.92 | -1.11 | -0.87 |
| V4 | 1.73 | 2.14 | 1.53 | 2.48 | -0.80 | -0.48 | 0.35 | -1.61 | -1.83 | -1.39 |
| V5 | -0.62 | -0.66 | -0.53 | -0.80 | 0.85 | 0.80 | 0.21 | 0.37 | 0.46 | 0.15 |
| V6 | -0.31 | -0.14 | -0.32 | -0.48 | 0.80 | 1.38 | 0.63 | 0.22 | 0.09 | -0.37 |
| V7 | 0.36 | 0.52 | 0.29 | 0.35 | 0.21 | 0.63 | 1.61 | -0.53 | -0.34 | -0.71 |
| V8 | -1.21 | -1.78 | -0.92 | -1.61 | 0.37 | 0.22 | -0.53 | 1.92 | 1.81 | 1.37 |
| V9 | -1.27 | -1.81 | -1.11 | -1.83 | 0.46 | 0.09 | -0.34 | 1.81 | 2.17 | 1.56 |
| V10 | -0.90 | -1.54 | -0.87 | -1.39 | 0.15 | -0.37 | -0.71 | 1.37 | 1.56 | 1.82 |
“Cálculo de V(X) forma manual”
Zx <- scale(x = matriz_x, center = TRUE)
n_observaciones <- nrow(matriz_x)
matriz_R <- t(Zx) %*% Zx/(n_observaciones -1)
matriz_R %>%
kable(caption ="Calculo de V(X) forma manual",
align ="c",
digits =2) %>%
row_spec(0,background = "#E0B9EF" ) %>%
kable_material(html_font = "sans-serif") %>%
kable_styling(bootstrap_options = c("striped", "hover"))| V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | |
|---|---|---|---|---|---|---|---|---|---|---|
| V1 | 1.00 | 0.87 | 0.82 | 0.82 | -0.50 | -0.19 | 0.21 | -0.65 | -0.64 | -0.50 |
| V2 | 0.87 | 1.00 | 0.73 | 0.83 | -0.44 | -0.07 | 0.25 | -0.78 | -0.75 | -0.70 |
| V3 | 0.82 | 0.73 | 1.00 | 0.81 | -0.48 | -0.23 | 0.19 | -0.56 | -0.63 | -0.54 |
| V4 | 0.82 | 0.83 | 0.81 | 1.00 | -0.55 | -0.26 | 0.17 | -0.74 | -0.79 | -0.65 |
| V5 | -0.50 | -0.44 | -0.48 | -0.55 | 1.00 | 0.74 | 0.18 | 0.29 | 0.34 | 0.12 |
| V6 | -0.19 | -0.07 | -0.23 | -0.26 | 0.74 | 1.00 | 0.42 | 0.13 | 0.05 | -0.24 |
| V7 | 0.21 | 0.25 | 0.19 | 0.17 | 0.18 | 0.42 | 1.00 | -0.30 | -0.18 | -0.41 |
| V8 | -0.65 | -0.78 | -0.56 | -0.74 | 0.29 | 0.13 | -0.30 | 1.00 | 0.89 | 0.73 |
| V9 | -0.64 | -0.75 | -0.63 | -0.79 | 0.34 | 0.05 | -0.18 | 0.89 | 1.00 | 0.78 |
| V10 | -0.50 | -0.70 | -0.54 | -0.65 | 0.12 | -0.24 | -0.41 | 0.73 | 0.78 | 1.00 |
“Cálculo de R(X) a travéz de R base”
cor(matriz_x) %>%
kable(caption ="Calculo de R(X) a travez de R base",
align ="c",
digits =2) %>%
row_spec(0,background = "#C1F0C0" ) %>%
kable_material(html_font = "sans-serif") %>%
kable_styling(bootstrap_options = c("striped", "hover"))| V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | |
|---|---|---|---|---|---|---|---|---|---|---|
| V1 | 1.00 | 0.87 | 0.82 | 0.82 | -0.50 | -0.19 | 0.21 | -0.65 | -0.64 | -0.50 |
| V2 | 0.87 | 1.00 | 0.73 | 0.83 | -0.44 | -0.07 | 0.25 | -0.78 | -0.75 | -0.70 |
| V3 | 0.82 | 0.73 | 1.00 | 0.81 | -0.48 | -0.23 | 0.19 | -0.56 | -0.63 | -0.54 |
| V4 | 0.82 | 0.83 | 0.81 | 1.00 | -0.55 | -0.26 | 0.17 | -0.74 | -0.79 | -0.65 |
| V5 | -0.50 | -0.44 | -0.48 | -0.55 | 1.00 | 0.74 | 0.18 | 0.29 | 0.34 | 0.12 |
| V6 | -0.19 | -0.07 | -0.23 | -0.26 | 0.74 | 1.00 | 0.42 | 0.13 | 0.05 | -0.24 |
| V7 | 0.21 | 0.25 | 0.19 | 0.17 | 0.18 | 0.42 | 1.00 | -0.30 | -0.18 | -0.41 |
| V8 | -0.65 | -0.78 | -0.56 | -0.74 | 0.29 | 0.13 | -0.30 | 1.00 | 0.89 | 0.73 |
| V9 | -0.64 | -0.75 | -0.63 | -0.79 | 0.34 | 0.05 | -0.18 | 0.89 | 1.00 | 0.78 |
| V10 | -0.50 | -0.70 | -0.54 | -0.65 | 0.12 | -0.24 | -0.41 | 0.73 | 0.78 | 1.00 |
Para conocer los componentes esenciales que se deben retener, debemos realizar una descomposición, tambien llamada Extracción de componentes.
##
## Please cite as:
## Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
ava_ave<- eigen(matriz_R$r)
t(ava_ave$values) %>%
kable(caption ="Autovalores de R(X)",
align ="c",
digits =2) %>%
kable_classic_2(html_font = "sans-serif") %>%
kable_styling(bootstrap_options = c("striped", "hover"))| 5.7 | 2.07 | 0.72 | 0.55 | 0.32 | 0.27 | 0.15 | 0.13 | 0.07 | 0.03 |
options(scipen =99999)
Componentes_p <- princomp(x = matriz_x, cor = TRUE, fix_sign = FALSE )
factoextra::get_eig(Componentes_p)%>%
kable(caption ="Resumen de los componentes principales",
align ="c",
digits =2) %>%
kable_classic_2(html_font = "sans-serif") %>%
kable_styling(bootstrap_options = c("hover"))| eigenvalue | variance.percent | cumulative.variance.percent | |
|---|---|---|---|
| Dim.1 | 5.70 | 57.01 | 57.01 |
| Dim.2 | 2.07 | 20.69 | 77.70 |
| Dim.3 | 0.72 | 7.20 | 84.91 |
| Dim.4 | 0.55 | 5.48 | 90.39 |
| Dim.5 | 0.32 | 3.16 | 93.54 |
| Dim.6 | 0.27 | 2.71 | 96.25 |
| Dim.7 | 0.15 | 1.46 | 97.72 |
| Dim.8 | 0.13 | 1.28 | 99.00 |
| Dim.9 | 0.07 | 0.68 | 99.68 |
| Dim.10 | 0.03 | 0.32 | 100.00 |
En base a la información mostrada en la tabla “Resumen de los componentes principales”, considerando el criterio de los 3/4. R// Se deberian retener 2 dimenciones, porque la informacion acumulada en ellas es de 77.70%, y segun este criterio contiene la suficiente información sobre el fenomeno. Esto quiere decir que la empresa especializada en el diseño de automóviles de turismo deberia considerar las caracteristicas precio y financiación como las caracteristicas con mayor importancia para la investigación.
fviz_eig(Componentes_p,
choice = "eigenvalue",
barcolor = "purple4",
barfill = "purple4",
addlabels = TRUE,)+ labs(title = "Grafico de sedimentacion",
subtitle = "Usando princomp, con autovalores")+
xlab(label = "Componentes")+
ylab(label = "Autovalores")+geom_hline(yintercept = 1)Con esta representación gráfica de los autovalores, considerando el criterio de raiz latente. Tambien indica que solo se deberian retener 2 dimenciones. Ya que solo los valores de la dimención 1 y dimención 2 son mayores a 1. Esto confirma nuestra respuesta anterior donde indicamos que solo se deberian retener 2 dimenciones como las mas importantes para la investigación.
fviz_eig(Componentes_p,
choice = "variance",
barcolor = "orange",
barfill = "orange",
addlabels = TRUE,)+ labs(title = "Grafico de sedimentacion",
subtitle = "Usando princomp, con %varianza")+
xlab(label = "Componentes")+
ylab(label = "%varianza")+geom_hline(yintercept = 1)