La bases de datos autilizar son los microdatos anonimizados del Censo Nacional de Población y Vivienda DANE 2018.
El Censo de Población y Vivienda, es la operación estadĆstica mĆ”s grande y de mayor importancia que se realiza en cualquier paĆs. Se constituye en la columna vertebral del sistema nacional de información estadĆstica. Por su universalidad, la información que se obtiene es el soporte de la planeación y formulación de polĆticas pĆŗblicas. De igual forma, es la herramienta que permite llevar a cabo la caracterización de la población, sus hogares y viviendas como insumo para el ordenamiento territorial y para el seguimiento, la evaluación y la formulación de nuevas metas a los compromisos del paĆs, entre otros como los Objetivos de Desarrollo Sostenible (ODS), el Consenso de Montevideo (CDM) y los compromisos con la Organización para la Cooperación y el Desarrollo Económico - OCDE. (DANE,2018) El propósito del Censo Nacional de Población y Vivienda, en adelante - CNPV 2018, es el de contar la población residente en el territorio nacional y obtener información sociodemogrĆ”fica para la planificación, gestión y toma de decisiones de polĆtica pĆŗblica a nivel nacional, territorial y local.
La iformación se encuentra dividida en 5 arvhivos: Personas (825.364 observacioses), Viviendas (239.595 observaciones), Hogares (227.428 observaciones), fallecidos (6.646 observaciones) y un Marco de georreferenciación.
Ubicación y diccionario de Variables: http://microdatos.dane.gov.co/index.php/catalog/643/data_dictionary
En primer lugar, se realiza la revisión, y preparanción de la base de datos, para posteriormente obtener algunas estadĆsticas descriptivas.
library(readr)
library(magrittr)
library(tidyverse)
library(kableExtra)
library(MVN)
library(psych)
library(polycor)
library(ggcorrplot)
personas <- read_csv("CENSO/CNPV2018_5PER_A2_44.CSV",
locale = locale(decimal_mark = ",", grouping_mark = "."))
hogares <- read_csv("CENSO/CNPV2018_2HOG_A2_44.csv")
viviendas<- read_csv("CENSO/CNPV2018_1viv_A2_44.csv")
fallecidos<- read_csv("CENSO/CNPV2018_3FALL_A2_44.csv")
head(personas) #Primeras filas
## # A tibble: 6 x 48
## TIPO_REG U_DPTO U_MPIO UA_CLASE COD_ENCUESTAS U_VIVIENDA P_NROHOG P_NRO_PER
## <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 5 44 001 1 333492 2 1 1
## 2 5 44 001 1 333492 2 1 2
## 3 5 44 001 1 333492 2 1 3
## 4 5 44 001 1 333492 2 1 4
## 5 5 44 001 1 333492 2 1 5
## 6 5 44 001 1 333493 1 1 1
## # ... with 40 more variables: P_SEXO <dbl>, P_EDADR <dbl>, P_PARENTESCOR <dbl>,
## # PA1_GRP_ETNIC <dbl>, PA11_COD_ETNIA <dbl>, PA12_CLAN <dbl>,
## # PA21_COD_VITSA <lgl>, PA22_COD_KUMPA <lgl>, PA_HABLA_LENG <dbl>,
## # PA1_ENTIENDE <dbl>, PB_OTRAS_LENG <dbl>, PB1_QOTRAS_LENG <dbl>,
## # PA_LUG_NAC <dbl>, PA_VIVIA_5ANOS <dbl>, PA_VIVIA_1ANO <dbl>,
## # P_ENFERMO <dbl>, P_QUEHIZO_PPAL <dbl>, PA_LO_ATENDIERON <dbl>,
## # PA1_CALIDAD_SERV <dbl>, CONDICION_FISICA <dbl>, P_ALFABETA <dbl>,
## # PA_ASISTENCIA <dbl>, P_NIVEL_ANOSR <dbl>, P_TRABAJO <dbl>,
## # P_EST_CIVIL <dbl>, PA_HNV <dbl>, PA1_THNV <dbl>, PA2_HNVH <dbl>,
## # PA3_HNVM <dbl>, PA_HNVS <dbl>, PA1_THSV <dbl>, PA2_HSVH <dbl>,
## # PA3_HSVM <dbl>, PA_HFC <dbl>, PA1_THFC <dbl>, PA2_HFCH <dbl>,
## # PA3_HFCM <dbl>, PA_UHNV <dbl>, PA1_MES_UHNV <dbl>, PA2_ANO_UHNV <dbl>
dim(personas)
## [1] 825364 48
head(hogares)
## # A tibble: 6 x 13
## TIPO_REG U_DPTO U_MPIO UA_CLASE COD_ENCUESTAS U_VIVIENDA H_NROHOG
## <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 2 44 001 1 333492 2 1
## 2 2 44 001 1 333493 1 1
## 3 2 44 001 1 333494 11 1
## 4 2 44 001 1 333495 1 1
## 5 2 44 001 1 333496 1 1
## 6 2 44 001 1 333497 1 1
## # ... with 6 more variables: H_NRO_CUARTOS <dbl>, H_NRO_DORMIT <dbl>,
## # H_DONDE_PREPALIM <dbl>, H_AGUA_COCIN <dbl>, HA_NRO_FALL <dbl>,
## # HA_TOT_PER <dbl>
dim(hogares)
## [1] 227428 13
head(viviendas)
## # A tibble: 6 x 30
## TIPO_REG U_DPTO U_MPIO UA_CLASE U_EDIFICA COD_ENCUESTAS U_VIVIENDA UVA_ESTATER
## <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <lgl>
## 1 1 44 001 1 1 351875 1 NA
## 2 1 44 001 1 1 351908 3 NA
## 3 1 44 001 1 1 352022 1 NA
## 4 1 44 001 1 1 352981 2 NA
## 5 1 44 001 1 1 379761 1 NA
## 6 1 44 001 1 1 382524 1 NA
## # ... with 22 more variables: UVA1_TIPOTER <lgl>, UVA2_CODTER <lgl>,
## # UVA_ESTA_AREAPROT <dbl>, UVA1_COD_AREAPROT <dbl>, UVA_USO_UNIDAD <dbl>,
## # V_TIPO_VIV <dbl>, V_CON_OCUP <dbl>, V_TOT_HOG <dbl>, V_MAT_PARED <dbl>,
## # V_MAT_PISO <dbl>, VA_EE <dbl>, VA1_ESTRATO <dbl>, VB_ACU <dbl>,
## # VC_ALC <dbl>, VD_GAS <dbl>, VE_RECBAS <dbl>, VE1_QSEM <dbl>,
## # VF_INTERNET <dbl>, V_TIPO_SERSA <dbl>, L_TIPO_INST <lgl>,
## # L_EXISTEHOG <lgl>, L_TOT_PERL <lgl>
dim(viviendas)
## [1] 239595 30
head(fallecidos)
## # A tibble: 6 x 11
## TIPO_REG U_DPTO U_MPIO UA_CLASE COD_ENCUESTAS U_VIVIENDA F_NROHOG FA1_NRO_FALL
## <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 3 44 001 1 333495 1 1 1
## 2 3 44 001 1 352564 1 1 1
## 3 3 44 001 1 353037 1 1 1
## 4 3 44 001 1 444248 1 1 1
## 5 3 44 001 1 452116 2 2 1
## 6 3 44 001 1 452116 2 3 1
## # ... with 3 more variables: FA2_SEXO_FALL <dbl>, FA3_EDAD_FALL <dbl>,
## # FA4_CERT_DEFUN <dbl>
dim(fallecidos)
## [1] 6646 11
var_na_v<-sapply(viviendas, function(x) (sum(is.na(x))/length(x)*100))
var_na_v[var_na_v>0] #Proporción de Nas (Aplica solo para variables con Nas)
## UVA_ESTATER UVA1_TIPOTER UVA2_CODTER UVA1_COD_AREAPROT
## 68.39124356 68.40084309 100.00000000 97.40937833
## V_TIPO_VIV V_CON_OCUP V_TOT_HOG V_MAT_PARED
## 0.02545963 0.02545963 10.32074960 10.32074960
## V_MAT_PISO VA_EE VA1_ESTRATO VB_ACU
## 10.32074960 10.32074960 45.60195330 10.32074960
## VC_ALC VD_GAS VE_RECBAS VE1_QSEM
## 10.32074960 10.32074960 10.32074960 58.61683257
## VF_INTERNET V_TIPO_SERSA L_TIPO_INST L_EXISTEHOG
## 10.32074960 10.32074960 99.99958263 99.99916526
## L_TOT_PERL
## 99.99499155
var_na_h<-sapply(hogares, function(x) (sum(is.na(x))/length(x)*100))
var_na_h[var_na_h>0] #Proporción de Nas (Aplica solo para variables con Nas)
## H_NROHOG H_NRO_CUARTOS H_NRO_DORMIT H_DONDE_PREPALIM
## 0.02682168 0.02682168 0.02682168 0.02682168
## H_AGUA_COCIN HA_NRO_FALL HA_TOT_PER
## 3.25949311 97.45413933 0.02682168
var_na_p<-sapply(personas, function(x) (sum(is.na(x))/length(x)*100))
var_na_p[var_na_p>0] #Proporción de Nas (Aplica solo para variables con Nas)
## P_NROHOG P_PARENTESCOR PA11_COD_ETNIA PA12_CLAN
## 0.8461721 0.8461721 52.1807348 54.2398263
## PA21_COD_VITSA PA22_COD_KUMPA PA_HABLA_LENG PA1_ENTIENDE
## 100.0000000 99.9996365 52.7322490 94.2744050
## PB_OTRAS_LENG PB1_QOTRAS_LENG PA_VIVIA_5ANOS PA_VIVIA_1ANO
## 52.7322490 90.5154574 0.8461721 0.8461721
## P_ENFERMO P_QUEHIZO_PPAL PA_LO_ATENDIERON PA1_CALIDAD_SERV
## 0.8461721 94.8024144 96.2363272 96.3291348
## CONDICION_FISICA P_ALFABETA PA_ASISTENCIA P_NIVEL_ANOSR
## 0.8461721 11.8086081 12.6519935 11.8086081
## P_TRABAJO P_EST_CIVIL PA_HNV PA1_THNV
## 24.5512283 23.7812650 60.9538337 78.5832675
## PA2_HNVH PA3_HNVM PA_HNVS PA1_THSV
## 78.5832675 78.5832675 78.5832675 79.5018925
## PA2_HSVH PA3_HSVM PA_HFC PA1_THFC
## 79.5018925 79.5018925 78.5832675 93.4972933
## PA2_HFCH PA3_HFCM PA_UHNV PA1_MES_UHNV
## 93.4972933 93.4972933 78.5832675 85.7176955
## PA2_ANO_UHNV
## 85.7176955
Al tratarse de variables categóricas, no se encuentran datos atĆpicos.
summary(viviendas)
## TIPO_REG U_DPTO U_MPIO UA_CLASE U_EDIFICA
## Min. :1 Min. :44 Length:239595 Min. :1.000 Min. : 1.00
## 1st Qu.:1 1st Qu.:44 Class :character 1st Qu.:1.000 1st Qu.: 2.00
## Median :1 Median :44 Mode :character Median :1.000 Median : 6.00
## Mean :1 Mean :44 Mean :1.856 Mean : 12.77
## 3rd Qu.:1 3rd Qu.:44 3rd Qu.:3.000 3rd Qu.: 13.00
## Max. :1 Max. :44 Max. :3.000 Max. :620.00
##
## COD_ENCUESTAS U_VIVIENDA UVA_ESTATER UVA1_TIPOTER
## Min. : 333492 Min. : 1.00 Mode:logical Mode:logical
## 1st Qu.: 5984571 1st Qu.: 1.00 TRUE:75733 TRUE:75710
## Median : 11079259 Median : 2.00 NA's:163862 NA's:163885
## Mean : 63801081 Mean : 23.03
## 3rd Qu.: 15009520 3rd Qu.: 7.00
## Max. :950003460 Max. :990.00
##
## UVA2_CODTER UVA_ESTA_AREAPROT UVA1_COD_AREAPROT UVA_USO_UNIDAD
## Mode:logical Min. :1.000 Min. :1106 Min. :1.000
## NA's:239595 1st Qu.:2.000 1st Qu.:1113 1st Qu.:1.000
## Median :2.000 Median :1113 Median :1.000
## Mean :1.974 Mean :2695 Mean :1.022
## 3rd Qu.:2.000 3rd Qu.:5061 3rd Qu.:1.000
## Max. :2.000 Max. :5061 Max. :4.000
## NA's :233388
## V_TIPO_VIV V_CON_OCUP V_TOT_HOG V_MAT_PARED
## Min. :1.000 Min. :1.000 Min. : 1.000 Min. :1.000
## 1st Qu.:1.000 1st Qu.:1.000 1st Qu.: 1.000 1st Qu.:1.000
## Median :1.000 Median :1.000 Median : 1.000 Median :1.000
## Mean :2.159 Mean :1.263 Mean : 1.058 Mean :3.257
## 3rd Qu.:4.000 3rd Qu.:1.000 3rd Qu.: 1.000 3rd Qu.:5.000
## Max. :6.000 Max. :4.000 Max. :11.000 Max. :9.000
## NA's :61 NA's :61 NA's :24728 NA's :24728
## V_MAT_PISO VA_EE VA1_ESTRATO VB_ACU
## Min. :1.000 Min. :1.000 Min. :0.0 Min. :1.000
## 1st Qu.:4.000 1st Qu.:1.000 1st Qu.:1.0 1st Qu.:1.000
## Median :4.000 Median :1.000 Median :1.0 Median :2.000
## Mean :4.485 Mean :1.393 Mean :1.3 Mean :1.534
## 3rd Qu.:6.000 3rd Qu.:2.000 3rd Qu.:2.0 3rd Qu.:2.000
## Max. :6.000 Max. :2.000 Max. :9.0 Max. :2.000
## NA's :24728 NA's :24728 NA's :109260 NA's :24728
## VC_ALC VD_GAS VE_RECBAS VE1_QSEM
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.0
## 1st Qu.:1.000 1st Qu.:1.000 1st Qu.:1.000 1st Qu.:2.0
## Median :2.000 Median :2.000 Median :2.000 Median :3.0
## Mean :1.581 Mean :1.633 Mean :1.539 Mean :2.5
## 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:3.0
## Max. :2.000 Max. :9.000 Max. :2.000 Max. :9.0
## NA's :24728 NA's :24728 NA's :24728 NA's :140443
## VF_INTERNET V_TIPO_SERSA L_TIPO_INST L_EXISTEHOG L_TOT_PERL
## Min. :1.00 Min. :1.000 Mode:logical Mode:logical Mode:logical
## 1st Qu.:2.00 1st Qu.:1.000 TRUE:1 TRUE:2 TRUE:12
## Median :2.00 Median :2.000 NA's:239594 NA's:239593 NA's:239583
## Mean :1.93 Mean :3.287
## 3rd Qu.:2.00 3rd Qu.:6.000
## Max. :9.00 Max. :9.000
## NA's :24728 NA's :24728
Al tratarse de variables categóricas, no se encuentran datos atĆpicos. Las Ćŗnicas variables continuas hacen referencia al nĆŗmero de personas en el hogar, el nĆŗmero de cuartos y el nĆŗmero de dormitorios.
summary(hogares)
## TIPO_REG U_DPTO U_MPIO UA_CLASE COD_ENCUESTAS
## 2:227428 44:227428 001 :50750 1:114085 10705779: 11
## 430 :44260 2: 20116 14230976: 11
## 847 :41173 3: 93227 15197146: 11
## 560 :20015 16167288: 11
## 650 :13380 15457182: 10
## 279 :11308 15845585: 10
## (Other):46542 (Other) :227364
## U_VIVIENDA H_NROHOG H_NRO_CUARTOS H_NRO_DORMIT
## 1 :109451 Min. : 1.000 Min. : 1.0 Min. : 1.000
## 2 : 26986 1st Qu.: 1.000 1st Qu.: 1.0 1st Qu.: 1.000
## 3 : 12323 Median : 1.000 Median : 2.0 Median : 1.000
## 4 : 7443 Mean : 1.077 Mean : 2.3 Mean : 1.694
## 5 : 5269 3rd Qu.: 1.000 3rd Qu.: 3.0 3rd Qu.: 2.000
## 6 : 4101 Max. :11.000 Max. :19.0 Max. :17.000
## (Other): 61855 NA's :61 NA's :61 NA's :61
## H_DONDE_PREPALIM H_AGUA_COCIN HA_NRO_FALL HA_TOT_PER
## 1 :112161 1 :89414 1 : 5263 Min. : 1.000
## 5 : 68323 5 :51814 2 : 366 1st Qu.: 2.000
## 3 : 13484 9 :26186 3 : 91 Median : 3.000
## 2 : 13039 3 :14192 4 : 36 Mean : 3.599
## 4 : 10886 4 :13826 5 : 14 3rd Qu.: 5.000
## (Other): 9474 (Other):24583 (Other): 20 Max. :32.000
## NA's : 61 NA's : 7413 NA's :221638 NA's :61
Al igual que la base de datos de hogares, la mayor parte de las variables son categóricas por lo que no se encuentran datos atĆpicos en estas. Sin embargo, las variables continuas son aquellas relacionadas con el total de personas en el higar, y el total de hijos.
summary(personas)
## TIPO_REG U_DPTO U_MPIO UA_CLASE COD_ENCUESTAS
## 5:825364 44:825364 001 :177573 1:391901 750000695: 1182
## 847 :160711 2: 72681 750001039: 553
## 430 :159223 3:360782 750000643: 491
## 560 : 74528 750000267: 462
## 650 : 46077 750000863: 404
## 279 : 40852 750000701: 335
## (Other):166400 (Other) :821937
## U_VIVIENDA P_NROHOG P_NRO_PER P_SEXO
## 1 :389124 Min. : 1.000 Min. : 1.000 1:404215
## 2 : 93055 1st Qu.: 1.000 1st Qu.: 1.000 2:421149
## 3 : 43064 Median : 1.000 Median : 3.000
## 4 : 26635 Mean : 1.069 Mean : 4.963
## 5 : 19234 3rd Qu.: 1.000 3rd Qu.: 4.000
## 6 : 15074 Max. :11.000 Max. :1182.000
## (Other):239178 NA's :6984
## P_EDADR P_PARENTESCOR PA1_GRP_ETNIC PA11_COD_ETNIA PA12_CLAN
## 2 : 98818 1 :227367 1:394683 720 :371130 2 : 78117
## 1 : 97464 2 :109903 2: 29 50 : 12855 3 : 61190
## 3 : 90547 3 :388984 3: 108 370 : 6558 13 : 59637
## 4 : 87425 4 : 86108 4: 111 40 : 2031 8 : 51686
## 5 : 77426 5 : 6018 5: 60256 800 : 1307 19 : 27787
## 6 : 66200 NA's: 6984 6:360151 (Other): 802 (Other): 99271
## (Other):307484 9: 10026 NA's :430681 NA's :447676
## PA21_COD_VITSA PA22_COD_KUMPA PA_HABLA_LENG PA1_ENTIENDE PB_OTRAS_LENG
## NA's:825364 TRUE: 3 1 :342187 1 : 17876 1 : 78282
## NA's:825361 2 : 47257 2 : 29193 2 :309992
## 9 : 687 9 : 188 9 : 1857
## NA's:435233 NA's:778107 NA's:435233
##
##
##
## PB1_QOTRAS_LENG PA_LUG_NAC PA_VIVIA_5ANOS PA_VIVIA_1ANO P_ENFERMO
## 1 : 40967 1:590755 1 : 95464 1 : 17784 1 : 42899
## 2 : 36032 2:180094 2 :645047 2 :768321 2 :767765
## 99 : 1013 3: 45739 3 : 25134 3 : 7656 9 : 7716
## 3 : 88 9: 8776 4 : 41803 4 : 15626 NA's: 6984
## 4 : 36 9 : 10932 9 : 8993
## (Other): 146 NA's: 6984 NA's: 6984
## NA's :747082
## P_QUEHIZO_PPAL PA_LO_ATENDIERON PA1_CALIDAD_SERV CONDICION_FISICA
## 1 : 31064 1 : 30298 1 : 4060 1 : 25419
## 7 : 3870 2 : 733 2 : 22128 2 :792961
## 8 : 2346 9 : 33 3 : 3264 NA's: 6984
## 2 : 1955 NA's:794300 4 : 846
## 9 : 1699 NA's:795066
## (Other): 1965
## NA's :782465
## P_ALFABETA PA_ASISTENCIA P_NIVEL_ANOSR P_TRABAJO P_EST_CIVIL
## 1 :600457 1 :260971 2 :240323 1 :181844 7 :266902
## 2 :116909 2 :450005 3 :127660 7 :152618 1 :238804
## 9 : 10534 9 : 9963 4 :119176 6 :141236 2 : 48929
## NA's: 97464 NA's:104425 10 :101501 9 : 52333 4 : 40467
## 8 : 45254 4 : 48180 9 : 14545
## (Other): 93986 (Other): 46516 (Other): 19435
## NA's : 97464 NA's :202637 NA's :196282
## PA_HNV PA1_THNV PA2_HNVH PA3_HNVM PA_HNVS
## 1 :176766 Min. : 1.0 Min. : 1.0 Min. : 1.0 1 :169184
## 2 :133744 1st Qu.: 2.0 1st Qu.: 2.0 1st Qu.: 2.0 2 : 7301
## 9 : 11763 Median : 3.0 Median : 2.0 Median : 2.0 9 : 281
## NA's:503091 Mean : 3.4 Mean : 2.8 Mean : 2.7 NA's:648598
## 3rd Qu.: 4.0 3rd Qu.: 3.0 3rd Qu.: 3.0
## Max. :25.0 Max. :18.0 Max. :19.0
## NA's :648598 NA's :648598 NA's :648598
## PA1_THSV PA2_HSVH PA3_HSVM PA_HFC
## Min. : 1.0 Min. : 1.0 1 : 60717 1 : 53671
## 1st Qu.: 3.0 1st Qu.: 2.0 2 : 36672 2 :121203
## Median : 4.0 Median : 2.0 0 : 34627 9 : 1892
## Mean : 4.5 Mean : 2.8 3 : 18585 NA's:648598
## 3rd Qu.: 5.0 3rd Qu.: 3.0 4 : 8549
## Max. :22.0 Max. :14.0 (Other): 10034
## NA's :656180 NA's :656180 NA's :656180
## PA1_THFC PA2_HFCH PA3_HFCM PA_UHNV
## 0 : 42654 0 : 45613 0 : 45860 1 :117881
## 1 : 4162 1 : 4426 1 : 4356 9 : 58885
## 2 : 2652 2 : 1964 2 : 1909 NA's:648598
## 3 : 1771 3 : 878 3 : 766
## 4 : 947 4 : 331 4 : 325
## (Other): 1485 (Other): 459 (Other): 455
## NA's :771693 NA's :771693 NA's :771693
## PA1_MES_UHNV PA2_ANO_UHNV
## 10 : 11068 2017 : 12300
## 11 : 10770 2016 : 10377
## 9 : 10707 2015 : 8248
## 12 : 10614 2014 : 6702
## 8 : 9937 2013 : 5786
## (Other): 64785 (Other): 74468
## NA's :707483 NA's :707483
Al igual que la base de datos de hogares, la mayor parte de las variables son categóricas por lo que no se encuentran datos atĆpicos en estas. Sin embargo, las variables continuas son aquellas relacionadas con el total de personas en el higar, y el total de hijos.
summary(fallecidos)
## TIPO_REG U_DPTO U_MPIO UA_CLASE COD_ENCUESTAS
## Min. :3 44:6646 Length:6646 Min. :1.000 Min. : 333495
## 1st Qu.:3 Class :character 1st Qu.:1.000 1st Qu.: 8250271
## Median :3 Mode :character Median :3.000 Median : 12409480
## Mean :3 Mean :2.348 Mean : 44895627
## 3rd Qu.:3 3rd Qu.:3.000 3rd Qu.: 14646787
## Max. :3 Max. :3.000 Max. :902429324
## U_VIVIENDA F_NROHOG FA1_NRO_FALL FA2_SEXO_FALL
## Min. : 1.00 Min. : 1.000 Min. : 1.000 Min. :1.00
## 1st Qu.: 1.00 1st Qu.: 1.000 1st Qu.: 1.000 1st Qu.:1.00
## Median : 2.00 Median : 1.000 Median : 1.000 Median :1.00
## Mean : 32.51 Mean : 1.093 Mean : 1.244 Mean :1.45
## 3rd Qu.: 13.00 3rd Qu.: 1.000 3rd Qu.: 1.000 3rd Qu.:2.00
## Max. :866.00 Max. :10.000 Max. :15.000 Max. :9.00
## FA3_EDAD_FALL FA4_CERT_DEFUN
## Min. : 0.00 1:2169
## 1st Qu.: 4.00 2:3443
## Median : 42.00 9:1034
## Mean : 42.14
## 3rd Qu.: 70.00
## Max. :999.00
Las variables seleccionadas (hasta ahora) son las siguientes:
Material de la vivienda: (pisos_in): proporción de viviendas con pisos inadecuados, (paredes_in): Proporción de viviendas con paredes inadecuadas.
Acceso a servicios pĆŗblicos: (sin_elec): Proporción de viviendas sin acceso a energĆa electrica, (sin_gas): Proporción de viviendas sin acceso al servicio de gas natural, (sin_alc): Proporción de viviendas sin servicio de alcantarillado, (sin_basu): proporción de viviendas sin recolección de basuras,(sin_acu): Proporción de viviendas sin acueducto.
Conexión (sin_int): proporción de viviendas sin acceso a intertet.
-Ruralidad (v_rural): Proporción de viviendas en zona rural
-ĀØNĆŗmero de personas (T_hogar): nĆŗmero promedio de personas por hogar
########################### CĆLCULO DE INDICADORES ##############################
################################# MARGINACIĆN ##################################
##### Trampas de pobreza #####
#Embarazo adolescente (menos de 19 aƱos)
e_adol<- personas %>% group_by(U_MPIO)%>%
dplyr::summarise(embar_a=(sum(P_EDADR<=4& P_SEXO==2&PA_HNV==1,na.rm = T))/sum(P_EDADR<=4& P_SEXO==2))
#Analfabetismo
analf<-personas %>% group_by(U_MPIO) %>%
dplyr::summarise(analfa=(sum(P_ALFABETA==2,na.rm = T)/sum(P_EDADR>=2)))
#Jóvenes que no estudian ni trabajan
NINI<-personas %>% group_by(U_MPIO) %>%
dplyr::summarise(ninis=(sum(P_EDADR>=4&P_EDADR<7&(P_TRABAJO==4|P_TRABAJO==7),na.rm = T)/sum(P_EDADR>=4&P_EDADR<7)))
### Condiciones habitacionales ####
#Filtrar viviendas de uso residencial
viviendas<-filter(viviendas,UVA_USO_UNIDAD==1|UVA_USO_UNIDAD==2)
#Material de la vivienda
#Pisos inadecuados
pisos<- viviendas %>% group_by(U_MPIO)%>%
dplyr::summarise(piso_in=(sum(V_MAT_PISO==6,na.rm = T))/n())
#Paredes inadecuadas
paredes<-viviendas %>% group_by(U_MPIO)%>%
dplyr::summarise(pared_in=(sum(V_MAT_PARED==7|V_MAT_PARED==8,na.rm = T))/n())
#Acceso a servicios pĆŗblicos
#Sin acceso a electricidad
elec<-viviendas %>% group_by(U_MPIO)%>%
dplyr::summarise(sin_elec=(sum(VA_EE==2,na.rm = T))/n())
#Sin acceso a gas natural
gasn<-viviendas %>% group_by(U_MPIO)%>%
dplyr::summarise(sin_gas=(sum(VD_GAS==2,na.rm = T))/n())
#Sin acceso a alcantarillado
alc<-viviendas %>% group_by(U_MPIO)%>%
dplyr::summarise(sin_alc=(sum(VC_ALC==2,na.rm = T))/n())
#Sin acceso a recolección de basuras
desec<-viviendas %>% group_by(U_MPIO)%>%
dplyr::summarise(sin_basu=(sum(VE_RECBAS==2,na.rm = T))/n())
#Sin acceso a acueducto
acued<-viviendas %>% group_by(U_MPIO)%>%
dplyr::summarise(sin_acu=(sum(VB_ACU==2,na.rm = T))/n())
#Ruralidad
rural<-viviendas %>% group_by(U_MPIO)%>%
dplyr::summarise(v_rural=(sum(UA_CLASE==2|UA_CLASE==3|UA_CLASE==4,na.rm = T))/n())
#Numero de personas en el hogar
hogar<-hogares %>% group_by(U_MPIO)%>%
dplyr::summarise(T_hog=(sum(HA_TOT_PER,na.rm = T))/n())
#### VehĆculos de movilidad social ####
#Jovenes entre 5 a 10 aƱos que no estudian
no_estu<- personas %>% group_by(U_MPIO)%>%
dplyr::summarise(no_estu=(sum(P_EDADR<=4&P_EDADR>2&PA_ASISTENCIA==2,na.rm = T))/sum(P_EDADR<=4&P_EDADR>2))
#Mayores de 15 años sin educación media
educacion<- personas %>% group_by(U_MPIO)%>%
dplyr::summarise(sin_educm=(sum(P_EDADR>=4&P_NIVEL_ANOSR<4,na.rm = T))/sum(P_EDADR>=4))
#Viviendas sin acceso a internet
internet<-viviendas %>% group_by(U_MPIO)%>%
dplyr::summarise(sin_int=(sum(VF_INTERNET==2,na.rm = T))/n())
####### SALUD ######
#Personas sin atención estando enfermas
atencion<-personas %>% group_by(U_MPIO)%>%
dplyr::summarise(aten_salud=(sum(P_ENFERMO==1&PA_LO_ATENDIERON==2,na.rm = T))/sum(P_ENFERMO==1,na.rm = T))
#Personas con alguna discapacidad fĆsica
discapacidad<-personas %>% group_by(U_MPIO)%>%
dplyr::summarise(disc=(sum(CONDICION_FISICA==1,na.rm = T))/n())
#Proporción de niños menores de 5 años fallecidos
fallec<- fallecidos %>% group_by(U_MPIO)%>%
dplyr::summarise(fall_men=(sum(FA3_EDAD_FALL<=5,na.rm = T))/n())
#### Pertenencia etnica ####
#Proporción de personas que pertenecen a una etnia
etnia<-personas %>% group_by(U_MPIO) %>%
dplyr::summarise(per_etnia=(sum(PA1_GRP_ETNIC<6,na.rm = T)/n()))
#Proproción de personas que hablan lenguas nativas
etnia_leng<-personas %>% group_by(U_MPIO) %>%
dplyr::summarise(leng_etnia=(sum(PA_HABLA_LENG==1,na.rm = T)/sum(PA1_GRP_ETNIC<6)))
#Proporción viviendas etnicas
viv_etnia<-viviendas %>% group_by(U_MPIO)%>%
dplyr::summarise(viv_etnia=(sum(UVA_ESTATER==1,na.rm = T))/n())
Para cada mucicipio se obtienen un total de 22 indicadores:
##### UNIR TODAS LAS VARIABLES #####
base_in<-cbind(e_adol,analf[,2],NINI[,2],pisos[,2],paredes[,2],elec[,2],gasn[,2],alc[,2],desec[,2],acued[,2],rural[,2],hogar[,2],no_estu[,2],educacion[,2],internet[,2],atencion[,2],discapacidad[,2],fallec[,2],etnia[,2],etnia_leng[,2],viv_etnia[,2])
#Base de datos a utilizar
base_in %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
scroll_box(width = "100%", height = "400px")
| U_MPIO | embar_a | analfa | ninis | piso_in | pared_in | sin_elec | sin_gas | sin_alc | sin_basu | sin_acu | v_rural | T_hog | no_estu | sin_educm | sin_int | aten_salud | disc | fall_men | per_etnia | leng_etnia | viv_etnia |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 001 | 0.0444579 | 0.1082951 | 0.2631219 | 0.2423565 | 0.0757550 | 0.1678684 | 0.3587092 | 0.4067222 | 0.2929965 | 0.2809979 | 0.2920835 | 3.484335 | 0.2213295 | 0.3534969 | 0.6808079 | 0.0195488 | 0.0325218 | 0.1877058 | 0.4279310 | 0.5555015 | 0.1118439 |
| 035 | 0.0405340 | 0.1052853 | 0.3059067 | 0.3340153 | 0.0589198 | 0.3342881 | 0.6591653 | 0.4191217 | 0.4742226 | 0.3518822 | 0.4058920 | 3.435546 | 0.1892840 | 0.3781426 | 0.8359247 | 0.0122117 | 0.0331849 | 0.2142857 | 0.4060134 | 0.7842384 | 0.2336334 |
| 078 | 0.0468404 | 0.0726198 | 0.2892999 | 0.2662947 | 0.1093435 | 0.1004393 | 0.3736199 | 0.2676006 | 0.3217381 | 0.3044046 | 0.4237208 | 3.668337 | 0.1906198 | 0.3875774 | 0.7922355 | 0.0158730 | 0.0293881 | 0.1964286 | 0.5500718 | 0.4628757 | 0.1409237 |
| 090 | 0.0489282 | 0.2713166 | 0.3075681 | 0.3757962 | 0.1046257 | 0.3060457 | 0.5455779 | 0.6468623 | 0.5005743 | 0.4282134 | 0.8638405 | 3.930551 | 0.3732006 | 0.3480785 | 0.8418085 | 0.0243682 | 0.0564427 | 0.2786885 | 0.3941319 | 0.6868078 | 0.2242874 |
| 098 | 0.0350211 | 0.0914826 | 0.2564178 | 0.2312746 | 0.0210250 | 0.0522996 | 0.3760841 | 0.2927727 | 0.3056505 | 0.2402102 | 0.5208936 | 3.441290 | 0.1950627 | 0.3364497 | 0.7582129 | 0.0012987 | 0.0449137 | 0.0312500 | 0.2735043 | 0.3872549 | 0.1516426 |
| 110 | 0.0363153 | 0.0778053 | 0.3243243 | 0.1053065 | 0.0012341 | 0.1168244 | 0.2422871 | 0.1846977 | 0.2328260 | 0.1690662 | 0.2081448 | 3.556180 | 0.1996753 | 0.3673743 | 0.7379679 | 0.0000000 | 0.0893293 | 0.0714286 | 0.1173345 | 0.0832313 | 0.0000000 |
| 279 | 0.0436368 | 0.0691039 | 0.2755715 | 0.0979427 | 0.0365470 | 0.0427592 | 0.1943526 | 0.1753126 | 0.1922549 | 0.1827350 | 0.1749092 | 3.611337 | 0.2267846 | 0.3452624 | 0.7449778 | 0.0064915 | 0.0355919 | 0.0409357 | 0.1041809 | 0.1217105 | 0.0033885 |
| 378 | 0.0483956 | 0.0727215 | 0.2765675 | 0.3028786 | 0.0244949 | 0.1503665 | 0.4793492 | 0.3393528 | 0.3302342 | 0.3393528 | 0.3161094 | 3.370589 | 0.2436627 | 0.3671346 | 0.8042196 | 0.0118959 | 0.0316320 | 0.1511628 | 0.4717632 | 0.5695094 | 0.2460218 |
| 420 | 0.0500000 | 0.1086632 | 0.3125000 | 0.1885057 | 0.0241379 | 0.0919540 | 0.3735632 | 0.2517241 | 0.3494253 | 0.2034483 | 0.4275862 | 3.290970 | 0.2000000 | 0.4401751 | 0.8321839 | 0.0187166 | 0.1043360 | 0.0000000 | 0.1355014 | 0.0300000 | 0.0000000 |
| 430 | 0.0352408 | 0.1211946 | 0.3251288 | 0.3370441 | 0.1500166 | 0.2829193 | 0.5192116 | 0.4903007 | 0.4019066 | 0.5122747 | 0.3176871 | 3.594239 | 0.2616792 | 0.4537651 | 0.7901886 | 0.0210464 | 0.0274583 | 0.3323529 | 0.4651652 | 0.8062648 | 0.1842029 |
| 560 | 0.0370243 | 0.2675432 | 0.3212249 | 0.7749878 | 0.3279319 | 0.6981509 | 0.8858881 | 0.8775669 | 0.8454501 | 0.8907056 | 0.8430657 | 3.654859 | 0.1975575 | 0.4461012 | 0.9168856 | 0.0198646 | 0.0093119 | 0.2441628 | 0.9361851 | 0.9081150 | 0.7911436 |
| 650 | 0.0391787 | 0.0782240 | 0.2797546 | 0.1121114 | 0.0139249 | 0.1755829 | 0.3330959 | 0.3630829 | 0.3907383 | 0.2337435 | 0.3674223 | 3.419656 | 0.2089923 | 0.3724574 | 0.7506477 | 0.0163218 | 0.0858129 | 0.1073446 | 0.4471428 | 0.0836286 | 0.0605570 |
| 847 | 0.0238349 | 0.2993056 | 0.4104763 | 0.8071380 | 0.2932331 | 0.8681801 | 0.9025930 | 0.8960382 | 0.8930981 | 0.8992674 | 0.8916763 | 3.841498 | 0.3503425 | 0.4833075 | 0.9104251 | 0.0199813 | 0.0107460 | 0.3666882 | 0.9648313 | 0.9201788 | 0.8821814 |
| 855 | 0.0342950 | 0.0671954 | 0.2826415 | 0.1436407 | 0.0085704 | 0.1021598 | 0.2272883 | 0.1621529 | 0.6732945 | 0.1391841 | 0.1631814 | 3.391145 | 0.1879230 | 0.3185794 | 0.8056222 | 0.0120275 | 0.0501079 | 0.0285714 | 0.0885468 | 0.0841639 | 0.0000000 |
| 874 | 0.0376182 | 0.0749967 | 0.3135647 | 0.0841081 | 0.0139454 | 0.0517141 | 0.1658919 | 0.1140325 | 0.1398896 | 0.0772807 | 0.0797501 | 3.267373 | 0.1793376 | 0.3552896 | 0.7838466 | 0.0081177 | 0.0452472 | 0.0878378 | 0.0325652 | 0.1990172 | 0.0000000 |
bdf<-base_in[1:15,2:22]
mat_cor <- hetcor(bdf)$correlations #matriz de correlación policorica
## Warning in hetcor.data.frame(bdf): the correlation matrix has been adjusted to
## make it positive-definite
ggcorrplot(mat_cor,type="lower",hc.order = T)
Prueba de Barlett
cortest.bartlett(mat_cor,n=NULL)->p_esf
## Warning in cortest.bartlett(mat_cor, n = NULL): n not specified, 100 used
p_esf$p
## [1] 0
El resultado del p valor de la prueba permite rechazar las hipotesis nula de que las variables no estĆ”n correlacionadas entre sĆ.
# GrƔficos univariados
Result <- mvn(data=base_in[,2:22], mvnTest="royston", univariatePlot="box")
## Warning in uniPlot(data, type = univariatePlot): Box-Plots are based on
## standardized values (centered and scaled).
#Result2<- mvn(data=base_in[,2:22], mvnTest="royston", univariatePlot="histogram")
#result3<-mvn(data=base_in[,2:22], mvnTest = "royston", univariatePlot = "qqplot")
Para cada variable se realiza el test Shapiro-Wilks, los reusltados muestran que sólo 3 variables siguen esta distribución:
# test de normalidad univariante
# Test de Shapiro-Wilks
result <- mvn(data = base_in[,2:22], univariateTest = "SW", desc = TRUE)
result$univariateNormality
## Test Variable Statistic p value Normality
## 1 Shapiro-Wilk embar_a 0.9386 0.3652 YES
## 2 Shapiro-Wilk analfa 0.6853 0.0002 NO
## 3 Shapiro-Wilk ninis 0.8491 0.0169 NO
## 4 Shapiro-Wilk piso_in 0.7927 0.0030 NO
## 5 Shapiro-Wilk pared_in 0.7579 0.0011 NO
## 6 Shapiro-Wilk sin_elec 0.7456 0.0008 NO
## 7 Shapiro-Wilk sin_gas 0.8953 0.0806 YES
## 8 Shapiro-Wilk sin_alc 0.8709 0.0348 NO
## 9 Shapiro-Wilk sin_basu 0.8850 0.0563 YES
## 10 Shapiro-Wilk sin_acu 0.8122 0.0053 NO
## 11 Shapiro-Wilk v_rural 0.8784 0.0450 NO
## 12 Shapiro-Wilk T_hog 0.9482 0.4973 YES
## 13 Shapiro-Wilk no_estu 0.7346 0.0006 NO
## 14 Shapiro-Wilk sin_educm 0.8908 0.0691 YES
## 15 Shapiro-Wilk sin_int 0.9616 0.7201 YES
## 16 Shapiro-Wilk aten_salud 0.9347 0.3202 YES
## 17 Shapiro-Wilk disc 0.8966 0.0845 YES
## 18 Shapiro-Wilk fall_men 0.9457 0.4598 YES
## 19 Shapiro-Wilk per_etnia 0.8885 0.0638 YES
## 20 Shapiro-Wilk leng_etnia 0.8933 0.0754 YES
## 21 Shapiro-Wilk viv_etnia 0.7103 0.0003 NO
Test de Mardia. No se cumple normalidad multivariada en la base de datos de 22 indicadores.
Se viola el supuesto de normalidad multivariada Hay que utilizar un estimador robusto para obtener las componentes ¿cuÔl?
Preguntar al profe MartĆn
############## PRUEBAS ########################
# test de normalidad
# test de Mardia en MVN
prueba1 <- mvn(data = base_in[,2:22], mvnTest = "mardia")
prueba1$multivariateNormality
## Test Statistic p value Result
## 1 Mardia Skewness 4628.15001992833 8.27116713126411e-254 NO
## 2 Mardia Kurtosis -0.28638447720968 0.774583662441416 YES
## 3 MVN <NA> <NA> NO
AĆŗn sin cumplir el supuesto de normalidad, se realizó el CPA como ejercicio, y para tener una lĆnea de código hecha.
Preparar base de datos.
N_MUN<-c('Riohacha', 'Albania', 'Barrancas', 'Dibulla', 'Distracción', 'El Molino', 'Fonseca', 'Hatonuevo', 'La Jagua del Pilar', 'Maicao', 'Manaure', 'San Juan del Cesar', 'Uribia', 'Urumita', 'Villanueva')
base_in<-base_in %>% data.frame %>% set_rownames(N_MUN) #ID = Municipios
base_in$N_MUN<-NULL #Borrar columna municipior
x<-base_in[,2:22] #Seleccionar datos numericos
Se obtienen las componentes, la primera explica el 68,7% de la variabilidad.
## pca using R base facilities
pca1 <- prcomp(x, scale = TRUE)
res.pca <- PCA(x, graph = FALSE)
summary(pca1)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 3.7977 1.29572 1.14753 1.03169 0.90839 0.66751 0.63886
## Proportion of Variance 0.6868 0.07995 0.06271 0.05068 0.03929 0.02122 0.01944
## Cumulative Proportion 0.6868 0.76674 0.82945 0.88013 0.91943 0.94064 0.96008
## PC8 PC9 PC10 PC11 PC12 PC13 PC14
## Standard deviation 0.54813 0.41028 0.37550 0.32137 0.26890 0.18504 0.13684
## Proportion of Variance 0.01431 0.00802 0.00671 0.00492 0.00344 0.00163 0.00089
## Cumulative Proportion 0.97439 0.98240 0.98912 0.99403 0.99748 0.99911 1.00000
## PC15
## Standard deviation 3.403e-16
## Proportion of Variance 0.000e+00
## Cumulative Proportion 1.000e+00
1ra componente: Mayor peso las variables relacionadas con las condiciones habitacionales de los habitantes.
2da componente: Mayor peso condiciones socioeconómicas relacionadas con trampas de pobreza: Embarazo adolescente, no atención en salud.
3ra componente: Niveles educativos y etnia, jóvenes que no estudian ni trabajan.
pca1
## Standard deviations (1, .., p=15):
## [1] 3.797721e+00 1.295721e+00 1.147533e+00 1.031690e+00 9.083894e-01
## [6] 6.675062e-01 6.388604e-01 5.481258e-01 4.102803e-01 3.755043e-01
## [11] 3.213679e-01 2.688970e-01 1.850381e-01 1.368449e-01 3.402929e-16
##
## Rotation (n x k) = (21 x 15):
## PC1 PC2 PC3 PC4 PC5
## embar_a -0.1057460 -0.54078550 -0.058952662 -0.482314329 0.037741890
## analfa 0.2396843 -0.14129895 0.187231232 0.085200840 0.198312567
## ninis 0.1867735 0.20571658 0.407555565 0.210693795 -0.354182659
## piso_in 0.2575907 0.07896432 -0.059449242 -0.077161437 0.099948103
## pared_in 0.2493973 0.07514911 -0.086369431 -0.050676270 -0.044723139
## sin_elec 0.2544580 0.13414926 0.060742690 0.022203960 0.014476477
## sin_gas 0.2504166 0.04146080 -0.051068494 -0.167982144 0.063621864
## sin_alc 0.2559412 -0.07863813 -0.004621535 -0.032495706 0.083224032
## sin_basu 0.2191461 0.16234403 0.114319042 -0.082339620 0.305093596
## sin_acu 0.2585183 0.06117896 -0.055756810 -0.048716339 -0.001772725
## v_rural 0.2265844 -0.18324150 0.139924416 -0.056270723 0.360851727
## T_hog 0.1811530 -0.33026522 0.031484802 0.434703758 0.109932778
## no_estu 0.1618470 -0.41489711 0.216576454 0.408061564 -0.052724020
## sin_educm 0.1910144 0.15939797 0.227665933 -0.260583848 -0.515852784
## sin_int 0.2061226 0.12845499 0.232832366 -0.244489953 0.205466499
## aten_salud 0.1633163 -0.38397662 0.075901820 -0.341306294 -0.283637621
## disc -0.1562521 -0.09706058 0.620593329 -0.149086411 0.035384569
## fall_men 0.2215252 -0.17929429 -0.177145241 0.157152095 -0.386974180
## per_etnia 0.2392723 0.01748122 -0.189668697 -0.134033967 -0.016324866
## leng_etnia 0.2241782 -0.09029088 -0.353524028 0.003921204 -0.123535849
## viv_etnia 0.2512391 0.15238871 -0.071607908 -0.023632620 0.131015906
## PC6 PC7 PC8 PC9 PC10
## embar_a -0.298340856 0.25923803 0.152030850 0.02459516 -0.140933493
## analfa -0.008515441 0.01859478 0.005835512 0.27872511 0.500956697
## ninis -0.083452000 0.27661621 0.060726245 -0.04771364 0.009473976
## piso_in -0.086794440 0.03066747 0.006304001 0.13818934 -0.019042811
## pared_in -0.063542990 -0.12818448 0.432734700 0.12648833 0.274718149
## sin_elec 0.053650816 -0.06789086 -0.080957168 0.02846338 0.013443916
## sin_gas -0.121930801 0.05911879 -0.330504085 -0.19484271 -0.034378443
## sin_alc -0.001086608 -0.20750283 -0.173903920 0.09899477 0.163824587
## sin_basu 0.538644052 -0.05163337 0.086558188 -0.24362518 -0.226154745
## sin_acu -0.100424165 -0.11573956 0.056064057 0.08536731 -0.056189068
## v_rural -0.228444136 -0.11916930 -0.124896676 -0.05313595 0.100204009
## T_hog -0.192862488 -0.10124629 0.505091938 -0.42651726 -0.052173889
## no_estu 0.095112035 0.16048604 -0.247542979 0.41585657 -0.420785424
## sin_educm -0.281628287 -0.10957058 0.128042797 0.09673980 -0.093573382
## sin_int 0.049948272 0.63202238 0.125717612 -0.18838059 -0.082632556
## aten_salud 0.580534542 -0.13499023 0.134133646 0.09010096 0.107608616
## disc -0.123706547 -0.34347838 -0.286263205 -0.30409128 0.078591488
## fall_men 0.098685912 0.05103922 -0.172664963 -0.38777883 0.068232831
## per_etnia -0.108962553 -0.34998180 -0.018297384 -0.10534196 -0.488329831
## leng_etnia -0.059069146 0.21865301 -0.366187502 -0.20623909 0.265456654
## viv_etnia -0.129433379 -0.02390110 -0.028750124 0.24900614 -0.184392888
## PC11 PC12 PC13 PC14 PC15
## embar_a -0.366067085 0.20771597 -0.097552073 -0.09124767 0.083438670
## analfa -0.152729249 -0.05746618 0.029890016 -0.11592010 0.479225993
## ninis -0.374573369 -0.15581227 -0.349175125 -0.06376331 -0.256804106
## piso_in -0.027129597 0.08700597 -0.298171806 -0.37429903 0.054342308
## pared_in 0.014905026 0.01706911 0.125853381 -0.17097222 -0.221397183
## sin_elec -0.431169729 0.13856182 0.047884104 0.48559463 0.034888609
## sin_gas 0.007362767 0.25209013 -0.127608462 0.41953352 0.081445310
## sin_alc -0.080953524 0.12566913 0.329732966 0.13551531 -0.187518232
## sin_basu 0.021302789 0.34865246 -0.247940105 -0.24343716 0.187674128
## sin_acu 0.074064196 0.27341928 0.340515934 -0.19080199 -0.168549206
## v_rural 0.359186761 -0.43102433 -0.337880503 0.11907557 -0.011263787
## T_hog 0.044431089 0.16461591 -0.033784677 0.17118562 -0.076973999
## no_estu 0.180024716 0.12821943 0.091052280 -0.05515714 -0.048878193
## sin_educm 0.434514590 0.20929415 -0.073372317 0.09593910 0.294888281
## sin_int 0.179805249 -0.22772213 0.443187389 0.04446350 -0.060087016
## aten_salud 0.041852388 -0.19144821 -0.111957385 0.17193626 -0.234537108
## disc -0.100864390 0.08480817 0.174594622 -0.28914319 -0.169827543
## fall_men -0.107423491 -0.19840341 0.236084309 -0.20745212 0.451463575
## per_etnia -0.169649042 -0.42148836 0.067295798 -0.10549636 0.009253073
## leng_etnia 0.144856246 0.15802033 -0.171754866 -0.21536890 -0.381616885
## viv_etnia -0.212581187 -0.12546336 0.004434784 -0.08703323 -0.080139024
## Valores propios
res.pca$eig
## eigenvalue percentage of variance cumulative percentage of variance
## comp 1 14.42268836 68.67946840 68.67947
## comp 2 1.67889321 7.99472959 76.67420
## comp 3 1.31683201 6.27062862 82.94483
## comp 4 1.06438422 5.06849628 88.01332
## comp 5 0.82517127 3.92938700 91.94271
## comp 6 0.44556457 2.12173603 94.06445
## comp 7 0.40814257 1.94353605 96.00798
## comp 8 0.30044188 1.43067562 97.43866
## comp 9 0.16832989 0.80157089 98.24023
## comp 10 0.14100350 0.67144524 98.91167
## comp 11 0.10327731 0.49179671 99.40347
## comp 12 0.07230558 0.34431228 99.74778
## comp 13 0.03423909 0.16304329 99.91083
## comp 14 0.01872654 0.08917399 100.00000
## resultados por variables
res.pca$var
## $coord
## Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
## embar_a -0.4015938 0.70070719 -0.067650126 0.497598861 -0.034284332
## analfa 0.9102540 0.18308403 0.214854019 -0.087900853 -0.180145030
## ninis 0.7093136 -0.26655131 0.467683465 -0.217370678 0.321735767
## piso_in 0.9782577 -0.10231574 -0.068219967 0.079606681 -0.090791796
## pared_in 0.9471416 -0.09737228 -0.099111773 0.052282200 0.040626025
## sin_elec 0.9663607 -0.17382002 0.069704242 -0.022907603 -0.013150278
## sin_gas 0.9510125 -0.05372164 -0.058602782 0.173305495 -0.057793426
## sin_alc 0.9719933 0.10189308 -0.005303364 0.033525494 -0.075599827
## sin_basu 0.8322558 -0.21035258 0.131184874 0.084948962 -0.277143784
## sin_acu 0.9817804 -0.07927087 -0.063982781 0.050260159 0.001610324
## v_rural 0.8605043 0.23742988 0.160567886 0.058053941 -0.327793878
## T_hog 0.6879686 0.42793162 0.036129850 -0.448479512 -0.099861768
## no_estu 0.6146497 0.53759094 0.248528630 -0.420993027 0.047893940
## sin_educm 0.7254196 -0.20653531 0.261254173 0.268841746 0.468595192
## sin_int 0.7827964 -0.16644185 0.267182826 0.252237836 -0.186643586
## aten_salud 0.6202300 0.49752662 0.087099844 0.352122284 0.257653403
## disc -0.5934020 0.12576344 0.712151331 0.153810956 -0.032142967
## fall_men 0.8412911 0.23231539 -0.203280011 -0.162132242 0.351523237
## per_etnia 0.9086895 -0.02265079 -0.217651090 0.138281501 0.014829335
## leng_etnia 0.8513664 0.11699180 -0.405680492 -0.004045467 0.112218654
## viv_etnia 0.9541362 -0.19745327 -0.082172439 0.024381537 -0.119013458
##
## $cor
## Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
## embar_a -0.4015938 0.70070719 -0.067650126 0.497598861 -0.034284332
## analfa 0.9102540 0.18308403 0.214854019 -0.087900853 -0.180145030
## ninis 0.7093136 -0.26655131 0.467683465 -0.217370678 0.321735767
## piso_in 0.9782577 -0.10231574 -0.068219967 0.079606681 -0.090791796
## pared_in 0.9471416 -0.09737228 -0.099111773 0.052282200 0.040626025
## sin_elec 0.9663607 -0.17382002 0.069704242 -0.022907603 -0.013150278
## sin_gas 0.9510125 -0.05372164 -0.058602782 0.173305495 -0.057793426
## sin_alc 0.9719933 0.10189308 -0.005303364 0.033525494 -0.075599827
## sin_basu 0.8322558 -0.21035258 0.131184874 0.084948962 -0.277143784
## sin_acu 0.9817804 -0.07927087 -0.063982781 0.050260159 0.001610324
## v_rural 0.8605043 0.23742988 0.160567886 0.058053941 -0.327793878
## T_hog 0.6879686 0.42793162 0.036129850 -0.448479512 -0.099861768
## no_estu 0.6146497 0.53759094 0.248528630 -0.420993027 0.047893940
## sin_educm 0.7254196 -0.20653531 0.261254173 0.268841746 0.468595192
## sin_int 0.7827964 -0.16644185 0.267182826 0.252237836 -0.186643586
## aten_salud 0.6202300 0.49752662 0.087099844 0.352122284 0.257653403
## disc -0.5934020 0.12576344 0.712151331 0.153810956 -0.032142967
## fall_men 0.8412911 0.23231539 -0.203280011 -0.162132242 0.351523237
## per_etnia 0.9086895 -0.02265079 -0.217651090 0.138281501 0.014829335
## leng_etnia 0.8513664 0.11699180 -0.405680492 -0.004045467 0.112218654
## viv_etnia 0.9541362 -0.19745327 -0.082172439 0.024381537 -0.119013458
##
## $cos2
## Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
## embar_a 0.1612776 0.4909905636 4.576540e-03 0.2476046266 1.175415e-03
## analfa 0.8285624 0.0335197610 4.616225e-02 0.0077265600 3.245223e-02
## ninis 0.5031258 0.0710496029 2.187278e-01 0.0472500115 1.035139e-01
## piso_in 0.9569882 0.0104685109 4.653964e-03 0.0063372237 8.243150e-03
## pared_in 0.8970773 0.0094813616 9.823144e-03 0.0027334284 1.650474e-03
## sin_elec 0.9338530 0.0302134009 4.858681e-03 0.0005247583 1.729298e-04
## sin_gas 0.9044247 0.0028860144 3.434286e-03 0.0300347947 3.340080e-03
## sin_alc 0.9447710 0.0103821999 2.812567e-05 0.0011239587 5.715334e-03
## sin_basu 0.6926497 0.0442482097 1.720947e-02 0.0072163261 7.680868e-02
## sin_acu 0.9638928 0.0062838703 4.093796e-03 0.0025260836 2.593144e-06
## v_rural 0.7404677 0.0563729480 2.578205e-02 0.0033702601 1.074488e-01
## T_hog 0.4733008 0.1831254715 1.305366e-03 0.2011338726 9.972373e-03
## no_estu 0.3777943 0.2890040208 6.176648e-02 0.1772351290 2.293829e-03
## sin_educm 0.5262336 0.0426568345 6.825374e-02 0.0722758842 2.195815e-01
## sin_int 0.6127702 0.0277028888 7.138666e-02 0.0636239257 3.483583e-02
## aten_salud 0.3846852 0.2475327356 7.586383e-03 0.1239901027 6.638528e-02
## disc 0.3521259 0.0158164433 5.071595e-01 0.0236578103 1.033170e-03
## fall_men 0.7077707 0.0539704418 4.132276e-02 0.0262868638 1.235686e-01
## per_etnia 0.8257167 0.0005130581 4.737200e-02 0.0191217736 2.199092e-04
## leng_etnia 0.7248248 0.0136870819 1.645767e-01 0.0000163658 1.259303e-02
## viv_etnia 0.9103760 0.0389877950 6.752310e-03 0.0005944594 1.416420e-02
##
## $contrib
## Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
## embar_a 1.118221 29.24489535 0.347541641 23.262711169 1.424450e-01
## analfa 5.744854 1.99653919 3.505553407 0.725918318 3.932787e+00
## ninis 3.488433 4.23193103 16.610153875 4.439187533 1.254454e+01
## piso_in 6.635297 0.62353644 0.353421234 0.595388735 9.989623e-01
## pared_in 6.219903 0.56473881 0.745967860 0.256808429 2.000159e-01
## sin_elec 6.474889 1.79960230 0.368967437 0.049301585 2.095684e-02
## sin_gas 6.270847 0.17189982 0.260799104 2.821800077 4.047742e-01
## sin_alc 6.550589 0.61839549 0.002135859 0.105597089 6.926240e-01
## sin_basu 4.802501 2.63555831 1.306884332 0.677981309 9.308210e+00
## sin_acu 6.683170 0.37428648 0.310882192 0.237328166 3.142553e-04
## v_rural 5.134048 3.35774470 1.957884218 0.316639427 1.302140e+01
## T_hog 3.281641 10.90751156 0.099129278 18.896735696 1.208522e+00
## no_estu 2.619444 17.21396085 4.690536025 16.651423979 2.779822e-01
## sin_educm 3.648651 2.54077116 5.183177699 6.790394197 2.661041e+01
## sin_int 4.248654 1.65006854 5.421091087 5.977533730 4.221648e+00
## aten_salud 2.667223 14.74380464 0.576108625 11.648998599 8.045030e+00
## disc 2.441472 0.94207560 38.513607980 2.222675783 1.252068e-01
## fall_men 4.907343 3.21464411 3.138043630 2.469678086 1.497490e+01
## per_etnia 5.725123 0.03055931 3.597421447 1.796510436 2.665012e-02
## leng_etnia 5.025587 0.81524434 12.497923816 0.001537584 1.526111e+00
## viv_etnia 6.312110 2.32223197 0.512769255 0.055850072 1.716517e+00
## contribution of variables
res.pca$var$contrib
## Dim.1 Dim.2 Dim.3 Dim.4 Dim.5
## embar_a 1.118221 29.24489535 0.347541641 23.262711169 1.424450e-01
## analfa 5.744854 1.99653919 3.505553407 0.725918318 3.932787e+00
## ninis 3.488433 4.23193103 16.610153875 4.439187533 1.254454e+01
## piso_in 6.635297 0.62353644 0.353421234 0.595388735 9.989623e-01
## pared_in 6.219903 0.56473881 0.745967860 0.256808429 2.000159e-01
## sin_elec 6.474889 1.79960230 0.368967437 0.049301585 2.095684e-02
## sin_gas 6.270847 0.17189982 0.260799104 2.821800077 4.047742e-01
## sin_alc 6.550589 0.61839549 0.002135859 0.105597089 6.926240e-01
## sin_basu 4.802501 2.63555831 1.306884332 0.677981309 9.308210e+00
## sin_acu 6.683170 0.37428648 0.310882192 0.237328166 3.142553e-04
## v_rural 5.134048 3.35774470 1.957884218 0.316639427 1.302140e+01
## T_hog 3.281641 10.90751156 0.099129278 18.896735696 1.208522e+00
## no_estu 2.619444 17.21396085 4.690536025 16.651423979 2.779822e-01
## sin_educm 3.648651 2.54077116 5.183177699 6.790394197 2.661041e+01
## sin_int 4.248654 1.65006854 5.421091087 5.977533730 4.221648e+00
## aten_salud 2.667223 14.74380464 0.576108625 11.648998599 8.045030e+00
## disc 2.441472 0.94207560 38.513607980 2.222675783 1.252068e-01
## fall_men 4.907343 3.21464411 3.138043630 2.469678086 1.497490e+01
## per_etnia 5.725123 0.03055931 3.597421447 1.796510436 2.665012e-02
## leng_etnia 5.025587 0.81524434 12.497923816 0.001537584 1.526111e+00
## viv_etnia 6.312110 2.32223197 0.512769255 0.055850072 1.716517e+00
## screeplot
myscree(x, colline = cols[1], ylab = '', xlab = 'NĆŗmero de componentes', main = '')
mtext(expression(lambda), 2, las = 1, side = 2.8)
Realizado a travƩs de 2 comandos diferentes, ambos sugieren utilizar una sola componente.
#MƩtodo 1
corr<-cor(bdf)
fa.parallel(corr,n.obs=15,fa="fa",fm="minres")
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## In smc, smcs < 0 were set to .0
## In smc, smcs < 0 were set to .0
## In factor.scores, the correlation matrix is singular, an approximation is used
## Parallel analysis suggests that the number of factors = 1 and the number of components = NA
#MƩtodo 2
library(paran)
paran(x, cfa=F, graph = T)
##
## Using eigendecomposition of correlation matrix.
## Computing: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
##
##
## Results of Horn's Parallel Analysis for component retention
## 630 iterations, using the mean estimate
##
## --------------------------------------------------
## Component Adjusted Unadjusted Estimated
## Eigenvalue Eigenvalue Bias
## --------------------------------------------------
## 1 11.420582 14.422688 3.002106
## --------------------------------------------------
##
## Adjusted eigenvalues > 1 indicate dimensions to retain.
## (1 components retained)
## cummulative variance
lambdas <- pca1$sdev^2
perc.lambdas <- lambdas/sum(lambdas)
cvar <- 100*cumsum(perc.lambdas)
plot(cvar, type = 'l', col = cols[1], las = 1, ylab = 'Varianza acumulada (%)')
points(cvar, type = 'p', pch = 16)
fviz_pca_ind(res.pca,
label = "all", # show individual labels
habillage = "none", # color by groups?
#groups = mygroup, # is there any groups?
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
addEllipses = TRUE # concentration ellipses?
) + ggtitle("PCA- Municipios La Guajira") #+ theme(legend.position = "none")
## visualize variables
fviz_pca_var(res.pca, col.var = cols[1])
## visualize biplot
fviz_pca_biplot(res.pca, label = c("ind", "var"),
addEllipses = TRUE,
col.ind = 'gray60',
col.var = cols[1],
ellipse.level = 0.95,
ggtheme = theme_minimal())
rot<-varimax(pca1$rotation)
rot
## $loadings
##
## Loadings:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9
## embar_a -0.965
## analfa 0.135
## ninis
## piso_in 0.337 -0.157 -0.147 -0.125 -0.145 -0.161
## pared_in -0.105 -0.209 0.121 0.119
## sin_elec 0.173
## sin_gas -0.178 -0.221
## sin_alc -0.137 0.174 0.114
## sin_basu 0.919
## sin_acu 0.119 -0.100
## v_rural
## T_hog 0.932
## no_estu 0.959
## sin_educm -0.945
## sin_int 0.976
## aten_salud 0.947
## disc 0.952
## fall_men 0.109 -0.124 0.112 0.186 -0.334
## per_etnia
## leng_etnia -0.850
## viv_etnia -0.127 0.111 -0.144 -0.192
## PC10 PC11 PC12 PC13 PC14 PC15
## embar_a
## analfa 0.168 -0.144 0.817
## ninis -0.918
## piso_in -0.164 -0.158 0.174 -0.123 0.132
## pared_in 0.588 -0.160
## sin_elec -0.152 0.117 0.674
## sin_gas -0.136 -0.153 0.608
## sin_alc 0.163 0.366 0.335
## sin_basu
## sin_acu 0.128 0.142 0.518
## v_rural -0.114 -0.842 -0.127 0.104
## T_hog
## no_estu
## sin_educm
## sin_int
## aten_salud
## disc
## fall_men -0.386 0.393 -0.277 0.503
## per_etnia -0.829 -0.114
## leng_etnia 0.100 0.102
## viv_etnia -0.303 -0.163 -0.100 0.238
##
## PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
## SS loadings 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
## Proportion Var 0.048 0.048 0.048 0.048 0.048 0.048 0.048 0.048 0.048 0.048
## Cumulative Var 0.048 0.095 0.143 0.190 0.238 0.286 0.333 0.381 0.429 0.476
## PC11 PC12 PC13 PC14 PC15
## SS loadings 1.000 1.000 1.000 1.000 1.000
## Proportion Var 0.048 0.048 0.048 0.048 0.048
## Cumulative Var 0.524 0.571 0.619 0.667 0.714
##
## $rotmat
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.27370324 0.09547072 -0.16181671 0.17547216 -0.19383191 0.14058067
## [2,] 0.19536329 0.51576512 -0.12831289 -0.41427366 -0.14040311 -0.40996712
## [3,] 0.09598557 0.05678809 0.62150918 0.22109912 -0.20204345 0.07929606
## [4,] -0.11250637 0.50409955 -0.14451871 0.40558522 0.26650972 -0.32325161
## [5,] 0.33889845 -0.07681143 0.01758315 -0.02916530 0.56061585 -0.32096896
## [6,] 0.47640283 0.33605488 -0.09949013 0.08571702 0.29022734 0.62301382
## [7,] -0.02702007 -0.26956464 -0.36085389 0.15637661 0.08492053 -0.15218124
## [8,] 0.09241062 -0.16560999 -0.33900049 -0.29860318 -0.09433459 0.14979663
## [9,] -0.18157412 -0.07870995 -0.35205742 0.43340274 -0.01953829 0.04439972
## [10,] -0.24305690 0.14752897 0.07768436 -0.44941425 0.10969599 0.14010255
## [11,] 0.02671669 0.35898087 -0.08627652 0.17366240 -0.47307098 0.04031372
## [12,] 0.37465090 -0.22971265 0.09702799 0.15440866 -0.20445713 -0.21740951
## [13,] -0.35431888 0.15106650 0.23243547 0.11403399 0.09023901 -0.07110524
## [14,] -0.34361479 0.11307465 -0.26262308 -0.04818971 -0.07302238 0.19758249
## [15,] 0.19295413 -0.05834052 -0.15815508 -0.05700330 -0.35319320 -0.22367872
## [,7] [,8] [,9] [,10] [,11]
## [1,] 0.20201117 0.153934241 -0.283050899 -0.347683322 -0.20533409
## [2,] 0.10852484 -0.372244674 0.144288411 -0.006114547 -0.24849936
## [3,] 0.22339614 0.017849080 0.416412693 0.253539774 -0.41011338
## [4,] -0.23419788 0.458447085 -0.009822075 0.092737837 -0.21276304
## [5,] 0.17231720 0.028346572 0.216014680 0.089742158 0.32735331
## [6,] 0.06919440 -0.145909319 0.069207935 0.118683590 0.11018113
## [7,] 0.63801321 -0.098402051 -0.244114525 0.330130591 -0.28642422
## [8,] 0.09517924 0.498188725 0.463179731 0.082352427 -0.13146231
## [9,] -0.25041412 -0.540520392 0.338851193 0.154923702 -0.03728616
## [10,] -0.09445018 -0.036541308 -0.254503071 0.531255854 -0.01498029
## [11,] 0.19004615 0.065123590 -0.181296041 0.236842440 0.41757522
## [12,] -0.23748065 0.130818573 -0.130088155 0.499244011 0.17485515
## [13,] 0.47345577 0.005671214 0.150449161 -0.118699204 0.40902673
## [14,] 0.05661185 0.179548590 0.271688050 0.179650687 0.10304853
## [15,] -0.02346079 -0.012482199 0.272314376 -0.101743244 0.28199952
## [,12] [,13] [,14] [,15]
## [1,] -0.21253957 0.436881518 0.386382289 0.32964429
## [2,] 0.12359524 0.125848960 0.110829050 -0.20880031
## [3,] -0.17735789 -0.070497308 0.027664714 0.10956095
## [4,] 0.11801788 -0.091609977 -0.093893429 0.14899010
## [5,] -0.49875382 0.093234059 0.098018668 0.03671929
## [6,] 0.29552704 -0.157612316 -0.024512715 0.02474029
## [7,] 0.12168532 -0.234013708 -0.061008237 0.03020801
## [8,] 0.07514623 0.335746002 -0.334794124 -0.04795946
## [9,] -0.10606666 0.343349478 -0.068443071 0.13339950
## [10,] -0.11044880 0.186887073 0.004644235 0.52426729
## [11,] -0.38211367 0.003847001 -0.345980699 -0.19487626
## [12,] 0.36378525 0.255413966 0.303104288 -0.15761382
## [13,] 0.44490308 0.362025528 0.041462624 0.11165920
## [14,] -0.14114260 -0.231789759 0.695758102 -0.20880933
## [15,] 0.13402092 -0.419590974 0.027337357 0.62993873