Alumno: Diego Gabriel Gornati

Recreación del análisis del informe

En la ponencia “Análisis exploratorio multivaiado. Combinación de procedimientos para clasificar departamentos de cuatro provincias argentinas según el impacto del desarrollo en las condiciones de vida” los y las autoras construyen dimensiones relativas al impacto del desarrollo económico en las condiciones de vida (IDECV) y a partir de éstos dividen en sustratos de los departamentos de las provincias incluídas en el análisis, los cuales se busca que sean coherentes a su interior y lo más diferenciados posibles entre sí. A tal efecto, en la ponencia se efectúa un desarrollo analítico que se puede resulir en los siguientes puntos:

1- Análisis factorial de 25 indicadores. Resultan 6 dimensiones (factores) 2- Análisis de clusters. Se construyen 4 estratos (clusters/conglomerados) 3- Análisis de correlación entre los estratos obtenidos y los datos de NBI, análisis de varianza

En tal sentido, el presente trabajo buscará reproducir dicho trayecto analítico, partiendo de los mismos datos empleados en la ponencia y empleando las mismas herramientas analíticas, para lo cual se abordará este desarrollo en tres etapas que buscarán replicar cada uno de los puntos previamente detallados.

Primer etapa: Análisis factorial de 25 indicadores

# Cargo las librerias necesarias
library(haven)
library(psych)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.7     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ ggplot2::%+%()   masks psych::%+%()
## ✖ ggplot2::alpha() masks psych::alpha()
## ✖ dplyr::filter()  masks stats::filter()
## ✖ dplyr::lag()     masks stats::lag()
library(ca)
library(cluster)
library(factoextra)
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
library(PerformanceAnalytics)
## Loading required package: xts
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## Attaching package: 'xts'
## The following objects are masked from 'package:dplyr':
## 
##     first, last
## 
## Attaching package: 'PerformanceAnalytics'
## The following object is masked from 'package:graphics':
## 
##     legend
# Cargo la base con los datos del paper
basejor3 <- read_sav("basejor3.sav")
#View(basejor3)

# Le quito a la base los resultados obtenidos por el grupo de investigadorxs
base <- basejor3[,-(30:45),drop=FALSE]
base <- base[,-(1:2),drop=FALSE]
base <- base %>% remove_rownames %>% column_to_rownames(var="DEPTO")
base <- base[,-4,drop=FALSE]

Análisis de la correlación de las variables

Para comenzar este recorrido analítico se realizará un breve análisis de correlación entre las variables seleccionadas.

chart.Correlation(base, histogram = F, pch = 19)

Análisis de factibilidad

A los efectos de determinar la factibilidad de realizar una factorización de las variables propuestas se empleará un test de Kaiser-Meyer-Olkin y un test de esfericidad de Bartlett:

KMO(base)
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = base)
## Overall MSA =  0.73
## MSA for each item = 
##  PURBANA MIGINTPR MIGLIMIT VSINAGUA VSINELEC  HSINGAS CONDESAG  MORTINF 
##     0.78     0.63     0.74     0.83     0.79     0.84     0.81     0.56 
## MORTNEON NACVIV20 SINCOBSA ESCMEDIO  ANALFAB ASISHTPI PRPERHOG JEFMUJER 
##     0.57     0.73     0.74     0.79     0.75     0.83     0.64     0.68 
##  VIVPREC IRRTENVI TASACTTO TASACTMU TASDESTO TASDESMU ASALPUBL  CTAPROP 
##     0.75     0.77     0.65     0.71     0.73     0.78     0.68     0.57 
## PRECASAL 
##     0.67
cortest.bartlett(base)
## R was not square, finding R from data
## $chisq
## [1] 2049.968
## 
## $p.value
## [1] 8.694932e-258
## 
## $df
## [1] 300

De allí obtenemos que el valor de KMO para este conjunto de variables es de 0.73, lo que indica que es apropiado realizar una factorización con estas variables (ya que el valor obtenido supera ampliamente 0,6). Por otro lado se obtiene mediante el test de Bartlett un p-value de 8.694932e-258, descartando la hipótesis de que las variables no están correlacionadas.

Análisis de componentes principales

A continuación se realiza un análisis de componentes principales para determinar cuál es la cantidad óptima de factores a tomar con este modelo.

Factores <- prcomp(base)
summary(Factores)
## Importance of components:
##                            PC1     PC2     PC3      PC4     PC5      PC6
## Standard deviation     46.3845 23.4894 19.7017 16.05986 12.9826 10.14293
## Proportion of Variance  0.5553  0.1424  0.1002  0.06656  0.0435  0.02655
## Cumulative Proportion   0.5553  0.6977  0.7978  0.86441  0.9079  0.93446
##                            PC7     PC8     PC9    PC10    PC11    PC12    PC13
## Standard deviation     7.83166 6.37494 5.74480 5.10384 4.69287 4.04959 3.64176
## Proportion of Variance 0.01583 0.01049 0.00852 0.00672 0.00568 0.00423 0.00342
## Cumulative Proportion  0.95029 0.96078 0.96930 0.97602 0.98171 0.98594 0.98936
##                           PC14    PC15    PC16    PC17    PC18   PC19    PC20
## Standard deviation     3.21335 2.75650 2.46133 2.31921 1.90586 1.7614 1.59269
## Proportion of Variance 0.00266 0.00196 0.00156 0.00139 0.00094 0.0008 0.00065
## Cumulative Proportion  0.99203 0.99399 0.99555 0.99694 0.99788 0.9987 0.99933
##                           PC21    PC22    PC23    PC24    PC25
## Standard deviation     1.09021 0.76870 0.70858 0.53292 0.16710
## Proportion of Variance 0.00031 0.00015 0.00013 0.00007 0.00001
## Cumulative Proportion  0.99964 0.99979 0.99992 0.99999 1.00000
plot(Factores)

print(Factores$rotation,sort=T)
##                    PC1          PC2          PC3          PC4          PC5
## PURBANA  -0.6989933258  0.065198502 -0.563945224  0.364274629 -0.067387674
## MIGINTPR -0.0095109945  0.064331278 -0.051322097 -0.037564791 -0.083890118
## MIGLIMIT -0.0098795014 -0.021327804  0.003867086 -0.031344725 -0.017494660
## VSINAGUA  0.2701279703 -0.394536125 -0.251520366 -0.092511153 -0.219117282
## VSINELEC  0.3120323253  0.142222547 -0.425427205  0.053350435 -0.217658062
## HSINGAS   0.4141296669  0.282873051 -0.299325233  0.175771685 -0.079600675
## CONDESAG -0.2293657605  0.073998167 -0.046119040 -0.609668426 -0.671672938
## MORTINF  -0.0219854656  0.153954382 -0.254086513 -0.407145644  0.445570932
## MORTNEON -0.0363653743  0.111461372 -0.156719879 -0.331489992  0.338976742
## NACVIV20  0.0244459711  0.018933192  0.007112322  0.140919037  0.036163432
## SINCOBSA  0.0656542451 -0.230891078 -0.323364965 -0.022723998  0.149597310
## ESCMEDIO -0.2249451768  0.162786870  0.146163634  0.003418360  0.055588555
## ANALFAB   0.0346121307 -0.051566744 -0.028539030  0.011074198  0.035358048
## ASISHTPI  0.1087817635 -0.197812179 -0.127453904  0.047899072  0.060322727
## PRPERHOG  0.0009011224 -0.002651774  0.003665993  0.008897510  0.002809708
## JEFMUJER  0.0196784628  0.196577213 -0.111551217 -0.008195263 -0.001070991
## VIVPREC   0.1501731895 -0.071751612 -0.202095011  0.066526653 -0.119033699
## IRRTENVI  0.0180294013 -0.189299890  0.053628820  0.104177162 -0.010512447
## TASACTTO  0.0387155733  0.057133037 -0.046553398 -0.149579479  0.126693368
## TASACTMU  0.0433580202  0.257971677 -0.130536286 -0.208573165  0.173342429
## TASDESTO -0.0311903822 -0.031529014 -0.010162251  0.023556715  0.009192653
## TASDESMU -0.0437269617 -0.081257640  0.002273952  0.049016894  0.005091989
## ASALPUBL  0.1104329803  0.480037846  0.120893073  0.245340729 -0.099540344
## CTAPROP   0.0043009401  0.015734682 -0.138442947 -0.087835282  0.050372270
## PRECASAL -0.0684580797 -0.428913780 -0.043722321 -0.017078273  0.131143665
##                   PC6          PC7          PC8           PC9         PC10
## PURBANA   0.067699079 -0.019808495  0.071507909 -0.0907530287 -0.044200861
## MIGINTPR -0.175638303 -0.242855285 -0.281700344 -0.3901387546 -0.330121900
## MIGLIMIT -0.027709183 -0.002480982  0.036382098 -0.0255248250 -0.033564781
## VSINAGUA  0.372969668 -0.372520293  0.141746347  0.1879021678 -0.196739880
## VSINELEC  0.158768164 -0.276538447 -0.045176916  0.0183544378 -0.030499221
## HSINGAS  -0.279026605  0.252209800 -0.370905803  0.0341199257 -0.054954946
## CONDESAG -0.134338230  0.135287573 -0.072473485 -0.0208434605  0.190543765
## MORTINF   0.307531654  0.022539805 -0.276687135 -0.1857398229  0.073401804
## MORTNEON  0.252573061  0.044544565 -0.085386968  0.0837872282  0.126667446
## NACVIV20 -0.060001480  0.109030166 -0.090718427 -0.1085049142  0.239740895
## SINCOBSA -0.396783640 -0.050406151  0.051151885  0.3035502120  0.261315272
## ESCMEDIO -0.014735074 -0.179553603 -0.150086859  0.6396550898 -0.073963958
## ANALFAB  -0.047677539 -0.004433362  0.095841645 -0.0357461048  0.159233849
## ASISHTPI -0.176581746 -0.147328116  0.166526700 -0.2285757344  0.583133776
## PRPERHOG  0.003389791  0.006383955 -0.003610032 -0.0072288331  0.003006073
## JEFMUJER -0.107144588  0.110174763 -0.001432467  0.1261884938 -0.056724928
## VIVPREC   0.237586992  0.632933963  0.181940775  0.0526262952 -0.095604582
## IRRTENVI  0.144427897  0.330805487 -0.001470735  0.0469829170 -0.112584415
## TASACTTO -0.182779495  0.022256165  0.402148019 -0.1164853953 -0.180493466
## TASACTMU -0.259541306 -0.013683035  0.527506184 -0.0097803203 -0.314456458
## TASDESTO  0.085139278  0.049100228 -0.003956820  0.0367235395  0.062974560
## TASDESMU  0.158226374  0.100375198 -0.020154519  0.0694886477  0.130923941
## ASALPUBL  0.137105647 -0.189178222  0.050082859  0.0009658379  0.162829277
## CTAPROP  -0.196627125  0.034978933 -0.077474920  0.3839018179  0.015628564
## PRECASAL -0.266820680  0.013514399 -0.336269955 -0.0732559635 -0.294182000
##                   PC11          PC12         PC13         PC14          PC15
## PURBANA   0.0569490581 -0.1127503797  0.052682795 -0.008909583  0.0472854340
## MIGINTPR -0.0245690302  0.3905290680 -0.037944647 -0.085660803  0.4882871855
## MIGLIMIT  0.0490679818  0.0431907278 -0.049846654  0.039117920  0.0446308589
## VSINAGUA  0.3056527298 -0.2275803058  0.168758710  0.212477074  0.2295640940
## VSINELEC -0.3477006713  0.1821688624 -0.173826690 -0.262190235 -0.4741333016
## HSINGAS   0.1435564530 -0.3001260060  0.324062395 -0.120271435  0.1030199208
## CONDESAG  0.1579701844 -0.0469216409 -0.035046245 -0.050927622 -0.0679203432
## MORTINF   0.1263090630 -0.0482969290 -0.108120048  0.116302852  0.0419330182
## MORTNEON  0.0557012879  0.0757559560  0.123407129 -0.203785577  0.0009745577
## NACVIV20  0.1614302323 -0.1261605915  0.051239677  0.410785609 -0.1438064802
## SINCOBSA  0.1643952952  0.2131976160 -0.416628740  0.103154240  0.0681030135
## ESCMEDIO  0.1825901450  0.3052614949  0.378323824 -0.195724679 -0.0273817303
## ANALFAB   0.0684888771 -0.0018876818  0.017256087 -0.162419468  0.0250828597
## ASISHTPI  0.0642129401  0.1382222302  0.376444123 -0.261056841  0.1191547438
## PRPERHOG  0.0080098779 -0.0047861685 -0.001351834 -0.009700674 -0.0119738384
## JEFMUJER  0.0009213703 -0.0891907882  0.101222787  0.218359838 -0.0614214828
## VIVPREC   0.0523800739  0.5779452739  0.078942312  0.171886752  0.0686798772
## IRRTENVI  0.2284700016 -0.2442714999 -0.284741140 -0.596928062  0.1656715491
## TASACTTO  0.1161285245 -0.0005614221  0.008808335 -0.153528547  0.0082962355
## TASACTMU  0.0834509254 -0.1200855641  0.063135258 -0.031422094 -0.1280328150
## TASDESTO -0.0738269821 -0.0375766168 -0.032669715 -0.118893289  0.0364118023
## TASDESMU -0.1207691402 -0.1218309706 -0.055741102 -0.116059740 -0.1060543659
## ASALPUBL  0.5464634748  0.1089941015 -0.394518619  0.008464938  0.0153494763
## CTAPROP  -0.3494810149 -0.1539091016 -0.275336357  0.077324951  0.3677344895
## PRECASAL  0.3214276806  0.0775114468 -0.053372617 -0.031083738 -0.4705170021
##                   PC16          PC17         PC18         PC19         PC20
## PURBANA  -0.0429630055  0.0428404736 -0.075254181  0.018581096 -0.017728296
## MIGINTPR -0.0146411573 -0.2483293094  0.269606221 -0.098836031 -0.072790436
## MIGLIMIT -0.0371374064  0.0200223959 -0.122306812 -0.029780336  0.055471929
## VSINAGUA -0.0009203602 -0.0431100124  0.021861594 -0.051836387 -0.033978792
## VSINELEC  0.0433235145 -0.2078347022 -0.025897426  0.100099543  0.084223340
## HSINGAS  -0.2220791467  0.1933226898 -0.045542471 -0.071836154  0.052385642
## CONDESAG -0.0166958182 -0.0201394041 -0.007591844 -0.022143607  0.014958263
## MORTINF   0.2200376861  0.0684930171 -0.042295590 -0.060520117  0.463519581
## MORTNEON -0.3046421491 -0.0930785021  0.009789738  0.136079648 -0.653389123
## NACVIV20 -0.1901465871 -0.7670949029  0.066776790  0.072939668  0.058514484
## SINCOBSA -0.2216570518  0.1795185096  0.342949691  0.139025915  0.073994487
## ESCMEDIO  0.0242411718 -0.1829346671  0.027919154 -0.030695511  0.275643355
## ANALFAB  -0.0389225539 -0.1085535182 -0.048918136  0.074910033  0.239943123
## ASISHTPI  0.3602620097 -0.0398111324 -0.126911852 -0.109592149 -0.070938679
## PRPERHOG -0.0483486146 -0.0004479885 -0.010682260 -0.008358756  0.021159423
## JEFMUJER  0.6405231250 -0.0137172896  0.435592512  0.122011965 -0.192837759
## VIVPREC   0.0541828431 -0.0127787455 -0.118218108 -0.092933387  0.008127284
## IRRTENVI  0.1891760354 -0.2600611037  0.136266716  0.263283440  0.050397130
## TASACTTO -0.1900736849 -0.1358329392 -0.150770457 -0.167653746  0.202950856
## TASACTMU  0.0589879621 -0.1068320530  0.102306576 -0.099145967 -0.065475200
## TASDESTO -0.0296193451  0.0023401978  0.138156307 -0.250754655  0.133948854
## TASDESMU -0.0898764878 -0.0551314034  0.410411139 -0.760282788 -0.053981169
## ASALPUBL  0.1083300643  0.0581370279 -0.171360394 -0.204191471 -0.178184037
## CTAPROP   0.2090719047 -0.2608052876 -0.479298654 -0.218717858 -0.140690774
## PRECASAL  0.1770386794 -0.0110323111 -0.240746769 -0.209666265 -0.180906193
##                  PC21          PC22        PC23         PC24          PC25
## PURBANA   0.005867431 -1.134156e-02  0.01052665 -0.016360926  0.0039225525
## MIGINTPR  0.023734462  5.735597e-02  0.01397200 -0.016996552 -0.0050102223
## MIGLIMIT  0.016635215 -1.088048e-01 -0.94589928 -0.236390571 -0.0237520362
## VSINAGUA  0.035417127  4.894796e-03  0.01876785  0.012653685 -0.0066176364
## VSINELEC -0.024375712 -5.844209e-02 -0.03832411 -0.004185618 -0.0038393029
## HSINGAS  -0.015633839  2.647788e-05 -0.02007926 -0.008110555  0.0163439958
## CONDESAG -0.018137238  1.377398e-02  0.02679561  0.017872484 -0.0013495251
## MORTINF  -0.117406180  1.092222e-02  0.03453052 -0.040442921  0.0049437038
## MORTNEON  0.150241903 -3.514028e-02 -0.05856874  0.029629536 -0.0071627275
## NACVIV20 -0.083038226 -1.123695e-02 -0.03703353  0.067789489  0.0071991089
## SINCOBSA -0.073019752 -3.374638e-02  0.01282489  0.012054035  0.0008806107
## ESCMEDIO -0.083475362 -2.302513e-02 -0.01061314 -0.027308717  0.0012621973
## ANALFAB   0.739885113  4.615685e-01  0.04354053 -0.282455904  0.0530475163
## ASISHTPI -0.197143539 -3.558348e-02 -0.02427383  0.032973846 -0.0181902295
## PRPERHOG -0.006085499  6.464059e-02  0.03576878 -0.058940979 -0.9937305184
## JEFMUJER  0.337696280 -2.477064e-01 -0.06481174 -0.010786893 -0.0636753452
## VIVPREC  -0.017055811  7.483505e-02  0.04288729 -0.003027711  0.0047012672
## IRRTENVI -0.176835792 -3.469862e-02 -0.03010542 -0.029674620 -0.0013913671
## TASACTTO  0.256126715 -6.637706e-01  0.14277676  0.042476601 -0.0260492805
## TASACTMU -0.285435938  4.772898e-01 -0.06049177  0.023762085  0.0214632885
## TASDESTO  0.223993852  1.322432e-01 -0.23778468  0.855453320 -0.0460004226
## TASDESMU -0.031126476 -2.602407e-02  0.02466595 -0.332808568  0.0282442501
## ASALPUBL  0.043883152  1.523261e-02  0.03529296  0.035527265 -0.0007608199
## CTAPROP   0.030731738  3.632677e-02  0.04858538 -0.009356010 -0.0125694114
## PRECASAL  0.090174160  5.654873e-02  0.01561857  0.058058327  0.0015789943

Del análisis previo se puede aproximar la idea de que entre 5 y 7 factores pueden ser empleados eficientemente para estructurar este modelo. De allí que se presente como razonable la opción elegida por los autores de la ponencia de elegir 6 factores para continuar con el análisis.

Factorización

Se procede a factorizar con seis factores siguiendo el camino adoptado por los autores, y se extraen las cargas factoriales para este modelo:

## Factores
cor.base <- cor(base,use="complete.obs")
ff <- fa(base,nfactors=6,n.obs=20, use="complete.obs",rotate="none",fm="pa")
## Warning in fa.stats(r = r, f = f, phi = phi, n.obs = n.obs, np.obs = np.obs, :
## The estimated weights for the factor scores are probably incorrect. Try a
## different factor score estimation method.
# Se imprimen las cargas factoriales ordenadas superiores a 0.50
print(ff, cut=.50, sort=TRUE)
## Factor Analysis using method =  pa
## Call: fa(r = base, nfactors = 6, n.obs = 20, rotate = "none", fm = "pa", 
##     use = "complete.obs")
## Standardized loadings (pattern matrix) based upon correlation matrix
##          item   PA1   PA2   PA3   PA4   PA5   PA6   h2    u2 com
## HSINGAS     6  0.94                               0.96 0.039 1.2
## VSINELEC    5  0.89                               0.95 0.054 1.4
## ESCMEDIO   12 -0.82                               0.87 0.133 1.6
## PURBANA     1 -0.76                               0.78 0.218 1.7
## TASDESTO   21 -0.70                               0.86 0.140 2.4
## VIVPREC    17  0.66                               0.64 0.356 2.0
## VSINAGUA    4  0.64  0.53                         0.90 0.096 3.0
## ANALFAB    13  0.63  0.52                         0.79 0.210 2.7
## TASDESMU   22 -0.62  0.53                         0.89 0.111 2.9
## ASISHTPI   14  0.61  0.55                         0.79 0.214 2.6
## CONDESAG    7 -0.57                               0.66 0.340 3.2
## TASACTTO   19                                     0.81 0.195 4.7
## IRRTENVI   18        0.79                         0.68 0.323 1.2
## JEFMUJER   16       -0.77                         0.88 0.121 2.0
## TASACTMU   20       -0.77                         0.90 0.104 2.1
## PRECASAL   25        0.67                         0.83 0.170 2.6
## PRPERHOG   15                                     0.53 0.473 3.1
## MIGINTPR    2                                     0.35 0.654 2.8
## SINCOBSA   11              0.66                   0.97 0.034 3.5
## ASALPUBL   23       -0.55 -0.65                   0.89 0.112 2.7
## CTAPROP    24              0.55                   0.55 0.450 2.7
## MORTNEON    9       -0.50  0.54  0.51             0.91 0.090 3.7
## MORTINF     8              0.51                   0.75 0.247 3.3
## NACVIV20   10                                     0.54 0.457 3.9
## MIGLIMIT    3                                     0.59 0.410 3.3
## 
##                        PA1  PA2  PA3  PA4  PA5  PA6
## SS loadings           7.01 5.50 3.15 1.60 1.03 0.95
## Proportion Var        0.28 0.22 0.13 0.06 0.04 0.04
## Cumulative Var        0.28 0.50 0.63 0.69 0.73 0.77
## Proportion Explained  0.36 0.29 0.16 0.08 0.05 0.05
## Cumulative Proportion 0.36 0.65 0.81 0.90 0.95 1.00
## 
## Mean item complexity =  2.6
## Test of the hypothesis that 6 factors are sufficient.
## 
## The degrees of freedom for the null model are  300  and the objective function was  33.7 with Chi Square of  2049.97
## The degrees of freedom for the model are 165  and the objective function was  8.58 
## 
## The root mean square of the residuals (RMSR) is  0.03 
## The df corrected root mean square of the residuals is  0.04 
## 
## The harmonic number of observations is  71 with the empirical chi square  46.06  with prob <  1 
## The total number of observations was  71  with Likelihood Chi Square =  487.83  with prob <  1.2e-33 
## 
## Tucker Lewis Index of factoring reliability =  0.637
## RMSEA index =  0.165  and the 90 % confidence intervals are  0.15 0.184
## BIC =  -215.51
## Fit based upon off diagonal values = 0.99

Los resultados obtenidos indican que hay factores para los cuales no hay presencia de cargas factoriales por encima del 0,5 elegido como corte. En tal sentido, se intenta factorizar nuevamente pero realizando una rotación de tipo “varimax”:

ff <- fa(base,nfactors=6,n.obs=20, use="complete.obs",rotate="varimax",fm="pa")
## Warning in fa.stats(r = r, f = f, phi = phi, n.obs = n.obs, np.obs = np.obs, :
## The estimated weights for the factor scores are probably incorrect. Try a
## different factor score estimation method.
print(ff, cut=.50, sort=TRUE)
## Factor Analysis using method =  pa
## Call: fa(r = base, nfactors = 6, n.obs = 20, rotate = "varimax", fm = "pa", 
##     use = "complete.obs")
## Standardized loadings (pattern matrix) based upon correlation matrix
##          item   PA1   PA2   PA3   PA4   PA6   PA5   h2    u2 com
## VSINAGUA    4  0.93                               0.90 0.096 1.1
## ESCMEDIO   12 -0.90                               0.87 0.133 1.1
## VSINELEC    5  0.79                               0.95 0.054 2.0
## VIVPREC    17  0.76                               0.64 0.356 1.2
## ASISHTPI   14  0.71                               0.79 0.214 2.2
## PURBANA     1 -0.71                               0.78 0.218 2.2
## HSINGAS     6  0.68       -0.53                   0.96 0.039 2.9
## ANALFAB    13  0.67                               0.79 0.210 2.4
## TASDESMU   22       -0.88                         0.89 0.111 1.3
## TASDESTO   21       -0.83                         0.86 0.140 1.5
## TASACTTO   19        0.69        0.52             0.81 0.195 2.2
## TASACTMU   20        0.69        0.55             0.90 0.104 2.5
## IRRTENVI   18       -0.51                         0.68 0.323 4.2
## MIGINTPR    2                                     0.35 0.654 3.0
## PRECASAL   25              0.81                   0.83 0.170 1.6
## ASALPUBL   23             -0.80                   0.89 0.112 1.8
## JEFMUJER   16             -0.71                   0.88 0.121 2.5
## MIGLIMIT    3              0.66                   0.59 0.410 1.7
## MORTNEON    9                    0.92             0.91 0.090 1.1
## MORTINF     8                    0.83             0.75 0.247 1.2
## CONDESAG    7                         -0.64       0.66 0.340 2.1
## NACVIV20   10                          0.60       0.54 0.457 2.0
## PRPERHOG   15                          0.57       0.53 0.473 2.3
## SINCOBSA   11  0.52                          0.70 0.97 0.034 2.7
## CTAPROP    24                                0.63 0.55 0.450 1.8
## 
##                        PA1  PA2  PA3  PA4  PA6  PA5
## SS loadings           5.71 3.70 3.65 2.55 1.99 1.64
## Proportion Var        0.23 0.15 0.15 0.10 0.08 0.07
## Cumulative Var        0.23 0.38 0.52 0.62 0.70 0.77
## Proportion Explained  0.30 0.19 0.19 0.13 0.10 0.09
## Cumulative Proportion 0.30 0.49 0.68 0.81 0.91 1.00
## 
## Mean item complexity =  2
## Test of the hypothesis that 6 factors are sufficient.
## 
## The degrees of freedom for the null model are  300  and the objective function was  33.7 with Chi Square of  2049.97
## The degrees of freedom for the model are 165  and the objective function was  8.58 
## 
## The root mean square of the residuals (RMSR) is  0.03 
## The df corrected root mean square of the residuals is  0.04 
## 
## The harmonic number of observations is  71 with the empirical chi square  46.06  with prob <  1 
## The total number of observations was  71  with Likelihood Chi Square =  487.83  with prob <  1.2e-33 
## 
## Tucker Lewis Index of factoring reliability =  0.637
## RMSEA index =  0.165  and the 90 % confidence intervals are  0.15 0.184
## BIC =  -215.51
## Fit based upon off diagonal values = 0.99

Con este nuevo resultado se obtiene que cada factor aporta cargas factoriales superiores a 0,5 (en valor absoluto) a más de una variable. Es en este paso que los autores deciden que de todas manera uno de los factores no aporta en gran medida al análisis. En tal sentido, se procede a quita un factor, quedándonos con 5 de ellos:

# Guardo los scores en un nuevo dataframe
base_scores <- as.data.frame(ff$scores)
base_scores2 <- base_scores[,-6,drop=FALSE]

Segunda etapa: Análisis de clusters

En esta etapa los autores deciden realizar cuatro clusters, para lo cual se realiza una aglomeración jerárquica basada en las distancias euclídeas y empleando el método de Ward:

# Agglomerative Nesting (Hierarchical Clustering)
base.agnes <- agnes(x = base_scores2,stand = TRUE,
                   metric = "euclidean",
                   method = "ward")

# Se obtienen los números de cluster para luego incorporarlos al dataframe
clust <- cutree(base.agnes, k = 4)

# Se grafica primero el dendrograma y luego el plot de las distancias de los clusters
fviz_dend(base.agnes, cex = 0.6, k = 4)
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

fviz_cluster(list(data = base_scores2, cluster = clust)) 

Tercera etapa: Análisis de correlación entre los estratos obtenidos y los datos de NBI

Por último, se realiza un análisis de correlación entre los estratos obtenidos y los datos de NBI, efectuando un análisis de varianza entre ellos.

## Le vuelvo a agregar al dataframe las variables de NBI y nombre de provincia
base_cluster <- cbind(base, clust)
base_cluster2 <- base_cluster %>% rownames_to_column(var="DEPTO")
provincias <- basejor3[, c("DEPTO", "PCIA", "HOGNBI")]
base_cluster3 <- merge(base_cluster2, provincias, by = "DEPTO")

Se realiza un análisis ANOVA entre las variables HOGNBI y PCIA

anova1 <- aov(base_cluster3$HOGNBI ~ base_cluster3$PCIA)
summary(anova1)
##                    Df Sum Sq Mean Sq F value   Pr(>F)    
## base_cluster3$PCIA  3   3236  1078.7   8.734 5.72e-05 ***
## Residuals          67   8275   123.5                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# calculo de eta cuadrado
determinacion1 <- 3236/(3236+8275)
determinacion1
## [1] 0.2811224

Se obtiene un valor de eta cuadrado de 0.2811224, que resulta coincidente con aquél obtenido por los autores.

Luego se realiza un análisis ANOVA entre las variables HOGNBI y el Estrato al que pertenece ese Departamento:

anova2 <- aov(base_cluster3$HOGNBI ~ base_cluster3$clust)
summary(anova2)
##                     Df Sum Sq Mean Sq F value   Pr(>F)    
## base_cluster3$clust  1   3021    3021   24.55 4.94e-06 ***
## Residuals           69   8490     123                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
determinacion2 <- 3021/(3021+8490)
determinacion2
## [1] 0.2624446

El valor de eta cuadrado obtenido para este caso es de 0.2624446, lo cual demuestra una homogeneidad interna similar a aquella obtenida a partir del ordenamiento por provincias, aunque sensiblemente menor para el caso de la estructuración en base a la variable compleja de impacto del desarrollo económico en las condiciones de vida (IDECV) propuesta por los autores en la ponencia.