Principal Components Analysis

We are going to perform a Principal Components Analysis (PCA) to reduce the variables in order to have a “single” measure of generalized violence at the municipality level. To achieve this, I rely on a specific dataset from the Ministry of Public Security that encompasses, at the municipal level, the number of investigations on several crimes that were open between January 2015 and January 2018. Generalized violence is a latent idea frequently present but hard to explain, since violence can be thought as number of homicides, extorsions, robberies, etc. In these sense, I will use twelve events recorded monthly in this dataset from January 2015 to January 2018. The variables are: sexual abuse, breaking and entering, threats, extortion, feminicides, intentional homicides, intentional homicides committed with firme arms, drug dealing, robberies, kidnapping, child trafficking, human trafficking and gender violence.
library(readr)
pca <- read_csv("C:/Users/gpe637/Desktop/pca.csv", locale = locale(encoding = "WINDOWS-1252"))
## Parsed with column specification:
## cols(
##   mpio = col_character(),
##   Abuso_sex = col_integer(),
##   Allanamiento = col_integer(),
##   Amenazas = col_integer(),
##   Extorsión = col_integer(),
##   Hom_dol = col_integer(),
##   Hom_AF = col_integer(),
##   Narcomenudeo = col_integer(),
##   Robo = col_integer(),
##   Secuestro = col_integer(),
##   Tráfico_menores = col_integer(),
##   Trata_pers = col_integer(),
##   viol_gen = col_integer()
## )
grl.viol<-prcomp(~Abuso_sex+Allanamiento+Amenazas+Extorsión+Hom_dol+Hom_AF+Narcomenudeo+Robo+Secuestro+Tráfico_menores+Trata_pers+viol_gen, data=pca, center=T, scale=T, retx=T)
screeplot(grl.viol, type="l", main="Scree Plot")
abline(h=1)

The Scree Plot suggests that I could kept up to three principal components. But, once we take a lot at the eigenvalues and the proportion of variance that are explaining, I will only use the first component as a resume measure of generalized violence since this component accounts for 48.2% of the total variance in the dataset.
summary(grl.viol)
## Importance of components:
##                           PC1    PC2     PC3     PC4     PC5     PC6
## Standard deviation     2.4062 1.1453 1.02572 0.98196 0.86851 0.81772
## Proportion of Variance 0.4825 0.1093 0.08767 0.08035 0.06286 0.05572
## Cumulative Proportion  0.4825 0.5918 0.67947 0.75983 0.82269 0.87841
##                            PC7     PC8     PC9    PC10   PC11    PC12
## Standard deviation     0.72283 0.67786 0.44284 0.41308 0.3304 0.03415
## Proportion of Variance 0.04354 0.03829 0.01634 0.01422 0.0091 0.00010
## Cumulative Proportion  0.92195 0.96024 0.97658 0.99080 0.9999 1.00000
By looking how the observed variables relate to our concept of generalized violence, we can see that the 12 variables relate in the same direction, with robberies, sexual abuse and the two type of homicides more than the others.
grl.viol$rotation
##                        PC1         PC2          PC3         PC4
## Abuso_sex       0.37313297 -0.20073977  0.007540438  0.03613778
## Allanamiento    0.22530818 -0.36738710  0.402895570 -0.10340670
## Amenazas        0.33456051 -0.29342986  0.227558799 -0.04191135
## Extorsión       0.32604558  0.15472373  0.258855986 -0.15443869
## Hom_dol         0.33698695  0.13918358 -0.449506559  0.20227489
## Hom_AF          0.33484021  0.14291004 -0.457759634  0.20178841
## Narcomenudeo    0.29731467 -0.27479053 -0.134040494  0.09162898
## Robo            0.37442778  0.06021909  0.155442856 -0.08955781
## Secuestro       0.23374708  0.45925096  0.049658044 -0.03660956
## Tráfico_menores 0.06522346 -0.18904169 -0.437778928 -0.84944473
## Trata_pers      0.24532590 -0.02187799  0.056627807  0.20864024
## viol_gen        0.13573729  0.59045015  0.260339179 -0.31291488
##                          PC5          PC6         PC7          PC8
## Abuso_sex       -0.092274980 -0.138761004  0.27857867 -0.020075237
## Allanamiento    -0.375458838  0.106733484 -0.65149748 -0.009511494
## Amenazas        -0.059191503  0.156192936  0.35426678 -0.045043840
## Extorsión        0.117548640  0.144890794  0.38923595 -0.392646521
## Hom_dol         -0.207088054  0.018373619 -0.14451855 -0.203219291
## Hom_AF          -0.206265886  0.003168989 -0.13860131 -0.204294587
## Narcomenudeo     0.038413913 -0.445812938  0.14678294  0.637678564
## Robo             0.083982098  0.001672761 -0.09862061 -0.057140316
## Secuestro       -0.009920715  0.600466807 -0.02201083  0.587465015
## Tráfico_menores  0.171037817  0.085189684 -0.08367363  0.005269063
## Trata_pers       0.833400698 -0.049680591 -0.35941089 -0.071650131
## viol_gen        -0.123797796 -0.595464807 -0.11360569  0.009813485
##                          PC9         PC10         PC11          PC12
## Abuso_sex       -0.143023852 -0.007165461 -0.832348472  0.0047963668
## Allanamiento     0.229915759  0.096812377 -0.045527571  0.0061400505
## Amenazas        -0.487675061  0.440708511  0.401177183  0.0008707900
## Extorsión        0.653738881 -0.044563611  0.095076864 -0.0031825170
## Hom_dol          0.010409997  0.078361507  0.092083571 -0.7099024067
## Hom_AF           0.018851284  0.085557146  0.101212361  0.7041127138
## Narcomenudeo     0.337806540 -0.072932086  0.248665827 -0.0077715025
## Robo            -0.355368817 -0.803775850  0.177651967  0.0066894040
## Secuestro        0.005841541  0.087009555 -0.129365706  0.0067806651
## Tráfico_menores -0.004430876  0.050129057 -0.026987226 -0.0011338625
## Trata_pers      -0.031412659  0.233950274 -0.074459690  0.0007408731
## viol_gen        -0.139917571  0.254390526  0.003773067 -0.0062698359
hist(grl.viol$x[, 1], main = "Generalized violence (first component)", xlab = "PC1")

We check that all components are uncorrelated with each other.
cor(grl.viol$x[, 1:12])
##                PC1           PC2           PC3           PC4           PC5
## PC1   1.000000e+00 -8.224271e-15  8.429844e-15 -1.761682e-15  8.590258e-15
## PC2  -8.224271e-15  1.000000e+00 -2.597696e-16 -4.172063e-15  5.561691e-15
## PC3   8.429844e-15 -2.597696e-16  1.000000e+00  1.840330e-15 -3.743650e-15
## PC4  -1.761682e-15 -4.172063e-15  1.840330e-15  1.000000e+00  5.347104e-15
## PC5   8.590258e-15  5.561691e-15 -3.743650e-15  5.347104e-15  1.000000e+00
## PC6  -2.455057e-14  8.301711e-15 -1.269299e-16 -8.450234e-16 -1.003133e-15
## PC7  -1.107187e-14 -2.587854e-15 -1.130055e-14  2.311488e-15  3.480810e-15
## PC8   4.157798e-14  1.476126e-15 -1.251287e-15  6.063587e-16 -2.475625e-15
## PC9  -4.460339e-14 -5.442950e-15  7.793485e-15 -4.658718e-15 -1.401181e-14
## PC10  3.806439e-15 -7.178673e-15  1.595645e-14 -3.412645e-15  6.866519e-15
## PC11 -1.206539e-15  2.385836e-14 -1.129848e-14  1.857051e-15 -4.145119e-15
## PC12  2.381617e-13 -9.995977e-14  3.420013e-15  1.905512e-14 -2.468253e-14
##                PC6           PC7           PC8           PC9          PC10
## PC1  -2.455057e-14 -1.107187e-14  4.157798e-14 -4.460339e-14  3.806439e-15
## PC2   8.301711e-15 -2.587854e-15  1.476126e-15 -5.442950e-15 -7.178673e-15
## PC3  -1.269299e-16 -1.130055e-14 -1.251287e-15  7.793485e-15  1.595645e-14
## PC4  -8.450234e-16  2.311488e-15  6.063587e-16 -4.658718e-15 -3.412645e-15
## PC5  -1.003133e-15  3.480810e-15 -2.475625e-15 -1.401181e-14  6.866519e-15
## PC6   1.000000e+00 -1.363717e-14  3.253300e-15 -6.255113e-16  1.392307e-15
## PC7  -1.363717e-14  1.000000e+00  1.473637e-14 -2.832683e-14 -9.810956e-15
## PC8   3.253300e-15  1.473637e-14  1.000000e+00  2.588709e-14 -9.142792e-15
## PC9  -6.255113e-16 -2.832683e-14  2.588709e-14  1.000000e+00  2.213250e-14
## PC10  1.392307e-15 -9.810956e-15 -9.142792e-15  2.213250e-14  1.000000e+00
## PC11  1.911890e-14  5.188142e-15 -2.338406e-14  2.965451e-14 -1.275821e-15
## PC12 -1.045714e-14  6.729314e-14  4.557733e-15 -6.997221e-14  6.906849e-14
##               PC11          PC12
## PC1  -1.206539e-15  2.381617e-13
## PC2   2.385836e-14 -9.995977e-14
## PC3  -1.129848e-14  3.420013e-15
## PC4   1.857051e-15  1.905512e-14
## PC5  -4.145119e-15 -2.468253e-14
## PC6   1.911890e-14 -1.045714e-14
## PC7   5.188142e-15  6.729314e-14
## PC8  -2.338406e-14  4.557733e-15
## PC9   2.965451e-14 -6.997221e-14
## PC10 -1.275821e-15  6.906849e-14
## PC11  1.000000e+00 -1.287521e-13
## PC12 -1.287521e-13  1.000000e+00
I saved the scores for each municipality by each component, although I will only consider the first as the measure of generalized violence. Latter I will see how the observed variables correlated with each other and with the components.
scores<-data.frame(grl.viol$x)
scores$name<-rownames(grl.viol$x)
pca$name<-rownames(pca)
pca<-merge(pca, scores, by.x = "name", by.y = "name", all.x = T)
tail(names(pca), 20)
##  [1] "Hom_dol"         "Hom_AF"          "Narcomenudeo"   
##  [4] "Robo"            "Secuestro"       "Tráfico_menores"
##  [7] "Trata_pers"      "viol_gen"        "PC1"            
## [10] "PC2"             "PC3"             "PC4"            
## [13] "PC5"             "PC6"             "PC7"            
## [16] "PC8"             "PC9"             "PC10"           
## [19] "PC11"            "PC12"
round(cor(pca[, c("Abuso_sex","Allanamiento", "Amenazas", "Extorsión", "Hom_dol","Hom_AF", "Narcomenudeo", "Robo", "Secuestro", "Tráfico_menores", "Trata_pers", "viol_gen")], method = "spearman"), 3)
##                 Abuso_sex Allanamiento Amenazas Extorsión Hom_dol Hom_AF
## Abuso_sex           1.000        0.729    0.602     0.619   0.625  0.611
## Allanamiento        0.729        1.000    0.685     0.708   0.680  0.673
## Amenazas            0.602        0.685    1.000     0.554   0.642  0.629
## Extorsión           0.619        0.708    0.554     1.000   0.630  0.624
## Hom_dol             0.625        0.680    0.642     0.630   1.000  0.993
## Hom_AF              0.611        0.673    0.629     0.624   0.993  1.000
## Narcomenudeo        0.678        0.691    0.594     0.629   0.713  0.709
## Robo                0.731        0.826    0.749     0.735   0.816  0.808
## Secuestro           0.436        0.544    0.411     0.580   0.554  0.557
## Tráfico_menores     0.279        0.251    0.240     0.198   0.258  0.256
## Trata_pers          0.378        0.420    0.255     0.409   0.393  0.396
## viol_gen            0.198        0.318    0.042     0.351   0.267  0.275
##                 Narcomenudeo  Robo Secuestro Tráfico_menores Trata_pers
## Abuso_sex              0.678 0.731     0.436           0.279      0.378
## Allanamiento           0.691 0.826     0.544           0.251      0.420
## Amenazas               0.594 0.749     0.411           0.240      0.255
## Extorsión              0.629 0.735     0.580           0.198      0.409
## Hom_dol                0.713 0.816     0.554           0.258      0.393
## Hom_AF                 0.709 0.808     0.557           0.256      0.396
## Narcomenudeo           1.000 0.799     0.457           0.260      0.430
## Robo                   0.799 1.000     0.581           0.257      0.430
## Secuestro              0.457 0.581     1.000           0.167      0.395
## Tráfico_menores        0.260 0.257     0.167           1.000      0.219
## Trata_pers             0.430 0.430     0.395           0.219      1.000
## viol_gen               0.219 0.328     0.364           0.003      0.164
##                 viol_gen
## Abuso_sex          0.198
## Allanamiento       0.318
## Amenazas           0.042
## Extorsión          0.351
## Hom_dol            0.267
## Hom_AF             0.275
## Narcomenudeo       0.219
## Robo               0.328
## Secuestro          0.364
## Tráfico_menores    0.003
## Trata_pers         0.164
## viol_gen           1.000
round(cor(pca[, c("Abuso_sex", "Allanamiento", "Amenazas", "Extorsión", "Hom_dol","Hom_AF", "Narcomenudeo", "Robo", "Secuestro", "Tráfico_menores", "Trata_pers", "viol_gen", "PC1", "PC2","PC3","PC4","PC5","PC6","PC7","PC8","PC9","PC10","PC11", "PC12")], method = "spearman"), 3)
##                 Abuso_sex Allanamiento Amenazas Extorsión Hom_dol Hom_AF
## Abuso_sex           1.000        0.729    0.602     0.619   0.625  0.611
## Allanamiento        0.729        1.000    0.685     0.708   0.680  0.673
## Amenazas            0.602        0.685    1.000     0.554   0.642  0.629
## Extorsión           0.619        0.708    0.554     1.000   0.630  0.624
## Hom_dol             0.625        0.680    0.642     0.630   1.000  0.993
## Hom_AF              0.611        0.673    0.629     0.624   0.993  1.000
## Narcomenudeo        0.678        0.691    0.594     0.629   0.713  0.709
## Robo                0.731        0.826    0.749     0.735   0.816  0.808
## Secuestro           0.436        0.544    0.411     0.580   0.554  0.557
## Tráfico_menores     0.279        0.251    0.240     0.198   0.258  0.256
## Trata_pers          0.378        0.420    0.255     0.409   0.393  0.396
## viol_gen            0.198        0.318    0.042     0.351   0.267  0.275
## PC1                 0.754        0.811    0.758     0.758   0.889  0.883
## PC2                -0.015        0.088   -0.036     0.273   0.381  0.391
## PC3                 0.125        0.201    0.048     0.275  -0.281 -0.287
## PC4                -0.002       -0.091    0.067    -0.189   0.293  0.297
## PC5                -0.389       -0.423   -0.447    -0.264  -0.587 -0.582
## PC6                 0.120        0.274    0.366     0.365   0.268  0.266
## PC7                 0.184        0.029    0.187     0.201  -0.219 -0.231
## PC8                -0.172       -0.107   -0.188    -0.143  -0.242 -0.238
## PC9                -0.037        0.019   -0.168     0.374   0.075  0.074
## PC10                0.159        0.205    0.281     0.159   0.331  0.338
## PC11               -0.304       -0.043    0.101    -0.041   0.137  0.139
## PC12                0.266        0.325    0.303     0.232   0.389  0.433
##                 Narcomenudeo   Robo Secuestro Tráfico_menores Trata_pers
## Abuso_sex              0.678  0.731     0.436           0.279      0.378
## Allanamiento           0.691  0.826     0.544           0.251      0.420
## Amenazas               0.594  0.749     0.411           0.240      0.255
## Extorsión              0.629  0.735     0.580           0.198      0.409
## Hom_dol                0.713  0.816     0.554           0.258      0.393
## Hom_AF                 0.709  0.808     0.557           0.256      0.396
## Narcomenudeo           1.000  0.799     0.457           0.260      0.430
## Robo                   0.799  1.000     0.581           0.257      0.430
## Secuestro              0.457  0.581     1.000           0.167      0.395
## Tráfico_menores        0.260  0.257     0.167           1.000      0.219
## Trata_pers             0.430  0.430     0.395           0.219      1.000
## viol_gen               0.219  0.328     0.364           0.003      0.164
## PC1                    0.793  0.940     0.658           0.263      0.456
## PC2                    0.098  0.226     0.539          -0.168      0.094
## PC3                    0.011  0.095     0.178          -0.130      0.155
## PC4                    0.093 -0.012    -0.140          -0.148      0.156
## PC5                   -0.343 -0.463    -0.277          -0.017      0.273
## PC6                    0.074  0.274     0.579           0.088      0.065
## PC7                    0.035 -0.020    -0.102           0.002     -0.206
## PC8                   -0.105 -0.168     0.416          -0.051     -0.028
## PC9                    0.104 -0.034     0.062          -0.073     -0.007
## PC10                   0.112  0.163     0.361           0.003      0.199
## PC11                   0.027  0.036    -0.191          -0.078     -0.092
## PC12                   0.272  0.403     0.431           0.069      0.260
##                 viol_gen    PC1    PC2    PC3    PC4    PC5    PC6    PC7
## Abuso_sex          0.198  0.754 -0.015  0.125 -0.002 -0.389  0.120  0.184
## Allanamiento       0.318  0.811  0.088  0.201 -0.091 -0.423  0.274  0.029
## Amenazas           0.042  0.758 -0.036  0.048  0.067 -0.447  0.366  0.187
## Extorsión          0.351  0.758  0.273  0.275 -0.189 -0.264  0.365  0.201
## Hom_dol            0.267  0.889  0.381 -0.281  0.293 -0.587  0.268 -0.219
## Hom_AF             0.275  0.883  0.391 -0.287  0.297 -0.582  0.266 -0.231
## Narcomenudeo       0.219  0.793  0.098  0.011  0.093 -0.343  0.074  0.035
## Robo               0.328  0.940  0.226  0.095 -0.012 -0.463  0.274 -0.020
## Secuestro          0.364  0.658  0.539  0.178 -0.140 -0.277  0.579 -0.102
## Tráfico_menores    0.003  0.263 -0.168 -0.130 -0.148 -0.017  0.088  0.002
## Trata_pers         0.164  0.456  0.094  0.155  0.156  0.273  0.065 -0.206
## viol_gen           1.000  0.351  0.381  0.280 -0.366 -0.254 -0.097 -0.137
## PC1                0.351  1.000  0.314  0.034  0.054 -0.520  0.337 -0.077
## PC2                0.381  0.314  1.000 -0.131  0.041 -0.248  0.337 -0.419
## PC3                0.280  0.034 -0.131  1.000 -0.697  0.243  0.072  0.399
## PC4               -0.366  0.054  0.041 -0.697  1.000 -0.112 -0.118 -0.365
## PC5               -0.254 -0.520 -0.248  0.243 -0.112  1.000 -0.144  0.047
## PC6               -0.097  0.337  0.337  0.072 -0.118 -0.144  1.000  0.025
## PC7               -0.137 -0.077 -0.419  0.399 -0.365  0.047  0.025  1.000
## PC8                0.114 -0.158  0.195  0.171 -0.208  0.154  0.176 -0.016
## PC9                0.039  0.036  0.226 -0.024 -0.029  0.075  0.079  0.090
## PC10               0.246  0.351  0.341 -0.073  0.124 -0.239  0.244 -0.314
## PC11              -0.061  0.004  0.095 -0.281  0.187 -0.141 -0.002 -0.136
## PC12               0.071  0.450  0.212 -0.050  0.096 -0.209  0.366 -0.165
##                    PC8    PC9   PC10   PC11   PC12
## Abuso_sex       -0.172 -0.037  0.159 -0.304  0.266
## Allanamiento    -0.107  0.019  0.205 -0.043  0.325
## Amenazas        -0.188 -0.168  0.281  0.101  0.303
## Extorsión       -0.143  0.374  0.159 -0.041  0.232
## Hom_dol         -0.242  0.075  0.331  0.137  0.389
## Hom_AF          -0.238  0.074  0.338  0.139  0.433
## Narcomenudeo    -0.105  0.104  0.112  0.027  0.272
## Robo            -0.168 -0.034  0.163  0.036  0.403
## Secuestro        0.416  0.062  0.361 -0.191  0.431
## Tráfico_menores -0.051 -0.073  0.003 -0.078  0.069
## Trata_pers      -0.028 -0.007  0.199 -0.092  0.260
## viol_gen         0.114  0.039  0.246 -0.061  0.071
## PC1             -0.158  0.036  0.351  0.004  0.450
## PC2              0.195  0.226  0.341  0.095  0.212
## PC3              0.171 -0.024 -0.073 -0.281 -0.050
## PC4             -0.208 -0.029  0.124  0.187  0.096
## PC5              0.154  0.075 -0.239 -0.141 -0.209
## PC6              0.176  0.079  0.244 -0.002  0.366
## PC7             -0.016  0.090 -0.314 -0.136 -0.165
## PC8              1.000 -0.037  0.062 -0.256  0.039
## PC9             -0.037  1.000 -0.045  0.043 -0.121
## PC10             0.062 -0.045  1.000 -0.025  0.251
## PC11            -0.256  0.043 -0.025  1.000 -0.066
## PC12             0.039 -0.121  0.251 -0.066  1.000
Finally with the scores I plot the municipalities with the highest value of Generalized Violence in the period 2015-2018. In the top three I found three border cities: Tijuana, Juárez and Mexicali.
summary(pca$PC1)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -0.5912 -0.5890 -0.5383  0.0000 -0.3494 47.5913
mpios_hv<-subset(pca, subset = PC1>10)
plot(mpios_hv$PC1, xlab = "Municipalities", ylab = "Generalized Violence", main = "Municipalities with the most of Generalized Violence, 2015-18")
text(mpios_hv$PC1, labels = mpios_hv$mpio, cex = 0.6, pos=4)