特異値分解 ◦行列Aを次のように分解する。
Dは対角成分以外がゼロの行列で、対角成分を特異値と呼ぶ。UやVのベクトル成分を特異ベクトルと呼ぶ。
固有値分解 ◦正方行列Aを次のように分解する。
A=PΛP
Λの対角成分を固有値、Pのベクトル成分を固有ベクトルと呼ぶ。
#library(MASS)
data(USArrests)
head(USArrests)
## Murder Assault UrbanPop Rape
## Alabama 13.2 236 58 21.2
## Alaska 10.0 263 48 44.5
## Arizona 8.1 294 80 31.0
## Arkansas 8.8 190 50 19.5
## California 9.0 276 91 40.6
## Colorado 7.9 204 78 38.7
summary(USArrests)
## Murder Assault UrbanPop Rape
## Min. : 0.800 Min. : 45.0 Min. :32.00 Min. : 7.30
## 1st Qu.: 4.075 1st Qu.:109.0 1st Qu.:54.50 1st Qu.:15.07
## Median : 7.250 Median :159.0 Median :66.00 Median :20.10
## Mean : 7.788 Mean :170.8 Mean :65.54 Mean :21.23
## 3rd Qu.:11.250 3rd Qu.:249.0 3rd Qu.:77.75 3rd Qu.:26.18
## Max. :17.400 Max. :337.0 Max. :91.00 Max. :46.00
str(USArrests)
## 'data.frame': 50 obs. of 4 variables:
## $ Murder : num 13.2 10 8.1 8.8 9 7.9 3.3 5.9 15.4 17.4 ...
## $ Assault : int 236 263 294 190 276 204 110 238 335 211 ...
## $ UrbanPop: int 58 48 80 50 91 78 77 72 80 60 ...
## $ Rape : num 21.2 44.5 31 19.5 40.6 38.7 11.1 15.8 31.9 25.8 ...
#1
svd(var(USArrests))
## $d
## [1] 7011.114851 201.992366 42.112651 6.164246
##
## $u
## [,1] [,2] [,3] [,4]
## [1,] -0.04170432 0.04482166 -0.07989066 -0.99492173
## [2,] -0.99522128 0.05876003 0.06756974 0.03893830
## [3,] -0.04633575 -0.97685748 0.20054629 -0.05816914
## [4,] -0.07515550 -0.20071807 -0.97408059 0.07232502
##
## $v
## [,1] [,2] [,3] [,4]
## [1,] -0.04170432 0.04482166 -0.07989066 -0.99492173
## [2,] -0.99522128 0.05876003 0.06756974 0.03893830
## [3,] -0.04633575 -0.97685748 0.20054629 -0.05816914
## [4,] -0.07515550 -0.20071807 -0.97408059 0.07232502
#2
eigen(var(USArrests))
## $values
## [1] 7011.114851 201.992366 42.112651 6.164246
##
## $vectors
## [,1] [,2] [,3] [,4]
## [1,] -0.04170432 0.04482166 0.07989066 0.99492173
## [2,] -0.99522128 0.05876003 -0.06756974 -0.03893830
## [3,] -0.04633575 -0.97685748 -0.20054629 0.05816914
## [4,] -0.07515550 -0.20071807 0.97408059 -0.07232502
#3
prcomp(USArrests)
## Standard deviations:
## [1] 83.732400 14.212402 6.489426 2.482790
##
## Rotation:
## PC1 PC2 PC3 PC4
## Murder 0.04170432 -0.04482166 0.07989066 -0.99492173
## Assault 0.99522128 -0.05876003 -0.06756974 0.03893830
## UrbanPop 0.04633575 0.97685748 -0.20054629 -0.05816914
## Rape 0.07515550 0.20071807 0.97408059 0.07232502
どれも同じ結果になる(固有値分解は正方行列しか扱えない)。主成分分析はデータの共分散行列を特異値分解したものということが分かる。
また次に示すように、標準化したデータの主成分分析は、相関行列の特異値分解である(標準化したデータの特異値分解とも一致する)。
svd(cor(USArrests))
## $d
## [1] 2.4802416 0.9897652 0.3565632 0.1734301
##
## $u
## [,1] [,2] [,3] [,4]
## [1,] -0.5358995 0.4181809 -0.3412327 0.64922780
## [2,] -0.5831836 0.1879856 -0.2681484 -0.74340748
## [3,] -0.2781909 -0.8728062 -0.3780158 0.13387773
## [4,] -0.5434321 -0.1673186 0.8177779 0.08902432
##
## $v
## [,1] [,2] [,3] [,4]
## [1,] -0.5358995 0.4181809 -0.3412327 0.64922780
## [2,] -0.5831836 0.1879856 -0.2681484 -0.74340748
## [3,] -0.2781909 -0.8728062 -0.3780158 0.13387773
## [4,] -0.5434321 -0.1673186 0.8177779 0.08902432
svd(scale(USArrests))
## $d
## [1] 11.024148 6.964086 4.179904 2.915146
##
## $u
## [,1] [,2] [,3] [,4]
## [1,] -0.088502119 0.16111249 -0.105218608 0.0530665011
## [2,] -0.175119011 0.15255799 0.483145153 -0.1489378244
## [3,] -0.158329049 -0.10603826 0.012974042 -0.2834384049
## [4,] 0.012699298 0.15917987 0.027135114 -0.0620804497
## [5,] -0.226649068 -0.21932910 0.141759482 -0.1161380178
## [6,] -0.136005136 -0.14038162 0.259336498 0.0004974586
## [7,] 0.122004202 -0.15479183 -0.152346210 -0.0402308322
## [8,] -0.004284214 -0.04624999 -0.170197773 -0.2995093258
## [9,] -0.270566006 0.00557636 -0.136613685 -0.0326971796
## [10,] -0.147204794 0.18180252 -0.081106695 0.3656676469
## [11,] 0.081955040 -0.22324195 0.012026954 0.3065826885
## [12,] 0.147251202 0.02998994 0.061530174 -0.1694899353
## [13,] -0.123823808 -0.09692418 -0.160455000 -0.0414370084
## [14,] 0.045389560 -0.02154472 0.054011475 0.1442115222
## [15,] 0.202373535 -0.01479136 0.038974667 0.0059617844
## [16,] 0.071558552 -0.03840409 0.006051930 0.0701237800
## [17,] 0.067425851 0.13624293 -0.006718884 0.2277132300
## [18,] -0.140517959 0.12382100 -0.185555941 0.1544203417
## [19,] 0.215231159 0.05350432 -0.015555919 -0.1122203023
## [20,] -0.158347534 0.06079147 -0.037242408 -0.1898534932
## [21,] 0.043656895 -0.20960067 -0.144350623 -0.0609897144
## [22,] -0.189334383 -0.02208976 0.091150533 0.0347643443
## [23,] 0.151999911 -0.08987636 0.036252509 0.0228600295
## [24,] -0.089483486 0.34027971 -0.175449706 0.0731840095
## [25,] -0.062570302 -0.03743606 0.089392087 0.0766873549
## [26,] 0.106451539 0.07631705 0.058472149 0.0420214180
## [27,] 0.113651981 -0.02757065 0.041582129 0.0053970395
## [28,] -0.258115678 -0.11025209 0.275529769 0.1068057898
## [29,] 0.214071497 -0.00257041 0.008728664 -0.0112530536
## [30,] -0.016304324 -0.20604821 -0.181049718 0.0826499280
## [31,] -0.177802722 0.02030605 0.043504824 -0.1153016522
## [32,] -0.151092550 -0.11701618 -0.152302994 -0.0045791345
## [33,] -0.100877464 0.31671218 -0.204524432 -0.3240968906
## [34,] 0.268696706 0.08516514 0.071353148 -0.0862511360
## [35,] 0.020291306 -0.10550966 -0.007374849 0.1609363198
## [36,] 0.027997563 -0.04091867 -0.003625902 0.0035087357
## [37,] -0.005309061 -0.07696200 0.222585787 -0.0807475502
## [38,] 0.079778211 -0.08118230 -0.094883089 0.1219329726
## [39,] 0.077565243 -0.21208574 -0.324451737 -0.2083610270
## [40,] -0.118598723 0.27483477 -0.071178009 -0.0446445538
## [41,] 0.178498756 0.11703880 0.092198470 -0.0372092940
## [42,] -0.089775081 0.12228530 0.044544714 0.2217051038
## [43,] -0.121689077 -0.05863443 -0.116539361 0.2184216921
## [44,] 0.049439812 -0.20917537 0.069565218 -0.0279528910
## [45,] 0.251561949 0.19933619 0.199240942 -0.0492029259
## [46,] 0.008650709 0.02839251 0.002773945 0.0717790646
## [47,] 0.019477550 -0.13790380 0.147991604 -0.0749973365
## [48,] 0.189347337 0.20254292 0.024814359 0.0447947015
## [49,] 0.186754750 -0.08689225 -0.032888157 0.0625194853
## [50,] 0.056521430 0.04563221 -0.056996643 -0.0565930091
##
## $v
## [,1] [,2] [,3] [,4]
## [1,] -0.5358995 0.4181809 -0.3412327 0.64922780
## [2,] -0.5831836 0.1879856 -0.2681484 -0.74340748
## [3,] -0.2781909 -0.8728062 -0.3780158 0.13387773
## [4,] -0.5434321 -0.1673186 0.8177779 0.08902432
prcomp(USArrests, scale = TRUE)
## Standard deviations:
## [1] 1.5748783 0.9948694 0.5971291 0.4164494
##
## Rotation:
## PC1 PC2 PC3 PC4
## Murder -0.5358995 0.4181809 -0.3412327 0.64922780
## Assault -0.5831836 0.1879856 -0.2681484 -0.74340748
## UrbanPop -0.2781909 -0.8728062 -0.3780158 0.13387773
## Rape -0.5434321 -0.1673186 0.8177779 0.08902432
prcomp(USArrests, scale = TRUE)
## Standard deviations:
## [1] 1.5748783 0.9948694 0.5971291 0.4164494
##
## Rotation:
## PC1 PC2 PC3 PC4
## Murder -0.5358995 0.4181809 -0.3412327 0.64922780
## Assault -0.5831836 0.1879856 -0.2681484 -0.74340748
## UrbanPop -0.2781909 -0.8728062 -0.3780158 0.13387773
## Rape -0.5434321 -0.1673186 0.8177779 0.08902432
summary(prcomp(USArrests, scale = TRUE))
## Importance of components:
## PC1 PC2 PC3 PC4
## Standard deviation 1.5749 0.9949 0.59713 0.41645
## Proportion of Variance 0.6201 0.2474 0.08914 0.04336
## Cumulative Proportion 0.6201 0.8675 0.95664 1.00000
biplot(prcomp(USArrests, scale = TRUE))
v1 <- c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6)
v2 <- c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5)
v3 <- c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6)
v4 <- c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4)
v5 <- c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5)
v6 <- c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4)
m1 <- cbind(v1,v2,v3,v4,v5,v6)
a <- factanal(m1, factors = 3) # varimax is the default
a$loadings
##
## Loadings:
## Factor1 Factor2 Factor3
## v1 0.944 0.182 0.267
## v2 0.905 0.235 0.159
## v3 0.236 0.210 0.946
## v4 0.180 0.242 0.828
## v5 0.242 0.881 0.286
## v6 0.193 0.959 0.196
##
## Factor1 Factor2 Factor3
## SS loadings 1.893 1.886 1.797
## Proportion Var 0.316 0.314 0.300
## Cumulative Var 0.316 0.630 0.929
factanal(m1, factors = 3, rotation = "promax")
##
## Call:
## factanal(x = m1, factors = 3, rotation = "promax")
##
## Uniquenesses:
## v1 v2 v3 v4 v5 v6
## 0.005 0.101 0.005 0.224 0.084 0.005
##
## Loadings:
## Factor1 Factor2 Factor3
## v1 0.985
## v2 0.951
## v3 1.003
## v4 0.867
## v5 0.910
## v6 1.033
##
## Factor1 Factor2 Factor3
## SS loadings 1.903 1.876 1.772
## Proportion Var 0.317 0.313 0.295
## Cumulative Var 0.317 0.630 0.925
##
## Factor Correlations:
## Factor1 Factor2 Factor3
## Factor1 1.000 0.462 0.460
## Factor2 0.462 1.000 0.501
## Factor3 0.460 0.501 1.000
##
## The degrees of freedom for the model is 0 and the fit was 0.4755
因子分析と主成分分析は要約され方が違う→回転させてるから
prcomp(m1)
## Standard deviations:
## [1] 3.0368683 1.6313757 1.5818857 0.6344131 0.3190765 0.2649086
##
## Rotation:
## PC1 PC2 PC3 PC4 PC5 PC6
## v1 0.4168038 -0.52292304 0.2354298 -0.2686501 0.5157193 -0.39907358
## v2 0.3885610 -0.50887673 0.2985906 0.3060519 -0.5061522 0.38865228
## v3 0.4182779 0.01521834 -0.5555132 -0.5686880 -0.4308467 -0.08474731
## v4 0.3943646 0.02184360 -0.5986150 0.5922259 0.3558110 0.09124977
## v5 0.4254013 0.47017231 0.2923345 -0.2789775 0.3060409 0.58397162
## v6 0.4047824 0.49580764 0.3209708 0.2866938 -0.2682391 -0.57719858
summary(prcomp(m1))
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6
## Standard deviation 3.0369 1.6314 1.5819 0.6344 0.31908 0.26491
## Proportion of Variance 0.6165 0.1779 0.1673 0.0269 0.00681 0.00469
## Cumulative Proportion 0.6165 0.7943 0.9616 0.9885 0.99531 1.00000
prcomp(m1, scale=T)
## Standard deviations:
## [1] 1.9225064 1.0359124 1.0003870 0.4012524 0.2023886 0.1676783
##
## Rotation:
## PC1 PC2 PC3 PC4 PC5 PC6
## v1 0.4154985 -0.53088297 0.1760717 -0.2791358 0.5317514 -0.39223298
## v2 0.4007058 -0.54223870 0.2485226 0.3048547 -0.5042931 0.36932463
## v3 0.4133938 0.07418871 -0.5496063 -0.5693303 -0.4344463 -0.09302655
## v4 0.3940548 0.08433475 -0.5976225 0.5877130 0.3543977 0.09721936
## v5 0.4206885 0.44028459 0.3342420 -0.2798686 0.2920358 0.59484588
## v6 0.4045287 0.46655507 0.3691854 0.2850910 -0.2516003 -0.58121033
summary(prcomp(m1, scale=T))
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6
## Standard deviation 1.923 1.0359 1.0004 0.40125 0.20239 0.16768
## Proportion of Variance 0.616 0.1789 0.1668 0.02683 0.00683 0.00469
## Cumulative Proportion 0.616 0.7949 0.9617 0.98849 0.99531 1.00000