Basic Concept

Factor analysis is a multivariate technique used to reduce handful of items into more underlying structure for more meaningful analysis. Two main objectives is data summation and data reduction. Data summation used loadings to describe the item contribution to underlying structure. On the other hand data reduction using loadings to construct one new factor score to represent the underlying structure.

Stock Price

Below are the data about variables that assumed to affect the movement of stock price in Indonesia Stock Exchange (IDX) in Jakarta, on 30th August 2001 until 27th September 2001. Based on the data, is it possible to reduce all the variables that affect the stock price movement into several factors? If it is possible, how many factors are form? Which variables include in which factors?

library(foreign)
# import data from spss file into data.frame
newdata <- read.spss(file = "C:/Users/asus/Google Drive/Ilham Fadhil/Tutor/Advanced Statistics/Archive/Materi/Week 11/Stock Price.sav",to.data.frame = TRUE, use.value.labels = TRUE)
head(newdata)
##         date    ihsg vol_shm  vol_rp frek_tr     as_sell      as_buy
## 1 31-Jul-01  443.194  462.91 403.821  14.977 48560441075 15508027500
## 2 01-Aug-01  436.460  523.37 348.256  13.712 40240610000 15424025000
## 3 02-Aug-01  453.150  352.43 265.913  10.255  8776482500 14388347500
## 4 05-Aug-01  430.810  469.42 245.608  13.214 40476617500  7514912500
## 5 06-Aug-01  432.936  485.12 395.627  15.514 14426367500 21616615000
## 6 07-Aug-01  442.526  713.70 586.387  22.433 25636831500 25850870000
##   ind_agri ind_min ind_bas ind_mis ind_con ind_prop ind_infr ind_fin
## 1  190.554 137.512  51.695  90.195 150.481   33.694  108.824  37.915
## 2  186.278 131.227  51.536  88.869 148.447   31.857  107.218  37.644
## 3  187.016 128.261  51.236  88.629 147.710   31.082  108.240  37.510
## 4  183.073 121.375  50.903  87.993 147.796   30.885  105.775  36.952
## 5  184.481 127.812  51.539  89.065 147.289   31.024  105.983  36.871
## 6  186.149 132.508  51.755  90.239 151.074   31.067  109.355  37.842
##   ind_trad ind_manu    i_dow i_nikkei inflasi   sbi kurs_.  kurs_y    jub
## 1  130.783  103.164 10522.81 11959.33   12.23 17.15   9425 7538.57 713963
## 2  128.911  101.938 10510.01 12339.20   12.23 17.15   9573 7680.29 713963
## 3  128.299  101.463 10551.01 12241.97   12.23 17.15   9658 7802.95 713963
## 4  128.302  101.232 10512.78 12243.90   12.23 17.15   9455 7627.23 713963
## 5  129.672  101.496 10401.31 12319.46   12.23 17.15   9285 7496.75 713963
## 6  131.772  103.446 10458.74 12163.67   12.23 17.13   9295 7523.43 712015
# examine the data.frame 
str(newdata)
## 'data.frame':    65 obs. of  24 variables:
##  $ date    : Factor w/ 65 levels "          ","01-Aug-01 ",..: 65 2 4 11 13 15 17 19 26 28 ...
##  $ ihsg    : num  443 436 453 431 433 ...
##  $ vol_shm : num  463 523 352 469 485 ...
##  $ vol_rp  : num  404 348 266 246 396 ...
##  $ frek_tr : num  15 13.7 10.3 13.2 15.5 ...
##  $ as_sell : num  4.86e+10 4.02e+10 8.78e+09 4.05e+10 1.44e+10 ...
##  $ as_buy  : num  1.55e+10 1.54e+10 1.44e+10 7.51e+09 2.16e+10 ...
##  $ ind_agri: num  191 186 187 183 184 ...
##  $ ind_min : num  138 131 128 121 128 ...
##  $ ind_bas : num  51.7 51.5 51.2 50.9 51.5 ...
##  $ ind_mis : num  90.2 88.9 88.6 88 89.1 ...
##  $ ind_con : num  150 148 148 148 147 ...
##  $ ind_prop: num  33.7 31.9 31.1 30.9 31 ...
##  $ ind_infr: num  109 107 108 106 106 ...
##  $ ind_fin : num  37.9 37.6 37.5 37 36.9 ...
##  $ ind_trad: num  131 129 128 128 130 ...
##  $ ind_manu: num  103 102 101 101 101 ...
##  $ i_dow   : num  10523 10510 10551 10513 10401 ...
##  $ i_nikkei: num  11959 12339 12242 12244 12319 ...
##  $ inflasi : num  12.2 12.2 12.2 12.2 12.2 ...
##  $ sbi     : num  17.1 17.1 17.1 17.1 17.1 ...
##  $ kurs_.  : num  9425 9573 9658 9455 9285 ...
##  $ kurs_y  : num  7539 7680 7803 7627 7497 ...
##  $ jub     : num  713963 713963 713963 713963 713963 ...
##  - attr(*, "variable.labels")= Named chr  "Date" "IHSG gabungan" "Stock Volume (million shares)" "Stock Transaction (billion rupiah)" ...
##   ..- attr(*, "names")= chr  "date" "ihsg" "vol_shm" "vol_rp" ...

The items are consist of:

Variable Name Description
date Date
ihsg IHSG gabungan
vol_shm Stock volume (million shares)
vol_rp Stock transaction (billion rupiah)
frek_tr Trading frequency (times)
as_sell Total foreign sell (rupiah)
as_buy Foreign buy (rupiah)
ind_agri Agrobusiness index
ind_min Mining index
ind_bas Basic industry index
ind_mis Miscellaneous index
ind_con Consumer index
ind_prop Property index
ind_infr Infrastructure index
ind_fin Financial index
ind_trad Trading index
ind_manu Manufacture index
i_dow Dow Jones Index (USA)
i_nikkei Nikkei Index (Japan)
inflasi Inflation rate (percentage)
sbi SBI rate (percentage)
kurs_. Exchange currency of dollar to rupiah
kurs_y Exchange currency of yen to rupiah
jub Total money supply (billion rupiah)

Since the unit of each variable are vary, then we standardized the variable by transform it into z-score. And we drop date variable since we do not use it in the analysis.

newdata_z <- as.data.frame(scale(newdata[2:24]))
head(newdata_z)
##        ihsg      vol_shm     vol_rp     frek_tr     as_sell     as_buy
## 1 1.0410846 -0.171579449  0.2084396  0.39703879  0.85444308 -0.4968689
## 2 0.8061833 -0.005019057 -0.1732826  0.05889172  0.51060813 -0.5018657
## 3 1.3883786 -0.475939226 -0.7389653 -0.86519872 -0.78971473 -0.5634720
## 4 0.6090950 -0.153645143 -0.8784572 -0.07422863  0.52036165 -0.9723317
## 5 0.6832560 -0.110393437  0.1521482  0.54058422 -0.55622106 -0.1335055
## 6 1.0177829  0.519318345  1.4626375  2.39010166 -0.09292448  0.1183651
##   ind_agri     ind_min   ind_bas   ind_mis   ind_con ind_prop  ind_infr
## 1 1.558442  1.07172119 0.6908141 1.1157250 0.9578381 2.378684 0.7243257
## 2 1.314510  0.64594489 0.6442168 0.9092797 0.7417340 1.620699 0.5123842
## 3 1.356610  0.44501371 0.5562972 0.8719141 0.6634308 1.300918 0.6472560
## 4 1.131675 -0.02147723 0.4587065 0.7728951 0.6725679 1.219631 0.3219537
## 5 1.211997  0.41459628 0.6450960 0.9397950 0.6187012 1.276986 0.3494031
## 6 1.307151  0.73272603 0.7083980 1.1225754 1.0208419 1.294728 0.7944009
##      ind_fin ind_trad  ind_manu    i_dow i_nikkei   inflasi       sbi
## 1 0.52196438 1.391767 0.9550151 1.435530 1.004070 -1.006213 -1.991139
## 2 0.39779147 1.187841 0.7700337 1.415496 1.270179 -1.006213 -1.991139
## 3 0.33639232 1.121174 0.6983647 1.479666 1.202067 -1.006213 -1.991139
## 4 0.08071526 1.121500 0.6635109 1.419831 1.203419 -1.006213 -1.991139
## 5 0.04360084 1.270741 0.7033438 1.245366 1.256351 -1.006213 -1.991139
## 6 0.48851559 1.499503 0.9975638 1.335252 1.147216 -1.006213 -2.097732
##         kurs_.      kurs_y          jub
## 1 -0.049428736 -0.64072244 -0.007890826
## 2  0.211400784 -0.33825941 -0.007890826
## 3  0.361201522 -0.07647479 -0.007890826
## 4  0.003442113 -0.45150164 -0.007890826
## 5 -0.296159364 -0.72997593 -0.007890826
## 6 -0.278535747 -0.67303468 -0.949292748
# check for missing data
colSums(is.na(newdata_z))
##     ihsg  vol_shm   vol_rp  frek_tr  as_sell   as_buy ind_agri  ind_min 
##        1        1        1        1        1        1        1        1 
##  ind_bas  ind_mis  ind_con ind_prop ind_infr  ind_fin ind_trad ind_manu 
##        1        1        1        1        1        1        1        1 
##    i_dow i_nikkei  inflasi      sbi   kurs_.   kurs_y      jub 
##        1        1        1        1        1        1        1

Factor required some correlation among its variable and adequate sample to produce good result. Two kind of test used to test this two important characteristics. KMO test is measure one value measurement that gauged the sample adequacy of data. Bartlett’s test of sphericity is measure the significance of the correlation among variables.

#install REdaS package
library(REdaS)
KMOS(newdata_z, use = "complete.obs")
## 
## Kaiser-Meyer-Olkin Statistics
## 
## Call: KMOS(x = newdata_z, use = "complete.obs")
## 
## Measures of Sampling Adequacy (MSA):
##      ihsg   vol_shm    vol_rp   frek_tr   as_sell    as_buy  ind_agri 
## 0.9670033 0.8817809 0.6001665 0.5467121 0.4866760 0.5788883 0.9306211 
##   ind_min   ind_bas   ind_mis   ind_con  ind_prop  ind_infr   ind_fin 
## 0.8515817 0.7627290 0.7867134 0.7714455 0.9313319 0.9044966 0.8736970 
##  ind_trad  ind_manu     i_dow  i_nikkei   inflasi       sbi    kurs_. 
## 0.8960251 0.7749495 0.8726731 0.7541353 0.6824569 0.6988162 0.7953850 
##    kurs_y       jub 
## 0.8148946 0.5984746 
## 
## KMO-Criterion: 0.8176324

KMO test generate \(0.8321097\) for KMO criterion which is considered high, but if we examine more closely in MSA (measures of sampling adequacy), as_sell item has MSA lower than 0.5. So we must drop as_sell item and repeat the test again.

# drop variable which have MSA below .5
# examine the difference in KMO value
newdata_z <- newdata_z[c(-5)]
KMOS(newdata_z, use = "complete.obs")
## 
## Kaiser-Meyer-Olkin Statistics
## 
## Call: KMOS(x = newdata_z, use = "complete.obs")
## 
## Measures of Sampling Adequacy (MSA):
##      ihsg   vol_shm    vol_rp   frek_tr    as_buy  ind_agri   ind_min 
## 0.9725269 0.8660000 0.6268534 0.5559223 0.5937948 0.9385210 0.8673001 
##   ind_bas   ind_mis   ind_con  ind_prop  ind_infr   ind_fin  ind_trad 
## 0.7631266 0.7892094 0.7715850 0.9429283 0.9041360 0.8725793 0.8930200 
##  ind_manu     i_dow  i_nikkei   inflasi       sbi    kurs_.    kurs_y 
## 0.7757945 0.8695311 0.7581044 0.6868625 0.8459401 0.7986419 0.8183135 
##       jub 
## 0.6552642 
## 
## KMO-Criterion: 0.830418
# check for sample adequacy 
bart_spher(newdata_z, use = "complete.obs")
##  Bartlett's Test of Sphericity
## 
## Call: bart_spher(x = newdata_z, use = "complete.obs")
## 
##      X2 = 2740.069
##      df = 231
## p-value < 2.22e-16

As we can see, after we drop as_sell item, the second test KMO-criterion is increased to \(0.830418\).

# how many number of factors to extract
# using scree plot
fact_stock <- princomp(na.omit(newdata_z), cor = TRUE)
plot(fact_stock, type = "lines")

# unrotated component matrix
fact_stock$loadings
## 
## Loadings:
##          Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
## ihsg     -0.279                                                        
## vol_shm                 0.311  0.122 -0.542  0.191  0.625  0.278       
## vol_rp                  0.540         0.222 -0.282  0.149        -0.123
## frek_tr                 0.543         0.138 -0.376 -0.261  0.306       
## as_buy           0.112  0.444 -0.280  0.159  0.355        -0.687  0.128
## ind_agri -0.248 -0.184               -0.177 -0.104 -0.136         0.262
## ind_min  -0.116        -0.167  0.569  0.391 -0.281  0.495 -0.238       
## ind_bas  -0.272  0.164                                     0.107       
## ind_mis  -0.290                                                        
## ind_con  -0.281                                                        
## ind_prop -0.279                                            0.122  0.202
## ind_infr -0.251  0.175         0.163  0.151                      -0.259
## ind_fin  -0.232  0.327                       0.114                0.310
## ind_trad -0.283                                                        
## ind_manu -0.285                                                        
## i_dow    -0.228 -0.289        -0.170                0.118        -0.319
## i_nikkei        -0.344  0.117  0.259  0.343  0.687         0.303 -0.158
## inflasi          0.553         0.204         0.186         0.135  0.277
## sbi       0.126  0.474 -0.103                                    -0.637
## kurs_.    0.262                       0.236                       0.105
## kurs_y    0.275                       0.137        -0.102         0.123
## jub       0.111        -0.184 -0.595  0.417         0.425  0.337  0.182
##          Comp.10 Comp.11 Comp.12 Comp.13 Comp.14 Comp.15 Comp.16 Comp.17
## ihsg     -0.235          -0.162   0.103   0.104   0.680   0.523   0.201 
## vol_shm  -0.252  -0.118                                                 
## vol_rp    0.253   0.345  -0.469  -0.297   0.165                         
## frek_tr          -0.308   0.408   0.250  -0.166                         
## as_buy           -0.189   0.124                                         
## ind_agri  0.147  -0.472  -0.552   0.244          -0.236                 
## ind_min   0.128  -0.229   0.108                                         
## ind_bas                   0.224  -0.329           0.187  -0.299         
## ind_mis                                                          -0.150 
## ind_con          -0.171                                  -0.390   0.422 
## ind_prop                  0.100  -0.163   0.398                  -0.445 
## ind_infr -0.418   0.108  -0.236          -0.572  -0.252                 
## ind_fin           0.126   0.193  -0.247          -0.488   0.455   0.193 
## ind_trad -0.158                          -0.121          -0.216  -0.565 
## ind_manu         -0.115                                  -0.307   0.238 
## i_dow    -0.244   0.324   0.146   0.446   0.394  -0.289           0.116 
## i_nikkei  0.253  -0.114                                                 
## inflasi           0.255  -0.161   0.540   0.122          -0.227  -0.171 
## sbi       0.130  -0.367                   0.267  -0.132   0.128  -0.238 
## kurs_.   -0.534  -0.108  -0.166  -0.121   0.204  -0.159                 
## kurs_y   -0.371  -0.219  -0.121  -0.182   0.264                   0.157 
## jub                               0.110  -0.237           0.104         
##          Comp.18 Comp.19 Comp.20 Comp.21 Comp.22
## ihsg             -0.137                         
## vol_shm                                         
## vol_rp                                          
## frek_tr                                         
## as_buy                                          
## ind_agri -0.219   0.119   0.151                 
## ind_min                                         
## ind_bas  -0.482   0.349   0.429           0.156 
## ind_mis  -0.192   0.109  -0.714  -0.531   0.176 
## ind_con   0.406  -0.291                   0.504 
## ind_prop  0.574   0.315                         
## ind_infr  0.135   0.329                         
## ind_fin  -0.139  -0.289                         
## ind_trad -0.239  -0.567           0.322         
## ind_manu  0.117                          -0.831 
## i_dow    -0.216   0.102                         
## i_nikkei                                        
## inflasi                                         
## sbi                                             
## kurs_.           -0.264   0.265  -0.539         
## kurs_y   -0.159   0.185  -0.429   0.534         
## jub                                             
## 
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## SS loadings     1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000
## Proportion Var  0.045  0.045  0.045  0.045  0.045  0.045  0.045  0.045
## Cumulative Var  0.045  0.091  0.136  0.182  0.227  0.273  0.318  0.364
##                Comp.9 Comp.10 Comp.11 Comp.12 Comp.13 Comp.14 Comp.15
## SS loadings     1.000   1.000   1.000   1.000   1.000   1.000   1.000
## Proportion Var  0.045   0.045   0.045   0.045   0.045   0.045   0.045
## Cumulative Var  0.409   0.455   0.500   0.545   0.591   0.636   0.682
##                Comp.16 Comp.17 Comp.18 Comp.19 Comp.20 Comp.21 Comp.22
## SS loadings      1.000   1.000   1.000   1.000   1.000   1.000   1.000
## Proportion Var   0.045   0.045   0.045   0.045   0.045   0.045   0.045
## Cumulative Var   0.727   0.773   0.818   0.864   0.909   0.955   1.000

Varimax rotation essentially orthogonally rotates the factor axes with the goal of maximizing the variance of the squared loadings of a factor on all the variables in the factor matrix. In other words, varimax rotation looks for a rotation (i.e., a linear combination) of the original factors that maximizes the variance of the loadings

library(psych)
princ_stock <- principal(na.omit(newdata_z), nfactors = 5, rotate = "varimax")
princ_stock
## Principal Components Analysis
## Call: principal(r = na.omit(newdata_z), nfactors = 5, rotate = "varimax")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            RC1   RC2   RC3   RC5   RC4   h2    u2 com
## ihsg      0.94  0.08  0.12  0.08  0.17 0.94 0.056 1.1
## vol_shm   0.21  0.03  0.29  0.68 -0.23 0.64 0.360 1.9
## vol_rp    0.18  0.08  0.88  0.13  0.06 0.84 0.158 1.2
## frek_tr   0.04  0.08  0.83  0.23  0.06 0.75 0.251 1.2
## as_buy    0.06 -0.09  0.75 -0.05 -0.18 0.61 0.390 1.2
## ind_agri  0.77  0.47 -0.02  0.20 -0.07 0.85 0.145 1.8
## ind_min   0.33  0.03 -0.15  0.03  0.77 0.72 0.278 1.5
## ind_bas   0.97 -0.04  0.07  0.03  0.08 0.96 0.040 1.0
## ind_mis   0.96  0.18  0.08  0.15  0.10 0.99 0.008 1.2
## ind_con   0.97  0.07  0.11  0.03  0.07 0.97 0.027 1.0
## ind_prop  0.91  0.32  0.06  0.05  0.09 0.95 0.054 1.3
## ind_infr  0.87 -0.09  0.07  0.08  0.33 0.88 0.119 1.3
## ind_fin   0.86 -0.34  0.15  0.16  0.13 0.91 0.090 1.5
## ind_trad  0.90  0.36  0.08  0.07  0.14 0.98 0.025 1.4
## ind_manu  0.98  0.07  0.10  0.06  0.08 0.99 0.012 1.1
## i_dow     0.68  0.64  0.05 -0.03  0.03 0.87 0.132 2.0
## i_nikkei  0.02  0.58  0.21  0.03  0.44 0.57 0.426 2.2
## inflasi  -0.02 -0.90  0.18  0.02  0.18 0.88 0.123 1.2
## sbi      -0.20 -0.82 -0.12 -0.20 -0.14 0.79 0.210 1.3
## kurs_.   -0.91 -0.03  0.00 -0.25  0.08 0.90 0.105 1.2
## kurs_y   -0.90 -0.25 -0.03 -0.20  0.02 0.92 0.079 1.3
## jub      -0.19 -0.13 -0.07 -0.82 -0.25 0.79 0.210 1.4
## 
##                         RC1  RC2  RC3  RC5  RC4
## SS loadings           10.74 2.96 2.32 1.46 1.21
## Proportion Var         0.49 0.13 0.11 0.07 0.06
## Cumulative Var         0.49 0.62 0.73 0.80 0.85
## Proportion Explained   0.57 0.16 0.12 0.08 0.06
## Cumulative Proportion  0.57 0.73 0.86 0.94 1.00
## 
## Mean item complexity =  1.4
## Test of the hypothesis that 5 components are sufficient.
## 
## The root mean square of the residuals (RMSR) is  0.04 
##  with the empirical chi square  52.84  with prob <  1 
## 
## Fit based upon off diagonal values = 0.99
princ_stock$communality
##      ihsg   vol_shm    vol_rp   frek_tr    as_buy  ind_agri   ind_min 
## 0.9442078 0.6403353 0.8424446 0.7491220 0.6102223 0.8549268 0.7223998 
##   ind_bas   ind_mis   ind_con  ind_prop  ind_infr   ind_fin  ind_trad 
## 0.9604589 0.9919909 0.9726970 0.9462956 0.8810889 0.9095998 0.9753950 
##  ind_manu     i_dow  i_nikkei   inflasi       sbi    kurs_.    kurs_y 
## 0.9883090 0.8682689 0.5742556 0.8768206 0.7898269 0.8954249 0.9206365 
##       jub 
## 0.7897340

Communality of a variable: The extent to which the variability across subjects in a variable is ‘explained’ by the set of factors extracted in the factor analysis.We can also look at the uniqueness of each variable. Uniqueness \(=1 − Communality\), where Communality is SS factor loadings for all factors for a given variable. If all the factors jointly explain a large percent of variance in a given variable, that variable has high Communality (and thus low uniqueness).

For more simple result, we can exclude loading that below or larger than \(|5|\). Based on this result we can classify the items into new underlying factor, and we can name each of the new factor according to its corresponding items.

print(princ_stock$loadings, cutoff = 0.5)
## 
## Loadings:
##          RC1    RC2    RC3    RC5    RC4   
## ihsg      0.942                            
## vol_shm                        0.677       
## vol_rp                  0.885              
## frek_tr                 0.828              
## as_buy                  0.751              
## ind_agri  0.767                            
## ind_min                               0.767
## ind_bas   0.972                            
## ind_mis   0.959                            
## ind_con   0.974                            
## ind_prop  0.912                            
## ind_infr  0.867                            
## ind_fin   0.856                            
## ind_trad  0.901                            
## ind_manu  0.981                            
## i_dow     0.676  0.637                     
## i_nikkei         0.581                     
## inflasi         -0.902                     
## sbi             -0.823                     
## kurs_.   -0.908                            
## kurs_y   -0.903                            
## jub                           -0.819       
## 
##                   RC1   RC2   RC3   RC5   RC4
## SS loadings    10.745 2.965 2.319 1.464 1.213
## Proportion Var  0.488 0.135 0.105 0.067 0.055
## Cumulative Var  0.488 0.623 0.729 0.795 0.850