Factor analysis is a multivariate technique used to reduce handful of items into more underlying structure for more meaningful analysis. Two main objectives is data summation and data reduction. Data summation used loadings to describe the item contribution to underlying structure. On the other hand data reduction using loadings to construct one new factor score to represent the underlying structure.
Below are the data about variables that assumed to affect the movement of stock price in Indonesia Stock Exchange (IDX) in Jakarta, on 30th August 2001 until 27th September 2001. Based on the data, is it possible to reduce all the variables that affect the stock price movement into several factors? If it is possible, how many factors are form? Which variables include in which factors?
library(foreign)
# import data from spss file into data.frame
newdata <- read.spss(file = "C:/Users/asus/Google Drive/Ilham Fadhil/Tutor/Advanced Statistics/Archive/Materi/Week 11/Stock Price.sav",to.data.frame = TRUE, use.value.labels = TRUE)
head(newdata)
## date ihsg vol_shm vol_rp frek_tr as_sell as_buy
## 1 31-Jul-01 443.194 462.91 403.821 14.977 48560441075 15508027500
## 2 01-Aug-01 436.460 523.37 348.256 13.712 40240610000 15424025000
## 3 02-Aug-01 453.150 352.43 265.913 10.255 8776482500 14388347500
## 4 05-Aug-01 430.810 469.42 245.608 13.214 40476617500 7514912500
## 5 06-Aug-01 432.936 485.12 395.627 15.514 14426367500 21616615000
## 6 07-Aug-01 442.526 713.70 586.387 22.433 25636831500 25850870000
## ind_agri ind_min ind_bas ind_mis ind_con ind_prop ind_infr ind_fin
## 1 190.554 137.512 51.695 90.195 150.481 33.694 108.824 37.915
## 2 186.278 131.227 51.536 88.869 148.447 31.857 107.218 37.644
## 3 187.016 128.261 51.236 88.629 147.710 31.082 108.240 37.510
## 4 183.073 121.375 50.903 87.993 147.796 30.885 105.775 36.952
## 5 184.481 127.812 51.539 89.065 147.289 31.024 105.983 36.871
## 6 186.149 132.508 51.755 90.239 151.074 31.067 109.355 37.842
## ind_trad ind_manu i_dow i_nikkei inflasi sbi kurs_. kurs_y jub
## 1 130.783 103.164 10522.81 11959.33 12.23 17.15 9425 7538.57 713963
## 2 128.911 101.938 10510.01 12339.20 12.23 17.15 9573 7680.29 713963
## 3 128.299 101.463 10551.01 12241.97 12.23 17.15 9658 7802.95 713963
## 4 128.302 101.232 10512.78 12243.90 12.23 17.15 9455 7627.23 713963
## 5 129.672 101.496 10401.31 12319.46 12.23 17.15 9285 7496.75 713963
## 6 131.772 103.446 10458.74 12163.67 12.23 17.13 9295 7523.43 712015
# examine the data.frame
str(newdata)
## 'data.frame': 65 obs. of 24 variables:
## $ date : Factor w/ 65 levels " ","01-Aug-01 ",..: 65 2 4 11 13 15 17 19 26 28 ...
## $ ihsg : num 443 436 453 431 433 ...
## $ vol_shm : num 463 523 352 469 485 ...
## $ vol_rp : num 404 348 266 246 396 ...
## $ frek_tr : num 15 13.7 10.3 13.2 15.5 ...
## $ as_sell : num 4.86e+10 4.02e+10 8.78e+09 4.05e+10 1.44e+10 ...
## $ as_buy : num 1.55e+10 1.54e+10 1.44e+10 7.51e+09 2.16e+10 ...
## $ ind_agri: num 191 186 187 183 184 ...
## $ ind_min : num 138 131 128 121 128 ...
## $ ind_bas : num 51.7 51.5 51.2 50.9 51.5 ...
## $ ind_mis : num 90.2 88.9 88.6 88 89.1 ...
## $ ind_con : num 150 148 148 148 147 ...
## $ ind_prop: num 33.7 31.9 31.1 30.9 31 ...
## $ ind_infr: num 109 107 108 106 106 ...
## $ ind_fin : num 37.9 37.6 37.5 37 36.9 ...
## $ ind_trad: num 131 129 128 128 130 ...
## $ ind_manu: num 103 102 101 101 101 ...
## $ i_dow : num 10523 10510 10551 10513 10401 ...
## $ i_nikkei: num 11959 12339 12242 12244 12319 ...
## $ inflasi : num 12.2 12.2 12.2 12.2 12.2 ...
## $ sbi : num 17.1 17.1 17.1 17.1 17.1 ...
## $ kurs_. : num 9425 9573 9658 9455 9285 ...
## $ kurs_y : num 7539 7680 7803 7627 7497 ...
## $ jub : num 713963 713963 713963 713963 713963 ...
## - attr(*, "variable.labels")= Named chr "Date" "IHSG gabungan" "Stock Volume (million shares)" "Stock Transaction (billion rupiah)" ...
## ..- attr(*, "names")= chr "date" "ihsg" "vol_shm" "vol_rp" ...
The items are consist of:
| Variable Name | Description |
|---|---|
date |
Date |
ihsg |
IHSG gabungan |
vol_shm |
Stock volume (million shares) |
vol_rp |
Stock transaction (billion rupiah) |
frek_tr |
Trading frequency (times) |
as_sell |
Total foreign sell (rupiah) |
as_buy |
Foreign buy (rupiah) |
ind_agri |
Agrobusiness index |
ind_min |
Mining index |
ind_bas |
Basic industry index |
ind_mis |
Miscellaneous index |
ind_con |
Consumer index |
ind_prop |
Property index |
ind_infr |
Infrastructure index |
ind_fin |
Financial index |
ind_trad |
Trading index |
ind_manu |
Manufacture index |
i_dow |
Dow Jones Index (USA) |
i_nikkei |
Nikkei Index (Japan) |
inflasi |
Inflation rate (percentage) |
sbi |
SBI rate (percentage) |
kurs_. |
Exchange currency of dollar to rupiah |
kurs_y |
Exchange currency of yen to rupiah |
jub |
Total money supply (billion rupiah) |
Since the unit of each variable are vary, then we standardized the variable by transform it into z-score. And we drop date variable since we do not use it in the analysis.
newdata_z <- as.data.frame(scale(newdata[2:24]))
head(newdata_z)
## ihsg vol_shm vol_rp frek_tr as_sell as_buy
## 1 1.0410846 -0.171579449 0.2084396 0.39703879 0.85444308 -0.4968689
## 2 0.8061833 -0.005019057 -0.1732826 0.05889172 0.51060813 -0.5018657
## 3 1.3883786 -0.475939226 -0.7389653 -0.86519872 -0.78971473 -0.5634720
## 4 0.6090950 -0.153645143 -0.8784572 -0.07422863 0.52036165 -0.9723317
## 5 0.6832560 -0.110393437 0.1521482 0.54058422 -0.55622106 -0.1335055
## 6 1.0177829 0.519318345 1.4626375 2.39010166 -0.09292448 0.1183651
## ind_agri ind_min ind_bas ind_mis ind_con ind_prop ind_infr
## 1 1.558442 1.07172119 0.6908141 1.1157250 0.9578381 2.378684 0.7243257
## 2 1.314510 0.64594489 0.6442168 0.9092797 0.7417340 1.620699 0.5123842
## 3 1.356610 0.44501371 0.5562972 0.8719141 0.6634308 1.300918 0.6472560
## 4 1.131675 -0.02147723 0.4587065 0.7728951 0.6725679 1.219631 0.3219537
## 5 1.211997 0.41459628 0.6450960 0.9397950 0.6187012 1.276986 0.3494031
## 6 1.307151 0.73272603 0.7083980 1.1225754 1.0208419 1.294728 0.7944009
## ind_fin ind_trad ind_manu i_dow i_nikkei inflasi sbi
## 1 0.52196438 1.391767 0.9550151 1.435530 1.004070 -1.006213 -1.991139
## 2 0.39779147 1.187841 0.7700337 1.415496 1.270179 -1.006213 -1.991139
## 3 0.33639232 1.121174 0.6983647 1.479666 1.202067 -1.006213 -1.991139
## 4 0.08071526 1.121500 0.6635109 1.419831 1.203419 -1.006213 -1.991139
## 5 0.04360084 1.270741 0.7033438 1.245366 1.256351 -1.006213 -1.991139
## 6 0.48851559 1.499503 0.9975638 1.335252 1.147216 -1.006213 -2.097732
## kurs_. kurs_y jub
## 1 -0.049428736 -0.64072244 -0.007890826
## 2 0.211400784 -0.33825941 -0.007890826
## 3 0.361201522 -0.07647479 -0.007890826
## 4 0.003442113 -0.45150164 -0.007890826
## 5 -0.296159364 -0.72997593 -0.007890826
## 6 -0.278535747 -0.67303468 -0.949292748
# check for missing data
colSums(is.na(newdata_z))
## ihsg vol_shm vol_rp frek_tr as_sell as_buy ind_agri ind_min
## 1 1 1 1 1 1 1 1
## ind_bas ind_mis ind_con ind_prop ind_infr ind_fin ind_trad ind_manu
## 1 1 1 1 1 1 1 1
## i_dow i_nikkei inflasi sbi kurs_. kurs_y jub
## 1 1 1 1 1 1 1
Factor required some correlation among its variable and adequate sample to produce good result. Two kind of test used to test this two important characteristics. KMO test is measure one value measurement that gauged the sample adequacy of data. Bartlett’s test of sphericity is measure the significance of the correlation among variables.
#install REdaS package
library(REdaS)
KMOS(newdata_z, use = "complete.obs")
##
## Kaiser-Meyer-Olkin Statistics
##
## Call: KMOS(x = newdata_z, use = "complete.obs")
##
## Measures of Sampling Adequacy (MSA):
## ihsg vol_shm vol_rp frek_tr as_sell as_buy ind_agri
## 0.9670033 0.8817809 0.6001665 0.5467121 0.4866760 0.5788883 0.9306211
## ind_min ind_bas ind_mis ind_con ind_prop ind_infr ind_fin
## 0.8515817 0.7627290 0.7867134 0.7714455 0.9313319 0.9044966 0.8736970
## ind_trad ind_manu i_dow i_nikkei inflasi sbi kurs_.
## 0.8960251 0.7749495 0.8726731 0.7541353 0.6824569 0.6988162 0.7953850
## kurs_y jub
## 0.8148946 0.5984746
##
## KMO-Criterion: 0.8176324
KMO test generate \(0.8321097\) for KMO criterion which is considered high, but if we examine more closely in MSA (measures of sampling adequacy), as_sell item has MSA lower than 0.5. So we must drop as_sell item and repeat the test again.
# drop variable which have MSA below .5
# examine the difference in KMO value
newdata_z <- newdata_z[c(-5)]
KMOS(newdata_z, use = "complete.obs")
##
## Kaiser-Meyer-Olkin Statistics
##
## Call: KMOS(x = newdata_z, use = "complete.obs")
##
## Measures of Sampling Adequacy (MSA):
## ihsg vol_shm vol_rp frek_tr as_buy ind_agri ind_min
## 0.9725269 0.8660000 0.6268534 0.5559223 0.5937948 0.9385210 0.8673001
## ind_bas ind_mis ind_con ind_prop ind_infr ind_fin ind_trad
## 0.7631266 0.7892094 0.7715850 0.9429283 0.9041360 0.8725793 0.8930200
## ind_manu i_dow i_nikkei inflasi sbi kurs_. kurs_y
## 0.7757945 0.8695311 0.7581044 0.6868625 0.8459401 0.7986419 0.8183135
## jub
## 0.6552642
##
## KMO-Criterion: 0.830418
# check for sample adequacy
bart_spher(newdata_z, use = "complete.obs")
## Bartlett's Test of Sphericity
##
## Call: bart_spher(x = newdata_z, use = "complete.obs")
##
## X2 = 2740.069
## df = 231
## p-value < 2.22e-16
As we can see, after we drop as_sell item, the second test KMO-criterion is increased to \(0.830418\).
# how many number of factors to extract
# using scree plot
fact_stock <- princomp(na.omit(newdata_z), cor = TRUE)
plot(fact_stock, type = "lines")
# unrotated component matrix
fact_stock$loadings
##
## Loadings:
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
## ihsg -0.279
## vol_shm 0.311 0.122 -0.542 0.191 0.625 0.278
## vol_rp 0.540 0.222 -0.282 0.149 -0.123
## frek_tr 0.543 0.138 -0.376 -0.261 0.306
## as_buy 0.112 0.444 -0.280 0.159 0.355 -0.687 0.128
## ind_agri -0.248 -0.184 -0.177 -0.104 -0.136 0.262
## ind_min -0.116 -0.167 0.569 0.391 -0.281 0.495 -0.238
## ind_bas -0.272 0.164 0.107
## ind_mis -0.290
## ind_con -0.281
## ind_prop -0.279 0.122 0.202
## ind_infr -0.251 0.175 0.163 0.151 -0.259
## ind_fin -0.232 0.327 0.114 0.310
## ind_trad -0.283
## ind_manu -0.285
## i_dow -0.228 -0.289 -0.170 0.118 -0.319
## i_nikkei -0.344 0.117 0.259 0.343 0.687 0.303 -0.158
## inflasi 0.553 0.204 0.186 0.135 0.277
## sbi 0.126 0.474 -0.103 -0.637
## kurs_. 0.262 0.236 0.105
## kurs_y 0.275 0.137 -0.102 0.123
## jub 0.111 -0.184 -0.595 0.417 0.425 0.337 0.182
## Comp.10 Comp.11 Comp.12 Comp.13 Comp.14 Comp.15 Comp.16 Comp.17
## ihsg -0.235 -0.162 0.103 0.104 0.680 0.523 0.201
## vol_shm -0.252 -0.118
## vol_rp 0.253 0.345 -0.469 -0.297 0.165
## frek_tr -0.308 0.408 0.250 -0.166
## as_buy -0.189 0.124
## ind_agri 0.147 -0.472 -0.552 0.244 -0.236
## ind_min 0.128 -0.229 0.108
## ind_bas 0.224 -0.329 0.187 -0.299
## ind_mis -0.150
## ind_con -0.171 -0.390 0.422
## ind_prop 0.100 -0.163 0.398 -0.445
## ind_infr -0.418 0.108 -0.236 -0.572 -0.252
## ind_fin 0.126 0.193 -0.247 -0.488 0.455 0.193
## ind_trad -0.158 -0.121 -0.216 -0.565
## ind_manu -0.115 -0.307 0.238
## i_dow -0.244 0.324 0.146 0.446 0.394 -0.289 0.116
## i_nikkei 0.253 -0.114
## inflasi 0.255 -0.161 0.540 0.122 -0.227 -0.171
## sbi 0.130 -0.367 0.267 -0.132 0.128 -0.238
## kurs_. -0.534 -0.108 -0.166 -0.121 0.204 -0.159
## kurs_y -0.371 -0.219 -0.121 -0.182 0.264 0.157
## jub 0.110 -0.237 0.104
## Comp.18 Comp.19 Comp.20 Comp.21 Comp.22
## ihsg -0.137
## vol_shm
## vol_rp
## frek_tr
## as_buy
## ind_agri -0.219 0.119 0.151
## ind_min
## ind_bas -0.482 0.349 0.429 0.156
## ind_mis -0.192 0.109 -0.714 -0.531 0.176
## ind_con 0.406 -0.291 0.504
## ind_prop 0.574 0.315
## ind_infr 0.135 0.329
## ind_fin -0.139 -0.289
## ind_trad -0.239 -0.567 0.322
## ind_manu 0.117 -0.831
## i_dow -0.216 0.102
## i_nikkei
## inflasi
## sbi
## kurs_. -0.264 0.265 -0.539
## kurs_y -0.159 0.185 -0.429 0.534
## jub
##
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## SS loadings 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
## Proportion Var 0.045 0.045 0.045 0.045 0.045 0.045 0.045 0.045
## Cumulative Var 0.045 0.091 0.136 0.182 0.227 0.273 0.318 0.364
## Comp.9 Comp.10 Comp.11 Comp.12 Comp.13 Comp.14 Comp.15
## SS loadings 1.000 1.000 1.000 1.000 1.000 1.000 1.000
## Proportion Var 0.045 0.045 0.045 0.045 0.045 0.045 0.045
## Cumulative Var 0.409 0.455 0.500 0.545 0.591 0.636 0.682
## Comp.16 Comp.17 Comp.18 Comp.19 Comp.20 Comp.21 Comp.22
## SS loadings 1.000 1.000 1.000 1.000 1.000 1.000 1.000
## Proportion Var 0.045 0.045 0.045 0.045 0.045 0.045 0.045
## Cumulative Var 0.727 0.773 0.818 0.864 0.909 0.955 1.000
Varimax rotation essentially orthogonally rotates the factor axes with the goal of maximizing the variance of the squared loadings of a factor on all the variables in the factor matrix. In other words, varimax rotation looks for a rotation (i.e., a linear combination) of the original factors that maximizes the variance of the loadings
library(psych)
princ_stock <- principal(na.omit(newdata_z), nfactors = 5, rotate = "varimax")
princ_stock
## Principal Components Analysis
## Call: principal(r = na.omit(newdata_z), nfactors = 5, rotate = "varimax")
## Standardized loadings (pattern matrix) based upon correlation matrix
## RC1 RC2 RC3 RC5 RC4 h2 u2 com
## ihsg 0.94 0.08 0.12 0.08 0.17 0.94 0.056 1.1
## vol_shm 0.21 0.03 0.29 0.68 -0.23 0.64 0.360 1.9
## vol_rp 0.18 0.08 0.88 0.13 0.06 0.84 0.158 1.2
## frek_tr 0.04 0.08 0.83 0.23 0.06 0.75 0.251 1.2
## as_buy 0.06 -0.09 0.75 -0.05 -0.18 0.61 0.390 1.2
## ind_agri 0.77 0.47 -0.02 0.20 -0.07 0.85 0.145 1.8
## ind_min 0.33 0.03 -0.15 0.03 0.77 0.72 0.278 1.5
## ind_bas 0.97 -0.04 0.07 0.03 0.08 0.96 0.040 1.0
## ind_mis 0.96 0.18 0.08 0.15 0.10 0.99 0.008 1.2
## ind_con 0.97 0.07 0.11 0.03 0.07 0.97 0.027 1.0
## ind_prop 0.91 0.32 0.06 0.05 0.09 0.95 0.054 1.3
## ind_infr 0.87 -0.09 0.07 0.08 0.33 0.88 0.119 1.3
## ind_fin 0.86 -0.34 0.15 0.16 0.13 0.91 0.090 1.5
## ind_trad 0.90 0.36 0.08 0.07 0.14 0.98 0.025 1.4
## ind_manu 0.98 0.07 0.10 0.06 0.08 0.99 0.012 1.1
## i_dow 0.68 0.64 0.05 -0.03 0.03 0.87 0.132 2.0
## i_nikkei 0.02 0.58 0.21 0.03 0.44 0.57 0.426 2.2
## inflasi -0.02 -0.90 0.18 0.02 0.18 0.88 0.123 1.2
## sbi -0.20 -0.82 -0.12 -0.20 -0.14 0.79 0.210 1.3
## kurs_. -0.91 -0.03 0.00 -0.25 0.08 0.90 0.105 1.2
## kurs_y -0.90 -0.25 -0.03 -0.20 0.02 0.92 0.079 1.3
## jub -0.19 -0.13 -0.07 -0.82 -0.25 0.79 0.210 1.4
##
## RC1 RC2 RC3 RC5 RC4
## SS loadings 10.74 2.96 2.32 1.46 1.21
## Proportion Var 0.49 0.13 0.11 0.07 0.06
## Cumulative Var 0.49 0.62 0.73 0.80 0.85
## Proportion Explained 0.57 0.16 0.12 0.08 0.06
## Cumulative Proportion 0.57 0.73 0.86 0.94 1.00
##
## Mean item complexity = 1.4
## Test of the hypothesis that 5 components are sufficient.
##
## The root mean square of the residuals (RMSR) is 0.04
## with the empirical chi square 52.84 with prob < 1
##
## Fit based upon off diagonal values = 0.99
princ_stock$communality
## ihsg vol_shm vol_rp frek_tr as_buy ind_agri ind_min
## 0.9442078 0.6403353 0.8424446 0.7491220 0.6102223 0.8549268 0.7223998
## ind_bas ind_mis ind_con ind_prop ind_infr ind_fin ind_trad
## 0.9604589 0.9919909 0.9726970 0.9462956 0.8810889 0.9095998 0.9753950
## ind_manu i_dow i_nikkei inflasi sbi kurs_. kurs_y
## 0.9883090 0.8682689 0.5742556 0.8768206 0.7898269 0.8954249 0.9206365
## jub
## 0.7897340
Communality of a variable: The extent to which the variability across subjects in a variable is ‘explained’ by the set of factors extracted in the factor analysis.We can also look at the uniqueness of each variable. Uniqueness \(=1 − Communality\), where Communality is SS factor loadings for all factors for a given variable. If all the factors jointly explain a large percent of variance in a given variable, that variable has high Communality (and thus low uniqueness).
For more simple result, we can exclude loading that below or larger than \(|5|\). Based on this result we can classify the items into new underlying factor, and we can name each of the new factor according to its corresponding items.
print(princ_stock$loadings, cutoff = 0.5)
##
## Loadings:
## RC1 RC2 RC3 RC5 RC4
## ihsg 0.942
## vol_shm 0.677
## vol_rp 0.885
## frek_tr 0.828
## as_buy 0.751
## ind_agri 0.767
## ind_min 0.767
## ind_bas 0.972
## ind_mis 0.959
## ind_con 0.974
## ind_prop 0.912
## ind_infr 0.867
## ind_fin 0.856
## ind_trad 0.901
## ind_manu 0.981
## i_dow 0.676 0.637
## i_nikkei 0.581
## inflasi -0.902
## sbi -0.823
## kurs_. -0.908
## kurs_y -0.903
## jub -0.819
##
## RC1 RC2 RC3 RC5 RC4
## SS loadings 10.745 2.965 2.319 1.464 1.213
## Proportion Var 0.488 0.135 0.105 0.067 0.055
## Cumulative Var 0.488 0.623 0.729 0.795 0.850