Dataset

sustainability <- read.csv(url("https://www.dropbox.com/s/nvzib4kkstgv0a7/sustainability-practice.csv?dl=1"),
                           header=TRUE)
data1 <- sustainability[1:30]

This dataset was from an actual research which aims to determine the best fit model of qualitative productivity of small and medium enterprises in Region XI. The data contains 400 observations with 92 variables. These variables correspond to the questions used by the researcher to measure the variables considered in the study.

For this demonstration, I will be using the first thirty questions of the questionnaire which are the first thirty columns of the dataset. These questions aim to measure the sustainability practices of the participants which refers to the method of evaluating whether a small and medium enterprise can maintain existing practices without placing future resources at risk. These will be stored in data1.

Exploratory Factor Analysis

Initial Analysis

Correlation Matrix

cor(data1)

##             SO.1      SO.2      SO.3      SO.4      SO.5      SO.6      SO.7
## SO.1   1.0000000 0.6315712 0.3790817 0.2965772 0.2357391 0.2625446 0.3941005
## SO.2   0.6315712 1.0000000 0.4138961 0.2210042 0.2872451 0.2538355 0.3312963
## SO.3   0.3790817 0.4138961 1.0000000 0.6052334 0.5901324 0.6112921 0.5233781
## SO.4   0.2965772 0.2210042 0.6052334 1.0000000 0.4743486 0.5375134 0.3720154
## SO.5   0.2357391 0.2872451 0.5901324 0.4743486 1.0000000 0.5802702 0.4982759
## SO.6   0.2625446 0.2538355 0.6112921 0.5375134 0.5802702 1.0000000 0.6461334
## SO.7   0.3941005 0.3312963 0.5233781 0.3720154 0.4982759 0.6461334 1.0000000
## SO.8   0.1807797 0.2664885 0.3046017 0.2098966 0.3372045 0.3659635 0.3972928
## PM.1   0.3215602 0.2430358 0.5308076 0.4411201 0.5677841 0.5615930 0.5673235
## PM.2   0.2979573 0.2476473 0.4985504 0.4506427 0.4922629 0.4890420 0.4551187
## PM.3   0.1714981 0.2446794 0.5255356 0.4864463 0.5329021 0.5842153 0.3798788
## PM.4   0.2381309 0.2047825 0.3934646 0.4114495 0.3973107 0.4865922 0.3750900
## PM.5   0.2598198 0.2120894 0.3698620 0.4358193 0.4143380 0.4557790 0.5457836
## PM.6   0.2495176 0.3115838 0.5077867 0.4386151 0.4810776 0.5442122 0.5324265
## PM.7   0.3209050 0.3475961 0.5774903 0.3402683 0.5123425 0.5294526 0.5852776
## PM.8   0.1044868 0.1912737 0.5493662 0.3920557 0.5031479 0.5582762 0.5148927
## SFPS.1 0.2884414 0.1545617 0.5134441 0.4997082 0.3983013 0.5437305 0.4770273
## SFPS.2 0.4089458 0.3731473 0.2556721 0.2850198 0.2711304 0.2929723 0.3101318
## SFPS.3 0.3288393 0.2882688 0.3508675 0.3119934 0.2772150 0.3332310 0.3310851
## SFPS.4 0.3003922 0.2819080 0.3541564 0.3607331 0.2429748 0.2346654 0.2457086
## SFPS.5 0.3300313 0.4100307 0.4941824 0.3325802 0.4374663 0.4370749 0.4714447
## SFPS.6 0.2222653 0.3136680 0.5080672 0.3822412 0.3687162 0.5208503 0.3999261
## SFPS.7 0.2561839 0.2881681 0.5442601 0.5115058 0.3931217 0.5461929 0.4588204
## LO.1   0.3031987 0.2289579 0.3404363 0.4141417 0.2729543 0.3834063 0.3469728
## LO.2   0.2807116 0.2768188 0.3364128 0.3740529 0.3339760 0.4077828 0.4620626
## LO.3   0.2611204 0.2644986 0.2876618 0.3243582 0.3522527 0.3751686 0.2938339
## LO.4   0.2434140 0.2862638 0.4166843 0.3496510 0.3862101 0.4699904 0.4697268
## LO.5   0.2926416 0.3567731 0.3390702 0.4026061 0.2678942 0.4166245 0.3567292
## LO.6   0.1247796 0.2746303 0.3645226 0.3645602 0.3381415 0.4178378 0.3423396
## LO.7   0.2459781 0.2311428 0.2836062 0.2834288 0.2927143 0.3909378 0.4571340
##              SO.8      PM.1      PM.2      PM.3      PM.4      PM.5      PM.6
## SO.1   0.18077968 0.3215602 0.2979573 0.1714981 0.2381309 0.2598198 0.2495176
## SO.2   0.26648854 0.2430358 0.2476473 0.2446794 0.2047825 0.2120894 0.3115838
## SO.3   0.30460169 0.5308076 0.4985504 0.5255356 0.3934646 0.3698620 0.5077867
## SO.4   0.20989662 0.4411201 0.4506427 0.4864463 0.4114495 0.4358193 0.4386151
## SO.5   0.33720452 0.5677841 0.4922629 0.5329021 0.3973107 0.4143380 0.4810776
## SO.6   0.36596346 0.5615930 0.4890420 0.5842153 0.4865922 0.4557790 0.5442122
## SO.7   0.39729281 0.5673235 0.4551187 0.3798788 0.3750900 0.5457836 0.5324265
## SO.8   1.00000000 0.2796751 0.3463834 0.4599439 0.1212626 0.1620266 0.2932955
## PM.1   0.27967513 1.0000000 0.5321683 0.5198244 0.4519694 0.4972739 0.5187687
## PM.2   0.34638336 0.5321683 1.0000000 0.6445620 0.3699464 0.5115970 0.4843028
## PM.3   0.45994394 0.5198244 0.6445620 1.0000000 0.4593094 0.4442291 0.4909920
## PM.4   0.12126257 0.4519694 0.3699464 0.4593094 1.0000000 0.4907171 0.3398160
## PM.5   0.16202657 0.4972739 0.5115970 0.4442291 0.4907171 1.0000000 0.5448936
## PM.6   0.29329548 0.5187687 0.4843028 0.4909920 0.3398160 0.5448936 1.0000000
## PM.7   0.36324779 0.5833262 0.4512267 0.4265951 0.2344860 0.4209132 0.6243853
## PM.8   0.40228620 0.4990042 0.5399056 0.5535045 0.3521556 0.5119299 0.6161623
## SFPS.1 0.09080285 0.4426808 0.4036216 0.3871325 0.5903292 0.4795334 0.4745211
## SFPS.2 0.26912637 0.2229664 0.3453844 0.3403376 0.2747184 0.3371116 0.4241030
## SFPS.3 0.19825276 0.4222256 0.3232486 0.3384897 0.3922519 0.3168752 0.4257423
## SFPS.4 0.09288413 0.2980724 0.3943281 0.3274897 0.2736548 0.3881096 0.3058616
## SFPS.5 0.28157707 0.3867742 0.4055636 0.3821619 0.2638816 0.4901881 0.4970655
## SFPS.6 0.22245885 0.3727905 0.3636085 0.3990942 0.3784876 0.3835169 0.5101764
## SFPS.7 0.21729171 0.4233401 0.3158516 0.3578237 0.3367531 0.3032766 0.4690446
## LO.1   0.21405330 0.4077570 0.3486905 0.2975338 0.3522706 0.3439237 0.3714022
## LO.2   0.25568901 0.4081827 0.3883277 0.2603986 0.3240218 0.4696258 0.5055111
## LO.3   0.24493331 0.3659990 0.3838990 0.3275324 0.2856123 0.3793444 0.3612389
## LO.4   0.36419810 0.4646748 0.4463391 0.4179591 0.3012022 0.3963319 0.5137543
## LO.5   0.28795856 0.2705463 0.4243729 0.4542954 0.3488761 0.3811119 0.3968120
## LO.6   0.27607911 0.3783195 0.4058888 0.4391417 0.2797446 0.3829381 0.4586959
## LO.7   0.35422766 0.3607607 0.3692906 0.3763254 0.2366145 0.2911738 0.4521198
##             PM.7      PM.8     SFPS.1    SFPS.2    SFPS.3     SFPS.4    SFPS.5
## SO.1   0.3209050 0.1044868 0.28844140 0.4089458 0.3288393 0.30039223 0.3300313
## SO.2   0.3475961 0.1912737 0.15456173 0.3731473 0.2882688 0.28190797 0.4100307
## SO.3   0.5774903 0.5493662 0.51344411 0.2556721 0.3508675 0.35415636 0.4941824
## SO.4   0.3402683 0.3920557 0.49970820 0.2850198 0.3119934 0.36073312 0.3325802
## SO.5   0.5123425 0.5031479 0.39830133 0.2711304 0.2772150 0.24297483 0.4374663
## SO.6   0.5294526 0.5582762 0.54373054 0.2929723 0.3332310 0.23466538 0.4370749
## SO.7   0.5852776 0.5148927 0.47702727 0.3101318 0.3310851 0.24570856 0.4714447
## SO.8   0.3632478 0.4022862 0.09080285 0.2691264 0.1982528 0.09288413 0.2815771
## PM.1   0.5833262 0.4990042 0.44268082 0.2229664 0.4222256 0.29807238 0.3867742
## PM.2   0.4512267 0.5399056 0.40362155 0.3453844 0.3232486 0.39432811 0.4055636
## PM.3   0.4265951 0.5535045 0.38713247 0.3403376 0.3384897 0.32748966 0.3821619
## PM.4   0.2344860 0.3521556 0.59032915 0.2747184 0.3922519 0.27365485 0.2638816
## PM.5   0.4209132 0.5119299 0.47953345 0.3371116 0.3168752 0.38810963 0.4901881
## PM.6   0.6243853 0.6161623 0.47452105 0.4241030 0.4257423 0.30586160 0.4970655
## PM.7   1.0000000 0.5797259 0.35867697 0.3613515 0.3818592 0.32895635 0.5522861
## PM.8   0.5797259 1.0000000 0.41054839 0.2490766 0.3807546 0.28150433 0.4767798
## SFPS.1 0.3586770 0.4105484 1.00000000 0.3504265 0.4574639 0.35315290 0.3185669
## SFPS.2 0.3613515 0.2490766 0.35042653 1.0000000 0.5392120 0.46857867 0.4486740
## SFPS.3 0.3818592 0.3807546 0.45746390 0.5392120 1.0000000 0.49577534 0.5171400
## SFPS.4 0.3289564 0.2815043 0.35315290 0.4685787 0.4957753 1.00000000 0.5949558
## SFPS.5 0.5522861 0.4767798 0.31856689 0.4486740 0.5171400 0.59495583 1.0000000
## SFPS.6 0.4671421 0.5343526 0.46673924 0.4346131 0.5066181 0.44038023 0.5594899
## SFPS.7 0.4745088 0.5661656 0.48728956 0.2436269 0.4282009 0.33959164 0.4643649
## LO.1   0.3276255 0.3819507 0.33551885 0.2951315 0.4119873 0.26544034 0.2192280
## LO.2   0.3401427 0.4141034 0.35730501 0.3704356 0.4426714 0.31269018 0.4036177
## LO.3   0.3193983 0.3437535 0.29388268 0.4228620 0.4577726 0.37868287 0.4287014
## LO.4   0.4578769 0.4759301 0.29163660 0.4080287 0.5077200 0.39850161 0.5238008
## LO.5   0.3289526 0.3794629 0.29429946 0.5210820 0.4746364 0.39918827 0.3715045
## LO.6   0.4069949 0.4257089 0.23900281 0.3462870 0.4585234 0.36474692 0.4672360
## LO.7   0.3473964 0.3376220 0.20721683 0.4595433 0.5470055 0.25795801 0.3751128
##           SFPS.6    SFPS.7      LO.1      LO.2      LO.3      LO.4      LO.5
## SO.1   0.2222653 0.2561839 0.3031987 0.2807116 0.2611204 0.2434140 0.2926416
## SO.2   0.3136680 0.2881681 0.2289579 0.2768188 0.2644986 0.2862638 0.3567731
## SO.3   0.5080672 0.5442601 0.3404363 0.3364128 0.2876618 0.4166843 0.3390702
## SO.4   0.3822412 0.5115058 0.4141417 0.3740529 0.3243582 0.3496510 0.4026061
## SO.5   0.3687162 0.3931217 0.2729543 0.3339760 0.3522527 0.3862101 0.2678942
## SO.6   0.5208503 0.5461929 0.3834063 0.4077828 0.3751686 0.4699904 0.4166245
## SO.7   0.3999261 0.4588204 0.3469728 0.4620626 0.2938339 0.4697268 0.3567292
## SO.8   0.2224589 0.2172917 0.2140533 0.2556890 0.2449333 0.3641981 0.2879586
## PM.1   0.3727905 0.4233401 0.4077570 0.4081827 0.3659990 0.4646748 0.2705463
## PM.2   0.3636085 0.3158516 0.3486905 0.3883277 0.3838990 0.4463391 0.4243729
## PM.3   0.3990942 0.3578237 0.2975338 0.2603986 0.3275324 0.4179591 0.4542954
## PM.4   0.3784876 0.3367531 0.3522706 0.3240218 0.2856123 0.3012022 0.3488761
## PM.5   0.3835169 0.3032766 0.3439237 0.4696258 0.3793444 0.3963319 0.3811119
## PM.6   0.5101764 0.4690446 0.3714022 0.5055111 0.3612389 0.5137543 0.3968120
## PM.7   0.4671421 0.4745088 0.3276255 0.3401427 0.3193983 0.4578769 0.3289526
## PM.8   0.5343526 0.5661656 0.3819507 0.4141034 0.3437535 0.4759301 0.3794629
## SFPS.1 0.4667392 0.4872896 0.3355188 0.3573050 0.2938827 0.2916366 0.2942995
## SFPS.2 0.4346131 0.2436269 0.2951315 0.3704356 0.4228620 0.4080287 0.5210820
## SFPS.3 0.5066181 0.4282009 0.4119873 0.4426714 0.4577726 0.5077200 0.4746364
## SFPS.4 0.4403802 0.3395916 0.2654403 0.3126902 0.3786829 0.3985016 0.3991883
## SFPS.5 0.5594899 0.4643649 0.2192280 0.4036177 0.4287014 0.5238008 0.3715045
## SFPS.6 1.0000000 0.5755789 0.3776770 0.4214773 0.4500252 0.5089488 0.4626203
## SFPS.7 0.5755789 1.0000000 0.4275235 0.4676417 0.3723770 0.4489399 0.3929254
## LO.1   0.3776770 0.4275235 1.0000000 0.6886450 0.5396950 0.5096267 0.5499661
## LO.2   0.4214773 0.4676417 0.6886450 1.0000000 0.7024386 0.5996974 0.5845681
## LO.3   0.4500252 0.3723770 0.5396950 0.7024386 1.0000000 0.6133422 0.5254800
## LO.4   0.5089488 0.4489399 0.5096267 0.5996974 0.6133422 1.0000000 0.5769439
## LO.5   0.4626203 0.3929254 0.5499661 0.5845681 0.5254800 0.5769439 1.0000000
## LO.6   0.4193321 0.4597387 0.5727567 0.6797776 0.6020432 0.6093703 0.5905355
## LO.7   0.3688877 0.2680821 0.4684382 0.5200677 0.4610961 0.6173724 0.5561634
##             LO.6      LO.7
## SO.1   0.1247796 0.2459781
## SO.2   0.2746303 0.2311428
## SO.3   0.3645226 0.2836062
## SO.4   0.3645602 0.2834288
## SO.5   0.3381415 0.2927143
## SO.6   0.4178378 0.3909378
## SO.7   0.3423396 0.4571340
## SO.8   0.2760791 0.3542277
## PM.1   0.3783195 0.3607607
## PM.2   0.4058888 0.3692906
## PM.3   0.4391417 0.3763254
## PM.4   0.2797446 0.2366145
## PM.5   0.3829381 0.2911738
## PM.6   0.4586959 0.4521198
## PM.7   0.4069949 0.3473964
## PM.8   0.4257089 0.3376220
## SFPS.1 0.2390028 0.2072168
## SFPS.2 0.3462870 0.4595433
## SFPS.3 0.4585234 0.5470055
## SFPS.4 0.3647469 0.2579580
## SFPS.5 0.4672360 0.3751128
## SFPS.6 0.4193321 0.3688877
## SFPS.7 0.4597387 0.2680821
## LO.1   0.5727567 0.4684382
## LO.2   0.6797776 0.5200677
## LO.3   0.6020432 0.4610961
## LO.4   0.6093703 0.6173724
## LO.5   0.5905355 0.5561634
## LO.6   1.0000000 0.5992233
## LO.7   0.5992233 1.0000000

Correlation matrix was used to measure the linear relationship among the questions under sustainability practices. In general, the correlation among the variables are positive with varying degree of correlation ranges from moderate to strong positive correlation.

Kaiser-Meyer-Olkin Factor Adequacy

library(psych)
KMO(data1)

## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = data1)
## Overall MSA =  0.93
## MSA for each item = 
##   SO.1   SO.2   SO.3   SO.4   SO.5   SO.6   SO.7   SO.8   PM.1   PM.2   PM.3 
##   0.82   0.85   0.94   0.92   0.96   0.96   0.91   0.89   0.94   0.96   0.91 
##   PM.4   PM.5   PM.6   PM.7   PM.8 SFPS.1 SFPS.2 SFPS.3 SFPS.4 SFPS.5 SFPS.6 
##   0.93   0.91   0.94   0.93   0.94   0.91   0.92   0.92   0.92   0.92   0.96 
## SFPS.7   LO.1   LO.2   LO.3   LO.4   LO.5   LO.6   LO.7 
##   0.92   0.93   0.90   0.93   0.97   0.94   0.93   0.90

Kaiser-Meyer-Olkin factor adequacy was used to measure how suited the data is for Factor Analysis. Overall, the adequacy of the data is considered marvelous whereas the individual scores either marvelous or meritorious.

Bartlett’s Test of Sphericity

cortest(data1)

## Tests of correlation matrices 
## Call:cortest(R1 = data1)
##  Chi Square value 34570.69  with df =  435   with probability < 0

Bartlett’s test of sphericity was used to test the hypothesis that the correlation matrix is an identity matrix. The result shows that we need to reject the null hypothesis and accept the alternative since the p-value is less than $\alpha = 0.05$, $\chi^2 = 34570.69$, df$=435$, p$-$value $< 0.001$. This means that the correlation matrix is not an identity matrix and is appropriate for factor analysis.

Principal Component Analysis

pca <- princomp(data1, cor = TRUE)
pca.var <- pca$sdev^2
pca.var # Latent Root Criterion

##     Comp.1     Comp.2     Comp.3     Comp.4     Comp.5     Comp.6     Comp.7 
## 12.6439267  2.2080524  1.5925917  1.4305699  1.1882621  1.0973386  0.9457982 
##     Comp.8     Comp.9    Comp.10    Comp.11    Comp.12    Comp.13    Comp.14 
##  0.8597469  0.6985417  0.6416641  0.5930120  0.5806828  0.5634108  0.4779285 
##    Comp.15    Comp.16    Comp.17    Comp.18    Comp.19    Comp.20    Comp.21 
##  0.4270293  0.4011747  0.3841773  0.3708049  0.3453669  0.3169820  0.2939708 
##    Comp.22    Comp.23    Comp.24    Comp.25    Comp.26    Comp.27    Comp.28 
##  0.2816540  0.2555185  0.2466795  0.2417222  0.2307325  0.1971733  0.1803687 
##    Comp.29    Comp.30 
##  0.1668989  0.1382202

Principal component analysis was performed to identify the number of components needed for the dataset. The eigenvalues of each component are presented in the result. While others suggest to consider components with eigenvalues larger that one, this research (https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1156&context=pare) suggest that it would be better to use scree plot. These results are presented for the sake of presenting the result.

Percentage of Variance

pca.var.per <- round(pca.var/sum(pca.var)*100, 2)
pca.var.per # Percentage of Variance Criterion

##  Comp.1  Comp.2  Comp.3  Comp.4  Comp.5  Comp.6  Comp.7  Comp.8  Comp.9 Comp.10 
##   42.15    7.36    5.31    4.77    3.96    3.66    3.15    2.87    2.33    2.14 
## Comp.11 Comp.12 Comp.13 Comp.14 Comp.15 Comp.16 Comp.17 Comp.18 Comp.19 Comp.20 
##    1.98    1.94    1.88    1.59    1.42    1.34    1.28    1.24    1.15    1.06 
## Comp.21 Comp.22 Comp.23 Comp.24 Comp.25 Comp.26 Comp.27 Comp.28 Comp.29 Comp.30 
##    0.98    0.94    0.85    0.82    0.81    0.77    0.66    0.60    0.56    0.46

The percentage of variance of each component are presented in the result above. It can be seen that the principal component explained 42.15% of the variance present in the dataset, this percentage slowly reduces after the third component. This suggest that we may consider a three factor model.

Cumulative Percentage of Variance

cum.pca.var.per <- cumsum(pca.var.per)
cum.pca.var.per  # Cumulative %

##  Comp.1  Comp.2  Comp.3  Comp.4  Comp.5  Comp.6  Comp.7  Comp.8  Comp.9 Comp.10 
##   42.15   49.51   54.82   59.59   63.55   67.21   70.36   73.23   75.56   77.70 
## Comp.11 Comp.12 Comp.13 Comp.14 Comp.15 Comp.16 Comp.17 Comp.18 Comp.19 Comp.20 
##   79.68   81.62   83.50   85.09   86.51   87.85   89.13   90.37   91.52   92.58 
## Comp.21 Comp.22 Comp.23 Comp.24 Comp.25 Comp.26 Comp.27 Comp.28 Comp.29 Comp.30 
##   93.56   94.50   95.35   96.17   96.98   97.75   98.41   99.01   99.57  100.03

The cumulative percentage of variance of each component are presented in the result above. Similar to the previous conclusion, a three factor model may be considered for this dataset.

Scree Plot

plot(pca, type="l") # Scree Plot

The scree plot visualizes the cumulative percentage of variance of each component. It can be noticed that the scree plot shouldered after the third component. Therefore, consider 3 factors.

Three Factor Model

Kaiser-Meyer-Olkin Factor Adequacy

sample <- data1[,-c(8,1,2,12,17,4,13,23,18,10,11,30,19,20,22)]
KMO(sample)

## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = sample)
## Overall MSA =  0.92
## MSA for each item = 
##   SO.3   SO.5   SO.6   SO.7   PM.1   PM.6   PM.7   PM.8 SFPS.5   LO.1   LO.2 
##   0.94   0.94   0.93   0.90   0.94   0.93   0.92   0.97   0.92   0.90   0.86 
##   LO.3   LO.4   LO.5   LO.6 
##   0.89   0.95   0.94   0.94

A three factor model was performed on the sustainability practices dataset using SO.3, SO.5, SO.6, SO.7, PM.1, PM.6, PM.7, PM.8, LO.1, LO.2, LO.3, LO.4, and LO.6. These variables were who produced better communalities (greater than 50) and better factor loadings (greater than 0.40 in only one factor). In addition, Kaiser-Meyer-Olkin factor adequacy was used to measure how suited this filtered data is for Factor Analysis. Overall, the adequacy of the data is considered marvelous whereas the individual scores either marvelous or meritorious.

Bartlett’s Test of Sphericity

cortest(sample)

## Tests of correlation matrices 
## Call:cortest(R1 = sample)
##  Chi Square value 11794.01  with df =  105   with probability < 0

Bartlett’s test of sphericity was used to test the hypothesis that the correlation matrix is an identity matrix. The result shows that we need to reject the null hypothesis and accept the alternative since the p-value is less than $\alpha = 0.05$, $\chi^2 = 9239.52$, df$=78$, p$-$value $< 0.001$. This means that the correlation matrix is not an identity matrix and is appropriate for factor analysis.

Cumulative Percentage of Variance

pca4 <- princomp(sample, cor = TRUE)
pca.var4 <- pca4$sdev^2
pca.var.per4 <- round(pca.var4/sum(pca.var4)*100, 2)
cum.pca.var.per4 <- cumsum(pca.var.per4)
cum.pca.var.per4

##  Comp.1  Comp.2  Comp.3  Comp.4  Comp.5  Comp.6  Comp.7  Comp.8  Comp.9 Comp.10 
##   50.25   62.32   67.56   71.81   75.72   79.36   82.46   85.28   87.92   90.34 
## Comp.11 Comp.12 Comp.13 Comp.14 Comp.15 
##   92.72   94.97   97.02   98.89  100.00

The cumulative percentage of variance of each component are presented in the result above. The three factor model explains approximately 70% of the variance in the dataset which is considered enough.

Communalities

fa4 <- factanal(sample, factor=3, rotation="none", scores = "regression")
apply(fa4$loadings^2,1,sum)*100 # communality

##     SO.3     SO.5     SO.6     SO.7     PM.1     PM.6     PM.7     PM.8 
## 57.12621 50.88505 61.09177 56.45199 55.98804 55.37212 61.40076 53.54456 
##   SFPS.5     LO.1     LO.2     LO.3     LO.4     LO.5     LO.6 
## 70.12448 65.98055 77.06293 61.75697 60.64407 50.03066 63.50601

Communalities for all the variables are displayed in the result above. It can be seen that all of which are greater than 50.

Factor Loadings

kaiser.fa4 <- kaiser(fa4, rotate = "Promax")

## Loading required namespace: GPArotation

kaiser.fa4$loadings

## 
## Loadings:
##        Factor1 Factor2 Factor3
## SO.3    0.768                 
## SO.5    0.737                 
## SO.6    0.787                 
## SO.7    0.741                 
## PM.1    0.775           0.131 
## PM.6    0.574  -0.137  -0.119 
## PM.7    0.711          -0.196 
## PM.8    0.638                 
## SFPS.5  0.187          -0.695 
## LO.1    0.149  -0.840   0.344 
## LO.2           -0.912         
## LO.3   -0.133  -0.789  -0.140 
## LO.4    0.128  -0.550  -0.222 
## LO.5           -0.677         
## LO.6           -0.730  -0.158 
## 
##                Factor1 Factor2 Factor3
## SS loadings      4.237   3.505   0.788
## Proportion Var   0.282   0.234   0.053
## Cumulative Var   0.282   0.516   0.569

Factor loadings for each variable is presented in the above result. It can be seen that no variable has factor loading of 0.40 or greater on two or more factors. This is a good indicator that the selected factor model is good. However, in the third factor, only PM.6 is included. This suggest that the model needs improvement by reducing the number of factors from three to two.

Two Factor Model

Kaiser-Meyer-Olkin Factor Adequacy

sample2 <- data1[,-c(1,2,8,20,18,12,17,19,4,13,21,22,23,30,10,11,28,27,5,9)]
KMO(sample2)

## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = sample2)
## Overall MSA =  0.88
## MSA for each item = 
## SO.3 SO.6 SO.7 PM.6 PM.7 PM.8 LO.1 LO.2 LO.3 LO.6 
## 0.92 0.88 0.85 0.90 0.87 0.94 0.90 0.78 0.86 0.92

A three factor model was performed on the sustainability practices dataset using SO.3, SO.6, SO.7, PM.6, PM.7, PM.8, LO.1, LO.2,LO.3, and LO.6. These variables were who produced better communalities (greater than 50) and better factor loadings (greater than 0.40 in only one factor). In addition, Kaiser-Meyer-Olkin factor adequacy was used to measure how suited this filtered data is for Factor Analysis. Overall, the adequacy of the data is considered marvelous whereas the individual scores either marvelous or meritorious.

Bartlett’s Test of Sphericity

cortest.bartlett(sample2)

## R was not square, finding R from data

## $chisq
## [1] 2210.262
## 
## $p.value
## [1] 0
## 
## $df
## [1] 45

Bartlett’s test of sphericity was used to test the hypothesis that the correlation matrix is an identity matrix. The result shows that we need to reject the null hypothesis and accept the alternative since the p-value is less than $=$0.05, $\chi^2 = 2210.262$, df$=45$, p−value$ <0.001$. This means that the correlation matrix is not an identity matrix and is appropriate for factor analysis.

Cumulative Percentage of Variance

pca5 <- princomp(sample2, cor = TRUE)
pca.var5 <- pca5$sdev^2
pca.var.per5 <- round(pca.var5/sum(pca.var5)*100, 2)
cum.pca.var.per5 <- cumsum(pca.var.per5)
cum.pca.var.per5

##  Comp.1  Comp.2  Comp.3  Comp.4  Comp.5  Comp.6  Comp.7  Comp.8  Comp.9 Comp.10 
##   52.71   67.54   73.45   78.57   83.26   87.59   91.42   95.05   98.28  100.00

The cumulative percentage of variance of each component are presented in the result above. The two factor model explains approximately 68% of the variance in the dataset which is considered enough.

Communalities

fa5 <- factanal(sample2, factor=2, rotation="none", scores = "regression")
apply(fa5$loadings^2,1,sum)*100 # communality

##     SO.3     SO.6     SO.7     PM.6     PM.7     PM.8     LO.1     LO.2 
## 54.78879 58.57489 54.84477 57.90150 60.81498 56.03807 54.56384 86.54381 
##     LO.3     LO.6 
## 57.25098 56.88877

Communalities for all the variables are displayed in the result above. It can be seen that all of which are greater than 50.

Factor Loadings

kaiser.fa5 <- kaiser(fa5, rotate = "Promax")
kaiser.fa5$loadings

## 
## Loadings:
##      Factor1 Factor2
## SO.3          0.783 
## SO.6          0.749 
## SO.7          0.704 
## PM.6  0.145   0.664 
## PM.7          0.830 
## PM.8          0.722 
## LO.1  0.721         
## LO.2  0.965         
## LO.3  0.784         
## LO.6  0.679   0.114 
## 
##                Factor1 Factor2
## SS loadings      2.567   3.340
## Proportion Var   0.257   0.334
## Cumulative Var   0.257   0.591

Factor loadings for each variable is presented in the above result. It can be seen that no variable has factor loading of 0.40 or greater on two factors. This is a good indicator that the selected factor model is good. Also, the two factors contain sufficient number of variables, factor one contains eight variables whereas factor two contains five variables, which suggest that the model is better. Therefore, a two factor model will be used for the sustainability practice datasets.

Validation of the Two Factor Analysis

Split Test

Spliting the Dataset into Half

sample.size <- floor(0.5*nrow(sample2))
set.seed(24061991)
picked <- sample(seq_len(nrow(sample2)), size=sample.size)
group1 <- sample2[picked,]
group2 <- sample2[-picked,]

Compare Factor Loadings

fa6 <- factanal(group1, factor=2, rotation="none", scores = "regression")
kaiser.fa6 <- kaiser(fa6, rotate = "Promax")
kaiser.fa6$loadings

## 
## Loadings:
##      Factor1 Factor2
## SO.3          0.745 
## SO.6          0.731 
## SO.7          0.718 
## PM.6  0.147   0.679 
## PM.7 -0.153   0.892 
## PM.8  0.131   0.640 
## LO.1  0.710         
## LO.2  0.959         
## LO.3  0.800         
## LO.6  0.686   0.109 
## 
##                Factor1 Factor2
## SS loadings      2.599   3.289
## Proportion Var   0.260   0.329
## Cumulative Var   0.260   0.589

fa7 <- factanal(group2, factor=2, rotation="none", scores = "regression")
kaiser.fa7 <- kaiser(fa7, rotate = "Promax")
kaiser.fa7$loadings

## 
## Loadings:
##      Factor1 Factor2
## SO.3          0.819 
## SO.6          0.772 
## SO.7          0.675 
## PM.6  0.160   0.633 
## PM.7          0.763 
## PM.8          0.801 
## LO.1  0.737         
## LO.2  0.994  -0.109 
## LO.3  0.763         
## LO.6  0.678   0.112 
## 
##                Factor1 Factor2
## SS loadings      2.620   3.374
## Proportion Var   0.262   0.337
## Cumulative Var   0.262   0.599

The two rotated factor matrices for each half produced the same pattern of loadings of variables on factors that we obtained for the analysis on the complete sample. This result validates the factor solution obtained.

Compare Communalities

apply(fa6$loadings^2,1,sum)*100 # communality

##     SO.3     SO.6     SO.7     PM.6     PM.7     PM.8     LO.1     LO.2 
## 51.44319 56.30448 54.67199 60.38985 65.40387 52.83870 50.93515 87.94679 
##     LO.3     LO.6 
## 58.06202 57.23394

apply(fa7$loadings^2,1,sum)*100 # communality

##     SO.3     SO.6     SO.7     PM.6     PM.7     PM.8     LO.1     LO.2 
## 58.35501 61.43947 54.43389 55.13692 57.10449 60.60701 58.85146 86.68624 
##     LO.3     LO.6 
## 56.11073 56.63105

In all cases, communalities are above 0.50 indicating that the factor model is explaining more than half of the variance in all of the original variables.

Checking Outliers

Identify and Remove of Outliers

Scores <- fa5$scores
Outlier.1 = which(Scores[,1] %in% boxplot(Scores)$out)
Outlier.2 = which(Scores[,2] %in% boxplot(Scores)$out)

r.outlier <- sample2[-c(Outlier.1,Outlier.2),]

We identify outliers using boxplot.

Correlation Matrix

cor(r.outlier)

##           SO.3      SO.6      SO.7      PM.6      PM.7      PM.8      LO.1
## SO.3 1.0000000 0.5828032 0.4857082 0.4563528 0.5156638 0.5010971 0.3365875
## SO.6 0.5828032 1.0000000 0.6234690 0.5224786 0.4886988 0.5306362 0.4038830
## SO.7 0.4857082 0.6234690 1.0000000 0.4810590 0.5098019 0.4842705 0.3562976
## PM.6 0.4563528 0.5224786 0.4810590 1.0000000 0.5580913 0.5867132 0.3710734
## PM.7 0.5156638 0.4886988 0.5098019 0.5580913 1.0000000 0.5319550 0.3120460
## PM.8 0.5010971 0.5306362 0.4842705 0.5867132 0.5319550 1.0000000 0.3688989
## LO.1 0.3365875 0.4038830 0.3562976 0.3710734 0.3120460 0.3688989 1.0000000
## LO.2 0.3833507 0.4748845 0.5193547 0.5508421 0.3621809 0.4584001 0.6521510
## LO.3 0.3270076 0.4266623 0.3684317 0.4207248 0.4074096 0.3762445 0.5191733
## LO.6 0.4022594 0.4645490 0.3961275 0.5188967 0.4700127 0.4571050 0.5273461
##           LO.2      LO.3      LO.6
## SO.3 0.3833507 0.3270076 0.4022594
## SO.6 0.4748845 0.4266623 0.4645490
## SO.7 0.5193547 0.3684317 0.3961275
## PM.6 0.5508421 0.4207248 0.5188967
## PM.7 0.3621809 0.4074096 0.4700127
## PM.8 0.4584001 0.3762445 0.4571050
## LO.1 0.6521510 0.5191733 0.5273461
## LO.2 1.0000000 0.7189890 0.6472040
## LO.3 0.7189890 1.0000000 0.5837188
## LO.6 0.6472040 0.5837188 1.0000000

Correlations are the same even after removing the outliers.

Communalities

# Original
apply(fa5$loadings^2,1,sum)*100

##     SO.3     SO.6     SO.7     PM.6     PM.7     PM.8     LO.1     LO.2 
## 54.78879 58.57489 54.84477 57.90150 60.81498 56.03807 54.56384 86.54381 
##     LO.3     LO.6 
## 57.25098 56.88877

#Without Outliers
fa.outlier <- factanal(r.outlier, factor=2, rotation="none", scores = "regression")
apply(fa.outlier$loadings^2,1,sum)*100

##     SO.3     SO.6     SO.7     PM.6     PM.7     PM.8     LO.1     LO.2 
## 50.05064 57.76047 50.30449 53.94678 54.07699 52.66076 47.24624 91.23197 
##     LO.3     LO.6 
## 57.20725 52.80356

Communalities changed and LO.1 is below 50. This needs to be investigated more.

Factor Loadings

#Original
kaiser.fa5$loadings

## 
## Loadings:
##      Factor1 Factor2
## SO.3          0.783 
## SO.6          0.749 
## SO.7          0.704 
## PM.6  0.145   0.664 
## PM.7          0.830 
## PM.8          0.722 
## LO.1  0.721         
## LO.2  0.965         
## LO.3  0.784         
## LO.6  0.679   0.114 
## 
##                Factor1 Factor2
## SS loadings      2.567   3.340
## Proportion Var   0.257   0.334
## Cumulative Var   0.257   0.591

# Without Outliers
kaiser.fa.outlier <- kaiser(fa.outlier, rotate = "Promax")
kaiser.fa.outlier$loadings

## 
## Loadings:
##      Factor1 Factor2
## SO.3          0.757 
## SO.6          0.727 
## SO.7  0.149   0.598 
## PM.6  0.207   0.576 
## PM.7 -0.109   0.807 
## PM.8          0.694 
## LO.1  0.682         
## LO.2  1.035  -0.121 
## LO.3  0.759         
## LO.6  0.542   0.237 
## 
##                Factor1 Factor2
## SS loadings      2.494   2.994
## Proportion Var   0.249   0.299
## Cumulative Var   0.249   0.549

Patterns in the factor loadings are the same overall.

Reliability Analysis

Factor 1

reliability <- psych::alpha(sample2[,1:6])
reliability

## 
## Reliability analysis   
## Call: psych::alpha(x = sample2[, 1:6])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N    ase mean   sd median_r
##       0.89      0.89    0.87      0.57 7.8 0.0088  4.1 0.66     0.56
## 
##  lower alpha upper     95% confidence boundaries
## 0.87 0.89 0.9 
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se  var.r med.r
## SO.3      0.87      0.87    0.85      0.57 6.7    0.010 0.0020  0.57
## SO.6      0.86      0.86    0.84      0.56 6.4    0.011 0.0017  0.56
## SO.7      0.87      0.87    0.85      0.57 6.6    0.010 0.0015  0.57
## PM.6      0.87      0.87    0.85      0.57 6.6    0.011 0.0017  0.57
## PM.7      0.86      0.86    0.84      0.56 6.4    0.011 0.0023  0.55
## PM.8      0.87      0.87    0.85      0.57 6.6    0.011 0.0023  0.56
## 
##  Item statistics 
##        n raw.r std.r r.cor r.drop mean   sd
## SO.3 400  0.78  0.79  0.73   0.68  4.1 0.81
## SO.6 400  0.81  0.81  0.77   0.72  4.1 0.82
## SO.7 400  0.79  0.79  0.74   0.69  4.1 0.79
## PM.6 400  0.80  0.80  0.74   0.70  4.0 0.88
## PM.7 400  0.81  0.81  0.77   0.72  4.2 0.80
## PM.8 400  0.80  0.80  0.74   0.70  4.0 0.84
## 
## Non missing response frequency for each item
##         1    2    3    4    5 miss
## SO.3 0.01 0.01 0.19 0.42 0.37    0
## SO.6 0.01 0.01 0.17 0.44 0.36    0
## SO.7 0.00 0.03 0.18 0.44 0.35    0
## PM.6 0.01 0.02 0.22 0.41 0.34    0
## PM.7 0.00 0.03 0.14 0.42 0.41    0
## PM.8 0.01 0.02 0.22 0.44 0.32    0

The result shows that factor one has a reliability score of 0.89 which is considered very acceptable.

Factor 2

reliability <- psych::alpha(sample2[,7:10])
reliability

## 
## Reliability analysis   
## Call: psych::alpha(x = sample2[, 7:10])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.87      0.87    0.84      0.63 6.8 0.011  4.3 0.62     0.64
## 
##  lower alpha upper     95% confidence boundaries
## 0.85 0.87 0.89 
## 
##  Reliability if an item is dropped:
##      raw_alpha std.alpha G6(smc) average_r S/N alpha se   var.r med.r
## LO.1      0.85      0.85    0.80      0.66 5.9    0.013 0.00277  0.68
## LO.2      0.80      0.80    0.73      0.57 4.0    0.017 0.00097  0.57
## LO.3      0.84      0.85    0.79      0.65 5.5    0.014 0.00416  0.68
## LO.6      0.84      0.84    0.80      0.64 5.4    0.014 0.00814  0.69
## 
##  Item statistics 
##        n raw.r std.r r.cor r.drop mean   sd
## LO.1 400  0.83  0.82  0.74   0.68  4.3 0.76
## LO.2 400  0.90  0.90  0.88   0.82  4.3 0.72
## LO.3 400  0.82  0.84  0.76   0.70  4.4 0.66
## LO.6 400  0.85  0.84  0.76   0.71  4.1 0.79
## 
## Non missing response frequency for each item
##         1    2    3    4    5 miss
## LO.1 0.01 0.00 0.12 0.40 0.47    0
## LO.2 0.00 0.01 0.10 0.43 0.45    0
## LO.3 0.00 0.00 0.10 0.42 0.48    0
## LO.6 0.00 0.03 0.17 0.47 0.33    0

The result shows that factor one has a reliability score of 0.87 which is considered very acceptable.

Session Info

sessionInfo()

## R version 4.0.2 (2020-06-22)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18363)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_Philippines.1252  LC_CTYPE=English_Philippines.1252   
## [3] LC_MONETARY=English_Philippines.1252 LC_NUMERIC=C                        
## [5] LC_TIME=English_Philippines.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] psych_2.0.8      knitr_1.29       kableExtra_1.2.1
## 
## loaded via a namespace (and not attached):
##  [1] rstudioapi_0.11       xml2_1.3.2            magrittr_1.5         
##  [4] rvest_0.3.6           mnormt_2.0.2          munsell_0.5.0        
##  [7] lattice_0.20-41       colorspace_1.4-1      viridisLite_0.3.0    
## [10] R6_2.4.1              rlang_0.4.7           stringr_1.4.0        
## [13] httr_1.4.2            tools_4.0.2           grid_4.0.2           
## [16] parallel_4.0.2        webshot_0.5.2         nlme_3.1-148         
## [19] tmvnsim_1.0-2         xfun_0.17             htmltools_0.5.0      
## [22] yaml_2.2.1            digest_0.6.25         lifecycle_0.2.0      
## [25] GPArotation_2014.11-1 glue_1.4.2            evaluate_0.14        
## [28] rmarkdown_2.3         stringi_1.5.3         compiler_4.0.2       
## [31] scales_1.1.1

Activity 1: Exploratory Factor Analysis

Raffy S. Centeno
Ph.D. in Development Research Administration
DRA 328 - Empirical Analysis 2: Application of Intermediate Statistics

Dataset

Exploratory Factor Analysis

Initial Analysis

Correlation Matrix

Kaiser-Meyer-Olkin Factor Adequacy

Bartlett’s Test of Sphericity

Principal Component Analysis

Percentage of Variance

Cumulative Percentage of Variance

Scree Plot

Three Factor Model

Kaiser-Meyer-Olkin Factor Adequacy

Bartlett’s Test of Sphericity

Cumulative Percentage of Variance

Communalities

Factor Loadings

Two Factor Model

Kaiser-Meyer-Olkin Factor Adequacy

Bartlett’s Test of Sphericity

Cumulative Percentage of Variance

Communalities

Factor Loadings

Validation of the Two Factor Analysis

Split Test

Spliting the Dataset into Half

Compare Factor Loadings

Compare Communalities

Checking Outliers

Identify and Remove of Outliers

Correlation Matrix

Communalities

Factor Loadings

Reliability Analysis

Factor 1

Factor 2

Session Info

Activity 1: Exploratory Factor Analysis

Raffy S. Centeno Ph.D. in Development Research Administration DRA 328 - Empirical Analysis 2: Application of Intermediate Statistics

Dataset

Exploratory Factor Analysis

Initial Analysis

Correlation Matrix

Kaiser-Meyer-Olkin Factor Adequacy

Bartlett’s Test of Sphericity

Principal Component Analysis

Percentage of Variance

Cumulative Percentage of Variance

Scree Plot

Three Factor Model

Kaiser-Meyer-Olkin Factor Adequacy

Bartlett’s Test of Sphericity

Cumulative Percentage of Variance

Communalities

Factor Loadings

Two Factor Model

Kaiser-Meyer-Olkin Factor Adequacy

Bartlett’s Test of Sphericity

Cumulative Percentage of Variance

Communalities

Factor Loadings

Validation of the Two Factor Analysis

Split Test

Spliting the Dataset into Half

Compare Factor Loadings

Compare Communalities

Checking Outliers

Identify and Remove of Outliers

Correlation Matrix

Communalities

Factor Loadings

Reliability Analysis

Factor 1

Factor 2

Session Info

Raffy S. Centeno
Ph.D. in Development Research Administration
DRA 328 - Empirical Analysis 2: Application of Intermediate Statistics