I am going to use R for PCA in the data Sandra. I am reading the data from csv file.
lean<-read.csv("matrix1.csv",stringsAsFactors = F, header=T)
lean1<-lean[1:101,1:48]
#str(lean) #allows you to see an overview class,levels of dataframe
#sapply(lean1,class) # class of variables
names of variables
names(lean1)
[1] "Lean.culture"
[2] "Values.and.personal.vision"
[3] "Way.of.thinking"
[4] "People.skills"
[5] "Growth.in.the.philosophy"
[6] "Growth.and.development.of.Lean.leaders"
[7] "Promotion.and.continuous.development.of.Lean.Construction"
[8] "Encourage.of.continuous.improvement"
[9] "Growth.and.development.of.the.team.work"
[10] "Team.work.processes"
[11] "Problem.understanding"
[12] "Solution.and.joint.learning"
[13] "Fulfillment.of.the.value.offer"
[14] "Continuous.improvement"
[15] "Control.of.complete.process"
[16] "Development.and.operation.of..planning.and.control.system.of.production"
[17] "Pull.system.development"
[18] "Implementation.of.a.quality.management.and.control.system"
[19] "Knowledge.and.selection.of.lean.construction.tools"
[20] "Level.of.using.Lean.Construction.tools"
[21] "Simplification.of.processes"
[22] "Flexibility"
[23] "Transparency"
[24] "Benchmarking"
[25] "Continuous.flow"
[26] "Waste.reduction"
[27] "Variability.reduction"
[28] "Cycle.time.reduction"
[29] "Standards.development"
[30] "Improvement.and.sustainability.of.standards"
[31] "Engage.with.the.safety.at.the.workplace"
[32] "Learning.and.training.for.safety.at.the.workplace"
[33] "Integration.of.new.developments"
[34] "Knowledge.management"
[35] "Continuous.support.for.the.development.of.a.lean.construction.production.system"
[36] "Definition.and.deployment.of.policy.and.strategy.for..lean.construction.support"
[37] "Focus.on.philosophy"
[38] "Involvement.of.People"
[39] "Interaction.in.the.work.environment"
[40] "Work.environment.construction"
[41] "Business.outcomes"
[42] "Project.support.by.using.the..enterprise.processes"
[43] "Implementation.of.Management.system"
[44] "Contractual.management.process"
[45] "systems.of.Information"
[46] "Flow.of.information"
[47] "Supply.chain.management"
[48] "Logistic.operations"
summary(lean_PCA)
Importance of components%s:
PC1 PC2
Standard deviation 4.3273 2.11606
Proportion of Variance 0.3901 0.09329
Cumulative Proportion 0.3901 0.48339
PC3 PC4
Standard deviation 1.66039 1.48825
Proportion of Variance 0.05744 0.04614
Cumulative Proportion 0.54083 0.58697
PC5 PC6
Standard deviation 1.38007 1.27298
Proportion of Variance 0.03968 0.03376
Cumulative Proportion 0.62665 0.66041
PC7 PC8
Standard deviation 1.17191 1.11677
Proportion of Variance 0.02861 0.02598
Cumulative Proportion 0.68902 0.71501
PC9 PC10
Standard deviation 1.07156 1.02454
Proportion of Variance 0.02392 0.02187
Cumulative Proportion 0.73893 0.76080
PC11 PC12
Standard deviation 0.97299 0.91275
Proportion of Variance 0.01972 0.01736
Cumulative Proportion 0.78052 0.79788
PC13 PC14
Standard deviation 0.86621 0.82283
Proportion of Variance 0.01563 0.01411
Cumulative Proportion 0.81351 0.82761
PC15 PC16
Standard deviation 0.81315 0.78461
Proportion of Variance 0.01378 0.01283
Cumulative Proportion 0.84139 0.85421
PC17 PC18
Standard deviation 0.75423 0.71818
Proportion of Variance 0.01185 0.01075
Cumulative Proportion 0.86607 0.87681
PC19 PC20
Standard deviation 0.70931 0.66475
Proportion of Variance 0.01048 0.00921
Cumulative Proportion 0.88729 0.89650
PC21 PC22
Standard deviation 0.64380 0.61902
Proportion of Variance 0.00864 0.00798
Cumulative Proportion 0.90513 0.91312
PC23 PC24
Standard deviation 0.60426 0.58007
Proportion of Variance 0.00761 0.00701
Cumulative Proportion 0.92072 0.92773
PC25 PC26
Standard deviation 0.56017 0.53505
Proportion of Variance 0.00654 0.00596
Cumulative Proportion 0.93427 0.94023
PC27 PC28
Standard deviation 0.5185 0.51101
Proportion of Variance 0.0056 0.00544
Cumulative Proportion 0.9458 0.95127
PC29 PC30
Standard deviation 0.48689 0.46641
Proportion of Variance 0.00494 0.00453
Cumulative Proportion 0.95621 0.96075
PC31 PC32
Standard deviation 0.43408 0.42391
Proportion of Variance 0.00393 0.00374
Cumulative Proportion 0.96467 0.96842
PC33 PC34
Standard deviation 0.42060 0.40230
Proportion of Variance 0.00369 0.00337
Cumulative Proportion 0.97210 0.97547
PC35 PC36
Standard deviation 0.38346 0.37427
Proportion of Variance 0.00306 0.00292
Cumulative Proportion 0.97854 0.98145
PC37 PC38
Standard deviation 0.34577 0.32569
Proportion of Variance 0.00249 0.00221
Cumulative Proportion 0.98394 0.98615
PC39 PC40
Standard deviation 0.3176 0.31370
Proportion of Variance 0.0021 0.00205
Cumulative Proportion 0.9883 0.99031
PC41 PC42
Standard deviation 0.29461 0.28039
Proportion of Variance 0.00181 0.00164
Cumulative Proportion 0.99211 0.99375
PC43 PC44
Standard deviation 0.25410 0.24814
Proportion of Variance 0.00135 0.00128
Cumulative Proportion 0.99510 0.99638
PC45 PC46
Standard deviation 0.23282 0.2196
Proportion of Variance 0.00113 0.0010
Cumulative Proportion 0.99751 0.9985
PC47 PC48
Standard deviation 0.20887 0.16641
Proportion of Variance 0.00091 0.00058
Cumulative Proportion 0.99942 1.00000
the first 20 principal components explain 90.5% of the variation. You can also limit the number of component to that number that accounts for a certain fraction of the total variance. For example, if you are satisfied with 70% of the total variance explained then use the number of components to achieve that.
This is the loadings of the PCA. the coefficients of the each component principal.
#lean_eigen<-get_eigenvalue(lean_PCA) eigenavalues which means the variance
lean_loadings<-lean_PCA$rotation # loadings
#edit(lean_loadings) edit in a square the loadings
fviz_pca_var(lean_PCA, col.var = "black")
This is variable correlation plot. X axis first principal component against Y axis second component. Negatively correlated variables there are for the first component.
Now I am going to extract the results. This function provides a list of matrices containing all the results for the active variables (coordinates, correlation between variables and axes, squared cosine and contributions)
This is the first 4 coordinates of 48 principal component. The correlation between a variable and a principal component (PC) is used as the coordinates of the variable on the PC.
var <- get_pca_var(lean_PCA)
head(var$coord)
The contributions of variables in accounting for the variability in a given principal component are expressed in percentage. 1-Variables that are correlated with PC1 and PC2 (or Dim.1 and Dim.2) are the most important in explaining the variability in the data set. 2-Variables that do not correlated with any PC or correlated with the last dimensions are variables with low contribution and might be removed to simplify the overall analysis.
head(var$contrib)
I a going to print allcontributions. It is a long list! I think it is much better for taking decision about it!
print(var)
Principal Component Analysis Results for variables
===================================================
Name Description
1 "$coord" "Coordinates for the variables"
2 "$cor" "Correlations between variables and dimensions"
3 "$cos2" "Cos2 for the variables"
4 "$contrib" "contributions of the variables"
Good luck sandra!plzz dont hesitate to contact me if you have any question! :-)!