We first load the necessary packages in R and read in our dataset prince.

library(tidyverse)
library(psych)
library(rela)
library(MASS)
library(parallel)
options(scipen = 999)
prince <- read.table("prince.dat", sep = "" , header = F)

Principal Components Analysis

Principa Components Analysis (PCA) is a method of data reduction, which seeks to find a number of linear combinations of variates less than the actual number of variates themselves, such that as much of the variation in the data as possible is accounted for.

This method has three key assumptions: - Sphericity or the existence of the identity matrix - Sampling adequacy or an appropriate number of observations relative to the number of variables being examined - Positive determinant of the correlation or variance-covariance matrices

We first compute the correlation matrix \(\boldsymbol{\rho}\) and the variance-covariance matrix \(\boldsymbol{\sum}\).

princeCor <- cor(prince[,1:4])
princeCov <- cov(prince[,1:4])

We now examine the data to assess whether the assumptions for PCA have been met before proceeding. We use the paf() function from the rela package.

assumptions <- paf(as.matrix(prince[,1:4]), eigcrit = 1, convcrit = .001)

To test for the first assumption, we perform Bartlett’s Test for Sphericity.

The null hypothesis for this test is that the intercorrelation matrix comes from a noncollinear populaton or simply that there is nocollinearity between the variables, which would render PCA impossible as it depends on the construction of a linear combination of the variables.

We use a significance level \(\alpha = .05\).

bartlettTest <- cortest.bartlett(princeCor, n = 98)
bartlettTest

$chisq
[1] 228.97

$p.value
[1] 0.00000000000000000000000000000000000000000000012692

$df
[1] 6

Bartlett’s Test follows a \(\chi^{2}_{p}\) distribution. Our p-value is very close to 0. Hence we reject the null hypothesis.

We proceed to evaluate the sampling adequacy. Using the assumptions variable calculated earlier, we extract the Kaiser-Meyer-Olkin (KMO) value.

print(assumptions$KMO)

[1] 0.63989

Our KMO value is 0.63989, which is some distance from 1. However, it is often recommended that a ratio of 20 observations to one variable will yield satisfactory results {Schumacker:2015tn}. In this case, we have 98 observations for four variables and therefore feel reasonably confident that we can proceed with the analysis.

We now test for positivity of the correlation matrix.

det(princeCor)

[1] 0.089415

Clearly, the determinant of the correlation matrix is positive. We have thus satisfied all the assumptions for PCA and proceed with the analysis.

We can use a scree plot to aid with the decision of how many components to include:

fa.parallel(prince[, 1:4], n.obs = 98, fm = "pa", fa = "pc")

You specified the number of subjects, implying a correlation matrix, but do not have a correlation matrix, correlations found  A loading greater than abs(1) was detected.  Examine the loadings carefully.The estimated weights for the factor scores are probably incorrect.  Try a different factor extraction method.
An ultra-Heywood case was detected.  Examine the results carefully

Parallel analysis suggests that the number of factors =  NA  and the number of components =  1

From this plot, we can see that the first component has the largest eigen value.

We now proceed to construct a PCA model, with a single principal component.

pcaModel1 <- principal(princeCor,rotate = "none")
pcaModel1

Principal Components Analysis
Call: principal(r = princeCor, rotate = "none")
Standardized loadings (pattern matrix) based upon correlation matrix
    PC1   h2    u2 com
V1 0.41 0.17 0.833   1
V2 0.95 0.91 0.089   1
V3 0.72 0.51 0.487   1
V4 0.94 0.87 0.126   1

                PC1
SS loadings    2.46
Proportion Var 0.62

Mean item complexity =  1
Test of the hypothesis that 1 component is sufficient.

The root mean square of the residuals (RMSR) is  0.14 

Fit based upon off diagonal values = 0.93

The SS loadings represent the eigenvalue for the single principal component, which has a value of 2.46, with the proportion of variance explained by this principal component equal to 62%, which leaves 38% of the variance unexplained.

We thus fit a second model, which allows for additional components.

pcaModel2 <- principal(princeCor, nfactors = 3, rotate = "none")
pcaModel2

Principal Components Analysis
Call: principal(r = princeCor, nfactors = 3, rotate = "none")
Standardized loadings (pattern matrix) based upon correlation matrix
    PC1   PC2   PC3   h2       u2 com
V1 0.41  0.87  0.26 1.00 0.000061 1.6
V2 0.95 -0.02 -0.22 0.96 0.040456 1.1
V3 0.72 -0.45  0.54 1.00 0.000350 2.6
V4 0.94 -0.02 -0.30 0.96 0.035164 1.2

                       PC1  PC2  PC3
SS loadings           2.46 0.96 0.49
Proportion Var        0.62 0.24 0.12
Cumulative Var        0.62 0.86 0.98
Proportion Explained  0.63 0.25 0.13
Cumulative Proportion 0.63 0.87 1.00

Mean item complexity =  1.6
Test of the hypothesis that 3 components are sufficient.

The root mean square of the residuals (RMSR) is  0.02 

Fit based upon off diagonal values = 1

The eigenvalues for the components are \(\lambda_{1}\) = 2.46 (from previous computation), \(\lambda_{2}\) = 0.96 and \(\lambda_{3}\) = .49. Together these components account for 98% of the variation in the dataset.

The first PC was strongly correlated with variables X2 and X4, moderately correlated with variable X3 and weakly correlated with variable X1.

The second PC is strongly correlated with variable X1 and has negative correlations with variables X2, X3 and X4.

The third PC has positive correlations with variables X1 and X3 and negative correlations with variables X2 and X4.

Factor Analysis

We first load additional required packages to perform the Factor Analysis.

library(corpcor)
library(GPArotation)

We modify our previous dataset and convert it to a matrix to facilitate our computations.

factData <- as.matrix(prince[,1:4])

We then compute our correlation matrix.

factCor <- cor(factData)
factCov <- cov(factData)

The key assumptions of Factor Analysis are: 1. Adequate sample size 2. Sphericity 3. Positive determinant of the correlation matrix

Similar to the PCA performed above. As these assumptions were all test and satisfied (with some caveats), we proceed to perform the Factor Analysis.

factorAnalysis <-  fa(factCor, nfactors = 3, n.obs = 98, rotate = "none")
print(factorAnalysis)

Factor Analysis using method =  minres
Call: fa(r = factCor, nfactors = 3, n.obs = 98, rotate = "none")
Standardized loadings (pattern matrix) based upon correlation matrix
    MR1   MR2   MR3   h2     u2 com
V1 0.30  0.38  0.06 0.24 0.7631 1.9
V2 0.99  0.04  0.05 0.99 0.0078 1.0
V3 0.59 -0.35  0.05 0.47 0.5337 1.6
V4 0.93  0.06 -0.10 0.88 0.1220 1.0

                       MR1  MR2  MR3
SS loadings           2.28 0.27 0.02
Proportion Var        0.57 0.07 0.00
Cumulative Var        0.57 0.64 0.64
Proportion Explained  0.89 0.11 0.01
Cumulative Proportion 0.89 0.99 1.00

Mean item complexity =  1.4
Test of the hypothesis that 3 factors are sufficient.

The degrees of freedom for the null model are  6  and the objective function was  2.41 with Chi Square of  228.97
The degrees of freedom for the model are -3  and the objective function was  0 

The root mean square of the residuals (RMSR) is  0 
The df corrected root mean square of the residuals is  NA 

The harmonic number of observations is  98 with the empirical chi square  0  with prob <  NA 
The total number of observations was  98  with Likelihood Chi Square =  0  with prob <  NA 

Tucker Lewis Index of factoring reliability =  1.028
Fit based upon off diagonal values = 1
Measures of factor score adequacy             
                                                   MR1   MR2   MR3
Correlation of (regression) scores with factors   1.00  0.55  0.39
Multiple R square of scores with factors          0.99  0.31  0.15
Minimum correlation of possible factor scores     0.98 -0.39 -0.69

We construct a scree plot to aid with the selection of the number of factors. From this plot, we see that the eigenvalues drop precipitously after factor 1.

plot(factorAnalysis$values, type = "b", xlim = c(1, 10))

We therefore construct a new analysis with only a single factor.

factorAnalysis2 <-  fa(factCor, nfactors = 1, n.obs = 98, rotate = "none", fm = "mle")
print(factorAnalysis2)

Factor Analysis using method =  ml
Call: fa(r = factCor, nfactors = 1, n.obs = 98, rotate = "none", fm = "mle")
Standardized loadings (pattern matrix) based upon correlation matrix
    ML1    h2    u2 com
V1 0.31 0.097 0.903   1
V2 1.00 0.995 0.005   1
V3 0.57 0.328 0.672   1
V4 0.92 0.853 0.147   1

                ML1
SS loadings    2.27
Proportion Var 0.57

Mean item complexity =  1
Test of the hypothesis that 1 factor is sufficient.

The degrees of freedom for the null model are  6  and the objective function was  2.41 with Chi Square of  228.97
The degrees of freedom for the model are 2  and the objective function was  0.03 

The root mean square of the residuals (RMSR) is  0.06 
The df corrected root mean square of the residuals is  0.1 

The harmonic number of observations is  98 with the empirical chi square  3.6  with prob <  0.17 
The total number of observations was  98  with Likelihood Chi Square =  2.98  with prob <  0.23 

Tucker Lewis Index of factoring reliability =  0.987
RMSEA index =  0.074  and the 90 % confidence intervals are  0 0.226
BIC =  -6.19
Fit based upon off diagonal values = 0.99
Measures of factor score adequacy             
                                                   ML1
Correlation of (regression) scores with factors   1.00
Multiple R square of scores with factors          1.00
Minimum correlation of possible factor scores     0.99

Using the above output, we test whether a single factor is sufficient to account for this model. The output indicates a \(\chi^{2}\) value of 228.97 with 6 \(df\). We calculate the p-value using the pchisq() function.

pchisq(228.97, df = 6, lower.tail = FALSE)

[1] 0.00000000000000000000000000000000000000000000012701

We obtain a p-value close to 0, therefore one factor is not insufficient.

We try a model with three factors. However, we achieve the same \(\chi^{2}\) value.

factorAnalysis3 <-  fa(factCov, nfactors = 3, n.obs = 98, rotate = "none", fm = "mle")
print(factorAnalysis3)

Factor Analysis using method =  ml
Call: fa(r = factCov, nfactors = 3, n.obs = 98, rotate = "none", fm = "mle")
Standardized loadings (pattern matrix) based upon correlation matrix
    ML1   ML3   ML2   h2    u2 com
V1 0.31 -0.35  0.05 0.22 0.778 2.0
V2 0.98  0.00  0.19 0.99 0.005 1.1
V3 0.56  0.38  0.14 0.48 0.524 1.9
V4 0.98  0.00 -0.19 0.99 0.005 1.1

                       ML1  ML3  ML2
SS loadings           2.32 0.27 0.10
Proportion Var        0.58 0.07 0.02
Cumulative Var        0.58 0.65 0.67
Proportion Explained  0.86 0.10 0.04
Cumulative Proportion 0.86 0.96 1.00

Mean item complexity =  1.5
Test of the hypothesis that 3 factors are sufficient.

The degrees of freedom for the null model are  6  and the objective function was  2.41 with Chi Square of  228.97
The degrees of freedom for the model are -3  and the objective function was  0 

The root mean square of the residuals (RMSR) is  0 
The df corrected root mean square of the residuals is  NA 

The harmonic number of observations is  98 with the empirical chi square  0  with prob <  NA 
The total number of observations was  98  with Likelihood Chi Square =  0  with prob <  NA 

Tucker Lewis Index of factoring reliability =  1.028
Fit based upon off diagonal values = 1
Measures of factor score adequacy             
                                                   ML1   ML3  ML2
Correlation of (regression) scores with factors   1.00  0.55 0.97
Multiple R square of scores with factors          1.00  0.31 0.94
Minimum correlation of possible factor scores     0.99 -0.39 0.87

 Factor scores estimated using the  regression  method  have correlations of 
                                                 ML1   ML3   ML2
Correlation of scores with factors              0.06  0.05  0.04
Multiple R square of scores with factors        0.00  0.00  0.00
Minimum correlation of possible factor scores  -0.99 -1.00 -1.00

We note that factor 1 is comprised of all four variables, similarly factor 2 is comprised of all four variables, while factor 3 is comprised of only variables 1 and 3.

LS0tCnRpdGxlOiAnU1RBVCA2MTcwOiBNdWx0aXZhcmlhdGUgQW5hbHlzaXMnCmF1dGhvcjogIk5pa2tldiBNYXJzaGFsbCIKb3V0cHV0OgogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQKICBodG1sX2RvY3VtZW50OgogICAgZGZfcHJpbnQ6IHBhZ2VkCiAgcGRmX2RvY3VtZW50OiBkZWZhdWx0CiAgd29yZF9kb2N1bWVudDogZGVmYXVsdAotLS0KCldlIGZpcnN0IGxvYWQgdGhlIG5lY2Vzc2FyeSBwYWNrYWdlcyBpbiAqUiogYW5kIHJlYWQgaW4gb3VyIGRhdGFzZXQgKipwcmluY2UqKi4KCmBgYHtyfQpsaWJyYXJ5KHRpZHl2ZXJzZSkKbGlicmFyeShwc3ljaCkKbGlicmFyeShyZWxhKQpsaWJyYXJ5KE1BU1MpCmxpYnJhcnkocGFyYWxsZWwpCgpvcHRpb25zKHNjaXBlbiA9IDk5OSkKCnByaW5jZSA8LSByZWFkLnRhYmxlKCJwcmluY2UuZGF0Iiwgc2VwID0gIiIgLCBoZWFkZXIgPSBGKQoKYGBgCgojIFByaW5jaXBhbCBDb21wb25lbnRzIEFuYWx5c2lzCgpQcmluY2lwYSBDb21wb25lbnRzIEFuYWx5c2lzIChQQ0EpIGlzIGEgbWV0aG9kIG9mIGRhdGEgcmVkdWN0aW9uLCB3aGljaCBzZWVrcyB0byBmaW5kIGEgbnVtYmVyIG9mIGxpbmVhciBjb21iaW5hdGlvbnMgb2YgdmFyaWF0ZXMgbGVzcyB0aGFuIHRoZSBhY3R1YWwgbnVtYmVyIG9mIHZhcmlhdGVzIHRoZW1zZWx2ZXMsIHN1Y2ggdGhhdCBhcyBtdWNoIG9mIHRoZSB2YXJpYXRpb24gaW4gdGhlIGRhdGEgYXMgcG9zc2libGUgaXMgYWNjb3VudGVkIGZvci4gCgpUaGlzIG1ldGhvZCBoYXMgdGhyZWUga2V5IGFzc3VtcHRpb25zOgotIFNwaGVyaWNpdHkgb3IgdGhlIGV4aXN0ZW5jZSBvZiB0aGUgaWRlbnRpdHkgbWF0cml4Ci0gU2FtcGxpbmcgYWRlcXVhY3kgb3IgYW4gYXBwcm9wcmlhdGUgbnVtYmVyIG9mIG9ic2VydmF0aW9ucyByZWxhdGl2ZSB0byB0aGUgbnVtYmVyIG9mIHZhcmlhYmxlcyBiZWluZyBleGFtaW5lZAotIFBvc2l0aXZlIGRldGVybWluYW50IG9mIHRoZSBjb3JyZWxhdGlvbiBvciB2YXJpYW5jZS1jb3ZhcmlhbmNlIG1hdHJpY2VzCgpXZSBmaXJzdCBjb21wdXRlIHRoZSBjb3JyZWxhdGlvbiBtYXRyaXggJFxib2xkc3ltYm9se1xyaG99JCBhbmQgdGhlIHZhcmlhbmNlLWNvdmFyaWFuY2UgbWF0cml4ICRcYm9sZHN5bWJvbHtcc3VtfSQuCgpgYGB7cn0KcHJpbmNlQ29yIDwtIGNvcihwcmluY2VbLDE6NF0pCnByaW5jZUNvdiA8LSBjb3YocHJpbmNlWywxOjRdKQpgYGAKCldlIG5vdyBleGFtaW5lIHRoZSBkYXRhIHRvIGFzc2VzcyB3aGV0aGVyIHRoZSBhc3N1bXB0aW9ucyBmb3IgUENBIGhhdmUgYmVlbiBtZXQgYmVmb3JlIHByb2NlZWRpbmcuIFdlIHVzZSB0aGUgKnBhZigpKiBmdW5jdGlvbiBmcm9tIHRoZSAqKnJlbGEqKiBwYWNrYWdlLiAKCmBgYHtyfQphc3N1bXB0aW9ucyA8LSBwYWYoYXMubWF0cml4KHByaW5jZVssMTo0XSksIGVpZ2NyaXQgPSAxLCBjb252Y3JpdCA9IC4wMDEpCmBgYAoKVG8gdGVzdCBmb3IgdGhlIGZpcnN0IGFzc3VtcHRpb24sIHdlIHBlcmZvcm0gQmFydGxldHQncyBUZXN0IGZvciBTcGhlcmljaXR5LgoKVGhlIG51bGwgaHlwb3RoZXNpcyBmb3IgdGhpcyB0ZXN0IGlzIHRoYXQgdGhlIGludGVyY29ycmVsYXRpb24gbWF0cml4IGNvbWVzIGZyb20gYSBub25jb2xsaW5lYXIgcG9wdWxhdG9uIG9yIHNpbXBseSB0aGF0IHRoZXJlIGlzIG5vY29sbGluZWFyaXR5IGJldHdlZW4gdGhlIHZhcmlhYmxlcywgd2hpY2ggd291bGQgcmVuZGVyIFBDQSBpbXBvc3NpYmxlIGFzIGl0IGRlcGVuZHMgb24gdGhlIGNvbnN0cnVjdGlvbiBvZiBhIGxpbmVhciBjb21iaW5hdGlvbiBvZiB0aGUgdmFyaWFibGVzLiAKCldlIHVzZSBhIHNpZ25pZmljYW5jZSBsZXZlbCAkXGFscGhhID0gLjA1JC4KCmBgYHtyfQpiYXJ0bGV0dFRlc3QgPC0gY29ydGVzdC5iYXJ0bGV0dChwcmluY2VDb3IsIG4gPSA5OCkKYmFydGxldHRUZXN0CmBgYAoKQmFydGxldHQncyBUZXN0IGZvbGxvd3MgYSAkXGNoaV57Mn1fe3B9JCBkaXN0cmlidXRpb24uIE91ciBwLXZhbHVlIGlzIHZlcnkgY2xvc2UgdG8gMC4gSGVuY2Ugd2UgcmVqZWN0IHRoZSBudWxsIGh5cG90aGVzaXMuCgpXZSBwcm9jZWVkIHRvIGV2YWx1YXRlIHRoZSBzYW1wbGluZyBhZGVxdWFjeS4gVXNpbmcgdGhlICphc3N1bXB0aW9ucyogdmFyaWFibGUgY2FsY3VsYXRlZCBlYXJsaWVyLCB3ZSBleHRyYWN0IHRoZSBLYWlzZXItTWV5ZXItT2xraW4gKEtNTykgdmFsdWUuIAoKYGBge3J9CnByaW50KGFzc3VtcHRpb25zJEtNTykKYGBgCgpPdXIgS01PIHZhbHVlIGlzIDAuNjM5ODksIHdoaWNoIGlzIHNvbWUgZGlzdGFuY2UgZnJvbSAxLiBIb3dldmVyLCBpdCBpcyBvZnRlbiByZWNvbW1lbmRlZCB0aGF0IGEgcmF0aW8gb2YgMjAgb2JzZXJ2YXRpb25zIHRvIG9uZSB2YXJpYWJsZSB3aWxsIHlpZWxkIHNhdGlzZmFjdG9yeSByZXN1bHRzIHtTY2h1bWFja2VyOjIwMTV0bn0uIEluIHRoaXMgY2FzZSwgd2UgaGF2ZSA5OCBvYnNlcnZhdGlvbnMgZm9yIGZvdXIgdmFyaWFibGVzIGFuZCB0aGVyZWZvcmUgZmVlbCByZWFzb25hYmx5IGNvbmZpZGVudCB0aGF0IHdlIGNhbiBwcm9jZWVkIHdpdGggdGhlIGFuYWx5c2lzLiAKCldlIG5vdyB0ZXN0IGZvciBwb3NpdGl2aXR5IG9mIHRoZSBjb3JyZWxhdGlvbiBtYXRyaXguIAoKYGBge3J9CmRldChwcmluY2VDb3IpCmBgYAoKQ2xlYXJseSwgdGhlIGRldGVybWluYW50IG9mIHRoZSBjb3JyZWxhdGlvbiBtYXRyaXggaXMgcG9zaXRpdmUuIFdlIGhhdmUgdGh1cyBzYXRpc2ZpZWQgYWxsIHRoZSBhc3N1bXB0aW9ucyBmb3IgUENBIGFuZCBwcm9jZWVkIHdpdGggdGhlIGFuYWx5c2lzLiAKCldlIGNhbiB1c2UgYSBzY3JlZSBwbG90IHRvIGFpZCB3aXRoIHRoZSBkZWNpc2lvbiBvZiBob3cgbWFueSBjb21wb25lbnRzIHRvIGluY2x1ZGU6CgpgYGB7cn0KZmEucGFyYWxsZWwocHJpbmNlWywgMTo0XSwgbi5vYnMgPSA5OCwgZm0gPSAicGEiLCBmYSA9ICJwYyIpCmBgYAoKRnJvbSB0aGlzIHBsb3QsIHdlIGNhbiBzZWUgdGhhdCB0aGUgZmlyc3QgY29tcG9uZW50IGhhcyB0aGUgbGFyZ2VzdCBlaWdlbiB2YWx1ZS4KCldlIG5vdyBwcm9jZWVkIHRvIGNvbnN0cnVjdCBhIFBDQSBtb2RlbCwgd2l0aCBhIHNpbmdsZSBwcmluY2lwYWwgY29tcG9uZW50LgoKYGBge3J9CnBjYU1vZGVsMSA8LSBwcmluY2lwYWwocHJpbmNlQ29yLHJvdGF0ZSA9ICJub25lIikKcGNhTW9kZWwxCmBgYAoKVGhlIFNTIGxvYWRpbmdzIHJlcHJlc2VudCB0aGUgZWlnZW52YWx1ZSBmb3IgdGhlIHNpbmdsZSBwcmluY2lwYWwgY29tcG9uZW50LCB3aGljaCBoYXMgYSB2YWx1ZSBvZiAyLjQ2LCB3aXRoIHRoZSBwcm9wb3J0aW9uIG9mIHZhcmlhbmNlIGV4cGxhaW5lZCBieSB0aGlzIHByaW5jaXBhbCBjb21wb25lbnQgZXF1YWwgdG8gNjIlLCB3aGljaCBsZWF2ZXMgMzglIG9mIHRoZSB2YXJpYW5jZSB1bmV4cGxhaW5lZC4KCldlIHRodXMgZml0IGEgc2Vjb25kIG1vZGVsLCB3aGljaCBhbGxvd3MgZm9yIGFkZGl0aW9uYWwgY29tcG9uZW50cy4gCgpgYGB7cn0KcGNhTW9kZWwyIDwtIHByaW5jaXBhbChwcmluY2VDb3IsIG5mYWN0b3JzID0gMywgcm90YXRlID0gIm5vbmUiKQpwY2FNb2RlbDIKYGBgCgpUaGUgZWlnZW52YWx1ZXMgZm9yIHRoZSBjb21wb25lbnRzIGFyZSAkXGxhbWJkYV97MX0kID0gMi40NiAoZnJvbSBwcmV2aW91cyBjb21wdXRhdGlvbiksICRcbGFtYmRhX3syfSQgPSAwLjk2IGFuZCAkXGxhbWJkYV97M30kID0gLjQ5LiBUb2dldGhlciB0aGVzZSBjb21wb25lbnRzIGFjY291bnQgZm9yIDk4JSBvZiB0aGUgdmFyaWF0aW9uIGluIHRoZSBkYXRhc2V0LgoKVGhlIGZpcnN0IFBDIHdhcyBzdHJvbmdseSBjb3JyZWxhdGVkIHdpdGggdmFyaWFibGVzIFgyIGFuZCBYNCwgbW9kZXJhdGVseSBjb3JyZWxhdGVkIHdpdGggdmFyaWFibGUgWDMgYW5kIHdlYWtseSBjb3JyZWxhdGVkIHdpdGggdmFyaWFibGUgWDEuIAoKVGhlIHNlY29uZCBQQyBpcyBzdHJvbmdseSBjb3JyZWxhdGVkIHdpdGggdmFyaWFibGUgWDEgYW5kIGhhcyBuZWdhdGl2ZSBjb3JyZWxhdGlvbnMgd2l0aCB2YXJpYWJsZXMgWDIsIFgzIGFuZCBYNC4gCgpUaGUgdGhpcmQgUEMgaGFzIHBvc2l0aXZlIGNvcnJlbGF0aW9ucyB3aXRoIHZhcmlhYmxlcyBYMSBhbmQgWDMgYW5kIG5lZ2F0aXZlIGNvcnJlbGF0aW9ucyB3aXRoIHZhcmlhYmxlcyBYMiBhbmQgWDQuIAoKIyBGYWN0b3IgQW5hbHlzaXMKCldlIGZpcnN0IGxvYWQgYWRkaXRpb25hbCByZXF1aXJlZCBwYWNrYWdlcyB0byBwZXJmb3JtIHRoZSBGYWN0b3IgQW5hbHlzaXMuCgpgYGB7cn0KbGlicmFyeShjb3JwY29yKQpsaWJyYXJ5KEdQQXJvdGF0aW9uKQpgYGAKCldlIG1vZGlmeSBvdXIgcHJldmlvdXMgZGF0YXNldCBhbmQgY29udmVydCBpdCB0byBhIG1hdHJpeCB0byBmYWNpbGl0YXRlIG91ciBjb21wdXRhdGlvbnMuCgpgYGB7cn0KZmFjdERhdGEgPC0gYXMubWF0cml4KHByaW5jZVssMTo0XSkKYGBgCgpXZSB0aGVuIGNvbXB1dGUgb3VyIGNvcnJlbGF0aW9uIG1hdHJpeC4gCgpgYGB7cn0KZmFjdENvciA8LSBjb3IoZmFjdERhdGEpCmZhY3RDb3YgPC0gY292KGZhY3REYXRhKQpgYGAKClRoZSBrZXkgYXNzdW1wdGlvbnMgb2YgRmFjdG9yIEFuYWx5c2lzIGFyZToKMS4gQWRlcXVhdGUgc2FtcGxlIHNpemUKMi4gU3BoZXJpY2l0eQozLiBQb3NpdGl2ZSBkZXRlcm1pbmFudCBvZiB0aGUgY29ycmVsYXRpb24gbWF0cml4CgpTaW1pbGFyIHRvIHRoZSBQQ0EgcGVyZm9ybWVkIGFib3ZlLiBBcyB0aGVzZSBhc3N1bXB0aW9ucyB3ZXJlIGFsbCB0ZXN0IGFuZCBzYXRpc2ZpZWQgKHdpdGggc29tZSBjYXZlYXRzKSwgd2UgcHJvY2VlZCB0byBwZXJmb3JtIHRoZSBGYWN0b3IgQW5hbHlzaXMuIAoKYGBge3J9CmZhY3RvckFuYWx5c2lzIDwtICBmYShmYWN0Q29yLCBuZmFjdG9ycyA9IDMsIG4ub2JzID0gOTgsIHJvdGF0ZSA9ICJub25lIikKcHJpbnQoZmFjdG9yQW5hbHlzaXMpCmBgYAoKV2UgY29uc3RydWN0IGEgc2NyZWUgcGxvdCB0byBhaWQgd2l0aCB0aGUgc2VsZWN0aW9uIG9mIHRoZSBudW1iZXIgb2YgZmFjdG9ycy4gRnJvbSB0aGlzIHBsb3QsIHdlIHNlZSB0aGF0IHRoZSBlaWdlbnZhbHVlcyBkcm9wIHByZWNpcGl0b3VzbHkgYWZ0ZXIgZmFjdG9yIDEuCgpgYGB7cn0KcGxvdChmYWN0b3JBbmFseXNpcyR2YWx1ZXMsIHR5cGUgPSAiYiIsIHhsaW0gPSBjKDEsIDEwKSkKYGBgCgpXZSB0aGVyZWZvcmUgY29uc3RydWN0IGEgbmV3IGFuYWx5c2lzIHdpdGggb25seSBhIHNpbmdsZSBmYWN0b3IuCgpgYGB7cn0KZmFjdG9yQW5hbHlzaXMyIDwtICBmYShmYWN0Q29yLCBuZmFjdG9ycyA9IDEsIG4ub2JzID0gOTgsIHJvdGF0ZSA9ICJub25lIiwgZm0gPSAibWxlIikKcHJpbnQoZmFjdG9yQW5hbHlzaXMyKQpgYGAKClVzaW5nIHRoZSBhYm92ZSBvdXRwdXQsIHdlIHRlc3Qgd2hldGhlciBhIHNpbmdsZSBmYWN0b3IgaXMgc3VmZmljaWVudCB0byBhY2NvdW50IGZvciB0aGlzIG1vZGVsLiBUaGUgb3V0cHV0IGluZGljYXRlcyBhICRcY2hpXnsyfSQgdmFsdWUgb2YgMjI4Ljk3IHdpdGggNiAkZGYkLiBXZSBjYWxjdWxhdGUgdGhlIHAtdmFsdWUgdXNpbmcgdGhlICoqcGNoaXNxKCkqKiBmdW5jdGlvbi4KCmBgYHtyfQpwY2hpc3EoMjI4Ljk3LCBkZiA9IDYsIGxvd2VyLnRhaWwgPSBGQUxTRSkKYGBgCgpXZSBvYnRhaW4gYSBwLXZhbHVlIGNsb3NlIHRvIDAsIHRoZXJlZm9yZSBvbmUgZmFjdG9yIGlzIG5vdCBpbnN1ZmZpY2llbnQuCgpXZSB0cnkgYSBtb2RlbCB3aXRoIHRocmVlIGZhY3RvcnMuIEhvd2V2ZXIsIHdlIGFjaGlldmUgdGhlIHNhbWUgJFxjaGleezJ9JCB2YWx1ZS4KCmBgYHtyfQpmYWN0b3JBbmFseXNpczMgPC0gIGZhKGZhY3RDb3YsIG5mYWN0b3JzID0gMywgbi5vYnMgPSA5OCwgcm90YXRlID0gIm5vbmUiLCBmbSA9ICJtbGUiKQpwcmludChmYWN0b3JBbmFseXNpczMpCmBgYAoKV2Ugbm90ZSB0aGF0IGZhY3RvciAxIGlzIGNvbXByaXNlZCBvZiBhbGwgZm91ciB2YXJpYWJsZXMsIHNpbWlsYXJseSBmYWN0b3IgMiBpcyBjb21wcmlzZWQgb2YgYWxsIGZvdXIgdmFyaWFibGVzLCB3aGlsZSBmYWN0b3IgMyBpcyBjb21wcmlzZWQgb2Ygb25seSB2YXJpYWJsZXMgMSBhbmQgMy4gIA==

STAT 6170: Multivariate Analysis

Nikkev Marshall

Principal Components Analysis

Factor Analysis