Factor Analysis (FA) and Principal Component Analysis (PCA) are multivariate statistical techniques used for data reduction, structure detection, and latent variable modeling. Both methods aim to summarize information contained in a large number of correlated variables into a smaller set of uncorrelated or less correlated variables, but their objectives, assumptions, and interpretations differ fundamentally.
PCA is primarily a descriptive technique that transforms observed variables into linear combinations called principal components. FA, on the other hand, is a model-based inferential technique that assumes observed variables are driven by a smaller number of unobserved latent factors.
Let
For standardized variables, we work with the correlation matrix \(\mathbf{R}\).
The objective of PCA is to find linear combinations of the observed variables that:
The \(k\)-th principal component is given by
\[ Z_k = a_{k1}X_1 + a_{k2}X_2 + \cdots + a_{kp}X_p \]
subject to
\[ \sum_{j=1}^p a_{kj}^2 = 1 \]
PCA is obtained by solving the eigenvalue problem:
\[ | \mathbf{R} - \lambda \mathbf{I} | = 0 \]
where
The total variance is
\[ \sum_{j=1}^p \lambda_j = p \quad (\text{for standardized variables}) \]
The proportion of variance explained by the \(k\)-th component is
\[ \text{PV}_k = \frac{\lambda_k}{\sum_{j=1}^p \lambda_j} \]
Cumulative proportion:
\[ \text{CPV}_m = \sum_{k=1}^m \text{PV}_k \]
Consider the correlation matrix
\[ \mathbf{R} = \begin{pmatrix} 1 & 0.8 & 0.6 \\ 0.8 & 1 & 0.7 \\ 0.6 & 0.7 & 1 \end{pmatrix} \]
Eigenvalues:
\[ \lambda_1 = 2.30, \quad \lambda_2 = 0.50, \quad \lambda_3 = 0.20 \]
Thus, the first two components explain over 93% of the total variance.
Component loadings are given by
\[ \ell_{jk} = \sqrt{\lambda_k} \, a_{jk} \]
They represent correlations between variables and components.
FA seeks to explain correlations among observed variables in terms of a small number of latent factors.
The basic factor model is
\[ X_i - \mu_i = \ell_{i1}F_1 + \ell_{i2}F_2 + \cdots + \ell_{im}F_m + \varepsilon_i \]
where
\[ \mathbf{X} = \boldsymbol{\mu} + \mathbf{L}\mathbf{F} + \boldsymbol{\varepsilon} \]
Assumptions:
Then
\[ \boldsymbol{\Sigma} = \mathbf{L}\mathbf{L}' + \boldsymbol{\Psi} \]
For variable \(X_i\):
Communality: \[ h_i^2 = \sum_{j=1}^m \ell_{ij}^2 \]
Uniqueness: \[ \psi_i = 1 - h_i^2 \]
Communality represents the proportion of variance explained by common factors.
In PAF, communalities are iteratively estimated and substituted into the diagonal of \(\mathbf{R}\).
Suppose the estimated loading matrix is
\[ \mathbf{L} = \begin{pmatrix} 0.80 & 0.10 \\ 0.75 & 0.20 \\ 0.10 & 0.85 \\ 0.20 & 0.80 \end{pmatrix} \]
Communalities:
\[ h_1^2 = 0.80^2 + 0.10^2 = 0.65 \]
\[ h_3^2 = 0.10^2 + 0.85^2 = 0.73 \]
Thus, variables 1 and 2 load strongly on Factor 1, while 3 and 4 load on Factor 2.
Initial factor solutions are often difficult to interpret. Rotation improves simple structure without changing communalities.
Varimax maximizes the variance of squared loadings:
\[ V = \sum_{j=1}^m \left[ \frac{1}{p} \sum_{i=1}^p \ell_{ij}^4 - \left( \frac{1}{p} \sum_{i=1}^p \ell_{ij}^2 \right)^2 \right] \]
Allows correlated factors:
Produces:
| Aspect | PCA | FA |
|---|---|---|
| Nature | Descriptive | Model-based |
| Variance used | Total variance | Common variance |
| Error term | Not explicit | Explicit |
| Objective | Data reduction | Latent structure |
\[ \text{KMO} = \frac{\sum r_{ij}^2}{\sum r_{ij}^2 + \sum q_{ij}^2} \]
where \(q_{ij}\) are partial correlations.
PCA and FA are powerful multivariate tools with distinct purposes. PCA is suitable for summarization and index construction, while FA is appropriate when the goal is to uncover latent constructs. Proper understanding of their mathematical foundations ensures correct application and interpretation in empirical research.
End of Notes