Visualizing Homelessness Data Using Principal Component Analysis

Data Is Just Lots of Vectors!

We represent an enormous amount of information around us in the form of numeric data vectors.

plot of chunk unnamed-chunk-1 plot of chunk unnamed-chunk-2

Examples:
- Points on graph of \( y=x^2 \)
- Colors of pixels in a photo

plot of chunk unnamed-chunk-3 plot of chunk unnamed-chunk-4

Example: Homelessness in the U.S.

If doing research on social issue like homelessness in U.S., could have a \( 3\times 50 \) matrix of data:
- Each of \( n=50 \) states is represented by a column vector of the form, \[ \small{ \begin{aligned} &\mathbf{x}_i=\begin{bmatrix}x_{1}\\x_{2}\\x_{3}\end{bmatrix},&1\leq i\leq n \end{aligned} } \]
- \( x_{1},x_{2},x_{3} \) are some three measurable factors (e.g., unemployment rate, poverty threshold, median rent).

Example: Plotting Homelessness Data

I can plot these column vectors since they are elements \( \mathbb{R}^3 \)

plot of chunk unnamed-chunk-5

plot of chunk unnamed-chunk-6

The "Curse" of Dimensionality

Homelessness in the U.S. related to many factors - poverty levels, unemployment rates, median rent, median income, social services [1].
Question: What if I want to consider more than \( p=3 \) factors?
Answer: To plot \( n \) data vectors with \( p\geq 4 \) entries, must select only \( 3 \) entries to view at a time.

Plotting whole matrix of data to gain necessary intuition is no longer possible.

Introduction

Principal Component Analysis (PCA) is a process that allows for dimension-reduction of data while maintaining much of variance, i.e., “information”.

Overview of Presentation:

[1] Theory behind Principal Component Analysis

How PCA accomplishes plotting/compressing large datasets while maintaining important info.

[2] Applying PCA to a small dataset for each of the 50 states

Demonstrate applying PCA to reduce dimension of data on U.S. homelessness.

Theory of Principal Component Analysis

To maintain most of the “information” in our data, we want to maintain the majority of its variance.
Goal of PCA is to find orthogonal directions (unit vectors) of maximum variance in our data.
Recall our general \( p\times n \) matrix of data, \[ \small{ W=\begin{bmatrix}\mathbf{x}_1&\mathbf{x}_2&\cdots&\mathbf{x}_n\end{bmatrix} } \]
Column vectors are elements of \( \mathbb{R}^p \), \[ \small{ \begin{aligned} &\mathbf{x}_i=\begin{bmatrix}x_{i1}\\\vdots\\x_{ip}\end{bmatrix},&1\leq i\leq n \end{aligned} } \]

Descriptions of Original Data

Mean of original data is, \[ \small{ \mathbf{\bar{x}}=\frac{1}{n}\sum_{i=1}^n\mathbf{x}_i } \]
Variance of original data is captured by covariance matrix, \[ \small{ S=\frac{1}{n-1}\sum_{i=1}^n(\mathbf{x}_i-\bar{\mathbf{x}}_i)(\mathbf{x}_i-\bar{\mathbf{x}}_i)^T } \]
Covariance matrix is necessarily a \( p\times p \) symmetric matrix.
Variance of \( p \) variates on diagonals, covariance on off-diagonals.

Maximizing Variance

To find first unit \( \mathbf{w}\in\mathbb{R}^p \) pointing in direction of maximum variance:
- Project our data onto a unit vector \( \mathbf{w} \): \[ \small{ \begin{aligned} \mathbf{w}^TW&=\begin{bmatrix}\mathbf{x}_1\cdot\mathbf{w}&\mathbf{x}_2\cdot\mathbf{w}&\cdots&\mathbf{x}_n\cdot\mathbf{w}\end{bmatrix}\\ &=\begin{bmatrix}y_1&y_2&\cdots&y_n\end{bmatrix}\\ &=\mathbf{y} \end{aligned} } \]latex
- Maximize \( \mathrm{Var}(\mathbf{y}) \)
Formula for Variance [8] \[ \small{ \mathrm{Var}(\mathbf{y})=\frac{1}{n-1}\sum_{i=1}^n(y_i-\bar{y})^2 } \]

Maximizing Variance of Projected Data

The mean of our projected data is, \[ \small{ \begin{aligned} \bar{y}&=\frac{1}{n}\sum_{i=1}^n y_i\\ &=\frac{1}{n}\sum_{i=1}^n\mathbf{x}_i\cdot\mathbf{w}\\ &=\mathbf{w}\cdot\frac{1}{n}\sum_{i=1}^n\mathbf{x}_i\\ &=\mathbf{w}\cdot\bar{\mathbf{x}} \end{aligned} } \]

The variance of our projected data is, \[ \small{ \begin{aligned} \mathrm{Var(\mathbf{y})}&=\frac{1}{n-1}\sum_{i=1}^n(y_i-\bar{y})^2\\ &=\frac{1}{n-1}\sum_{i=1}^n(\mathbf{w}\cdot\mathbf{x}_i-\mathbf{w}\cdot\bar{\mathbf{x}})^2\\ &=\frac{1}{n-1}\sum_{i=1}^n\mathbf{w}^T(\mathbf{x}_i-\bar{\mathbf{x}_i})(\mathbf{x}_i-\bar{\mathbf{x}}_i)^T\mathbf{w}\\ \end{aligned} } \]

A New Perspective

A nice result: \[ \small{ \begin{aligned} \mathrm{Var(\mathbf{y})}&=\frac{1}{n-1}\sum_{i=1}^n\mathbf{w}^T(\mathbf{x}_i-\bar{\mathbf{x}}_i)(\mathbf{x}_i-\bar{\mathbf{x}}_i)^T\mathbf{w}\\ &=\mathbf{w}^T\left(\frac{1}{n-1}\sum_{i=1}^n(\mathbf{x}_i-\bar{\mathbf{x}}_i)(\mathbf{x}_i-\bar{\mathbf{x}}_i)^T\right)\mathbf{w}\\ &=\mathbf{w}^TS\mathbf{w} \end{aligned} } \]
To find the unit \( \mathbf{w} \) that gives the direction of maximum variance in our data, we maximize, \[ \boxed{ \begin{aligned} &\mathbf{w}^TS\mathbf{w},&\left\lVert\mathbf{w}\right\rVert=1 \end{aligned} } \]

Introduction to Quadratic Forms

Quadratic Form: function on \( \mathbb{R}^n \) whose value at position vector \( \mathbf{x} \) is given by, \[ \small{ Q(\mathbf{x})=\mathbf{x}^TA\mathbf{x} } \]

for an \( n\times n \) symmetric matrix \( A \).

Important theorem tells how to find, \[ \small{ M=\mathrm{Max}\{\mathbf{x}^TA\mathbf{x}\, | \, \left\lVert\mathbf{x}\right\rVert=1\} } \]
Pertinent to economics, physics, signal processing [4].

( Include picture of Quadratic form here! )

Maximizing Quadratic Forms

Theorem: Maximizing \( \{\mathbf{w}^TS\mathbf{w}\,| \,\left\lVert\mathbf{w}\right\rVert=1 \} \)

Let \( \{\lambda_1\geq\lambda_2\cdots\geq\lambda_n\} \) be set of eigenvalues of \( S \) in order of descending magnitude, and \( \{\mathbf{u}_1,\mathbf{u}_2\ldots,\mathbf{u}_n\} \) are set of corresponding orthonormal eigenvectors of \( S \).
Then, \[ \small{ M=\mathrm{Max}\{\mathbf{w}^TS\mathbf{w} \, | \, \left\lVert\mathbf{w}\right\rVert\}=\lambda_1 } \]

\[ \small{ \lambda_1=\mathbf{u}_1^TS\mathbf{u}_1 } \]

Conclusion: Direction of maximum variance is given by unit eigenvector of \( S \), corresponding to largest eigenvalue of \( S \).

Extending to n Orthogonal Directions

Theorem extends to \( n \) orthogonal directions of maximum variance \( \{\mathbf{w}_1,\ldots,\mathbf{w}_n\} \).
Directions of max. variance given by orthonormal unit eigenvectors of \( S \), \( \{\mathbf{u}_1,\mathbf{u}_2,\ldots,\mathbf{u}_n\} \).

For \( k=2,\ldots,n \) the maximum value of \( \mathbf{w}^TS\mathbf{w} \) where, \[ \small{ \begin{aligned} &\mathbf{w}^T\mathbf{w}=1,&\mathbf{w}^T\mathbf{u}_1=0&,\ldots,&\mathbf{w}^T\mathbf{u}_{k-1}=0 \end{aligned} } \]

is \( \lambda_k \) and occurs when \( \mathbf{w}=\mathbf{u}_k \) [4].

Main Idea: Orthonormal eigenvectors of \( S \) are data's orthogonal directions of maximum variance,i.e., principal components.

Singular Value Decomposition

To find Principal Components, use SVD, \[ \small{ \begin{aligned} \boxed{A=U\Sigma V^T} \end{aligned} } \]
Singular values \( \sigma_1,\ldots,\sigma_n \) are square roots of eigenvalues of \( A^TA \), \[ \small{ \sigma_1,\ldots,\sigma_n=\sqrt{\lambda_1},\cdots,\sqrt{\lambda_n} } \]latex where \( A^TA\mathbf{v}_i=\lambda_i\mathbf{v}_i \).
Left singular vectors of \( A \) are columns of \( U \)
Right singular vectors of \( A \) are columns of \( V \) and orthononormal eigenvectors of \( A^TA \),

\[ \small{ \begin{aligned} &\mathbf{v}_1,\ldots,\mathbf{v_n}\, & A^TA\mathbf{v}_i=\lambda_i\mathbf{v}_i \end{aligned} } \]

Applying SVD to Perform PCA

If we consider data \( W \) in mean-deviation form, \[ \small{ B=\begin{bmatrix}\mathbf{x}_1-\bar{\mathbf{x}}&\mathbf{x}_2-\bar{\mathbf{x}}&\cdots&\mathbf{x}_n-\bar{\mathbf{x}}\end{bmatrix} } \]
Apply SVD to \( A=\frac{1}{\sqrt{n-1}}B^T \), so that we get eigenvalues, eigenvectors of, \[ \small{ \begin{aligned} A^TA&=\left(\frac{1}{\sqrt{n-1}}B^T\right)^T\left(\frac{1}{\sqrt{n-1}}B\right)\\ &=\frac{1}{\sqrt{n-1}}BB^T\\ &=S \end{aligned} } \]

Applying SVD to Perform PCA Continued...

Thus, apply SVD to \( \frac{1}{\sqrt{n-1}}B^T \), \[ \small{ \frac{1}{\sqrt{n-1}}B^T=U\Sigma V^T } \] gives eigenvalues, eigenvectors of \( S \).
Singular values of \( \frac{1}{\sqrt{n-1}}B^T \) are square roots of eigenvalues of \( S \).
Left singular vectors are columns of \( V \) and orthonormal eigenvectors of \( S \).
Thus, left singular vectors are principal components of \( S \).

Applying PCA to Homelessness Data

Collect \( 3\times 50 \) matrix of data from 2010.
Sources: U.S. Census Bureau and U.S. Department of Housing and Urban Development.
Data: \( n=50 \) vectors of form, \[ \small{ \mathbf{x}_i=\begin{bmatrix}h\\cb\\p\end{bmatrix} } \]
\( h=\mathrm{percent\,of\,state\,population\,that\,is\,homeless} \)
\( cb=\mathrm{percent\,of\,state\,population\,that\,is\,cost\,burdened} \)
\( p=\mathrm{percent\,of\,state\,population\,in\,deep\,poverty} \)

Applying PCA to Homelessness Data

Original Data

plot of chunk unnamed-chunk-7

plot of chunk unnamed-chunk-8

Using R Built-In Functions

Can use svd() built-in function in R to find principal components of data,

[1] "d="

[1] 4.10582391 1.23541736 0.09071855

[1] "Eigenvalues of S="

[1] 16.857790020  1.526256047  0.008229856

[1] "Total Variance"

[1] 18.39228

Using R Built-In Functions

[1] "First six rows of u="

            [,1]        [,2]        [,3]
[1,]  0.11619489  0.14915260 -0.11649125
[2,] -0.12755129 -0.21413139  0.15892807
[3,]  0.08573686  0.17879062  0.03870884
[4,] -0.04074753  0.16857778 -0.09683053
[5,]  0.21843789 -0.03495111  0.14053457
[6,]  0.04246217 -0.05934145  0.17601419

[1] "v="

           [,1]         [,2]         [,3]
[1,] 0.00948655 -0.008718115  0.999916996
[2,] 0.99333641 -0.114778510 -0.010424853
[3,] 0.11485987  0.993352852  0.007571169

Percentage of Variance Explained by each PC

[1] "% total variance explained by PC1="

[1] 91.6569

[1] "% total variance explained by PC2="

[1] 8.298353

[1] "% total variance explained by PC3="

[1] 0.04474626

Original Data with Component Axes

plot of chunk unnamed-chunk-12

Dimension Reduction to 2D

plot of chunk unnamed-chunk-13 plot of chunk unnamed-chunk-14

Homelessness Index

Can create homelessness index based off our data's first principal component, \[ \small{ \mathbf{v}_1=\begin{bmatrix}0.00949\\0.99334\\0.99334\end{bmatrix} } \]
Index provides quick scalar quantification of “strength” of each state's relationship to factors related to homelessness.

\[ \small{ h_{\mathrm{index}} = 0.00949\hat{h}+0.99334\hat{cb}+0.99334\hat{p} } \] where \( \hat{h},\hat{cb},\hat{p} \) are the values of our data in mean-deviation form.

Homelessness Index Plot

plot of chunk unnamed-chunk-15

Works Cited

[1] Chris Glynn, Thomas H. Byrne, Dennis P. Culhane. “Inflection points in community-level homeless rates.” The Annals of Applied Statistics, 15(2) 1037-1053 June 2021. https://doi.org/10.1214/20-AOAS1414

[2] “2007 - 2021 Point-in-Time Estimates by State”, PIT and HIC Data Since 2007, US Department of Housing and Urban Development, 2022,

[3] Jolliffe IT, Cadima J. 2016 Principal component analysis: a review and recent developments.Phil. Trans. R. Soc. A 374: 20150202. http://dx.doi.org/10.1098/rsta.2015.0202

[4] Lay, David C., et al. Linear Algebra and Its Applications. Sixth ed., Pearson, 2022. U.S. Census Bureau.

Works Cited Continued...

[5] “DECENNIALCD1132010 .” P1: Census Bureau Table, United States Census Bureau, 2010, https://tinyurl.com/h8nzjpmv. Accessed 27 Nov. 2022.

[6] American Community Survey. “ACSDP1Y2010.” DP04: Census Bureau Table, United States Census Bureau, 2010, https://tinyurl.com/y2xzxt2r . Accessed 27 Nov. 2022.

[7] American Community Survey. “ACSST1Y2010 .” S1701: Poverty Status in the Past 12 Months, United States Census Bureau, 2010, https://tinyurl.com/5etsm7xm. Accessed 27 Nov. 2022.

Works Cited Continued ...

[8] Li, Xiaodong. “STA135 Lecture 2 : Sample Mean Vector and Sample Covariance Matrix.” STA135. Davis, California, https://www.stat.ucdavis.edu/~xdgli/Xiaodong_Li_Teaching_files/135Note2.pdf. Accessed 24 Oct. 2022.

[9] Henderson, Thomas C. “A Geometric Interpretation of the Covariance Matrix.” CS 4640 Fall Semester 2019. https://www.cs.utah.edu/~tch/CS4640F2019/resources/A%20geometric%20interpretation%20of%20the%20covariance%20matrix.pdf. Accessed 30 Nov. 2022.

[10] “What Is ‘Deep Poverty’?” Center for Poverty and Inequality Research, University of California, Davis, 18AD, https://poverty.ucdavis.edu/faq/what-deep-poverty.

Works Cited Continued ...

[11] “Rental Burdens: Rethinking Affordability Measures: HUD User.” Rental Burdens: Rethinking Affordability Measures | HUD USER, PD&R Edge, https://www.huduser.gov/portal/pdredge/pdr_edge_featd_article_092214.html.

[12] “Principal Component Analysis from Scratch in Python.” AskPython, AskPython, 19 Oct. 2020, https://www.askpython.com/python/examples/principal-component-analysis.

[13] Newsom, Jason T. “Principal Component Analysis.” Psy 523/623 Structural Equation Modeling, Spring 2020. https://web.pdx.edu/~newsomj/semclass/ho_pca.pdf. Accessed 30 Nov. 2022.

Works Cited Continued ...

[14] “Homelessness: A State of Emergency.” Homelessness: A State of Emergency, National Alliance to End Homelessness, 18 Oct. 2018, https://endhomelessness.org/resource/homelessness-a-state-of-emergency/.

[15] West, Mary, and Karin Gepp. “Maslow's Hierarchy of Needs Pyramid: Uses and Criticism.” Medical News Today, MediLexicon International, 28 July 2022, https://www.medicalnewstoday.com/articles/maslows-hierarchy-of-needs.

[16] US Census Bureau. “Poverty Thresholds.” Census.gov, United States Census Bureau, 13 Sept. 2022, https://www.census.gov/data/tables/time-series/demo/income-poverty/historical-poverty-thresholds.html.

Works Cited Continued ...

[17] Chartier, Timothy P. When Life Is Linear: From Computer Graphics to Bracketology. Mathematical Association of America, 2015.

[18]RGB Image https://datagenetics.com/blog/august32013/index.html, Accessed 12/2/22

Introduction to Quadratic Forms

Quadratic Form: function on \( \mathbb{R}^n \) whose value at position vector \( \mathbf{x} \) is given by, \[ Q(\mathbf{x})=\mathbf{x}^TA\mathbf{x} \]

for an \( n\times n \) symmetric matrix \( A \).

Important theorem tells how to find, \[ M=\mathrm{Max}\{\mathbf{x}^TA\mathbf{x}\, | \, \left\lVert\mathbf{x}\right\rVert=1\} \]
Pertinent to economics, physics, signal processing [4].

( Include picture of Quadratic form here! )

Maximizing Quadratic Forms

Theorem: Finding \( \begin{aligned}M=\mathrm{Max}\{\mathbf{w}^TS\mathbf{w} \, \, | \, \left\lVert\mathbf{w}\right\rVert=1\}\end{aligned} \):

By the Spectral Theorem of Symmetric Matrices [4], can orthogonally diagonalize \( S \): \[ \begin{aligned} S&=PDP^{-1}\\ &=\begin{bmatrix}\mathbf{u}_1&\cdots&\mathbf{u}_n\end{bmatrix}\begin{bmatrix}\lambda_1&\cdots&0\\\vdots&\ddots&\vdots\\0&\cdots&\lambda_n\end{bmatrix}\begin{bmatrix}\mathbf{u}_1\\\vdots\\\mathbf{u}_n\end{bmatrix} \end{aligned} \]
\( \lambda_1\geq\lambda_2\geq\ldots\geq\lambda_n \) are eigenvalues of \( S \)
\( \mathbf{u}_1,\ldots,\mathbf{u}_n \) are corresponding orthonormal eigenvectors of \( S \)

Maximizing Quadratic Forms Continued...

\[ \begin{aligned} &S=PDP^{-1} &D=P^{-1}SP \end{aligned} \]

Perform orthogonal change of variable, \[ \begin{aligned} &\mathbf{w}=P\mathbf{y}, &P^T=P^{-1} \end{aligned} \] Then, quadratic form becomes, \[ \begin{aligned} \mathbf{w}^TS\mathbf{w}&=(P\mathbf{y})^TS(P\mathbf{y})\\ &=\mathbf{y}^T(P^TSP)\mathbf{y}\\ &=\mathbf{y}^TD\mathbf{y} \end{aligned} \]
Without loss of generality, suppose \( S \) is a \( 3\times 3 \) matrix and we can easily derive what \( M=\mathrm{Max}\{\mathbf{w}^TS\mathbf{w}\, | \, \left\lVert\mathbf{w}\right\rVert=1\} \) will be.

Maximizing Quadratic Forms Continued...

If \( S \) is a \( 3\times 3 \) matrix, \[ \begin{aligned} \mathbf{y}^TD\mathbf{y}&=\begin{bmatrix}y_1&y_2&y_3\end{bmatrix}\begin{bmatrix}\lambda_1&0&0\\0&\lambda_2&0\\0&0&\lambda_3\end{bmatrix}\begin{bmatrix}y_1\\y_2\\y_3\end{bmatrix}\\ &=\lambda_1y_1^2+\lambda_2y_2^2+\lambda_2y_3^2 \end{aligned} \]

Since \[ \lambda_1\geq\lambda_2\geq\lambda_3 \]

We have, \[ \begin{aligned} \mathbf{y}^TD\mathbf{y}&=\lambda_1y_1^2+\lambda_2y_2^2+\lambda_3y_3^2\\ &\leq\lambda_1y_1^2+\lambda_1y_2^2+\lambda_2y_3^2\\ \end{aligned} \]

Maximizing Quadratic Forms Continued...

Note, \[ \begin{aligned} \mathbf{y}^TD\mathbf{y}&=\lambda_1y_1^2+\lambda_2y_2^2\lambda_3y_3^2\\ &\leq\lambda_1y_1^2+\lambda_1y_2^2+\lambda_2y_3^2\\ &=\lambda_1(y_1^2+y_2^2+y_3^2)\\ &=\lambda_1\left\lVert\mathbf{y}\right\rVert^2\\ &=\lambda_1 \end{aligned} \]

\( \mathbf{y}^TD\mathbf{y} \) will always be less than or equal to the largest eigenvalue of \( S \)

Maximizing Quadratic Forms Continued...

For \( \mathbf{y}=\begin{bmatrix}1\\0\\0\end{bmatrix} \), \[ \begin{aligned} \mathbf{y}^TD\mathbf{y}&=\begin{bmatrix}1&0&0\end{bmatrix}\begin{bmatrix}\lambda_1&0&0\\0&\lambda_2&0\\0&0&\lambda_3\end{bmatrix}\begin{bmatrix}1\\0\\0\end{bmatrix}\\ &=\lambda_1\\ \end{aligned} \]
Thus, \( \mathbf{y}^TD\mathbf{y}=\mathrm{Max}\{\mathbf{y}^TD\mathbf{y} \, | \, \left\lVert\mathbf{y}\right\rVert=1\}=\lambda_1 \) when \( \mathbf{y}=\mathbf{e}_1 \).

Maximizing Quadratic Forms Continued...

For \( \mathbf{y}=\mathbf{e}_1 \), \[ \begin{aligned} \mathbf{w}&=P\mathbf{y}\\ &=\begin{bmatrix}\mathbf{u}_1&\mathbf{u}_2&\mathbf{u}_3\end{bmatrix}\begin{bmatrix}1\\0\\0\end{bmatrix}\\ &=\mathbf{u}_1 \end{aligned} \]
Thus, the unit \( \mathbf{w} \) that maximizes \( \mathbf{w}^TS\mathbf{w} \) is the unit eigenvector of \( S \) corresponding to its largest eigenvalue.

Theory of SVD

SVD answers the question of how much a matrix “stretches” a unit vector \( \mathbf{x} \).
Note: the same unit vector \( \mathbf{x} \) that maximizes \( \left\lVert A\mathbf{x}\right\rVert \), also maximizes,

\[ \begin{aligned} \left\lVert A\mathbf{x}\right\rVert^2&=(A\mathbf{x})^T(A\mathbf{x})\\ &=\mathbf{x}^TA^TA\mathbf{x}\\ &=\mathbf{x}(A^TA)\mathbf{x} \end{aligned} \]

Theory of SVD

Know from previous theorem that maximum of \( \mathbf{x}(A^TA)\mathbf{x} \) given by greatest eigenvalue of \( A^TA \)
Maximum occurs when \( \mathbf{x}=\mathbf{v}_1 \) where \( A^TA\mathbf{v}_1=\lambda_1\mathbf{v}_1 \)
\( A \) stretches \( \mathbf{v}_1 \) the most, by amount \( \lambda_1 \), then in mutually orthogonal directions by amounts given by eigenvalues of \( A^TA \) in order of decreasing magnitude

Applying SVD to Perform PCA Continued...

\[ \begin{aligned} \frac{1}{\sqrt{n-1}}BB^T&=\frac{1}{n-1}\begin{bmatrix}\mathbf{x}_1-\bar{\mathbf{x}}&\mathbf{x}_2-\bar{\mathbf{x}}&\cdots&\mathbf{x}_n-\bar{\mathbf{x}}\end{bmatrix}\begin{bmatrix}\mathbf{x}_1-\bar{\mathbf{x}}\\\mathbf{x}_2-\bar{\mathbf{x}}\\\vdots\\\mathbf{x}_n-\bar{\mathbf{x}}\end{bmatrix}\\ &=\frac{1}{n-1}\sum_{i=1}^n\begin{bmatrix}x_{1i}-\bar{x}_1\\\vdots\\x_{pi}-\bar{x}_p\end{bmatrix}\begin{bmatrix}x_{1i}-\bar{x}_1&\cdots&x_{pi}-\bar{x}_p\end{bmatrix}\\ &=\frac{1}{n-1}\sum_{i=1}^n(\mathbf{x}_i-\mathbf{\bar{x}})(\mathbf{x}_i-\mathbf{\bar{x}})^T\\ &=S \end{aligned} \]

3D Plot of Original Data

(2.0, 10.0)

(-0.5, 0.5)

(40.0, 80.0)

plot of chunk unnamed-chunk-16

plt.show()

plot of chunk unnamed-chunk-17

Visualizing Homelessness Data Using Principal Component Analysis

Data Is Just Lots of Vectors!

Example: Homelessness in the U.S.

Example: Plotting Homelessness Data

The "Curse" of Dimensionality

Introduction

Theory of Principal Component Analysis

Descriptions of Original Data

Maximizing Variance

Maximizing Variance of Projected Data

A New Perspective

Introduction to Quadratic Forms

Maximizing Quadratic Forms

Extending to n Orthogonal Directions

Singular Value Decomposition

Applying SVD to Perform PCA

Applying SVD to Perform PCA Continued...

Applying PCA to Homelessness Data

Applying PCA to Homelessness Data

Using R Built-In Functions

Using R Built-In Functions

Percentage of Variance Explained by each PC

Original Data with Component Axes

Dimension Reduction to 2D

Homelessness Index

Homelessness Index Plot

Works Cited

Works Cited Continued...

Works Cited Continued ...

Works Cited Continued ...

Works Cited Continued ...

Works Cited Continued ...

Introduction to Quadratic Forms

Maximizing Quadratic Forms

Maximizing Quadratic Forms Continued...

Maximizing Quadratic Forms Continued...

Maximizing Quadratic Forms Continued...

Maximizing Quadratic Forms Continued...

Maximizing Quadratic Forms Continued...

Theory of SVD

Theory of SVD

Applying SVD to Perform PCA Continued...

3D Plot of Original Data

Plot of Percent Homeless