This document explains PCA/clustering related plotting using {ggplot2} and {ggfortify}.


First, install ggfortify from CRAN.


Plotting PCA (Principal Component Analysis)

{ggfortify} let {ggplot2} know how to interpret PCA objects. After loading {ggfortify}, you can use ggplot2::autoplot function for stats::prcomp and stats::princomp objects.

df <- iris[c(1, 2, 3, 4)]

PCA result should only contains numeric values. If you want to colorize by non-numeric values which original data has, pass original data using data keyword and then specify column name by colour keyword. Use help(autoplot.prcomp) (or help(autoplot.*) for any other objects) to check available options.

autoplot(prcomp(df), data = iris, colour = 'Species')

Passing label = TRUE draws each data label using rownames

autoplot(prcomp(df), data = iris, colour = 'Species', label = TRUE, label.size = 3)

Passing shape = FALSE makes plot without points. In this case, label is turned on unless otherwise specified.

autoplot(prcomp(df), data = iris, colour = 'Species', shape = FALSE, label.size = 3)