PCA (Principal Components Analysis) is easy in R, but the standard biplot() function is a little clunky. The vegan package can do PCA using the rda() function (normally for redundancy analysis) and has some nice plotting functions. (Note that ggplot is also developing biplot tools).
data("iris")
summary(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width
## Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100
## 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300
## Median :5.800 Median :3.000 Median :4.350 Median :1.300
## Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
## 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
## Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
## Species
## setosa :50
## versicolor:50
## virginica :50
##
##
##
pairs(iris,
lower.panel = NULL,
col = as.numeric(iris$Species))
Note: can also be done in ggplot using GGalley::ggpairs; however, not updated for most recent version of R.
#"[,-5]" drops that last columns
##the species code
iris.pca <- princomp(iris[,-5])
biplot(iris.pca)
There are is also ggplot biplot code somewhere outthere on the internet.
library(vegan)
## Loading required package: permute
## Loading required package: lattice
## This is vegan 2.4-0
my.rda <- rda(iris[,-5])
#"[,-5]" drops that last columns
##the species code
Note to get help when doing and RDA biplot, call ?biplot.rda; however, use biplot() for the actual plotting.
biplot(my.rda)
Note that vegan’s scaling is different than a normal biplot.
biplot(my.rda,
display = c("sites",
"species"),
type = c("text",
"points"))
#make basic plot
biplot(my.rda,
display = c("sites",
"species"),
type = c("text",
"points"))
#Add "hulls"
ordihull(my.rda,
group = iris$Species)
#make basic plot
biplot(my.rda,
display = c("sites",
"species"),
type = c("text",
"points"))
#get names of species
## levels() extracts fator levels
## from the column
spp.names <- levels(iris$Species)
#Add hulls
ordihull(my.rda,
group = iris$Species,
col = c(1,2,3))
#Add legend
legend("topright",
col = c(1,2,3),
lty = 1,
legend = spp.names)