QUESTION: How do I make a scatterplot matrix that shows the correlations?

A scatterplot matrix is a matrix of scatterplots between all possible data column/row pairs.

Data

We’ll use the “palmerpenguins” packages (https://allisonhorst.github.io/palmerpenguins/) to address this question. You’ll need to install the package with install.packages(“palmerpenguins”) if you have not done so before, call library("“palmerpenguins”), and load the data with data(penguins)

#install.packages("palmerpenguins")
library(palmerpenguins)
## Warning: package 'palmerpenguins' was built under R version 4.1.2
library(pander)
## Warning: package 'pander' was built under R version 4.1.2
data(penguins)

We’ll subset out the the numeric numeric to set up the scatterplot matrix.

penguins.numeric <- data.frame(penguins[, c(3:6)])
# display onnly the top 10 rows
pander(penguins.numeric[1:10, ])
bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
39.1 18.7 181 3750
39.5 17.4 186 3800
40.3 18 195 3250
NA NA NA NA
36.7 19.3 193 3450
39.3 20.6 190 3650
38.9 17.8 181 3625
39.2 19.6 195 4675
34.1 18.1 193 3475
42 20.2 190 4250

Make scatterplot matrix

We will use the plot() function to make the scatterplot matrix. This creates scatterplots for each pair of parameters and shows the correlation between them.

par(mfrow = c(2,2), mar = c(3,1,3,1))
plot(penguins.numeric, main="Penguins")

Additional Reading

For more information on this topic, see

TODO: find one resource related to this topic, such as those found on https://www.statmethods.net/index.html, https://r-charts.com/, http://www.r-tutor.com/, http://www.sthda.com/. (http://www.sthda.com/ is run by the author of ggpubr and has lots of resources for it).

Keywords

  • palmerpenguins
  • scatterplot matrix
  • correlation
  • plot