QUESTION: How do you make a scatterplot with color coding in ggpubr?

Here I will demonstrate how to add color coding using both a categorical and a continuous variable.

Data

We’ll use the “palmerpenguins” packages (https://allisonhorst.github.io/palmerpenguins/) to address this question. You’ll need to install the package with install.packages(“palmerpenguins”) if you have not done so before, call library(““palmerpenguins”), and load the data with data(penguins)

#install.packages("palmerpenguins")
library(palmerpenguins)
data(penguins)

Making Boxplots Using ggpubr

First, you need to install and load the ggpubr package. I commented out the first line in this code chunk since I’ve already installed ggpubr.

#install.packages("ggpubr")
library(ggpubr)
## Loading required package: ggplot2

Change color by a categorical variable

The main way to color code a scatterplot using ggpubr is in the ggscatter() function, specifically with the color argument.

A categorical variable is a qualitative variable that has a distinct number of categories. Here, I am color coding by sex, where the options are male, female, or NA. The arguments of the function ggscatter() are the x and y columns I chose to build the scatterplot based on. The color argument is where I identify the variable to base the color coding on. Lastly, the data argument shows the name of the dataframe that contains all the data.

ggscatter(y = "bill_depth_mm",
          x = "bill_length_mm",
          color = "sex",   
          data = penguins)
## Warning: Removed 2 rows containing missing values (geom_point).

Change color by a continuous variable

This code chunk looks the same as the previous one, except the variable being used for color is body_mass_g instead of sex. body_mass_g is a continuous variable, meaning that the values can take on any numeric values.

ggscatter(y = "bill_depth_mm",
          x = "bill_length_mm",
          color = "body_mass_g",
          data = penguins)
## Warning: Removed 2 rows containing missing values (geom_point).

Additional Reading

For more information on this topic, see https://rpkgs.datanovia.com/ggpubr/reference/ggscatter.html

Keywords

  1. ggscatter()
  2. scatterplot
  3. categorical variable
  4. continuous variable
  5. palmerpenguins
  6. argument