To do this, we simply add a small argument into the function.
We’ll use the “palmerpenguins” packages (https://allisonhorst.github.io/palmerpenguins/) to address this question. You’ll need to install the package with install.packages(“palmerpenguins”) if you have not done so before, call library(“palmerpenguins”), and load the data with data(penguins)
#install.packages("palmerpenguins")
library(palmerpenguins)
## Warning: package 'palmerpenguins' was built under R version 4.1.2
#install.packages("ggpubr")
library(ggpubr)
## Warning: package 'ggpubr' was built under R version 4.1.2
## Loading required package: ggplot2
This chunk is just to get some information on the data so we know how to deal with it!
data(penguins)
penguins
## # A tibble: 344 x 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <fct> <fct> <dbl> <dbl> <int> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ... with 334 more rows, and 2 more variables: sex <fct>, year <int>
is(penguins)
## [1] "tbl_df" "tbl" "data.frame" "list" "oldClass"
## [6] "vector"
First, using names() check the different columns in the penguins data set. For this example I will use bill_length_mm and bill_depth_mm but you can use any.
names(penguins)
## [1] "species" "island" "bill_length_mm"
## [4] "bill_depth_mm" "flipper_length_mm" "body_mass_g"
## [7] "sex" "year"
Before this step, make sure you have ggpubr installed and loaded! This is just setting up the basic scatterplot.
ggscatter(y = "bill_length_mm",
x = "bill_depth_mm",
data = penguins)
## Warning: Removed 2 rows containing missing values (geom_point).
Now that the basic scatter plot set up, we can add to it. To add a correlation coefficient, we insert cor.coef = TRUE
ggscatter(y = "bill_length_mm",
x = "bill_depth_mm",
data = penguins,
cor.coef = TRUE)
## Warning: Removed 2 rows containing non-finite values (stat_cor).
## Warning: Removed 2 rows containing missing values (geom_point).
For more information on this topic, see
https://r-charts.com/correlation/scatter-plot-regression-line/
palmerpenguins, ggpubr, ggscatter(), names(), is(), Building a scatter plot, Adding a correlation coefficient, scatter plot, correlation coefficient