A line of best fit is a straight line drawn through the maximum number of points on a scatter plot balancing about an equal number of points above and below the line. The line, depending on its orientation, determines the positive or negative value of the correlation of the two variables.
We’ll use the “palmerpenguins” package for this example.
library(ggpubr)
## Loading required package: ggplot2
library(palmerpenguins)
data(penguins)
We make scatterplots in ggpubr using the ggscatter () function. Because this is not a base function in R, we must explicitly define a y and an x variable.
ggscatter(y = "flipper_length_mm",
x = "bill_length_mm",
data = penguins)
## Warning: Removed 2 rows containing missing values (geom_point).
ggscatter(y = "flipper_length_mm",
x = "bill_length_mm",
add = "reg.line", # LoBF
data = penguins)
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (stat_smooth).
## Warning: Removed 2 rows containing missing values (geom_point).
ggscatter(y = "flipper_length_mm",
x = "bill_length_mm",
data = penguins)
## Warning: Removed 2 rows containing missing values (geom_point).