Question: “How can I plot a line of best fit to a scatterplot of data when the data are in a dataframe?”

If you have a 2D scatterplot, how do you add a line of best fit to the scatterplot?

Data

We’ll use the “palmerpenguins” packages (https://allisonhorst.github.io/palmerpenguins/) to address this question. You’ll need to install the package with install.packages(“palmerpenguins”) if you have not done so before, call library(““palmerpenguins”), and load the data with data(penguins)

#install.packages("palmerpenguins")
library(palmerpenguins)
data(penguins)

Let’s put two columns from this dataframe into vectors called X and Y. (Don’t worry if you don’t know how to do this, just run the code).

X <- penguins$body_mass_g
Y <-penguins$bill_length_mm

Next, let’s create a dataframe and put put both vectors in it.

df <- data.frame(body.mass = X, 
                 bill.length = Y)
df
##     body.mass bill.length
## 1        3750        39.1
## 2        3800        39.5
## 3        3250        40.3
## 4          NA          NA
## 5        3450        36.7
## 6        3650        39.3
## 7        3625        38.9
## 8        4675        39.2
## 9        3475        34.1
## 10       4250        42.0
## 11       3300        37.8
## 12       3700        37.8
## 13       3200        41.1
## 14       3800        38.6
## 15       4400        34.6
## 16       3700        36.6
## 17       3450        38.7
## 18       4500        42.5
## 19       3325        34.4
## 20       4200        46.0
## 21       3400        37.8
## 22       3600        37.7
## 23       3800        35.9
## 24       3950        38.2
## 25       3800        38.8
## 26       3800        35.3
## 27       3550        40.6
## 28       3200        40.5
## 29       3150        37.9
## 30       3950        40.5
## 31       3250        39.5
## 32       3900        37.2
## 33       3300        39.5
## 34       3900        40.9
## 35       3325        36.4
## 36       4150        39.2
## 37       3950        38.8
## 38       3550        42.2
## 39       3300        37.6
## 40       4650        39.8
## 41       3150        36.5
## 42       3900        40.8
## 43       3100        36.0
## 44       4400        44.1
## 45       3000        37.0
## 46       4600        39.6
## 47       3425        41.1
## 48       2975        37.5
## 49       3450        36.0
## 50       4150        42.3
## 51       3500        39.6
## 52       4300        40.1
## 53       3450        35.0
## 54       4050        42.0
## 55       2900        34.5
## 56       3700        41.4
## 57       3550        39.0
## 58       3800        40.6
## 59       2850        36.5
## 60       3750        37.6
## 61       3150        35.7
## 62       4400        41.3
## 63       3600        37.6
## 64       4050        41.1
## 65       2850        36.4
## 66       3950        41.6
## 67       3350        35.5
## 68       4100        41.1
## 69       3050        35.9
## 70       4450        41.8
## 71       3600        33.5
## 72       3900        39.7
## 73       3550        39.6
## 74       4150        45.8
## 75       3700        35.5
## 76       4250        42.8
## 77       3700        40.9
## 78       3900        37.2
## 79       3550        36.2
## 80       4000        42.1
## 81       3200        34.6
## 82       4700        42.9
## 83       3800        36.7
## 84       4200        35.1
## 85       3350        37.3
## 86       3550        41.3
## 87       3800        36.3
## 88       3500        36.9
## 89       3950        38.3
## 90       3600        38.9
## 91       3550        35.7
## 92       4300        41.1
## 93       3400        34.0
## 94       4450        39.6
## 95       3300        36.2
## 96       4300        40.8
## 97       3700        38.1
## 98       4350        40.3
## 99       2900        33.1
## 100      4100        43.2
## 101      3725        35.0
## 102      4725        41.0
## 103      3075        37.7
## 104      4250        37.8
## 105      2925        37.9
## 106      3550        39.7
## 107      3750        38.6
## 108      3900        38.2
## 109      3175        38.1
## 110      4775        43.2
## 111      3825        38.1
## 112      4600        45.6
## 113      3200        39.7
## 114      4275        42.2
## 115      3900        39.6
## 116      4075        42.7
## 117      2900        38.6
## 118      3775        37.3
## 119      3350        35.7
## 120      3325        41.1
## 121      3150        36.2
## 122      3500        37.7
## 123      3450        40.2
## 124      3875        41.4
## 125      3050        35.2
## 126      4000        40.6
## 127      3275        38.8
## 128      4300        41.5
## 129      3050        39.0
## 130      4000        44.1
## 131      3325        38.5
## 132      3500        43.1
## 133      3500        36.8
## 134      4475        37.5
## 135      3425        38.1
## 136      3900        41.1
## 137      3175        35.6
## 138      3975        40.2
## 139      3400        37.0
## 140      4250        39.7
## 141      3400        40.2
## 142      3475        40.6
## 143      3050        32.1
## 144      3725        40.7
## 145      3000        37.3
## 146      3650        39.0
## 147      4250        39.2
## 148      3475        36.6
## 149      3450        36.0
## 150      3750        37.8
## 151      3700        36.0
## 152      4000        41.5
## 153      4500        46.1
## 154      5700        50.0
## 155      4450        48.7
## 156      5700        50.0
## 157      5400        47.6
## 158      4550        46.5
## 159      4800        45.4
## 160      5200        46.7
## 161      4400        43.3
## 162      5150        46.8
## 163      4650        40.9
## 164      5550        49.0
## 165      4650        45.5
## 166      5850        48.4
## 167      4200        45.8
## 168      5850        49.3
## 169      4150        42.0
## 170      6300        49.2
## 171      4800        46.2
## 172      5350        48.7
## 173      5700        50.2
## 174      5000        45.1
## 175      4400        46.5
## 176      5050        46.3
## 177      5000        42.9
## 178      5100        46.1
## 179      4100        44.5
## 180      5650        47.8
## 181      4600        48.2
## 182      5550        50.0
## 183      5250        47.3
## 184      4700        42.8
## 185      5050        45.1
## 186      6050        59.6
## 187      5150        49.1
## 188      5400        48.4
## 189      4950        42.6
## 190      5250        44.4
## 191      4350        44.0
## 192      5350        48.7
## 193      3950        42.7
## 194      5700        49.6
## 195      4300        45.3
## 196      4750        49.6
## 197      5550        50.5
## 198      4900        43.6
## 199      4200        45.5
## 200      5400        50.5
## 201      5100        44.9
## 202      5300        45.2
## 203      4850        46.6
## 204      5300        48.5
## 205      4400        45.1
## 206      5000        50.1
## 207      4900        46.5
## 208      5050        45.0
## 209      4300        43.8
## 210      5000        45.5
## 211      4450        43.2
## 212      5550        50.4
## 213      4200        45.3
## 214      5300        46.2
## 215      4400        45.7
## 216      5650        54.3
## 217      4700        45.8
## 218      5700        49.8
## 219      4650        46.2
## 220      5800        49.5
## 221      4700        43.5
## 222      5550        50.7
## 223      4750        47.7
## 224      5000        46.4
## 225      5100        48.2
## 226      5200        46.5
## 227      4700        46.4
## 228      5800        48.6
## 229      4600        47.5
## 230      6000        51.1
## 231      4750        45.2
## 232      5950        45.2
## 233      4625        49.1
## 234      5450        52.5
## 235      4725        47.4
## 236      5350        50.0
## 237      4750        44.9
## 238      5600        50.8
## 239      4600        43.4
## 240      5300        51.3
## 241      4875        47.5
## 242      5550        52.1
## 243      4950        47.5
## 244      5400        52.2
## 245      4750        45.5
## 246      5650        49.5
## 247      4850        44.5
## 248      5200        50.8
## 249      4925        49.4
## 250      4875        46.9
## 251      4625        48.4
## 252      5250        51.1
## 253      4850        48.5
## 254      5600        55.9
## 255      4975        47.2
## 256      5500        49.1
## 257      4725        47.3
## 258      5500        46.8
## 259      4700        41.7
## 260      5500        53.4
## 261      4575        43.3
## 262      5500        48.1
## 263      5000        50.5
## 264      5950        49.8
## 265      4650        43.5
## 266      5500        51.5
## 267      4375        46.2
## 268      5850        55.1
## 269      4875        44.5
## 270      6000        48.8
## 271      4925        47.2
## 272        NA          NA
## 273      4850        46.8
## 274      5750        50.4
## 275      5200        45.2
## 276      5400        49.9
## 277      3500        46.5
## 278      3900        50.0
## 279      3650        51.3
## 280      3525        45.4
## 281      3725        52.7
## 282      3950        45.2
## 283      3250        46.1
## 284      3750        51.3
## 285      4150        46.0
## 286      3700        51.3
## 287      3800        46.6
## 288      3775        51.7
## 289      3700        47.0
## 290      4050        52.0
## 291      3575        45.9
## 292      4050        50.5
## 293      3300        50.3
## 294      3700        58.0
## 295      3450        46.4
## 296      4400        49.2
## 297      3600        42.4
## 298      3400        48.5
## 299      2900        43.2
## 300      3800        50.6
## 301      3300        46.7
## 302      4150        52.0
## 303      3400        50.5
## 304      3800        49.5
## 305      3700        46.4
## 306      4550        52.8
## 307      3200        40.9
## 308      4300        54.2
## 309      3350        42.5
## 310      4100        51.0
## 311      3600        49.7
## 312      3900        47.5
## 313      3850        47.6
## 314      4800        52.0
## 315      2700        46.9
## 316      4500        53.5
## 317      3950        49.0
## 318      3650        46.2
## 319      3550        50.9
## 320      3500        45.5
## 321      3675        50.9
## 322      4450        50.8
## 323      3400        50.1
## 324      4300        49.0
## 325      3250        51.5
## 326      3675        49.8
## 327      3325        48.1
## 328      3950        51.4
## 329      3600        45.7
## 330      4050        50.7
## 331      3350        42.5
## 332      3450        52.2
## 333      3250        45.2
## 334      4050        49.3
## 335      3800        50.2
## 336      3525        45.6
## 337      3950        51.9
## 338      3650        46.8
## 339      3650        45.7
## 340      4000        55.8
## 341      3400        43.5
## 342      3775        49.6
## 343      4100        50.8
## 344      3775        50.2

Before we make our scatterplot, let’s acquire the equation for the line of best fit using the lm() function, which stands for ‘linear model’. Inside the parantheses for the function, make sure you use the names of the columns you made in the dataframe (i.e. body.mass and bill.length, not X and Y).

#          lm(y-axis data ~ x-axis data , data = df)
line.xy <- lm(bill.length ~ body.mass, data = df) 

Next, you plot your scatterplot, then follow the call for that by a function called abline(), and use line.xy as your argument for it.

plot(bill.length ~ body.mass, data = df)
abline(line.xy)