Lab 7: Diversity and Extinction in the Paleobiology Database


First up, we have to load our packages

library(tidyverse)

We’re going to look at all Bivalves and all Brachiopods through the Phanerozoic, and we’ll use the PBDB diversity download interface to make this a little easier. So this time we won’t be downloading occurences, but rather diversity data. We’re going to do this at the level of genera as well as families; the example here is for Bivalve genera.

bivalve_genera<-read.csv("https://paleobiodb.org/data1.2/occs/diversity.csv?base_name=Bivalvia&count=genera&time_reso=epoch")

Now we need to make some new columns. I’ll show you how to make one, and you can do the rest. Here’s the code for making a new column that’s the midpoint between the min and max ages for each time bin in your data.

bivalve_genera<-bivalve_genera %>% mutate(mean_age=((max_ma+min_ma)/2))

Again, I’ll remind you how to make a figure of these data, then you can use this code to help you make the rest of your figures.

Here, I’ll show you how to make a figure of the variable Sampled in Bin which is basically just how many genera are found in each time bin (no range through assumed, etc)

ggplot(bivalve_genera, aes(x=mean_age, y=sampled_in_bin))+geom_point()+scale_x_reverse()+geom_line()

If we want to add in some lines to help us remember where the major extinctions are we can do it like this:

ggplot(bivalve_genera, aes(x=mean_age, y=sampled_in_bin))+geom_point()+scale_x_reverse(limits=c(500,5))+geom_line()+geom_vline(xintercept=252, size=1, alpha=.3)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Removed 6 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 6 rows containing missing values or values outside the scale range
## (`geom_line()`).

If we want to add in a trend line we can do it like this:

ggplot(bivalve_genera, aes(x=mean_age, y=sampled_in_bin))+geom_point()+geom_smooth(method=lm, se=FALSE)+scale_x_reverse(limits=c(500,5))+geom_line()+geom_vline(xintercept=252, size=1, alpha=.3)
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 6 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 6 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 6 rows containing missing values or values outside the scale range
## (`geom_line()`).

To add an additional line you just need to do another +geom_vline command (you can find a way to do it all in one fell swoop but honestly it’s easier to just do a bunch of geom_vline commands and then copy and paste them for all your plots)

We can also add a title to our plot – which makes it a lot easier to keep track of your plots!

ggplot(bivalve_genera, aes(x=mean_age, y=sampled_in_bin))+geom_point()+geom_smooth(method=lm, se=FALSE)+scale_x_reverse(limits=c(500,5))+geom_line()+geom_vline(xintercept=252, size=1, alpha=.3)+
  ggtitle("trilobites sampled in bin")
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 6 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 6 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 6 rows containing missing values or values outside the scale range
## (`geom_line()`).

Want to do more? Use google! For example, if you wanted to plot two lines of data on the y axis against the same x-axis, you can google “tidyverse plot two y variable lines on one x axis” and see what you get.