iNaturalist is a wonderful tool that helps bridge the gap between citizen scientists and research scientists. When iNaturalist observers take high quality photographs of taxa, upload them with geodata, and include relevant notes, these data can then be used in a multitude of studies such examining biodiversity, monitoring endangered species, understanding species hybridization, tracking invasive species, etc.
One way to quickly utilize these data is to use the R package rinat to mine iNaturalists’ public database for taxa, locations, observations, etc. of particular interest to you. For full documentation of rinat download their pdf: here (or you can always load the package and get help with particular commands using ?commandhere).
#only install them if necessary, if not delete the install commands or comment them out using a hashtag
#install.packages("tidyverse")
#install.packages("rinat")
#install.packages("lubridate")
#install.packages("maps") #only for mapping with iNat in R.
#next you need to tell R to load the packages you just installed. You need to do this every time you open R.
library(tidyverse)
library(rinat)
library(lubridate)
library(maps)
R is open source and therefore has packages that are notoriously outdated, error-prone, etc. You can always check the status of a package by looking at it’s cran repository information. The cran page for a package will tell you when it was last updated, any necessary dependencies for the package, will link you to the reference manual, and will link you to a page where you can submit bug fix requests. Most packages are on cran.
rinat is very intuitive as long as you know how to obtain the taxon id, place id, and what observation grade means. Let’s look at the pacific golden chanterelle as an example (gives us a head start to mushroom hunting season).
#use rinat package to download a inat dataframe
chanterelle_inat_oregon <- get_inat_obs(
taxon_id = 120443,
place_id = 10, #This is for Oregon
quality = "research",
geo = TRUE, #Specifies that we want geocoordinates
maxresults = 1000, #Limits results... too many and it'll be cumbersome to work with locally
meta = FALSE
)
#View(chanterelle_inat_oregon) #uncomment to preview your dataframe
You can then choose to directly map the observations using their internal mapping feature (inat_map), use it as a layer in ggplot, or export the spreadsheet to Tableau.
#First example is mapping with inatmap. You can figure out the different map names by looking into the map package documentation.
chanterelle_map <- inat_map(chanterelle_inat_oregon, plot = TRUE, map = "usa")
Unfortunately, customization options are pretty limited with (inat_map), but it’s nice for quick plotting. For superior mapping, we’d want to use ggplot2.
There’s an entire book about ggplot2, so I can’t cover it all without overwhelming you, but, you can learn the package fairly quickly by understanding how ggplot works. In general, ggplot works using the by layering geoms, geometric objects, and specifying their each layers aesthetics (aes).
There’s a ton of customizability with ggplot, so knowing the components of a plot helps you know what you can customize.
The components of a plot are:
For more info, check out this cheatsheet on ggplot2.
Back to the scheduled content…
#So, if we wanted to make our points a polygon and instead use ggplot, we'd change plot to FALSE. We'd also need to create a custom polygon of the map shape.
#use the map packages to make a dataframe of the polygons in map_data("state")
states <- map_data("state")
#View(states) #uncomment to preview your dataframe
#Now we want to filter out a polygon from our states dataframe for Oregon
oregon <- states %>%
filter(region %in% ("oregon")) #can change to any state, just make sure to use lowercase.
#View(oregon) #uncomment to preview your dataframe
Now lets map our iNat data to our custom Oregon polygon using ggplot. Essentially, we’ll be layering a scatterplot (the observation data) onto a custom polygon (the map of Oregon).
chanterelle_ggplot <-
ggplot(data = oregon) +
geom_polygon(aes(x = long, #base map
y = lat,
group = group),
fill = "white", #background color
color = "darkgray") + #border color
coord_quickmap() +
geom_point(data = chanterelle_inat_oregon, #these are the research grade observation points
mapping = aes(
x = longitude,
y = latitude,
fill = scientific_name), #changes color of point based on scientific name
color = "black", #outline of point
shape= 21, #this is a circle that can be filled
alpha= 0.7) + #alpha sets transparency (0-1)
theme_bw() + #just a baseline theme
theme(
plot.background= element_blank(), #removes plot background
panel.background = element_rect(fill = "white"), #sets panel background to white
panel.grid.major = element_blank(), #removes x/y major gridlines
panel.grid.minor = element_blank()) #removes x/y minor gridlines
chanterelle_ggplot #this allows you to see the map
If we wanted to customize the labels on our chart, we can do that by adding another layer to our chanterelle_ggplot dataframe using labs().
chanterelle_ggplot +
labs(title = "Pacific Golden Chanterelle iNaturalist Observations in Oregon")
It’s probably obvious that we’re plotting geodata, so I often like to omit the x/y axis labels on maps. You can do this by defining x/y labs as empty through the theme specifications using axis.title = element_blank(). There’s so much theme customization you can do. Check out here for more modifications under theme().
chanterelle_ggplot +
labs(title = "Pacific Golden Chanterelle iNaturalist Observations in Oregon") +
theme(
axis.title = element_blank()
)
Lastly, lets fix the legend. Scientific names are always italicized… plus that title is ugly (and unnecessary with only one species). We can change the legend name by setting theme(legend.title = element_blank()) and italicize our scientific name using under theme(legend.text = element_text(face = "italic")).
chanterelle_ggplot +
labs(title = "Pacific Golden Chanterelle iNaturalist Observations in Oregon") +
theme(
axis.title = element_blank(),
legend.title = element_blank(),
legend.text = element_text(face = "italic"))
Obviously there’s a lot more you can do with iNaturalist data in R using ggplot, Shiny, etc. R is great for reproducibility because you can literally give your colleague the code and they can run it and produce the same result. On the other hand, tools like Tableau are great because they’re just so much easier to use, but it’s hard to tell someone how to recreate what you did. Tableau is improving, but you’ll see that very few people in the sciences use it… it’s more of a tool in business and marketing.
To export our data for use in Tableau, we’ll use the write.csv command. This will write to your working directory. Don’t know your working directory? Check with getwd().
write.csv(chanterelle_inat_oregon, "chanterelle_inat_oregon.csv", row.names = FALSE)