In our observatories, we store the authentic copies of new datasets on the European scientific data repository, Zenodo. If you are new to Zenodo, you should upload at least one or two datasets manually, before trying to automate the process. And to avoid live-testing in Zenodo, where everythings is supposed to be permanent, set up a practice account on its practice clone, Sandbox Zenodo account.

In this example, you will authenticate yourself twice: you will authenticate yourself as the creator (author) of a scientific object with ORCID. ORCID provides a persistent digital identifier (an ORCID iD) that you own and control, and that distinguishes you from every other researcher. And you will authenticate your session to Zenodo / Sandbox Zenodo with a personal access token to one of these sites. Beware, your Zenodo credentials do not work on Sandbox Zenodo, which ensures that you cannot accidentally upload practicing material to the real Zenodo, where there is no undo button.

Set Up Your Zenodo / Sandbox Account

For practicing, please set up a Zenodo Sandbox account. Zenodo Sandbox is a clone of Zenodo, created for testing applications. Do not practice on Zenodo itself, because whatever you publish on Zenodo is permanent. All examples below work exactly the same on Zenodo, without the sandbox subdomain in the calls.

Important: you will get a verification for both your account and your email address. If you do not reply to the verification links (check Spam, Social, etc.) the API will seemingly work, but not record anything. That will lead to misleading error messages.

Once you are sure that your Sandbox account is up an running and verified, you should create your Personal Access Token (PAT). In your user profile, Go to Applications and create a secret code with clicking deposit:actions, deposit:write.

In R, the best practice is to store this PAT in a keyring with the keyring package. The following code will interactively set your PAT, i.e. if you run it in R, a pop-up window will ask you to copy the Zenodo Sandbox PAT from your browser to a textbox. Of course, you can use your favorite method managing your secret variables, but do not expose it to the risk that you accidentally upload your PAT to github or send it to somebody in an email. If you store it in your repo, make sure that you exclude its synching in .gitignore, and in a package exclude it in .Rbuildignore, too.

require(keyring)
keyring::key_set (service =  "Zenodo_Sandbox", username = NULL)  # separately for the sandbox
keyring::key_set (service =  "Zenodo", username = NULL)          # and the real service

Your First Deposition

Your deposition has three important parts:

In this tutorial we use Zen4R, the R Interface to the Zenodo REST API. Zen4R uses R6 objects to prepare your deposition, which, unless you are familiar with truly object oriented languages, is a bit bizarre at first sight. R6 objects are not real R objects, but environments, so when you create a metadata record, they will not show in RStudios’s Environment window as an object, but as a new environment. If all goes well, this should not bother use, but if you experience unexpected behavior, be mindful that you are not debugging a true R object in your global or function environment.

Start New Session

Using the keyring you you set up earlier, you initiate a new session with the API:

require(zen4R)    # for Zenodo API interaction 
## Loading required package: zen4R
## Warning: package 'zen4R' was built under R version 4.0.5
require(keyring)  # you can use any other secure form to store your PAT
## Loading required package: keyring
## Warning: package 'keyring' was built under R version 4.0.5
ZENODO <- ZenodoManager$new(
url = "https://sandbox.zenodo.org/api",
token = keyring::key_get( "Zenodo_Sandbox"),
logger = "INFO"
)
## [zen4R][INFO] ZenodoManager - Successfully connected to Zenodo with user token

If you did not store your PAT anywhere, you can also simply write token = 'abc_mytoken_def' where ‘abc_mytoken_def’ is of course your secret token generate on Zenodo Sandbox. If you have failed to save, you can always generate a new one in the web interface.

Create a Record

In this step we are creating an R6 object called myrec. If you run this code, in your RStudio, in the Environment window you will see myrec not as an object, but as an Environment.

If you run myrec in your console, you will be able to print out your record, but only if you replaced Jane Doe with your own name, and orcid with your true ORCID ID. Also, if you specify a pre-set DOI (and don’t expect to get a new DOI from Zenodo), you must beware that the API will check the validity of the DOI and the ORID ID, firstname, lastname. So you cannot test the following code with the dummy http://doi.org/00.0000/zenodo.00000 and Jane Doe.

Because we work with R6 objects, your session is called ZENODO, and you call the depositRecord method on session ZENODO to assign the myrec ZenodoRecord record object. With Jane Doe, you will get a validation error, but it should work fine, provided that you have spelled your name identically to your ORCID ID records.

myrec <- ZENODO$depositRecord(myrec)
## [zen4R][ERROR] ZenodoManager - Error while depositing record: Validation error. 
## [zen4R][ERROR] ZenodoManager - Error: metadata.creators.0.orcid - Not a valid ORCID identifier. 
## [zen4R][ERROR] ZenodoManager - Error: metadata.doi - The provided DOI is invalid - it should look similar  to '10.1234/foo.bar'.

If you have provided some real information, you went throgh the verification, and take a look in your browser to your Zenodo Sandbox account’s uploads, you must see a record, but not the data.

Upload the Data

Crucially, at this point your record should have a Zenodo ID, which connects you, as the author, your ORCID ID with the metadata record. In this blogpost example, because we used Jane Doe, you get an NULL id back.

myrec$id
## NULL

Should you work programmatically, your script can make check before upload if you have got here safely with:

is.null(myrec$id)
## [1] TRUE

… which of course in a real-life example must return FALSE. The final step is to connect your ZENODO session (the R6 object) to an R file that should be uploaded. You can, if you want to, upload rds files to Zenodo, but I would suggest something more system- and language independent.

In this example, I create a temporary file in your R session, I write there the famous iris dataset in csv format, and try to upload it to the ZENODO session. In this case, you get an error code because Jane Doe with her fictious ORCID ID and non-existing DOI was prevented to spam the server.

my_file_path <- tempfile()
write.csv(iris, my_file_path, row.names = FALSE)
ZENODO$uploadFile(my_file_path, myrec$id)
## $message
## [1] "The method is not allowed for the requested URL."
## 
## $status
## [1] 405

Your Authoritative Copy

You can check what I uploaded here on sandbox.zenodo.org/deposit/818354. This is a copy of one of my datasets that you can find with full description on Zenodo, under the same DOI.