Submitting single records to the warehouse from R

About this example

The code on this example is a minimal example based on code from a Shiny app and shows you how to submit a record from R (known as a POST request). It is not designed for the upload of hundreds of biological records. This does not use the indicia warehouse API. When this code submits a record it returns the ID of the new record which means it’s then easy to log in to the warehouse and check that your record has been submitted as expected.

This example is adapted from a Shiny application developed by Mark Logie to streamline the transfer of records of invasive species (eg. Asian Hornets) that were submitted by email, to the BRC warehouse. The code for the Shiny app is available here: https://github.com/mlogie/inns

This Shiny application accessed an outlook inbox to extract information for a biological record, presented this to the user of the Shiny application (eg. a member of staff at BRC), and made it easy for them to transfer this to the BRC’s indicia warehouse.

The record is submitted as a JSON file. Images can be also be submitted.

By default, this record is not submitted attached to any particular user on a website (eg. won’t appear on someones ‘my records’ page on iRecord) even if there is a matching email address.

Setting up a test website and dataset on the BRC indicia development warehouse

Before submitting records from R to any real datasets you should test your R code using a test dataset on the development warehouse.

Firstly, you need to contact Biren () for an account on the BRC indicia dev warehouse: https://devwarehouse.indicia.org.uk/ ensuring that this account has the correct privileges to create websites and administer data (observations, samples etc.).

Once you have a log in, go to: https://devwarehouse.indicia.org.uk/index.php/login to log in

Next, we need to create a new website for our test dataset to sit within. To do this go to: https://devwarehouse.indicia.org.uk/index.php/website then click on the ‘New website’ button at the bottom of the page. See: https://indicia-docs.readthedocs.io/en/latest/site-building/warehouse/websites.html for more information about creating websites. Fill out the details with something like so:

Setting up a test websites on the dev warehouse

You can generate secure passwords using tools like this: https://my.norton.com/extspa/passwordmanager?path=pwd-gen

The password you enter for the website is needed for authentication. Create a .txt document in the /passwords folder with the filename devwarehousepassword.txt then you can load it in later.

Because of the .gitignore file in the passwords folder, none of it’s contents will be commited to github.

Once you’ve created the website, you need to see what website ID has been assigned to it. This is a number which is unique for each website, you will need it later for the R code. Go to: https://devwarehouse.indicia.org.uk/index.php/website and search for your newly created website like so:

Finding the ID of your newly created website

You can see the the ID of my newly created website is 132

you next need to create a survey dataset

You can create it here: https://devwarehouse.indicia.org.uk/index.php/survey See here: https://indicia-docs.readthedocs.io/en/latest/site-building/warehouse/surveys.html for more details about survey datasets. Give it a descriptive title such as “Test survey dataset used by \[your name\]” and description. No need to enforce any required fields to leave all those options unticked.

Ensure that you select your newly created website in the drop down: selecting your website when you create a survey dataset

The other piece of information you’ll need is the survey dataset ID. Go to https://devwarehouse.indicia.org.uk/index.php/survey and navigate to the last page (assuming your dataset is the latest dataset to be created), or search using the ‘Filter for’ box by changing the dropdown to ‘Title’ like so: selecting your website when you create a survey dataset

You’ll see that my test dataset has the ID 601. Note this down as you’ll need it for the R code below.

R Code

Now we’ve created the website and survey dataset, we can now submit a record. Reminder of the important peices of information needed: website ID, website password, dataset ID.

Required packages

R packages

library(jsonlite)
library(httr)
library(magrittr) #for the %>%, will also be loaded in with any of the tidyverse packages
library(digest)

Defining our biological record

Default attributes

For this example here we define all the essential information we need to constitute a biological record. They are just defined as variables. You might be pulling some or all of this data from another source. Notice everything is defined as character strings, including numbers.

The actual record constitutes of two parts, the sample and the occurrence.

This of the sample as the overarching information about where and when the record was made.

The occurrence describes the species record - what species was it and any other attributes to do with that species record.

In this example, our sample only includes a single occurrence record, but samples can hold multiple occurrences (for example in iRecord when someone submits a list of records).

This page outlines the data model for the indicia database: https://indicia-docs.readthedocs.io/en/latest/developing/data-model/tables.html

#SAMPLE attributes

# Where is the record being submitted to?
website_id <- "132" #the website ID
survey_id <- "601" # the survey ID

# where is the record located?
entered_sref <- "SK1234" #the grid reference of the record / or different coordinate if entered_sref_system is not "OSGB"
entered_sref_system <- "OSGB" # the type of coordinate system that the `entered_sref` is in
location_name <- "My local park"
location_id <- "" # https://devwarehouse.indicia.org.uk/index.php/location

# who submitted the record?
recorder_names <- "Rolph, Simon" # typically in last name, first name format

#When was the record made?
record_date <- as.Date("2021-08-24") #use the as.Date function (it's in base R) to make sure your date is in the right format

# Other default sample attributes
sample_method_id <- "" # eg. "22" for transect, leave as "" if no sample method. 
#To get the ID of a sample method visit https://devwarehouse.indicia.org.uk/index.php/termlist 
#and search for for 'sample method' and click 'list terms'
training_samp <- "t" # "t" or "f" as to whether this is a training record (a fake sample for training purposes)
comment_samp <- "This is an example comment"
licence_id <- "2" # look up here: https://devwarehouse.indicia.org.uk/index.php/licence 


#OCCURENCE attributes

#Species data
taxa_taxon_list_id <- "289248" # the species ID - not sure where to find out this for your target species 
#(289248 is the asian hornet, because that was the taxa used in Mark's example)

zero_abundance <- "f" # is this a record that something WASN'T observed? Majority of records will have 'f' for this


record_status <- "C" # TODO not sure what the options are for this
confidential <- "f"
sensitivity_precision <- "100000" # if record is confidential, how much do we want the record to be blurred by? #unit is meters. Set as "" if not sensitive
release_status <- 'U' # TODO not sure what the options are for this
determiner <- ""
training_occ <- "t"
comment_occ <- "This is an example occurence comment"

Custom attributes

The above attributes are default attributes for any survey dataset created on an indicia warehouse.

If you want to define custom attributes for a survey data set, and be able to submit records from R with these custom attributes, you will need to do some set up in the warehouse website. First go to: https://devwarehouse.indicia.org.uk/index.php/survey and from here find your survey data set by filtering and click on ‘setup attributes’

selecting your website when you create a survey dataset

This will take you to the page for adding existing custom attributes to your survey data set. From the first dropdown menu you can select if you want to add custom attributes so the sample, occurrence or location. In this example I want to add the custom sample attribute ‘Email’. I select it from the existing attribute dropdown. If you want to add a new custom attribute you can follow this documentation first: https://indicia-docs.readthedocs.io/en/latest/site-building/warehouse/custom-attributes.html then select your newly created attribute from this dropdown.

Adding a custom attribute

After you click ‘Add existing attribute’ this blue box will appear:

Confirming the custom attribute

Press save to add the custom attribute to the survey data set. This means that if we submit a record with custom sample attribute ID:35, it will be added, rather than just ignored and lost from our submitted record.

Here we define our email attribute with a human readable string with the name email for use in the next code block.

# SAMPLE Custom attributes (not defined by default in )
email <- "simrol@ceh.ac.uk" # user email (typically used for contacting the recorder for verification purposes) 
# ID: 35 (on the dev warehouse)

Packaging up all our data into JSON

Next, we need to turn all these individual variables into list objects then into a JSON object. This means it’s all packaged up in the right format to be submitted to the warehouse. Notice how the email attribute (ID:35) is added to sample fields list object.

We must use the variable names outlined here: https://indicia-docs.readthedocs.io/en/latest/developing/data-model/tables.html

#This is some handy code that can be used to check that the location is properly formatted
if(!grepl(pattern = '^[A-Z]+[0-9]+$',entered_sref)){
    cat('Location improperly formatted') 
} else {
    cat('Location properly formatted')
}
## Location properly formatted
#Create the sample fields
fields <- list(
  website_id = list(value=website_id),
  survey_id = list(value=survey_id),
  
  entered_sref = list(value = entered_sref),
  entered_sref_system = list(value = entered_sref_system),
  location_name = list(value = location_name),
  location_id = list(value = location_id),
  
  date = list(value = record_date),
  
  recorder_names = list(value = recorder_names),
  comment = list(value = comment_samp),
  
  #sample_method = list(value=sample_method),
  training = list(value=training_samp),
  licence_id = list(value=licence_id), 
  
  
  `smpAttr:35` = list(value = email) # the CUSTOM ATTRIBUTE we added earlier 
  #note the syntax here with the ` quotes around smpAttr:[ID of custom attribute]
)

  
# Create the occurrence fields
occ_fields <- list(
  zero_abundance = list(value = zero_abundance),
  taxa_taxon_list_id = list(value = taxa_taxon_list_id),
  website_id = list(value = website_id),
  record_status = list(value = record_status),
  confidential = list(value = confidential),
  sensitivity_precision = list(value = sensitivity_precision), # no need to define this if not a sensitive record
  release_status = list(value = release_status),
  determiner_id = list(value = determiner),
  comment = list(value = comment_occ)
)

# Make the occurrence list
occurrence <- list(list(fkId = "sample_id",
                        model = list(id = "occurrence",
                                     fields = occ_fields)
                        )
                   )

#put the record list object together
record <- list(id = "sample", 
                       fields = fields, 
                       subModels = occurrence)
  
# convert it into JSON  
submission <- toJSON(record,auto_unbox = TRUE)

Connecting to the warehouse

Here we’ve got 3 functions that relate to submitting records and images to the warehouse.

Authentication

The first function is used to generate a nonce. Nonce is a number or key used once. Indicia uses nonces to authenticate so that only specific users can submit records to the warehouse.

# Function to get security nonce and authentication token
# Takes:
#   URLnonce: set to the dev warehouse url by default 
#   password: user password
# Returns: text string to append to website for posting
getnonce <- function(URLbase,password){
  URLnonce <- paste0(URLbase, 'index.php/services/security/get_nonce')
  
  r <- POST(URLnonce,
            body = list(website_id = 132))

  nonce <- httr::content(x = r, as = 'text')

  key <- paste0(nonce, ':', password)
  authtoken <- digest(key, 'sha1', serialize = FALSE)
  
  URLappend <- paste0('?auth_token=', authtoken,
                      '&nonce=', nonce,
                      paste0('&website_id=',website_id))
  
  return(URLappend)
}

Submitting records and images

The other two functions are used to submit records and submit images.

Both of them take the nonce generated in the previous function as the URLauth argument.

postsubmission requires the JSON submission object we created earlier.

postimage requires the file path to an image. This image may be in a local file system, or in a Shiny app’s file system etc.

The workflow is to upload the images first. The postimage function then provides you with the image path of the image on the warehouse. When you submit the record in postsubmission you then define the image path in the object. See the second submission below for how to submit a record with images.

# Function to post a json to the data warehouse
# Takes:
#   URLauth: the URL string from function getnonce()
#   submission: the sample, in json format
# Returns: the content of the return message from the warehouse
postsubmission <- function(URLauth, submission,URLbase){
  URL <- paste0(URLbase,
                'index.php/services/data/save',
                URLauth)
  
  r <- httr::POST(URL,
                  body = list('submission' = I(submission)))
  return(httr::content(x = r, as = 'text'))
}

# Function to post an image to the data warehouse
# Takes:
#   URLauth: the URL string from function getnonce()
#   imgpath: the path to the image
# Returns: the image path from the server e.g. '123456789image.png'
postimage <- function(URLauth, imgpath,URLbase){
  URLimg <- paste0(URLbase,
                   'index.php/services/data/handle_media',
                   URLauth)
  
  res <- POST(url=URLimg,
              body=list('media_upload'=upload_file(imgpath)))
  return(httr::content(x = res, as = 'text'))
}

Submit a record (without images)

Firstly let’s use the getnonce and postsubmission functions to submit a record without images.

As well as sending the record to the warehouse, it also provides a response which we can use to determine if the submission was successful. The response is in JSON format in the serverPost object. We then turn it into non-JSON format with fromJSON.

# Set the URL base for the server
URLbase <- "https://devwarehouse.indicia.org.uk/" #the dev warehouse by default
password <- readChar("passwords/devwarehousepassword.txt",nchar = 1000) #your password
#password <- readChar("examples/passwords/devwarehousepassword.txt",nchar = 1000) #your password

#reminder of the R object that contains our record data:
submission
## {"id":"sample","fields":{"website_id":{"value":"132"},"survey_id":{"value":"601"},"entered_sref":{"value":"SK1234"},"entered_sref_system":{"value":"OSGB"},"location_name":{"value":"My local park"},"location_id":{"value":""},"date":{"value":"2021-08-24"},"recorder_names":{"value":"Rolph, Simon"},"comment":{"value":"This is an example comment"},"training":{"value":"t"},"licence_id":{"value":"2"},"smpAttr:35":{"value":"simrol@ceh.ac.uk"}},"subModels":[{"fkId":"sample_id","model":{"id":"occurrence","fields":{"zero_abundance":{"value":"f"},"taxa_taxon_list_id":{"value":"289248"},"website_id":{"value":"132"},"record_status":{"value":"C"},"confidential":{"value":"f"},"sensitivity_precision":{"value":"100000"},"release_status":{"value":"U"},"determiner_id":{"value":""},"comment":{"value":"This is an example occurence comment"}}}}]}
#authenticate using the get nonce function
URLauth <- getnonce(password = password, URLbase = URLbase)

#send the submission
serverPost <- postsubmission(URLauth = URLauth,
                             URLbase = URLbase,
                             submission = submission)

# convert the response from JSON to R so we can see
serverOut <- fromJSON(serverPost)
serverOut
## $success
## [1] "multiple records"
## 
## $outer_table
## [1] "sample"
## 
## $outer_id
## [1] "9474601"
## 
## $struct
## $struct$model
## [1] "sample"
## 
## $struct$id
## [1] "9474601"
## 
## $struct$created_on
## [1] "20210909 15:27:42"
## 
## $struct$updated_on
## [1] "20210909 15:27:42"
## 
## $struct$children
##                    model       id        created_on        updated_on
## 1 sample_attribute_value 35809553 20210909 15:27:42 20210909 15:27:42
## 2             occurrence 15229202 20210909 15:27:42 20210909 15:27:42
#ID of record you have just submitted
serverOut$struct$id
## [1] "9474601"
#open your newly submitted record in the warehouse (presuming you're already logged in on your web browser)
browseURL(paste0("https://devwarehouse.indicia.org.uk/index.php/sample/edit/",serverOut$struct$id))

The browseURL function should then open your record. If not then navigate to https://devwarehouse.indicia.org.uk/index.php/sample and go to the last page, and make sure your record is in there. For example here’s one submitted through this R code:

sample on the warehouse submitted through R

At the bottom of the sample page you can see how our custom attribute for email is here: survey specific attribute

Click on the ‘Occurrences’ tab to see the species records attached to this sample.

Submit a record with images

This example shows how to submit a record with images. The process is to have images locally on the file system, upload them to the warehouse as media items, then submit a record to the warehouse linking to these previosly uploaded media.

This example uses the same record data as defined before but I will attach 2 images to the record which are stored in /images.

# here we're using the same sample and occurrence object that we made before
fields
## $website_id
## $website_id$value
## [1] "132"
## 
## 
## $survey_id
## $survey_id$value
## [1] "601"
## 
## 
## $entered_sref
## $entered_sref$value
## [1] "SK1234"
## 
## 
## $entered_sref_system
## $entered_sref_system$value
## [1] "OSGB"
## 
## 
## $location_name
## $location_name$value
## [1] "My local park"
## 
## 
## $location_id
## $location_id$value
## [1] ""
## 
## 
## $date
## $date$value
## [1] "2021-08-24"
## 
## 
## $recorder_names
## $recorder_names$value
## [1] "Rolph, Simon"
## 
## 
## $comment
## $comment$value
## [1] "This is an example comment"
## 
## 
## $training
## $training$value
## [1] "t"
## 
## 
## $licence_id
## $licence_id$value
## [1] "2"
## 
## 
## $`smpAttr:35`
## $`smpAttr:35`$value
## [1] "simrol@ceh.ac.uk"
occurrence
## [[1]]
## [[1]]$fkId
## [1] "sample_id"
## 
## [[1]]$model
## [[1]]$model$id
## [1] "occurrence"
## 
## [[1]]$model$fields
## [[1]]$model$fields$zero_abundance
## [[1]]$model$fields$zero_abundance$value
## [1] "f"
## 
## 
## [[1]]$model$fields$taxa_taxon_list_id
## [[1]]$model$fields$taxa_taxon_list_id$value
## [1] "289248"
## 
## 
## [[1]]$model$fields$website_id
## [[1]]$model$fields$website_id$value
## [1] "132"
## 
## 
## [[1]]$model$fields$record_status
## [[1]]$model$fields$record_status$value
## [1] "C"
## 
## 
## [[1]]$model$fields$confidential
## [[1]]$model$fields$confidential$value
## [1] "f"
## 
## 
## [[1]]$model$fields$sensitivity_precision
## [[1]]$model$fields$sensitivity_precision$value
## [1] "100000"
## 
## 
## [[1]]$model$fields$release_status
## [[1]]$model$fields$release_status$value
## [1] "U"
## 
## 
## [[1]]$model$fields$determiner_id
## [[1]]$model$fields$determiner_id$value
## [1] ""
## 
## 
## [[1]]$model$fields$comment
## [[1]]$model$fields$comment$value
## [1] "This is an example occurence comment"
# but we're going to add some images to warehouse then get their ID's and add them to the occurrence object before we submit the record 

#provide a vector of file paths where the images are located
imagelist <- c("images/record_photo_1.jpg",
               "images/record_photo_2.jpg")

#submit the images to the warehouse using the getnonce and postimage functions we defined earlier. 
#We also need to keep hold of the media paths so we assigns to imageStr a vector of the media paths
imageStr <- lapply(imagelist, FUN = function(img){
  getnonce(password = password, URLbase = URLbase) %>%
    postimage(imgpath = img, URLbase = URLbase)
  }) %>% 
  unlist()

imageStr
## [1] "1631197665record_photo_1.jpg" "1631197676record_photo_2.jpg"
# For every image supplied, create an image instance
if(!is.null(imageStr)){
  media <- lapply(imageStr, FUN = function(imgx){
    med_fields <- list(path = list(value = imgx),
                       caption = list(value = "Enter comment here"))
    list(fkId = "occurrence_id",
         model = list(id = "occurrence_medium",
                      fields = med_fields))
  })
  # Add the images to the occurrence as a submodel
  occurrence[[1]]$model$subModels <- media
}

#you can see information about the images (as paths to images that we have already uploaded to the warehouse) have been added to the occurrence object
occurrence[[1]]$model$subModels
## [[1]]
## [[1]]$fkId
## [1] "occurrence_id"
## 
## [[1]]$model
## [[1]]$model$id
## [1] "occurrence_medium"
## 
## [[1]]$model$fields
## [[1]]$model$fields$path
## [[1]]$model$fields$path$value
## [1] "1631197665record_photo_1.jpg"
## 
## 
## [[1]]$model$fields$caption
## [[1]]$model$fields$caption$value
## [1] "Enter comment here"
## 
## 
## 
## 
## 
## [[2]]
## [[2]]$fkId
## [1] "occurrence_id"
## 
## [[2]]$model
## [[2]]$model$id
## [1] "occurrence_medium"
## 
## [[2]]$model$fields
## [[2]]$model$fields$path
## [[2]]$model$fields$path$value
## [1] "1631197676record_photo_2.jpg"
## 
## 
## [[2]]$model$fields$caption
## [[2]]$model$fields$caption$value
## [1] "Enter comment here"
#The rest of this code is the same as the previous example
#put the record list object together with the sample fields we defined in the first example 
#and the new occurrence object which we've added some photos to
record <- list(id = "sample", 
                       fields = fields, 
                       subModels = occurrence)

# convert it into JSON  
submission <- toJSON(record,auto_unbox = TRUE)

#authenticate using the get nonce function
URLauth <- getnonce(password = password, URLbase = URLbase)

#send the submission
serverPost <- postsubmission(URLauth = URLauth,
                             URLbase = URLbase,
                             submission = submission)

# convert the response from JSON to R so we can see
serverOut <- fromJSON(serverPost)
serverOut
## $success
## [1] "multiple records"
## 
## $outer_table
## [1] "sample"
## 
## $outer_id
## [1] "9474602"
## 
## $struct
## $struct$model
## [1] "sample"
## 
## $struct$id
## [1] "9474602"
## 
## $struct$created_on
## [1] "20210909 15:28:05"
## 
## $struct$updated_on
## [1] "20210909 15:28:05"
## 
## $struct$children
##                    model       id        created_on        updated_on
## 1 sample_attribute_value 35809554 20210909 15:28:05 20210909 15:28:05
## 2             occurrence 15229203 20210909 15:28:05 20210909 15:28:05
##                                                                                                                             children
## 1                                                                                                                               NULL
## 2 occurrence_medium, occurrence_medium, 1812551, 1812552, 20210909 15:28:05, 20210909 15:28:05, 20210909 15:28:05, 20210909 15:28:05
#ID of record you have just submitted
serverOut$struct$id
## [1] "9474602"
#open your newly submitted record in the warehouse (presuming you're already logged in on your web browser)
browseURL(paste0("https://devwarehouse.indicia.org.uk/index.php/sample/edit/",serverOut$struct$id))

Your record should have successfully uploaded. To check the images have uploaded, view your new sample in the warehouse and click on occurrences. Don’t click on ‘Media files’ because you won’t see anything there because we have attached the images to the occurrence NOT the sample.

sample on the warehouse submitted through R

On the occurrence, click on edit, then click on the media files tab and you should be able to see the images uploaded:

occurrence images on the warehouse submitted through R

You can see that the caption for each image is “Enter comment here”, you can see there this is defined in the previous code block but this could easily be set to more descriptive text.

Conclusion

This should give you a basic knowledge of how to upload records to the indicia warehouse using R.

Session info

sessionInfo()
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17763)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United Kingdom.1252 
## [2] LC_CTYPE=English_United Kingdom.1252   
## [3] LC_MONETARY=English_United Kingdom.1252
## [4] LC_NUMERIC=C                           
## [5] LC_TIME=English_United Kingdom.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] digest_0.6.27  magrittr_2.0.1 httr_1.4.2     jsonlite_1.7.2
## 
## loaded via a namespace (and not attached):
##  [1] mime_0.11         R6_2.5.1          evaluate_0.14     rlang_0.4.11     
##  [5] stringi_1.7.3     curl_4.3.2        rmarkdown_2.10    tools_4.1.0      
##  [9] stringr_1.4.0     xfun_0.25         yaml_2.2.1        compiler_4.1.0   
## [13] htmltools_0.5.1.1 knitr_1.33