July 19, 2016

History of DINA-Web

This slideshow is almost 2 years old but remains relevant for describing the background of the DINA project. It outlines the history and rationale for the behind the DINA-Web system that is being developed in the DINA project:

Various similar slideshows provide more background, history and context to DINA-Web - Natural History Collections for the Web and can be found here:

The latest news for DINA-Web

This is an open collaboration and progress and results are published continuously on a wiki and a blog:

  1. There is a blog available at https://blog.dina-web.net with various posts on mostly technical or DevOps-related topics.

  2. There is a project wiki at https://dina-project.net with various assets relating to project management. It holds meeting notes from all meetings, for example: http://dina-project.net/wiki/DINA_TC_Meetings_and_minutes_2016

More external communication channels are in the works.

Principles and guidelines

Strategy

Summary: Micro-services architecture with a Web-API strategy.

The Web-API strategy provides a "Cloud Platform", but without lock-in to a single commercial vendor, so you can run the services yourself and own your own data if you want to.

Progress

Backend layer (APIs)

Currently we have a set of REST APIs for serving data, for example:

  • Collections (which covers a variety of related data types)
  • Classifications
  • Users
  • Media

Available at GitHub and from Docker Hub.

Frontend layer (Clients)

There are also some web-based clients/UIs which consume and use this data, including a beta version of a Collections Manager UI:

Available at GitHub and from Docker Hub.

CLI Tools and "data science clients"

There are also CLI tools for data migrations along with a Data Wranglers Platform (centered around RStudio for the web and explained in more detail here: http://mirroreum.eu/).

There is also an R client to DINA-Web in the works, working name "dinar". Not released yet.

We will likely also want to provide a Python client, to enable Python apps to bind to the DINA-Web APIs.

Strengths

It is nice to be able to work with FOSS - free and open source components - especially in integration scenarios.

COTS products, can prove to be challenging to integrate due to high integration costs, high complexity, use of proprietary non-standard formats. Often the business strategy is to create lock-in, providing various tools and consulting to move data in to the system, but making it difficult to get data out.

We recently held a EUBON workshop focused on DINA-Web and integrative approaches in system integration. Some takeaways from that workshop can be found here: http://rpubs.com/mskyttner/dw-integration-approaches.

Docker vs monoliths

Sometimes COTS systems use complicated and restrictive licensing and are designed to create lock-in. With COTS you often get turn-key ready monoliths, ie non portable components that have specific system requirements and require non-free technology stacks to run and that bring all-or-nothing solutions.

In DINA-Web, adopting FOSS and tools such as Docker has significantly simplified both development and operations and given us significant forward momentum. It also helps us to combine and use various modules in DINA-Web like lego pieces.