The famous Iris dataset exists as a simple CSV everywhere. It’s not FAIR:
Findable? No persistent identifiers
Accessible? No metadata about provenance
Interoperable? No ontology links
Reusable? No license, no data dictionary
The Solution
Build a proof-of-concept using Airtable (free) + R to transform it into FAIR-compliant JSON-LD.
Architecture
Three linked tables in Airtable:
Observations - the 150 iris measurements
Taxa - species metadata with NCBI Taxonomy IDs
Property_Mappings - ontology URIs for each measurement (Plant Ontology)
R script pulls via API and generates JSON-LD with: - Persistent identifiers for each record - Schema.org vocabulary - Ontology mappings (PO, NCBI) - Full provenance and licensing
Implementation
library( airtabler )library( dplyr )
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library( tidyr )library(jsonlite)library(purrr)
Attaching package: 'purrr'
The following object is masked from 'package:jsonlite':
flatten