What is Provenance:

You can look at provenance as the history of a data object, for example how a sample has been created and where it has been stored.

Provenance, as defined in the ISO Provenance Standard ISO 23494 - “Biotechnology – Provenance information model for biological specimen and data – Part 2: Common Provenance Model” is expressed as a triplet of “Entity - Agent - Activity”. The entity was generated by an activity, which has been performed by an agent.

The Provenance Access Point

The Provenance Access Point (PAP) exposes provenance information about a single record, meaning a single data record like a biological sample. The information returned by the PAP adheres to the ISO provenance standard 23494 and is in a machine-readable format.

The PAP is still under development (GitHub). In the current state, data is extracted from OpenSpecimen with the OpenSpecimenAPIconnector (GitHub, Docs), transformed by a Juypter Notebook (GitHub) and loaded in a Neo4j using the PROV Database Connector (GitHub, Docs).