@entryscape/entrysync v0.8.0
EntrySync - library for synchronizing entries with metadata from various sources
The EntryScape platform with the backend EntryStore relies on the use of entries that may contain both a resource, metadata and external metadata. At the heart of the library is the mechanism that synchronizes metadata of entries by doing metadata fingerprinting.
Installation
Dependencies are installed by running yarn.
Synchronization patterns
The core functionality of the library lets you build a custom synchronization mechanism. However, most cases can be covered with the following established synchronization patterns. The patterns are listened below, each together with a corresponding CLI command.
Graph synchronization pattern - src/graph/graphSync.js
This pattern takes a single graph as input and breaks it up into smaller graphs centered around entities and synchronizes them as entries.
The algorithm for breaking up the graph is based on detecting entities based on rdf:type and includes all outgoing triples and then repeats the procedure for all blank nodes in object position.
CLI command:
cd cli node graphSync.js config.json
Where config.js has to be provided, check the example cli/graphSync_exampleConfig.json
Type based synchronization pattern - src/context/typeSync.js
This pattern synchronizes entries in one context with another context (potentially in another EntryStore instance). Detection of entries is based on one or several classes (rdf:type).
CLI command:
cd cli node typeSync.js config.json
Where config.js has to be provided, check the example cli/typeSync_exampleConfig.json
Traversal synchronization pattern - src/context/traverseSync.js
This pattern synchronizes entries in one context with another context (potentially in another EntryStore instance). Detection of entries is based on an initial starting point of one or several entries and includes all entries reachable via a set of properties.
CLI command:
cd cli node traverseSync.js config.json
Where config.js has to be provided, check the example cli/traverseSync_exampleConfig.json
Utility functionality in CLI
Context creation
cd cli node context.js config.json create TYPE [ENTRYID]
You can leave out the ENTRYID parameter and an id will be generated for you. The value of TYPE must be one of:
- catalog - Data catalog typically handled by EntryScape Catalog
- terms - Terminology context typically handled by EntryScape Terms
- workbench - Any kind of project with linked data typically handled by EntryScape Workbench
- model - Modeling project typically handled by EntryScape Models
Context removal
cd cli node context.js config.json remove ENTRYID
Where ENTRYID has be an entryid of an existing context.
Context listing
cd cli node context.js config.json list
Core functionality
The following classes are central to how the synchronization works:
src/EntrySync.js This class handles synchronizing metadata as Entries in an EntryStore instance, uses EntityIndex and DuplicateIndex to steer what should be synchronized. src/EntityIndex.js This class handles an index of synchronized entries with the corresponding metadata fingerprint, useful to speed up conseqitive synchronizations, can be persisted on disk. src/DuplicateIndex.js Keeps track of which entities that have already been synched and blocks them from being duplicated.