0.1.1 • Published 16 days ago

@treecg/sds-storage-writer-mongo v0.1.1

Weekly downloads
-
License
MIT
Repository
-
Last release
16 days ago

A SDS storage writer for MongoDB

Bun CI npm

Given an SDS stream and its correspondent stream of members, this processor will write everything into a MongoDB instance.

SDS stream updates are stored into MongoDB collections for the LDES server to find this information when serving requests. If a ldes:timestampPath property is given as part of the dataset metadata, the storage writer will automatically start up a timestamp fragmentation, based on a B+ Tree strategy.

An example of a SDS data stream with a predefined fragmentation strategy is shown next:

# Member ex:sample1 exists
ex:sample1 a ex:Object;
  ex:x "2";
  ex:y "5".


# <bucketizedStream> contains this member and this member is part of bucket <bucket2>
[] sds:stream <bucketizedStream>;
   sds:payload ex:sample1;
   sds:bucket <bucket2>.

# <bucket1> has a relation to <bucket2>
<bucket1> sds:relation [
  sds:relationType tree:GreaterThanRelation ;
  sds:relationBucket <bucket2> ;
  sds:relationValue 1;
  sds:relationPath ex:x 
] .

With this information, the data of the member is stored in the MongoDB collection, and the required relations are also stored in the database.

Usage

As a Connector Architecture processor

This repository exposes the Connector Architecture processor js:Ingest, which can be used within data processing pipelines to write a SDS streams into a MongoDB instance. The processor can be configured as follows:

@prefix : <https://w3id.org/conn#>.
@prefix js: <https://w3id.org/conn/js#>.
@prefix sh: <http://www.w3.org/ns/shacl#>.

[ ] a js:Ingest;
  js:dataInput <inputDataReader>;
  js:metadataInput <inputMetadataReader>;
  js:database [
    js:url <http://myLDESView.org>;
    js:metadata "METADATA";
    js:data "DATA";
    js:index "INDEX";
  ];
  js:pageSize 500;
  js:branchSize 4.

As a library

The library exposes one function ingest, which handles everything.

async function ingest(
  data: Stream<string | Quad[]>, 
  metadata: Stream<string | RDF.Quad[]>, 
  database: DBConfig,
  maxsize: number = 100,
  k: number = 4
) { /* snip */ }

arguments:

  • data: a stream reader that carries data (as string or Quad[]).
  • metadata: a stream reader that carries SDS metadata (as string or Quad[]).
  • database: connection parameters for a reachable MongoDB instance.
  • maxsize: max number of members per fragment.
  • k: max number of child nodes in the default time-based B+ Tree fragmentation.

Authors and License

Arthur Vercruysse arthur.vercruysse@ugent.be Julián Rojas julianandres.rojasmelendez@ugent.be

© Ghent University - IMEC. MIT licensed

0.1.1

2 months ago

0.3.1-alpha.1

16 days ago

0.3.1-alpha.2

16 days ago

0.2.6

4 months ago

0.2.5

4 months ago

0.2.4

4 months ago

0.2.3

4 months ago

0.2.2

4 months ago

0.2.1

4 months ago

0.2.0

4 months ago

0.1.0

8 months ago

0.0.5

11 months ago

0.0.4

12 months ago

0.0.3

2 years ago

0.0.2

2 years ago

0.0.1-a.2

2 years ago

0.0.1-a.1

2 years ago