Daval NPM | npm.io

daval

experimental document storage w/ messagepack, ajv, redis & elasticsearch

intro & summary

good parts
- redis in-memory storage provides speed
- redis single-threadness + transactions provides ACID compliance
- redis disk storage provides persistence
- messagepack allows low memory & disk consumption for redis
- elasticsearch provides powerful search capability
trade-offs / gotchas
- messagepack trades speed for memory & disk capacity
  - but lets you store all JS data types in redis
  - fixstr, str8/16/32
  - positive fixint, negative fixint
  - uint8/16/32/64, int8/16/32/64, float32/64
  - true, false, undefined, NaN, +Infinity, -Infinity
  - arrays, objects, nested arrays, nested objects
  - buffers, arraybuffers, typedarrays
  - dictionary support further reduces size
- messagepack what-the-pack module buffer is 8KB by default
  - you can set it up to 1GB which is already an overkill
- redis trades durability for speed
  - redis appendfsync is everysec by default
  - you can set appendfsync to always to reverse trade-off
- elasticsearch trades data consistency for search capability
  - elasticsearch index.refresh_interval is 1s by default
ideal use cases
- you want fast ACID-compliant data updates
- you want elasticsearch search capability
- you can tolerate 1s data inconsistency between redis in-memory and disk storage
- you can tolerate 1s data inconsistency between redis and elasticsearch
- you have data updates that need to reflect in search immediately ok
- you have data updates that does not need to reflect in search immediately ok
- you have data that requires some schema ok
- you have data that does not require schema ok
- you have data that needs to exist in elasticsearch ok
- you have data that does not need to exist in elasticsearch ok
todo
- Type
  - make use of ajv schemas optional ok
  - provide option to index in elasticsearch ok
  - allow Query to check if Type is in elasticsearch ok
- Entity & Transactions
  - redis & elasticsearch calls try-catch retry planned
- Logging
  - local file planned
  - local db planned
  - third-party db planned

setup

spin up redis (4.x and up) instance
spin up elasticsearch (6.x and up) instance
add module

$ yarn add daval

create client instance

const Client = require('daval');
const client = new Client(
  // redis config
  {
    host: '127.0.0.1',
    port: 6379,
    password: 'password'
  },
  // elasticsearch config
  {
    host: 'localhost:9200'
  }
);
const { Type, Entity, Transaction, Query } = client;

optional configurations

const Client = require('daval');

// for details:
// https://www.npmjs.com/package/what-the-pack

// 16.8 MB buffer
Client.MessagePack.reallocate(2 ** 24);

// register 'name' word in dictionary
Client.MessagePack.register('name');

// initialize instance here..

Type `class`

constructor (label String, useElastic Boolean)
- label gets transformed into lowercase
- useElastic is eitehr true or false
useSchema (schema Object)
- schema must be a valid ajv json schema
- returns self
example

const User = new Type('User', true)
  .useSchema({
    "properties": {
      "name": {
        "type": "string",
      },
      "age": {
        "type": "number",
        "minimum": 25,
      }
    }
  });

Entity `class`

constructor (t Type, id String)
- t must be instance of Type
- id must be unique for each Entity
upsert (data Object, refresh Boolean) async
- data must pass this entity's Type schema if schema exists
- absence of this entity's id generates a uuidv4 id
- if refresh === true, client awaits update visibility in search before returning
merge (data Object, refresh Boolean) async
- data must pass this entity's Type schema if schema exists
- absence of this entity's id throws an error
- uses lodash.merge to merge existing and new data
- if refresh === true, client awaits update visibility in search before returning
exists () async Boolean
- absence of this entity's id throws an error
fetch () async Object
- fetched data must pass this entity's Type schema if schema exists
- absence of this entity's id throws an error
delete (refresh Boolean) async
- absence of this entity's id throws an error
- if refresh === true, client awaits update visibility in search before returning
example

  const alice = new Entity(User);
  await alice.upsert({ name: 'alice' });
  console.log('id', alice.id);
    // ie. d14a4dc9-e19c-48bf-94b5-1d820c7566d0
  await alice.merge({ age: 25 });
  console.log('exists', await alice.exists());
    // true
  console.log('fetch', await user.fetch());
    // { name: 'alice', age: 25 }
  await alice.delete();
  console.log('exists', await alice.exists());
    // false
  console.log('fetch', await alice.fetch());
    // undefined

Transaction `class`

with (...entities Entity) Transaction
- entities are instances of Entity class
- returns Transaction, for chaining
run (fn Function, refresh Boolean) async
- fn is a function accepting single-parameter e
- fn can also be an async function
- parameter e is a function (e stands for entity)
- used as e(x) where x is an instance of Entity
- e(x) returns an object containing the entity's data, which can be modified
- if fn gracefully returns (regardless of return value), all modifications to the e(x) objects will be committed
- all data fetched are validated with schema if schema exists
- all data modifications are validated with schema if schema exists
- if refresh === true, client awaits update visibility in search before returning
example

const alice = new Entity(User);
const bob = new Entity(User);
await alice.upsert({ name: 'alice' });
await bob.upsert({ name: 'bob' });
await new Transaction()
  .with(alice, bob)
  .run(async (e) => {
    e(alice).age = 25;
    e(bob).age = 26;
  });
console.log({
  alice: await alice.fetch(),
  bob: await bob.fetch()
});
// {
//   alice: { name: 'alice', age: 25 },
//   bob: { name: 'bob', age: 26 }
// }

Query `class`

constructor (...t Type)
- t are Type instances to include in search
from (offset Integer)
- offset is amount of records to offset
size (amount Integer)
- amount is amount of records to return
sort (field String, direction String, mode String)
- field is name of field to sort
- direction must be asc or desc
- mode must be min, max, sum, avg, or median
- can be called multiple times to stack multiple sorts
range (field String)
- returns Object with the following methods
- gt(value Number) - greater than
- gte(value Number) - greater than or equal
- lt(value Number) - less than
- lte(value Number) - less than or equal
matchAll ()
- matches all documents
matchNone ()
- matches no documents
scroll (duration DurationString, scrollId String)
- duration, ie. 30s, 1m, 1h, 1d
- scrollId is used in scroll continuation
sourceFilter (...fields String)
- selects / specifies the field(s) to return
term (field String, value String)
- finds documents that contain the exact term specified
terms (field String, values String)
- filters documents that have fields that match any of the provided terms
run () async
- returns an Object with the following properties
  - scrollId
  - ids
  - entities
  - data
  - hitsRetreived
  - hitsTotal
all methods aside from run() allows chaining

Query `coverage`

No description / code for non-supported parts.
Search API
- Request Body
  - from - sets offset
    - .from(10)
  - size - sets amount of records to return
    - .size(10)
  - sort - sorts by field, ascending or descending
    - .sort('age', 'asc')
    - .sort('age', 'desc')
    - .sort('age', 'desc', 'avg')
  - scroll - retrieve large numbers of results from a search request
    - .scroll('10m', scrollId)
  - source filtering - specifies fields to return
    - .sourceFilter('name', 'age')
- ~~suggesters~~
- ~~count~~
- ~~validate~~
- ~~explain~~
- ~~profiling~~
Query DSL
- Full text queries (partial support)
  - match - filters fields with values
    - .match('name', 'josh')
    - .match('age', 25)
  - ~~match_phrase~~
  - ~~match_phrase_prefix~~
  - ~~multi_match~~
  - ~~common~~
  - ~~query_string~~
  - ~~simple_query_string~~
- Term level queries (partial support)
  - term - finds documents that contain the exact term specified
    - .term('name', 'alice')
    - .term('name', 'bob')
  - terms - filters documents that have fields that match any of the provided terms
    - .terms('tags', 'Horror', 'Comedy')
    - .terms('tags', 'Urgent')
  - terms_set
  - range - greater than / less than filters
    - .range('age').gt(25)
    - .range('age').gte(25)
    - .range('age').lt(25)
    - .range('age').lte(25)
  - ~~exists~~
  - ~~prefix~~
  - ~~wildcard~~
  - ~~regexp~~
  - ~~fuzzy~~
  - ~~type~~
  - ~~ids~~
- Compound queries (no support)
  - ~~constant_score~~
  - ~~bool~~
  - ~~dis_max~~
  - ~~function_score~~
  - ~~boosting~~
- Joining queries (no support)
  - ~~nested~~
  - ~~has_child~~
  - ~~has_parent~~
- Geo queries (no support)
  - ~~geo_shape~~
  - ~~geo_bounding_box~~
  - ~~geo_distance~~
  - ~~geo_polygon~~
- Specialized queries (no support)
  - ~~more_like_this~~
  - ~~script~~
  - ~~percolate~~
  - ~~wrapper~~
- Span queries (no support)
  - ~~span_term~~
  - ~~span_multi~~
  - ~~span_first~~
  - ~~span_near~~
  - ~~span_or~~
  - ~~span_not~~
  - ~~span_containing~~
  - ~~span_within~~
  - ~~field_masking_span~~
- Misc (partial support)
  - match_all - matches all documents
    - .matchAll()
  - match_none - matches no documents
    - .matchNone()
  - ~~minimum_should_match~~
  - ~~multi term query rewrite~~

exposed clients

redis Object
- ioredis client
- https://www.npmjs.com/package/ioredis
elastic `Object
- elasticsearch client
- https://www.npmjs.com/package/elasticsearch
clients are exposed for complex calls

notes

on elasticsearch, we use type's label as 'index' and 'type' value, because:
- es 7.x onwards will get rid of mapping types
- it's currently recommended to use same 'index' and 'type' value in latest es 6.x
throwing errors within transactions effectively aborts it
the Query class covers the basic query functionality of Google Cloud Datastore
- set search offset
- set search size (amount of results to return)
- set search scroll id
- specify which fields to return
- sort items in ascending and descending
- filter items with exact field values
- filter items with greater than, less than, greater than or equal, and less than or equal
- filter items with array fields containing specific values
Difference between match and term
- The match query analyzes the input string and constructs more basic queries from that.
- The term query matches exact terms.
- If you have a document containing "CAT" and search for "cat" the match query will find it but the term query won't. That is, if you lowercase in your analysis config which it does by default.
1,000 documents at 1 KB each is 1 MB, 1,000 documents at 100 KB each is 100 MB
on indices locked by storage, unlock with:
- curl -XPUT -H "Content-Type: application/json" http://localhost:9200/_all/_settings -d '{"index.blocks.read_only_allow_delete": null}'

external references

redis client
- https://www.npmjs.com/package/ioredis
elasticsearch client
- https://www.npmjs.com/package/elasticsearch
messagepack module
- https://www.npmjs.com/package/what-the-pack
redis installation
- https://redis.io/topics/quickstart
elasticsearch installation
- https://www.elastic.co/guide/en/elasticsearch/reference/current/_installation.html

license

MIT | @davalapar

ajv elasticsearch ioredis lodash uuid what-the-pack

1.0.0

7 years ago

daval

intro & summary

setup

Type class

Entity class

Transaction class

Query class

Query coverage

exposed clients

notes

external references

license

Type `class`

Entity `class`

Transaction `class`

Query `class`

Query `coverage`