1.0.0 • Published 7 years ago
daval v1.0.0
daval
experimental document storage w/ messagepack, ajv, redis & elasticsearch
intro & summary
- good parts
- redis in-memory storage provides speed
- redis single-threadness + transactions provides ACID compliance
- redis disk storage provides persistence
- messagepack allows low memory & disk consumption for redis
- elasticsearch provides powerful search capability
- trade-offs / gotchas
- messagepack trades speed for memory & disk capacity
- but lets you store all JS data types in redis
- fixstr, str8/16/32
- positive fixint, negative fixint
- uint8/16/32/64, int8/16/32/64, float32/64
- true, false, undefined, NaN, +Infinity, -Infinity
- arrays, objects, nested arrays, nested objects
- buffers, arraybuffers, typedarrays
- dictionary support further reduces size
- messagepack
what-the-packmodule buffer is8KBby default- you can set it up to
1GBwhich is already an overkill
- you can set it up to
- redis trades durability for speed
- redis
appendfsynciseverysecby default - you can set
appendfsynctoalwaysto reverse trade-off
- redis
- elasticsearch trades data consistency for search capability
- elasticsearch
index.refresh_intervalis1sby default
- elasticsearch
- messagepack trades speed for memory & disk capacity
- ideal use cases
- you want fast ACID-compliant data updates
- you want elasticsearch search capability
- you can tolerate
1sdata inconsistency between redis in-memory and disk storage - you can tolerate
1sdata inconsistency between redis and elasticsearch - you have data updates that need to reflect in search immediately
ok - you have data updates that does not need to reflect in search immediately
ok - you have data that requires some schema
ok - you have data that does not require schema
ok - you have data that needs to exist in elasticsearch
ok - you have data that does not need to exist in elasticsearch
ok
- todo
Type- make use of ajv schemas optional
ok - provide option to index in elasticsearch
ok - allow
Queryto check ifTypeis in elasticsearchok
- make use of ajv schemas optional
Entity&Transactions- redis & elasticsearch calls try-catch retry
planned
- redis & elasticsearch calls try-catch retry
Logging- local file
planned - local db
planned - third-party db
planned
- local file
setup
- spin up redis (4.x and up) instance
- spin up elasticsearch (6.x and up) instance
- add module
$ yarn add daval- create client instance
const Client = require('daval');
const client = new Client(
// redis config
{
host: '127.0.0.1',
port: 6379,
password: 'password'
},
// elasticsearch config
{
host: 'localhost:9200'
}
);
const { Type, Entity, Transaction, Query } = client;- optional configurations
const Client = require('daval');
// for details:
// https://www.npmjs.com/package/what-the-pack
// 16.8 MB buffer
Client.MessagePack.reallocate(2 ** 24);
// register 'name' word in dictionary
Client.MessagePack.register('name');
// initialize instance here..Type class
- constructor (label
String, useElasticBoolean)labelgets transformed into lowercaseuseElasticis eitehrtrueorfalse
- useSchema (schema
Object)schemamust be a validajvjson schema- returns self
- example
const User = new Type('User', true)
.useSchema({
"properties": {
"name": {
"type": "string",
},
"age": {
"type": "number",
"minimum": 25,
}
}
});Entity class
- constructor (t
Type, idString)tmust be instance ofTypeidmust be unique for eachEntity
- upsert (data
Object, refreshBoolean)asyncdatamust pass this entity'sTypeschema if schema exists- absence of this entity's
idgenerates auuidv4id - if
refresh===true, client awaits update visibility in search before returning
- merge (data
Object, refreshBoolean)asyncdatamust pass this entity'sTypeschema if schema exists- absence of this entity's
idthrows an error - uses
lodash.mergeto merge existing and newdata - if
refresh===true, client awaits update visibility in search before returning
- exists ()
asyncBoolean- absence of this entity's
idthrows an error
- absence of this entity's
- fetch ()
asyncObject- fetched
datamust pass this entity'sTypeschema if schema exists - absence of this entity's
idthrows an error
- fetched
- delete (refresh
Boolean)async- absence of this entity's
idthrows an error - if
refresh===true, client awaits update visibility in search before returning
- absence of this entity's
- example
const alice = new Entity(User);
await alice.upsert({ name: 'alice' });
console.log('id', alice.id);
// ie. d14a4dc9-e19c-48bf-94b5-1d820c7566d0
await alice.merge({ age: 25 });
console.log('exists', await alice.exists());
// true
console.log('fetch', await user.fetch());
// { name: 'alice', age: 25 }
await alice.delete();
console.log('exists', await alice.exists());
// false
console.log('fetch', await alice.fetch());
// undefinedTransaction class
- with (...entities
Entity)Transactionentitiesare instances ofEntityclass- returns
Transaction, for chaining
- run (fn
Function, refreshBoolean)asyncfnis a function accepting single-parameterefncan also be anasyncfunction- parameter
eis a function (estands for entity) - used as
e(x)wherexis an instance ofEntity e(x)returns an object containing the entity's data, which can be modified- if
fngracefully returns (regardless of return value), all modifications to thee(x)objects will be committed - all data fetched are validated with schema if schema exists
- all data modifications are validated with schema if schema exists
- if
refresh===true, client awaits update visibility in search before returning
- example
const alice = new Entity(User);
const bob = new Entity(User);
await alice.upsert({ name: 'alice' });
await bob.upsert({ name: 'bob' });
await new Transaction()
.with(alice, bob)
.run(async (e) => {
e(alice).age = 25;
e(bob).age = 26;
});
console.log({
alice: await alice.fetch(),
bob: await bob.fetch()
});
// {
// alice: { name: 'alice', age: 25 },
// bob: { name: 'bob', age: 26 }
// }Query class
- constructor (...t
Type)tareTypeinstances to include in search
- from (offset
Integer)offsetis amount of records to offset
- size (amount
Integer)amountis amount of records to return
- sort (field
String, directionString, modeString)fieldis name of field to sortdirectionmust beascordescmodemust bemin,max,sum,avg, ormedian- can be called multiple times to stack multiple sorts
- range (field
String)- returns
Objectwith the following methods - gt(value
Number) - greater than - gte(value
Number) - greater than or equal - lt(value
Number) - less than - lte(value
Number) - less than or equal
- returns
- matchAll ()
- matches all documents
- matchNone ()
- matches no documents
- scroll (duration
DurationString, scrollIdString)duration, ie.30s,1m,1h,1dscrollIdis used in scroll continuation
- sourceFilter (...fields
String)- selects / specifies the field(s) to return
- term (field
String, valueString)- finds documents that contain the exact term specified
- terms (field
String, valuesString)- filters documents that have fields that match any of the provided terms
- run ()
async- returns an
Objectwith the following propertiesscrollIdidsentitiesdatahitsRetreivedhitsTotal
- returns an
- all methods aside from
run()allows chaining
Query coverage
- No description / code for non-supported parts.
Search APIRequest Bodyfrom- sets offset.from(10)
size- sets amount of records to return.size(10)
sort- sorts by field, ascending or descending.sort('age', 'asc').sort('age', 'desc').sort('age', 'desc', 'avg')
scroll- retrieve large numbers of results from a search request.scroll('10m', scrollId)
source filtering- specifies fields to return.sourceFilter('name', 'age')
suggesterscountvalidateexplainprofiling
Query DSLFull text queries(partial support)match- filters fields with values.match('name', 'josh').match('age', 25)
match_phrasematch_phrase_prefixmulti_matchcommonquery_stringsimple_query_string
Term level queries(partial support)term- finds documents that contain the exact term specified.term('name', 'alice').term('name', 'bob')
terms- filters documents that have fields that match any of the provided terms.terms('tags', 'Horror', 'Comedy').terms('tags', 'Urgent')
terms_setrange- greater than / less than filters.range('age').gt(25).range('age').gte(25).range('age').lt(25).range('age').lte(25)
existsprefixwildcardregexpfuzzytypeids
Compound queries(no support)constant_scorebooldis_maxfunction_scoreboosting
Joining queries(no support)nestedhas_childhas_parent
Geo queries(no support)geo_shapegeo_bounding_boxgeo_distancegeo_polygon
Specialized queries(no support)more_like_thisscriptpercolatewrapper
Span queries(no support)span_termspan_multispan_firstspan_nearspan_orspan_notspan_containingspan_withinfield_masking_span
Misc(partial support)match_all- matches all documents.matchAll()
match_none- matches no documents.matchNone()
minimum_should_matchmulti term query rewrite
exposed clients
- redis
Objectioredisclient- https://www.npmjs.com/package/ioredis
- elastic `Object
elasticsearchclient- https://www.npmjs.com/package/elasticsearch
- clients are exposed for complex calls
notes
- on elasticsearch, we use type's label as 'index' and 'type' value, because:
- es 7.x onwards will get rid of mapping types
- it's currently recommended to use same 'index' and 'type' value in latest es 6.x
- throwing errors within transactions effectively aborts it
- the
Queryclass covers the basic query functionality ofGoogle Cloud Datastore- set search offset
- set search size (amount of results to return)
- set search scroll id
- specify which fields to return
- sort items in ascending and descending
- filter items with exact field values
- filter items with
greater than,less than,greater than or equal, andless than or equal - filter items with array fields containing specific values
- Difference between
matchandterm- The match query analyzes the input string and constructs more basic queries from that.
- The term query matches exact terms.
- If you have a document containing "CAT" and search for "cat" the match query will find it but the term query won't. That is, if you lowercase in your analysis config which it does by default.
- 1,000 documents at 1 KB each is 1 MB, 1,000 documents at 100 KB each is 100 MB
- on indices locked by storage, unlock with:
curl -XPUT -H "Content-Type: application/json" http://localhost:9200/_all/_settings -d '{"index.blocks.read_only_allow_delete": null}'
external references
- redis client
- elasticsearch client
- messagepack module
- redis installation
- elasticsearch installation
license
MIT | @davalapar
1.0.0
7 years ago