0.4.38 • Published 25 days ago

njodb v0.4.38

Weekly downloads
139
License
MIT
Repository
github
Last release
25 days ago

njodb

Version Coverage Dependencies License

njodb is a persistent, partitioned, concurrency-controlled, Node JSON object database. Data is written to the file system and distributed across multiple files that are protected by read and write locks. By default, all methods are asynchronous and use read/write streams to improve performance and reduce memory requirements (this should be particularly useful for large databases).

What makes njodb different than other Node JSON object databases?

Persistence - Data is saved to the file system so that it remains after the application that created it is no longer running, unlike the many existing in-memory solutions. This persistence also allows data to be made available to other applications.

Asynchronous and streaming - By default, all methods are asynchronous and non-blocking, and also use read and write streams, to make data access and manipulation efficient. Synchronous methods are also provided for those cases where they are desired or appropriate.

JSON records, not JSON files - Records are stored as individual lines of JSON objects in a file, so a read stream can be used to retrieve data rapidly, parse it in small chunks, and dispense with it when done. This removes the time and memory overhead required by solutions that store data as a single, monolithic JSON object. They must read all of that data into memory, and then parse all of it, before allowing you to use any of it.

Completely schemaless - While the JSON data itself is schemaless, it is also the case that data is not siloed into tables, or forced into collections, so the entire database, from top to bottom, is schemaless, not just the data. For a user or application, this means that there is no need to know anything about the database structure, only what data is being sought.

Balanced partitions - When inserting data, records are randomly distributed across partitions so that partition sizes are kept roughly equal, making data access times consistent. Manually resizing the database performs a similar distribution, so, as the database grows or shrinks, the partitions remain well-balanced.

Concurrency-controlled - File locks are used during read and write operations, ensuring data integrity can be maintained in a multi-user/multi-application environment. There are few, if any, existing solutions that are designed for such scenarios and that include this sort of data protection.

njodb even has its own command-line interface: check out njodb-cli.

Table of contents

Install

npm install njodb

Test

npm test

Introduction

Load the module:

const njodb = require("njodb");

Create an instance of an NJODB:

const db = new njodb.Database();

Create some JSON data objects:

const data = [
    {
        id: 1,
        name: "James",
        nickname: "Good Times",
        modified: Date.now()
    },
    {
        id: 2,
        name: "Steve",
        nickname: "Esteban",
        modified: Date.now()
    }
];

Insert them into the database:

db.insert(data).then(results => /* do something */ );

Select some records from the database by supplying a function to find matches:

db.select(
    record => record.id === 1 || record.name === "Steve"
).then(results => /* do something */ );

Update some records in the database by supplying a function to find matches and another function to update them:

db.update(
    record => record.name === "James",
    record => { record.nickname = "Bulldog"; return record; }
).then(results => /* do something */ );

Delete some records from the database by supplying a function to find matches:

db.delete(
    record => record.modified < Date.now()
).then(results => /* do something */ );

Delete the database:

db.drop().then(results => /* do something */ );

Constructor

Creates a new instance of an NJODB Database.

Parameters:

NameTypeDescriptionDefault
rootstringPath to the root directory of the Databaseprocess.cwd()
propertiesobjectUser-specific properties to set for the Database{} (see Database properties)

If an njodb.properties file already exists in the root directory, a connection to the existing Database will be created. If the root directory does not exist it will be created. If no user-specific properties are supplied, an njodb.properties file will be created using default values; otherwise, the user-supplied properties will be merged with the default values (see Database properties below). If the data and temp directories do not exist, they will be created.

Example:

const db = new njodb.Database() // created in or connected to the current directory

const db = new njodb.Database("/path/to/some/other/place", {datadir: "mydata", datastores: 2}) // created or connected to elsewhere with user-supplied properties

Database properties

An NJODB Database has several properties that control its functioning. These properties can be set explicitly in the njodb.properties file in the root directory; otherwise, default properties will be used. For a newly created Database, an njodb.properties file will be created using default values.

Properties:

NameTypeDescriptionDefault
datadirstringThe name of the subdirectory of root where data files will be storeddata
datanamestringThe file name that will be used when creating or accessing data filesdata
datastoresnumberThe number of data partitions that will be used5
tempdirstringThe name of the subdirectory of root where temporary data files will be storedtmp
lockoptionsobjectThe options that will be used by proper-lockfile to lock data files{"stale": 5000, "update": 1000, "retries": { "retries": 5000, "minTimeout": 250, "maxTimeout": 5000, "factor": 0.15, "randomize": false } }

Database management methods

stats

stats

Returns statistics about the Database. Resolves with the following information:

NameDescription
rootThe path of the root directory of the Database
dataThe path of the data subdirectory of the Database
tempThe path of the temp subdirectory of the Database
recordsThe number of records in the Database (the sum of the number of records in each datastore)
errorsThe number of problematic records in the Database
sizeThe total size of the Database in "human-readable" format (the sum of the sizes of the individual datastores)
storesThe total number of datastores in the Database
minThe minimum number of records in a datastore
maxThe maximum number of records in a datastore
meanThe mean (i.e., average) number of records in each datastore
varThe variance of the number of records across datastores
stdThe standard deviation of the number of records across datastores
startThe timestamp of when the stats call started
endThe timestamp of when the stats call finished
elapsedThe amount of time in milliseconds required to run the stats call
detailsAn array of detailed stats for each datastore

statsSync

A synchronous version of stats.

grow

grow()

Increases the number of datastores by one and redistributes the data across them.

growSync

growSync()

A synchronous version of grow.

shrink

shrink()

Decreases the number of datastores by one and redistributes the data across them. If the current number of datastores is one, calling shrink() will throw an error.

shrinkSync

shrinkSync()

A synchronous version of shrink.

resize

resize(size)

Changes the number of datastores and redistributes the data across them.

Parameters:

NameTypeDescription
sizenumberThe number of datastores (must be greater than zero)

resizeSync

resizeSync(size)

A synchronous version of resize.

drop

drop()

Deletes the database, including the data and temp directories, and the properties file.

dropSync

dropSync()

A synchronous version of drop.

getProperties

getProperties()

Returns the properties set for the Database. Will likely be deprecated in a future version.

setProperties

setProperties(properties)

Sets the properties for the the Database. Will likely be deprecated in a future version.

Parameters:

NameTypeDescriptionDefault
propertiesobjectThe properties to set for the DatabaseSee Database properties

Data manipulation methods

insert

insert(data)

Inserts data into the Database.

Parameters:

NameTypeDescription
dataarrayAn array of JSON objects to insert into the Database

Resolves with an object containing results from the insert:

NameTypeDescription
insertednumberThe number of objects inserted into the Database
startdateThe timestamp of when the insertions began
enddateThe timestamp of when the insertions finished
elapsednumberThe amount of time in milliseconds required to execute the insert
detailsarrayAn array of insertion results for each individual datastore

insertSync

insertSync(data)

A synchronous version of insert.

insertFile

insertFile(file)

Inserts data into the database from a file containing JSON data. The file itself does not need to be a valid JSON object, rather it should contain a single stringified JSON object per line. Blank lines are ignored and problematic data is collected in an errors array.

Resolves with an object containing results from the insertFile:

NameTypeDescription
inspectednumberThe number of lines of the file inspected
insertednumberThe number of objects inserted into the Database
blanksnumberThe number of blank lines in the file
errorsarrayAn array of problematic records in the file
startdateThe timestamp of when the insertions began
enddateThe timestamp of when the insertions finished
elapsednumberThe amount of time in milliseconds required to execute the insert
detailsarrayAn array of insertion results for each individual datastore

An example data file, data.json, is included in the test subdirectory. Among many valid records, it also includes blank lines and a malformed JSON object. To insert its data into the database:

db.insertFile("./test/data.json").then(results => /* do something */ );

insertFileSync

insertFileSync(file)

A synchronous version of insertFile.

select

select(selecter [, projector])

Selects data from the Database.

Parameters:

NameTypeDescription
selecterfunctionA function that returns a boolean that will be used to identify the records that should be returned
projecterfunctionA function that returns an object that identifies the fields that should be returned

Resolves with an object containing results from the select:

NameTypeDescription
dataarrayAn array of objects selected from the Database
selectednumberThe number of objects selected from the Database
ignorednumberThe number of objects that were not selected from the Database
errorsarrayAn array of problematic (i.e., un-parseable) records in the Database
startdateThe timestamp of when the selections began
enddateThe timestamp of when the selections finished
elapsednumberThe amount of time in milliseconds required to execute the select
detailsarrayAn array of selection results, including error details, for each individual datastore

Example with projection that selects all records, returns only the id and modified fields, but also creates a new one called newID:

db.select(
    () => true,
    record => { return {id: record.id, newID: record.id + 1, modified: record.modified }; }
);

selectSync

selectSync(selecter [, projector])

A synchronous version of select.

update

update(selecter, updater)

Updates data in the Database.

Parameters:

NameTypeDescription
selecterfunctionA function that returns a boolean that will be used to identify the records that should be updated
updaterfunctionA function that applies an update to a selected record and returns it

Resolves with an object containing results from the update:

NameTypeDescription
selectednumberThe number of objects selected from the Database for updating
updatednumberThe number of objects updated in the Database
unchangednumberThe number of objects that were not updated in the Database
errorsarrayAn array of problematic (i.e., un-parseable) records in the Database or records that were unable to be updated
startdateThe timestamp of when the updates began
enddateThe timestamp of when the updates finished
elapsednumberThe amount of time in milliseconds required to execute the update
detailsarrayAn array of update results, including error details, for each individual datastore

updateSync

updateSync(selecter, updater)

A synchronous version of update

delete

delete(selecter)

Deletes data from the Database.

Parameters:

NameTypeDescription
selecterfunctionA function that returns a boolean that will be used to identify the records that should be deleted

Resolves with an object containing results from the delete:

NameTypeDescription
deletednumberThe number of objects deleted from the Database
retainednumberThe number of objects that were not deleted from the Database
errorsarrayAn array of problematic (i.e., un-parseable) records in the Database or records that were unable to be deleted
startdateThe timestamp of when the deletions began
enddateThe timestamp of when the deletions finished
elapsednumberThe amount of time in milliseconds required to execute the delete
detailsarrayAn array of deletion results, including error details, for each individual datastore

deleteSync

deleteSync(selecter)

A synchronous version of delete.

aggregate

aggregate(selecter, indexer [, projecter])

Aggregates data in the database.

Parameters:

NameTypeDescription
selecterfunctionA function that returns a boolean that will be used to identify the records that should be aggregated
indexerfunctionA function that returns an object that creates the index by which data will be grouped
projecterfunctionA function that returns an object that identifies the fields that should be returned

Resolves with an object containing results from the aggregate:

NameTypeDescription
dataarrayAn array of index objects selected from the Database
indexednumberThe number of records that were indexable (i.e., processable by the indexer function)
unindexednumberThe number of records that were un-indexable
errorsnumberThe number of problematic (i.e., un-parseable) records in the Database
startdateThe timestamp of when the aggregations began
enddateThe timestamp of when the aggregations finished
elapsednumberThe amount of time in milliseconds required to execute the aggregate
detailsarrayAn array of selection results, including error details, for each individual datastore

Each index object contains the following:

NameTypeDescription
indexany valid typeThe value of the index created by the indexer function
countnumberThe count of records that contained the index
dataarrayAn array of aggregation objects for each field of the records returned

Each aggregation object contains one or more of the following (non-numeric fields do not contain numeric aggregate data):

NameTypeDescription
minany valid typeMinimum value of the field
maxany valid typeMaximum value of the field
sumnumberThe sum of the values of the field (undefined if not a number)
meannumberThe mean (i.e., average) of the values of the field (undefined if not a number)
varpnumberThe population variance of the values of the field (undefined if not a number)
varsnumberThe sample variance of the values of the field (undefined if not a number)
stdpnumberThe population standard deviation of the values of the field (undefined if not a number)
stdsnumberThe sample standard deviation of the values of the field (undefined if not a number)

An example that generates aggregates for all records and fields, grouped by state and lastName:

db.aggregate(
    () => true,
    record => [record.state, record.lastName]
);

Another example that generates aggregates for records with an ID less than 1000, grouped by state, but for only two fields (note the non-numeric fields do not include numeric aggregate data):

db.aggregate(
    record => record.id < 1000,
    record => record.state,
    record => { return {favoriteNumber: record.favoriteNumber, firstName: record.firstName}; }
);

Example aggregate data array:

[
    {
        index: "Maryland",
        count: 50,
        aggregates: [
            {
                field: "favoriteNumber",
                data: {
                      min: 0,
                      max: 98,
                      sum: 2450,
                      mean: 49,
                      varp: 833,
                      vars: 850,
                      stdp: 28.861739379323623,
                      stds: 29.154759474226502
                  }
            },
            {
                field: "firstName",
                data: {
                    min: "Elizabeth",
                    max: "William"
                }
            }
        ]
    },
    {
        index: "Virginia",
        count: 50,
        aggregates: [
            {
                field: "favoriteNumber",
                data: {
                    min: 0,
                    max: 49,
                    sum: 1225,
                    mean: 24.5,
                    varp: 208.25000000000003,
                    vars: 212.50000000000003,
                    stdp: 14.430869689661813,
                    stds: 14.577379737113253
                }
            },
            {
                field: "firstName",
                data: {
                    min: "James",
                    max: "Robert"
                }
            }
        ]
    }
]

aggregateSync

aggregate(selecter, indexer [, projecter])

A synchronous version of aggregate.

Finding and fixing problematic data

Many methods return information about problematic records encountered (e.g., records that are not parseable using JSON.parse(), or ones that couldn't be updated or deleted); both a count of them, as well as details about them in the details array. The objects in the details array - one for each datastore - contain an errors array that is a collection of objects about problematic records.

For un-parseable records, each error object includes the line of the datastore file where the problematic record was found as well as a copy of the record itself. With this information, if one wants to address these problematic data they can simply load the datastore file in a text editor and either correct the record or remove it. For records that couldn't be deleted or updated, each error object includes a copy of the record itself. With this information, one could make another attempt to update or delete the record(s), or otherwise handle the failure.

Here is an example of the details for a datastore that contains an un-parseable record. As you can see, the record is on the tenth line of the file, and the problem is that the lastname key name is missing an enclosing quote. Simply adding the quote fixes the record.

{
  store: '/Users/jamesbontempo/github/njodb/data/data.0.json',
  size: 1512464,
  lines: 8711,
  records: 8709,
  errors: [
    {
      error: 'Unexpected token D in JSON at position 42',
      line: 10,
      data: '{"id":232,"firstName":"Robert","lastName:"Davis","state":"Illinois","birthdate":"1990-10-22","favoriteNumbers":[5,34,1],"favoriteNumber":183,"modified":1616806973645}'
    }
  ],
  blanks: 1,
  created: 2021-03-27T01:20:21.562Z,
  modified: 2021-03-27T01:28:32.686Z,
  start: 1616808517081,
  end: 1616808517124,
  elapsed: 43
}
0.4.38

25 days ago

0.4.37

27 days ago

0.4.36

7 months ago

0.4.35

1 year ago

0.4.34

1 year ago

0.4.33

2 years ago

0.4.32

2 years ago

0.4.31

2 years ago

0.4.30

2 years ago

0.4.28

2 years ago

0.4.29

2 years ago

0.4.26

2 years ago

0.4.27

2 years ago

0.4.24

2 years ago

0.4.25

2 years ago

0.4.22

2 years ago

0.4.23

2 years ago

0.4.21

3 years ago

0.4.20

3 years ago

0.4.19

3 years ago

0.4.17

3 years ago

0.4.18

3 years ago

0.4.16

3 years ago

0.4.15

3 years ago

0.4.13

3 years ago

0.4.14

3 years ago

0.4.10

3 years ago

0.4.11

3 years ago

0.4.12

3 years ago

0.4.9

3 years ago

0.4.8

3 years ago

0.4.7

3 years ago

0.4.6

3 years ago

0.4.5

3 years ago

0.4.4

3 years ago

0.4.1

3 years ago

0.4.0

3 years ago

0.4.3

3 years ago

0.4.2

3 years ago

0.3.2

3 years ago

0.3.0

3 years ago

0.3.1

3 years ago

0.2.3

3 years ago

0.2.4

3 years ago

0.2.2

3 years ago

0.2.1

3 years ago

0.2.0

3 years ago

0.1.10

3 years ago

0.1.11

3 years ago

0.1.12

3 years ago

0.1.13

3 years ago

0.1.15

3 years ago

0.1.16

3 years ago

0.1.17

3 years ago

0.1.8

3 years ago

0.1.9

3 years ago

0.1.7

3 years ago

0.1.6

3 years ago

0.1.5

3 years ago

0.1.4

3 years ago

0.1.3

3 years ago