3.0.0 • Published 5 years ago

clowncar v3.0.0

Weekly downloads
3
License
MIT
Repository
github
Last release
5 years ago

clowncar

Stream array items out of incoming JSON

Build Status Coverage Status

Usage

const Clowncar = require('clowncar');
const Wreck = require('wreck');

(async () => {

    const res = await Wreck.request('get', 'https://api.npms.io/v2/search?q=streams&size=100');
    const clowncar = new Clowncar('results');

    res.pipe(clowncar).on('data', (result) => {

        console.log(result.package.name);
    });
})();

API

new Clowncar([options])

Returns a new Transform stream, which will receive streaming JSON and output the items of an array within that JSON where options is either,

  • an array or string specifying pathToArray as described below or,
  • an object of the form,

    • pathToArray - a path in the form of an array or Hoek.reach()-style string, specifying where in the incoming JSON the array will be found. Defaults to [], meaning that the incoming JSON is the array itself.

      For example, ['a', 1, 'b'] and 'a.1.b' both represent the array ["this", "array"] within the following JSON,

       {
         "a": [
           "junk",
           {
             "b": ["this", "array"]
           },
           "junk"
         ]
       }
    • keepRemainder - a boolean specifying whether the remainder of the incoming JSON (omitting the array items at pathToArray) should be emitted after the stream ends. When true a 'remainder' event is emitted after the 'end' event with a single argument containing the remainder. Defaults to false.

    • doParse - a boolean specifying whether the outgoing array items and remainder should be parsed (with JSON.parse()), or left as buffers. Defaults to true. When true, the stream is placed into object mode.

Extras

Approach

A major downside of parsing streaming JSON is that it is much slower than using JSON.parse(). While JSON.parse() is exceptionally fast, parsing large JSON objects does block the event loop and use memory, sometimes in nasty ways. One thing that makes a JSON object large is when it may contain an array of arbitrary length; this is where clowncar shines. Rather than carefully parsing every bit of JSON while it streams, clowncar instead just identifies items within an array, then JSON.parse()s each one of them separately. That is, clowncar parses as little of the JSON as is necessary on its own, and leaves the heavy-lifting to JSON.parse(). This has the benefits of keeping a low memory / event-loop footprint while taking advantage of the speed of JSON.parse(), which no streaming JSON parser can touch today.

Resources

If you're into streaming JSON, you've gotta check-out the following,

  • json-depth-stream - used internally to clowncar, this streaming JSON parser runs light and uniquely can parse to a specified max depth
  • JSONStream - perhaps the most mature streaming JSON parser
  • json-stream-bench - benchmarking various streaming JSON implementations