0.4.0 • Published 9 years ago

streamsummary-stream v0.4.0

Weekly downloads
3
License
MIT
Repository
github
Last release
9 years ago

streamsummary-stream

Stream-based implementation of the StreamSummary data structure described in this paper.

Pipe in your buffers/strings to get approximate top-K most frequent elements.

var StreamSummary = require('streamsummary-stream');
var ss = new StreamSummary(50);

//...

myDataSource.pipe(ss);

ss.on('finish', function() {
  console.log(ss.frequency('42'));
  console.log(ss.top());
});

Requires es6 Map

This module uses es6 Maps, so you probably need node.js >= 0.12 or io.js.

API

StreamSummary(size, streamOpts)

Construct a new writable StreamSummary to track the size most frequent elements (extends Stream.Writable).

  • size - the number of elements to track
  • streamOpts - the options to pass to the Stream constructor

StreamSummary.frequency(element)

Get the approximate frequency of element. Returns null if the element isn't in the top size elements.

  • element - the value in question

StreamSummary.top()

Get the top size most frequent elements in ascending order of frequency.

StreamSummary.export()

Export the StreamSummary data as an object. Exported object will look like:

{
  size: 42,
  numUsedBuckets: 40,
  trackedElements: {...},
  registers: [...]
}

StreamSummary.import(data)

Import a StreamSummary data object (expects same format as export() returns).

  • data - object containing StreamSummary data

StreamSummary.merge(ss)

Merge another StreamSummary with this one. Returns a new StreamSummary of size equal to the combined sizes of the two.

  • ss - another StreamSummary instance