1.0.0 • Published 9 years ago

suudenstats v1.0.0

Weekly downloads
2
License
MIT
Repository
github
Last release
9 years ago

suddenStats

demo: (coming back soon)

minimal config common statistics

intended for high volume streaming data

a statsistics library will typically require a separate lop through an array of data to get each stat. sudenStats is written to reduce the loops through an array to be as fast as possible and get everything in 1 pass.

can use these stats to make very fast decisions in app logic, or display constantly changing values.

Features

  • fast generic numeric statstics
  • get stats on unique values, and unique combinations, and unique values + numeric stats
  • stats trim functions to keep what's most relevant and needed for other decisions
  • after in line filter callback to tie to immediate actions
  • batching, with auto throttling
  • rolling time windows
  • auto aggregation of time windows
  • in line filters, get stats for subsets
  • works in node and browser

QuickStart:

for node:

npm install suddenstats --save

or

browser:

bower install suddenstats --save

look at the tests and the demo. The demo can run in a browser

node include

var SuddenStats = require('suddenstats');

browser include

<script src="bower_components/suddenstats/utils.js"></script>
<script src="bower_components/suddenstats/SuddenStats.js"></script>

Example use from demo

var objStats = new SuddenStats({
    stats:{ 
      ips:{type:"uniq",path:"user",limit:10,padding:5,filter:[{path:"server_name",op:"ne",val:"en.wikipedia.org"},{path:"user",op:"in",val:"."}]}
      ,type:{type:"uniq",path:"type",level:'hour'}
      ,server:{type:"uniq",path:"server_name",keep:"newHigh",limit:20}
      ,size:{type:"numeric",path:"length.new"}
      ,size:{type:"numeric",path:"length.new"}
      ,bot:{type:"uniq",path:"bot"}
    }
  });

API

  • qData(dataObject) : put the data in a small buffer to be batched for efficiency
  • addData(dataObject) : apply data immediately

Configuring Stats

Statistics Types

  • numeric: pure numerical stats
  • uniq: unique value counts, gives each unique value and count
  • compete: unique value + numeric stats, will give you each unique value + numeric statistics (not just counts)
  • co-occurence: 2 value combination counts, same as unique but works for unique combinations. value1_value2

Windows: defined by "level" param

  • minute, keep up to 60 minute windows
  • hour, keep 60 minute windows, automatically aggregate them into hourly windows
  • day, 60 minute windows, 24 hourly windows, daily windows.

Filters:

  • has: val2 many strings mentioned in val found in path an IP has 3 periods. path:'ip',val='.',val2:3
  • eq: exact matches
  • in: substring or item in array match
  • ni: not in, opposite of in
  • gt: greater than
  • lt: less than
  • ne: not equals

Options

  • limit: max number of values to trim to
  • padding: How many items over limit before starting a trim, less frquent trims may be faster
  • keep: method for decising which to keep (newest),"newHigh"