0.0.15 • Published 9 years ago

term-frequency v0.0.15

Weekly downloads
40
License
MIT
Repository
github
Last release
9 years ago

NPM version NPM downloads MIT License Build Status

term-frequency

A simple term frequency library that takes in a document vector, and compiles the frequency calculation of your choosing.

First make the necessary require-ments

var sw = require('stopword')
var tf = require('term-frequency');
var tv = require('term-vector');

You can then do:

var vec = tv.getVector(
  sw.removeStopwords(
    'This is a really, really cool vector. I like this VeCTor'
      .toLowerCase()
      .split(/[ ,\.]+/)
  )
)
var freq = tf.getTermFrequency(vec);
// freq is now
// [ [ [ 'cool' ], 1 ], [ [ 'really' ], 2 ], [ [ 'vector' ], 2 ] ];

Or you can specify a TF scheme like so:

var vec = tv.getVector('This is a really, really cool vector. I like this VeCTor');
var freq = tf.getTermFrequency(vec, {scheme: tf.logNormalization});
// freq is now:
// [
//   [ [ 'cool' ], 0.6931471805599453 ],
//   [ [ 'really' ], 1.0986122886681098 ],
//   [ [ 'vector' ], 1.0986122886681098 ]
// ]);

Currently supported schemes are

  • raw
  • logNormalization
  • doubleNormalization0point5
  • selfString
  • selfNumeric

See the Wikipedia page for more info about term frequency calculation

You can also weight your calculations like so. A weight is a numeric value that will be added to the calculated score.

var freq = tf.getTermFrequency(vec, {
  scheme: tf.doubleNormalization0point5, 
  weight: 5
});
// freq is now
// [
//   [ [ 'cool' ], 5.7027325540540822 ],
//   [ [ 'really' ], 5.9581453659370776 ],
//   [ [ 'vector' ], 5.9581453659370776 ] 
// ]);
0.0.15

9 years ago

0.0.14

9 years ago

0.0.13

10 years ago

0.0.12

10 years ago

0.0.11

10 years ago

0.0.10

10 years ago

0.0.9

10 years ago

0.0.8

10 years ago

0.0.7

10 years ago

0.0.6

10 years ago

0.0.5

10 years ago

0.0.4

10 years ago

0.0.3

10 years ago

0.0.2

10 years ago

0.0.1

11 years ago