0.0.15 • Published 8 years ago

term-frequency v0.0.15

Weekly downloads
40
License
MIT
Repository
github
Last release
8 years ago

NPM version NPM downloads MIT License Build Status

term-frequency

A simple term frequency library that takes in a document vector, and compiles the frequency calculation of your choosing.

First make the necessary require-ments

var sw = require('stopword')
var tf = require('term-frequency');
var tv = require('term-vector');

You can then do:

var vec = tv.getVector(
  sw.removeStopwords(
    'This is a really, really cool vector. I like this VeCTor'
      .toLowerCase()
      .split(/[ ,\.]+/)
  )
)
var freq = tf.getTermFrequency(vec);
// freq is now
// [ [ [ 'cool' ], 1 ], [ [ 'really' ], 2 ], [ [ 'vector' ], 2 ] ];

Or you can specify a TF scheme like so:

var vec = tv.getVector('This is a really, really cool vector. I like this VeCTor');
var freq = tf.getTermFrequency(vec, {scheme: tf.logNormalization});
// freq is now:
// [
//   [ [ 'cool' ], 0.6931471805599453 ],
//   [ [ 'really' ], 1.0986122886681098 ],
//   [ [ 'vector' ], 1.0986122886681098 ]
// ]);

Currently supported schemes are

  • raw
  • logNormalization
  • doubleNormalization0point5
  • selfString
  • selfNumeric

See the Wikipedia page for more info about term frequency calculation

You can also weight your calculations like so. A weight is a numeric value that will be added to the calculated score.

var freq = tf.getTermFrequency(vec, {
  scheme: tf.doubleNormalization0point5, 
  weight: 5
});
// freq is now
// [
//   [ [ 'cool' ], 5.7027325540540822 ],
//   [ [ 'really' ], 5.9581453659370776 ],
//   [ [ 'vector' ], 5.9581453659370776 ] 
// ]);
0.0.15

8 years ago

0.0.14

8 years ago

0.0.13

8 years ago

0.0.12

8 years ago

0.0.11

8 years ago

0.0.10

8 years ago

0.0.9

8 years ago

0.0.8

8 years ago

0.0.7

8 years ago

0.0.6

8 years ago

0.0.5

8 years ago

0.0.4

8 years ago

0.0.3

9 years ago

0.0.2

9 years ago

0.0.1

9 years ago