0.0.5 • Published 10 years ago
tkn v0.0.5
tkn
Simple word tokeniser that ignores punctuation. Returning an Array
of words.
Usage
tokenise(text, noStopWords)
. Simply returns an array of terms, without punctuation.
text
is the string (text document) in which the calculations are to be performed on.noStopWords
defaults totrue
. Set tofalse
if you want to include stop words–e.g words such as "I" and "the".
var tkn = require('tkn');
var str = "you're simply a test, a mere test";
var tokenised = tkn.tokenise(str);
>> ['simply', 'test', 'mere', 'test']