0.0.3 • Published 12 years ago
stm v0.0.3
stm
Takes in a document, spits out a tokenised and stemmed array of terms. Using Porter's Algorithm.
Usage
stm.stem(text, noStopWords). Returns an array of terms, stemmed and without punctuation.
textis the string (text document) in which the calculations are to be performed on.noStopWordsdefaults totrue. Set tofalseif you want to include stop words–e.g words such as "I" and "the".
Note: This is basically a wrapper around the stem-porter library by kastor.
var stm = require('stm');
var str = "you're simply a simplistic house, made for housing";
var stemmed = stm.stem(str); // noStopWords -> `true`
>> ["simpli", "simplist", "hous", "hous"]
var withStopWords = stm.stem(str, false); // turn off the removal of stop words
>> [ 'you', 're', 'simpli', 'a', 'simplist', 'hous', 'made', 'for', 'hous'];