tdm-skeeft v0.0.1
tdm-skeeft
tdm-skeeft is a tdm module for terme exctraction of structured text. It can be used to get keywords (or summary) of document.
Installation
Using npm :
$ npm i -g tdm-skeeft
$ npm i --save tdm-skeeft
Using Node :
/* require of Skeeft module */
const Skeeft = require('tdm-skeeft');
/* Build new Instance of Matrix */
let matrix = new Skeeft.Matrix();
/* Build new Instance of Indexator */
let indexator = new Skeeft.Indexator();
Launch tests
$ npm run test
Build documentation
$ npm run docs
API Documentation
Classes
Functions
Indexator
Kind: global class
new Indexator(options)
Returns: Indexator - - An instance of Indexator
Param | Type | Description |
---|---|---|
options | Object | Options of constructor |
options.filters | Object | Filters options given to title & fulltext extractors |
options.filters.title | Filter | Options given to extractor of title |
options.filters.fulltext | Filter | Options given to extractor of fulltext |
options.stopwords | Object | Stopwords |
options.dictionary | Object | Dictionnary |
Example (Example usage of 'contructor' (with paramters))
let options = {
'filters': {
'title' : customTitleFilter, // According customTitleFilter contain your custom settings
'fulltext' : customFulltextFilter, // According customFulltextFilter contain your custom settings
},
'dictionary': myDictionary, // According myDictionary contain your custom settings
'stopwords': myStopwords // According myStopwords contain your custom settings
},
indexator = new Indexator(options);
// returns an instance of Indexator with custom options
Example (Example usage of 'contructor' (with default values))
let indexator = new Indexator();
// returns an instance of Indexator with default options
indexator.summarize(xmlString, selectors, indexation, delimiter) ⇒ Array
Summarize a fulltext
Kind: instance method of Indexator
Returns: Array - List of extracted sentences (representative summary)
Param | Type | Description |
---|---|---|
xmlString | String | Fulltext (XML formated string) |
selectors | Object | Used selectors |
selectors.title | String | Used selectors |
selectors.title | Object | Used selectors |
indexation | Object | Indexation of xmlString |
delimiter | RegExp | Delimiter used to split text into sentences |
Example (Example usage of 'summarize' function)
let indexator = new Indexator();
indexator.summarize(xmlString, {'title' :'title', 'segments': ['paragraph1', 'paragraph2']}, indexator.index(xmlString)); // return an ordered Array of Object [{...}, {...}]
indexator.index(xmlString, selectors, criterion) ⇒ Array
Index a fulltext
Kind: instance method of Indexator
Returns: Array - List of extracted keywords
Param | Type | Description |
---|---|---|
xmlString | String | Fulltext (XML formated string) |
selectors | Object | Used selectors |
criterion | String | Criterion used (sort) |
Example (Example usage of 'index' function)
let indexator = new Indexator();
indexator.index(xmlString, {'title' :'title', 'segments': ['paragraph1', 'paragraph2']}); // return an ordered Array of Object [{...}, {...}]
Matrix(options)
Constructor
Kind: global function
Returnsthis:
Param | Type | Description |
---|---|---|
options | Object | Options of constructor |
Example (Example usage of 'contructor' (with paramters))
let options = {
'boost': 20
},
matrix = new Matrix(options);
// returns an instance of Matrix with custom options
Example (Example usage of 'contructor' (with default values))
let matrix = new Matrix();
// returns an instance of Matrix with default options
matrix.init(indexations, selectors)
Init each values of this object
Kind: instance method of Matrix
Returns{object}: Return 'this' reference
Param | Type | Description |
---|---|---|
indexations | Array | Array filled with indexations of each segments |
selectors | Array | Array filled with each segment's name |
matrix.fill(criterion)
Fill a matrix with values of choosen criterion
Kind: instance method of Matrix
Returns{matrix}: Return a mathsjs matrix filled with values (and )
Param | Type | Description |
---|---|---|
criterion | string | Key of term object |
matrix.stats(m)
Calcul some statistics
Kind: instance method of Matrix
Returns{object}: Return an object with some statistcs :
- FR (rappel d’étiquetage),
- FP (précision d’étiquetage),
- FF (F-mesure d’étiquetage),
- rowsFF (nb of terms for each rows of FF matrix),
- colsFF (nb of terms for each columns of FF matrix),
- mFF (mean of FF)
Param | Type | Description |
---|---|---|
m | Matrix | Matrix (mathjs.matrix()) |
matrix.select(stats, boost, criterion)
Select terms
Kind: instance method of Matrix
Returns{object}: Return an object with selected terms
Param | Type | Description |
---|---|---|
stats | Object | Statistics of Text (result of Matrix.stats()) |
boost | Object | List of boosted terms (Object with key = term) |
criterion | String | Criterion used by skeeft (frequency or specificity) |
Matrix.sort(terms, compare)
Sort all terms with the 'compare' function
Kind: static method of Matrix
Returns{array}: Return the array of sorted terms
Param | Type | Description |
---|---|---|
terms | Object | List of terms |
compare | function | Compare function |
Matrix.compare(a, b)
Compare two elements depending of its factor
Kind: static method of Matrix
Returns{integer}: return 1, -1 or 0
Param | Type | Description |
---|---|---|
a | Object | First object |
b | Object | Second object |
5 years ago