0.0.1 • Published 5 years ago

tdm-skeeft v0.0.1

Weekly downloads
1
License
OpenBSD
Repository
github
Last release
5 years ago

tdm-skeeft

tdm-skeeft is a tdm module for terme exctraction of structured text. It can be used to get keywords (or summary) of document.

Installation

Using npm :

$ npm i -g tdm-skeeft
$ npm i --save tdm-skeeft

Using Node :

/* require of Skeeft module */
const Skeeft = require('tdm-skeeft');

/* Build new Instance of Matrix */
let matrix = new Skeeft.Matrix();

/* Build new Instance of Indexator */
let indexator = new Skeeft.Indexator();

Launch tests

$ npm run test

Build documentation

$ npm run docs

API Documentation

Classes

Functions

Indexator

Kind: global class

new Indexator(options)

Returns: Indexator - - An instance of Indexator

ParamTypeDescription
optionsObjectOptions of constructor
options.filtersObjectFilters options given to title & fulltext extractors
options.filters.titleFilterOptions given to extractor of title
options.filters.fulltextFilterOptions given to extractor of fulltext
options.stopwordsObjectStopwords
options.dictionaryObjectDictionnary

Example (Example usage of 'contructor' (with paramters))

let options = {
    'filters': {
      'title' : customTitleFilter, // According customTitleFilter contain your custom settings
      'fulltext' : customFulltextFilter, // According customFulltextFilter contain your custom settings
    },
    'dictionary': myDictionary, // According myDictionary contain your custom settings
    'stopwords': myStopwords // According myStopwords contain your custom settings
  },
  indexator = new Indexator(options);
// returns an instance of Indexator with custom options

Example (Example usage of 'contructor' (with default values))

let indexator = new Indexator();
// returns an instance of Indexator with default options

indexator.summarize(xmlString, selectors, indexation, delimiter) ⇒ Array

Summarize a fulltext

Kind: instance method of Indexator
Returns: Array - List of extracted sentences (representative summary)

ParamTypeDescription
xmlStringStringFulltext (XML formated string)
selectorsObjectUsed selectors
selectors.titleStringUsed selectors
selectors.titleObjectUsed selectors
indexationObjectIndexation of xmlString
delimiterRegExpDelimiter used to split text into sentences

Example (Example usage of 'summarize' function)

let indexator = new Indexator();
indexator.summarize(xmlString, {'title' :'title', 'segments': ['paragraph1', 'paragraph2']}, indexator.index(xmlString)); // return an ordered Array of Object [{...}, {...}]

indexator.index(xmlString, selectors, criterion) ⇒ Array

Index a fulltext

Kind: instance method of Indexator
Returns: Array - List of extracted keywords

ParamTypeDescription
xmlStringStringFulltext (XML formated string)
selectorsObjectUsed selectors
criterionStringCriterion used (sort)

Example (Example usage of 'index' function)

let indexator = new Indexator();
indexator.index(xmlString, {'title' :'title', 'segments': ['paragraph1', 'paragraph2']}); // return an ordered Array of Object [{...}, {...}]

Matrix(options)

Constructor

Kind: global function
Returnsthis:

ParamTypeDescription
optionsObjectOptions of constructor

Example (Example usage of 'contructor' (with paramters))

let options = {
    'boost': 20
  },
  matrix = new Matrix(options);
// returns an instance of Matrix with custom options

Example (Example usage of 'contructor' (with default values))

let matrix = new Matrix();
// returns an instance of Matrix with default options

matrix.init(indexations, selectors)

Init each values of this object

Kind: instance method of Matrix
Returns{object}: Return 'this' reference

ParamTypeDescription
indexationsArrayArray filled with indexations of each segments
selectorsArrayArray filled with each segment's name

matrix.fill(criterion)

Fill a matrix with values of choosen criterion

Kind: instance method of Matrix
Returns{matrix}: Return a mathsjs matrix filled with values (and )

ParamTypeDescription
criterionstringKey of term object

matrix.stats(m)

Calcul some statistics

Kind: instance method of Matrix
Returns{object}: Return an object with some statistcs :

  • FR (rappel d’étiquetage),
  • FP (précision d’étiquetage),
  • FF (F-mesure d’étiquetage),
  • rowsFF (nb of terms for each rows of FF matrix),
  • colsFF (nb of terms for each columns of FF matrix),
  • mFF (mean of FF)
ParamTypeDescription
mMatrixMatrix (mathjs.matrix())

matrix.select(stats, boost, criterion)

Select terms

Kind: instance method of Matrix
Returns{object}: Return an object with selected terms

ParamTypeDescription
statsObjectStatistics of Text (result of Matrix.stats())
boostObjectList of boosted terms (Object with key = term)
criterionStringCriterion used by skeeft (frequency or specificity)

Matrix.sort(terms, compare)

Sort all terms with the 'compare' function

Kind: static method of Matrix
Returns{array}: Return the array of sorted terms

ParamTypeDescription
termsObjectList of terms
comparefunctionCompare function

Matrix.compare(a, b)

Compare two elements depending of its factor

Kind: static method of Matrix
Returns{integer}: return 1, -1 or 0

ParamTypeDescription
aObjectFirst object
bObjectSecond object