1.0.5 • Published 3 years ago

@taumechanica/abmh v1.0.5

Weekly downloads
-
License
MIT
Repository
github
Last release
3 years ago

AdaBoost.MH

Implementation of AdaBoost variant for multi-class/multi-label classification tasks based on Hamming loss function:

Installation

npm i --save @taumechanica/abmh

Usage

Prepare a dataset:

import { Dataset } from '@taumechanica/abmh';

const rows = /* specify the number of rows */;
const cols = /* specify the number of columns */;
const lbls = /* specify the number of classes/labels */;

const inival = 0.0; // used to fill value matrix
const inilbl = -1.0; // used to fill label matrix

const data = new Dataset(rows, cols, lbls, inival, inilbl);
for (let row = 0; row < rows; row++) {
    for (let col = 0; col < cols; col++) {
        // src_of_values is the way you choose to
        // get the dataset values by row and column
        data.putValue(row, col, src_of_values(row, col));
    }

    // src_of_labels is the way you choose to
    // get the indices of labels by row
    for (const lblidx of src_of_labels(row)) {
        data.putLabel(row, lblidx, 1.0);
    }
}

Instantiate a feature extractor:

import { PlainExtractor } from '@taumechanica/abmh';

// specify the volumes of your dataset features
// - for nominals the volume is the number of categories
// - for numerics the volume is 1
const volumes = new Uint32Array(Array(cols).fill(1));

// plain feature extractor retrieves values as they are
const ext = new PlainExtractor(data, volumes);

Build a datagrid on top of a dataset using your feature extractor:

import { Datagrid } from '@taumechanica/abmh';

// it will extract feature values and sort
// the data by every numeric feature
const grid = new Datagrid(data, ext);

Instantiate a feature selector:

import { ExhaustSelector } from '@taumechanica/abmh';

// exhaustive selector goes through the entire feature
// space each time a new weak classifier is needed
const sel = new ExhaustSelector(grid);

Train a strong classifier and track progress:

import { AdaBoostMH } from '@taumechanica/abmh';

const iterations = /* specify the number of iterations */;
const treeNodes = /* specify the maximum number of tree nodes */;

// specify tracking function
const track = (iteration: number, scores: Float64Array) => {
    // use classwise scores to calculate accuracy, AUC, LogLoss or whatever you need
};

// create an ensemble
const model = new AdaBoostMH(grid, sel, iterations, treeNodes, track);

Also check the complete example with the MNIST dataset: AdaBoost.MH vs. MNIST.

Development

Cloning

git clone https://github.com/taumechanica/abmh && cd abmh && npm i

Extending

Create your own feature extractors or selectors that better suit your requirements:

import { FeatureExtractor, FeatureSelector, HammingClassifier } from '@taumechanica/abmh';

class PerfectExtractor extends FeatureExtractor {
    public constructor() {
        const count = /* specify the number of features you will extract */;
        const volumes = /* specify the feature volumes */;

        super(count, volumes);
    }

    public encode(
        values: Float64Array, voffset: number,
        result: Float64Array, roffset: number
    ): void {
        // map the source vector to result vector using your extraction strategy
        // - voffset points to the start of the source vector within the values matrix
        // - roffset points to the start of the result vector within the result matrix
    }
}

class GreatSelector extends FeatureSelector {
    public constructor(data: Datagrid) {
        super(data);
    }

    public fit(
        weights: Float64Array,
        subsets: Uint8Array,
        soffset: number
    ): HammingClassifier {
        // select a feature and create weak classifier
        // with respect to weights and subsets matrices
    }
}

Building

npm run build

Inspiration

Build your own machine learning library from scratch in TypeScript

References

  1. B. Kégl, R. Busa-Fekete. Boosting products of base classifiers. Proceedings of the 26th Annual International Conference on Machine Learning, pages 497‐504, 2009.
  2. R. Busa-Fekete, B. Kégl. Fast boosting using adversarial bandits. 27th International Conference on Machine Learning (ICML 2010), pages 143‐150, 2010.
  3. B. Kégl. The return of AdaBoost.MH: multi-class Hamming trees. arXiv preprint arXiv:1312.6086, 2013.
1.0.5

3 years ago

1.0.4

3 years ago

1.0.3

3 years ago

1.0.2

3 years ago

1.0.1

3 years ago

1.0.0

3 years ago