1.0.5 • Published 3 years ago
@taumechanica/abmh v1.0.5
AdaBoost.MH
Implementation of AdaBoost variant for multi-class/multi-label classification tasks based on Hamming loss function:
- builds an ensemble of Hamming trees introduced in B. Kégl 2013,
- handles nominal features with the subset indicator described in B. Kégl, R. Busa-Fekete 2009,
- includes the fast feature selection strategy proposed in R. Busa-Fekete, B. Kégl 2010,
- low-level enough to be easily extended.
Installation
npm i --save @taumechanica/abmh
Usage
Prepare a dataset:
import { Dataset } from '@taumechanica/abmh';
const rows = /* specify the number of rows */;
const cols = /* specify the number of columns */;
const lbls = /* specify the number of classes/labels */;
const inival = 0.0; // used to fill value matrix
const inilbl = -1.0; // used to fill label matrix
const data = new Dataset(rows, cols, lbls, inival, inilbl);
for (let row = 0; row < rows; row++) {
for (let col = 0; col < cols; col++) {
// src_of_values is the way you choose to
// get the dataset values by row and column
data.putValue(row, col, src_of_values(row, col));
}
// src_of_labels is the way you choose to
// get the indices of labels by row
for (const lblidx of src_of_labels(row)) {
data.putLabel(row, lblidx, 1.0);
}
}
Instantiate a feature extractor:
import { PlainExtractor } from '@taumechanica/abmh';
// specify the volumes of your dataset features
// - for nominals the volume is the number of categories
// - for numerics the volume is 1
const volumes = new Uint32Array(Array(cols).fill(1));
// plain feature extractor retrieves values as they are
const ext = new PlainExtractor(data, volumes);
Build a datagrid on top of a dataset using your feature extractor:
import { Datagrid } from '@taumechanica/abmh';
// it will extract feature values and sort
// the data by every numeric feature
const grid = new Datagrid(data, ext);
Instantiate a feature selector:
import { ExhaustSelector } from '@taumechanica/abmh';
// exhaustive selector goes through the entire feature
// space each time a new weak classifier is needed
const sel = new ExhaustSelector(grid);
Train a strong classifier and track progress:
import { AdaBoostMH } from '@taumechanica/abmh';
const iterations = /* specify the number of iterations */;
const treeNodes = /* specify the maximum number of tree nodes */;
// specify tracking function
const track = (iteration: number, scores: Float64Array) => {
// use classwise scores to calculate accuracy, AUC, LogLoss or whatever you need
};
// create an ensemble
const model = new AdaBoostMH(grid, sel, iterations, treeNodes, track);
Also check the complete example with the MNIST dataset: AdaBoost.MH vs. MNIST.
Development
Cloning
git clone https://github.com/taumechanica/abmh && cd abmh && npm i
Extending
Create your own feature extractors or selectors that better suit your requirements:
import { FeatureExtractor, FeatureSelector, HammingClassifier } from '@taumechanica/abmh';
class PerfectExtractor extends FeatureExtractor {
public constructor() {
const count = /* specify the number of features you will extract */;
const volumes = /* specify the feature volumes */;
super(count, volumes);
}
public encode(
values: Float64Array, voffset: number,
result: Float64Array, roffset: number
): void {
// map the source vector to result vector using your extraction strategy
// - voffset points to the start of the source vector within the values matrix
// - roffset points to the start of the result vector within the result matrix
}
}
class GreatSelector extends FeatureSelector {
public constructor(data: Datagrid) {
super(data);
}
public fit(
weights: Float64Array,
subsets: Uint8Array,
soffset: number
): HammingClassifier {
// select a feature and create weak classifier
// with respect to weights and subsets matrices
}
}
Building
npm run build
Inspiration
Build your own machine learning library from scratch in TypeScript
References
- B. Kégl, R. Busa-Fekete. Boosting products of base classifiers. Proceedings of the 26th Annual International Conference on Machine Learning, pages 497‐504, 2009.
- R. Busa-Fekete, B. Kégl. Fast boosting using adversarial bandits. 27th International Conference on Machine Learning (ICML 2010), pages 143‐150, 2010.
- B. Kégl. The return of AdaBoost.MH: multi-class Hamming trees. arXiv preprint arXiv:1312.6086, 2013.