1.0.2 • Published 5 years ago

macho-ml v1.0.2

Weekly downloads
-
License
MIT
Repository
-
Last release
5 years ago

MachoML

I'm done with this for now. I've learned all that I set out to learn about implementing certain essential algorithms by hand, and now I'm happy to use a proper library (as you should).

MachoML is a machine learning and statistics library that was created as not only one of the simplest and least-overhead options, but also for teaching myself the fundamentals of these algorithms.

Macho uses plotly for plotting and exposes CSV functionality through csv.js.

Info

About the errors

All of the models will throw UntrainedErrors if you try to call certain methods on an untrained model (these errors give tips about how to fix them). If proper plotly credentials aren't provided through either ~/.plotly/.credentials or the graph method's argument, a PlotlyUnidentifiedError will be thrown -- see below for more detail on connecting with plotly.

About the graphing

For the models in MachoML, you can call their graph method to get the URL of the plotly graph of the data. MachoML tries to read from ~/.plotly/.credentials as the standard location for plotly credentials, but if that location is unavailable, then you must supply the "username" and "apiKey" properties in the paremeter object with your plotly username and api key. By default, this method prints the URL of the graph, but by setting the "showUrl" property of the object to false you stop that. The method also can use xLabel, yLabel, title, xRange, yRange, mode1, and mode2 (given also as properties to the parameter object) to format the graph. The labels and title must be strings, the ranges must be a two-element array min, max, and the mode1 and mode2 must be valid plotly graph modes. By default, the predicted line will be "lines+markers" and the actual points will be "markers".

Some examples:

model.graph() // defaults
model.graph({
    xLabel: "Reading scores",
    yLabel: "Writing scores",
    title: "Reading vs. Writing Scores of U.S. High School Students"
}); // set axis labels and graph title and leave other defaults
model.graph({
    yRange: [0, 100],
    yLabel: "Percentages maybe"
}); // set the range of graphed y-values and y-axis label, leave other defaults

About the dependencies

Plotly is a well-known and richly-featured graphing library that is perfect for quickly making nice-looking graphs. Csv.js, provided directly as CSV through this package, is a thoroughly tested, very idiomatic, and very simple library for parsing and creating CSV tables.

Linear Regression

The linear regression in MachoML uses least squares fitting on two arrays. The correlation is r^2^, and the standard error is the mean of least squares method. The findCorrelation, findStandardError, and predict methods all accept one more argument, specifying to how many places after the decimal to round to. The train method accepts two more arguments, for correlation, and standard error, respectively. The default value for these arguments is always 3. The rounding for this is done MDN's suggested way to eliminate the risk of floating-point weirdness in rounding.

What you can do

const { Linear } = require("macho-ml");

var xValues = [1, 2, 4, 5];
var yValues = [2, 4, 4, 5];

var model = new Linear(xValues, yValues);

model.add(3, 5); // x, y -- watch out! this makes the model untrained again

model.point(4); //=> { x: 3, y: 5 }

model.train(false); // the argument is whether to also call the findCorrelation and findStandardError methods after training, usually keep this as default (true)

// all methods that don't return anything are chainable:
model.findCorrelation().findStandardError();

model.correlation; //=> 0.6
model.standardError; //=> 0.894
model.predict(8); //=> 7
model.graph(); // gives quite an unexciting graph -- run examples/reading-vs-writing.js to see a well-correlated and slightly more interesting result

model.line //=> { x: [ 1, 2, 4, 5, 3 ], y: [ 2, 4, 4, 5, 5 ] }
model.predictedLine //=> { x: [ 1, 2, 4, 5, 3 ], y: [ 2.8, 3.4, 4.6, 5.2, 4 ] }

// also, you shouldn't really be reaching into macho's internals too much, but to get the slope and y-intercept of the line, use
model.__slope
model.__yIntercept

Usage

const { Linear } = require("macho-ml");

var model = new Linear([1, 2, 3, 4, 5], [1, 3, 5, 7, 9]);
model.train().predict(4028); //=> 8055