8.0.0 • Published 11 months ago

fit-tnt v8.0.0

Weekly downloads
-
License
MIT
Repository
github
Last release
11 months ago

TNT

NPM version build status Test coverage npm download

Least-squares solver for large, dense matrices. It is based off the TNT paper by J. M. Myre et al.

Supports multiple right-hand-sides.

  • Speed. Best when these apply:

    • $\large\frac{\mathrm{rows}}{\mathrm{cols}} \geq 1$.
    • Data columns $\geq 10$. But it's worth trying in any case.
  • Accuracy: it's frequently as accurate as QR or PseudoInverse but it will have larger error (normally still acceptable) with tricky matrices.

For speed, see comparison here..

For calculations with non-zero intercept, remember to push a $1$ to each row. The coefficient will be the last item in XBest.

A more thorough webpage to compare speed/accuracy will hopefully be included soon.

Install and Use

npm i fit-tnt
import { TNT } from 'fit-tnt';

const A = [
  [1, 2, 3],
  [4, 5, 6],
]; // 2x3
const b = [6, 12]; // or [[6],[12]]

try {
  const { XBest, metadata } = new TNT(A, b);
} catch (e) {
  console.error(e);
}

The preconditioning is Ridge Regression. This is slightly different to the paper.

Documentation

Comparison: TNT vs Pseudo-Inverse

The larger the rows/columns ratio, the more convenient to use TNT. This is a benchmark on random matrices.

Inverting the shape below, the TNT speed is about $\approx 0.9$ slower.

  • Matrix Shape: 500 200
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│ (index)       │       Avg Exec Time │           Avg Error │
ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
│ TNT           │ 0.09470919929999999 │ 0.04945702797110891 │
│ PseudoInverse │ 0.49272041820000007 │ 0.04945702797110894 │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
  • Speed Up: 5.202455747083906

Misc.

The linear problem appears in all science: $A\,x = b$. Methods to solve it fast abound. But $A^{-1}$ rarely exists in practice; the Least-Squares approach is used to minimize the squared error in the predictions:

$E(x) = \mathrm{min}_x \left|\left| A\,x -b \right|\right|_2^2$

We then look for $\nabla_x E(x)=0$ that is $A^T\,A x = A^T b$

When computed directly (as done here), $A^T\,A$ has a condition number $\kappa (A^T A) = \kappa (A)^2$. We try to reduce this problem with preconditioning. Larger condition number also tends to slow the convergence.

TNT

The Conjugate Gradient for Normal Residual (CGNR) is a popular method for solving Sparse Least-Squares problems, where the design matrix has many zeros.

The reason for "Large" is that systems with $m \lt\lt n$ can be solved faster and more accurately using the Pseudo-Inverse. Even though the QR decomposition-method can be more accurate, TNT tends to be faster in overdetermined problems where $m \approx n$ or $m \gt n$.

TNT revives CGNR for Dense Large matrices. It uses a modified version Preconditioned-CGNR to update $A^T\,A$ so that it has an inverse.

Positive definite means that $x^T M x \gt 0$. In our case: $x^T \,(A^T A)\, x \gt 0$, and $(A\,x)^T (A x) \gt 0$

The $(\ldots)$ are non-zero when the columns are linearly independent. If the columns of $A$ are linearly independent then it's invertible/non-singular, and $A^T A$ is invertible.

So we want to pre-condition $A^T A$ so that it is invertible, we also want to avoid tiny numbers in the diagonal of the decomposition.

  1. Carry out product: $N=A^T\,A$ (N is Symmetric.)
  2. Cholesky Decomposition and factor: R, p = Cho(N)
  3. if !p: N = N + e\*I, $\epsilon$ being a tiny number.
  4. Residual $r_0 = A\,x_0 - b$
  5. Gradient per coefficient ($r$), $g_0 = A^T r_0$
  6. Error in the coefficients $z_0 = R^{-1}\,g_0$
  7. Get $\alpha$ as a = dot(z,g)/dot (r,r)
  8. Update $x$ as $x{i+1}=x{i} + a_i\times p_i$
  9. Next residual $r_{i+1} = r_i - a_i \times r_i$
  10. New gradient $g{i+1} = A^T r{i+1}$
  11. New error in coefficients: $z{i+1} = R^{-1}\,g{i+1}$
  12. Get $\beta$ beta = dot(z_{i+1},g_{i+1})/dot (z_i,g_i)

License

MIT

5.1.1

11 months ago

5.1.0

11 months ago

8.0.0

11 months ago

6.0.0

11 months ago

7.0.0

11 months ago

5.0.0

11 months ago

4.0.2

11 months ago

4.0.1

11 months ago

4.0.0

11 months ago

3.0.0

11 months ago

2.0.0

11 months ago

1.0.0

11 months ago