8.0.0 • Published 4 months ago

fit-tnt v8.0.0

Weekly downloads
-
License
MIT
Repository
github
Last release
4 months ago

TNT

NPM version build status Test coverage npm download

Least-squares solver for large, dense matrices. It is based off the TNT paper by J. M. Myre et al.

Supports multiple right-hand-sides.

  • Speed. Best when these apply:

    • $\large\frac{\mathrm{rows}}{\mathrm{cols}} \geq 1$.
    • Data columns $\geq 10$. But it's worth trying in any case.
  • Accuracy: it's frequently as accurate as QR or PseudoInverse but it will have larger error (normally still acceptable) with tricky matrices.

For speed, see comparison here..

For calculations with non-zero intercept, remember to push a $1$ to each row. The coefficient will be the last item in XBest.

A more thorough webpage to compare speed/accuracy will hopefully be included soon.

Install and Use

npm i fit-tnt
import { TNT } from 'fit-tnt';

const A = [
  [1, 2, 3],
  [4, 5, 6],
]; // 2x3
const b = [6, 12]; // or [[6],[12]]

try {
  const { XBest, metadata } = new TNT(A, b);
} catch (e) {
  console.error(e);
}

The preconditioning is Ridge Regression. This is slightly different to the paper.

Documentation

Comparison: TNT vs Pseudo-Inverse

The larger the rows/columns ratio, the more convenient to use TNT. This is a benchmark on random matrices.

Inverting the shape below, the TNT speed is about $\approx 0.9$ slower.

  • Matrix Shape: 500 200
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│ (index)       │       Avg Exec Time │           Avg Error │
ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
│ TNT           │ 0.09470919929999999 │ 0.04945702797110891 │
│ PseudoInverse │ 0.49272041820000007 │ 0.04945702797110894 │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
  • Speed Up: 5.202455747083906

Misc.

The linear problem appears in all science: $A\,x = b$. Methods to solve it fast abound. But $A^{-1}$ rarely exists in practice; the Least-Squares approach is used to minimize the squared error in the predictions:

$E(x) = \mathrm{min}_x \left|\left| A\,x -b \right|\right|_2^2$

We then look for $\nabla_x E(x)=0$ that is $A^T\,A x = A^T b$

When computed directly (as done here), $A^T\,A$ has a condition number $\kappa (A^T A) = \kappa (A)^2$. We try to reduce this problem with preconditioning. Larger condition number also tends to slow the convergence.

TNT

The Conjugate Gradient for Normal Residual (CGNR) is a popular method for solving Sparse Least-Squares problems, where the design matrix has many zeros.

The reason for "Large" is that systems with $m \lt\lt n$ can be solved faster and more accurately using the Pseudo-Inverse. Even though the QR decomposition-method can be more accurate, TNT tends to be faster in overdetermined problems where $m \approx n$ or $m \gt n$.

TNT revives CGNR for Dense Large matrices. It uses a modified version Preconditioned-CGNR to update $A^T\,A$ so that it has an inverse.

Positive definite means that $x^T M x \gt 0$. In our case: $x^T \,(A^T A)\, x \gt 0$, and $(A\,x)^T (A x) \gt 0$

The $(\ldots)$ are non-zero when the columns are linearly independent. If the columns of $A$ are linearly independent then it's invertible/non-singular, and $A^T A$ is invertible.

So we want to pre-condition $A^T A$ so that it is invertible, we also want to avoid tiny numbers in the diagonal of the decomposition.

  1. Carry out product: $N=A^T\,A$ (N is Symmetric.)
  2. Cholesky Decomposition and factor: R, p = Cho(N)
  3. if !p: N = N + e\*I, $\epsilon$ being a tiny number.
  4. Residual $r_0 = A\,x_0 - b$
  5. Gradient per coefficient ($r$), $g_0 = A^T r_0$
  6. Error in the coefficients $z_0 = R^{-1}\,g_0$
  7. Get $\alpha$ as a = dot(z,g)/dot (r,r)
  8. Update $x$ as $x{i+1}=x{i} + a_i\times p_i$
  9. Next residual $r_{i+1} = r_i - a_i \times r_i$
  10. New gradient $g{i+1} = A^T r{i+1}$
  11. New error in coefficients: $z{i+1} = R^{-1}\,g{i+1}$
  12. Get $\beta$ beta = dot(z_{i+1},g_{i+1})/dot (z_i,g_i)

License

MIT

5.1.1

4 months ago

5.1.0

4 months ago

8.0.0

4 months ago

6.0.0

4 months ago

7.0.0

4 months ago

5.0.0

4 months ago

4.0.2

4 months ago

4.0.1

4 months ago

4.0.0

4 months ago

3.0.0

5 months ago

2.0.0

5 months ago

1.0.0

5 months ago