0.1.2 • Published 3 years ago

@vasqo/node-tesseract v0.1.2

Weekly downloads
-
License
MIT
Repository
github
Last release
3 years ago

@vasqo/node-tesseract

This is a modern re-implementation of Desmond Morris's node-tesseract package.

Features:

  • Fully TypeScript-typed
  • Promise-based
  • Supports Tesseract's latest version (4.1.1)
  • 0 dependencies

As this was originally intended as an inside project, it will be focused on vasqo's needs. This means that external PRs will be refused if they don't suit our needs, and that we may publish new versions at any time that would completely change the code's behaviour. However, the package is still published on the public npm registry, which means you can use it as you wish, under the terms of the MIT license.

Requirements

The package is tested with Tesseract 4.1.1. It might work with earlier versions, but you will have to try it out. Versions >= 4.0.0 will probably work. It requires Node.js v14.17.0 or higher (we use the randomUUID function). The TypeScript code is transpiled to ES2021 JavaScript code.

If you don't have Tesseract installed on your machine, you can follow installation instructions here. On macOS, if you have HomeBrew installed, you can simply run:

brew install tesseract tesseract-lang

tesseract-lang is required if you intend to read text that is not written in English.

Usage

Install the package with npm install @vasqo/node-tesseract. Then, you can use the exported process function with async/await or with then, in JS or TS files:

import { process, TesseractOptions } from '@vasqo/node-tesseract';

// Recognize text of any language in any format
let parsedText: string;
try {
  parsedText = await process('/path/to/image.jpg');
} catch (err) {
  console.error(err);
}

// Or equivalently:
let parsedTextNoAwait: string;
process('/path/to/image.jpg')
  .then((text) => {
    parsedTextNoAwait = text;
  })
  .catch((err) => console.error(err));

// Recognize German text in a single uniform block of text and set the binary path
const options: TesseractOptions = {
  language: 'deu',
  psm: 6,
  binary: '/usr/local/bin/tesseract',
};

const germanText = await process('/path/to/image.jpg', options);

License

This package is MIT-licensed; you can read more about it here. Many thanks to Desmond Morris for creating the original node-tesseract package.

0.1.2

3 years ago

0.1.1

3 years ago

0.1.0

3 years ago