0.0.19 • Published 4 months ago

gliner v0.0.19

Weekly downloads
-
License
MIT
Repository
github
Last release
4 months ago

👑 GLiNER.js: Generalist and Lightweight Named Entity Recognition for JavaScript

GLiNER.js is a TypeScript-based inference engine for running GLiNER (Generalist and Lightweight Named Entity Recognition) models. GLiNER can identify any entity type using a bidirectional transformer encoder, offering a practical alternative to traditional NER models and large language models.

🌟 Key Features

  • Flexible entity recognition without predefined categories
  • Lightweight and fast inference
  • Easy integration with web applications
  • TypeScript support for better developer experience

🚀 Getting Started

Installation

npm install gliner

Basic Usage

const gliner = new Gliner({
  tokenizerPath: "onnx-community/gliner_small-v2",
  onnxSettings: {
    modelPath: "public/model.onnx", // Can be a string path or Uint8Array/ArrayBufferLike
    executionProvider: "webgpu", // Optional: "cpu", "wasm", "webgpu", or "webgl"
    wasmPaths: "path/to/wasm", // Optional: path to WASM binaries
    multiThread: true, // Optional: enable multi-threading (for wasm/cpu providers)
    maxThreads: 4, // Optional: specify number of threads (for wasm/cpu providers)
    fetchBinary: true, // Optional: prefetch binary from wasmPaths
  },
  transformersSettings: {
    // Optional
    allowLocalModels: true,
    useBrowserCache: true,
  },
  maxWidth: 12, // Optional
  modelType: "gliner", // Optional
});

await gliner.initialize();

const texts = ["Your input text here"];
const entities = ["city", "country", "person"];
const options = {
  flatNer: false, // Optional
  threshold: 0.1, // Optional
  multiLabel: false, // Optional
};

const results = await gliner.inference({
  texts,
  entities,
  ...options,
});
console.log(results);

Response Format

The inference results will be returned in the following format:

// For a single text input:
[
  {
    spanText: "New York", // The extracted entity text
    start: 10, // Start character position
    end: 18, // End character position
    label: "city", // Entity type
    score: 0.95, // Confidence score
  },
  // ... more entities
];

// For multiple text inputs, you'll get an array of arrays

🛠 Setup & Model Preparation

To use GLiNER models in a web environment, you need an ONNX format model. You can:

  1. Search for pre-converted models on HuggingFace
  2. Convert a model yourself using the official Python script

Converting to ONNX Format

Use the convert_to_onnx.py script with the following arguments:

  • model_path: Location of the GLiNER model
  • save_path: Where to save the ONNX file
  • quantize: Set to True for IntU8 quantization (optional)

Example:

python convert_to_onnx.py --model_path /path/to/your/model --save_path /path/to/save/onnx --quantize True

🌟 Use Cases

GLiNER.js offers versatile entity recognition capabilities across various domains:

  1. Enhanced Search Query Understanding
  2. Real-time PII Detection
  3. Intelligent Document Parsing
  4. Content Summarization and Insight Extraction
  5. Automated Content Tagging and Categorization ...

🔧 Areas for Improvement

  • Further optimize inference speed
  • Add support for token-based GLiNER architecture
  • Implement bi-encoder GLiNER architecture for better scalability
  • Enable model training capabilities
  • Provide more usage examples

Creating a PR

  • for any changes, remember to run pnpm changeset, otherwise there will not be a version bump and the PR Github Action will fail.

🙏 Acknowledgements

📞 Support

For questions and support, please join our Discord community or open an issue on GitHub.

0.0.18

7 months ago

0.0.19

4 months ago

0.0.17

9 months ago

0.0.16

9 months ago

0.0.15

9 months ago

0.0.14

9 months ago

0.0.13

9 months ago