@lenml/tokenizer-gpt4
gpt4 tokenizer for NodeJS/Browser
gpt4 tokenizer for NodeJS/Browser
gpt4o tokenizer for NodeJS/Browser
A NodeJS library to tokenize strings
a wrapper around the LunaSec CLI enabling it to be used as an NPM package
Lexer
Simple HTML Tokenizer is a lightweight JavaScript library that can be used to tokenize the kind of HTML normally found in templates.
JavaScript implementation of Japanese morphological analyzer
A lua parser
Enhances comprehension for MindScript-powered models by tokenizing text.
A pure JavaScript implementation of a BPE tokenizer (Encoder/Decoder) for GPT-2 / GPT-3 / GPT-4 / Claude Instant / Claude 2
Promise based tokenizer for a ReadableStream
Split text into sentence strings or word arrays with Sentence Boundary Detection (SBD). Based on http://github.com/Tessmore/sbd by Fabiën Tesselaar.
tokenizer for ecmascript
Flexible and asynchronous string tokenizer
Tokenizers for Infix strings. Only for SE-1222 (Data Structures) class, not for production use.
A simple generic string tokenizer
Tokenizer of c-like languages.
Regex Based Tokenizer used in parsers developed by Porifa
Here is a README generated from the code snippet: