0.0.2 • Published 3 months ago

minbpe v0.0.2

Weekly downloads
-
License
Apache 2.0
Repository
github
Last release
3 months ago

Minbpe

This package is a port of https://github.com/karpathy/minbpe in pure JavaScript. Right now, it only supports server environments, but we will also grant support for browsers too.

Usage

Install minbpe with npm:

npm i minbpe

Use in your JavaScript file:

import { BasicTokenizer } from 'minbpe'

const tokenizer = new BasicTokenizer()
const text = 'The quick brown fox jumps over the lazy dog'

// optionally train a tokenizer:
// tokenizer.train(text, 256 + 3)

const ids = tokenizer.encode(text)
const decoded = tokenizer.decode(ids)

Progress status

  • Base Tokenizer
  • Regex Tokenizer (work in progress)
  • GPT4 Tokenizer (work in progress)

  • Node.js support

  • Deno support
  • Bun support
  • Cloudflare Workers support (work in progress)

License

Apache 2.0

0.0.2

3 months ago

0.0.1

3 months ago