1.0.0 • Published 5 years ago

@vlr/tokenize v1.0.0

Weekly downloads
-
License
MIT
Repository
gitlab
Last release
5 years ago

@vlr/tokenize

Basic tokenize function used to split content string into known and unknown tokens Longer tokens are prioritized over short tokens

usage

Function tokenize returns objects with token and its position

import { tokenize } from "@vlr/tokenize";

const result = tokenize("my content", ["con", " "]);
// [
//  {token: "my", position: 0}, 
//  {token: " ", position: 2}, 
//  {token: "con", position: 3}
//  {token: "tent", position: 6} 
// ]

Function tokenizePlain returns array of strings

const result = tokenizePlain("some content", ["ten", "t", " "]);
// [ "some", " ", "con", "ten", "t" ];
1.0.0

5 years ago