1.0.1 • Published 4 years ago

simple-text-tokenizer v1.0.1

Weekly downloads
66
License
MIT
Repository
github
Last release
4 years ago

Simple Text Tokenizer

Tokenize text to paragraphs, sentences, subsentences, and words.

Installation

Use npm:

npm install simple-text-tokenizer

How to

Import functions

import * as tokenizer from 'text-tokenizer'

Tokenize text to paragraphs

getParagraphTokens('this is the text of paragraph1\n\n\n this is the text of paragraph1\n');

Tokenize paragraph to sentences:

getSentenceTokens('this is the text of sentence1. And this is sentence2!');

Tokenize sentence to subsentences

getSubSentenceTokens('this is the text of subsentence1, this is sentence2; and this is the 3rd one!');

Tokenize sentence to words

getSubSentenceTokens('this is the text of subsentence1, this is sentence2; and this is the 3rd one!');