1.0.1 • Published 8 months ago

string-segmenter v1.0.1

Weekly downloads
-
License
ISC
Repository
github
Last release
8 months ago

String Segmenter

  • Splits strings into sentences.
  • Supports multiple languages.
  • Respects common abbreviations (Mr., Mrs., Etc.) to avoid incorrect sentence splits (English & Spanish only currently).

Installation

npm install string-segmenter

Usage

import { splitBySentence } from "string-segmenter"

const text = "Dr. John Smith, Jr. gave a lecture. It was insightful."
const sentences = []

for (const { segment } of splitBySentence(text)) {
	sentences.push(segment.trim())
}

console.log(sentences)
// Output: ["Dr. John Smith, Jr. gave a lecture.", "It was insightful."]

API

splitBySentence(input: string, locale: Intl.LocalesArgument = "en"): Iterable<{ segment: string, index: number, input: string }>

Splits the input string into sentences.

  • input: The string to be split.
  • locale: The locale to be used for sentence segmentation. Defaults to "en".

clearSegmenterCache(): void

Clears the cache of Intl.Segmenter instances.

Development

Building the Project

npm run build

Running Tests

npm test # once

npm run dev # run and watch for file changes

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

1.0.1

8 months ago

1.0.0

8 months ago