1.0.4 • Published 6 years ago
grapheme-breaker-mjs v1.0.4
grapheme-breaker-mjs
A JavaScript (ES Module) implementation for web apps and Node.js of the Unicode 13.0.0 grapheme cluster breaking algorithm (UAX #29)
This is a fork of grapheme-breaker-u10-0. Support Unicode 10.0 and emoji v5 by @vaskevich(publishd by @yumetodo).
The base project is grapheme-breaker by @devongovett
for Web
test page
https://taisukef.github.io/grapheme-breaker-mjs/
import GraphemeBreaker from 'https://taisukef.github.io/grapheme-breaker-mjs/src/GraphemeBreaker.mjs'
console.log(GraphemeBreaker.break('😜🇺🇸👍')) // => [ '😜', '🇺🇸', '👍' ]Installation
You can install via npm
npm i grapheme-breaker-mjsExample
import GraphemeBreaker from 'grapheme-breaker-mjs'
// break a string into an array of grapheme clusters
GraphemeBreaker.break('Z͑ͫ̓ͪ̂ͫ̽͏̴̙̤̞͉͚̯̞̠͍A̴̵̜̰͔ͫ͗͢L̠ͨͧͩ͘G̴̻͈͍͔̹̑͗̎̅͛́Ǫ̵̹̻̝̳͂̌̌͘!͖̬̰̙̗̿̋ͥͥ̂ͣ̐́́͜͞') // => ['Z͑ͫ̓ͪ̂ͫ̽͏̴̙̤̞͉͚̯̞̠͍', 'A̴̵̜̰͔ͫ͗͢', 'L̠ͨͧͩ͘', 'G̴̻͈͍͔̹̑͗̎̅͛́', 'Ǫ̵̹̻̝̳͂̌̌͘', '!͖̬̰̙̗̿̋ͥͥ̂ͣ̐́́͜͞']
// or just count the number of grapheme clusters in a string
GraphemeBreaker.countBreaks('Z͑ͫ̓ͪ̂ͫ̽͏̴̙̤̞͉͚̯̞̠͍A̴̵̜̰͔ͫ͗͢L̠ͨͧͩ͘G̴̻͈͍͔̹̑͗̎̅͛́Ǫ̵̹̻̝̳͂̌̌͘!͖̬̰̙̗̿̋ͥͥ̂ͣ̐́́͜͞') // => 6
// use nextBreak and previousBreak to get break points starting
// from anywhere in the string
GraphemeBreaker.nextBreak('😜🇺🇸👍', 3) // => 6
GraphemeBreaker.previousBreak('😜🇺🇸👍', 3) // => 2Development Notes
In order to use the library, you shouldn't need to know this, but if you're interested in contributing or fixing bugs, these things might be of interest.
- The
src/classes.mjsfile is generated fromGraphemeBreakProperty.txtin the Unicode database bysrc/generate_data.mjs. It should be rare that you need to run this, but you may if, for instance, you want to change the Unicode version. - You can run the tests using
npm test. They are written usingmocha, and generated fromGraphemeBreakTest.txtandemoji-test.txtfrom the Unicode database, which is included in the repository for performance reasons while running them.
License
MIT