Hanzi2reading NPM

hanzi2reading

This library is distributed with a default database; you can find the license information here. https://github.com/g0v/moedict-data/blob/master/README.md

Annotation of Chinese characters with Standard Mandarin (國語/普通話) readings
Agnostic to simplified/traditional script and transliteration method
Should work offline, and database format should be as compact as possible - e.g. Protocol Buffers loaded by WebAssembly
Should support word-based disambiguation of characters with multiple readings
separation of code and data - dictionary backend should be swappable

Word segmentation is a non-goal.
Target should be good performance for non-sentence inputs, without needing part-of-speech classification, e.g. 得

Total = 15 bits per syllable. This is less compact than enumerating all standard syllables, but allows dictionaries to have non-standard syllables.

5 years ago

5 years ago