Pinyin NPM | npm.io

pīnyīn (v3)

pinyin, The convert tool of chinese pinyin.

Web Site: 简体中文 | English | 한국어

README: 简体中文 | English | 한국어

Convert Han to pinyin. useful for phonetic notation, sorting, and searching.

Note: This module both support Node and Web browser.
Python version see mozillazg/python-pinyin

Feature

Segmentation for heteronym words.
Support Traditional and Simplified Chinese.
Support multiple pinyin style.

Install

via npm:

npm install pinyin --save

Usage

for developer:

import pinyin from "pinyin";

console.log(pinyin("中心"));    // [ [ 'zhōng' ], [ 'xīn' ] ]

console.log(pinyin("中心", {
  heteronym: true                // Enable heteronym mode.
}));                            // [ [ 'zhōng', 'zhòng' ], [ 'xīn' ] ]

console.log(pinyin("中心", {
  heteronym: true,              // Enable heteronym mode.
  segment: true                 // Enable Chinese words segmentation, fix most heteronym problem.
}));                            // [ [ 'zhōng' ], [ 'xīn' ] ]

console.log(pinyin("我喜欢你", {
  segment: true,                // Enable segmentation. Needed for grouping.
  group: true                   // Group pinyin segments
}));                            // [ [ 'wǒ' ], [ 'xǐhuān' ], [ 'nǐ' ] ]

console.log(pinyin("中心", {
  style: pinyin.STYLE_INITIALS, // Setting pinyin style.
  heteronym: true
}));                            // [ [ 'zh' ], [ 'x' ] ]

console.log(pinyin("华夫人", {
  mode: "surname",              // 姓名模式。
}));                            // [ ['huà'], ['fū'], ['rén'] ]

for cli:

$ pinyin 中心
zhōng xīn
$ pinyin -h

Types

IPinyinOptions

The types for the second argument of pinyin method.

export interface IPinyinOptions {
  style?: IPinyinStyle; // output style of pinyin.
  mode?: IPinyinMode, // mode of pinyin.
  segment?: IPinyinSegment | boolean;
  heteronym?: boolean;
  group?: boolean;
  compact?: boolean;
}

IPinyinStyle

The output style of pinyin.

export type IPinyinStyle =
  "normal" | "tone" | "tone2" | "to3ne" | "initials" | "first_letter" | // Suggest.
  "NORMAL" | "TONE" | "TONE2" | "TO3NE" | "INITIALS" | "FIRST_LETTER" |
  0        | 1      | 2       | 5       | 3          | 4;               // compatibility.

IPinyinMode

The mode of pinyin.

// - NORMAL: Default mode is normal mode.
// - SURNAME: surname mode, for chinese surname.
export type IPinyinMode =
  "normal" | "surname" |
  "NORMAL" | "SURNAME";

IPinyinSegment

The segment method.

Default is disable segment: false，
If set true, use "Intl.Segmenter" module default for segment on Web and Node.
Also specify follow string for segment (bug just "Intl.Segmenter", "segmentit" is support on web):

export type IPinyinSegment = "Intl.Segmenter" | "nodejieba" | "segmentit" | "@node-rs/jieba";

API

`<Array> pinyin(words[, options])`

Convert Han （汉字） to pinyin.

options argument is optional, for sepcify heteronym mode and pinyin styles.

Return a Array<Array<String>>. If one of Han is heteronym word, it would be have multiple pinyin.

`Number pinyin.compare(a, b)`

Default compare implementation for pinyin.

Options

`<Boolean> options.segment`

Enable Chinese word segmentation. Segmentation is helpful for fix heteronym problem, but performance will be more slow, and need more CPU and memory.

Default is false.

`<Boolean> options.heteronym`

Enable or disable heteronym mode. default is disabled, false.

`<Boolean> options.group`

Group pinyin by phrases. for example:

我喜欢你
wǒ xǐhuān nǐ

`<Object> options.style`

Specify pinyin style. please use static properties like STYLE_*. default is .STYLE_TONE. see Static Property for more.

`options.mode`

pinyin mode, default is pinyin.MODE_NORMAL. If you cleared in surname scene, use pinyin.MODE_SURNAME maybe better.

Static Property

`.STYLE_NORMAL`

Normal mode.

Example: pin yin

`.STYLE_TONE`

Tone style, this is default.

Example: pīn yīn

`.STYLE_TONE2`

tone style by postfix number 0-4.

Example: pin1 yin1

`.STYLE_TO3NE`

tone style by number 0-4 after phonetic notation character.

Example: pin1 yin1

`.STYLE_INITIALS`

Initial consonant (of a Chinese syllable).

Example: pinyin of 中国 is zh g

Note: when a Han （汉字） without initial consonant, will convert to empty string.

`.STYLE_FIRST_LETTER`

First letter style.

Example: p y

`pinyin.MODE_NORMAL`

Normal mode. This is the default mode.

`pinyin.MODE_SURNAME`

Surname mode. If chinese word is surname, The pinyin of surname is prioritized.

Test

npm test

Q&A

How to sort by pinyin?

This module provide default compare implementation:

const pinyin = require('pinyin');

const data = '我要排序'.split('');
const sortedData = data.sort(pinyin.compare);

But if you need different implementation, do it like:

const pinyin = require('pinyin');

const data = '我要排序'.split('');

// Suggest you to store pinyin result by data persistence.
const pinyinData = data.map(han => ({
  han: han,
  pinyin: pinyin(han)[0][0], // Choose you options and styles.
}));
const sortedData = pinyinData.sort((a, b) => {
  return a.pinyin.localeCompare(b.pinyin);
}).map(d => d.han);