pinyin, The convert tool of chinese pinyin.
Web Site: 简体中文 | English | 한국어
Convert Han to pinyin. useful for phonetic notation, sorting, and searching.
Note: This module both support Node and Web browser.
Python version see mozillazg/python-pinyin
via npm:
npm install pinyin --save
for developer:
import pinyin from "pinyin";
console.log(pinyin("中心")); // [ [ 'zhōng' ], [ 'xīn' ] ]
console.log(pinyin("中心", {
heteronym: true // Enable heteronym mode.
})); // [ [ 'zhōng', 'zhòng' ], [ 'xīn' ] ]
console.log(pinyin("中心", {
heteronym: true, // Enable heteronym mode.
segment: true // Enable Chinese words segmentation, fix most heteronym problem.
})); // [ [ 'zhōng' ], [ 'xīn' ] ]
console.log(pinyin("我喜欢你", {
segment: true, // Enable segmentation. Needed for grouping.
group: true // Group pinyin segments
})); // [ [ 'wǒ' ], [ 'xǐhuān' ], [ 'nǐ' ] ]
console.log(pinyin("中心", {
style: pinyin.STYLE_INITIALS, // Setting pinyin style.
heteronym: true
})); // [ [ 'zh' ], [ 'x' ] ]
console.log(pinyin("华夫人", {
mode: "surname", // 姓名模式。
})); // [ ['huà'], ['fū'], ['rén'] ]
for cli:
$ pinyin 中心
zhōng xīn
$ pinyin -h
The types for the second argument of pinyin method.
export interface IPinyinOptions {
style?: IPinyinStyle; // output style of pinyin.
mode?: IPinyinMode, // mode of pinyin.
segment?: IPinyinSegment | boolean;
heteronym?: boolean;
group?: boolean;
compact?: boolean;
}
The output style of pinyin.
export type IPinyinStyle =
"normal" | "tone" | "tone2" | "to3ne" | "initials" | "first_letter" | // Suggest.
"NORMAL" | "TONE" | "TONE2" | "TO3NE" | "INITIALS" | "FIRST_LETTER" |
0 | 1 | 2 | 5 | 3 | 4; // compatibility.
The mode of pinyin.
// - NORMAL: Default mode is normal mode.
// - SURNAME: surname mode, for chinese surname.
export type IPinyinMode =
"normal" | "surname" |
"NORMAL" | "SURNAME";
The segment method.
false
,true
, use "Intl.Segmenter" module default for segment on Web and Node.Also specify follow string for segment (bug just "Intl.Segmenter", "segmentit" is support on web):
export type IPinyinSegment = "Intl.Segmenter" | "nodejieba" | "segmentit" | "@node-rs/jieba";
Convert Han (汉字) to pinyin.
options
argument is optional, for sepcify heteronym mode and pinyin styles.
Return a Array<Array<String>>
. If one of Han is heteronym word, it would be
have multiple pinyin.
Default compare implementation for pinyin.
Enable Chinese word segmentation. Segmentation is helpful for fix heteronym problem, but performance will be more slow, and need more CPU and memory.
Default is false
.
Enable or disable heteronym mode. default is disabled, false
.
Group pinyin by phrases. for example:
我喜欢你
wǒ xǐhuān nǐ
Specify pinyin style. please use static properties like STYLE_*
.
default is .STYLE_TONE
. see Static Property for more.
pinyin mode, default is pinyin.MODE_NORMAL
. If you cleared in surname scene,
use pinyin.MODE_SURNAME
maybe better.
Normal mode.
Example: pin yin
Tone style, this is default.
Example: pīn yīn
tone style by postfix number [0-4].
Example: pin1 yin1
tone style by number [0-4] after phonetic notation character.
Example: pin1 yin1
Initial consonant (of a Chinese syllable).
Example: pinyin of 中国
is zh g
Note: when a Han (汉字) without initial consonant, will convert to empty string.
First letter style.
Example: p y
Normal mode. This is the default mode.
Surname mode. If chinese word is surname, The pinyin of surname is prioritized.
npm test
This module provide default compare implementation:
const pinyin = require('pinyin');
const data = '我要排序'.split('');
const sortedData = data.sort(pinyin.compare);
But if you need different implementation, do it like:
const pinyin = require('pinyin');
const data = '我要排序'.split('');
// Suggest you to store pinyin result by data persistence.
const pinyinData = data.map(han => ({
han: han,
pinyin: pinyin(han)[0][0], // Choose you options and styles.
}));
const sortedData = pinyinData.sort((a, b) => {
return a.pinyin.localeCompare(b.pinyin);
}).map(d => d.han);
If this module is helpful for you, please Star this repository.
And you have chioce donate to me via Aliapy or WeChat: