0.3.0 • Published 9 years ago

parse-japanese-basic v0.3.0

Weekly downloads
3
License
MIT
Repository
github
Last release
9 years ago

parse-japanese-basic js-standard-style

A Japanese language parser producing NLCST nodes. This module is a basic module for retext-japanese.

  • For semantics of nodes, see NLCST;

Installation

npm:

npm install parse-japanese-basic

Usage

var inspect = require('unist-util-inspect')

var ParseJapaneseBasic = require('parse-japanese-basic')
var japanese = new ParseJapaneseBasic()

var text = 'タイトル\n' +
             '\n' +
             '1 これは前段です。これは中段(2文の場合は後段。)です。これは後段です。\n'
             
var cst = japanese.parse(text)

console.log(inspect(cst))

/**
* RootNode[3] (1:1-3:39, 0-44)
* ├─ ParagraphNode[2] (1:1-1:6, 0-5)
* │  ├─ TextNode: "タイトル" (1:1-1:5, 0-4)
* │  └─ WhiteSpaceNode: "\n" (1:5-1:6, 4-5)
* ├─ ParagraphNode[1] (2:1-2:2, 5-6)
* │  └─ WhiteSpaceNode: "\n" (2:1-2:2, 5-6)
* └─ ParagraphNode[2] (3:1-3:39, 6-44)
*    ├─ TextNode: "1 これは前段です。これは中段(2文の場合は後段。)です。これは後段です。" (3:1-3:38, 6-43)
*    └─ WhiteSpaceNode: "\n" (3:38-3:39, 43-44)
*/

API

ParseJapaneseBasic(options?)

Exposes the functionality needed to tokenize natural Japanese languages into a syntax tree.

Parameters:

  • options (Object, optional)

    • position (boolean, default: true) - Whether to add positional information to nodes.

ParseJapaneseBasic#parse(value)

Tokenize natural Japanese languages into an NLCST syntax tree.

Parameters:

  • value (VFile or string) — Text document;

Related

License

MIT