@rshirohara/kkast v0.1.3
kkast
Kakuyomu novel Abstract Syntax Tree
kkast is a specification for representing kakuyomu novel format in a syntax tree. It implements unist.
Contents
Introduction
This document defines a format for representing kakuyomu novel format as an abstract syntax tree. This specification is written in a Web IDL-like grammar.
Types
If you are using TypeScript, you can use the unist types by installing them with npm:
npm install @rshirohara/kkast
Nodes
Parent
interface Parent <: UnistParent {
children: [KkastContent]
}
Parent (UnistParent) represents an abstract interface in kkast containing order nodes (said to be children).
Its content is limited to only other kkast content.
Literal
interface Literal <: UnistLiteral {
value: string
}
Literal (UnistLiteral) represents an abstract interface in kkast containing a value.
Its value
field is a string
.
Root
interface Root <: Parent {
type: "root"
}
Root (Parent) represents a document.
Root can be used as the root of a tree, never as a child.
Paragraph
interface Paragraph <: Parent {
type: "paragraph"
children: [PhrasingContent]
}
Paragraph (Parent) represents a unit of discourse dealing with a particular point.
Paragraph can be used where content is expected. Its content model is phrasing content.
ParagraphMargin
interface ParagraphMargin <: Node {
type: "paragraphMargin"
size: 1 <= number
}
ParagraphMargin (Node) represents the margins between paragraphs.
ParagraphMargin can only be used between two paragraphs. It has no content model.
A size
field must be present.
A value of 1
is said to be minimum value.
For example, the following text:
これは一段落目。
これは二段落目。
Yields:
{
type: "root",
children: [
{
type: "paragraph",
children: [{ type: "text", value: "これは一段落目。" }],
},
{ type: "paragraphMargin", size: 2 },
{
type: "paragraph",
children: [{ type: "text", value: "これは二段落目。" }],
},
],
}
Text
interface text <: Literal {
type: "text"
}
Text (Literal) represents everything that is just text.
Text can be used where phrasing content is
expected.
Its content is represented by its value
field.
For example, the following text:
たとえば私はこの文章を書く。
Yields:
{ type: "text", value: "たとえば私はこの文章を書く。" }
Ruby
interface Ruby <: Literal {
type: "ruby"
ruby: "string"
}
Ruby (Literal) represents a small annotations that are rendered above, below, or next to text.
Ruby can be used where phrasing content is
expected.
Its content is represented by its value
and ruby
fields.
If the node start symbol ("《
") is preceded by either the characters
"|
" or "|
",
the node start symbol ("《
") is treated as a plain text node.
For example, the following text:
私《わたし》
|etc《えとせとら》
|書き方《でっちあげかた》
|《山括弧
|《これも山括弧
Yields:
{
type: "paragraph",
children: [
{ type: "ruby", value: "私", ruby: "わたし" },
{ type: "break" },
{ type: "ruby", value: "etc", ruby: "えとせとら" },
{ type: "break" },
{ type: "ruby", value: "書き方", ruby: "でっちあげかた" },
{ type: "break" },
{ type: "text", value: "《山括弧" },
{ type: "break" },
{ type: "text", value: "《これも山括弧" }
]
}
Emphasis
interface Emphasis <: Literal {
type: "emphasis"
}
Emphasis (Literal) represents a highlighted text.
Emphasis can be used where phrasing content is
expected.
Its content is represented by its value
field.
For example, the following text:
《《強調された》》テキスト
Yields:
{
type: "paragraph",
children: [
{ type: "emphasis", value: "強調された" },
{ type: "text", value: "テキスト" }
]
}
Break
interface Break <: Node {
type: "break"
}
Break (Node) represents a line break.
Break can be used where phrasing content is expected. It has no content model.
For example, the following text:
これは一行目。
これが二行目。
Yields:
{
type: "paragraph",
children: [
{ type: "text", value: "これは一行目。" },
{ type: "break" },
{ type: "text", value: "これが二行目。" }
]
}
Content model
type KkastContent = FlowContent | PhrasingContent;
Each node in kkast falls into one or more categories of Content that group nodes with similar characteristics together.
FlowContent
type FlowContent = Paragraph | ParagraphMargin;
Flow content represent the sections of document.
PhrasingContent
type PhrasingContent = Text | Ruby | Emphasis | Break;
Phrasing content represent the text in a document, and its markup.