0.4.0 • Published 3 years ago
@vipran/aksharas v0.4.0
Aksharas
Aksharas is an utility for analysing akṣaras and varṇas in a Devanagari text.
Installation
npm i @vipran/aksharasUsage
import Aksharas from "@vipran/aksharas";
// OR for CommonJS:
// const Aksharas = require("@vipran/aksharas").default;
const input = "सर्वे भवन्तु सुखिनः।"
const results = Aksharas.analyse(input);
const aksharas = results.aksharas.map(akshara => akshara.value);
console.log(aksharas); // "स", "र्वे", "भ", "व", "न्तु", "सु", "खि", "नः"API
Aksharas.analyse()
Accepts a string input and returns a Results object.
const input: string = 'नमः';
const results: Results = Aksharas.analyse(input);Aksharas.TokenType
It is an enum with the following values:
TokenType.AksharaTokenType.SymbolTokenType.WhitespaceTokenType.InvalidTokenType.Unrecognised
These can be used to filter the tokens in the Results object. Example:
import Aksharas from "@vipran/aksharas";
// OR import Aksharas, { TokenType } ...
const input = "हे! हरेऽत्र नागच्छ।";
const results = Aksharas.analyse(input);
const symbols = results.all
.filter((token) => token.type === Aksharas.TokenType.Symbol)
.map((token) => token.value);
console.log(symbols); // "ऽ", "।"Aksharas.VarnaType
It is an enum with the following values:
VarnaType.SvaraVarnaType.Vyanjana
These can be used to filter the varnas in Results.varnas. Example:
import Aksharas from "@vipran/aksharas";
// OR import Aksharas, { VarnaType } ...
const input = "गुरुः";
const results = Aksharas.analyse(input);
const svaras = results.varnas
.filter((varna) => varna.type === Aksharas.VarnaType.Svara)
.map((varna) => varna.value);
console.log(svaras); // "उ", "उः"Results
The Results object contains the following properties:
- all
- type:
Token[] - An array of
Tokenobjects containing all the tokens analysed frominputstring. It includes Devanagari akṣaras, Devanagari symbols (१, २, ।, ॥, etc.) and non-devangari characters (i.e. characters in other scripts, special characters, whitespace characters, etc.)
- type:
- aksharas
- type:
Token[] - Devanagari syllables like रा, सी, etc. Here, halanta consonants such as क्, च्, य्, etc. are also considered as
aksharaswhen they are at the end of a word.
- type:
- varnas
- type:
Varna[] - Devanagari consonants and vowels in the
input. (Only in v0.4.0 or above.)
- type:
- symbols
- type:
Token[] - Devanagari symbols such as १, २, ।, ॥, etc.
- type:
- whitespaces
- type:
Token[] - All whitespace characters:
\s,\t,\n, etc.
- type:
- invalid
- type:
Token[] - All Devanagari characters whose occurance in the
inputstring do not conform to the definition of an akṣara. For example, a virāma or a vowel mark which is not preceded by a consonant is invalid. ("अ्", "गोु", etc.)
- type:
- unrecognised
- type:
Token[] - Non-devangari characters (i.e. characters in other scripts and special characters such as @, #, etc.)
- type:
- chars
- type:
string[] - All Unicode characters in the
inputstring. Same asString.prototype.split().
- type:
Token
Many of the properties in the Results object consists of an array of Token-s. A Token object has the following properties:
- type
- type:
TokenType - Type of the token. One of the values of
Aksharas.TokenType.
- type:
- value
- type:
string - Conatins an analysed part of the
inputstring.
- type:
- from
- type:
number - From index - representing the start position of the token in the
inputstring.
- type:
- to
- type:
number - To index - representing the end position of the token in the
inputstring.
- type:
- attributes
- type:
Record<string, any> - An optional key-value object which may contain other attributes of the token. It is currently used only in the
Aksharatokens for storing thevarnasin that akshara.
- type:
Varna
Results.varnas consists of an array of Varna objects. A Varna object has the following properties:
- type
- type:
VarnaType - Type of the token. One of the values of
Aksharas.VarnaType.
- type:
- value
- type:
string - Conatins an analysed part of the
inputstring.
- type:
License
MIT © Prasanna Venkatesh T S