0.6.1 • Published 11 months ago

lambda-parser v0.6.1

Weekly downloads
-
License
MIT
Repository
github
Last release
11 months ago

Intro

Lambda parser is a top-down recursive parser with both lookback and lookahead functionalities. Lambda parser introduces a new analytic formal grammar named XPG (extensible parsing grammar) that's inspired by PEG. XPG has generated its own parser from this grammar.

Technical facts

  • Uses the builtin JavaScript RegExp for tokenization.
  • Is pure JavaScript (library/framework free) and can be used in all JavaScript RTEs.
  • Written in TypeScript with the strictest ESNext config.
  • Uses no recursive function (uses the stack to implement recursion).
  • Returns a JSON object instead of a parse-tree.

Installation

npm i lambda-parser

Usage

Generating parser from XPG

For Node.js and browser:

npx lambda-parser gen src/grammar.xpg src

For Deno:

npx lambda-parser gen src/grammar.xpg src -r deno

This command generates a TypeScript ES module with the name parser.ts.

Sample

ENV file parser

.env

HOST=127.0.0.1
PORT=8080

src/grammar.xpg

list : (key(KEY) '=' value(VALUE) END)+
KEY = r'[a-zA-Z0-9_]+'
VALUE = r'.+'
END = r'\n|$'

src/build

npx lambda-parser gen src/grammar.xpg src

test/index.ts

import { deepEqual } from 'node:assert'
import fs from 'node:fs'
import { join } from 'node:path'

import parser from './src/parser'

const input = fs.readFileSync(join(process.cwd(), '/.env'), 'utf8')
const output = parser.parse(input.toString())

deepEqual(output, [
  { key: 'HOST', value: '127.0.0.1' },
  { key: 'PORT', value: '8080' }
])

Please see the sample parsers for more samples.

XPG Syntax

SCRIPT

Path to the TS/JS script that will be used for extending the generated parser.

SCRIPT = 'path_to_the_script'

Comment

# A comment must be started with a hashtag.

Variable

name = rule

RegExp rule

name : r'regex_pattern'
# or
name : 'exact_string'

Sequence rule

name : subrule_1 subrule_2

Ordered choice rule

name : subrule_1 / subrule_2

Non-capture rule

name : ~subrule

Dynamic rule

name : SCRIPT.subrule

Group

name : group_name(subrule)

Note: group_name can be emitted, in that case the parsed data will be assigned to the closest parent rule.

Predicate

name : &and_predicate !not_predicate

Quantifiers

name : (subrule)? # Optional
name : (subrule)* # Zero or more
name : (subrule)+ # One or more

Extending the parser

Global parser settings

When extending a generated parser, the parser script module must export a variable with the name settings:

export const settings: ParserSettings = {}
  • ParserSettings

    type ParserSettings = Partial<{
      skip: RegExp
      listener: ParserListener
    }>
    • skip

      Specifies the global skip pattern.

    • listener

      Lambda parser raises an event when a rule is succeeded or failed. This event can be used for transforming the parsed data or resolving the failure:

      export function listener(this: IParser, event: ParserEvent) {
        if (event.type === ParserEventType.RuleSucceeded) {
          const rule = event.data
          if (rule.name === 'NUMBER') return parseInt(rule.data)
        } else if (event.type === ParserEventType.RuleFailed) {
          if (rule.name === 'NUMBER') {
            // Checking if the number is a float number.
            // This is a sample scenario. In a real project, the NUMBER rule should cover both int and float numbers.
            const lexer = this.lexer
            const floatNumber = lexer.lookahead(/[0-9]+\.[0-9]+/)
            if (floatNumber) {
              lexer.position += floatNumber.length
              return parseFloat(floatNumber)
            }
          }
        }
      }

Dynamic rule

An XPG rule can be defined programmatically:

src/grammar.xpg

grammar : SCRIPT.text

SCRIPT = '/src/parser_script.ts'

src/parser_script.ts

import { IParser } from 'lambda-parser/src/types'

export function text(this: IParser): string | undefined {
  let text = ''
  const parser = this
  while (!parser.lexer.isEof) {
    const char = parser.lexer.scan(/[\w\W\s\S]/)
    if (char === "'") {
      // If the quotation mark has been escaped then remove the \ char
      if (text[text.length - 1] === '\\') text = text.slice(0, -1) + "'"
      // Otherwise the quotation mark is an enclosing mark and should be dropped
      continue
    }
    text += char
  }
  return text
}

test/parser.ts

import { equal } from 'node:assert'

import parser from './src/parser'

const input = "'It\\'s Ok!'"
const output = parser.parse(input)

equal(output, "It's Ok!")

API

type ParserListener = (this: IParser, event: ParserEvent) => unknown
type ParserEvent = RuleSucceededEvent | RuleFailedEvent
type RuleSucceededEvent = { type: 0; data: { name?: string; data: any } }
type RuleFailedEvent = { type: 1; data: { name?: string } }

interface IParser {
  lexer: ILexer
}

interface ILexer {
  isEof: boolean
  jump: (length: number) => void
  scan(pattern: RegExp): string | undefined
  lookahead(pattern: RegExp): string | undefined
  lookback(pattern: RegExp): string | undefined
}

License

LGPL v3.0 or later

0.6.1

11 months ago