@lemons_dev/parsinom v0.0.12
parsiNOM
parsiNOM is a modern, optimized and typesafe parser combinator library, inspired by parsimmon.
parsiNOM has not yet reached stable, so breaking changes can still occur in minor versions.
What is a Parser Combinator?
The idea behind parser combinator is to construct your parser out of a bunch of small parsers. This makes building parsers easier and more readable. On top of that, parser combinators make testing your parser easier, as every part of the parser, such as the parser for string literals, can be tested individually.
Important Terms
combinator
a function that usually takes in one or more parsers and returns a single combined parsermatcher
a matcher is a parser that is not constructed from other parsersyield
/yields
in this caseyield
refers to the value that a parser generates from a specific input string, if it can match. In the code a parser is generic over the value that it yields, meaningParser<string[]>
will yield an array of strings.
Basic Matchers
parsiNOM provides many matchers, which are documented using doc comments. Here we will look at the two most basic matchers.
String Matcher
P.string(str: string): Parser<string>
P.string
returns a parser that matches str
and yields str
.
const parser = P.string('foo'); // matches the string foo
expect(parser.parse('foo')).toEqual('foo'); // succeeds, yields 'foo'
expect(parser.parse('foobar')).toEqual('foo'); // succeeds, yields 'foo'
expect(() => parser.parse('')).toThrow(); // fails
expect(() => parser.parse('bar')).toThrow(); // fails
snippet source | anchor
To assert that the parser is at the end of input after parsing you can use .thenEof()
.
const parser = P.string('foo').thenEof(); // matches the string foo
expect(parser.parse('foo')).toEqual('foo'); // succeeds, yields 'foo'
expect(() => parser.parse('')).toThrow(); // fails
expect(() => parser.parse('bar')).toThrow(); // fails
expect(() => parser.parse('foobar')).toThrow(); // fails
snippet source | anchor
RegExp
P.regexp(regexp: RegExp, group?: number | undefined): Parser<string>
P.regexp
returns a parser that matches the regexp regexp
and yields the matched string or optionally a specific capture group.
Most of the time you want to use ^
to only match at the current parser position.
const parser = P.regexp(/^[0-9]+/); // matches multiple digits
expect(parser.parse('1')).toEqual('1'); // succeeds, yields '1'
expect(parser.parse('123')).toEqual('123'); // succeeds, yields '123'
expect(parser.parse('123foo')).toEqual('123'); // succeeds, yields '123'
expect(() => parser.parse('')).toThrow(); // fails
expect(() => parser.parse('foo')).toThrow(); // fails
snippet source | anchor
Basic Combinators
The most important parser combinators are or
, sequence
, many
and map
, but parsiNOM provides many other combinators build on top of these.
Matching Multiple Options
P.or<ParserArr extends readonly Parser<unknown>[]>(...parsers: ParserArr): Parser<TupleToUnion<DeParserArray<ParserArr>>>
P.or
accepts any number of parsers as arguments and yields the value of the first parser that succeeds.
Because of that the order of the parsers is important.
const parser = P.or(P.string('a'), P.string('b')).thenEof(); // matches 'a' or 'b'
expect(parser.parse('a')).toEqual('a'); // succeeds, yields 'a'
expect(parser.parse('b')).toEqual('b'); // succeeds, yields 'b'
expect(() => parser.parse('')).toThrow(); // fails
expect(() => parser.parse('c')).toThrow(); // fails
snippet source | anchor
In the following example the order of parsers matters.
const parser = P.or(P.string('a'), P.string('ab')).thenEof(); // matches only 'a'
expect(parser.parse('a')).toEqual('a'); // succeeds, yields 'a'
expect(() => parser.parse('ab')).toThrow(); // fails, since the parser will try to match 'a' first, succeeds and then expects the end of input
snippet source | anchor
const parser = P.or(P.string('ab'), P.string('a')).thenEof(); // matches 'ab' or 'a'
expect(parser.parse('a')).toEqual('a'); // succeeds, yields 'a', the parser will try to match 'ab' first but fails, then it backtracks and tries to match 'a'
expect(parser.parse('ab')).toEqual('ab'); // succeeds, yields 'ab'
snippet source | anchor
Matching a Sequence
P.sequence<ParserArr extends readonly Parser<unknown>[]>(...parsers: ParserArr): Parser<DeParserArray<ParserArr>>
P.sequence
accepts any number of parsers as arguments and matches them in order, yielding a tuple pf all of their results.
const parser = P.sequence(P.string('a'), P.string('b')).thenEof(); // matches 'a' then 'b'
expect(parser.parse('ab')).toEqual(['a', 'b']); // succeeds, yields ['a', 'b']
expect(() => parser.parse('')).toThrow(); // fails
expect(() => parser.parse('a')).toThrow(); // fails
expect(() => parser.parse('ba')).toThrow(); // fails
expect(() => parser.parse('foo')).toThrow(); // fails
snippet source | anchor
Matching Something Many Times
Parser.many(): Parser<SType[]>
Parser.many
makes a parser match itself as many times as it can in a row, yielding an array of the parser's result.
const parser = P.string('a').many().thenEof(); // matches 'a' as many times as it can
expect(parser.parse('')).toEqual([]); // succeeds, yields []
expect(parser.parse('a')).toEqual(['a']); // succeeds, yields ['a']
expect(parser.parse('aaa')).toEqual(['a', 'a', 'a']); // succeeds, yields ['a', 'a', 'a']
expect(() => parser.parse('foo')).toThrow(); // fails
expect(() => parser.parse('aafoo')).toThrow(); // fails
snippet source | anchor
Transforming what a Parser Yields
Parser.map<OtherSType extends STypeBase>(fn: (value: SType) => OtherSType): Parser<OtherSType>
Parser.map
allows for the transformation of the yielded value of a parser.
const parser = P.regexp(/^[0-9]+/).map(x => Number(x)); // matches a number, yielding the number as a number, not a string
expect(parser.parse('1')).toEqual(1); // succeeds, yields '1' as a number
expect(parser.parse('123')).toEqual(123); // succeeds, yields '123' as a number
expect(() => parser.parse('')).toThrow(); // fails
expect(() => parser.parse('foo')).toThrow(); // fails
snippet source | anchor
There are More
parsiNOM has a lot more combinators and matchers, which are documented using doc comments.
parsiNOM also has P_UTILS
which contains a lot of utility parsers, such as parsers for binary expressions, that can simplify building a parser.
Examples
The folder examples
contains a few parsers written using parsiNOM, which you can use as a reference. The tests for these parsers are located in tests/examples
.
Technical Things
parsiNOM parsers are LL(infinity) parser, meaning the following.
- the parser works left to right on the input
- the parser applies the left most derivation first
- the parser is a top-down parser
- the parser is restricted to context-free languages (this might not hold true, since
Parser.chain
exists) - the parser does not directly support left recursion (it will lead to an infinite loop), but there is a workaround using
Parser.many
andArray.reduce
- the parser supports infinite lookahead
Development
parsiNOM uses tsc
to build and bun
to test.
Development setup
- install bun (if you don't have it already)
- run
bun install
- run
bun run test
to test - run
bun run build
to build parsiNOM - run
bun run format
to reformat the code