@lemons_dev/parsinom v0.0.12
parsiNOM
parsiNOM is a modern, optimized and typesafe parser combinator library, inspired by parsimmon.
parsiNOM has not yet reached stable, so breaking changes can still occur in minor versions.
What is a Parser Combinator?
The idea behind parser combinator is to construct your parser out of a bunch of small parsers. This makes building parsers easier and more readable. On top of that, parser combinators make testing your parser easier, as every part of the parser, such as the parser for string literals, can be tested individually.
Important Terms
combinatora function that usually takes in one or more parsers and returns a single combined parsermatchera matcher is a parser that is not constructed from other parsersyield/yieldsin this caseyieldrefers to the value that a parser generates from a specific input string, if it can match. In the code a parser is generic over the value that it yields, meaningParser<string[]>will yield an array of strings.
Basic Matchers
parsiNOM provides many matchers, which are documented using doc comments. Here we will look at the two most basic matchers.
String Matcher
P.string(str: string): Parser<string>
P.string returns a parser that matches str and yields str.
const parser = P.string('foo'); // matches the string foo
expect(parser.parse('foo')).toEqual('foo'); // succeeds, yields 'foo'
expect(parser.parse('foobar')).toEqual('foo'); // succeeds, yields 'foo'
expect(() => parser.parse('')).toThrow(); // fails
expect(() => parser.parse('bar')).toThrow(); // failssnippet source | anchor
To assert that the parser is at the end of input after parsing you can use .thenEof().
const parser = P.string('foo').thenEof(); // matches the string foo
expect(parser.parse('foo')).toEqual('foo'); // succeeds, yields 'foo'
expect(() => parser.parse('')).toThrow(); // fails
expect(() => parser.parse('bar')).toThrow(); // fails
expect(() => parser.parse('foobar')).toThrow(); // failssnippet source | anchor
RegExp
P.regexp(regexp: RegExp, group?: number | undefined): Parser<string>
P.regexp returns a parser that matches the regexp regexp and yields the matched string or optionally a specific capture group.
Most of the time you want to use ^ to only match at the current parser position.
const parser = P.regexp(/^[0-9]+/); // matches multiple digits
expect(parser.parse('1')).toEqual('1'); // succeeds, yields '1'
expect(parser.parse('123')).toEqual('123'); // succeeds, yields '123'
expect(parser.parse('123foo')).toEqual('123'); // succeeds, yields '123'
expect(() => parser.parse('')).toThrow(); // fails
expect(() => parser.parse('foo')).toThrow(); // failssnippet source | anchor
Basic Combinators
The most important parser combinators are or, sequence, many and map, but parsiNOM provides many other combinators build on top of these.
Matching Multiple Options
P.or<ParserArr extends readonly Parser<unknown>[]>(...parsers: ParserArr): Parser<TupleToUnion<DeParserArray<ParserArr>>>
P.or accepts any number of parsers as arguments and yields the value of the first parser that succeeds.
Because of that the order of the parsers is important.
const parser = P.or(P.string('a'), P.string('b')).thenEof(); // matches 'a' or 'b'
expect(parser.parse('a')).toEqual('a'); // succeeds, yields 'a'
expect(parser.parse('b')).toEqual('b'); // succeeds, yields 'b'
expect(() => parser.parse('')).toThrow(); // fails
expect(() => parser.parse('c')).toThrow(); // failssnippet source | anchor
In the following example the order of parsers matters.
const parser = P.or(P.string('a'), P.string('ab')).thenEof(); // matches only 'a'
expect(parser.parse('a')).toEqual('a'); // succeeds, yields 'a'
expect(() => parser.parse('ab')).toThrow(); // fails, since the parser will try to match 'a' first, succeeds and then expects the end of inputsnippet source | anchor
const parser = P.or(P.string('ab'), P.string('a')).thenEof(); // matches 'ab' or 'a'
expect(parser.parse('a')).toEqual('a'); // succeeds, yields 'a', the parser will try to match 'ab' first but fails, then it backtracks and tries to match 'a'
expect(parser.parse('ab')).toEqual('ab'); // succeeds, yields 'ab'snippet source | anchor
Matching a Sequence
P.sequence<ParserArr extends readonly Parser<unknown>[]>(...parsers: ParserArr): Parser<DeParserArray<ParserArr>>
P.sequence accepts any number of parsers as arguments and matches them in order, yielding a tuple pf all of their results.
const parser = P.sequence(P.string('a'), P.string('b')).thenEof(); // matches 'a' then 'b'
expect(parser.parse('ab')).toEqual(['a', 'b']); // succeeds, yields ['a', 'b']
expect(() => parser.parse('')).toThrow(); // fails
expect(() => parser.parse('a')).toThrow(); // fails
expect(() => parser.parse('ba')).toThrow(); // fails
expect(() => parser.parse('foo')).toThrow(); // failssnippet source | anchor
Matching Something Many Times
Parser.many(): Parser<SType[]>
Parser.many makes a parser match itself as many times as it can in a row, yielding an array of the parser's result.
const parser = P.string('a').many().thenEof(); // matches 'a' as many times as it can
expect(parser.parse('')).toEqual([]); // succeeds, yields []
expect(parser.parse('a')).toEqual(['a']); // succeeds, yields ['a']
expect(parser.parse('aaa')).toEqual(['a', 'a', 'a']); // succeeds, yields ['a', 'a', 'a']
expect(() => parser.parse('foo')).toThrow(); // fails
expect(() => parser.parse('aafoo')).toThrow(); // failssnippet source | anchor
Transforming what a Parser Yields
Parser.map<OtherSType extends STypeBase>(fn: (value: SType) => OtherSType): Parser<OtherSType>
Parser.map allows for the transformation of the yielded value of a parser.
const parser = P.regexp(/^[0-9]+/).map(x => Number(x)); // matches a number, yielding the number as a number, not a string
expect(parser.parse('1')).toEqual(1); // succeeds, yields '1' as a number
expect(parser.parse('123')).toEqual(123); // succeeds, yields '123' as a number
expect(() => parser.parse('')).toThrow(); // fails
expect(() => parser.parse('foo')).toThrow(); // failssnippet source | anchor
There are More
parsiNOM has a lot more combinators and matchers, which are documented using doc comments.
parsiNOM also has P_UTILS which contains a lot of utility parsers, such as parsers for binary expressions, that can simplify building a parser.
Examples
The folder examples contains a few parsers written using parsiNOM, which you can use as a reference. The tests for these parsers are located in tests/examples.
Technical Things
parsiNOM parsers are LL(infinity) parser, meaning the following.
- the parser works left to right on the input
- the parser applies the left most derivation first
- the parser is a top-down parser
- the parser is restricted to context-free languages (this might not hold true, since
Parser.chainexists) - the parser does not directly support left recursion (it will lead to an infinite loop), but there is a workaround using
Parser.manyandArray.reduce - the parser supports infinite lookahead
Development
parsiNOM uses tsc to build and bun to test.
Development setup
- install bun (if you don't have it already)
- run
bun install - run
bun run testto test - run
bun run buildto build parsiNOM - run
bun run formatto reformat the code