0.1.10 • Published 9 years ago

taglex v0.1.10

Weekly downloads
4
License
LGPL-3.0
Repository
bitbucket
Last release
9 years ago

TagLex

Build Status

TagLex is library containing streaming Lexers and Parsers for processing custom mark-up languages. It makes writing a Markdown-like language parser very easy. It also facilitates writing parsers for strict data formats.

Installation

npm install taglex

Usage

Check out the Examples for an example full Markdown-esque language, or see below for mini-language examples.

Simple example

var taglex = require('taglex');
var sys = require('sys');

var ruleset = new taglex.TagRuleSet({ ignore_case: true });
ruleset.add_tag({
    name: 'italic',
    open: '*',
    close: '*',
    parents: ['root'],
    aliases: [['_', '_'], ['i:', ':i']],
    payload: {start: '<i>', finish: '</i>'}
});

var parser = ruleset.new_parser();
parser.on('tag_open', function (payload, token) {
    sys.print(payload.start);
});

parser.on('text_node', function (text) {
    sys.print(text.replace(/</g, "&lt;")); // escape
});

parser.on('tag_close', function (payload, token) {
    sys.print(payload.finish);
});

parser.write("This is an *example* of a small I:regular");
parser.write(" language:I");

// Would output:
// This is an <i>example</i> of a small <i>regular language</i>

Tag hierarchy example

TagLex is also capable of parsing context-free grammars:

var ruleset = new taglex.TagRuleSet();
ruleset.add_tag({
    name: 'table',
    open: '{{{', close: '}}}',
    ignore_text: true,
    parents: ['root'],
    payload: {start: '<table>', finish: '</table>'}
});

ruleset.add_tag({
    name: 'row',
    ignore_text: true,
    open: '[', close: ']',
    parents: ['table'],
    payload: {start: '<tr>', finish: '</tr>'}
});

ruleset.add_tag({
    name: 'cell',
    open: '[', close: ']',
    parents: ['row'],
    payload: {start: '<td>', finish: '</td>'}
});

// to make it context-free, a tag that can contain itself:
ruleset.add_tag({
    name: 'i',
    open: '[', close: ']',
    parents: ['i', 'cell'],
    payload: {start: '<i>', finish: '</i>'}
});

/* [... parser set up as before ...] */

parser.write("Outside the table I can freely use [] characters.\n");
parser.write("Here is a table example:\n{{{ (ignored text)");
parser.write("[ [ cell 1 ] [[[ cell 2 ]]] [ 3 ] ]\n");
parser.write("[ [ cell 4 ] [ cell 5 ] [ 6 ] ]");
parser.write("}}}");

// Would output (wrapped):
// Outside the table I can freely use [] characters.
// Here is a table example:
// <table><tr><td> cell 1 </td><td><i><i> cell 2 </i></i></td><td> 3 </td></tr>
// <tr><td> cell 4 </td><td> cell 5 </td><td> 6 </td></tr></table>

Speed

I haven't benchmarked it, or carefully looked at complexity, but to give you a broad idea of what to expect:

  • Compile step is at least O(n^2) both for memory and CPU, with n = number of tags.

  • Render step should be very fast as it relies on searching the input string by a single regular expression (per context). The slowest feature is the "stack collapse" feature.

Anti-features

  • TagRuleset aliases are counter-intuitive. Presently, they can be mixed and matched. Assume that in the future this will change, that opening with one alias can only close with that alias.

  • The "stack collapse" feature (enabled with the option to add_tag "force_close") sometimes splits TEXT_NODE emissions, typically this is a harmless bug. This feature in general is needlessly complex and could use a re-write.

  • Poor documentation: TagLex documentation could use a lot of work. In the mean time, check out examples and tests.js to see many more examples of what you can do.

  • Very large number of heavily interacting tags (e.g. where tag nesting is a complete graph, and sloppy tag closes apply everywhere, such as a fault-tolerant HTML parser) might mean a slow compile step and unnecessarily larger memory footprint (lots of n^2 operations)

0.1.10

9 years ago

0.1.9

9 years ago

0.1.8

9 years ago

0.1.7

9 years ago

0.1.6

9 years ago

0.1.5

9 years ago

0.1.4

9 years ago

0.1.3

9 years ago

0.1.1

9 years ago

0.1.0

9 years ago