Lex-luthor NPM

Lex Luthor

This package is a lexical scanner written in Javascript based on a talk by Rob Pike. In the video he goes over a method that uses state functions to tokenize input, and uses Go channels to emit the tokens. In Node, I thought it would be possible to use streams and event emitters to achieve the same thing.

Example

This basic example will find comments of the multi-line format (/*...*/). The general idea is that all state function should return a new state function or null if it is the end of the file.

// Register the default state function
Lexer.registerState('default', function(lex) {
	
	// Get the next character from the stream
	var next = lex.next();

	// Check if it is null, meaning end of the file
	if (next === null) {
		return null;
	}

	// Find comments start character
	if (lex.value == constants.COMMENT_MULTILINE_START_CHAR) {
		// If the next character is a star, start a comment
		if (lex.accepts(constants.STAR)) {
			lex.emitToken('comment_start');
			return Lexer.getState('comment');
		}
	}

	// Ignore whitespace
	lex.acceptsRun(constants.WHITESPACE);
	lex.ignore();

	// If nothing matched, start over
	return Lexer.getState('default');

});

// Register inside comment state function
Lexer.registerState('comment', function(lex) {
	// Get any input that is not the start of an ending token ('*')
	lex.notAcceptsRun(constants.COMMENT_MULTILINE_END_CHAR);

	// Check the next two characters, if they are '*/' then end the comment
	if (lex.lookAhead(2) == constants.COMMENT_MULTILINE_END) {

		// Emit the new token
		lex.emitToken('comment_content');

		// Advance past the ending token and emit
		lex.next();
		lex.next();
		lex.emitToken('comment_end');

		// Return to the default state
		return Lexer.getState('default');
	}

	// If it was not the end of a comment, then advance past the star and contiune
	var next = lex.next();
	if (next === null) {
		return null;
	}
	return Lexer.getState('comment');
});

// Create the lexer and give it the input file
var lex = new Lexer().inputFile('./test/files/comments.css');

// Create an array to store the tokens
// In real live you would probably have a parser listen for this
var tokens = [];
lex.on('token', function(token) {
	tokens.push(token);
	// If it reaches the end of the file, then just log all the tokens
	if (token.type == Lexer.EOFToken) {
		console.log(tokens);
	}
});

// Run the lexer
lex.run();

Tests

There are unit tests and end-to-end tests. To run both you can use grunt test, or you can run them individually with npm run-script test-unit and npm run-script test-e2e.

To generate the coverage report you can run grunt test-coverage. This will run both unit and end-to-end tests, and generate a report for each in the coverage directory. This task also opens up a server where you can view the html output of the report at:

@infinitebrahmanuniverse/nolb-lex @everything-registry/sub-chunk-2062 @zalastax/nolb-lex

0.0.1

12 years ago