lex-luthor v0.0.1
Lex Luthor
This package is a lexical scanner written in Javascript based on a talk by Rob Pike. In the video he goes over a method that uses state functions to tokenize input, and uses Go channels to emit the tokens. In Node, I thought it would be possible to use streams and event emitters to achieve the same thing.
Example
This basic example will find comments of the multi-line format (/*...*/
). The general idea is that all state function should return a new state function or null if it is the end of the file.
// Register the default state function
Lexer.registerState('default', function(lex) {
// Get the next character from the stream
var next = lex.next();
// Check if it is null, meaning end of the file
if (next === null) {
return null;
}
// Find comments start character
if (lex.value == constants.COMMENT_MULTILINE_START_CHAR) {
// If the next character is a star, start a comment
if (lex.accepts(constants.STAR)) {
lex.emitToken('comment_start');
return Lexer.getState('comment');
}
}
// Ignore whitespace
lex.acceptsRun(constants.WHITESPACE);
lex.ignore();
// If nothing matched, start over
return Lexer.getState('default');
});
// Register inside comment state function
Lexer.registerState('comment', function(lex) {
// Get any input that is not the start of an ending token ('*')
lex.notAcceptsRun(constants.COMMENT_MULTILINE_END_CHAR);
// Check the next two characters, if they are '*/' then end the comment
if (lex.lookAhead(2) == constants.COMMENT_MULTILINE_END) {
// Emit the new token
lex.emitToken('comment_content');
// Advance past the ending token and emit
lex.next();
lex.next();
lex.emitToken('comment_end');
// Return to the default state
return Lexer.getState('default');
}
// If it was not the end of a comment, then advance past the star and contiune
var next = lex.next();
if (next === null) {
return null;
}
return Lexer.getState('comment');
});
// Create the lexer and give it the input file
var lex = new Lexer().inputFile('./test/files/comments.css');
// Create an array to store the tokens
// In real live you would probably have a parser listen for this
var tokens = [];
lex.on('token', function(token) {
tokens.push(token);
// If it reaches the end of the file, then just log all the tokens
if (token.type == Lexer.EOFToken) {
console.log(tokens);
}
});
// Run the lexer
lex.run();
Tests
There are unit tests and end-to-end tests. To run both you can use grunt test
, or you can run them individually with npm run-script test-unit
and npm run-script test-e2e
.
To generate the coverage report you can run grunt test-coverage
. This will run both unit and end-to-end tests, and generate a report for each in the coverage
directory. This task also opens up a server where you can view the html output of the report at:
11 years ago