whilejs v2.2.0
:icons: font = While.js
This is primarily a utility library for the link:https://github.com/sonrad10/Whide[Whide editor] to perform operations on programs written in the WHILE language used at the University of Sussex.
Before this library, the only (known) existing program operating on WHILE programs was link:https://github.com/Alexj136/HWhile[HWhile].
For more information about the WHILE language, see the <<_while_language_context_free_grammar, Context Free Grammar>, the link:https://github.com/sonrad10/Whide[Whide editor] (for which this library was produced), or the Haskell WHILE interpreter .
== Features
- Linter
- Interpreter (no debugger yet)
- Convert program to pure WHILE
- Convert to programs-as-data representation
== Installing
To run this program, you will need to have link:https://nodejs.org/en/[Node.js] installed.
The recommended install method is from the NPM global repository:
source,shell
npm install -g whilejs
Alternatively you can install it manually:
source,shell
$ git clone git@github.com:sonrad10/while.js.git
source,shell
$ git clone https://github.com/sonrad10/while.js
Then install globally:
source,shell
$ cd while.js/
$ npm install -g
== Command Line Usage
//Ensure the program is installed on your computer.
Use whilejs --help
for a complete list of the program options.
=== Interpreter
source,shell
whilejs prog.while
Use the -P
or --pureOnly
flag to run the program as pure WHILE (not allowing any extended language features).
Additionally, the output display format of the program can be set using the -o
or --output
flags.
The treeLang syntax is described on link:https://github.com/sonrad10/whide-treeLang[this page].
Use whilejs --help
for a complete list of the program options.
=== Programs-as-data
Use the --data
, -d
, or -u
option to convert WHILE programs to their programs-as-data representation.
source,shell
whilejs --data prog.while
The optional --pure-data
flag removes the leading @
symbols from each command in the data representation.
source,shell
whilejs --data --pure-data prog.while
=== Convert to pure WHILE
Use the --pure
or -p
flags to convert an extended-WHILE program to a semantically equivalent pure WHILE program:
source,shell
whilejs --pure prog.while
== Usage (for developers)
Add while.js to your project as a dependency:
source,shell
npm install --save whilejs
From this, import the required utilities as follows:
source, typescript
import { linter, parseProgram, Interpreter, ProgramManager, MacroManager, toPad, fromPad, displayPad, displayProgram, ErrorType, BinaryTree
} from "whilejs";
=== Interpreter
See the following example for programmatically using the WHILE interpreter.
This program runs a WHILE program with the input +[1,2,3,4]+
(the list of numbers 1 through 4) and
displays the output as a tree whose two children are each lists of numbers.
It also demonstrates error handling for the output of the WHILE parser.
If you want to get binary trees as inputs from the user, I recommend using the link:https://github.com/sonrad10/whide-treeLang[@whide/treeLang] package.
source,typescript
import { BinaryTree, ErrorType, Interpreter } from 'whilejs'; import treeConverter, { ConversionResultType, stringify, treeParser } from "@whide/tree-lang";
//The program to run
let whileprog: string = prog read in {
out := cons in in
} write out
;
//Binary tree to pass as input to the program
//Converted from a string for better readability
let inputTree: BinaryTree = treeParser('1,2,3,4');
//Parse the input program and produce an Interpreter
object
//If success
is true, interpreter will contain a configured Interpreter object.
//If success
is false, interpreter will be undefined, and errors
will contain a list of error messages
let {success, errors, interpreter}: {success:true, interpreter:Interpreter}|{success:false, errors: ErrorType[]} =
Interpreter.parse(whileprog, inputTree);
if (!success) {
//Show errors if parsing failed
console.error(Parse failed with ${errors.length} errors:
);
for (let error of errors) {
console.error(Error (${error.position.row}:${error.position.col}): ${error.message}
);
}
} else {
//The program was parsed without issue
//Run the program and get the produced output
let output: BinaryTree = interpreter.run();
try {
//Convert the outputted tree to a human readable format and display it
let res: ConversionResultType = treeConverter(output, '<int[].int[]>');
console.log(stringify(res.tree))
} catch (e) {
//Error when converting the tree
console.error(e);
return;
}
}
=== Linter
The linter performs syntax analysis on WHILE programs. In order to provide as useful error messages as possible, the linter attempts to work around syntax errors as much as possible. This is particularly useful in, for example, syntax checking in a code editor.
NOTE: If you just want to run WHILE programs, use Interpreter.parse
instead.
This is equivalent to running the linter, then creating an Interpreter
object from the result.
source,typescript
import { ErrorType, linter } from 'whilejs';
//The program to analyse
let whileprog: string = prog read in {
out := cons in in
} write out
;
//Run the linter on the input program let errors: ErrorType[] = linter(whileprog);
//Do something with the list of errors
//...
=== Using the Lexer and Parser
The linter consists of two steps; firstly lexing the program string to produce a list of program tokens, then parsing the list of tokens into an Abstract Syntax Tree (AST) representing the program.
The lexer and parser can be directly accessed using their individual imports. This is useful if you need access to the lexer token list or the program's AST. An example program for this is shown below:
source,typescript
//Lexer imports import lexer from "whilejs/lib/linter/lexer"; import { WHILE_TOKEN } from "whilejs/lib/types/tokens"; import { WHILE_TOKEN_EXTD } from "whilejs/lib/types/extendedTokens"; //Parser imports import parser from "whilejs/lib/linter/parser"; import { AST_PROG,AST_PROG_PARTIAL } from "whilejs/lib/types/ast"; //General imports import { ErrorType } from "whilejs";
//The program to analyse
let whileprog: string = prog read in {
out := cons in in
} write out
;
//Run the program through the lexer to produce a list of program tokens let tokenList, lexerErrors: [(WHILE_TOKEN|WHILE_TOKEN_EXTD)[], ErrorType[]] = lexer(whileprog); //Then run the token list through the parser to produce an AST representing the program let ast, parseErrors: [(AST_PROG | AST_PROG_PARTIAL), ErrorType[]] = parser(tokenList);
//Optionally combine the lexer and parser errors into a single list
let errors: ErrorType[] = ...lexerErrors, ...parseErrors;
When the lexer catches invalid syntax, the offending token(s) are added as an unknown
type to the token list,
and an error message is added to the error list.
When the parser catches invalid syntax, the offending block (and all its parent nodes in the AST) is marked as incomplete,
and any data which cannot be parsed from the code is filled with null.
If the program's root AST node is marked as complete (i.e. node.complete
is true
) then the program was parsed without issue.
Otherwise, at least one node in the tree contains incomplete data.
=== Programs-as-data
source,typescript
import { parseProgram, toPad, fromPad, displayPad, displayProgram } from "whilejs"; import { HWHILE_DISPLAY_FORMAT, ProgDataType } from "whilejs/lib/tools/progAsData"; import { AST_PROG } from "whilejs/lib/ast";
//The program to convert
let prog, err = parseProgram(prog read in {
out := cons in in
} write out
);
//Make sure there weren't any parsing errors
if (!prog.complete) {
let errString = '';
for (let i in err) {
errString += erri.position.row + '.' + erri.position.column;
errString += :
+ erri.message;
if (i < err.length - 1) errString += '\n';
}
throw new Error(Errors while parsing the program:\n${errString}
);
}
//Convert the program to prog-as-data let pad: ProgDataType = toPad(prog); console.log(displayPad(pad, HWHILE_DISPLAY_FORMAT));
//Convert from prog-as-data back to a program let prog1: AST_PROG = fromPad(pad);
console.log(displayProgram(prog1));
=== Program Analysis tools
source,typescript
import { parseProgram, ProgramManager } from "whilejs"; import { AST_PROG } from "whilejs/lib/ast";
//The program to convert
let prog, err = parseProgram(prog read in {
out := cons in in
} write out
);
//Make sure there weren't any parsing errors
if (!prog.complete) {
let errString = '';
for (let i in err) {
errString += erri.position.row + '.' + erri.position.column;
errString += :
+ erri.message;
if (i < err.length - 1) errString += '\n';
}
throw new Error(Errors while parsing the program:\n${errString}
);
}
//Create a program manager
let mgr: ProgramManager = new ProgramManager(prog);
//Replace the macro called 'macroName' with its program code mgr.replaceMacro(macroAst, 'macroName'); //Rename variable 'in' to 'X' mgr.renameVariable('in', 'X'); //Display the program as a string mgr.displayProgram(); //Convert the program from extended WHILE into pure WHILE mgr.toPure(); //Convert the program from extended to programs-as-data representation
mgr.toPad();
=== Testing
You can run the library's tests with the following command:
source,shell
npm run test
Alternatively use the following command to run individual files
source,shell
npm run test-specific -- linter/parser.test.ts utils.test.ts
Or individual directories:
source,shell
npm run test-specific -- linter/*.test.ts
== WHILE language Context-Free Grammar
This grammar depicts the full extended WHILE language supported by while.js.
This is very similar to the language described in Dr. Bernhard Reus' textbook "The Limits Of Computation" with only minor
modifications.
Features available in the extended language which are not available in the pure language have been annotated with an asterisk +*+
.
<name>
represents the program name, and <variable>
accepts any valid variable name.
Variable names must conform to the regular expression /^[a-z_]\w*/i
;
that is, starting with a letter (of any case) or underscore, followed by any number of letters, numbers, or underscores.
source
::= read write
::= {} // Block of commands | { } // Empty block
::= // Single command | ; // List of commands
::= := // Assignment | while // While loop | if // If-then | if // If-then-else // Switch statements
| switch <expression> { <rule-list> }
| switch <expression> { <rule-list> default : <statement-list> }
::= else // Else case
::= // Variable Expression | nil // Atom nil | cons // Construct tree | hd // Left subtree | tl // Right subtree | ( ) // Right subtree
| <expression> = <expression> // Equality expressions
| <number> // All the natural numbers
| true // Booleans
| false
| [] // Empty list constructor
| [<expression-list>] // Non-empty list constuctor // Here '<<...>>' means '...' surrounded by < and >
| << <expression> . <expression> >> // Literal tree constructor
| <<name>> <expression> // Macro calls
::= ...
| <expression> // Single expression list
| <expression>, <expression-list> // Multiple expression list
::= case :
::=
| <rule> <rule-list>
The modifications made to the language are as follows:
- Macro calls may be used in place of any expression, instead of only in assignment statements
- Binary trees may be defined using the syntax
<expr1.expr2>
(whereexpr1`` and ``expr2
are expressions) instead of using thecons
oeprator