oscript-parser v0.2.6
OScript Parser
A parser for the OScript language written in JavaScript. Returns an abstract syntax tree. See also oscript-ast-walker for traversing the AST and oscript-interpreter for its execution.
Synopsis
import { parseText } from 'oscript-parser'
const program = parseText('i = 0', { sourceType: 'script' })
console.log(JSON.stringify(program))Installation
Use your favourite package manager to install this package locally in your Node.js project:
npm i oscript-parser
pnpm i oscript-parser
yarn add oscript-parserIf you want to use the executables osparse or oslint from PATH, install this package globally instead.
API
The OScript AST resembles the AST for JavaScript, but it includes nodes specific to the OScript syntax. See the language grammar and the AST node declarations.
The OScript language is case-insensitive. Values of keywords and identifiers in tokens and AST nodes (value property) are converted lower-case to make comparisons and look-ups more convenient. If you need the original letter-case, enable the raw AST node content (raw property) by the rawIdentifiers parser option.
Parser
The output of the parser is an Abstract Syntax Tree (AST) formatted in JSON. The parser functionality is exposed by parseText() and parseTokens(). The parseText() expects an input text. The startTokenization() expects an input text with tokens already produced by tokenize().
The available options are:
defines: {}Preprocessor named values. For evaluating preprocessor directives.tokens: falseInclude lexer tokens in the output object. Useful for code formatting or partial analysis in case of errors.preprocessor: falseInclude tokens of preprocessor directives and the content skipped by the preprocessor. Useful for code formatting.comments: falseInclude comment tokens in the output of parsing or lexing. Useful for code formatting.whitespace: falseInclude whitespace tokens in the output of parsing or lexing. Useful for code formatting.locations: falseStore location information on each parsed node.ranges: falseStore the start and end character locations on each parsed node.raw: falseStore the raw original of identifiers and literals.rawIdentifiers: falseStore the raw original of identifiers only.rawLiterals: falseStore the raw original of literals only.sourceType: 'script'Set the source type toobject,scriptordump(the old object format).oldVersion: undefinedExpect the old version of the OScript language.sourceFile: 'snippet'File name to refer in source locations to.
The default options are also exposed through defaultOptions where
they can be overridden globally.
import { parseText } from 'oscript-parser'
const program = parseText('foo = "bar"', { sourceType: 'script' })
// { type: "Program",
// body:
// [{ type: "AssignmentStatement",
// variables: [{ type: "Identifier", value: "foo" }],
// init: [{ type: "StringLiteral", value: "bar" }]
// }]
// }Lexer
The lexer can be used independently of the parser. The lexer functionality is exposed by tokenize() and startTokenization(). The tokenize() will return an array of tokens. The startTokenization() will return a generator advancing to the next token up until EOF is reached. The EOF itself will not be returned as a token. The options are the same as for the method parse(), except for tokens, which will be ignored.
Each token consists of:
typeexpressed as an enum flag which can be matched withtokenTypes.valueline,lineStartrangecan be used to slice out the raw token content. For example,foo = "bar"will return aStringLiteraltoken with the valuebar. Slicing out the range on the other hand will return"bar".
import { tokenize } from 'oscript-parser'
const tokens = tokenize('foo = "bar"', { sourceType: 'script' })
// [{ type: 8, value: "foo", line: 1, lineStart: 0, range: [0, 3] }
// { type: 32, value: "=", line: 1, lineStart: 0, range: [4, 5]}
// { type: 2, value: "bar", line: 1, lineStart: 0, range: [6, 11] }]Tokens can be consumed incrementally by an iterator:
import { startTokenization } from 'oscript-parser'
const iterator = startTokenization('foo = "bar"', { sourceType: 'script' })
iterator.next() // { value: { type: 8, value: "foo", line: 1, range: [0, 3] } }
iterator.next() // { value: { type: 32, value: "=", line: 1, range: [4, 5]} }
iterator.next() // { value: { type: 2, value: "bar", line: 1, range: [6, 11] } }
iterator.next() // { done: true }Tools
osparse(1)
The osparse executable can be used from the shell by installing oscript-parser globally using npm:
$ npm i -g oscript-parser
$ osparse -h
Usage: osparse [option...] [file]
Options:
--[no]-tokens include lexer tokens. defaults to false
--[no]-preprocessor include preprocessor directives. defaults to false
--[no]-comments include comments. defaults to false
--[no]-whitespace include whitespace. defaults to false
--[no]-locations store location of parsed nodes. defaults to false
--[no]-ranges store start and end token ranges. defaults to false
--[no]-raw store raw identifiers & literals. defaults to false
--[no]-raw-identifiers store raw identifiers & literals. defaults to false
--[no]-raw-literals store raw identifiers & literals. defaults to false
--[no]-context show near source as error context. defaults to true
--[no]-colors enable colors in the terminal. default is auto
-D|--define <name> define a named value for preprocessor
-S|--source <type> source type is object, script (default) or dump
-O|--old-version expect an old version of OScript. defaults to false
-t|--tokenize print tokens instead of AST
-c|--compact print without indenting and whitespace
-w|--warnings consider warnings as failures too
-s|--silent suppress output
-v|--verbose print error stacktrace
-p|--performance print parsing timing
-V|--version print version number
-h|--help print usage instructions
If no file name is provided, standard input will be read. If no source type
is provided, it will be inferred from the file extension: ".os" -> object,
".e|lxe" -> script, ".osx" -> dump. The source type object will enable the
new OScript language and source type dump the old one by default.
Examples:
echo 'foo = "bar"' | osparse --no-comments -S script
osparse -t foo.osExample usage:
$ echo "i = 0" | osparse -c -S script
{"type":"Program","body":[{"type":"AssignmentStatement",
"variables":[{"type":"Identifier","value":"i"}],
"init":[{"type":"NumericLiteral","value":0}]}]}oslint(1)
The oslint executable can be used in the shell by installing oscript-parser globally using npm:
$ npm i -g oscript-parser
$ oslint -h
Usage: oslint [option...] [pattern ...]
Options:
--[no]-context show near source as error context. defaults to true
--[no]-colors enable colors in the terminal. default is auto
-D|--define <name> define a named value for preprocessor
-S|--source <type> source type is object, script (default) or dump
-O|--old-version expect an old version of OScript. defaults to false
-e|--errors-only print only files that failed the check
-w|--warnings consider warnings as failures too
-s|--silent suppress output
-v|--verbose print error stacktrace
-p|--performance print parsing timing
-V|--version print version number
-h|--help print usage instructions
If no file name is provided, standard input will be read. If no source type
is provided, it will be inferred from the file extension: ".os" -> object,
".e|lxe" -> script, ".osx" -> dump. The source type object will enable the
new OScript language and source type dump the old one by default.
Examples:
echo 'foo = "bar"' | oslint -S script
oslint -t foo.osExample usage:
$ echo "i = 0" | oslint
snippet succeededError Handling
If tokenizing or parsing fails, a non-zero exit code will be returned by either of osparse and oslint and the error with an extra context will be printed on the console. For example, after deleting an equal sign (=) from example.os:
All output of oslint goes to standard output. For osparse, the result AST goes to standard output and error and timing information to standard error.
License
Copyright (c) 2020-2022 Ferdinand Prantl
Licensed under the MIT license.
