syntax-source v0.4.3
Syntax Source
Generate high-resolution and complete syntax definitions for languages by feeding this API a set of YAML input files that mimic and extend the Sublime Text 3 Syntax Definition Schema.
Examples
API Documentation
Example build script:
#!/usr/bin/env node
const syntax_source = require('syntax-source');
const export_sublime_syntax = syntax_source.export['sublime-syntax'];
(async() => {
// load syntax source from a path string
let k_syntax = await syntax_source.transform({
path: process.argv[2],
// optional custom exstensions
extensions: {
_customKey: (k_context, k_rule, s_version) => {
let s_ext = k_context.syntax.ext;
// remove source rule from context
let i_rule = k_context.drop(k_rule);
// create new rule
k_context.insert(i_rule++, {
match: `(myCustomRegex)(:)`,
captures: [
`keyword.other.word.my-custom-regex.SYNTAX`,
`punctuation.separator.custom.colon.SYNTAX`,
],
pop: true,
});
return i_rule;
},
},
});
// write to stdout
process.stdout.write(export_sublime_syntax(k_syntax, {
// optional post-processing transform
post: g_yaml => ({
...g_yaml,
name: `${g_yaml.name} (MyPackage)`,
}),
}));
})().catch((e_compile) => {
console.error(e_compile.stack);
process.exit(1);
});Documentation for Input Files
The input files closely follow the .sublime-syntax format (an extension of YAML), but the extension should be .syntax-source so the highlighting works and so that Sublime Text 3 does not try loading the source files as syntax definitions.
Primer
This API operates under the assumption that every context intends to match the full range of tokens expected at that state in the grammar. If none of the rules in a given context are matched by the input, an implcitly generated 'catch-all' rule at the end will mark the text invalid (via the invalid.illega.token.expected.CONTEXT.SYNTAX scope) and pop the context from the stack (equivalent to the throw alias)
Top Level Key Extensions
In addition to the regular name, file_extensions, scope, etc., you can also use the following keys at the root of the definition structure:
extends:filename- inherit all the variables and contexts defined in the given.syntax-source, overriding any duplicate keys.
Context Quantifiers
When using the set or push actions, you can essentially quantify a context (which creates a new ad-hoc context) by appending one of the following characters to the target name:
?- existential quantifier: include the given context and then pop if none of its rules matched.*- zero or more quantifier: if the context's lookahead matches, repeatedly push it to the stack until it matches no more; then/otherwise pop.^- exactly one quantifier: will throw if the given context does not match.+- one or more quantifier: essentially same as- goto: [CONTEXT*, CONTEXT^]
Global Substitutions
Anytime the following placeholder text appears in a scope name, it will be automatically substituted:
SYNTAX- the syntax id (without thescope./text./markup.prefix)
Rule Aliases
Instead of providing a mapping value for each item in the content of a context, you can use a rule alias to shortcut common patterns. The value must be a single string given by one of the following values:
alone
Declare this context stands alone, i.e., it does not inherit the prototype context. Equivalent to:
- meta_include_prototype: falsebail
If none of the previous rules match, pop this context from the stack. Equivalent to:
- include: _OTHERWISE_POPcontinue
Match a single character at a time to keep it in this context. Equivalent to:
- match: '.'throw
If none of the previous rules match, mark the text invalid and pop this context from the stack. Equivalent to:
- match: '{{_SOMETHING}}'
scope: invalid.illegal.token.expected.CONTEXT.SYNTAX
pop: trueretry
If none of the previous rules match, mark the text invalid but do not pop this context from the stack. Equivalent to:
- match: '{{_SOMETHING}}'
scope: invalid.illegal.token.expected.CONTEXT.SYNTAXKey Extensions
Instead of, or in addition to, using the regular match, include, push, set, etc., the following keys within the content of a context have the following effects:
goto[.set|push] : context | array<context>
If this rule is reached, it will immediately change state to the given context(s).
Modifiers:
.set- DEFAULT: use thesetaction to change state..push- use thepushaction to change state.
Example:
contexts:
main:
- match: 'hi'
- goto: [root_MORE, root]
root:
- match: 'you'
scope: keyword.you
- match: 'world'
scope: keyword.world
- goto.push: other
root_MORE:
- match: ','
scope: punctuation.separator
set: mainUsing the Sublime Syntax exporter yields:
variables:
_ANYTHING_LOOKAHEAD: '(?=[\S\s])'
contexts:
main:
- match: 'hi'
scope: keyword.hi
- match: '{{_ANYTHING_LOOKAHEAD}}'
set: [root_MORE, root]
root:
- match: 'you'
scope: keyword.you
- match: 'world'
scope: keyword.world
- match: '{{_ANYTHING_LOOKAHEAD}}'
push: other
root_MORE:
- match: ','
scope: punctuation.separator
set: mainswitch[.set|push] : array<context | mappings>
Generate a series of rules where each one attempts to match a lookahead regex and consequently change state to the given context.
Modifiers:
.set- DEFAULT: use thesetaction to change state..push- use thepushaction to change state.
Example:
contexts:
main:
- switch:
- hello
- world
- other: worldUsing the Sublime Syntax exporter yields:
variables:
_ANYTHING_LOOKAHEAD: '(?=[\S\s])'
contexts:
main:
- match: '{{hello_LOOKAHEAD}}'
set: hello
- match: '{{world_LOOKAHEAD}}'
set: world
- match: '{{other_LOOKAHEAD}}'
set: worldword[.CASE] : text | array<text>
words[.CASE] : text | array<text>
Generate a series of rules where each one attempts to match a case-sensitive (or insensitive) variation of the given text.
This rule automatically adds the regex 'WORD{{_WORD_BOUNDARY}}' to the context's lookahead pattern variable, where WORD is the supplied text value.
Keep in mind that these lookaheads will not uselessly bloat your output because any unused variables are automatically removed during compilation, much like dead code removal. You can override the generated lookahead by adding a
- lookahead: 'regex'rule to the context.
CASE (Modifiers):
.auto- DEFAULT: matches the best fit using either.mixedor.cameldepending on the case of the text. If all characters given are lower case, it will use.mixed, otherwise it will use.camel..mixed- attempts matches the following cases in order:lower,upper,properormixed..camel- attempts matches the following cases in order:camel,pascal,lower,upper,properormixed..lower_camel- will only matchcamel..pascal- will only matchpascal..lower- will only matchlower..upper- will only matchupper..exact- enables case-sensitivity.
Scope Substitutions: Scopes attached to this rule may use the following placeholder substitutions:
WORD- the word that was matched in lower case.CASE- will be replaced by one of the following values depending on the text that was matched:lower- e.g., strmatchesupper- e.g., STRMATCHESproper- e.g., Strmatchesmixed- e.g., sTrmAtChEscamel- e.g., strMatchespascal- e.g., StrMatchesexact- only if the.exactmodifier was used
Optional Keys:
The generated rule will add the default scope keyword.operator.word.TYPE.WORD.SYNTAX, where TYPE defaults to empty but can be overriden using the type key (or the whole scope overriden using the scope key -- see below).
The generated rule will also automatically add the supplementary scope meta.case.CASE.SYNTAX to the matched text, where CASE is given by one the values listed above in Scope Substitutions.
The following optional keys can be used:
type:text- provides some scope subname to use forTYPEin the default scope that is applied. Has no effect ifscopeis used.scope:scope- overrides the default scope.boundary:regex- overrides the default boundary lookahead regex (_WORD_BOUNDARY) to match at the end of the keyword.
Example:
contexts:
boolean:
- words:
- 'true'
- 'false'
scope: constant.language.boolean.WORD.SYNTAX
pop: trueUsing the Sublime Syntax exporter yields:
contexts:
bolean:
- match: 'TRUE(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.true.rq meta.case.upper.rq
pop: true
- match: 'true(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.true.rq meta.case.lower.rq
pop: true
- match: 'True(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.true.rq meta.case.proper.rq
pop: true
- match: '(?i)true(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.true.rq meta.case.mixed.rq
pop: true
- match: 'FALSE(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.false.rq meta.case.upper.rq
pop: true
- match: 'false(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.false.rq meta.case.lower.rq
pop: true
- match: 'False(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.false.rq meta.case.proper.rq
pop: true
- match: '(?i)false(?={{_WORD_BOUNDARY}})'
scope: constant.language.boolean.false.rq meta.case.mixed.rq
pop: truemask : scope
Before changing state, push a context to the stack that will apply the given scope to all contexts put on the stack by some action. This is useful when you want to apply different scopes to a token depending on where it is in the grammar while reusing the same context to match the token itself. Can only be used in combination with a rule that has a set or push action.
Example:
contexts:
predicate:
- goto: namedNode
mask: meta.term.role.predicate.SYNTAX
object:
- goto: namedNode
mask: meta.term.role.object.SYNTAXUsing the Sublime Syntax exporter yields:
contexts:
predicate:
- match: '{{_ANYTHING_LOOKAHEAD}}'
set: [meta_term_role_predicate__MASK, namedNode]
meta_term_role_predicate__MASK:
- meta_include_prototype: false
- meta_content_scope: meta.term.role.predicate.rq
- match: '{{_ANYTHING_LOOKAHEAD}}'
pop: true
object:
- match: '{{_ANYTHING_LOOKAHEAD}}'
set: [meta_term_role_object__MASK, namedNode]
meta_term_role_object__MASK:
- meta_include_prototype: false
- meta_content_scope: meta.term.role.object.rq
- match: '{{_ANYTHING_LOOKAHEAD}}'
pop: trueadd[.back|front] : scope
Rather than replacing the generated scope, add the given scope to the output either at the .back (binds tigther to the token) or at the .front (binds looser to the token).
open[.SYMBOL] : subscope
close[.SYMBOL] : subscope
Generates a rule that matches either the opening of closing of the given SYMBOL, and sets the scope according to the following table.
SYMBOL and it's corresponding open/close character, respectively, followed by the scope it applies. SIDE is either begin or end accordingly:
paren-()-punctuation.definition.SUBSCOPE.SIDE.SYNTAXbrace-{}-punctuation.section.SUBSCOPE.SIDE.SYNTAXbracket-[]-punctuation.definition.SUBSCOPE.SIDE.SYNTAXtag-<>-punctuation.definition.SUBSCOPE.SIDE.SYNTAXirk-''-punctuation.definition.string.SIDE.SUBSCOPE.SYNTAXdirk-""-punctuation.definition.string.SIDE.SUBSCOPE.SYNTAX
lookahead[s] : string | array<string>
Override the generated lookahead regular expression.