remark-tree-sitter v1.0.3
remark-tree-sitter
Highlight code in Markdown files using tree-sitter and remark. Powered by tree-sitter-hast.
Installation
npm install remark-tree-sitteror
yarn add remark-tree-sitterUsage
This plugin uses the same mechanism and data as Atom for syntax highlighting, So to highlight a particular language, you need to either:
- Install the APM (Atom) package for that language and tell
remark-tree-sitterto import it, using thegrammarPackagesoption. (See Atom language packages) - Provide the
tree-sittergrammar and scopeMappings manually, using the using thegrammarsoption.
For more information on how this mechanism works,
check out the documentation for tree-sitter-hast.
Any code blocks that are encountered for which there is not a matching language will be ignored.
Example
The following example is also in the examples directory
and can be run directly from there.
It uses @atom-languages/language-typescript to provide the TypeScript grammar and
npm install to-vfile vfile-reporter remark remark-tree-sitter remark-html @atom-languages/language-typescriptconst vfile = require('to-vfile')
const report = require('vfile-reporter')
const remark = require('remark')
const treeSitter = require('remark-tree-sitter')
const html = require('remark-html')
remark()
.use(treeSitter, {
grammarPackages: ['@atom-languages/language-typescript']
})
.use(html)
.process(vfile.readSync('example.md'), (err, file) => {
console.error(report(err || file))
console.log(String(file))
})Output:
example.md: no issues found
<pre><code class="tree-sitter language-typescript"><span class="source ts"><span class="storage type function">function</span> <span class="entity name function">foo</span><span class="punctuation definition parameters begin bracket round">(</span><span class="punctuation definition parameters end bracket round">)</span> <span class="punctuation definition function body begin bracket curly">{</span>
<span class="keyword control">return</span> <span class="constant numeric">1</span><span class="punctuation terminator statement semicolon">;</span>
<span class="punctuation definition function body end bracket curly">}</span></span></code></pre>Atom language packages
To use an Atom language package,
like any package you first need to install it using npm install or yarn add.
Unfortunately most APM packages are not made available on NPM,
so I've started to make some of them available under the NPM organization
@atom-languages.
Here's a list of packages with which languages they provide highlighting for.
@atom-languages/language-typescript:typescript,tsx(TypeScriptReact),flow
API
remark.use(treeSitter, options)
Note that options is required, and either grammarPackages or grammars needs to be provided. (Both can be provided, and grammars specified in grammars will overide those loaded in grammarPackages).
options.grammarPackages
An array of all Atom language packages that should be loaded.
Example:
remark().use(treeSitter, {
grammarPackages: ['@atom-languages/language-typescript']
})The language names that code blocks must then use
to refer to a language is based on the filenames in the atom package.
For example the above package
has the files:
tree-sitter-flow.cson, tree-sitter-tsx.cson, tree-sitter-typescript.cson...
so this will make the languages flow, tsx and typescript
available for use within code blocks.
If you want to make loaded languages available to use via different names,
you can use options.languageAliases.
options.grammars
An object mapping language keys objects containing grammar and scopeMappings.
Anything specified here will overwrite the languages loaded by options.grammarPackages.
For more information on scopeMappings, check out the documentation for tree-sitter-hast.
Example:
See a working example at examples/example-grammars.js.
remark().use(treeSitter, {
grammars: {
typescript: {
grammar: typescriptGrammar,
scopeMappings: typescriptScopeMappings
},
'custom-language': {
grammar: customLanguageGrammar,
scopeMappings: customLanguageScopeMappings
}
}
})You can then use both the typescript and custom-language languages in code blocks:
```custom-language
some code
```
```typescript
let foo = 'bar';
```If you want to make loaded languages available to use via different names,
you can use options.languageAliases.
options.classWhitelist
Sometimes including the full list of classes applied by the scope mappings can be too much, and you'd like to only include those that you have stylesheets for.
To do this, you can pass in a whitelist of classes that you actually care about.
Example: The following configuration...
remark().use(treeSitter, {
grammarPackages: ['@atom-languages/language-typescript'],
classWhitelist: ['storage', 'numeric']
})...will convert the following markdown...
```typescript
function foo() {
return 1;
}
```...to this:
<pre><code class="tree-sitter language-typescript"><span><span class="storage">function</span> foo() {
return <span class="numeric">1</span>;
}</span></code></pre>options.languageAliases
TODO: options.languageAliases is not implemented yet
TODO:
- Add unit tests for
grammarsoption