svelte-simple-markdown v1.0.3
svelte-simple-markdown
This is a fork of Simple-Markdown, modified to target Svelte, however due to separating the parsing and outputting steps, it can be used to target any web framework.
Philosophy
Most markdown-like parsers aim for speed or edge case handling. simple-markdown aims for extensibility and simplicity.
What does this mean? Many websites using markdown-like languages have custom extensions, such as
@mentions or issue number linking. Unfortunately, most markdown-like parsers don't allow extension
without forking, and can be difficult to modify even when forked. svelte-simple-markdown is
designed to allow simple addition of custom extensions without needing to be forked.
At Khan Academy, the original simple-markdown is used to format over half of their math exercises,
because markdown extensions for math text and interactive widgets are necessary.
On davecode.net, I use the svelte version to format custom content within my Input & Output page.
Getting started
This module is only distributed as ES Modules, so you'll want to be using SvelteKit or Vite for your codebase.
The first step is to create a parser and config object. You should do this in a file outside your svelte components, so the markdown configuration is only created once, and can be reused around your codebase.
import { createParser, defaultRules } from 'svelte-simple-markdown';
const parser = createParser(defaultRules);
export const markdownConfig = {
parser
};To add your own extensions, you just have to clone() the default rule list and add your own.
You'll find that defaultRules is a custom RuleList class which has some helper methods. In this
example, we add a basic issue number type.
import { createParser, defaultRules } from 'svelte-simple-markdown';
import IssueLink from '$lib/components/IssueLink.svelte';
const customRules = defaultRules.clone();
customRules.insertBefore('em', {
name: 'issue',
match: inlineRegex(/^#(\d+)/),
parse(capture) {
return {
number: capture[1]
};
}
});
const parser = createParser(customRules);
export const markdownConfig = {
parser,
renderers: {
issue: IssueLink
}
};The "before" of insertBefore refers to the parsing priority.
Custom Renderers are simply Svelte components given a node prop of the ast node. If your node has a content property, it is rendered into your renderer's <slot />.
<script lang="ts">
import type { ASTNode } from '$lib/core';
export let node: ASTNode;
</script>
<a sveltekit:prefetch href="/issues/{node.id}">
#{node.id}
</a>The Markdown svelte component can now display any markdown string.
<script lang="ts">
import { Markdown } from 'svelte-simple-markdown';
import { markdownConfig } from '$lib/markdown';
</script>
<Markdown config={markdownConfig} value="Hello **World**, see #54!" />Other Frameworks
You'll find non-svelte related code (all parsing and default rules) in
svelte-simple-markdown/core. To use this with other frameworks, you'll have to write your own
renderer. If you do that, please get in touch with an issue and we could try and organise this as
a multi-framework markdown project.
Extension Overview
Elements in simple-markdown are generally created from rules.
For parsing, rules must specify match and parse methods.
For output, rules must specify a react or html method
(or both), depending on which outputter you create afterwards.
Here is an example rule, a slightly modified version of what simple-markdown uses for parsing strong (bold) text:
rules.add({
name: 'strong',
match(source, state, lookbehind) {
return /^\*\*([\s\S]+?)\*\*(?!\*)/.exec(source);
},
parse(capture, recurseParse, state) {
return {
content: recurseParse(capture[1], state)
};
}
});Let's look at those methods in more detail.
match(source: string, state: ParserState)
simple-markdown calls your match function to determine whether the
upcoming markdown source matches this rule or not.
source is the upcoming source, beginning at the current position of
parsing (source0 is the next character).
state is a mutable state object to allow for more complicated matching
and parsing. The most common field on state is inline, which all of
the default rules set to true when we are in an inline scope, and false
or undefined when we are in a block scope.
state.lookbehind is the string previously captured at this parsing level, to
allow for lookbehind. For example, lists check that lookbehind ends with
/^$|\n *$/ to ensure that lists only match at the beginning of a new
line.
If this rule matches, match should return an object, array, or
array-like object, which we'll call capture, where capture[0]
is the full matched source, and any other fields can be used in the
rule's parse function. The return value from Regexp.prototype.exec
fits this requirement, and the common use case is to return the result
of someRegex.exec(source).
If this rule does not match, match should return null.
NOTE: If you are using regexes in your match function, your regex
should always begin with ^. Regexes without leading ^s can
cause unexpected output or infinite loops.
parse(capture: Capture, recurseParse: Parser, state: ParserState)
parse takes the output of match and transforms it into a syntax
tree node object, which we'll call node here.
capture is the non-null result returned from match.
recurseParse is a function that can be called on sub-content and
state to recursively parse the sub-content. This always returns an array.
state is the mutable state threading object, which can be examined
or modified, and should be passed as the second argument to any
recurseParse calls.
For example, to parse inline sub-content, you can add inline: true
to state, or inline: false to force block parsing (to leave the
parsing scope alone, you can just pass state with no modifications).
For example:
const innerText = capture[1];
recurseParse(
innerText,
_.defaults(
{
inline: true
},
state
)
);parse should return a node object, which can have custom fields
that will be passed to output, below. The one reserved field is
type, which designates the type of the node, which will be used
for output. If no type is specified, simple-markdown will use the
current rule's type (the common case). If you have multiple ways
to parse a single element, it can be useful to have multiple rules
that all return nodes of the same type.