mdstream v0.1.1
Streaming Markdown
Experiment making a streaming markdown parser à la ChatGPT.
Version | Size (gzip) |
---|---|
Full package | 16kB |
browser.js | 2.5kB |
INTERNAL FORK
Original: https://github.com/thetarnav/streaming-markdown
Original author and all parsing logic: @thetarnav
Installation
Install mdstream
package from npm.
bun i mdstream
Or use the CDN link. The
browser
export is a minified (2.5kB gzipped) version of the package, with only
the parser methods and and DOM renderer exported.
<script type="module">
import { parse, finish, createParser, createDOMRenderer } from "https://cdn.jsdelivr.net/npm/mdstream/dist/browser.js"
// ...
</script>
The package uses ES module exports, so you need to use type="module"
in your
script tag. See usage below.
Using the parser
First create new markdown Parser
by calling parser
function. It's single
argument is a Renderer
object, which is an interface to render the parsed
markdown tokens to the DOM. The built-in renderers are:
DOMRenderer
, which renders by appending to the DOM from a browser client script;HTMLRenderer
, which renders to raw HTML;ANSIRenderer
, which renders to ANSI-styled text usingchalk
; andLogRenderer
, which prints the internal parser methods as they're called.
See Examples below.
parse
function
Then, you can start streaming markdown to the Parser
by calling parse()
function with the chunk of markdown string.
parse(parser, "# Streaming Markdown\n\n")
You can write as many times as you want to stream the markdown.
The parser is optimistic. When it sees the start of an inline code block or code block, it will immediately style the element accordingly.
E.g. `print("hello wor
should be rendered as <code>print("hello
wor</code>
While the text is streamed in, the user should be able to select the text that has already been streamed in and copy it.
(The parser is only adding new elements to the DOM, not modifying the existing ones.)
finish
function
Finally, you can end the stream by calling finish()
function.
It will reset the Parser
state and flush the remaining markdown.
finish(parser)
Working with streams: ReadableStream<Uint8Array>
To transform a ReadableStream
, use MarkdownStream
to create a
TransformStream<Uint8Array, Uint8Array>
. The built-in renderers come with
renderer transforms already (except the DOM renderer, which manipulates the DOM
at runtime):
MarkdownHTMLStream
MarkdownANSIStream
MarkdownLogStream
Render to DOM using parser
import { parse, finish, createParser, createDOMRenderer } from "mdstream"
const response = await fetch("readme.md")
const source = await response.text()
const container = document.getElementById("markdown")
const renderer = createDOMRenderer(container)
const parser = createParser(renderer)
let i = 0
while (i < source.length) {
const length = Math.floor(Math.random() * 20) + 1
const delay = Math.floor(Math.random() * 80) + 10
const chunk = source.slice(i, i += length)
await new Promise(resolve => setTimeout(resolve, delay))
parse(parser, chunk)
}
finish(parser)
Streams: Rendering to HTML (server-side)
const readme = Bun.file("readme.md")
const response = new Response(
readme.stream().pipeThrough(new MarkdownHTMLStream())
)
Streams: Rendering to DOM (client-side)
const container = document.getElementById("markdown")
const response = await fetch("readme.md")
await response.body
.pipeThrough(new MarkdownDOMStream(container))
.pipeTo(new WritableStream())
Extending
MarkdownStream
can wrap any renderer, for instance MarkdownLogStream
creates
the parser and returns it for its parent to use:
export class MarkdownLogStream extends MarkdownStream {
constructor() {
const ENCODER = new TextEncoder()
super({
start: (controller) => {
const renderer = createLogRenderer({
render: (chunk) => controller.enqueue(ENCODER.encode(chunk)),
})
return createParser(renderer)
}
})
}
}
Examples
Render to DOM with DOMRenderer
Displaying this README as a demo with delayed chunks:
<script type="module">
import { parse, finish, createParser, createDOMRenderer } from "mdstream"
const response = await fetch("readme.md")
const source = await response.text()
const container = document.getElementById("markdown")
const renderer = createDOMRenderer(container)
const parser = createParser(renderer)
let i = 0
while (i < source.length) {
const length = Math.floor(Math.random() * 20) + 1
const delay = Math.floor(Math.random() * 80) + 10
const chunk = source.slice(i, i += length)
await new Promise(resolve => setTimeout(resolve, delay))
parse(parser, chunk)
}
finish(parser)
</script>
Testing
Blockquotes can span multiple lines. Another line.
Markdown features
- Paragraphs
- Line breaks
- don't end tokens
- Escaping line breaks
- Trim unnecessary spaces
- Headers
Alternate syntax(not planned)
- Code Block with indent
- Code Block with triple backticks
- language attr
- with many backticks
`inline code`
with backticks- with many backticks
- trim spaces
code
- italic with single asterisks
- Bold with double asterisks
- italic with underscores
- Bold with double underscores
- Special cases:
- boldbold>em
- bold>embold
- *emem>bold*
- *bold>emem*
- * or _ cannot be surrounded by spaces
- Strikethrough
example - Escape characters (e.g. * or _ with \* or \_)
- [Link](url)
href
attr
- Raw URLs
- http://example.com
- https://example.com
- www.example.com
- example@fake.com
- example@fake.com
- Autolinks
- www.example.com
- http://example.com
- https://example.com
- example@fake.com
- example@fake.com
- Reference-style Links
- Images
src
attr
- Horizontal rules
- With
---
- With
***
- With
___
- With
- Unordered lists
- Ordered lists
start
attr
- Task lists
- Nested lists
- One-line nested lists
- Adding Elements in Lists
- Blockquotes
- Tables
- Subscript
- Superscript
- Emoji Shortcodes
- Html tags (e.g.
<div>
,<span>
,<a>
,<img>
, etc.)