1.0.1 ⢠Published 8 months ago
markforge v1.0.1
āļø Markforge
A modern TypeScript library for crafting HTML into clean, readable Markdown. Built with LLMs and AI agents in mind. This is a modern TypeScript fork of the Turndown library, completely rewritten for modern use cases.
Features
- š¤ LLM-Optimized Output: Produces clean, consistent Markdown that's ideal for LLMs and AI agents
- š Modern TypeScript: Full type safety and modern ES module support
- š§ DOM-Independent: Works in any Node.js environment without requiring a DOM
- ⨠Built-in GFM: GitHub Flavored Markdown support included by default
- šÆ Zero Runtime DOM: Uses virtual DOM for parsing, keeping your bundle size small
- š³ Tree-Shakeable: Import only what you need
- š Functional Core: Built with functional programming patterns
Installation
npm install markforge
# or
yarn add markforge
# or
pnpm add markforge
Usage
Basic Usage
import markforge from 'markforge'
const html = '<h1>Hello, World!</h1>'
const markdown = markforge.toMarkdown(html)
console.log(markdown) // # Hello, World!
LLM/AI Agent Usage
Perfect for cleaning up HTML content before sending to LLMs:
import Markforge from 'markforge'
// Example: Processing HTML content for an LLM
async function processForLLM(htmlContent: string) {
const markforge = new Markforge()
const markdown = markforge.toMarkdown(htmlContent)
// The output is clean, consistent Markdown that LLMs can easily understand
const llmResponse = await llm.complete({
prompt: markdown,
// ... other options
})
return llmResponse
}
Custom Instance
import Markforge from 'markforge'
const service = new Markforge({
headingStyle: 'atx',
codeBlockStyle: 'fenced',
emDelimiter: '*',
strongDelimiter: '**'
})
const markdown = service.toMarkdown('<h1>Custom Options</h1>')
GFM Features
Built-in support for GitHub Flavored Markdown:
// Tables
const table = `
<table>
<thead>
<tr>
<th>Feature</th>
<th>Support</th>
</tr>
</thead>
<tbody>
<tr>
<td>Tables</td>
<td>ā
</td>
</tr>
</tbody>
</table>
`
markforge.toMarkdown(table)
// | Feature | Support |
// | ------- | ------- |
// | Tables | ā
|
// Task Lists
markforge.toMarkdown('<li><input type="checkbox" checked> Task</li>') // - [x] Task
// Strikethrough
markforge.toMarkdown('<del>removed</del>') // ~~removed~~
Custom Rules
import type { Rule } from 'markforge'
const customRule: Rule = {
filter: 'span',
replacement: (content) => `{${content}}`
}
markforge.addRule('customSpan', customRule)
Why Markforge?
For LLM/AI Applications
- Clean Output: Generates consistent, well-formatted Markdown that LLMs can easily process
- Reliable Parsing: Handles messy HTML input gracefully
- Semantic Preservation: Maintains document structure and meaning
- Lightweight: No heavy DOM dependencies that could bloat your AI application
For Modern Applications
- TypeScript First: Built from the ground up with TypeScript
- Modern Bundle: ES modules with tree-shaking support
- Minimal Dependencies: Only uses domino for HTML parsing
- Framework Agnostic: Works anywhere JavaScript runs
API
MarkforgeOptions
interface MarkforgeOptions {
headingStyle?: 'setext' | 'atx'
hr?: string
bulletListMarker?: '*' | '+' | '-'
codeBlockStyle?: 'indented' | 'fenced'
fence?: '```' | '~~~'
emDelimiter?: '_' | '*'
strongDelimiter?: '__' | '**'
linkStyle?: 'inlined' | 'referenced'
linkReferenceStyle?: 'full' | 'collapsed' | 'shortcut'
br?: string
preformattedCode?: boolean
}
Methods
toMarkdown(input: string | Node): string
- Convert HTML to Markdownuse(plugin: Plugin | Plugin[]): MarkforgeService
- Use pluginsaddRule(key: string, rule: Rule): MarkforgeService
- Add custom ruleskeep(filter: string | string[] | Function): MarkforgeService
- Keep elements as HTMLremove(filter: string | string[] | Function): MarkforgeService
- Remove elements
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Credits
This project is a modern TypeScript rewrite of Turndown by Dom Christie, with additional features and optimizations for modern use cases, particularly LLM and AI applications.
License
MIT Ā© Max Schedin