1.4.2 โข Published 5 months ago
@kayvan/markdown-tree-parser v1.4.2
markdown-tree-parser
A powerful JavaScript library and CLI tool for parsing and manipulating markdown files as tree structures. Built on top of the battle-tested remark/unified ecosystem.
๐ Features
- ๐ณ Tree-based parsing - Treats markdown as manipulable Abstract Syntax Trees (AST)
- โ๏ธ Section extraction - Extract specific sections with automatic boundary detection
- ๐ Powerful search - CSS-like selectors and custom search functions
- ๐ Batch processing - Process multiple sections at once
- ๐ ๏ธ CLI & Library - Use as a command-line tool or JavaScript library
- ๐ Document analysis - Get statistics and generate table of contents
- ๐ฏ TypeScript ready - Full type definitions included
๐ฆ Installation
Global Installation (for CLI usage)
# Using npm
npm install -g @kayvan/markdown-tree-parser
# Using pnpm (may require approval for build scripts)
pnpm install -g @kayvan/markdown-tree-parser
pnpm approve-builds -g # If prompted
# Using yarn
yarn global add @kayvan/markdown-tree-parserLocal Installation (for library usage)
npm install @kayvan/markdown-tree-parser๐ง CLI Usage
After global installation, use the md-tree command:
List all headings
md-tree list README.md
md-tree list README.md --format jsonExtract specific sections
# Extract one section
md-tree extract README.md "Installation"
# Extract to a file
md-tree extract README.md "Installation" --output ./sectionsExtract all sections at a level
# Extract all level-2 sections
md-tree extract-all README.md 2
# Extract to separate files
md-tree extract-all README.md 2 --output ./sectionsShow document structure
md-tree tree README.mdSearch with CSS-like selectors
# Find all level-2 headings
md-tree search README.md "heading[depth=2]"
# Find all links
md-tree search README.md "link"Document statistics
md-tree stats README.mdGenerate table of contents
md-tree toc README.md --max-level 3Complete CLI options
md-tree help๐ Library Usage
Basic Usage
import { MarkdownTreeParser } from 'markdown-tree-parser';
const parser = new MarkdownTreeParser();
// Parse markdown into AST
const markdown = `
# My Document
Some content here.
## Section 1
Content for section 1.
## Section 2
Content for section 2.
`;
const tree = await parser.parse(markdown);
// Extract a specific section
const section = parser.extractSection(tree, 'Section 1');
const sectionMarkdown = await parser.stringify(section);
console.log(sectionMarkdown);
// Output:
// ## Section 1
// Content for section 1.Advanced Usage
import { MarkdownTreeParser, createParser, extractSection } from 'markdown-tree-parser';
// Create parser with custom options
const parser = createParser({
bullet: '-', // Use '-' for lists
emphasis: '_', // Use '_' for emphasis
strong: '__' // Use '__' for strong
});
// Extract all sections at level 2
const tree = await parser.parse(markdown);
const sections = parser.extractAllSections(tree, 2);
sections.forEach(async (section, index) => {
const heading = parser.getHeadingText(section.heading);
const content = await parser.stringify(section.tree);
console.log(`Section ${index + 1}: ${heading}`);
console.log(content);
});
// Use convenience functions
const sectionMarkdown = await extractSection(markdown, 'Installation');Search and Manipulation
// CSS-like selectors
const headings = parser.selectAll(tree, 'heading[depth=2]');
const links = parser.selectAll(tree, 'link');
const codeBlocks = parser.selectAll(tree, 'code');
// Custom search
const customNode = parser.findNode(tree, (node) => {
return node.type === 'heading' &&
parser.getHeadingText(node).includes('API');
});
// Transform content
parser.transform(tree, (node) => {
if (node.type === 'heading' && node.depth === 1) {
node.depth = 2; // Convert h1 to h2
}
});
// Get document statistics
const stats = parser.getStats(tree);
console.log(`Document has ${stats.wordCount} words and ${stats.headings.total} headings`);
// Generate table of contents
const toc = parser.generateTableOfContents(tree, 3);
console.log(toc);Working with Files
import fs from 'fs/promises';
// Read and process a file
const content = await fs.readFile('README.md', 'utf-8');
const tree = await parser.parse(content);
// Extract all sections and save to files
const sections = parser.extractAllSections(tree, 2);
for (let i = 0; i < sections.length; i++) {
const section = sections[i];
const filename = `section-${i + 1}.md`;
const markdown = await parser.stringify(section.tree);
await fs.writeFile(filename, markdown);
}๐ฏ Use Cases
- ๐ Documentation Management - Split large docs into manageable sections
- ๐ Static Site Generation - Process markdown for blogs and websites
- ๐ Content Organization - Restructure and reorganize markdown content
- ๐ Content Analysis - Analyze document structure and extract insights
- ๐ Documentation Tools - Build custom documentation processing tools
- ๐ Content Migration - Extract and transform content between formats
๐๏ธ API Reference
MarkdownTreeParser
Constructor
new MarkdownTreeParser(options = {})Methods
parse(markdown)- Parse markdown into ASTstringify(tree)- Convert AST back to markdownextractSection(tree, headingText, level?)- Extract specific sectionextractAllSections(tree, level)- Extract all sections at levelselect(tree, selector)- Find first node matching CSS selectorselectAll(tree, selector)- Find all nodes matching CSS selectorfindNode(tree, condition)- Find node with custom conditiongetHeadingText(headingNode)- Get text content of headinggetHeadingsList(tree)- Get all headings with metadatagetStats(tree)- Get document statisticsgenerateTableOfContents(tree, maxLevel)- Generate TOCtransform(tree, visitor)- Transform tree with visitor function
Convenience Functions
createParser(options)- Create new parser instanceextractSection(markdown, sectionName, options)- Quick section extractiongetHeadings(markdown, options)- Quick heading extractiongenerateTOC(markdown, maxLevel, options)- Quick TOC generation
๐ CSS-Like Selectors
The library supports powerful CSS-like selectors for searching:
// Element selectors
parser.selectAll(tree, 'heading') // All headings
parser.selectAll(tree, 'paragraph') // All paragraphs
parser.selectAll(tree, 'link') // All links
// Attribute selectors
parser.selectAll(tree, 'heading[depth=1]') // H1 headings
parser.selectAll(tree, 'heading[depth=2]') // H2 headings
parser.selectAll(tree, 'link[url*="github"]') // Links containing "github"
// Pseudo selectors
parser.selectAll(tree, ':first-child') // First child elements
parser.selectAll(tree, ':last-child') // Last child elements๐งช Testing
# Run tests
npm test
# Test CLI
npm run test:cli
# Run examples
npm run example๐ง Development
Prerequisites
- Node.js 18+
- npm
Setup
# Clone the repository
git clone https://github.com/ksylvan/markdown-tree-parser.git
cd markdown-tree-parser
# Install dependencies
npm install
# Run tests
npm test
# Run linting
npm run lint
# Format code
npm run format
# Test CLI functionality
npm run test:cliCI/CD
This project uses GitHub Actions for continuous integration. The workflow automatically:
- Tests against Node.js versions 18.x, 20.x, and 22.x
- Runs linting with ESLint
- Executes the full test suite
- Tests CLI functionality
- Verifies the package can be published
The CI badge in the README shows the current build status and links to the Actions page.
๐ค Contributing
Contributions are welcome! Please read our Contributing Guide for details.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
Built on top of the excellent unified ecosystem:
- remark - Markdown processing
- mdast - Markdown AST specification
- unist - Universal syntax tree utilities
๐ Support
- ๐ Documentation
- ๐ Issue Tracker
- ๐ฌ Discussions
Made with โค๏ธ by Kayvan Sylvan