1.4.2 • Published 3 months ago

llmxml v1.4.2

Weekly downloads
-
License
MIT
Repository
-
Last release
3 months ago

LLMXML

A library for converting between Markdown and LLM-friendly XML formats, with section extraction capabilities.

Features

  • Bidirectional conversion between Markdown and LLM-XML
  • Fuzzy section matching and extraction
  • Precise heading level control
  • Configurable tag formatting and attribute output
  • Automatic preservation of JSON structures
  • Smart handling of code blocks

Installation

npm install llmxml

Quick Start

import { createLLMXML } from 'llmxml';

const llmxml = createLLMXML();

// Convert Markdown to LLM-XML
const xml = await llmxml.toXML(`
# Title
## Section
Content with JSON: {"name":"John","age":30}
`);
// Result:
// <Title>
//   Content with JSON: {
//     "name": "John",
//     "age": 30
//   }
//   <Section>
//     Content
//   </Section>
// </Title>

// Convert LLM-XML to Markdown
const markdown = await llmxml.toMarkdown(xml);

// Extract sections
const section = await llmxml.getSection(markdown, 'Section');

Section Extraction

Provides section extraction with fuzzy matching:

// Extract a single section with options
const section = await llmxml.getSection(content, 'Setup Instructions', {
  level: 2,                // Only match h2 headers (1-6)
  exact: false,           // Require exact matches
  includeNested: true,    // Include subsections
  fuzzyThreshold: 0.8     // Minimum match score (0-1)
});

// Extract multiple matching sections
const sections = await llmxml.getSections(content, 'setup', {
  // Same options as getSection
  fuzzyThreshold: 0.7
});

Configuration

Configure behavior when creating an instance:

const llmxml = createLLMXML({
  // Default threshold for fuzzy matching (0-1)
  defaultFuzzyThreshold: 0.7,
  
  // Warning emission level
  warningLevel: 'all', // 'all' | 'none' | 'ambiguous-only',

  // Control XML attribute output
  includeTitle: false,  // Include title attribute (default: false)
  includeHlevel: false, // Include hlevel attribute (default: false)
  verbose: false,       // Include both title and hlevel (default: false)

  // Tag name formatting (default: 'PascalCase')
  tagFormat: 'PascalCase', // 'snake_case' | 'SCREAMING_SNAKE' | 'camelCase' | 'PascalCase' | 'UPPERCASE'
});

// Examples with different configurations:
const withAttributes = createLLMXML({ verbose: true });
const xml1 = await withAttributes.toXML('# Long Title');
// <LongTitle title="Long Title" hlevel="1">

const snakeCase = createLLMXML({ tagFormat: 'snake_case' });
const xml2 = await snakeCase.toXML('# Long Title');
// <long_title>

Round-trip Conversions

For preserving document structure during round-trip conversions:

// Convert markdown to XML and back, preserving all structure
const roundTripped = await llmxml.roundTrip(`
# Title
## Section
Content
`);

Warning System

Emits warnings for potentially ambiguous situations:

// Register warning handler
llmxml.onWarning(warning => {
  // Warning structure:
  // {
  //   code: 'AMBIGUOUS_MATCH' | 'UNKNOWN_WARNING' | etc,
  //   message: string,
  //   details: {
  //     matches?: Array<{
  //       title: string,
  //       score: {
  //         exactMatch: boolean,
  //         fuzzyScore: number,
  //         contextualScore: number,
  //         level: number,
  //         // ... other scoring details
  //       }
  //     }>,
  //   }
  // }
});

Error Handling

Throws typed errors for various failure conditions:

try {
  const section = await llmxml.getSection(content, 'nonexistent');
} catch (error) {
  if (error.code === 'SECTION_NOT_FOUND') {
    console.log('Section not found:', error.message);
  }
  // Other error codes:
  // - PARSE_ERROR: Failed to parse document
  // - INVALID_FORMAT: Document format is invalid
  // - INVALID_LEVEL: Invalid header level
  // - INVALID_SECTION_OPTIONS: Invalid section extraction options
}

Documentation

License

MIT

1.4.2

3 months ago

1.4.1

3 months ago

1.4.0

4 months ago

1.3.0

4 months ago

1.2.0

4 months ago

1.1.2

4 months ago

1.1.1

4 months ago

1.1.0

4 months ago

1.0.0

4 months ago