0.4.6 • Published 5 months ago

llm-ast v0.4.6

Weekly downloads
-
License
TBD
Repository
github
Last release
5 months ago

llm-ast

A lightweight parser for .llm files - a pseudo-XML format designed for LLM-friendly document structuring. Part of the Meld ecosystem.

What is .llm?

The .llm format is a relaxed, pseudo-XML format that provides high-level structure to documents while staying friendly to both humans and LLMs. It's primarily used as the LLM-friendly output format for the Meld prompt scripting language and the oneshot LLM prompt tool.

Unlike strict XML:

  • Only top-level tags need to be valid
  • Inner content is treated flexibly
  • Tags must start at line beginnings
  • Common indentation is stripped from tag content
  • Relative indentation is preserved

Example of how it looks:

<System>
  Here's your role and context...
</System>

<Context>
  <CodeBase>
    Current implementation:
    ```typescript
    function example() {
      // This code block is preserved exactly
    }
    ```
  </CodeBase>

  <TestOutput>
    Test Results:
    ✓ 16 tests passed
    ✗ 2 tests failed
    <details>Failed tests...
    (note: unclosed tags inside are fine - treated as text)
  </TestOutput>
</Context>

<Task>
What improvements would you suggest?
</Task>

Features

  • Simple line-based parsing
  • Forgiving of "invalid" inner XML
  • Strips common indentation from tag content
  • Preserves relative indentation
  • Supports code fences (`language)
  • Clean error reporting

Installation

npm install llm-ast

Usage

import { parse } from 'llm-ast';
import type { ParseResult, TagNode } from 'llm-ast'; // TypeScript types are available directly

// Parse a .llm file
const input = `
<Message>
  <System>You are a helpful assistant.</System>
  <User title="Question">What is 2+2?</User>
  <Assistant title="Response">
    The answer is 4.
    
    Here's how to calculate it:
    ```python
    result = 2 + 2
    print(result)  # 4
    ```
  </Assistant>
</Message>
`;

const result: ParseResult = parse(input);
if (result.errors.length > 0) {
  console.error('Parsing errors:', result.errors);
} else {
  const ast: TagNode = result.ast;
  console.log('AST:', ast);
  // AST will include title attributes where specified:
  // {
  //   type: 'Tag',
  //   tagName: 'User',
  //   title: 'Question',
  //   content: [...]
  // }
}

TypeScript Support

The package includes full TypeScript type definitions. You can import types directly from the package:

import type {
  ParseResult,    // Result of parsing, includes AST and any errors
  TagNode,        // A tag node in the AST
  TextNode,       // A text node in the AST
  CodeFenceNode,  // A code fence node in the AST
  ErrorNode       // An error node in the AST
} from 'llm-ast';

All types are fully documented and provide excellent IDE support with TypeScript.

Format Rules

  1. Tags must:
    • Start with a capital letter
    • Start at the beginning of a line
    • Be properly closed at document level
// Valid
<User>Hello</User>

// Also valid - inner "invalid" tags are treated as text
<Message>
  <User>Look at this <link>http://example.com</link></User>
</Message>

// Invalid - tag not at line start
text <User>Hello</User>
  1. Title attributes are supported:
// Valid - with title attribute
<Message title="Original Message">Hello</Message>

// Valid - nested tags with titles
<Section title="Main Content">
  <SubSection title="Details">
    Content here
  </SubSection>
</Section>

// Invalid - quotes are required
<Message title=Hello>Content</Message>
  1. Indentation behavior:
<Message>
    All lines at same indent
      This line has extra indent
    Back to first indent
</Message>

// Becomes:
<Message>
All lines at same indent
  This line has extra indent
Back to first indent
</Message>
  1. Code fences work like in Markdown:
<Message>
  Here's some code:
  ```javascript
  console.log('Preserved exactly');

See FORMAT.md for detailed format rules.

Error Handling

The parser provides helpful errors while being forgiving:

// Invalid tag name - error
const result1 = parse('<user>Invalid</user>');
console.log(result1.errors); // [{ message: 'Expected [A-Z]...', location: {...} }]

// Inner "invalid" XML - treated as text
const result2 = parse('<Message>Text with <random>unclosed tag</Message>');
console.log(result2.errors); // []

License

MIT

0.4.6

5 months ago

0.4.5

5 months ago

0.4.4

5 months ago

0.4.3

5 months ago

0.4.2

5 months ago

0.4.1

5 months ago

0.4.0

5 months ago

0.3.2

5 months ago

0.3.1

6 months ago

0.2.1

6 months ago

0.1.1

6 months ago