0.4.6 • Published 5 months ago
llm-ast v0.4.6
llm-ast
A lightweight parser for .llm
files - a pseudo-XML format designed for LLM-friendly document structuring. Part of the Meld ecosystem.
What is .llm?
The .llm
format is a relaxed, pseudo-XML format that provides high-level structure to documents while staying friendly to both humans and LLMs. It's primarily used as the LLM-friendly output format for the Meld prompt scripting language and the oneshot LLM prompt tool.
Unlike strict XML:
- Only top-level tags need to be valid
- Inner content is treated flexibly
- Tags must start at line beginnings
- Common indentation is stripped from tag content
- Relative indentation is preserved
Example of how it looks:
<System>
Here's your role and context...
</System>
<Context>
<CodeBase>
Current implementation:
```typescript
function example() {
// This code block is preserved exactly
}
```
</CodeBase>
<TestOutput>
Test Results:
✓ 16 tests passed
✗ 2 tests failed
<details>Failed tests...
(note: unclosed tags inside are fine - treated as text)
</TestOutput>
</Context>
<Task>
What improvements would you suggest?
</Task>
Features
- Simple line-based parsing
- Forgiving of "invalid" inner XML
- Strips common indentation from tag content
- Preserves relative indentation
- Supports code fences (`
language)
- Clean error reporting
Installation
npm install llm-ast
Usage
import { parse } from 'llm-ast';
import type { ParseResult, TagNode } from 'llm-ast'; // TypeScript types are available directly
// Parse a .llm file
const input = `
<Message>
<System>You are a helpful assistant.</System>
<User title="Question">What is 2+2?</User>
<Assistant title="Response">
The answer is 4.
Here's how to calculate it:
```python
result = 2 + 2
print(result) # 4
```
</Assistant>
</Message>
`;
const result: ParseResult = parse(input);
if (result.errors.length > 0) {
console.error('Parsing errors:', result.errors);
} else {
const ast: TagNode = result.ast;
console.log('AST:', ast);
// AST will include title attributes where specified:
// {
// type: 'Tag',
// tagName: 'User',
// title: 'Question',
// content: [...]
// }
}
TypeScript Support
The package includes full TypeScript type definitions. You can import types directly from the package:
import type {
ParseResult, // Result of parsing, includes AST and any errors
TagNode, // A tag node in the AST
TextNode, // A text node in the AST
CodeFenceNode, // A code fence node in the AST
ErrorNode // An error node in the AST
} from 'llm-ast';
All types are fully documented and provide excellent IDE support with TypeScript.
Format Rules
- Tags must:
- Start with a capital letter
- Start at the beginning of a line
- Be properly closed at document level
// Valid
<User>Hello</User>
// Also valid - inner "invalid" tags are treated as text
<Message>
<User>Look at this <link>http://example.com</link></User>
</Message>
// Invalid - tag not at line start
text <User>Hello</User>
- Title attributes are supported:
// Valid - with title attribute
<Message title="Original Message">Hello</Message>
// Valid - nested tags with titles
<Section title="Main Content">
<SubSection title="Details">
Content here
</SubSection>
</Section>
// Invalid - quotes are required
<Message title=Hello>Content</Message>
- Indentation behavior:
<Message>
All lines at same indent
This line has extra indent
Back to first indent
</Message>
// Becomes:
<Message>
All lines at same indent
This line has extra indent
Back to first indent
</Message>
- Code fences work like in Markdown:
<Message>
Here's some code:
```javascript
console.log('Preserved exactly');
See FORMAT.md for detailed format rules.
Error Handling
The parser provides helpful errors while being forgiving:
// Invalid tag name - error
const result1 = parse('<user>Invalid</user>');
console.log(result1.errors); // [{ message: 'Expected [A-Z]...', location: {...} }]
// Inner "invalid" XML - treated as text
const result2 = parse('<Message>Text with <random>unclosed tag</Message>');
console.log(result2.errors); // []
License
MIT