3.4.5 • Published 4 months ago

meld-ast v3.4.5

Weekly downloads
-
License
MIT
Repository
github
Last release
4 months ago

meld-ast

A spec-compliant AST parser for the Meld scripting language, built with Peggy. This parser produces AST nodes that strictly conform to the meld-spec type definitions.

Features

  • Full compliance with meld-spec type definitions
  • Built with Peggy for robust parsing
  • Environment-independent parser loading
  • Comprehensive error handling with location information
  • Validation against meld-spec types
  • Direct access to parser components
  • Support for both ESM and CommonJS
  • Source location tracking for all nodes
  • Comprehensive parsing for all Meld language constructs:
    • Text blocks
    • Code fences with advanced features:
      • Support for 3, 4, and 5 backticks
      • Proper nesting of code fences
      • Optional preservation of fence markers
      • Language identifier support
    • Comments (>> comment)
    • Variables:
      • Text variables ({{var}}) with format options (previously ${var})
      • Data variables ({{data}}) with fields (previously #{data})
      • Array access for data variables ({{array.0}}, {{data.users.0.name}})
      • Path variables ($var)
    • Directives:
      • @run [command]
      • @import [path]
      • @import [https://example.com/file.md] for URL imports
      • @import [var1, var2] from [path.meld] for named imports
      • @import [var1, var2 as alias2] from [path.meld] for named imports with aliases
      • @define name = @directive [...] with metadata fields
      • @data identifier:schema = { ... }
      • @text name = "value"
      • @path name = "path"
      • @path name = "$HOMEPATH/path" with special variables
      • @embed [path] for embedding file content
      • @embed [https://example.com/content.md] for URL content
      • @embed [$path_variable] for path variables
      • @embed [$path_variable/{{variable}}] for paths and regular variables
      • @embed [{{variable}}] for variables within brackets
      • @embed {{variable}} for direct variable embedding
      • @embed [[...]] for multiline content
    • Path directives with special variables
    • Error recovery
    • Extensible AST format
    • Multi-file processing

Installation

npm install meld-ast meld-spec

Note: meld-spec is a peer dependency and must be installed alongside meld-ast.

Usage

Basic Parsing

import { parse } from 'meld-ast';

const input = `
>> This is a comment
Hello world

@run [echo "Hello"]
`;

const { ast } = parse(input);

Advanced Usage with Options

The parser supports several configuration options:

import { parse, ParserOptions, MeldAstError } from 'meld-ast';

const options: ParserOptions = {
  // Stop on first error (default: true)
  failFast: true,
  
  // Track source locations (default: true)
  trackLocations: true,
  
  // Validate nodes against meld-spec (default: true)
  validateNodes: true,
  
  // Preserve code fence markers in content (default: true)
  // When true, includes the opening/closing fence markers and language
  // When false, only includes content between fences
  preserveCodeFences: true,
  
  // Suppress warnings for undefined variables in paths (default: false)
  // When true, no warnings are emitted for undefined variables
  // When false, warnings are emitted for undefined variables
  variable_warning: false,
  
  // Custom error handler
  onError: (error: MeldAstError) => {
    console.warn(`Parse warning: ${error.toString()}`);
  }
};

try {
  const { ast, errors } = parse(input, options);
  
  // If failFast is false, errors array will contain any non-fatal errors
  if (errors) {
    console.warn('Parsing completed with warnings:', errors);
  }
} catch (error) {
  if (error instanceof MeldAstError) {
    console.error(
      `Parse error at line ${error.location?.start.line}, ` +
      `column ${error.location?.start.column}: ${error.message}`
    );
  }
}

Error Handling

The parser provides detailed error information:

import { MeldAstError, ParseErrorCode } from 'meld-ast';

try {
  const { ast } = parse(input);
} catch (error) {
  if (error instanceof MeldAstError) {
    // Location information
    if (error.location) {
      console.error(
        `Error at line ${error.location.start.line}, ` +
        `column ${error.location.start.column}`
      );
    }
    
    // Error details
    console.error(`
      Message: ${error.message}
      Code: ${error.code}
      ${error.cause ? `Cause: ${error.cause.message}` : ''}
    `);
    
    // JSON representation
    console.error('Full error:', JSON.stringify(error.toJSON(), null, 2));
  }
}

Error codes indicate specific failure types:

  • SYNTAX_ERROR: Basic syntax errors
  • VALIDATION_ERROR: Node validation failures
  • INITIALIZATION_ERROR: Parser initialization issues
  • GRAMMAR_ERROR: Grammar-level problems

Package Exports

The package provides direct access to its components:

// Main parser
import { parse } from 'meld-ast';

// Direct parser access
import { parser } from 'meld-ast/parser';

// Grammar utilities
import { grammar } from 'meld-ast/grammar';

// Error types
import { MeldAstError } from 'meld-ast/errors';

TypeScript Configuration

Configure your tsconfig.json:

{
  "compilerOptions": {
    "module": "NodeNext",
    "moduleResolution": "NodeNext",
    "esModuleInterop": true
  }
}

Environment-Specific Behavior

The parser automatically detects and adapts to different environments:

Development Environment

When used in development (source code):

// Grammar is loaded from source directory
// src/grammar/meld.pegjs

Production Environment

When installed as a dependency:

// Pre-built parser is loaded from lib directory
// node_modules/meld-ast/lib/grammar/parser.cjs
// node_modules/meld-ast/lib/grammar/parser.js

The parser uses a robust fallback strategy: 1. Tries pre-built CJS parser first (most compatible) 2. Falls back to pre-built ESM parser 3. Falls back to grammar compilation if needed

Debug Output

Enable detailed debug logging to see path resolution:

DEBUG=meld-ast:* node your-script.js

Example debug output:

Environment: {
  currentDir: '/path/to/node_modules/meld-ast/lib/grammar',
  isDev: false,
  pkgRoot: '/path/to/node_modules/meld-ast'
}
Looking for pre-built parsers...
Found pre-built parser at: /path/to/node_modules/meld-ast/lib/grammar/parser.cjs
Successfully loaded CJS parser

Troubleshooting

Common Issues

  1. Parser Initialization

    Error: Failed to initialize parser
    • Check environment detection in debug output
    • Verify package root resolution
    • Check pre-built parser availability
    • Verify grammar file paths
  2. Syntax Errors

    Error: Parse error: Unexpected token
    • Verify Meld syntax
    • Check directive formatting
    • Ensure proper code fence closure
  3. Validation Errors

    Error: Node validation failed
    • Check meld-spec compliance
    • Verify required fields
    • Check field types

Debug Mode

For verbose output:

DEBUG=meld-ast:* node your-script.js

Contributing

  1. Check current issues
  2. Run tests: npm test
  3. Add tests for new features
  4. Submit a PR

License

ISC

Development Setup

Prerequisites

  • Node.js 16 or higher
  • npm 7 or higher
  • TypeScript 5.3 or higher

Initial Setup

  1. Clone the repository:

    git clone https://github.com/adamavenir/meld-ast.git
    cd meld-ast
  2. Install dependencies:

    npm install
  3. Build the project:

    npm run build

The build process includes:

  • Generating the parser from the PeggyJS grammar
  • Building ESM and CommonJS versions
  • Creating TypeScript declaration files
  • Generating source maps
  • Verifying the build output

Project Structure

meld-ast/
├── src/
│   ├── grammar/          # PeggyJS grammar files
│   │   └── meld.pegjs    # Main grammar definition
│   ├── ast/             # AST type definitions
│   ├── parser/          # Parser implementation
│   └── index.ts         # Main entry point
├── lib/
│   └── grammar/         # Pre-built parser files
│       ├── parser.js    # ESM parser
│       └── parser.cjs   # CommonJS parser
├── dist/                # Build output
├── test/               # Test files
└── scripts/            # Build scripts

Build Output Structure

The build process generates: 1. ESM Build (dist/):

  • Main entry point: index.js
  • Type definitions: index.d.ts
  • Source maps: *.js.map, *.d.ts.map
  • Generated parser: grammar/parser.js
  1. CJS Build (dist/cjs/):
    • CommonJS entry: index.js
    • Type definitions: index.d.ts
    • Source maps: *.js.map, *.d.ts.map
    • Generated parser: grammar/parser.js

Debugging

  1. Parser Generation

    DEBUG=meld-ast:grammar npm run build:grammar
  2. Build Process

    DEBUG=meld-ast:* npm run build
  3. Tests

    DEBUG=meld-ast:* npm test

Common Development Tasks

  1. Adding New Grammar Rules

    1. Edit src/grammar/meld.pegjs
    2. Add test cases in tests/
    3. Run npm run build
    4. Run npm test
  2. Modifying Parser Behavior

    1. Edit relevant files in src/
    2. Update tests as needed
    3. Run npm run build
    4. Run npm test
  3. Updating Types

    1. Ensure compatibility with meld-spec
    2. Update type definitions
    3. Run npm run build
    4. Verify with npm test

Code Fences

Code blocks can be fenced with 3, 4, or 5 backticks. The number of backticks in the closing fence must match the opening fence. This allows for proper nesting of code blocks:

# 3 backticks (basic)
```python
print("hello")

4 backticks (can contain 3-backtick fences)

Here's some code:
```python
print("hello")
```

5 backticks (can contain 3 and 4-backtick fences)

A complex example:
```python
print("hello")
```
````javascript
console.log("hi");
````
The parser will always capture the outermost fence and treat any inner fences as content.

### Multiline Embed Syntax

The parser supports multiline embed content using the double bracket syntax:

```markdown
@embed [[
This is a multiline
embed content that can span
multiple lines.
]]

You can use variable interpolation within multiline embeds:

@embed [[
Hello, {{name}}!
This is a multi-line
content for embed.
]]

You can also specify sections within multiline embeds:

@embed [[ #SectionName
This content will be embedded
from the specified section.
]]

The multiline embed syntax provides a cleaner way to include content directly in your Meld document without having to create separate files for small pieces of content.

Embed Directive Syntax

The @embed directive supports several distinct syntax forms, each with specific semantics:

  1. Single Brackets for Paths: @embed [path/to/file.md]

    • Used for embedding content from external files
    • Path can be a relative or absolute file path
  2. Path Variables in Single Brackets: @embed [$path_variable]

    • References a path stored in a variable
    • The variable must be defined elsewhere in the document
  3. Double Brackets for Inline Content: @embed [[content goes here]]

    • Used for embedding content directly in the document
    • Can span multiple lines
    • Variables within double brackets are treated as literal text
  4. Direct Variable Embedding: @embed {{variable}}

    • Embeds the content of a variable directly
    • The variable is resolved at runtime
    • Shorthand alternative to @embed [{{variable}}]
  5. Variables in Brackets: @embed [{{variable}}]

    • Embeds the content from a path stored in a variable
    • The variable is resolved at runtime
    • Path validation is applied to the resolved content

These different syntax forms make the @embed directive highly flexible for various content embedding scenarios.

Code Fence Examples

Here are examples of how code fences are parsed with different options:

// Default behavior (preserveCodeFences: true)
const input = '```javascript\nconsole.log("hello");\n```';
const { ast } = await parse(input);
console.log((ast[0] as CodeFenceNode).content);
// Output: ```javascript\nconsole.log("hello");\n```

// Without fence preservation
const { ast } = await parse(input, { preserveCodeFences: false });
console.log((ast[0] as CodeFenceNode).content);
// Output: console.log("hello");

// Nested fences (4 backticks containing 3 backticks)
const nested = '````markdown\n```js\nlet x = 1;\n```\n````';
const { ast } = await parse(nested);
// Preserves all fences in content

Path Directives with Special Variables

Path directives support special variables for commonly used paths:

# Home directory references
@path home_config = "$HOMEPATH/config"
@path home_alt = "$~/config"

# Project root references
@path project_config = "$PROJECTPATH/config"
@path project_alt = "$./config"

These special variables provide consistent ways to reference important paths:

Special VariableAliasDescription
$HOMEPATH$~User's home directory
$PROJECTPATH$.Project root directory

The parser correctly sets the path structure with proper base and segments properties:

// For @path config = "$HOMEPATH/config"
{
  type: "PathDirective",
  identifier: "config",
  value: {
    raw: "$HOMEPATH/config",
    structured: {
      base: "$HOMEPATH",
      segments: ["config"]
    }
  }
}

When using these paths in embed or import directives, the runtime system is expected to resolve the special variables to actual filesystem paths.

URL Support in Path Directives

The parser now supports URLs in path directives, allowing you to reference remote content:

# Import from remote URL
@import [https://example.com/docs/file.md]

# Embed content from remote URL
@embed [https://example.com/snippets/code.js]

URLs are detected by checking for the http:// or https:// prefix. The parser performs the following:

  1. Validates that URLs are well-formed
  2. Preserves URLs in their original form during path normalization
  3. Adds a url: true property to the structured path object:
// For @import [https://example.com/file.md]
{
  type: "Directive",
  directive: {
    kind: "import",
    path: {
      raw: "https://example.com/file.md",
      structured: {
        base: ".",
        segments: ["https:", "example.com", "file.md"],
        variables: {},
        url: true
      },
      normalized: "https://example.com/file.md"
    }
  }
}

When using URL paths, the runtime system is expected to retrieve the remote content via HTTP(S) requests.

Note: Paths with slashes must either be URLs (starting with http:// or https://) or use special variables (starting with $). This validation ensures paths are properly structured.

Named Imports

The parser supports selective imports using named import syntax:

# Basic named imports
@import [var1, var2] from [variables.meld]

# Named imports with aliases
@import [var1, var2 as alias2] from [variables.meld]

# Explicit wildcard import
@import [*] from [variables.meld]

# Traditional import (equivalent to wildcard)
@import [variables.meld]

# Empty import list
@import [] from [variables.meld]

# Named imports with variable path
@import [var1, var2] from {{path_variable}}

Named imports allow you to selectively import specific variables from a file instead of importing everything. The AST structure for named imports includes an imports array with each import's name and optional alias:

// For @import [var1, var2 as alias2] from [variables.meld]
{
  type: "Directive",
  directive: {
    kind: "import",
    path: {
      raw: "variables.meld",
      structured: {
        base: ".",
        segments: ["variables.meld"],
        variables: {},
        cwd: true
      },
      normalized: "./variables.meld"
    },
    imports: [
      { name: "var1", alias: null },
      { name: "var2", alias: "alias2" }
    ]
  }
}

The traditional import syntax (@import [path.meld]) is maintained for backward compatibility and is equivalent to a wildcard import (@import [*] from [path.meld]).

Data Directives

Data directives allow you to define structured data, which can be referenced elsewhere.

@data config = { 
  "server": "localhost", 
  "port": 8080 
}

@data servers = [
  { "name": "prod", "url": "example.com" },
  { "name": "staging", "url": "staging.example.com" }
]

Data directives can also be loaded from external files:

@data config = @embed [config.json]

Variable Syntax

Meld supports different variable types with specific syntaxes:

  1. Text Variables

    Hello, {{name}}!
  2. Data Variables with Field Access

    User: {{user.name}}, Age: {{user.age}}
  3. Data Variables with Array Access

    First user: {{users.0.name}}
    Item by index: {{items.2}}
    Nested arrays: {{matrix.0.1}}
    Dynamic access: {{data[fieldName]}}
  4. Path Variables

    File path: $project_path
  5. Format Options

    Date: {{date>>formatName}}
3.4.5

4 months ago

3.4.4

4 months ago

3.4.3

4 months ago

3.4.2

4 months ago

3.4.1

5 months ago

3.4.0

5 months ago

3.3.0

5 months ago

3.2.0

5 months ago

3.1.0

5 months ago

3.0.1

5 months ago

3.0.0

5 months ago

1.6.1

5 months ago

1.6.0

5 months ago

2.1.0

5 months ago

1.5.0

5 months ago

1.4.0

5 months ago

1.3.3

5 months ago

1.3.2

5 months ago

1.3.1

5 months ago

1.3.0

5 months ago

1.2.0

5 months ago

1.1.0

5 months ago

1.0.8

5 months ago

1.0.7

5 months ago

1.0.6

5 months ago

1.0.5

5 months ago

1.0.4

5 months ago

1.0.3

5 months ago

1.0.2

5 months ago

1.0.0

5 months ago

0.5.0

5 months ago

0.4.2

6 months ago

0.4.1

6 months ago

0.4.0

6 months ago

0.3.1

6 months ago

0.3.0

6 months ago

0.2.3

6 months ago

0.2.2

6 months ago

0.2.1

6 months ago

0.2.0

6 months ago