@slyracoon23/prompt-spec NPM

Prompt Spec

A framework for testing and benchmarking AI agents with automated optimization capabilities.

Overview

Prompt Spec provides a comprehensive toolkit for defining, testing, and optimizing AI agents. It allows you to:

Define agent specifications in a structured YAML format
Run automated tests to evaluate agent performance
Generate comprehensive test reports with metrics
Optimize agent prompts using AI-powered feedback loops

Features

Declarative Agent Specifications: Define agents, tools, benchmarks, and evaluation criteria in YAML
Automated Testing: Run benchmarks against your agents with comprehensive reporting
AI-Powered Optimization: Automatically improve agent system prompts based on test results
Tool Simulation: Test agents with simulated tools and custom error handling
Flexible Evaluation: Define custom evaluation criteria and scoring systems

Installation

Prerequisites

Node.js 18 or higher
npm or pnpm (recommended)

Install from npm

# Using npm
npm install -g prompt-spec

# Using pnpm (recommended)
pnpm add -g prompt-spec

Set up OpenAI API key

Prompt Spec requires an OpenAI API key to function. You can set it as an environment variable:

# Set for current session
export OPENAI_API_KEY=your_api_key_here

# Or add to your .bashrc/.zshrc for persistence
echo 'export OPENAI_API_KEY=your_api_key_here' >> ~/.bashrc
source ~/.bashrc

Alternatively, you can create a .env file in your project root:

OPENAI_API_KEY=your_api_key_here

Usage

Command Line Interface

Prompt Spec provides a CLI for running tests, transpiling specifications, and optimizing agents:

# Run tests from a YAML specification
prompt-spec test path/to/spec.yaml

# Transpile a YAML specification to a test file
prompt-spec transpile path/to/spec.yaml

# Run tests and optimize the agent's system prompt
prompt-spec optimize path/to/spec.yaml

Programmatic API

import { testSpec, transpileSpec, optimizeSpec } from 'prompt-spec';

// Run tests from a YAML specification
const results = await testSpec('path/to/spec.yaml', {
  watch: false,
  reporter: 'default'
});

// Transpile a YAML specification to a test file
const testFilePath = await transpileSpec('path/to/spec.yaml', {
  output: 'path/to/output.test.ts'
});

// Optimize an agent's system prompt based on test results
const optimizedSpec = await optimizeSpec('path/to/spec.yaml', {
  iterations: 3,
  output: 'path/to/optimized-spec.yaml'
});

Agent Specification Format

Prompt Spec uses a declarative YAML format to define agents, tools, and benchmarks:

metadata:
  name: "Simple Customer Service Agent"
  version: "1.0"
  description: "A basic customer service agent for testing"

agent:
  model: gpt-4o-mini
  systemPrompt: |
    You are a helpful customer service agent. Assist users with their inquiries.
  toolChoice: "auto"
  maxSteps: 3

tools:
  checkUserAccount:
    description: "Check user account details"
    inputSchema:
      type: object
      properties:
        userId:
          type: string
    outputSchema: "UserAccountSchema"
    # Tool implementation details...

benchmarks:
  - name: "Password Reset Test"
    messages:
      - role: "user"
        content: "How do I reset my password?"
    # Benchmark configuration...

CLI Commands

`test`

Run tests from a YAML specification:

prompt-spec test [options] <spec>

Options:

--output, -o: Output path for the generated test file
--keep, -k: Keep the generated test file after running
--watch, -w: Watch for changes and rerun tests
--reporter, -r: Test reporter to use (default, json, verbose)

`transpile`

Transpile a YAML specification to a test file:

prompt-spec transpile [options] <spec>

Options:

--output, -o: Output path for the generated test file

`optimize`

Run tests and optimize the agent's system prompt:

prompt-spec optimize [options] <spec>

Options:

--iterations, -i: Number of optimization iterations (default: 3)
--output, -o: Output path for the optimized specification

Examples

Check out these examples to get started:

# Create a basic agent spec
cat > agent-spec.yaml << EOF
metadata:
  name: "Hello World Agent"
  version: "1.0"
  description: "A simple hello world agent"

agent:
  model: gpt-4o-mini
  systemPrompt: |
    You are a helpful assistant.
  maxSteps: 1

benchmarks:
  - name: "Basic Test"
    messages:
      - role: "user"
        content: "Say hello world"
    expectedOutputs:
      - pattern: "hello world"
        caseSensitive: false
EOF

# Run the test
prompt-spec test agent-spec.yaml

Development

To contribute to Prompt Spec:

# Clone the repository
git clone https://github.com/Slyracoon23/prompt-spec.git
cd prompt-spec

# Install dependencies
pnpm install

# Build the project
pnpm build

# Run tests
pnpm test

# Run examples
pnpm examples

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

ai testing benchmarking agents llm prompt-engineering

@ai-sdk/openai @types/figlet @types/gradient-string ai commander common-tags dedent dotenv figlet gradient-string js-yaml openai vitest vite zod

0.0.2

5 months ago

0.0.1

5 months ago