1.0.1 • Published 7 months ago

simulacra-test v1.0.1

Weekly downloads
-
License
MIT
Repository
-
Last release
7 months ago

Simulacra Framework

A TypeScript framework for simulating and testing agent-based conversations in Jest. This framework allows you to create, run, and test conversational agents with both deterministic and LLM-powered behavior for testing purposes.

Features

  • Simulate conversations with deterministic or LLM-powered responses
  • Rich assertion API for testing agent behavior.
  • Automated reporting of simulation results.
  • Extends the full power ofJest's test framework.

Roadmap

  • [] Support for test parallelization
  • [] Support for (cheaper) simulation agent models.
  • [] LLM-as-judge transcript conversation transcript evaluation
  • [] Support "streamed input", which waits for an "stop" token before generating an agent response.
  • [] Anything else! We're open to suggestions!

Set up

  1. Install the framework using npm install simulacra
  2. Define your test using simulationTest in a file with a agent.test.ts extension
  3. Add OPENAI_API_KEY to your environment variables (if using LLM-powered responses)
  4. Run your new agent tests using npx jest agent.test.ts.

Look at the examples for an example of a project setup to use the framework.

Usage

The framework provides two main ways to generate conversations:

  1. DeterministicConversationGenerator for deterministic tests
  2. LLMConversationGenerator for testing with real language models

Here's an example testing a customer support scenario:

simulationTest(
  'should handle refund requests appropriately',
  {
    role: 'frustrated customer who recently purchased a faulty laptop',
    task: 'You bought a laptop last week that keeps crashing. You have tried troubleshooting with tech support but nothing works. Now you want to request a refund.',
    conversationGenerator: new LLMConversationGenerator(),
    getAgentResponse: (simulationAgentState) => {
      // Your agent logic here
      return handleCustomerRequest(simulationAgentState.lastResponse?.content);
    },
  },
  async ({ agent }) => {
    // Test that refund handler was called
    simulationExpect(agent.events, async () => {
      expect(mockHandleRefund).toHaveBeenCalled();
    }).eventually();
  }
);

You can also use DeterministicConversationGenerator for to test a conversational agent with a specific set of messages:

simulationTest(
  'should handle a specific conversation flow',
  {
    role: 'customer seeking technical support',
    task: 'You need help with your printer that keeps jamming',
    conversationGenerator: new DeterministicConversationGenerator([
      { 
        role: "assistant", 
        content: "I understand you're having issues with a printer jam. Have you tried removing all paper and checking for debris?" 
      },
      { 
        role: "assistant", 
        content: "Let's try resetting the printer. Please turn it off, wait 30 seconds, then turn it back on." 
      },
      { 
        role: "assistant", 
        content: "Great! The printer should now be working correctly. Is there anything else you need help with?" 
      }
    ]),
    getAgentResponse: (simulationAgentState) => handleTechSupport(simulationAgentState.lastResponse?.content)
  },
  async ({ agent }) => {
    simulationExpect(agent.events, async () => {
      expect(mockPrinterReset).toHaveBeenCalled();
    }).eventually();
  }
);

Assertions

The framework provides powerful assertion capabilities through simulationExpect:

  • eventually(): Asserts a condition is true by the end of the simulation
// Assert something happens by the end of a simulation
simulationExpect(simulationAgent.events, async (simulationAgent) => {
  expect(simulationAgent.lastReceivedMessage?.content).toMatchSnapshot();
}).eventually();
  • always(): Asserts a condition remains true throughout the entire simulation
// Assert something remains true throughout a simulation
simulationExpect(simulationAgent.events, async (simulationAgent) => {
  expect(mockDeleteUserData.notToBeCalled()).toBe(true);
}).always();
  • when(condition): Asserts a condition when a specific state is met
// Assert something when a condition is met during a simulation
simulationExpect(simulationAgent.events, async (simulationAgent) => {
  expect(mockDeleteUserData).toBeCalled();
}).when(state => state.lastSimulationAgentResponse?.content === 'Please delete my data.');

Reporting

License

MIT

1.0.1

7 months ago

1.0.0

7 months ago