1.0.1 • Published 7 months ago
simulacra-test v1.0.1
Simulacra Framework
A TypeScript framework for simulating and testing agent-based conversations in Jest. This framework allows you to create, run, and test conversational agents with both deterministic and LLM-powered behavior for testing purposes.
Features
- Simulate conversations with deterministic or LLM-powered responses
- Rich assertion API for testing agent behavior.
- Automated reporting of simulation results.
- Extends the full power ofJest's test framework.
Roadmap
- [] Support for test parallelization
- [] Support for (cheaper) simulation agent models.
- [] LLM-as-judge transcript conversation transcript evaluation
- [] Support "streamed input", which waits for an "stop" token before generating an agent response.
- [] Anything else! We're open to suggestions!
Set up
- Install the framework using
npm install simulacra
- Define your test using
simulationTest
in a file with aagent.test.ts
extension - Add OPENAI_API_KEY to your environment variables (if using LLM-powered responses)
- Run your new agent tests using
npx jest agent.test.ts
.
Look at the examples for an example of a project setup to use the framework.
Usage
The framework provides two main ways to generate conversations:
DeterministicConversationGenerator
for deterministic testsLLMConversationGenerator
for testing with real language models
Here's an example testing a customer support scenario:
simulationTest(
'should handle refund requests appropriately',
{
role: 'frustrated customer who recently purchased a faulty laptop',
task: 'You bought a laptop last week that keeps crashing. You have tried troubleshooting with tech support but nothing works. Now you want to request a refund.',
conversationGenerator: new LLMConversationGenerator(),
getAgentResponse: (simulationAgentState) => {
// Your agent logic here
return handleCustomerRequest(simulationAgentState.lastResponse?.content);
},
},
async ({ agent }) => {
// Test that refund handler was called
simulationExpect(agent.events, async () => {
expect(mockHandleRefund).toHaveBeenCalled();
}).eventually();
}
);
You can also use DeterministicConversationGenerator
for to test a conversational agent with a specific set of messages:
simulationTest(
'should handle a specific conversation flow',
{
role: 'customer seeking technical support',
task: 'You need help with your printer that keeps jamming',
conversationGenerator: new DeterministicConversationGenerator([
{
role: "assistant",
content: "I understand you're having issues with a printer jam. Have you tried removing all paper and checking for debris?"
},
{
role: "assistant",
content: "Let's try resetting the printer. Please turn it off, wait 30 seconds, then turn it back on."
},
{
role: "assistant",
content: "Great! The printer should now be working correctly. Is there anything else you need help with?"
}
]),
getAgentResponse: (simulationAgentState) => handleTechSupport(simulationAgentState.lastResponse?.content)
},
async ({ agent }) => {
simulationExpect(agent.events, async () => {
expect(mockPrinterReset).toHaveBeenCalled();
}).eventually();
}
);
Assertions
The framework provides powerful assertion capabilities through simulationExpect
:
eventually()
: Asserts a condition is true by the end of the simulation
// Assert something happens by the end of a simulation
simulationExpect(simulationAgent.events, async (simulationAgent) => {
expect(simulationAgent.lastReceivedMessage?.content).toMatchSnapshot();
}).eventually();
always()
: Asserts a condition remains true throughout the entire simulation
// Assert something remains true throughout a simulation
simulationExpect(simulationAgent.events, async (simulationAgent) => {
expect(mockDeleteUserData.notToBeCalled()).toBe(true);
}).always();
when(condition)
: Asserts a condition when a specific state is met
// Assert something when a condition is met during a simulation
simulationExpect(simulationAgent.events, async (simulationAgent) => {
expect(mockDeleteUserData).toBeCalled();
}).when(state => state.lastSimulationAgentResponse?.content === 'Please delete my data.');
Reporting
License
MIT