Core Concepts
Templates & Environments
Templates are populated database schemas. Environments are isolated, ephemeral copies of templates where your agents operate. Each environment gets its own base API URL that you can proxy to your agents.
Runs & Diffs
A run represents a single test session within an environment. Starting a run takes a snapshot, and ending it takes another one and returns a diff - computed difference between the before and after states of an environment.
Evaluations
Evaluations let you verify that your AI agent performed the expected actions. You can create your own test suites that will compare expected state change to diff results, or use our example ones for Linear and Slack.
Tests & Assertions
Define test suites with expected outcomes using our assertion DSL. Each test specifies a prompt, environment template, and assertions that verify the agent made the correct database changes (inserts, updates, deletes).
Supported APIs
Slack
Web API coverage for conversations, chat, reactions, users, and more
Linear
GraphQL API for issues, teams, projects, comments, and workflow states
More APIs coming soon. Request an integration →
