Runs
A run represents a single test session within an environment. Starting a run captures a “before” snapshot, and ending it captures an “after” snapshot to compute the diff.Run Lifecycle
1
Start Run
Takes a “before” snapshot of the environment state
2
Agent Execution
Your agent makes API calls that modify the environment
3
Compute Diff
Compares before/after states to produce a diff
Run Properties
| Property | Description |
|---|---|
runId | Unique identifier |
status | running, completed, failed |
beforeSnapshot | Snapshot ID before agent execution |
afterSnapshot | Snapshot ID after agent execution |
Diffs
A diff is the computed difference between the before and after states of an environment.Output Structure
Diff Types
Inserts
New records created by the agent
Updates
Existing records modified (shows before/after)
Deletes
Records removed by the agent
How Diffs Are Captured
Agent Diff uses PostgreSQL logical replication to capture every change:- WAL Capture: All database writes are logged to the Write-Ahead Log
- wal2json: Converts WAL entries to JSON format
- Change Journal: Filtered and stored per environment/run
- Diff Computation: Aggregated into inserts/updates/deletes
This approach captures changes at the database level, so it works regardless of which API endpoint the agent used.
