Runs & Diffs

Runs

A run represents a single test session within an environment. Starting a run captures a “before” snapshot, and ending it captures an “after” snapshot to compute the diff.

Run Lifecycle

Start Run

Takes a “before” snapshot of the environment state

run = client.start_run(envId=env.environmentId)

Agent Execution

Your agent makes API calls that modify the environment

Compute Diff

Compares before/after states to produce a diff

result = client.diff_run(runId=run.runId)

Run Properties

Property	Description
`runId`	Unique identifier
`status`	`running`, `completed`, `failed`
`beforeSnapshot`	Snapshot ID before agent execution
`afterSnapshot`	Snapshot ID after agent execution

Diffs

A diff is the computed difference between the before and after states of an environment.

import requests
from agent_diff import AgentDiff

client = AgentDiff()
#Add envs 
# or pass explicitly: AgentDiff(api_key="", base_url="https://api.agentdiff.dev")


# Create sandbox.
env = client.init_env(templateService="slack", templateName="slack_default", impersonateUserId="U01AGENBOT9")
run = client.start_run(envId=env.environmentId)

# Post message to #general 
response = requests.post(
    f"{client.base_url}/api/env/{env.environmentId}/services/slack/chat.postMessage",
    headers={"Authorization": f"Bearer {client.api_key}"},
    json={"channel": "C01ABCD1234", "text": "Hello!"}
)

# Get diff
diff = client.diff_run(runId=run.runId)
pprint(diff.model_dump())

# Cleanup
client.delete_env(envId=env.environmentId)

Output Structure

{
  "inserts": [
    {
      "__table__": "messages",
      "message_id": "1732645891.000200",
      "channel_id": "C01GENERAL99",
      "user_id": "U01AGENBOT9",
      "message_text": "Hello World!"
    }
  ],
  "updates": [
    {
      "__table__": "channels",
      "before": { "last_message_at": null },
      "after": { "last_message_at": "2025-11-26T15:31:31" }
    }
  ],
  "deletes": []
}

Diff Types

Inserts

New records created by the agent

Updates

Existing records modified (shows before/after)

Deletes

Records removed by the agent

How Diffs Are Captured

Agent Diff uses PostgreSQL logical replication to capture every change:

WAL Capture: All database writes are logged to the Write-Ahead Log
wal2json: Converts WAL entries to JSON format
Change Journal: Filtered and stored per environment/run
Diff Computation: Aggregated into inserts/updates/deletes

This approach captures changes at the database level, so it works regardless of which API endpoint the agent used.

Getting Started

Core Concepts

Integrations

Training Integrations

Code Executors

Self-Hosting

Runs

Run Lifecycle

Run Properties

Diffs

Output Structure

Diff Types

Inserts

Updates

Deletes

How Diffs Are Captured

Next Steps

Evaluations

Assertions

Getting Started

Core Concepts

Integrations

Training Integrations

Code Executors

Self-Hosting

​Runs

​Run Lifecycle

​Run Properties

​Diffs

​Output Structure

​Diff Types

Inserts

Updates

Deletes

​How Diffs Are Captured

​Next Steps

Evaluations

Assertions

Runs

Run Lifecycle

Run Properties

Diffs

Output Structure

Diff Types

How Diffs Are Captured

Next Steps