Agent Pipeline

The core pipeline consists of four agents executing in sequence, connected by Cloudflare Queues.

Pipeline Stages

LogTailer ──→ TestGen ──→ CodeTriage ──→ FixAgent
    │              │            │              │
    ▼              ▼            ▼              ▼
[errors-      [triage-     [fix-ready]    [completed]
 detected]     ready]

Stage 1: Detection (LogTailer)

The LogTailer agent polls Workers Observability every 30 seconds:

Loads enabled service configs from D1
Queries the Observability API with a 60-second lookback window (15s overlap for consistency)
Classifies each error event (client_4xx, server_5xx, unhandled_exception)
Computes a SHA-256 fingerprint: sha256(service | normalized_message | route | top_3_stack_frames)
Checks the agent-local SQLite fingerprint cache:
- New fingerprint → create D1 incident, publish to errors-detected queue
- Known, resolved >24h → reopen incident, re-queue
- Known, active → increment occurrence count only
- Known, wontfix → increment count, do not re-queue

Stage 2: Reproduction (TestGen)

The TestGen agent receives error detection messages and generates reproduction tests:

Fetches the service config to get the repository URL
Builds an LLM prompt from the sample event (route, status code, error message, stack trace)
Calls Workers AI to generate a bun:test file
If a Sandbox is available: clones the repo, installs deps, writes the test, runs it
Reproduction confirmed = the test fails (proving the bug exists)
Stores artifacts in R2: test-case.ts, test-result.json, sample-event.json
Publishes a triage-ready message

If the generated test passes after 3 retries (bug not reproduced), the incident is marked needs_human.

Stage 3: Triage (CodeTriage)

The CodeTriage agent performs LLM-powered root cause analysis:

Clones the repository in a Sandbox
Reads the entry point source code
Sends a structured prompt to Workers AI requesting JSON output:
- rootCauseFile, rootCauseLines, rootCauseExplanation
- fixStrategy, confidence, affectedFiles
If confidence is low: runs a second-opinion validation with a different model
Publishes a fix-ready message (or marks needs_human for low confidence)

Stage 4: Fix (FixAgent)

The FixAgent runs a TDD cycle inside a Sandbox:

Clone and branch — git clone + git checkout -b sentinel/fix-{id}
Place test — write the reproduction test from R2
Baseline — run existing tests (must pass)
Reproduce — run reproduction test (must fail)
Fix loop (up to 5 attempts):
- Generate fix via LLM (including previous failure context)
- Write fix to file
- Run reproduction test — must pass
- Run full regression suite — must also pass
- On failure: revert, try different approach
Lint — bunx oxlint --fix .
Commit and push — authenticated push with token
Create PR — via GitHub REST API with structured description

Escape Hatches

At every stage, the pipeline can exit early:

Condition	Result
Bug cannot be reproduced (test passes)	→ `needs_human`
LLM confidence is `low`	→ `needs_human`
Fix fails after 5 attempts	→ `needs_human`
Manual override via Orchestrator API	→ `wontfix`
Queue message fails all retries	→ Dead Letter Queue

Incident Status Lifecycle

detected → test_generating → test_generated → triaging → triaged
    → fixing → fix_submitted → resolved
                                ↘ wontfix
                                ↘ needs_human