Skip to content

Agent Pipeline

The core pipeline consists of four agents executing in sequence, connected by Cloudflare Queues.

LogTailer ──→ TestGen ──→ CodeTriage ──→ FixAgent
│ │ │ │
▼ ▼ ▼ ▼
[errors- [triage- [fix-ready] [completed]
detected] ready]

The LogTailer agent polls Workers Observability every 30 seconds:

  1. Loads enabled service configs from D1
  2. Queries the Observability API with a 60-second lookback window (15s overlap for consistency)
  3. Classifies each error event (client_4xx, server_5xx, unhandled_exception)
  4. Computes a SHA-256 fingerprint: sha256(service | normalized_message | route | top_3_stack_frames)
  5. Checks the agent-local SQLite fingerprint cache:
    • New fingerprint → create D1 incident, publish to errors-detected queue
    • Known, resolved >24h → reopen incident, re-queue
    • Known, active → increment occurrence count only
    • Known, wontfix → increment count, do not re-queue

The TestGen agent receives error detection messages and generates reproduction tests:

  1. Fetches the service config to get the repository URL
  2. Builds an LLM prompt from the sample event (route, status code, error message, stack trace)
  3. Calls Workers AI to generate a bun:test file
  4. If a Sandbox is available: clones the repo, installs deps, writes the test, runs it
  5. Reproduction confirmed = the test fails (proving the bug exists)
  6. Stores artifacts in R2: test-case.ts, test-result.json, sample-event.json
  7. Publishes a triage-ready message

If the generated test passes after 3 retries (bug not reproduced), the incident is marked needs_human.

The CodeTriage agent performs LLM-powered root cause analysis:

  1. Clones the repository in a Sandbox
  2. Reads the entry point source code
  3. Sends a structured prompt to Workers AI requesting JSON output:
    • rootCauseFile, rootCauseLines, rootCauseExplanation
    • fixStrategy, confidence, affectedFiles
  4. If confidence is low: runs a second-opinion validation with a different model
  5. Publishes a fix-ready message (or marks needs_human for low confidence)

The FixAgent runs a TDD cycle inside a Sandbox:

  1. Clone and branchgit clone + git checkout -b sentinel/fix-{id}
  2. Place test — write the reproduction test from R2
  3. Baseline — run existing tests (must pass)
  4. Reproduce — run reproduction test (must fail)
  5. Fix loop (up to 5 attempts):
    • Generate fix via LLM (including previous failure context)
    • Write fix to file
    • Run reproduction test — must pass
    • Run full regression suite — must also pass
    • On failure: revert, try different approach
  6. Lintbunx oxlint --fix .
  7. Commit and push — authenticated push with token
  8. Create PR — via GitHub REST API with structured description

At every stage, the pipeline can exit early:

ConditionResult
Bug cannot be reproduced (test passes)needs_human
LLM confidence is lowneeds_human
Fix fails after 5 attemptsneeds_human
Manual override via Orchestrator APIwontfix
Queue message fails all retries→ Dead Letter Queue
detected → test_generating → test_generated → triaging → triaged
→ fixing → fix_submitted → resolved
↘ wontfix
↘ needs_human