Skip to content

Troubleshooting

Symptom: Incidents are created but never progress to test_generating.

Causes:

  • The sentinel-errors-detected queue consumer is not running
  • The TestGen agent is failing silently

Resolution:

  1. Check Queues dashboard for backlog in sentinel-errors-detected
  2. Check DLQ (sentinel-dlq) for failed messages
  3. Check Workers logs for errors in the queue consumer
  4. Manually trigger via the Orchestrator API: POST /api/incidents/:id/retry

Symptom: Many incidents are marked needs_human instead of progressing.

Common Causes:

AgentReason
TestGenGenerated test passes (bug not reproduced) — review the test logic
CodeTriageLLM confidence is low — check the source code context being sent
FixAgent5 fix attempts exhausted — review the LLM prompts and test output

Resolution:

  1. Get the incident detail: GET /api/incidents/:id
  2. Review the R2 artifacts (test case, sample event)
  3. Fix the issue manually and retry: POST /api/incidents/:id/retry

Symptom: Test generation or fix generation produces invalid code.

Causes:

  • Insufficient context in the LLM prompt
  • The source code being read is too large or irrelevant
  • Model limitations for complex codebases

Resolution:

  1. Check the llm_calls table for the prompt and response
  2. Review the source code being sent (currently reads src/index.ts only)
  3. Consider adjusting the maxTokens parameter in the LLM config

Symptom: Messages appear in the DLQ.

Resolution:

  1. Check sentinel-dlq for the full message body
  2. Verify the message matches the expected Zod schema
  3. Check the target agent’s HTTP endpoint for errors
  4. Messages are retried automatically (3x for errors-detected, 2x for fix-ready)

Symptom: Agents log "no sandbox or repo" and mark incidents as needs_human.

Causes:

  • The SANDBOX (containers) binding is commented out in wrangler.jsonc
  • The sandbox container image hasn’t been built and pushed

Resolution:

  1. Build the sandbox image: cd sandbox-image && docker build -t sentinel-sandbox .
  2. Uncomment the containers binding in wrangler.jsonc
  3. Redeploy: bunx wrangler@latest deploy
Terminal window
rm -rf .wrangler/state/v3/d1/
bun run db:migrate
Terminal window
bunx wrangler@latest d1 execute sentinel-db --command "SELECT COUNT(*) FROM incidents"

Agent-local SQLite data (fingerprint cache, poll cursors) is stored in the Durable Object’s storage. It can be inspected via the Durable Objects dashboard in the Cloudflare console.