Troubleshooting

Incidents Stuck in “detected”

Symptom: Incidents are created but never progress to test_generating.

Causes:

The sentinel-errors-detected queue consumer is not running
The TestGen agent is failing silently

Resolution:

Check Queues dashboard for backlog in sentinel-errors-detected
Check DLQ (sentinel-dlq) for failed messages
Check Workers logs for errors in the queue consumer
Manually trigger via the Orchestrator API: POST /api/incidents/:id/retry

Incidents Stuck in “needs_human”

Symptom: Many incidents are marked needs_human instead of progressing.

Common Causes:

Agent	Reason
TestGen	Generated test passes (bug not reproduced) — review the test logic
CodeTriage	LLM confidence is `low` — check the source code context being sent
FixAgent	5 fix attempts exhausted — review the LLM prompts and test output

Resolution:

Get the incident detail: GET /api/incidents/:id
Review the R2 artifacts (test case, sample event)
Fix the issue manually and retry: POST /api/incidents/:id/retry

LLM Generating Poor Results

Symptom: Test generation or fix generation produces invalid code.

Causes:

Insufficient context in the LLM prompt
The source code being read is too large or irrelevant
Model limitations for complex codebases

Resolution:

Check the llm_calls table for the prompt and response
Review the source code being sent (currently reads src/index.ts only)
Consider adjusting the maxTokens parameter in the LLM config

Queue Messages Failing

Symptom: Messages appear in the DLQ.

Resolution:

Check sentinel-dlq for the full message body
Verify the message matches the expected Zod schema
Check the target agent’s HTTP endpoint for errors
Messages are retried automatically (3x for errors-detected, 2x for fix-ready)

Sandbox Unavailable

Symptom: Agents log "no sandbox or repo" and mark incidents as needs_human.

Causes:

The SANDBOX (containers) binding is commented out in wrangler.jsonc
The sandbox container image hasn’t been built and pushed

Resolution:

Build the sandbox image: cd sandbox-image && docker build -t sentinel-sandbox .
Uncomment the containers binding in wrangler.jsonc
Redeploy: bunx wrangler@latest deploy

Database Issues

Reset Local D1

rm -rf .wrangler/state/v3/d1/
bun run db:migrate

Query D1 Directly

bunx wrangler@latest d1 execute sentinel-db --command "SELECT COUNT(*) FROM incidents"

Check Agent SQLite

Agent-local SQLite data (fingerprint cache, poll cursors) is stored in the Durable Object’s storage. It can be inspected via the Durable Objects dashboard in the Cloudflare console.