Skip to content

Merge dev to main#1354

Merged
zbigniewsobiecki merged 2 commits into
mainfrom
dev
May 12, 2026
Merged

Merge dev to main#1354
zbigniewsobiecki merged 2 commits into
mainfrom
dev

Conversation

@zbigniewsobiecki
Copy link
Copy Markdown
Member

Routine dev → main promotion. Single feature PR:

Commits

🤖 Generated with Claude Code

zbigniewsobiecki and others added 2 commits May 12, 2026 09:46
MNG-699 (ucho PR #400, 2026-05-12): respond-to-review run b728fa3e
completed its real work at 08:43:22 (commit 547c4c5d pushed, review
reply posted) yet kept its claude-code SDK session alive for ~32 more
minutes until the user manually cancelled at 09:15:08, burning compute
and blocking follow-up dispatches via the work-item lock.

Root cause: Finish DOES terminate the session (it throws
TaskCompletionSignal from llmist, which the framework catches), but
agent prompts don't mandate calling Finish. Without an explicit
instruction the model decides it is done by emitting trailing text
and never invokes the gadget — so the SDK keeps streaming.

Two surgical changes:

1) NATIVE_TOOL_EXECUTION_RULES (src/backends/shared/nativeToolPrompts.ts)
   gains a "Termination protocol" section that every agent inherits
   via buildSystemPrompt — explicit Finish-call mandate, do-not-keep-
   working-after-Finish, and guidance for the Finish-rejected retry path.

2) validateFinish (src/gadgets/session/core/finish.ts) routes every
   rejection through rejectFinish() which emits a structured WARN with
   reason + agent state. Ops can now grep
   `docker logs cascade-router | grep "[Finish] validation rejected"`
   for the precondition the agent is looping on — the breadcrumb that
   was missing during MNG-699 triage.

TDD-first: 4 new tests pinning the prompt-rule presence and 5 new
tests pinning the WARN log on each rejection path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rompt-mandate

fix(agents): mandate Finish in prompt + log Finish rejections (MNG-699)
@zbigniewsobiecki zbigniewsobiecki merged commit 3093b33 into main May 12, 2026
14 checks passed
@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

❌ Patch coverage is 97.36842% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/gadgets/session/core/finish.ts 97.36% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant