Skip to content

Commit 56f491a

Browse files
committed
🤖 refactor: remove plan subagent auto-handoff
Remove plan-mode subagent auto-handoff after propose_plan, drop the executor-routing settings and backend router, reject plan-like task creation, and update related docs/stories/tests. --- _Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `high` • Cost: `$2.78`_ <!-- mux-attribution: model=openai:gpt-5.4 thinking=high costs=2.78 -->
1 parent 9d16d04 commit 56f491a

19 files changed

Lines changed: 252 additions & 1489 deletions

File tree

docs/AGENTS.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ description: Agent instructions for AI assistants working on the Mux codebase
5959
Use `agent-browser` for web automation. Run `agent-browser --help` for all commands.
6060

6161
Core workflow:
62+
6263
1. `agent-browser open <url>` - Navigate to page
6364
2. `agent-browser snapshot -i` - Get interactive elements with refs (@e1, @e2)
6465
3. `agent-browser click @e1` / `fill @e2 "text"` - Interact using refs
@@ -68,8 +69,8 @@ Core workflow:
6869

6970
- If a PR has Codex review comments, address + resolve them, then re-request review by commenting `@codex review` on the PR.
7071
- Prefer `gh` CLI for GitHub interactions over manual web/curl flows.
71-
- In Orchestrator mode, delegate implementation/verification commands to `exec` or `explore` sub-agents and integrate their patches; do not bypass delegation with direct local edits.
72-
- In Orchestrator mode, route higher-complexity implementation tasks to `plan` sub-agents so they can research and produce a precise plan before auto-handoff to implementation.
72+
- When delegation is required by the active mode, use `exec` or `explore` sub-agents as directed and integrate their patches; do not bypass delegation with direct local edits.
73+
- Keep implementation tasks on `exec` sub-agents; use a top-level plan workspace when you need a separate planning phase before delegation.
7374

7475
- User preference: when work is already on an open PR, push branch updates at the end of each completed change set so the PR stays current.
7576
- **PR creation gate:** Do **not** open/create a pull request unless the user explicitly asks (e.g., "open a PR", "create PR", "submit this"). By default, complete local validation, commit/push branch updates as requested, and let the user review before deciding whether to open a PR.
@@ -81,11 +82,11 @@ Core workflow:
8182
When a PR exists, you MUST remain in this loop until the PR is fully ready:
8283

8384
1. Push your latest fixes.
84-
2. Run local validation (`make static-check` and targeted tests as needed); in Orchestrator mode, delegate command execution to sub-agents.
85+
2. Run local validation (`make static-check` and targeted tests as needed); delegate command execution to sub-agents when the active mode requires it.
8586
3. Request review with `@codex review`.
8687
4. Run `./scripts/wait_pr_ready.sh <pr_number>` (which must execute `./scripts/wait_pr_checks.sh <pr_number> --once` while checks are pending).
87-
5. If Codex leaves comments, address them (delegate fixes in Orchestrator mode), resolve threads with `./scripts/resolve_pr_comment.sh <thread_id>`, push, and repeat.
88-
6. If checks/mergeability fail, fix issues locally (delegate fixes in Orchestrator mode), push, and repeat.
88+
5. If Codex leaves comments, address them (delegating fixes when required by the active mode), resolve threads with `./scripts/resolve_pr_comment.sh <thread_id>`, push, and repeat.
89+
6. If checks/mergeability fail, fix issues locally (delegating fixes when required by the active mode), push, and repeat.
8990

9091
The only early-stop exception is when the reviewer is clearly misunderstanding the intended change and further churn would be counterproductive. In that case, leave a clarifying PR comment and pause for human direction.
9192

docs/agents/index.mdx

Lines changed: 12 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -436,7 +436,7 @@ When a plan is present (default):
436436
- Treat the accepted plan as the source of truth. Its file paths, symbols, and structure were validated during planning — do not routinely spawn `explore` to re-confirm them. Exception: if the plan references stale paths or appears to have been authored/edited by the user without planner validation, a single targeted `explore` to sanity-check critical paths is acceptable.
437437
- Spawning `explore` to gather _additional_ context beyond what the plan provides is encouraged (e.g., checking whether a helper already exists, locating test files not mentioned in the plan, discovering existing patterns to match). This produces better implementation task briefs.
438438
- Do not spawn `explore` just to verify that a planner-generated plan is correct — that is the planner's job, and the plan was accepted by the user.
439-
- Convert the plan into concrete implementation subtasks and start delegation (`exec` for low complexity, `plan` for higher complexity).
439+
- Convert the plan into concrete implementation subtasks and start delegation with `exec` sub-agents.
440440

441441
What you are allowed to do directly in this workspace:
442442

@@ -452,8 +452,8 @@ Hard rules (delegate-first):
452452
- Trust `explore` sub-agent reports as authoritative for repo facts (paths/symbols/callsites). Do not redo the same investigation yourself; only re-check if the report is ambiguous or contradicts other evidence.
453453
- For correctness claims, an `explore` sub-agent report counts as having read the referenced files.
454454
- **Do not do broad repo investigation here.** If you need context, spawn an `explore` sub-agent with a narrow prompt (keeps this agent focused on coordination).
455-
- **Do not implement features/bugfixes directly here.** Spawn `exec` (simple) or `plan` (complex) sub-agents and have them complete the work end-to-end.
456-
- **Do not use `bash` for file reads/writes, manual code editing, or broad repo exploration.** `bash` in this workspace is for orchestration-only operations: `git`/`gh` repo management, targeted post-apply verification checks, and waiting for PR review/CI outcomes. If direct checks fail due to code issues, delegate fixes to `exec`/`plan` sub-agents instead of implementing changes here.
455+
- **Do not implement features/bugfixes directly here.** Spawn `exec` sub-agents and have them complete the work end-to-end.
456+
- **Do not use `bash` for file reads/writes, manual code editing, or broad repo exploration.** `bash` in this workspace is for orchestration-only operations: `git`/`gh` repo management, targeted post-apply verification checks, and waiting for PR review/CI outcomes. If direct checks fail due to code issues, delegate fixes to `exec` sub-agents instead of implementing changes here.
457457
- **Never read or scan session storage.** This includes `~/.mux/sessions/**` and `~/.mux/sessions/subagent-patches/**`. Treat session storage as an internal implementation detail; do not shell out to locate patch artifacts on disk. Only use `task_apply_git_patch` to access patches.
458458

459459
Delegation guide:
@@ -474,12 +474,10 @@ Delegation guide:
474474
Trust Explore reports as authoritative; do not re-verify unless ambiguous/contradictory.
475475
If starting points + acceptance are already clear, skip initial explore and only explore when blocked.
476476
- Create one or more git commits before `agent_report`.
477-
- Use `plan` for higher-complexity subtasks that touch multiple files/locations, require non-trivial investigation, or have an unclear implementation approach.
478-
- Default to `plan` when a subtask needs coordinated updates across multiple locations, unless the edits are mechanical and already fully specified.
479-
- For higher-complexity implementation work, prefer `plan` over `exec` so the sub-agent can do targeted research and produce a precise plan before implementation begins.
477+
- Use `exec` for implementation subtasks, including higher-complexity work.
478+
- For higher-complexity work, do a small amount of parent-side framing first so the `exec` brief includes the goal, constraints, sequencing, and key files.
480479
- Good fit: multi-file refactors, cross-module behavior changes, unfamiliar subsystems, or work where sequencing/dependencies need discovery.
481-
- Plan subtasks automatically hand off to implementation after a successful `propose_plan`; expect the usual task completion output once implementation finishes.
482-
- For `plan` briefs, prioritize goal + constraints + acceptance criteria over file-by-file diff instructions.
480+
- If the implementation approach is still unclear after targeted exploration, switch to a top-level plan workspace before continuing delegation instead of spawning a plan sub-agent.
483481
- Use `desktop` for GUI-heavy desktop automation that requires repeated screenshot → act → verify loops (for example, interacting with application windows, clicking through UI flows, or visual verification). The desktop agent enforces a grounding discipline that keeps visual context local.
484482

485483
Recommended Orchestrator → Exec task brief template:
@@ -505,7 +503,7 @@ Recommended Orchestrator → Exec task brief template:
505503
If starting points + acceptance are already clear, skip initial explore and only explore when blocked.
506504
- Create one or more git commits before `agent_report`.
507505

508-
Dependency analysis (required before spawning implementation tasks`exec` or `plan`):
506+
Dependency analysis (required before spawning implementation tasks):
509507

510508
- For each candidate subtask, write:
511509
- Outputs: files/targets/artifacts introduced/renamed/generated
@@ -526,9 +524,9 @@ Example dependency chain (schema download → generation):
526524
Patch integration loop (default):
527525

528526
1. Identify a batch of independent subtasks.
529-
2. Spawn one implementation sub-agent task per subtask with `run_in_background: true` (`exec` for low complexity, `plan` for higher complexity).
527+
2. Spawn one `exec` implementation sub-agent task per subtask with `run_in_background: true`.
530528
3. Await the batch via `task_await`.
531-
4. For each successful implementation task (`exec` directly, or `plan` after auto-handoff to implementation), integrate patches one at a time:
529+
4. For each successful implementation task, integrate patches one at a time:
532530
- Treat every successful child task with a `taskId` as pending patch integration, whether the completion arrived inline from `task` or later from `task_await`.
533531
- Complete each dry-run + real-apply pair before starting the next patch. Applying one patch changes `HEAD`, which can invalidate later dry-run results.
534532
- Dry-run apply: `task_apply_git_patch` with `dry_run: true`.
@@ -544,11 +542,11 @@ Patch integration loop (default):
544542
- Run focused verification directly with `bash` when practical (for example: targeted tests or the repo's standard full-validation command), or delegate verification to `explore`/`exec` when investigation/fixes are likely.
545543
- Use `git`/`gh` directly for PR orchestration when a PR already exists (pushes, review-request comments, replies to review remarks, and CI/check-status waiting loops). Create a new PR only when the user explicitly asks.
546544
- PASS: summary-only (no long logs).
547-
- FAIL: include the failing command + key error lines; then delegate a fix to `exec`/`plan` and re-verify.
545+
- FAIL: include the failing command + key error lines; then delegate a fix to `exec` and re-verify.
548546

549547
Sequential protocol (only for dependency chains):
550548

551-
1. Spawn the prerequisite implementation task (`exec` or `plan`, based on complexity) with `run_in_background: false`.
549+
1. Spawn the prerequisite implementation task with `agentId: "exec"` and `run_in_background: false`.
552550
2. If step 1 returns `queued`/`running` without a completed report, call `task_await` with the returned `taskId` before attempting any patch apply. If step 1 returns `status: completed` inline, that same `taskId` still requires patch application.
553551
3. Dry-run apply its patch (`dry_run: true`); then apply for real (`dry_run: false`). If either step fails, follow the conflict playbook above (including `git am --abort` only when a real apply leaves a git-am session in progress).
554552
4. Only after the patch is applied, spawn the dependent implementation task.
@@ -579,7 +577,7 @@ description: Create a plan before coding
579577
ui:
580578
color: var(--color-plan-mode)
581579
subagent:
582-
runnable: true
580+
runnable: false
583581
tools:
584582
add:
585583
# Allow all tools by default (includes MCP tools which have dynamic names)

src/browser/components/icons/EmojiIcon/EmojiIcon.tsx

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,6 @@ const EMOJI_TO_ICON: Record<string, LucideIcon> = {
4848
"🔗": Link,
4949
"🔄": RefreshCw,
5050
"🧪": Beaker,
51-
// Used by auto-handoff routing status while selecting the executor.
5251
"🤔": CircleHelp,
5352

5453
// Directions

src/browser/features/Settings/Sections/TasksSection.tsx

Lines changed: 0 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,7 @@ import {
3434
import {
3535
DEFAULT_TASK_SETTINGS,
3636
TASK_SETTINGS_LIMITS,
37-
isPlanSubagentExecutorRouting,
3837
normalizeTaskSettings,
39-
type PlanSubagentExecutorRouting,
4038
type TaskSettings,
4139
} from "@/common/types/tasks";
4240
import { getThinkingOptionLabel, type ThinkingLevel } from "@/common/types/thinking";
@@ -173,8 +171,6 @@ function areTaskSettingsEqual(a: TaskSettings, b: TaskSettings): boolean {
173171
a.maxParallelAgentTasks === b.maxParallelAgentTasks &&
174172
a.maxTaskNestingDepth === b.maxTaskNestingDepth &&
175173
a.proposePlanImplementReplacesChatHistory === b.proposePlanImplementReplacesChatHistory &&
176-
a.planSubagentExecutorRouting === b.planSubagentExecutorRouting &&
177-
a.planSubagentDefaultsToOrchestrator === b.planSubagentDefaultsToOrchestrator &&
178174
a.bashOutputCompactionMinLines === b.bashOutputCompactionMinLines &&
179175
a.bashOutputCompactionMinTotalBytes === b.bashOutputCompactionMinTotalBytes &&
180176
a.bashOutputCompactionMaxKeptLines === b.bashOutputCompactionMaxKeptLines &&
@@ -499,25 +495,10 @@ export function TasksSection() {
499495
);
500496
};
501497

502-
const setPlanSubagentExecutorRouting = (value: string) => {
503-
if (!isPlanSubagentExecutorRouting(value)) {
504-
return;
505-
}
506-
507-
setTaskSettings((prev) =>
508-
normalizeTaskSettings({
509-
...prev,
510-
planSubagentExecutorRouting: value,
511-
})
512-
);
513-
};
514498
const setNewWorkspaceDefaultAgentId = (agentId: string) => {
515499
setGlobalDefaultAgentIdRaw(coerceAgentId(agentId));
516500
};
517501

518-
const planSubagentExecutorRouting: PlanSubagentExecutorRouting =
519-
taskSettings.planSubagentExecutorRouting ?? "exec";
520-
521502
const setAgentModel = (agentId: string, value: string) => {
522503
setAgentAiDefaults((prev) =>
523504
updateAgentDefaultEntry(prev, agentId, (updated) => {
@@ -917,28 +898,6 @@ export function TasksSection() {
917898
aria-label="Toggle plan Implement replaces conversation with plan"
918899
/>
919900
</div>
920-
921-
<div className="flex items-center justify-between gap-4">
922-
<div className="flex-1">
923-
<div className="text-foreground text-sm">Plan sub-agents: executor routing</div>
924-
<div className="text-muted text-xs">
925-
Choose how plan sub-agent tasks route after propose_plan.
926-
</div>
927-
</div>
928-
<Select
929-
value={planSubagentExecutorRouting}
930-
onValueChange={setPlanSubagentExecutorRouting}
931-
>
932-
<SelectTrigger className="border-border-medium bg-background-secondary h-9 w-44">
933-
<SelectValue />
934-
</SelectTrigger>
935-
<SelectContent>
936-
<SelectItem value="exec">Exec</SelectItem>
937-
<SelectItem value="orchestrator">Orchestrator</SelectItem>
938-
<SelectItem value="auto">Auto (Agent chooses)</SelectItem>
939-
</SelectContent>
940-
</Select>
941-
</div>
942901
</div>
943902

944903
{saveError ? <div className="text-danger-light mt-4 text-xs">{saveError}</div> : null}

src/browser/features/Tools/ProposePlan/ProposePlanToolCall.stories.tsx

Lines changed: 0 additions & 83 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,7 @@ import {
66
createUserMessage,
77
createAssistantMessage,
88
createProposePlanTool,
9-
createStatusTool,
109
} from "@/browser/stories/mockFactory";
11-
import {
12-
PLAN_AUTO_ROUTING_STATUS_EMOJI,
13-
PLAN_AUTO_ROUTING_STATUS_MESSAGE,
14-
} from "@/common/constants/planAutoRoutingStatus";
1510

1611
const meta = { ...appMeta, title: "App/Chat/Tools/ProposePlan" };
1712
export default meta;
@@ -167,84 +162,6 @@ graph TD
167162
},
168163
};
169164

170-
/**
171-
* Captures the handoff pause after a plan is presented and before the executor stream starts.
172-
*
173-
* This reproduces the visual state where the sidebar shows "Deciding execution strategy…"
174-
* while the proposed plan remains visible in the conversation.
175-
*/
176-
export const ProposePlanAutoRoutingDecisionGap: AppStory = {
177-
render: () => (
178-
<AppWithMocks
179-
setup={() =>
180-
setupSimpleChatStory({
181-
workspaceId: "ws-plan-auto-routing-gap",
182-
workspaceName: "feature/plan-auto-routing",
183-
messages: [
184-
createUserMessage(
185-
"msg-1",
186-
"Plan and implement a safe migration rollout for auth tokens.",
187-
{
188-
historySequence: 1,
189-
timestamp: STABLE_TIMESTAMP - 240000,
190-
}
191-
),
192-
createAssistantMessage("msg-2", "Here is the implementation plan.", {
193-
historySequence: 2,
194-
timestamp: STABLE_TIMESTAMP - 230000,
195-
toolCalls: [
196-
createProposePlanTool(
197-
"call-plan-1",
198-
`# Auth Token Migration Rollout
199-
200-
## Goals
201-
202-
- Migrate token validation to the new signing service.
203-
- Maintain compatibility during rollout.
204-
- Keep rollback simple and low risk.
205-
206-
## Steps
207-
208-
1. Add dual-read token validation behind a feature flag.
209-
2. Ship telemetry for token verification outcomes.
210-
3. Enable new validator for 10% of traffic.
211-
4. Ramp to 100% after stability checks.
212-
5. Remove legacy validator once metrics stay healthy.
213-
214-
## Rollback
215-
216-
- Disable the rollout flag to return to legacy validation immediately.
217-
- Keep telemetry running to confirm recovery.`
218-
),
219-
],
220-
}),
221-
createAssistantMessage("msg-3", "Selecting the right executor for this plan.", {
222-
historySequence: 3,
223-
timestamp: STABLE_TIMESTAMP - 220000,
224-
toolCalls: [
225-
createStatusTool(
226-
"call-status-1",
227-
PLAN_AUTO_ROUTING_STATUS_EMOJI,
228-
PLAN_AUTO_ROUTING_STATUS_MESSAGE
229-
),
230-
],
231-
}),
232-
],
233-
})
234-
}
235-
/>
236-
),
237-
parameters: {
238-
docs: {
239-
description: {
240-
story:
241-
"Chromatic regression story for the plan auto-routing gap: after `propose_plan` succeeds, " +
242-
"the sidebar stays in a working state with a 'Deciding execution strategy…' status before executor kickoff.",
243-
},
244-
},
245-
},
246-
};
247-
248165
/**
249166
* Mobile viewport version of ProposePlan.
250167
*

src/common/config/schemas/appConfigOnDisk.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@ import { TaskSettingsSchema } from "./taskSettings";
88

99
export { RuntimeEnablementOverridesSchema } from "../../schemas/runtimeEnablement";
1010
export type { RuntimeEnablementOverrides } from "../../schemas/runtimeEnablement";
11-
export { PlanSubagentExecutorRoutingSchema, TaskSettingsSchema } from "./taskSettings";
12-
export type { PlanSubagentExecutorRouting, TaskSettings } from "./taskSettings";
11+
export { TaskSettingsSchema } from "./taskSettings";
12+
export type { TaskSettings } from "./taskSettings";
1313

1414
export const AgentAiDefaultsEntrySchema = z.object({
1515
modelString: z.string().optional(),

src/common/config/schemas/taskSettings.ts

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,6 @@ export const SYSTEM1_BASH_OUTPUT_COMPACTION_LIMITS = {
1212
bashOutputCompactionTimeoutMs: { min: 1_000, max: 120_000, default: 5_000 },
1313
} as const;
1414

15-
export const PlanSubagentExecutorRoutingSchema = z.enum(["exec", "orchestrator", "auto"]);
16-
17-
export type PlanSubagentExecutorRouting = z.infer<typeof PlanSubagentExecutorRoutingSchema>;
18-
1915
export const TaskSettingsSchema = z.object({
2016
maxParallelAgentTasks: z
2117
.number()
@@ -30,8 +26,6 @@ export const TaskSettingsSchema = z.object({
3026
.max(TASK_SETTINGS_LIMITS.maxTaskNestingDepth.max)
3127
.optional(),
3228
proposePlanImplementReplacesChatHistory: z.boolean().optional(),
33-
planSubagentExecutorRouting: PlanSubagentExecutorRoutingSchema.optional(),
34-
planSubagentDefaultsToOrchestrator: z.boolean().optional(),
3529
bashOutputCompactionMinLines: z
3630
.number()
3731
.int()

src/common/constants/planAutoRoutingStatus.ts

Lines changed: 0 additions & 4 deletions
This file was deleted.

0 commit comments

Comments
 (0)