🤖 fix: prevent transcript flashes and tearing during chat hydration#3152
🤖 fix: prevent transcript flashes and tearing during chat hydration#3152
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 82248db82c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
82248db to
b567850
Compare
|
@codex review Addressed the integration failure by making the new regression test compatible with the Jest-based integration runner as well as Bun. |
|
Codex Review: Didn't find any major issues. Swish! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
b567850 to
6234138
Compare
|
@codex review Restricted the optimistic pending-start flag to auto-navigated creations and added hook coverage for the background-creation path. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6234138ce2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
6234138 to
d21a216
Compare
|
@codex review Kept the optimistic pending-start flag through buffered first-turn replay until |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d21a21607b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
d21a216 to
f064605
Compare
|
@codex review Cleared stale optimistic start state when replay/activity confirms no active stream, and added WorkspaceStore coverage for both buffered replay and no-stream catch-up paths. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f06460566e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
f064605 to
d9dcdef
Compare
|
@codex review Stopped clearing the optimistic-start flag on recency-only non-streaming activity updates, and added WorkspaceStore coverage for that exact startup gap. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d9dcdef0c9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
d9dcdef to
6e026b5
Compare
|
@codex review Kept the optimistic-start flag across caught-up-without-first-turn, while still clearing it on buffered/live first-turn observation and on definitive background stop. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6e026b5709
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
6e026b5 to
c7101d6
Compare
|
@codex review Preserved optimistic initial-send startup across full replay resets and added a direct WorkspaceStore regression test for that reset path. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c7101d697a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
c7101d6 to
63063c3
Compare
|
@codex review Cleared optimistic pending-start on pre-stream abort as well, with a targeted WorkspaceStore regression test for the interrupted-first-send path. |
|
Codex Review: Didn't find any major issues. Delightful! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review Rewrote the new-chat startup fix so the optimistic pending-start state lives in |
|
Codex Review: Didn't find any major issues. Can't wait for the next one! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review Follow-up fix for the remaining workspace-switch transcript flash:
|
Make deferred transcript rendering workspace-aware so chat switches cannot briefly render a stale deferred snapshot from the previous workspace. ChatPane now defers a workspace-scoped transcript snapshot instead of just the message array, and the deferred-message guard immediately falls back to the live snapshot when the deferred rows still belong to another workspace. Adds a regression unit test for the cross-workspace deferred snapshot case and reruns the switch-sensitive UI coverage. --- _Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` • Cost: `$117.90`_ <!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=117.90 -->
Add a browser-mode repro script that boots an isolated dev-server, creates two live mock-chat workspaces, switches between them, and captures screenshot/scroll diagnostics. The script exits non-zero when the target transcript keeps shifting after it is already visible, matching the user-reported same-transcript tear. --- _Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` • Cost: `$121.96`_ <!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=121.96 -->
Persist in-flight mock assistant parts to partial.json so browser workspace switches can reopen a mid-stream mock turn from authoritative partial state instead of an empty placeholder, and tighten the browser repro to wait for a settled completed assistant row before measuring switch-time transcript stability. Also move optimistic pending-start marking into the state update that actually wins workspace selection so background-created workspaces do not inherit a stale "starting" barrier. --- _Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` • Cost: `$136.74`_ <!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=136.74 -->
Extend the hydration keep-mounted path to Electron so opening an unseen workspace does not swap the whole shell back to the loading placeholder and drop the composer.
Add a regression test that exercises the same loading state with window.api present.
---
_Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` • Cost: `3904855{MUX_COSTS_USD}`_
<!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=263.05 -->
Hold the chat composer region at its previous height while a workspace transcript hydrates so workspace-specific decorations above the input do not collapse and re-expand during Electron switches.
Also guard mock stream-error and stream-end partial cleanup against replacement-stream races before deleting shared partial state, and add regression coverage for both hydration height holding and stale cleanup behavior.
---
_Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` • Cost: `63727{MUX_COSTS_USD}`_
<!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=263.93 -->
Preserve optimistic startup through the first authoritative idle catch-up for new chats, but clear it on a later idle catch-up if no replayed turn ever appears so the workspace cannot stay stuck in starting mode forever.
Add WorkspaceStore regression coverage for the second idle catch-up path.
---
_Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` • Cost: `772332{MUX_COSTS_USD}`_
<!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=276.04 -->
|
@codex review |
8248788 to
56be477
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 56be47759d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the stale delayed partial write race in MockAiStreamPlayer and refreshed the PR summary/body to match the full diff. Please take another look. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1011781465
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the abort-during-replacement-cleanup race in MockAiStreamPlayer and revalidated the focused mock/compaction coverage plus static-check. Please take another look. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 883fc936f3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the stale delayed-write cleanup race by making partial deletion message-aware, and revalidated the mock stream tests, compaction integration test, and static-check. Please take another look. |
|
Codex Review: Didn't find any major issues. Chef's kiss. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
## Summary This PR removes the remaining chat-input tear that can still show up after `#3152` when the workspace-specific decorations above `ChatInput` hydrate at a different time than the textarea itself. The composer now treats that decoration area as its own stable lane, so the textarea seam stays anchored while TODO/review/background-process/queued-message UI catches up. ## Background `#3152` fixed the largest workspace-open and new-chat tears by keeping `WorkspaceShell`/`ChatPane` mounted through hydration. The remaining artifact was lower in the tree: `HydrationStablePane` reserved the **entire** composer pane, which prevented total collapse but still let the textarea float inside a tall wrapper while the decorations above it disappeared and reappeared. That meant the visible seam the user actually watches — the boundary right above `ChatInput` — could still jump during workspace switches or reopen/hydration, especially in Electron. ## Implementation - replace `HydrationStablePane` with `ChatInputDecorationStack`, a dedicated decoration-lane wrapper that: - measures only the workspace-specific UI above the textarea - caches per-workspace decoration heights for hydration reuse - bottom-aligns the reserved lane so blank space stays above the decoration stack instead of below the textarea - restructure `ChatPane` so it builds one ordered decoration lane and renders `ChatInput` as a separate sibling beneath it, rather than wrapping the whole composer in a shared min-height container - centralize the lane visibility decisions that are already available at the `ChatPane` boundary (pinned TODOs and reviews), while keeping `BackgroundProcessesBanner` self-owned so an open output dialog stays mounted even after the last process exits - rename and expand the hydration wrapper tests to assert the reserved height belongs to the decoration lane, not the input section itself ## Validation - `bun test ./src/browser/components/ChatPane/ChatInputDecorationStack.test.tsx` - `env TEST_INTEGRATION=1 bun x jest tests/ui/chat/bottomLayoutShift.test.ts --runInBand --silent` - `make static-check` ## Risks The main regression risk is accidentally hiding a decoration whose visibility is now decided at the `ChatPane` boundary before rendering the leaf component. This change keeps those visibility rules narrow, leaves background-process dialog ownership with the existing banner component, and preserves the existing bottom-layout integration test. ## Pains The old hydration fix was stabilizing the wrong boundary. Reserving the composer height helped, but it still preserved the wrapper instead of the textarea seam, so the visual tear could survive even though the outer box looked stable in code. --- _Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` • Cost: `$295.73`_ <!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=295.73 -->
## Summary This PR now addresses the remaining transcript hydration tears from both sides: it still surfaces replay-buffered active-stream state early enough for the streaming barrier to render immediately, and it now restructures the volatile transcript/layout seams so tail chrome, inline interruption markers, and composer-adjacent transcript resizing are stabilized by explicit lane owners instead of ad hoc `ChatPane` conditionals. ## Background `#3152` and `#3173` removed the biggest startup/workspace-switch tears by keeping `ChatPane` mounted and stabilizing the decoration lane above `ChatInput`. The remaining follow-up that opened this PR was the transcript-tail streaming barrier: reopened streaming workspaces could still hydrate first and only later discover that a stream was already active. Once that state-layer fix landed, a broader audit still found two architectural seams in the transcript itself: - tail chrome (`RetryBarrier`, `StreamingBarrier`, queued-agent prompt, concurrent-local warning) was still rendered as ad hoc footer JSX in `ChatPane` - inline interrupted markers were rendered from a second visibility path that `ChatPane` also had to remember in its bottom-pinning dependency list That kept transcript stability dependent on a hand-maintained list of booleans instead of on the actual layout boundaries that can change height. ## Implementation - keep the original `WorkspaceStore` hydration fix in place so replay-buffered `stream-start` events drive `canInterrupt`, `currentModel`, and `currentThinkingLevel` before `caught-up` - add `TranscriptTailStack`, a dedicated tail-lane owner that: - renders all volatile transcript-tail chrome from a single item list - reserves per-workspace tail height during hydration - pins the transcript bottom when a tail item's layout signature changes or its measured height changes - introduce a small shared `LayoutStackItem`/signature helper so lane owners can describe layout-affecting seams directly instead of teaching `ChatPane` about individual banner booleans - move concurrent-local warning visibility into a hook + presentational view so `ChatPane` can model it as part of the tail lane, rather than as a self-subscribing child that appears out-of-band - let `ChatInputPane` own transcript pinning for its decoration lane via one derived layout signature, instead of relying on `ChatPane` to remember which banner/todo states might resize the viewport from below - derive interrupted-barrier visibility once per render and reuse that same decision for both row rendering and pre-paint transcript pinning, so inline interruption markers can no longer bypass the stabilization path ## Validation - `bun test src/browser/components/ChatPane/TranscriptTailStack.test.tsx src/browser/components/ChatPane/ChatInputDecorationStack.test.tsx src/browser/stores/WorkspaceStore.test.ts` - `bun test ./tests/ui/chat/bottomLayoutShift.test.ts ./tests/ui/chat/newChatStreamingFlash.test.ts ./tests/ui/chat/streamInterrupt.test.ts` - `make static-check` ## Risks The main regression risk is over-eager bottom pinning when the user has intentionally scrolled away from the tail. The new lane owners all gate their pinning on `autoScroll`, and the retry barrier still stays mounted behind a `visible` layout key so its rollback/auto-retry bookkeeping is preserved even while its rendered height drops to zero. --- _Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` • Cost: `$330.21`_ <!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=330.21 -->
Summary
This PR fixes the remaining chat handoff artifacts around startup, workspace switches, and reopen/hydration. New chats keep their explicit starting barrier instead of flashing generic empty/loading placeholders; the chat pane stays mounted through hydration in both browser and Electron paths; the bottom composer/decorations stack holds its height while workspace-specific banners catch up; and mock streaming now persists/replays partial assistant state without letting stale cleanup or stale delayed writes clobber replacement streams.
Background
Workspace creation navigates into the new workspace as soon as the workspace exists, but the first
sendMessageand chat subscription replay can land a moment later. During that gap, brand-new chats could briefly flashCatching up with the agent...orNo Messages Yetbefore the first turn appeared.Separately, switching to or reopening an existing workspace could still swap back to a loading shell, remount the transcript viewport, or let workspace-specific footer banners collapse and re-expand during hydration. Those transitions showed up as the remaining vertical tear, especially in Electron
make dev.Mock mode had its own reconnect/resume gaps: partial assistant state needed to stay on disk during streaming so reopened workspaces could recover the in-flight turn, but stale stop/replacement paths could still delete or recreate the wrong partial snapshot.
Implementation
StreamingMessageAggregatorand clear it from replay/terminal paths instead of a separate transient UI pathisStreamStartingis trueWorkspaceShellandChatPanemounted during transcript hydration so opening a workspace no longer drops the composer; apply that hydration behavior in Electron as well as browser modeChatPaneon workspace switches and make deferred transcript snapshots workspace-aware so stale rows and viewport tears cannot flash between chatsHydrationStablePanearound the chat input/decorations stack so workspace-specific warnings, queued banners, and review chips hold their last measured height until hydration completesValidation
make static-checkbun test ./src/browser/components/WorkspaceShell/WorkspaceShell.test.tsxbun test ./src/browser/components/ChatPane/HydrationStablePane.test.tsxbun test ./src/browser/stores/WorkspaceStore.test.tsbun test ./src/browser/utils/messages/StreamingMessageAggregator.test.tsbun test ./src/browser/utils/messages/messageUtils.test.tsbun test ./src/browser/features/ChatInput/useCreationWorkspace.test.tsxbun test ./src/browser/hooks/useAutoScroll.test.tsxbun test ./src/node/services/mock/mockAiStreamPlayer.test.tsTEST_INTEGRATION=1 bun x jest tests/ui/chat/newChatStreamingFlash.test.ts --runInBand --silentTEST_INTEGRATION=1 bun x jest tests/ui/chat/bottomLayoutShift.test.ts --runInBand --silentTEST_INTEGRATION=1 bun x jest tests/ui/tasks/awaitVisualization.test.ts --runInBand --silentbun scripts/reproWorkspaceSwitchTearWeb.tsRisks
The main regression risk is over-preserving old UI state during workspace switches or hydration. Coverage now explicitly exercises new-chat startup, web and Electron shell hydration behavior, bottom composer height stability, deferred snapshot cross-workspace isolation, and mock partial cleanup races around stop, replacement, and delayed writes.
Pains
The visible tear turned out to be a stack of smaller handoff issues rather than one bug: new-chat startup state, shell-level hydration placeholders, transcript remounting, deferred snapshot lag, and the workspace-specific composer decoration stack each contributed a different artifact. Review also flushed out several mock partial races that were easy to miss without targeted timing tests.
Generated with
mux• Model:openai:gpt-5.4• Thinking:xhigh• Cost:$286.28