feat(runtime): /runtimes/* HTTP surface + RuntimeStatusBar/ControlPanel UI by DaniAkash · Pull Request #971 · browseros-ai/BrowserOS

Dani Akash (DaniAkash) · 2026-05-08T13:51:26Z

Summary

Stacked on #970 (feat/openclaw-runtime). Lands the user-visible piece of the AgentRuntime architecture: a uniform /runtimes/<adapter>/* HTTP surface backed by runtime.executeAction(...) through AgentRuntimeRegistry, plus capability-gated UI components that consume it.

Server:

GET /runtimes — list all registered runtimes with descriptor + status snapshot + capabilities
GET /runtimes/:adapter/status — single runtime status
GET /runtimes/:adapter/status/stream — SSE: snapshot on connect + every state transition + 15s heartbeat
POST /runtimes/:adapter/actions/:action — capability-gated dispatch through executeAction. Body schema picks up agentId for reset-wipe-agent. 405 if action not in capabilities; 400 on unknown action; 500 on action throw.
GET /runtimes/:adapter/logs — container-runtime logs (405 for host-process)
All routes use zValidator for path/query/body so the typed RPC client (hc<AppType>) picks up the schemas.

UI:

useRuntime(adapter) / useRuntimeAction(adapter) / useRuntimeLogs(adapter) — generic React Query hooks backed by the typed RPC client. 5s default poll; mutations invalidate the status query on success.
<RuntimeStatusBar adapter='…'> replaces GatewayStatusBar. Compact one-line bar with state pill + optional Restart. extraPill and extraActions slots let openclaw add its control-plane pill and Open Terminal button without baking gateway specifics into the runtime layer.
<RuntimeControlPanel adapter='…'> replaces GatewayStateCards from OpenClawControls. Capability-gated state-appropriate primary CTA: not_installed → Install, stopped → Start, errored → Restart + Reset, installing/starting → spinner, cli_missing/unhealthy → Reinstall CLI, running → optional Stop. extras slot for adapter-specific affordances (e.g. openclaw's provider Setup dialog trigger).
AgentsPage rewired to mount the new components. The 'Unavailable' badge in AgentSummaryChips.tsx deletes (capabilities-driven UI surfaces the signal more usefully on the new RuntimeControlPanel).
GatewayStatusBar.tsx deletes outright.
ControlPlaneAlert / LifecycleAlert / InlineErrorAlert from OpenClawControls remain — they cover gateway-specific concerns the runtime layer doesn't model.

Out of scope (deferred follow-ups):

Deleting the legacy /claw/{status,start,stop,restart,logs} lifecycle routes — UI still polls /claw/status for control-plane info that lives outside the runtime registry. Will land once the control-plane surface is moved to the runtime layer (Phase 7+).
Slimming useOpenClaw.ts's lifecycle mutations — they're now a fallback, replaced by the new hooks at the call sites that matter.

Test plan

bun run typecheck clean across server + UI (pre-existing missing-generated-graphql errors aside)
biome check clean on touched files
11 new server-side tests in tests/api/routes/runtimes.test.ts covering list/status/actions (capability gate, unknown action, agentId requirement, throw → 500) / logs (container vs host-process)
Full server test sweep — 1042 pass, 0 fail (one pre-existing ContainerCli flake also reproduces on plain origin/dev)
End-to-end UI verification by Dani — full openclaw lifecycle via the new RuntimeStatusBar + RuntimeControlPanel before merging this stack

Uniform HTTP surface backed by AgentRuntimeRegistry + runtime.executeAction: - GET /runtimes — list all registered runtimes (descriptor + status + capabilities) - GET /runtimes/:adapter/status — single status snapshot - GET /runtimes/:adapter/status/stream — SSE: snapshot on connect + every state transition - POST /runtimes/:adapter/actions/:action — capability-gated dispatch through executeAction - GET /runtimes/:adapter/logs — container-runtime logs (405 for host-process) Routes use zValidator for path/query/body so the typed RPC client picks up the schemas; mounted with the same requireTrustedAppOrigin middleware as /claw/* /terminal /acl-rules /monitoring.

Generic React Query hooks backed by the typed RPC client (hc<AppType>), keyed by adapter id. useRuntime polls /runtimes/:adapter/status every 5s by default; useRuntimeAction issues a capability-gated POST to /runtimes/:adapter/actions/:action and invalidates the status query on success; useRuntimeLogs is opt-in (disabled by default) for container runtimes.

RuntimeStatusBar — compact one-line bar with adapter name + state pill + optional Restart action. Reads from useRuntime(adapter); the pill covers every container and host-process state. extraPill / extraActions slots let openclaw add its control-plane pill and Open Terminal button without baking gateway specifics into the runtime layer. RuntimeControlPanel — capability-gated state-appropriate primary CTA: not_installed → Install, stopped → Start, errored → Restart + Reset, installing/starting → spinner, cli_missing/unhealthy → Reinstall CLI, running → optional Stop. extras slot for adapter-specific affordances (e.g. openclaw provider Setup dialog trigger).

…ge; drop legacy lifecycle UI AgentsPage now uses the new runtime-control components for OpenClaw lifecycle: - RuntimeControlPanel replaces GatewayStateCards (state-appropriate CTAs gated on capabilities). Provider config dialog trigger lives in the panel's extras slot. - RuntimeStatusBar replaces GatewayStatusBar (running pill + Restart). Control-plane pill + Open Terminal live in the bar's extra slots — gateway specifics stay outside the runtime layer. GatewayStatusBar.tsx deletes outright. The 'Unavailable' badge in AgentSummaryChips.tsx deletes — capabilities-driven UI surfaces the same signal more usefully on the new RuntimeControlPanel; the prop stays for upstream callers but is now a no-op. ControlPlaneAlert / LifecycleAlert / InlineErrorAlert from OpenClawControls remain — they're alerts for control-plane and mid-flight lifecycle states, distinct from the runtime control surface. They cover gateway-specific concerns the runtime layer doesn't model. Cleanup deferred to a follow-up.

github-actions · 2026-05-08T13:53:16Z

✅ Tests passed — 1224/1228

Suite	Passed	Skipped
✅ `agent`	76/76	0
✅ `build`	9/9	0
✅ `eval`	93/93	0
✅ `server-agent`	261/261	0
✅ `server-api`	197/197	0
✅ `server-browser`	4/4	0
✅ `server-integration`	9/10	1
✅ `server-lib`	253/253	0
✅ `server-root`	60/63	3
✅ `server-skills`	31/31	0
✅ `server-tools`	231/231	0

View workflow run

greptile-apps · 2026-05-08T13:55:36Z

Greptile Summary

This PR lands the user-visible runtime layer: a uniform /runtimes/<adapter>/* HTTP surface backed by AgentRuntimeRegistry, plus generic React Query hooks and two new UI components (RuntimeStatusBar, RuntimeControlPanel) that replace the openclaw-specific GatewayStatusBar and GatewayStateCards. The AgentsPage is rewired to consume the new components, and the per-row Unavailable badge is removed in favour of the capability-driven control panel.

Server (runtimes.ts): five new routes (list, status, SSE stream, action dispatch, logs) all validated with zValidator; action dispatch is capability-gated with correct 405/400/500 error handling; 11 new unit tests cover the key branches.
Client hooks (useRuntime.ts): typed RPC-backed useRuntime / useRuntimeAction / useRuntimeLogs with 5 s default poll and post-action query invalidation.
UI components: RuntimeControlPanel maps runtime state to capability-gated CTAs; RuntimeStatusBar renders a compact pill bar — both accept adapter-specific slots (extras, extraPill, extraActions) so openclaw-specific concerns stay out of the generic layer.

Confidence Score: 4/5

Safe to merge after addressing the minor cleanup items — the core server routes, hooks, and UI components are well-structured and covered by tests.

The new routes are capability-gated, validated, and tested. The UI components cleanly replace their predecessors without introducing regressions on the primary openclaw flow. The findings are quality/cleanup items: an unused query key, a dead prop retained for callers that is never read, a label-fidelity regression in the control-plane pill, and a subtle SSE heartbeat leak on silent TCP drops. None affect correctness of the main flow today.

useRuntime.ts (unused RUNTIME_QUERY_KEYS.list), AgentSummaryChips.tsx (dead adapterHealth prop), AgentsPage.tsx (ControlPlanePill label regression), runtimes.ts (SSE heartbeat cleanup)

Important Files Changed

Filename	Overview
packages/browseros-agent/apps/server/src/api/routes/runtimes.ts	New `/runtimes/*` HTTP surface: list, status, SSE stream, action dispatch, logs. Route logic is well-structured and capability-gated. Minor: SSE heartbeat write errors are silently swallowed and won't trigger early cleanup on a dead connection.
packages/browseros-agent/apps/agent/entrypoints/app/agents/useRuntime.ts	New React Query hooks for runtime status, action dispatch, and logs. `RUNTIME_QUERY_KEYS.list` is exported but never consumed — dead code that should be removed per project rules.
packages/browseros-agent/apps/agent/entrypoints/app/agents/runtime-controls/RuntimeControlPanel.tsx	New generic capability-gated control panel. State-to-CTA mapping is clear and exhaustive. `extras` slot correctly threads adapter-specific affordances without leaking openclaw specifics into the base component.
packages/browseros-agent/apps/agent/entrypoints/app/agents/runtime-controls/RuntimeStatusBar.tsx	New compact status bar with extensible pill + action slots. State pill mapping is thorough. Separator rendering logic is correct.
packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentsPage.tsx	Rewires AgentsPage to use new runtime components; removes GatewayStatusBar/GatewayStateCards. New ControlPlanePill merges 'reconnecting'/'recovering' into a single "Connecting" label, losing the granularity the old component provided.
packages/browseros-agent/apps/agent/entrypoints/app/agents/agent-row/AgentSummaryChips.tsx	Removes the 'Unavailable' badge from per-row chips. `adapterHealth` prop is retained as optional but never used inside the component — dead interface surface.
packages/browseros-agent/apps/server/src/api/server.ts	Mounts the new `/runtimes` router behind `requireTrustedAppOrigin()`. No issues.
packages/browseros-agent/apps/server/tests/api/routes/runtimes.test.ts	11 focused tests covering the new routes: capability gate, unknown action (400), agentId requirement, action throw (500), container vs host-process logs. Coverage is comprehensive for the happy path and key error branches.
packages/browseros-agent/apps/agent/entrypoints/app/agents/GatewayStatusBar.tsx	Deleted entirely — replaced by the generic RuntimeStatusBar. Clean removal.

Sequence Diagram

sequenceDiagram
    participant UI as AgentsPage (React)
    participant Hook as useRuntime / useRuntimeAction
    participant RPC as Hono RPC Client
    participant Server as /runtimes/* routes
    participant Reg as AgentRuntimeRegistry
    participant RT as AgentRuntime (openclaw)

    UI->>Hook: useRuntime("openclaw") [5s poll]
    Hook->>RPC: GET /runtimes/:adapter/status
    RPC->>Server: GET /runtimes/openclaw/status
    Server->>Reg: registry.get("openclaw")
    Reg-->>Server: runtime instance
    Server->>RT: runtime.getStatusSnapshot()
    RT-->>Server: RuntimeStatusSnapshot
    Server-->>RPC: "{ descriptor, status, capabilities }"
    RPC-->>Hook: RuntimeView
    Hook-->>UI: "{ data, isLoading }"

    UI->>Hook: useRuntimeAction("openclaw")
    UI->>Hook: "action.mutate({ action: "restart" })"
    Hook->>RPC: POST /runtimes/openclaw/actions/restart
    RPC->>Server: POST /runtimes/:adapter/actions/:action
    Server->>RT: capabilities.includes("restart")?
    RT-->>Server: true
    Server->>RT: "runtime.executeAction({ type: "restart" })"
    RT-->>Server: void
    Server-->>RPC: "{ status: "ok", state: "starting" }"
    RPC-->>Hook: success
    Hook->>Hook: invalidateQueries(["runtime-status","openclaw"])

    UI->>RPC: GET /runtimes/openclaw/status/stream (SSE)
    RPC->>Server: SSE connect
    Server->>RT: runtime.subscribe(writeSnapshot)
    RT-->>Server: unsubscribe fn
    loop every state change
        RT->>Server: listener(snapshot)
        Server-->>UI: "event: snapshot data: {...}"
    end
    loop every 15s
        Server-->>UI: "event: heartbeat data: {ts:...}"
    end
    UI->>Server: abort
    Server->>RT: unsubscribe()
    Server->>Server: clearInterval(heartbeat)

Comments Outside Diff (2)

packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentsPage.tsx, line 131-138 (link)

pillForControlPlane loses distinct "Reconnecting" / "Recovering" labels

The old GatewayStatusBar.tsx mapped 'reconnecting' → "Reconnecting" and 'recovering' → "Recovering" as separate cases with separate labels. The new implementation folds both under a single "Connecting" label. A user whose gateway is in a slow recovery loop now sees the same text as a fresh connect attempt, making it harder to tell that the situation is degraded. Consider preserving the individual labels to match the previous UX fidelity.

Prompt To Fix With AI

This is a comment left during a code review.
Path: packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentsPage.tsx
Line: 131-138

Comment:
**`pillForControlPlane` loses distinct "Reconnecting" / "Recovering" labels**

The old `GatewayStatusBar.tsx` mapped `'reconnecting'` → `"Reconnecting"` and `'recovering'` → `"Recovering"` as separate cases with separate labels. The new implementation folds both under a single `"Connecting"` label. A user whose gateway is in a slow recovery loop now sees the same text as a fresh connect attempt, making it harder to tell that the situation is degraded. Consider preserving the individual labels to match the previous UX fidelity.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

packages/browseros-agent/apps/server/src/api/routes/runtimes.ts, line 1067-1103 (link)

SSE stream: heartbeat write errors are silently swallowed after abort

The heartbeat setInterval callback calls s.write(...).catch(() => {}), which suppresses every write error including those that occur while the stream is still considered alive but the underlying connection has silently dropped (e.g., TCP RST before the Hono abort handler fires). In that window, the interval continues firing and accumulating silently-failing writes. The pattern is fine for the snapshot writes (fire-and-forget after subscribe), but the heartbeat would benefit from detecting write failure and resolving the abort promise early to trigger cleanup. Minimal fix: track a closed flag and clearInterval on the first failed heartbeat write.

Prompt To Fix With AI

This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/api/routes/runtimes.ts
Line: 1067-1103

Comment:
**SSE stream: heartbeat write errors are silently swallowed after abort**

The heartbeat `setInterval` callback calls `s.write(...).catch(() => {})`, which suppresses every write error including those that occur while the stream is still considered alive but the underlying connection has silently dropped (e.g., TCP RST before the Hono abort handler fires). In that window, the interval continues firing and accumulating silently-failing writes. The pattern is fine for the snapshot writes (fire-and-forget after subscribe), but the heartbeat would benefit from detecting write failure and resolving the abort promise early to trigger cleanup. Minimal fix: track a `closed` flag and `clearInterval` on the first failed heartbeat write.

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
packages/browseros-agent/apps/agent/entrypoints/app/agents/useRuntime.ts:914-918
**Unused `list` key in `RUNTIME_QUERY_KEYS`**

`RUNTIME_QUERY_KEYS.list` is exported but never consumed — no `useRuntimeList` hook exists and the key isn't referenced anywhere in this PR. Leaving it creates false signal that a list-level invalidation pattern is in use. Per the project's cleanup guidelines, dead code should be removed rather than retained for hypothetical future use.

### Issue 2 of 4
packages/browseros-agent/apps/agent/entrypoints/app/agents/agent-row/AgentSummaryChips.tsx:7-15
**`adapterHealth` prop declared but never used**

`adapterHealth` is kept as an optional prop "for upstream callers" but is never destructured, read, or acted on inside the component. Any caller that still passes it has the value silently discarded. The prop should be removed entirely — callers can be updated in the same pass since the change is purely additive (optional → removed). Keeping it as dead interface surface contradicts the project's remove-dead-code rule.

### Issue 3 of 4
packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentsPage.tsx:131-138
**`pillForControlPlane` loses distinct "Reconnecting" / "Recovering" labels**

The old `GatewayStatusBar.tsx` mapped `'reconnecting'` → `"Reconnecting"` and `'recovering'` → `"Recovering"` as separate cases with separate labels. The new implementation folds both under a single `"Connecting"` label. A user whose gateway is in a slow recovery loop now sees the same text as a fresh connect attempt, making it harder to tell that the situation is degraded. Consider preserving the individual labels to match the previous UX fidelity.

### Issue 4 of 4
packages/browseros-agent/apps/server/src/api/routes/runtimes.ts:1067-1103
**SSE stream: heartbeat write errors are silently swallowed after abort**

The heartbeat `setInterval` callback calls `s.write(...).catch(() => {})`, which suppresses every write error including those that occur while the stream is still considered alive but the underlying connection has silently dropped (e.g., TCP RST before the Hono abort handler fires). In that window, the interval continues firing and accumulating silently-failing writes. The pattern is fine for the snapshot writes (fire-and-forget after subscribe), but the heartbeat would benefit from detecting write failure and resolving the abort promise early to trigger cleanup. Minimal fix: track a `closed` flag and `clearInterval` on the first failed heartbeat write.

_{Reviews (1): Last reviewed commit: "refactor(ui): wire RuntimeStatusBar + Ru..." | Re-trigger Greptile}

…nder Start CTA for installed state Two stuck-state bugs in the new RuntimeControlPanel: 1. The runtime's state machine started fresh at not_installed on every server boot. tryAutoStart's short-circuit branches (gateway already running, auth pass) never drove the state transitions, so the UI saw not_installed for a gateway that was actually running. Add a syncState() method on OpenClawContainerRuntime that probes the actual container via cli.inspectContainer + /readyz and sets state accordingly. Wire it into tryAutoStart as the first step so it runs regardless of which branch the rest takes. 2. RuntimeControlPanel had no case for state === 'installed', so after a successful Install the panel went blank instead of offering the next step. Treat installed the same as stopped — show the Start CTA with copy that reflects the difference (image is pulled vs container exists but stopped). Optional-chained the syncState call so existing tests with partial runtime mocks don't crash on the missing method.

When a previous server boot wrote runtime-state.json after the gateway container had already been created with a different hostPort (e.g. 18789 held at allocate-time → container started on 18790), the persisted port disagrees with the live mapping. The runtime then probes the persisted port forever and the UI sticks at `starting`. `syncState` now reads `NetworkSettings.Ports` from inspect-container and adopts the actual host port for the gateway container's published port when it differs. The service then re-syncs `hostPort`/`httpClient` and rewrites runtime-state.json so the next boot starts from a clean slate. - ContainerInfo gains a flat `ports` array (parsed from `NetworkSettings.Ports`) - OpenClawContainerRuntime.syncState: reconcile hostPort from live mapping before probing /readyz - OpenClawService.tryAutoStart: adopt the runtime's reconciled port and persist it via writePersistedGatewayPort

…ismatch When a previous boot leaves a gateway running with a stale token, the realloc-on-auth-mismatch branch was bumping the persisted port without actually freeing the old container — ManagedContainer.start() no-ops when state==='running', so the next start cycle never recreated the container on the new port. The result: persisted/service/runtime drift back into mismatch, and history requests 500 with "gateway is not ready" even while the (stale) gateway keeps serving chat from the old port. Stop the gateway explicitly when we decide to bump off the port, so the upcoming start cycle goes through the full remove + create + start path on the freshly-allocated port. The token-mismatch test still passes; adds a new test pinning the stop-before-realloc behaviour.

…fresh install Starting the gateway via the new RuntimeControlPanel "Start" CTA goes through runtime.executeAction({type:'start'}) directly, bypassing OpenClawService.tryAutoStart and its ensureStateEnvFile() seeding step. On a freshly-wiped .browseros-dev that left nerdctl create failing with "failed to open env file .../.openclaw/.env: no such file or directory". Seed the file (empty, mode 0600) inside buildContainerSpec so the runtime is self-sufficient. Service callers continue to work — their ensureStateEnvFile is now an idempotent no-op once the file exists.

Dani Akash (DaniAkash) added 4 commits May 8, 2026 19:08

github-actions Bot added the Feature label May 8, 2026

Dani Akash (DaniAkash) added 5 commits May 8, 2026 19:27

chore: merge feat/openclaw-runtime — picks up bundled-Lima fallback fix

8f68d12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(runtime): /runtimes/* HTTP surface + RuntimeStatusBar/ControlPanel UI#971

feat(runtime): /runtimes/* HTTP surface + RuntimeStatusBar/ControlPanel UI#971
Dani Akash (DaniAkash) wants to merge 9 commits intofeat/openclaw-runtimefrom
feat/runtime-control-ui

Dani Akash (DaniAkash) commented May 8, 2026

Uh oh!

github-actions Bot commented May 8, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 8, 2026 •

edited

Loading

Comments Outside Diff (2)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Dani Akash (DaniAkash) commented May 8, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Tests passed — 1224/1228

Uh oh!

greptile-apps Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (2)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented May 8, 2026 •

edited

Loading

greptile-apps Bot commented May 8, 2026 •

edited

Loading