feat: add GUI click-model browser mode#910
Conversation
❌ Tests failed — 3/1159 failed
Failed tests
|
Greptile SummaryThis PR replaces the element-ID-based
Confidence Score: 3/5Not safe to merge as-is — the hardcoded ephemeral RunPod URL will break production click/hover when the pod restarts, and the always-on mode flag silently disables chatMode restrictions. Two independent P1 issues: an ephemeral infrastructure endpoint with no env-var escape hatch, and a compile-time constant that globally overrides all agent modes with no kill switch. molmo-point-config.ts (hardcoded endpoint), gui-click-only.ts (hardcoded mode flag), ai-sdk-agent.ts (dead chatMode branch) Important Files Changed
Prompt To Fix All With AIFix the following 5 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 5
packages/browseros-agent/apps/server/src/tools/molmo-point-config.ts:1-5
**Hardcoded ephemeral RunPod endpoint**
`MOLMO_POINT_ENDPOINT` is set to a specific RunPod proxy URL. RunPod proxy hostnames are ephemeral — they become invalid as soon as the pod is stopped or restarted. When that happens every `click` and `hover` call will hang for the full `MOLMO_POINT_TIMEOUT_MS` (60 s) before throwing, making the browser completely unusable. There is no environment variable or config override to change it at runtime.
### Issue 2 of 5
packages/browseros-agent/apps/server/src/tools/molmo-point-config.ts:1-2
Read the endpoint from an environment variable so the RunPod pod can be rotated without a code deploy. The hardcoded URL will become invalid the moment the pod restarts.
```suggestion
export const MOLMO_POINT_ENDPOINT =
process.env.MOLMO_POINT_ENDPOINT ??
'https://gseb9k0a2n2vhl-8000.proxy.runpod.net/'
```
### Issue 3 of 5
packages/browseros-agent/apps/server/src/agent/gui-click-only.ts:1
**`GUI_CLICK_ONLY_MODE` is hardcoded `true` with no runtime kill switch**
`GUI_CLICK_ONLY_MODE = true` unconditionally puts every agent session into GUI-click-only mode. There is no environment variable, config flag, or per-session toggle to disable it. The consequence is that the existing chat-mode tool restriction in `ai-sdk-agent.ts` is now dead code (the `else if (config.resolvedConfig.chatMode)` branch can never execute). The old element-based `click` and `hover` schemas are also permanently gone from the registry, so any caller that depended on `element` IDs will silently get wrong behaviour. A guard like `process.env.GUI_CLICK_ONLY_MODE === 'true'` would give an operational kill switch without a code deploy.
### Issue 4 of 5
packages/browseros-agent/apps/server/src/agent/ai-sdk-agent.ts:115-133
**Dead `chatMode` tool-filter branch**
Because `GUI_CLICK_ONLY_MODE` is always `true`, the `else if (config.resolvedConfig.chatMode)` branch filtering tools by `CHAT_MODE_ALLOWED_TOOLS` can never be reached. The chat-mode tool restriction silently no longer applies, which may expose write tools in chat sessions. This violates the project rule to remove dead code.
### Issue 5 of 5
packages/browseros-agent/apps/server/src/tools/input.ts:68-115
**Excessive debug logging per project rule**
Every `click` and `hover` call emits three `logger.info` calls (`'GUI click dispatching'`, `'GUI click dispatched'`, and the same for hover), each carrying large structured payloads including screenshot dimensions, hit-element properties, and model responses. The project rule explicitly asks to remove excessive logging statements after debugging. This pattern also appears in `molmo-point-client.ts` (`'Molmo point request started'`, `'Molmo point response received'`). The request-started and dispatched/dispatching pairs should be removed or gated behind a `DEBUG`-level guard.
Reviews (1): Last reviewed commit: "fix: evals & timeoue" | Re-trigger Greptile |
| export const MOLMO_POINT_ENDPOINT = | ||
| 'https://gseb9k0a2n2vhl-8000.proxy.runpod.net/' | ||
|
|
||
| export const MOLMO_POINT_MAX_NEW_TOKENS = 64 | ||
| export const MOLMO_POINT_TIMEOUT_MS = 60_000 |
There was a problem hiding this comment.
Hardcoded ephemeral RunPod endpoint
MOLMO_POINT_ENDPOINT is set to a specific RunPod proxy URL. RunPod proxy hostnames are ephemeral — they become invalid as soon as the pod is stopped or restarted. When that happens every click and hover call will hang for the full MOLMO_POINT_TIMEOUT_MS (60 s) before throwing, making the browser completely unusable. There is no environment variable or config override to change it at runtime.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/tools/molmo-point-config.ts
Line: 1-5
Comment:
**Hardcoded ephemeral RunPod endpoint**
`MOLMO_POINT_ENDPOINT` is set to a specific RunPod proxy URL. RunPod proxy hostnames are ephemeral — they become invalid as soon as the pod is stopped or restarted. When that happens every `click` and `hover` call will hang for the full `MOLMO_POINT_TIMEOUT_MS` (60 s) before throwing, making the browser completely unusable. There is no environment variable or config override to change it at runtime.
How can I resolve this? If you propose a fix, please make it concise.| export const MOLMO_POINT_ENDPOINT = | ||
| 'https://gseb9k0a2n2vhl-8000.proxy.runpod.net/' |
There was a problem hiding this comment.
Read the endpoint from an environment variable so the RunPod pod can be rotated without a code deploy. The hardcoded URL will become invalid the moment the pod restarts.
| export const MOLMO_POINT_ENDPOINT = | |
| 'https://gseb9k0a2n2vhl-8000.proxy.runpod.net/' | |
| export const MOLMO_POINT_ENDPOINT = | |
| process.env.MOLMO_POINT_ENDPOINT ?? | |
| 'https://gseb9k0a2n2vhl-8000.proxy.runpod.net/' |
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/tools/molmo-point-config.ts
Line: 1-2
Comment:
Read the endpoint from an environment variable so the RunPod pod can be rotated without a code deploy. The hardcoded URL will become invalid the moment the pod restarts.
```suggestion
export const MOLMO_POINT_ENDPOINT =
process.env.MOLMO_POINT_ENDPOINT ??
'https://gseb9k0a2n2vhl-8000.proxy.runpod.net/'
```
How can I resolve this? If you propose a fix, please make it concise.| @@ -0,0 +1,18 @@ | |||
| export const GUI_CLICK_ONLY_MODE = true | |||
There was a problem hiding this comment.
GUI_CLICK_ONLY_MODE is hardcoded true with no runtime kill switch
GUI_CLICK_ONLY_MODE = true unconditionally puts every agent session into GUI-click-only mode. There is no environment variable, config flag, or per-session toggle to disable it. The consequence is that the existing chat-mode tool restriction in ai-sdk-agent.ts is now dead code (the else if (config.resolvedConfig.chatMode) branch can never execute). The old element-based click and hover schemas are also permanently gone from the registry, so any caller that depended on element IDs will silently get wrong behaviour. A guard like process.env.GUI_CLICK_ONLY_MODE === 'true' would give an operational kill switch without a code deploy.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/agent/gui-click-only.ts
Line: 1
Comment:
**`GUI_CLICK_ONLY_MODE` is hardcoded `true` with no runtime kill switch**
`GUI_CLICK_ONLY_MODE = true` unconditionally puts every agent session into GUI-click-only mode. There is no environment variable, config flag, or per-session toggle to disable it. The consequence is that the existing chat-mode tool restriction in `ai-sdk-agent.ts` is now dead code (the `else if (config.resolvedConfig.chatMode)` branch can never execute). The old element-based `click` and `hover` schemas are also permanently gone from the registry, so any caller that depended on `element` IDs will silently get wrong behaviour. A guard like `process.env.GUI_CLICK_ONLY_MODE === 'true'` would give an operational kill switch without a code deploy.
How can I resolve this? If you propose a fix, please make it concise.| toolContext, | ||
| config.resolvedConfig.toolApprovalConfig, | ||
| ) | ||
| const browserTools = config.resolvedConfig.chatMode | ||
| ? Object.fromEntries( | ||
| Object.entries(allBrowserTools).filter(([name]) => | ||
| CHAT_MODE_ALLOWED_TOOLS.has(name), | ||
| ), | ||
| ) | ||
| : allBrowserTools | ||
| let browserTools = allBrowserTools | ||
| if (GUI_CLICK_ONLY_MODE) { | ||
| browserTools = Object.fromEntries( | ||
| Object.entries(allBrowserTools).filter(([name]) => | ||
| isGuiClickOnlyBrowserToolAllowed(name), | ||
| ), | ||
| ) | ||
| } else if (config.resolvedConfig.chatMode) { | ||
| browserTools = Object.fromEntries( | ||
| Object.entries(allBrowserTools).filter(([name]) => | ||
| CHAT_MODE_ALLOWED_TOOLS.has(name), | ||
| ), | ||
| ) | ||
| } | ||
| if (config.resolvedConfig.chatMode) { | ||
| logger.info('Chat mode enabled, restricting to read-only browser tools', { |
There was a problem hiding this comment.
Dead
chatMode tool-filter branch
Because GUI_CLICK_ONLY_MODE is always true, the else if (config.resolvedConfig.chatMode) branch filtering tools by CHAT_MODE_ALLOWED_TOOLS can never be reached. The chat-mode tool restriction silently no longer applies, which may expose write tools in chat sessions. This violates the project rule to remove dead code.
Rule Used: Remove unused/dead code rather than leaving it in ... (source)
Learned From
browseros-ai/BrowserOS-agent#126
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/agent/ai-sdk-agent.ts
Line: 115-133
Comment:
**Dead `chatMode` tool-filter branch**
Because `GUI_CLICK_ONLY_MODE` is always `true`, the `else if (config.resolvedConfig.chatMode)` branch filtering tools by `CHAT_MODE_ALLOWED_TOOLS` can never be reached. The chat-mode tool restriction silently no longer applies, which may expose write tools in chat sessions. This violates the project rule to remove dead code.
**Rule Used:** Remove unused/dead code rather than leaving it in ... ([source](https://app.greptile.com/review/custom-context?memory=9b045db4-2630-428c-95b7-ccf048d34547))
**Learned From**
[browseros-ai/BrowserOS-agent#126](https://github.com/browseros-ai/BrowserOS-agent/pull/126)
How can I resolve this? If you propose a fix, please make it concise.8a21b97 to
4955b48
Compare
fix: extend molmo point timeout feat: modal endpoint
6dcd310 to
283f76e
Compare
Summary
Run AGI SDK Eval
Use Fireworks for Kimi:
This runs the full
agisdk-realsuite with 20 workers usingaccounts/fireworks/models/kimi-k2p5on Fireworks.