fix(llm): raise streaming timeout default#1464
Conversation
🦋 Changeset detectedLatest commit: d09fdd5 The changes in this PR will be included in the next version bump. This PR includes changesets to release 31 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
| export const DEFAULT_SESSION_CONNECT_OPTIONS: ResolvedSessionConnectOptions = { | ||
| sttConnOptions: DEFAULT_API_CONNECT_OPTIONS, | ||
| llmConnOptions: DEFAULT_API_CONNECT_OPTIONS, | ||
| llmConnOptions: DEFAULT_LLM_API_CONNECT_OPTIONS, |
There was a problem hiding this comment.
🔴 AgentSession still uses 10s LLM timeout base, defeating the PR's 30s default
The PR's stated intent is to raise the default LLM streaming request timeout to 30s. While DEFAULT_SESSION_CONNECT_OPTIONS.llmConnOptions at agents/src/types.ts:73 was updated to use DEFAULT_LLM_API_CONNECT_OPTIONS (30s), the actual resolution of session connect options in AgentSession constructor at agents/src/voice/agent_session.ts:324 still uses DEFAULT_API_CONNECT_OPTIONS (10s) as the base:
llmConnOptions: { ...DEFAULT_API_CONNECT_OPTIONS, ...connOptions?.llmConnOptions },
In the voice pipeline, agent.ts:428 passes activity.agentSession.connOptions.llmConnOptions to llm.chat(), which overrides the plugin-level default. So the LLM timeout in the primary voice pipeline path remains 10s, not 30s. The plugin-level defaults (DEFAULT_LLM_API_CONNECT_OPTIONS) only take effect when llm.chat() is called directly without connOptions.
Prompt for agents
In agents/src/voice/agent_session.ts line 324, the llmConnOptions base should use DEFAULT_LLM_API_CONNECT_OPTIONS instead of DEFAULT_API_CONNECT_OPTIONS. Import DEFAULT_LLM_API_CONNECT_OPTIONS from '../types.js' and change line 324 from:
llmConnOptions: { ...DEFAULT_API_CONNECT_OPTIONS, ...connOptions?.llmConnOptions },
to:
llmConnOptions: { ...DEFAULT_LLM_API_CONNECT_OPTIONS, ...connOptions?.llmConnOptions },
This ensures the voice pipeline (the primary code path for LLM calls) correctly uses the new 30s default timeout.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
Fixes #5508.
The Anthropic LLM plugin used
httpx.AsyncClient(timeout=5.0), which sets all httpx sub-timeouts — including the per-chunk SSE read timeout — to 5 seconds. Claude's adaptive-thinking phases routinely pause for 10–30 s before emitting the first content chunk, so the 5 s read timeout fires during normal usage and raisesAPIConnectionError, killing voice sessions mid-turn.Changes
httpx.Timeout(5.0, read=30.0)— connect stays at 5 s (genuine TCP failures surface fast) while the per-chunk read window is 30 s (covers standard thinking budgets with headroom).timeoutconstructor parameter:timeout: httpx.Timeout | None = Nonelets callers supply a custom timeout without constructing a wholeanthropic.AsyncClient— e.g.httpx.Timeout(5.0, read=60.0)for extended thinking or very large contexts. Aligns with the pattern already used by the OpenAI plugin.tests/test_plugin_anthropic.pycover the default split, tight connect value, custom timeout pass-through, andclient=precedence overtimeout=.Before / after
Usage (new parameter)
Test plan
uv run pytest tests/test_plugin_anthropic.py -v— 6/6 passuv run ruff check— no errorsuv run ruff format --check— no changes needed