fix(telemetry): surface FallbackAdapter active model/provider on parent spans by mrniket · Pull Request #1341 · livekit/agents-js

mrniket · 2026-04-29T15:22:35Z

Summary

llm.FallbackAdapter / tts.FallbackAdapter / stt.FallbackAdapter currently leave their wrapper labels (FallbackAdapter, inference.STT, livekit) on the parent spans (llm_node, tts_node, user_turn), so when a fallback fires the trace can't tell you which provider actually handled the turn — gen_ai.request.model / gen_ai.provider.name are stuck on the wrapper.

Separately, llm_node and the inner llm_request spans both carried gen_ai.usage.* and were both shaped like generation spans, so any tracing backend that infers cost from observation type ended up counting the same call 2-3 times across wrapper + provider layers.

This PR fixes both.

Trace propagation

llm/fallback_adapter: capture the caller span on construction. On first successful chunk, write gen_ai.request.model / gen_ai.provider.name back onto the caller span, the inner llm_request span, and the run span. Wrap each attempt in an llm_fallback_attempt span with attempt index + model/provider.
tts/fallback_adapter: same caller-span propagation pattern.
stt/fallback_adapter:
- Track _activeStt (set when a child stream produces events or recognize() succeeds) and expose it via label / model / provider getters so external callers reading the wrapper see the active child.
- Wrap each attempt in stt_fallback_recognize_attempt / stt_fallback_stream_attempt spans.
voice/agent_activity + voice/audio_recognition: thread the STT instance into AudioRecognition so user_turn re-reads model/provider on each STT event (FallbackAdapter only knows its active child after the first event lands). Idempotent — skips setAttribute if the value hasn't changed.

Cost attribution

voice/generation: capture final ChatChunk.usage and stamp exact prompt/completion tokens on the llm_node span, classified as generation. This becomes the single billable layer per LLM turn — backends that estimate tokens from prompt text no longer diverge from the provider's own billing.
llm/llm + tts/tts: classify llm_request / tts_request spans as span (not generation) so wrapper + inner-provider layers aren't counted as separate cost centres. LiveKit Cloud is unaffected — gen_ai.usage.* is still emitted on the inner spans for backends that read it directly. Made _llmRequestSpan / _ttsRequestSpan protected so FallbackAdapter subclasses can write through.
telemetry/trace_types: add a new observation-type attribute (matches the existing naming convention in this file) plus ATTR_FALLBACK_ATTEMPT_INDEX.

Verified

End-to-end against a real call:

Metric	Before	After
`llm_node` model	`FallbackAdapter`	active provider model
`user_turn` model	`inference.STT`	active STT (e.g. `assemblyai/u3-rt-pro`)
Billable LLM layers	3 (`llm_node` + 2x `llm_request`)	1 (`llm_node` only)
Per-turn cost vs provider math	~3x over	exact

Test plan

pnpm build:agents clean
pnpm test — all 29 fallback-adapter tests pass (LLM 9 + STT + TTS)
pnpm format:check clean
Verified live in production for several days as a vendored patch on @livekit/agents@1.2.7 before this PR
Reviewer eyes on the protected _llmRequestSpan / _ttsRequestSpan exposure — open to a different shape if you'd rather keep them private

changeset-bot · 2026-04-29T15:22:39Z

⚠️ No Changeset found

Latest commit: 94e9807

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

…nt spans When `llm.FallbackAdapter`, `tts.FallbackAdapter`, or `stt.FallbackAdapter` wraps multiple providers, parent spans (`llm_node`, `tts_node`, `user_turn`) were stamped with the wrapper labels (`FallbackAdapter`, `inference.STT`, `livekit`) instead of the provider that actually handled the request. This made `gen_ai.request.model` / `gen_ai.provider.name` useless for telemetry consumers when fallbacks were in play. Changes: - llm/fallback_adapter: capture caller span on construction; on first successful chunk, write `gen_ai.request.model` / `gen_ai.provider.name` back onto the caller span, the inner `llm_request` span, and the run span. Wrap each attempt in an `llm_fallback_attempt` span carrying the attempt index plus model/provider. - tts/fallback_adapter: same propagation pattern via captured caller span. - stt/fallback_adapter: - track `_activeStt` set when a child stream produces events or `recognize()` succeeds; expose it via `label` / `model` / `provider` getters so callers reading the wrapper see the active child. - wrap each `recognize()` and stream attempt in `stt_fallback_recognize_attempt` / `stt_fallback_stream_attempt` spans with attempt index + model/provider. - voice/agent_activity + audio_recognition: thread the `STT` instance into AudioRecognition so `user_turn` re-reads the active model/provider on each STT event. Skip `setAttribute` when nothing changed. Cost attribution: - voice/generation: capture final `ChatChunk.usage` and stamp exact prompt/completion tokens on the `llm_node` span, classified as `generation`. This becomes the single billable layer for an LLM turn, so tracing backends that infer cost from observation type don't fall back to a local-tokenizer estimate of the prompt text. - llm/llm + tts/tts: classify `llm_request` / `tts_request` spans as `span` (not `generation`) so wrapper + provider layers aren't double- counted as separate cost centres. Made `_llmRequestSpan` / `_ttsRequestSpan` `protected` so subclass implementations can write through to them. - LiveKit Cloud is unaffected: `gen_ai.usage.*` is still emitted on the inner `llm_request` / `tts_request` spans for backends that read it directly. - telemetry/trace_types: add a new observation-type attribute (matches the existing naming convention in this file) plus `ATTR_FALLBACK_ATTEMPT_INDEX`. Verified end-to-end against a real call — `llm_node` model now reads the active provider model (was `FallbackAdapter`), `user_turn` model reads the active STT (was `inference.STT`), per-turn cost matches exact provider math (was ~3x over).

Brings in 31 upstream commits since the branch diverged. Two real conflicts: - agents/src/llm/llm.ts: upstream added `#providerRequestIds`; this branch made `_llmRequestSpan` protected (so FallbackLLMStream can write through). Kept both — protected `_llmRequestSpan` plus private `#providerRequestIds`. - agents/src/voice/audio_recognition.ts: upstream added requestId collection in `onSTTEvent`; this branch added `refreshUserTurnSttAttributes()` at the same spot for FallbackAdapter live-update. Kept both, refresh first. Other files (tts.ts, generation.ts, agent_activity.ts, trace_types.ts) auto- merged cleanly — upstream's `#providerRequestIds` field on tts.ts coexists with this branch's protected `_ttsRequestSpan` the same way as llm.ts. # Conflicts: # agents/src/llm/llm.ts # agents/src/voice/audio_recognition.ts

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

mrniket force-pushed the claude/competent-beaver-bc8c43 branch from 9abe920 to 3d3cbce Compare April 30, 2026 13:14

mrniket added 2 commits April 30, 2026 14:15

mrniket force-pushed the claude/competent-beaver-bc8c43 branch from 3d3cbce to 94e9807 Compare April 30, 2026 13:16

mrniket marked this pull request as ready for review April 30, 2026 13:20

devin-ai-integration Bot reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(telemetry): surface FallbackAdapter active model/provider on parent spans#1341

fix(telemetry): surface FallbackAdapter active model/provider on parent spans#1341
mrniket wants to merge 2 commits into
livekit:mainfrom
lottiehq-oss:claude/competent-beaver-bc8c43

mrniket commented Apr 29, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mrniket commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Trace propagation

Cost attribution

Verified

Test plan

Uh oh!

changeset-bot Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mrniket commented Apr 29, 2026 •

edited

Loading

changeset-bot Bot commented Apr 29, 2026 •

edited

Loading