implement google live api for live session navid shad #86etkgdqu#34
Merged
navidshad merged 19 commits intoMay 8, 2026
Merged
Conversation
Replaces the OpenAI Realtime entry point on the bundle page with a Gemini `gemini-3.1-flash-live-preview` flow using the official `@google/genai` SDK (server-issued ephemeral tokens, browser `ai.live.connect`, AudioWorklet PCM16 mic capture, queued AudioBufferSourceNode playback, auto-resume across the 15-min cap, and a `provider`-aware `cost` virtual so historical OpenAI session records still price correctly). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds module/function-level JSDoc to the Gemini store, audio worklet, server
ephemeral-token function, and the Gemini-bound mic toggle; trims
debug-cycle console logs (errors and warnings stay), tightens dialog and
phrase-selection helpers in the practice page, and fixes a precedence bug
in the server-side error message (': ' + msg || String(error)).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…gemini/ sub-modules
Server: provider-specific config, types, utils, and ephemeral-token
functions move under `live_session/{openai,gemini}/`. The parent
`functions.ts` keeps the shared create/update functions and aggregates the
provider modules into the exported function list. `types.ts` becomes a
barrel re-exporting both providers' types so the frontend's existing
`~/types/live-session.type` import keeps working.
Frontend: `components/liveSession/StartNew.vue` is split into
`liveSession/gemini/StartNew.vue` (current behavior) and
`liveSession/openai/StartNew.vue`. `StartLiveSessionForm` accepts a
`voiceOptions` prop (defaulting to Gemini voices) so each variant can pass
its own provider-specific voice list. `pages/sessions/new.vue` defaults to
the Gemini variant; switching to OpenAI is a single component swap.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… Gemini pricing models #86etkgdqu
Removes every frontend runtime path to the OpenAI Realtime backend:
- Deletes the OpenAI practice page (`pages/practice/live-session-[id].vue`),
the OpenAI Pinia store, the OpenAI-bound mic toggle, and the OpenAI
`StartNew` variant.
- Rewires `FreemiumTimer` (the only component still importing the OpenAI
store) to the Gemini store so timer-expiry auto-mute keeps working.
- Trims stale comments that referenced the now-removed files.
Server-side `live_session/openai/` stays in place so historical OpenAI
session records continue to price correctly via the provider-aware `cost`
virtual; only the frontend exposure is gone.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…emini
Re-attaches the OpenAI source files we removed last commit so the logic
stays in the codebase, but disconnects them from the runtime:
- Gemini is now the default. `pages/practice/live-session-[id].vue` is
the Gemini practice page (moved out of `live-session-gemini/`); bundle
page and Gemini StartNew route to `/practice/live-session-{id}`.
- OpenAI lives at `pages/practice/live-session-openai/[id].vue` (folder
sub-path on purpose — vue-router can't disambiguate
`live-session-:id` and `live-session-openai-:id` since they score
equally and the alphabetically-earlier route wins, so the slash
separator is required). Its `OPENAI_DISABLED = true` constant
short-circuits `onMounted` and the template renders a disabled
notice, so even direct URL visits make zero API calls.
- OpenAI StartNew variant routes to the new
`/practice/live-session-openai/{id}` URL; not currently rendered by
any page but kept for reference.
- `StartLiveSessionForm` had a `withDefaults` bug where the defaults
factory referenced a local `const` — Vue compiler hoists `withDefaults`
out of script-setup scope, so this broke compilation. Inlined the
voice list literal.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ialog persistence #86etkgdqu
… dedicated video token metrics #86etkgdqu
…ranscript, and session state tracking #86etkgdqu
… hoverable titles #86etkgdqu
…anslation-masking mode in live sessions #86etkgdqu
…g i18n placeholders #86etkgdqu
…t input for live practice #86etkgdqu
…ection for live sessions #86etkgdqu
…allback #86etkgdqu
…igger actions #86etkgdqu
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🏷️ PR Title:
Enhance Gemini Live Sessions with Improved UI, Language Support, and Token Tracking
📋 Summary
This PR introduces comprehensive improvements to the Gemini live session experience, including a redesigned UI with enhanced phrase cards and chat transcript, native language selection, and better session state management. It adds a text-based message composer with microphone toggle fallback and microphone level visualization. The update also includes advanced Gemini token tracking for tools, thoughts, and video metrics, as well as refined message handling and granular dialog persistence. Localization enhancements using i18n placeholders and UI layout fixes are incorporated. Additionally, the OpenAI live-session flow is disabled and detached, with the default live session route set to Gemini. The price calculation logic is centralized and updated to apply cost markup for Gemini pricing models. Obsolete Instagram connection and status pages are removed, and overall code cleanup and documentation improvements are applied.
🔗 Related Tasks
#86etkgdqu - Implement Gemini Live API for live sessions, improve UI and session state, add native language selection, mic level visualization, token tracking, and text input enhancements; disable OpenAI live session flow and clean up related code.
📝 Additional Details
📜 Commit List