Skip to content

fix(session-replay-browser): suppress IDB transaction-timeout log cascade (SR-4356)#1751

Closed
lewgordon-amplitude wants to merge 1 commit into
mainfrom
lewgordon/sr-4356-sr-sdk-floods-console-with-transaction-timed-out-warnings
Closed

fix(session-replay-browser): suppress IDB transaction-timeout log cascade (SR-4356)#1751
lewgordon-amplitude wants to merge 1 commit into
mainfrom
lewgordon/sr-4356-sr-sdk-floods-console-with-transaction-timed-out-warnings

Conversation

@lewgordon-amplitude
Copy link
Copy Markdown
Collaborator

@lewgordon-amplitude lewgordon-amplitude commented May 12, 2026

Summary

  • Customer page was seeing hundreds of repeated Amplitude Logger [Warn]: Failed to store session replay events in IndexedDB: transaction timed out (devtools collapsed 429×). Self-amplifying loop via rrweb-plugin-console-record + Sentry made it worse.
  • Three layered fixes in packages/session-replay-browser/src/events:
    1. tripped flag on SessionReplayEventsIDBStore — public methods short-circuit without opening a tx once tripped, so no more watchdogs are armed
    2. Centralised logFailure helper gated on !tripped — only the first failure logs; in-flight watchdogs / tx.done rejections stay silent
    3. Swap-first in switchToMemoryStore — replace store synchronously before the recovery drain. The drain itself uses a new drainForFallback that bypasses tripped, preserving recovery on write-side wedges

Linear: SR-4356

Test plan

  • 912 jest tests pass with 100% coverage
  • New unit tests cover: short-circuit after trip, log-once across burst, drainForFallback bypasses tripped, swap-first ordering, drain rejection swallowed
  • New e2e test in e2e/idb.spec.ts verifies that under sustained IDB put failure with 40+ rapid events, exactly 1 storage-failure warn is emitted (pre-fix would be dozens)
  • Bugbot pass
  • CI green

🤖 Generated with Claude Code


Note

Medium Risk
Modifies session-replay event persistence and mid-session fallback behavior under IndexedDB failure; bugs here could impact event durability or recovery ordering, but changes are gated to failure paths and are heavily test-covered.

Overview
Prevents session replay from spamming console warnings and piling up stalled transactions when IndexedDB becomes unhealthy (SR-4356).

SessionReplayEventsIDBStore now trips a permanent failure flag that makes public read/write methods short-circuit (no new transactions/watchdogs) and centralizes storage-failure logging so only the first failure logs; a new drainForFallback() bypasses the trip to allow a best-effort final read.

switchToMemoryStore in events-manager is reordered to swap to the in-memory store first, then attempt the best-effort IDB drain (swallowing drain failures), ensuring new captures aren’t blocked by a wedged IDB handle; unit/integration/e2e tests are updated/added to lock in burst-suppression and swap-first behavior.

Reviewed by Cursor Bugbot for commit 9e4ca24. Bugbot is set up for automated code reviews on this repo. Configure here.

…cade (SR-4356)

Customer page was seeing hundreds of repeated 'transaction timed out' warns
when IDB wedged: every fire-and-forget addEventToCurrentSequence opened its
own readwrite tx and armed its own 5s watchdog, so the 5s mark produced a
burst of identical warns (devtools collapsed 429x).  A self-amplifying
loop made it worse — logger.warn went through rrweb-plugin-console-record,
becoming a new rrweb event that fed back into addEventToCurrentSequence.

Fix in three layered changes in packages/session-replay-browser/src/events:
  1. Add a tripped flag to SessionReplayEventsIDBStore.  Public methods
     short-circuit without opening a tx once tripped.  Kills new txs at
     the source so no new watchdogs are armed.
  2. Route all failure logs through a centralised logFailure helper
     gated on !tripped.  Only the first failure that flips tripped logs;
     later in-flight watchdogs / tx.done rejections stay silent.
  3. Swap-first in switchToMemoryStore: replace store synchronously
     before the recovery drain so new captures land on memory immediately,
     even if the drain stalls.  The drain itself uses a new
     drainForFallback method that bypasses tripped, preserving the
     ability to recover sequences on a write-side wedge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@linear
Copy link
Copy Markdown

linear Bot commented May 12, 2026

SR-4356

@lewgordon-amplitude
Copy link
Copy Markdown
Collaborator Author

bugbot run

@github-actions
Copy link
Copy Markdown

Session Replay Browser E2E Results

passed  121 passed

Details

stats  121 tests across 11 suites
duration  2 minutes, 56 seconds
commit  9e4ca24

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 9e4ca24. Configure here.

@lewgordon-amplitude
Copy link
Copy Markdown
Collaborator Author

Closing as I'm transitioning to Statsig. @stevenchien-amplitude @jpollock-ampl @jxiwang — would any of you want to pick this up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant