Skip to content

feat(spanner): add multiplexed session support#437

Open
JulienEllie wants to merge 1 commit intoyoshidan:mainfrom
JulienEllie:feature/multiplexed-sessions
Open

feat(spanner): add multiplexed session support#437
JulienEllie wants to merge 1 commit intoyoshidan:mainfrom
JulienEllie:feature/multiplexed-sessions

Conversation

@JulienEllie
Copy link
Copy Markdown

@JulienEllie JulienEllie commented May 5, 2026

Problem

The Spanner v1 protocol has a multiplexed session shape (Session.multiplexed = true) — one long-lived session shared by many concurrent ops, no per-session pooling, no expiry. Spanner Omni deployments only accept multiplexed sessions for read-write transactions; vanilla Cloud Spanner accepts them for read workloads. The Rust client previously had no support for this, so any user pointed at Omni couldn't use the crate at all.

This PR adds end-to-end support, with the multiplexed code path auto-enabled when SPANNER_OMNI_ENDPOINT is set in the environment.

Session management

  • SessionManager holds the long-lived multiplexed handle in a parking_lot::RwLock so concurrent get() callers can clone it in parallel; only recreation needs the write lock.
  • NotFound on any clone flips a shared invalid flag; the next get() recreates the master under an async lock that double-checks the flag. A monotonic multiplexed_generation counter, snapshotted onto each clone, prevents stale clones (whose master has already been rotated) from re-flagging a freshly-recreated master and triggering redundant CreateSession RPCs.
  • get() and recreate_multiplexed_session() honour the manager's CancellationToken, returning FailedToCreateSession after close() so a closed manager can't be silently resurrected.
  • close() drops the multiplexed master without DeleteSession — per the v1 proto, multiplexed sessions can't be deleted via the API and are reclaimed by server-side TTL.
  • Acquire / release / latency metrics fire on the multiplexed path the same way they do on the pool path, so OTel dashboards remain accurate after migration.

Read-write transactions (inline begin)

  • Multiplexed RW transactions use inline begin: the transaction is started implicitly on the first query/DML, with the server-assigned tx_id captured from response metadata into a slot shared with the parent transaction. PartitionedDml is excluded — it continues to use explicit BeginTransaction, since inline begin with PartitionedDml is rejected by the server.
  • Streaming retries upgrade the cached request's selector to Id(tx_id) once captured (new Reader::update_transaction_selector trait method, with a no-op default body so external impls don't break — SemVer-compatible). Without this, a resumed stream with Begin(...) + resume_token could start a second server-side transaction the client never tracks.
  • RowIterator only pre-fetches the first frame on the very first op of a transaction (when the inline-begin slot is still unpopulated); subsequent ops on the same transaction behave like non-multiplexed iterators.
  • Mutations-only commits go through an explicit BeginTransaction with mutation_key set to a randomly-selected buffered mutation (per proto guidance for partition-lookup load distribution), capturing both tx_id and the response precommit_token. The lifetime transaction_tag is used for the begin RPC.
  • commit() returns a synthetic empty CommitResponse if the transaction reaches commit with no operations and no buffered mutations — mirrors rollback()'s no-op guard, so an empty multiplexed tx doesn't ship TransactionId(\"\") to the server (which is rejected).

Precommit token handling

  • Multiplexed-session precommit tokens are kept by highest seq_num per proto contract, not last-write-wins. A shared helper enforces this at every capture site (DML, batch DML, streaming reads, mutations-only BeginTransaction).
  • The captured token is forwarded on CommitRequest.precommit_token.

Apply path

  • Client::apply_at_least_once routes through the regular RW transaction path on multiplexed sessions, since SingleUseTransaction commits with mutations are not supported on multiplexed sessions (no place to attach mutation_key / precommit_token; server returns UNIMPLEMENTED).

Tests

Unit tests:

  • update_precommit_token: empty slot, strictly-greater seq_num, lower seq_num, equal seq_num.
  • Reader::update_transaction_selector for both StatementReader and TableReader: Begin(...) replaced with Id(tx_id), resume_token untouched.
  • Compile-time canary: an external-style Reader impl that omits update_transaction_selector still compiles (default body present).
  • SessionManager::get after close returns FailedToCreateSession.
  • Multiplexed master is recreated when the invalid flag is flipped (server-assigned session name changes after invalidation).
  • Stale multiplexed clone (older generation) does NOT re-flag the invalid bit; current-generation clone does.

Integration tests against the Cloud Spanner emulator (recent images participate in the multiplexed protocol — accept multiplexed:true, return precommitToken on BeginTransaction with mutation_key, and reject buggy request shapes with NOT_FOUND / UNIMPLEMENTED):

  • inline-begin: query then commit captures tx_id.
  • mutations-only commit drives begin_for_mutations_only.
  • empty transaction commits cleanly without shipping TransactionId(\"\").
  • multi-op transaction reuses the captured tx_id across ops.
  • apply_at_least_once on a multiplexed session (server rejects the SingleUseTransaction-with-mutations shape used by the non-multiplexed path).
  • partitioned_update on a multiplexed session (server rejects the inline-begin PartitionedDml shape).
  • DML followed by buffered mutations in the same RW transaction.

Not exercised end-to-end (would need either a gRPC mock harness or a stricter Spanner endpoint): streaming-retry selector upgrade in production, strict precommit_token / seq_num enforcement (the emulator participates but doesn't reject mismatches), and the lock-contention contract of the RwLock-protected multiplexed clone.

Test plan

  • CI: full Spanner emulator suite passes (cargo test -p gcloud-spanner --lib --test client_test --test transaction_rw_test --test transaction_ro_test --test multiplexed_test).
  • Manual smoke against a Spanner Omni endpoint with SPANNER_OMNI_ENDPOINT set: verify RW transactions, mutations-only commits, and concurrent ops succeed.

Adds support for multiplexed Spanner sessions, required by Spanner Omni
deployments and supported by Cloud Spanner for concurrent workloads. A
single multiplexed session is created via CreateSession (with
Session.multiplexed = true) and shared across callers; no per-session
pooling or expiry is maintained. Auto-enabled when SPANNER_OMNI_ENDPOINT
is set.

Session management
- SessionManager holds a long-lived multiplexed handle behind an RwLock
  so concurrent get() callers can clone in parallel; only recreation
  takes the write lock.
- NotFound on any clone flips a shared invalid flag; the next get()
  recreates the master under an async lock that double-checks the flag.
- get() and recreate_multiplexed_session() honour the cancellation
  token, returning FailedToCreateSession after close() so a closed
  manager cannot be silently resurrected.
- close() drops the multiplexed master without issuing DeleteSession;
  per the v1 proto, multiplexed sessions cannot be deleted via the API
  and are reclaimed by server-side TTL.

Read-write transactions (inline begin)
- Multiplexed RW transactions use inline begin: the transaction is
  started implicitly on the first query/DML, with the server-assigned
  tx_id captured from response metadata into a slot shared with the
  parent transaction. PartitionedDml is excluded and continues to use
  explicit BeginTransaction.
- Streaming retries upgrade the cached request's selector to Id(tx_id)
  once captured, so a resumed stream cannot accidentally start a second
  server-side transaction.
- RowIterator only pre-fetches the first frame on the very first op
  (when the inline-begin slot is still unpopulated); subsequent ops on
  the same transaction behave like non-multiplexed iterators.
- Mutations-only commits go through an explicit BeginTransaction with
  mutation_key set, capturing both tx_id and the response precommit
  token; the lifetime transaction_tag is used for the begin RPC.
- commit() returns a synthetic empty CommitResponse if the transaction
  reaches commit with no operations and no buffered mutations, mirroring
  rollback()'s no-op guard.

Precommit token handling
- Multiplexed-session precommit tokens are kept by highest seq_num per
  proto contract (MultiplexedSessionPrecommitToken.seq_num); a shared
  helper enforces this at every capture site (DML, batch DML, streaming
  reads, mutations-only BeginTransaction).
- The captured token is forwarded on CommitRequest.precommit_token.

Apply path
- Client::apply_at_least_once routes through the regular RW transaction
  path on multiplexed sessions, since SingleUseTransaction commits with
  mutations are not supported on multiplexed sessions (no place to
  attach mutation_key / precommit_token).

Tests
- Unit tests:
  * update_precommit_token: empty slot, strictly-greater seq_num,
    lower seq_num, equal seq_num.
  * Reader::update_transaction_selector for both StatementReader and
    TableReader: Begin(...) replaced with Id(tx_id), resume_token
    untouched.
  * SessionManager::get after close returns FailedToCreateSession.
  * Multiplexed master is recreated when the invalid flag is flipped
    (server-assigned session name changes after invalidation).
- Integration tests against the Cloud Spanner emulator (which does
  participate in the multiplexed-session protocol on recent images —
  it accepts multiplexed:true sessions, returns precommitToken on
  BeginTransaction with mutation_key, and rejects buggy request shapes
  with NOT_FOUND/UNIMPLEMENTED). Coverage:
  * inline begin: query then commit captures tx_id.
  * mutations-only commit drives begin_for_mutations_only.
  * empty transaction commits cleanly without shipping
    TransactionId(empty) (which the server rejects).
  * multi-op transaction reuses the captured tx_id across ops.
  * apply_at_least_once on a multiplexed session (server rejects the
    SingleUseTransaction-with-mutations shape used by the non-
    multiplexed path).
  * partitioned_update on a multiplexed session (server rejects the
    inline-begin PartitionedDml shape).
  * DML followed by buffered mutations in the same RW transaction.
- Not exercised end-to-end (would need either a gRPC mock or an
  emulator that strictly enforces these): streaming-retry selector
  upgrade, strict precommit_token / seq_num enforcement, and the
  performance contract of the RwLock-protected multiplexed clone.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant