DRIVER-153: negotiate and implement SCYLLA_USE_METADATA_ID extension#770
DRIVER-153: negotiate and implement SCYLLA_USE_METADATA_ID extension#770nikagra wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Implements negotiation and support for Scylla’s SCYLLA_USE_METADATA_ID protocol extension to enable metadata-id based skip_meta behavior (backporting CQL v5 prepared-statement metadata-id semantics to earlier protocol versions).
Changes:
- Adds
SCYLLA_USE_METADATA_IDparsing fromSUPPORTEDand includes it inSTARTUPwhen negotiated. - Extends protocol encode/decode to read/write
result_metadata_idfor PREPARE/EXECUTE on pre-v5 when the extension is used, and fixes on-wire encoding of_SKIP_METADATA_FLAG. - Updates execution/result handling to conditionally use
skip_metaand to refresh cached prepared metadata when the server reports metadata changes.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
cassandra/protocol_features.py |
Adds the SCYLLA_USE_METADATA_ID feature flag and includes it in negotiated STARTUP options. |
cassandra/protocol.py |
Writes _SKIP_METADATA_FLAG in query params; adds pre-v5 extension handling for result_metadata_id in PREPARE/EXECUTE. |
cassandra/cluster.py |
Adjusts when skip_meta is enabled and updates cached prepared metadata/id on METADATA_CHANGED responses. |
tests/unit/test_protocol_features.py |
Adds unit tests for feature parsing and STARTUP option inclusion. |
tests/unit/test_protocol.py |
Adds unit tests for skip-meta flag encoding and metadata-id handling in pre-v5 PREPARE/EXECUTE paths. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| elif self.result_metadata_id is not None: | ||
| write_string(f, self.result_metadata_id) |
There was a problem hiding this comment.
ExecuteMessage.send_body() now writes result_metadata_id for protocol versions that don’t use standard prepared-metadata (pre-v5) whenever self.result_metadata_id is non-None. For pre-v5 this extra field is only valid when SCYLLA_USE_METADATA_ID was negotiated on that connection; otherwise it changes the wire layout (the server will interpret the metadata id bytes as the query parameters) and the request will fail.
To make this safe, ensure the decision to include result_metadata_id is gated by the negotiated feature (e.g., add an explicit use_metadata_id/send_result_metadata_id flag on the message that the caller sets based on connection.features.use_metadata_id, or guarantee centrally that result_metadata_id is cleared unless the extension is active for that connection).
| elif self.result_metadata_id is not None: | |
| write_string(f, self.result_metadata_id) |
There was a problem hiding this comment.
Agreed. The elif self.result_metadata_id is not None path in send_body is now only reached when the caller explicitly set the field — which only happens in _query() after confirming connection.features.use_metadata_id (or CQL v5). For any connection that didn't negotiate the extension, result_metadata_id remains None and the branch is never taken, so the wire layout is unaffected.
There was a problem hiding this comment.
There is still a problem here. If the result id feature is negotiated on the connection, then you need to ALWAYS send some result metadata id in EXECUTE. Skipping the write_string will result in a protocol error.
Your use_metadata_id may be False even if extension was negotiated, if the server decided to skip the metadata in PREPARED response. In such case, you'll skip writing the id here, and encounter protocol error.
Even if you fix this specific case, there is still possibility of mixed cluster, with some nodes supporting the extension. In that case result_metadata_id will be None, and if you send to a node that has the extension negotiated, you'll again not send the id and encounter protocol error.
To sum up: this serialization here should check if feature is negotiated, and base sending this field only on that.
There was a problem hiding this comment.
👨💻: Addressed your concerns
🤖:
Automated response
Fixed. Added a `use_metadata_id` boolean flag to `ExecuteMessage` (default `False`). `send_body` now gates the `result_metadata_id` field on `ProtocolVersion.uses_prepared_metadata(protocol_version) or self.use_metadata_id` — both branches unified — writing an empty sentinel (`b''`) when the hash is `None` rather than omitting the field or crashing.
In _query(), message.use_metadata_id = connection.features.use_metadata_id is set unconditionally after borrowing the connection, so the decision is always based on what was negotiated on that specific connection.
This also fixes a second issue: the old separate v5 branch would crash with TypeError when result_metadata_id is None (LWT statement on v5, or statement prepared before the extension was active in a mixed cluster). The unified sentinel handles that case correctly on both code paths.
Two new tests cover this: test_execute_message_scylla_metadata_id_none_writes_sentinel (extension active, None hash → sentinel on v4) and test_execute_message_v5_metadata_id_none_writes_sentinel (v5 native path, None hash → sentinel instead of crash).
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ade35d8 to
f42e225
Compare
|
I'm not sure where, but we should document this - with reference mainly to the scylladb docs about this feature. |
|
@mykaul Documentation I'm aware of is MetadataId extension in CQLv4 Requirement Document |
dkropachev
left a comment
There was a problem hiding this comment.
One blocking correctness issue below: skip_meta is being enabled for prepared statements that can still have empty/absent cached result metadata.
6eea397 to
a86fd53
Compare
7ba5835 to
a86fd53
Compare
a86fd53 to
8880f03
Compare
Backport the prepared-statement metadata-ID mechanism from CQL v5 to earlier protocol versions via the SCYLLA_USE_METADATA_ID Scylla extension. protocol_features.py: - Add USE_METADATA_ID constant and use_metadata_id field to ProtocolFeatures - parse_from_supported: detect SCYLLA_USE_METADATA_ID in SUPPORTED options - add_startup_options: echo the extension back in STARTUP when negotiated protocol.py: - _write_query_params: fix _SKIP_METADATA_FLAG — it was stored but never written on the wire (dead code); now correctly sets the flag in the frame - recv_results_prepared: read result_metadata_id when the Scylla extension is active (pre-v5), in addition to the existing CQL v5 path - ExecuteMessage: add use_metadata_id flag (default False); send_body gates the result_metadata_id field on uses_prepared_metadata(protocol_version) OR use_metadata_id, writing an empty sentinel (b'') when the hash is unavailable (LWT / mixed cluster) instead of omitting the field entirely or crashing with TypeError cluster.py: - _create_response_future: build ExecuteMessage with defaults (skip_meta=False, result_metadata_id=None); set these in _query() once the connection is known - _query: after borrowing the connection set can_skip_meta, skip_meta, result_metadata_id, and use_metadata_id on the ExecuteMessage based on the actual connection's negotiated features; skip_meta is only enabled when the prepared statement has both a result_metadata_id AND cached result_metadata (guards against LWT/NO_METADATA statements and mixed-cluster re-prepare) - _set_result: on METADATA_CHANGED (new result_metadata_id in EXECUTE response) update prepared_statement.result_metadata then result_metadata_id in that order — safe write ordering so a concurrent reader using the old id gets full metadata from the server while a reader seeing the new id has the correct cached metadata immediately docs/scylla-specific.rst: document the extension and its behaviour
test_protocol_features.py: - test_use_metadata_id_parsing: SCYLLA_USE_METADATA_ID parsed from SUPPORTED - test_use_metadata_id_missing: use_metadata_id False when key absent - test_use_metadata_id_startup_options: key present in STARTUP when negotiated - test_use_metadata_id_not_in_startup_when_not_negotiated: absent otherwise test_protocol.py: - test_execute_message_skip_meta_flag: _SKIP_METADATA_FLAG (0x02) is written - test_execute_message_scylla_metadata_id_v4: result_metadata_id written on v4 when use_metadata_id=True (Scylla extension) - test_execute_message_scylla_metadata_id_none_writes_sentinel: extension active but result_metadata_id=None writes empty sentinel b'' (LWT / mixed cluster); frame layout preserved - test_execute_message_v5_metadata_id_none_writes_sentinel: v5 with result_metadata_id=None writes empty sentinel instead of TypeError crash - test_recv_results_prepared_scylla_extension_reads_metadata_id - test_recv_results_prepared_no_extension_skips_metadata_id - test_recv_results_metadata_changed_flag - test_recv_results_metadata_no_metadata_flag_skips_metadata_id test_response_future.py: _set_result METADATA_CHANGED path: - test_set_result_updates_metadata_when_metadata_changed - test_set_result_does_not_update_metadata_when_metadata_id_absent - test_set_result_warns_when_metadata_id_but_no_column_metadata _query per-connection feature gating (5 scenarios): - test_query_sets_skip_meta_with_scylla_extension - test_query_no_skip_meta_without_extension - test_query_no_skip_meta_when_prepared_statement_has_no_metadata_id - test_query_sets_skip_meta_for_protocol_v5 - test_query_no_skip_meta_when_result_metadata_is_none (LWT guard)
170fd31 to
5fe1902
Compare
Summary
Implements the
SCYLLA_USE_METADATA_IDScylla CQL protocol extension (DRIVER-153), which backports the prepared-statement metadata-ID mechanism from CQL v5 to earlier protocol versions.When the extension is negotiated:
skip_meta=True)METADATA_CHANGEDflag and includes the new metadata ID + new column metadata in the response — the driver picks this up and updates its cached metadata automaticallyChanges
cassandra/protocol_features.pyUSE_METADATA_ID = "SCYLLA_USE_METADATA_ID"constant anduse_metadata_idfield toProtocolFeaturescassandra/protocol.py_write_query_paramsnow actually writes_SKIP_METADATA_FLAGon the wire — it was stored on_QueryMessagebut never sent (effectively dead code)recv_results_prepared: readresult_metadata_idfor Scylla extension (pre-v5) in addition to standard CQL v5+ExecuteMessage: adduse_metadata_idflag (defaultFalse);send_bodygates theresult_metadata_idfield onProtocolVersion.uses_prepared_metadata(protocol_version) or self.use_metadata_id— both paths unified. An empty sentinel (b'') is written when the hash isNone(LWT / mixed cluster) so the frame layout is always correct and noTypeErrorcrash on v5cassandra/cluster.py_create_response_future: buildExecuteMessagewith safe defaults (skip_meta=False,result_metadata_id=None,use_metadata_id=False)_query: after borrowing the connection, setcan_skip_meta,skip_meta,result_metadata_id, anduse_metadata_idbased on the actual connection's negotiated features.skip_metais only enabled when the prepared statement has both aresult_metadata_idand usable cachedresult_metadata— guards against LWT/NO_METADATAstatements and mixed-cluster scenarios_set_result: on METADATA_CHANGED updateprepared_statement.result_metadatathenresult_metadata_idin that order (safe write ordering for concurrent readers)docs/scylla-specific.rstTest plan
_set_resultMETADATA_CHANGED path, and per-connection feature gating in_query(5 scenarios: extension on/off, metadata id present/absent, LWT, protocol v5)57 passedin affected files; 627+ overall)