[fix](be) Fix RuntimeFilter selectivity sampling_frequency lost during VExprContext recreation#62355
Conversation
…g VExprContext recreation
### What problem does this PR solve?
Issue Number: close #xxx
Problem Summary:
RuntimeFilter selectivity tracking was completely non-functional because
`sampling_frequency` was lost during VExprContext recreation.
In `RuntimeFilterConsumer::_get_push_exprs()`, `sampling_frequency=32` was set
on a temporary `probe_ctx` VExprContext. However, only VRuntimeFilterWrapper
expressions (VExpr) were returned, not the VExprContext. When
`_append_rf_into_conjuncts()` later created a new VExprContext via
`VExprContext::create_shared(expr)`, the new context had default
`_sampling_frequency=-1` (DISABLE_SAMPLING).
With `_sampling_frequency=-1`, the condition `(_judge_counter++) >= -1`
evaluated to `0 >= -1 → true` on every call, causing `reset_judge_selectivity()`
to fire every time. This meant selectivity counters were perpetually reset
and never accumulated, making the runtime filter selectivity optimization
completely ineffective.
The fix stores `sampling_frequency` in VRuntimeFilterWrapper (which survives
VExprContext recreation) and propagates it to VExprContext in
`VRuntimeFilterWrapper::open()`, which is called on both original and cloned
contexts.
### Release note
Fixed a bug where RuntimeFilter selectivity tracking was non-functional due to
sampling_frequency being lost during VExprContext recreation, causing runtime
filters that should be skipped (due to low selectivity) to never be identified.
### Check List (For Author)
- Test: Unit Test
- Added 2 regression tests to runtime_filter_selectivity_test.cpp
- Added 3 new tests in vruntimefilter_wrapper_sampling_test.cpp
- All 22 tests pass
- Behavior changed: No (selectivity tracking was broken before, this makes it work as designed)
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Pull request overview
Fixes a BE runtime-filter selectivity regression where sampling_frequency was lost when runtime-filter expressions were re-wrapped into a new VExprContext, causing selectivity counters to reset continuously and making the “always true / can ignore” optimization ineffective.
Changes:
- Persist
sampling_frequencyinsideVRuntimeFilterWrapperand re-apply it to the activeVExprContextduringVRuntimeFilterWrapper::open(). - Update
RuntimeFilterConsumer::_get_push_exprs()to pass the computed sampling frequency into each created wrapper (instead of setting it on a temporary probe context). - Add regression/unit tests covering default sampling behavior and context recreation.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| be/src/exprs/vruntimefilter_wrapper.h | Extend wrapper constructor to carry sampling_frequency and store it as a member. |
| be/src/exprs/vruntimefilter_wrapper.cpp | Propagate wrapper-held sampling_frequency into VExprContext during open(). |
| be/src/exec/runtime_filter/runtime_filter_consumer.cpp | Compute sampling frequency once and pass it into wrapper creation paths. |
| be/test/exec/runtime_filter/runtime_filter_selectivity_test.cpp | Add regression tests for disabled vs enabled sampling-frequency behavior. |
| be/test/exec/runtime_filter/vruntimefilter_wrapper_sampling_test.cpp | New tests validating propagation across (re)created VExprContexts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Create a brand new VExprContext with the same VRuntimeFilterWrapper. | ||
| // The new context starts with default sampling_frequency = -1. | ||
| auto context2 = std::make_shared<VExprContext>(wrapper); | ||
| EXPECT_FALSE(context2->get_runtime_filter_selectivity().maybe_always_true_can_ignore()); | ||
|
|
||
| // After open() on the new context, sampling_frequency should be propagated | ||
| ASSERT_TRUE(wrapper->open(_runtime_states[0].get(), context2.get(), | ||
| FunctionContext::THREAD_LOCAL) |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
/review |
There was a problem hiding this comment.
Found 1 issue in the added regression coverage.
be/test/exec/runtime_filter/vruntimefilter_wrapper_sampling_test.cpp:sampling_frequency_survives_context_recreationdoes not currently verify the post-condition it describes, so it can pass even if propagation to the recreatedVExprContextregresses again.
Critical checkpoints
- Goal of the task: The code change does fix the production bug by moving
sampling_frequencyontoVRuntimeFilterWrapperand restoring it inVRuntimeFilterWrapper::open(). The new tests partially cover this, but the recreated-context regression test is ineffective as written. - Change scope: Small and focused; the production fix is narrowly scoped to runtime-filter wrapper/context state propagation.
- Concurrency: Applicable. The changed state remains per-
VExprContext; I did not find a new shared-state or locking regression in the modified path. - Lifecycle / initialization: Applicable. The key lifecycle here is sharing one
VExpracross recreated/clonedVExprContexts; the production code now restores state inopen(), which is the right hook. The added test does not faithfully exercise that lifecycle. - Config changes: None. Existing
runtime_filter_sampling_frequencysemantics are preserved. - Compatibility: No FE/BE protocol, storage-format, or rolling-upgrade compatibility change found.
- Parallel code paths: Checked the direct conjunct path, recreated-context path, clone path, and scan pushdown attachment path; the production fix covers the relevant runtime-wrapper openings.
- Special conditions: The disable path (
DISABLE_SAMPLING) remains explicit and understandable. - Test coverage: Incomplete for the recreated-context regression scenario because one new test lacks an assertion on the propagated state and uses a non-production call order.
- Observability: Existing counters/logging remain sufficient; no additional observability appears required for this fix.
- Transaction / persistence / data-write concerns: Not applicable.
- FE-BE variable propagation: Not applicable.
- Performance: No material performance concern in the production fix; it adds only a trivial per-open assignment.
- Other issues: No additional correctness issues found in the modified production code beyond the ineffective regression test.
| // Open the recreated context with FRAGMENT_LOCAL to match the real | ||
| // _append_rf_into_conjuncts path for a freshly created VExprContext. | ||
| ASSERT_TRUE( | ||
| wrapper->open(_runtime_states[0].get(), context2.get(), FunctionContext::FRAGMENT_LOCAL) |
There was a problem hiding this comment.
sampling_frequency_survives_context_recreation is not asserting the behavior it claims to protect. After context2->prepare()/open(), the test should exercise context2->get_runtime_filter_selectivity() and assert that low-selectivity updates can flip maybe_always_true_can_ignore() back to true; otherwise this still passes if propagation into the recreated context breaks again.
There is also a lifecycle mismatch here: _append_rf_into_conjuncts() drives the recreated context through conjunct->prepare(); conjunct->open();, but this test manually calls wrapper->open(..., context2, ...) first. That extra direct wrapper->open() can mask exactly the regression this test is supposed to catch.
|
run buildall |
f5d4d1e to
f9a109b
Compare
|
run buildall |
…ing test StubVExpr (leaf node with no children) defaulted to is_constant()=true via VExpr::is_constant()'s all_of on empty children. On second open() call (context2), _constant_col was already populated from context1, causing DCHECK(column_wrapper != nullptr) to fail in get_const_col() when called with nullptr from VExpr::open(). Fix: override is_constant() to return false, matching SLOT_REF semantics. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
What problem does this PR solve?
introduced by #59832
Problem Summary:
RuntimeFilter selectivity tracking was completely non-functional because
sampling_frequencywas lost during VExprContext recreation.In
RuntimeFilterConsumer::_get_push_exprs(),sampling_frequency=32was seton a temporary
probe_ctxVExprContext. However, only VRuntimeFilterWrapperexpressions (VExpr) were returned, not the VExprContext. When
_append_rf_into_conjuncts()later created a new VExprContext viaVExprContext::create_shared(expr), the new context had default_sampling_frequency=-1(DISABLE_SAMPLING).With
_sampling_frequency=-1, the condition(_judge_counter++) >= -1evaluated to
0 >= -1 → trueon every call, causingreset_judge_selectivity()to fire every time. This meant selectivity counters were perpetually reset
and never accumulated, making the runtime filter selectivity optimization
completely ineffective.
Fix: Store
sampling_frequencyin VRuntimeFilterWrapper (which survivesVExprContext recreation) and propagate it to VExprContext in
VRuntimeFilterWrapper::open(), which is called on both original and clonedcontexts.
Release note
Fixed a bug where RuntimeFilter selectivity tracking was non-functional due to
sampling_frequency being lost during VExprContext recreation, causing runtime
filters that should be skipped (due to low selectivity) to never be identified.
Check List (For Author)