-
Notifications
You must be signed in to change notification settings - Fork 721
feat(frontend): auto disable snapshot backfill in some cases #24360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
c2adadc to
391b731
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR automatically disables snapshot backfill in cases where it's not supported or could cause issues, specifically for temporal joins and shared sources. The implementation adds an is_shared property to logical source nodes and introduces a visitor pattern to detect when snapshot backfill should be disabled, providing user notifications in these scenarios.
- Adds
forbid_snapshot_backfillmethod that uses a visitor pattern to detect temporal joins and shared sources - Updates
should_use_snapshot_backfillto check and disable snapshot backfill with user notification when forbidden - Introduces
is_shared_sourcehelper method and exposesis_sharedproperty in logical plan output
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
src/frontend/src/optimizer/plan_node/mod.rs |
Adds forbid_snapshot_backfill method with visitor pattern to detect temporal joins and shared sources; imports required visitor types |
src/frontend/src/optimizer/plan_node/logical_source.rs |
Adds is_shared_source helper method; updates Distill implementation to display is_shared property in plan output |
src/frontend/src/optimizer/plan_node/logical_join.rs |
Adds should_be_temporal_join public helper method to check if join should be treated as temporal |
src/frontend/src/optimizer/mod.rs |
Updates should_use_snapshot_backfill to call forbid_snapshot_backfill and notify users when snapshot backfill is auto-disabled |
src/frontend/planner_test/tests/testdata/output/watermark.yaml |
Updates test output to include is_shared: false property in logical source |
src/frontend/planner_test/tests/testdata/output/tpch_variant.yaml |
Updates all logical source nodes in test output to include is_shared: false property |
src/frontend/planner_test/tests/testdata/output/temporal_join.yaml |
Changes expected output from error to valid stream plan showing arrangement backfill is used when snapshot backfill is disabled |
src/frontend/planner_test/tests/testdata/output/subquery.yaml |
Updates logical source nodes to include is_shared: false property |
src/frontend/planner_test/tests/testdata/output/shared_source.yml |
Adds new test case demonstrating snapshot backfill is disabled with shared sources |
src/frontend/planner_test/tests/testdata/output/share.yaml |
Updates logical source nodes to include is_shared: false property |
src/frontend/planner_test/tests/testdata/output/nexmark_watermark.yaml |
Updates all nexmark logical source nodes to include is_shared: false property |
src/frontend/planner_test/tests/testdata/output/nexmark_source_kafka.yaml |
Updates all nexmark Kafka source nodes to include is_shared: true property |
src/frontend/planner_test/tests/testdata/output/nexmark_source.yaml |
Updates all nexmark source nodes to include is_shared: false property |
src/frontend/planner_test/tests/testdata/output/batch_source.yaml |
Updates logical source node to include is_shared: true property |
src/frontend/planner_test/tests/testdata/input/temporal_join.yaml |
Changes expected output type from stream_error to stream_plan for temporal join test |
src/frontend/planner_test/tests/testdata/input/shared_source.yml |
Adds new test configuration with snapshot backfill enabled to verify auto-disable behavior |
e2e_test/source_inline/kafka/shared_source.slt.serial |
Removes obsolete test case that expected error for snapshot backfill with shared sources |
| statement error Not supported: Snapshot backfill with shared source backfill is not supported | ||
| create materialized view mv_snapshot_backfill_shared_source as (select * from mv_before_produce) union (select * from s_before_produce); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the test case's expect behavior now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this PR, even though snapshot backfill is enabled by session variable, it will choose to use arrangement backfill anyway with a notice to user about the implicit switch.
| struct ForbidSnapshotBackfill { | ||
| warning_msg: Option<String>, | ||
| } | ||
| impl LogicalPlanVisitor for ForbidSnapshotBackfill { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason to def a struct inside a function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This struct is a helper struct only used in this method to leverage the visitor trait. Defined in the function to limit its visibility.
391b731 to
2492744
Compare
tabVersion
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from source side, not sure if the backfill strategy switch has other effects.
cc streaming experts @chenzl25 @BugenZhao

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
This pull request updates planner test outputs and test cases to reflect the addition of an
is_sharedproperty to logical source nodes, and adjusts the expected planner behavior for temporal joins and shared source configurations. The main changes are to test data and expected outputs, ensuring that test cases now properly account for whether a source is shared or not, and that planner output matches the new logical plan structure.Test output updates for
is_sharedproperty:is_sharedfield toLogicalSourcenodes in test output files, marking sources as eithertrueorfalsedepending on whether they are shared. This affects multiple files, includingbatch_source.yaml,nexmark_source.yaml,nexmark_source_kafka.yaml, andnexmark_watermark.yaml. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]Test case and configuration changes:
shared_source.ymltest input to include a configuration map enablingstreaming_use_shared_source, and added a scenario withstreaming_use_snapshot_backfillenabled.Planner behavior adjustments:
Checklist
Documentation
Release note