Skip to content

Conversation

@shanicky
Copy link
Contributor

@shanicky shanicky commented Jan 3, 2026

Introduce a new mechanism to control backfill parallelism for streaming jobs, allowing separate tuning of resource usage during backfill phases.

  • Add AlterBackfillParallelism RPC and proto definitions in ddl_service.proto
  • Extend CatalogWriter trait and frontend handler to support ALTER ... SET BACKFILL_PARALLELISM
  • Implement server-side handling in Meta service and DdlController for backfill parallelism changes
  • Add rescheduling logic in GlobalStreamManager and ScaleController to update backfill parallelism safely
  • Validate unsupported cases (e.g., Iceberg tables) and support deferred execution mode
  • Add integration test verifying backfill parallelism switches after backfill completion
  • Extend SQL parser and keywords to recognize BACKFILL_PARALLELISM syntax

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Summary
This PR introduces support for altering the backfill parallelism of streaming jobs through the DDL gRPC API and the corresponding frontend handlers. It enables users to adjust the parallelism used during backfilling operations, including support for deferred execution and resetting the parallelism setting.

Details

  • Added new RPC messages (AlterBackfillParallelismRequest and AlterBackfillParallelismResponse) and the AlterBackfillParallelism method to the DDL gRPC service definition.
  • Extended the CatalogWriter trait and its implementation with an asynchronous method alter_backfill_parallelism to communicate with the meta service.
  • Implemented a new handler function handle_alter_backfill_parallelism to process alter backfill parallelism requests, including extracting parallelism values and validating input.
  • Integrated the new handler into the main dispatch logic, supporting the operation on tables, indexes, sinks, and sources, while returning a not implemented error for materialized views and disallowing the operation on iceberg engine tables.
  • Added stubs for the new method in testing utilities (MockCatalogWriter) to facilitate unit testing.
  • Began partial implementation of the meta service handler for backfill parallelism alterations.

Checklist

  • I have written necessary rustdoc comments.
  • I have added necessary unit tests and integration tests.
  • I have added test labels as necessary.
  • I have added fuzzing tests or opened an issue to track them.
  • My PR contains breaking changes.
  • My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
  • I have checked the Release Timeline and Currently Supported Versions to determine which release branches I need to cherry-pick this PR into.

Documentation

  • My PR needs documentation updates.
Release note

@shanicky shanicky force-pushed the peng/alter-backfill-parallelism branch from 8fd8208 to e7ad74e Compare January 5, 2026 13:41
@shanicky shanicky force-pushed the peng/online-scale-backfill branch from 3dccdcb to a317c48 Compare January 5, 2026 13:41
@shanicky shanicky force-pushed the peng/alter-backfill-parallelism branch from e7ad74e to b1336c6 Compare January 5, 2026 13:42
- Introduce resolve_streaming_job_id_for_alter_parallelism to include
  tables that are still being created when resolving streaming job IDs
- Refactor resolve_streaming_job_id_for_alter to delegate to a new internal
  helper with a flag controlling inclusion of creating tables
- Update ALTER PARALLELISM handler to use the new function, enabling early
  parallelism alteration in streaming jobs' lifecycle
- Ensure catalog lookups differentiate between fully created and creating
  tables based on the alter operation semantics

Signed-off-by: Shanicky Chen <[email protected]>
Introduce a new mechanism to control backfill parallelism for streaming jobs,
allowing separate tuning of resource usage during backfill phases.

- Add AlterBackfillParallelism RPC and proto definitions in ddl_service.proto
- Extend CatalogWriter trait and frontend handler to support ALTER ... SET BACKFILL_PARALLELISM
- Implement server-side handling in Meta service and DdlController for backfill parallelism changes
- Add rescheduling logic in GlobalStreamManager and ScaleController to update backfill parallelism safely
- Validate unsupported cases (e.g., Iceberg tables) and support deferred execution mode
- Add integration test verifying backfill parallelism switches after backfill completion
- Extend SQL parser and keywords to recognize BACKFILL_PARALLELISM syntax

Signed-off-by: Shanicky Chen <[email protected]>
@shanicky shanicky force-pushed the peng/alter-backfill-parallelism branch from b1336c6 to 4740e26 Compare January 5, 2026 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants