SharedMergeScheduler using shared thread pool for multi-tenant merge scheduling #14900

N624-debu · 2025-07-05T18:40:41Z

[Draft] SharedMergeScheduler using shared thread pool for multi-tenant merge scheduling

This draft PR introduces a prototype SharedMergeScheduler, which extends MergeScheduler and routes all merge tasks through a shared thread pool across IndexWriter instances.

Motivation

In multi-tenant environments (e.g., Solr, Elasticsearch), many IndexWriters may coexist in the same JVM. The default Lucene behavior assigns each writer its own ConcurrentMergeScheduler and thread pool, which can lead to resource oversubscription and inefficient coordination.

This implementation introduces a new merge scheduler that centralizes merge execution via a shared thread pool. This idea builds on feedback from GitHub issue #13883, where contributors suggested exploring a dedicated scheduler based on Java's executor framework, rather than modifying the existing ConcurrentMergeScheduler.

Implementation Highlights

Adds a new class SharedMergeScheduler in lucene.index
Implements the merge(MergeSource, MergeTrigger) method using the public MergeSource API
Uses a static Executors.newFixedThreadPool(4) as a prototype shared executor
Keeps ConcurrentMergeScheduler unchanged for easier evaluation and iteration

Next Steps

Add lifecycle management and graceful shutdown to the executor
Evaluate fairness or throttling strategies across writers
Potentially combine with a centralized merge manager
Seek feedback on architecture fit and maintainability

This PR is opened to propose and evaluate a shared-thread-pool merge scheduler design, and to gather feedback for further development.

…ge scheduling

github-actions · 2025-07-05T18:41:32Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

github-actions · 2025-07-05T20:10:07Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

vigyasharma · 2025-07-06T17:58:19Z

This seems to be in the right direction. Let's make it a singleton? As a next step, should we add support for prioritising small merges over larger ones (by backing the thread pool with a priority blocking queue)?

Also, since this is a draft PR, let's set it as "draft" on GitHub. You can also add a comment anywhere in your PR that says // no commit, which will ensure that the PR is not merged until it's ready (some GitHub checks will fail).

jpountz · 2025-07-06T21:13:56Z

@vigyasharma what would you think of the following:

Instead of making it a singleton, add the executor as a constructor argument so that the same executor can be used for indexing and merging? (plus add javadocs to recommend using a fixed thread pool and sharing it with indexing)
Tasks get wrapped when submitted to the executor, similarly to what TaskExecutor does with TaskExecutor.Task, to allow merges to run in the current thread if they don't get picked up by the executor in a timely fashion.
To mimic CMS, maybe this Task wrapper should keep polling pending merges that are below MIN_BIG_MERGE_MB=50MB without waiting for them to go through the executor queue so that they always run almost immediately? (this requires tracking pending merges in a separate queue)

vigyasharma · 2025-07-07T19:11:59Z

@jpountz I feel the scheduler should have some control on how the executor is created, its backing queue etc, so that it can prioritize how merges get scheduled, e.g. pick smaller merges before large ones, or have merges smaller than MIN_BIG_MERGE_MB so straight through without waiting. Otherwise, all the "merge scheduling" logic really gets offloaded to how the thread pool manages its task queue. That could be hard to get right, but maybe it's okay for expert users?

We could have the multi-tenant CMS accept a constructor argument for a custom MergeTasksExecutorService defined in Lucene. It will maintain a fixed thread pool, use a fixed size priority blocking queue that prefers small merges over big ones, and add custom logic to directly execute small merges, have bounded wait time on merges, use calling thread when needed etc. Something on the lines of Lucene's TaskExecutor that you mentioned. On similar lines, we can also create an IndexingMergeSharedExecutorService to share threads across indexing and merge tasks. We'll probably need to write a custom executor service; I'm not sure if wrapping an executor will give us the hooks we need.

vigyasharma · 2025-07-07T19:14:26Z

Based on all this, I think we get the following structure:

A merge scheduler that accepts an ExecutorService in its ctor. It will offload most of the scheduling / queueing / throttling logic to the executor service and its backing queue implementation. We will recommend using MergeTasksExecutorService or IndexingMergeSharedExecutorService but users are free to pass in their own executors.
A MergeTasksExecutorService which does CMS like scheduling but for merge tasks only. The fixed thread pool is shared across all submitted merges across all writers. This executor service ~~assumes~~ requires a separate indexing thread pool, and puts back-pressure on indexing when merges accumulate e.g. by making the merge run on the indexing thread which tries to submit it.
A IndexingMergeSharedExecutorService that can effectively handle sharing threads b/w indexing and merge tasks.

FWIW, I'm not really sure if sharing a thread pool b/w indexing and merging would be simpler than having separate thread pools and applying backpressure on indexing. I still need to grok all the details. But if we think backpressure is the way to go, and we'll always only use the MergeTasksExecutorService, then writing this executor as part of the scheduler and making it a singleton might make the code simpler?

OTOH, if we want a shared indexing/merging thread pool, or foresee a need to have different thread pools for sets of writers (shards in Elasticsearch / OpenSearch), then the ctor arg for executor service makes sense.

github-actions · 2025-07-14T11:41:32Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

…nstead of static field

github-actions · 2025-07-14T15:01:59Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

github-actions · 2025-07-14T15:22:10Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

… and size tracking

github-actions · 2025-07-14T15:38:50Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

vigyasharma · 2025-07-14T19:56:44Z

lucene/core/src/java/org/apache/lucene/index/MergeTaskWrapper.java

+    public long getMergeSize() {
+        return mergeSize;
+    }
+}


nit: we keep an empty line at the end of file. There might be an editor/formatter setting you want to add that brings it in automatically.

vigyasharma · 2025-07-14T19:58:16Z

lucene/core/src/java/org/apache/lucene/index/MergeTaskWrapper.java

+
+public class MergeTaskWrapper {
+    private final Runnable mergeTask;
+    private final Object sourceWriter; // Can be IndexWriter or its ID


You can directly use IndexWriter or String for ID instead of using Object. Keeping the type as specific as possible helps catch a lot of bugs before runtime.

vigyasharma · 2025-07-14T20:01:35Z

lucene/core/src/java/org/apache/lucene/index/SharedMergeScheduler.java

+   * version, the shared executor should be properly shut down.
+   */
+  @Override
+  public void close() {


This impl. for close() will shut down the entire executor. Since this is a shared merge scheduler, you only want to address merges related to the closing IndexWriter, like cancelling all pending merges and waiting for running merges to complete. (See close() in CMS as well).

vigyasharma · 2025-07-14T20:02:27Z

lucene/core/src/java/org/apache/lucene/index/MergeTaskWrapper.java

@@ -0,0 +1,25 @@
+package org.apache.lucene.index;
+
+public class MergeTaskWrapper {


I'm curious how we intend to use this wrapper, looking forward to the next iteration of this PR.

jpountz · 2025-07-17T15:57:34Z

FWIW, I'm not really sure if sharing a thread pool b/w indexing and merging would be simpler than having separate thread pools and applying backpressure on indexing

To me the reason for sharing the thread pools is not to make things simpler but rather to make it easier to control the overall number of active indexing+merging threads. For CPU-bound workloads (which indexing tends to be), the best approach is to create a thread pool sized based on the number of cores of the machine. You can't do this if indexing and merging don't use the same thread pool.

…se IndexWriter

github-actions · 2025-07-18T14:40:52Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

github-actions · 2025-07-18T19:41:50Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

vigyasharma · 2025-07-24T07:33:37Z

lucene/core/src/java/org/apache/lucene/index/SharedMergeScheduler.java

+   */
+  @Override
+  public void merge(MergeSource mergeSource, MergeTrigger trigger) throws IOException {
+    while (true) {


can we use hasPendingMerges() instead?

vigyasharma · 2025-07-24T07:36:39Z

lucene/core/src/java/org/apache/lucene/index/SharedMergeScheduler.java

+      MergeTaskWrapper wrappedTask = new MergeTaskWrapper(mergeRunnable, (IndexWriter) mergeSource, merge.totalBytesSize());
+
+      // Registering this task under the writer 
+      writerToMerges.computeIfAbsent((IndexWriter) mergeSource, k -> new CopyOnWriteArraySet<>()).add(wrappedTask);


I don't think you can cast mergeSource as an IndexWriter. However, you might not need to. This CMS is no longer a singleton, only the executor is shared. The CMS is actually owned by an indexWriter, which means all calls to merge() come from the same IW.

So instead of a writer -> merges mapping that you'd need in a singleton, all you need here is the set of pending and running merges submitted to this CMS. They are all from the same writer! When writer closes, instead of shutting down the executor, you could update the merge objects in the set (MergeTaskWrappers), and set an aborted flag. And update your runnable to skip the merge if "aborted" flag has been set.

Add SharedMergeScheduler with shared thread pool for multi-tenant mer…

d9e83b9

…ge scheduling

github-project-automation bot added this to OpenSearch Lucene & Core Performance Tracking Jul 5, 2025

github-project-automation bot moved this to Open in OpenSearch Lucene & Core Performance Tracking Jul 5, 2025

github-actions bot added the module:core/index label Jul 5, 2025

Fix formatting using gradlew tidy

04c76cf

Add executor shutdown handling for SharedMergeScheduler

01830ba

Refactor SharedMergeScheduler: pass ExecutorService via constructor i…

aa7e95e

…nstead of static field

Add MergeTaskWrapper as preparation for executor task wrapping

ca62b83

Integrate MergeTaskWrapper into SharedMergeScheduler for merge source…

4188bf4

… and size tracking

vigyasharma reviewed Jul 14, 2025

View reviewed changes

Refactor: Pass executor via constructor, update MergeTaskWrapper to u…

cf9c939

…se IndexWriter

Integrate MergeTaskWrapper with task tracking and cleanup logic

d40057c

vigyasharma mentioned this pull request Jul 21, 2025

[Draft] Multi-Tenant CMS Manager #14953

Draft

vigyasharma reviewed Jul 24, 2025

View reviewed changes

		@@ -0,0 +1,25 @@
		package org.apache.lucene.index;

		public class MergeTaskWrapper {

SharedMergeScheduler using shared thread pool for multi-tenant merge scheduling #14900

Are you sure you want to change the base?

SharedMergeScheduler using shared thread pool for multi-tenant merge scheduling #14900

Uh oh!

Conversation

N624-debu commented Jul 5, 2025

[Draft] SharedMergeScheduler using shared thread pool for multi-tenant merge scheduling

Motivation

Implementation Highlights

Next Steps

Uh oh!

github-actions bot commented Jul 5, 2025

Uh oh!

github-actions bot commented Jul 5, 2025

Uh oh!

vigyasharma commented Jul 6, 2025

Uh oh!

jpountz commented Jul 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vigyasharma commented Jul 7, 2025

Uh oh!

vigyasharma commented Jul 7, 2025

Uh oh!

github-actions bot commented Jul 14, 2025

Uh oh!

github-actions bot commented Jul 14, 2025

Uh oh!

github-actions bot commented Jul 14, 2025

Uh oh!

github-actions bot commented Jul 14, 2025

Uh oh!

vigyasharma Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

vigyasharma Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

vigyasharma Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

vigyasharma Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

jpountz commented Jul 17, 2025

Uh oh!

github-actions bot commented Jul 18, 2025

Uh oh!

github-actions bot commented Jul 18, 2025

Uh oh!

vigyasharma Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

vigyasharma Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jpountz commented Jul 6, 2025 •

edited

Loading