Draft: PIT context relocation on shard relocation #132251

cbuescher · 2025-07-31T09:16:48Z

WIP work to support PIT context relocation when nodes gracefully shut down and shards are relocated.

[CI] Auto commit changes from spotless

cbuescher · 2025-08-01T21:55:06Z

server/src/main/java/org/elasticsearch/action/search/ClearScrollController.java

                    if (node != null) {
+                        targetNodes = Collections.singleton(node);
+                    } else {
+                        staticLogger.info("---> missing node when closing context: " + contextId.getNode());


note: We need to close the contexts after moving them when the "old" PIT is used, so if the originally encoded node it gone we try all remaining ones that currently hold that shard here (regardless of whether that node also hold a pit context.

cbuescher · 2025-08-01T21:55:31Z

server/src/main/java/org/elasticsearch/action/search/PITHelper.java

+import java.util.Collections;
+import java.util.Map;
+
+public class PITHelper {


just for debugging atm

cbuescher · 2025-08-01T21:56:45Z

server/src/main/java/org/elasticsearch/action/search/TransportClosePointInTimeAction.java

    }

    @Override
    protected void doExecute(Task task, ClosePointInTimeRequest request, ActionListener<ClosePointInTimeResponse> listener) {
        final SearchContextId searchContextId = SearchContextId.decode(namedWriteableRegistry, request.getId());
-        final Collection<SearchContextIdForNode> contextIds = searchContextId.shards().values();
+        Map<ShardId, SearchContextIdForNode> shards = searchContextId.shards();


See above changes in ClearScrollController. We also need to pass in the shardIds now so we can retry if the original node is gone.

cbuescher · 2025-08-01T21:59:45Z

server/src/main/java/org/elasticsearch/action/search/TransportSearchAction.java

                            targetNodes.add(perNode.getNode());
                        }
-                        if (perNode.getSearchContextId().getSearcherId() != null) {
+                        if (perNode.getSearchContextId().getSearcherId() != null || nodeExists == false) {


This is where we re-try other shards when the original PIT node is gone now. Trying every node with a shard copy might be too much on the long run but without a cluster-wide service that keeps track of where the PIT contexts are this might be unavoidable. Needs follow up in terms of altering the PIT id once we found the new node where the PIT context lives now.

cbuescher · 2025-08-01T22:06:14Z

server/src/main/java/org/elasticsearch/search/SearchService.java

@@ -360,7 +372,7 @@ public class SearchService extends AbstractLifecycleComponent implements IndexEv

    private final AtomicLong idGenerator = new AtomicLong();

-    private final Map<Long, ReaderContext> activeReaders = ConcurrentCollections.newConcurrentMapWithAggressiveConcurrency();
+    private final Map<ReaderContextId, ReaderContext> activeReaders = ConcurrentCollections.newConcurrentMapWithAggressiveConcurrency();


Currently we keep track of the readers in this map only by an auto-increment long counter, which is also encoded in the PIT id to lookup the right ReaderContext. We cannot simply move the ReaderContext to another node using only that id now because of conflicts with existing contexts on that node. But we also cannot change it because its encoded in the PIT Id (via the ShardSearchContextIds). I changed the key to a new joint key that also uses the original sessionId encoded in the PIT for retrieval on the new node and to avoid id collisions.

cbuescher · 2025-08-01T22:07:14Z

server/src/main/java/org/elasticsearch/search/SearchService.java

+        return this.activeReaders.values()
+            .stream()
+            .filter(c -> c.singleSession() == false)
+            .filter(c -> c.scrollContext() == null)


exclude single-session ReaderContexts because they get cleared after each search, also don't move anything that is a scroll context.

cbuescher · 2025-08-01T22:09:42Z

server/src/main/java/org/elasticsearch/search/SearchService.java

+            .collect(Collectors.toList());
+    }
+
+    public void reopenPitContexts(ShardId shardId, String segmentsFileName, long keepAlive, String sessionId, long contextId) {


This whole part is pretty rough still, I mostly wanted something working in the receiving side of the PIT relocation that makes the integration test work. There are probably many open questions here.

I wonder if we could move part of this logic into the Engine? this would allow us to track the necessary metadata to keep blobs around (and it would throw if we ever try to call this in an Engine that doesn't support to transfer PIT contexts?

fcofdez

Looks in the right direction, I left a suggestion

fcofdez · 2025-08-11T08:47:02Z

server/src/main/java/org/elasticsearch/search/SearchService.java

+            .collect(Collectors.toList());
+    }
+
+    public void reopenPitContexts(ShardId shardId, String segmentsFileName, long keepAlive, String sessionId, long contextId) {


I wonder if we could move part of this logic into the Engine? this would allow us to track the necessary metadata to keep blobs around (and it would throw if we ever try to call this in an Engine that doesn't support to transfer PIT contexts?

cbuescher added WIP Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v9.2.0 labels Jul 31, 2025

cbuescher marked this pull request as draft July 31, 2025 09:18

elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Jul 31, 2025

cbuescher force-pushed the pit-relocation-poc branch from 894b79f to 18391a3 Compare July 31, 2025 21:32

PIT relocation POC

fccadb1

[CI] Auto commit changes from spotless

cbuescher force-pushed the pit-relocation-poc branch from 4ac74d5 to fccadb1 Compare August 1, 2025 09:40

cbuescher and others added 7 commits August 1, 2025 15:02

Iter TransportSearchAction

f037bc0

Fix test

83a170e

[CI] Auto commit changes from spotless

f5e4887

Don't relocate scroll contexts

d5759dc

[CI] Auto commit changes from spotless

b72487c

Fix more tests

0d1adda

Merge branch 'main' into pit-relocation-poc

b7e9568

cbuescher commented Aug 1, 2025

View reviewed changes

cbuescher and others added 2 commits August 2, 2025 00:17

iter

7b02a00

[CI] Auto commit changes from spotless

015b5d2

fcofdez reviewed Aug 11, 2025

View reviewed changes

Merge branch 'main' into pit-relocation-poc

f2b20da

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Draft: PIT context relocation on shard relocation #132251

Draft: PIT context relocation on shard relocation #132251

cbuescher commented Jul 31, 2025

Uh oh!

cbuescher Aug 1, 2025

Uh oh!

cbuescher Aug 1, 2025

Uh oh!

cbuescher Aug 1, 2025

Uh oh!

cbuescher Aug 1, 2025

Uh oh!

cbuescher Aug 1, 2025

Uh oh!

cbuescher Aug 1, 2025

Uh oh!

cbuescher Aug 1, 2025

Uh oh!

fcofdez Aug 11, 2025

Uh oh!

fcofdez left a comment

Uh oh!

fcofdez Aug 11, 2025

Uh oh!

Uh oh!

Draft: PIT context relocation on shard relocation #132251

Are you sure you want to change the base?

Draft: PIT context relocation on shard relocation #132251

Conversation

cbuescher commented Jul 31, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fcofdez left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!