[Persistent Tasks] Assign based on ProjectId #130391

prwhelan · 2025-07-01T13:52:59Z

Pass the ProjectId from PersistentTaskClusterService through to all PersistentTasksExecutors when creating node assignments.

These PersistentTasksExecutors require the ProjectId during node assignment:

OpenJobPersistentTasksExecutor
SnapshotUpgradeTaskExecutor
StartDatafeedPersistentTasksExecutor
TransformPersistentTasksExecutor

Pass the ProjectId from PersistentTaskClusterService through to all PersistentTasksExecutors when creating node assignments. These PersistentTasksExecutors require the ProjectId during node assignment: - OpenJobPersistentTasksExecutor - SnapshotUpgradeTaskExecutor - StartDatafeedPersistentTasksExecutor - TransformPersistentTasksExecutor Implemented TransformPersistentTasksExecutor's getAssignment using the ProjectId by reading the TransformMetadata and RoutingTable from the ProjectMetadata.

…anges, I don't want to have to revert broken transform stuff and affect serverless

elasticsearchmachine · 2025-07-03T17:57:10Z

Pinging @elastic/ml-core (Team:ML)

nielsbauman · 2025-07-07T14:56:46Z

server/src/main/java/org/elasticsearch/health/node/selection/HealthNodeTaskExecutor.java

-        ClusterState clusterState
+        ClusterState clusterState,
+        @Nullable ProjectId projectId


We'll probably want to split this method up into a cluster-scoped version and a project-scoped version. The health node persistent task is cluster-scoped, so it doesn't really make sense to have a project ID here or in other cluster-scoped persistent tasks (even though it's nullable). Let me know if "cluster-scoped vs. project-scoped persistent tasks" sound unfamiliar to you, then I (or Yang) can explain what they are. But I'll let @ywangd decide whether he agrees or whether he's fine with the nullable project ID like this.

If we decide to split it up, we can one method without a project ID (for cluster-scoped tasks) and one with a ProjectState (instead of a ClusterState and ProjectId). We created ProjectState to avoid passing cluster states together with project IDs.

I'm happy to rework this, I'm not thrilled about passing nulls around. I'd be happy to instead add a method to the parent, something like:

public Assignment getAssignment(Params params, Collection<DiscoveryNode> candidateNodes, ProjectState projectState) {

and then PersistentTasksClusterService can call the ProjectState or ClusterState API depending on the scope of the persistent task? I'm not sure if that is cleaner for the persistent task framework at the base level but it feels cleaner for the implementations.

99% of this PR was written by IntelliJ's refactor button so we'd only be throwing away minutes of work =)

Yep, something like that is what I had in mind as well (perhaps with a different name for clarity, i.e. getProjectScopedAssignment and getClusterScopedAssignment, but that wouldn't be a blocker for me). Curious to hear what Yang thinks of all this.

Yeah we can go with two separate methods. The PersistentTasksExecutor#scope method can be used to tell the scope of the executor and subsequently call the relevant getXxxAssignment method.

Theoretically we can have a single overriden generic method for project and cluster scoped task executors, if ProjectState and ClusterState shares some interface, e.g. Supplier<ClusterState>. It should help reducing verbosity of the types. We will still need to check the task executor types and pass either ClusterState or ProjectState to the method accordingly. This might be something worth doing in future since it feels like a better type system. But it is definitely outside of this PR.

ywangd · 2025-07-11T04:16:40Z

I am very sorry to say this. But I changed my mind after seeing the changes for two separate methods. I now prefer the original approach with the nullable ProjectId parameter.

The main issue with two separate methods is that we have no good way to ensure subclasses implement the right method. That is, if a project scoped task executor implements getClusterScopedAssignment, it is going to be silently skipped and no good way to detect this error. A subclass can still do the wrong thing if we have a single method with nullable ProjectId. But at least the wrong method will always be excercised and error is likely easier to spot. It is not a big concern right now since the methods are called in only a few places. But the usages can expand and so can the concern.

The other good thing is that we can add meaningingful assertions in the superclass like the follows since the executor itself knows its scope:

    // final so not overridable and enforces the assertion
    public final Assignment getAssignment(Params params, Collection<DiscoveryNode> candidateNodes, ClusterState clusterState, @Nullable ProjectId projectId) {
        assert (scope() == Scope.PROJECT && projectId != null) 
            || (scope() == Scope.CLUSTER && projectId == null);
        doGetAssignment(params, candidateNodes, clusterState, projectId);
    }

    // Overridable by subclasses
    public Assignment doGetAssignment(Params params, Collection<DiscoveryNode> candidateNodes, ClusterState clusterState, @Nullable ProjectId projectId) {
        ...
    }

In other places, e.g. PersistentTasksClusterService, we did prefer having separate project and cluster scoped methods. But those methods are not meant to be overidden so that the service class itself has tight control and incorrect usage will be detected due to mismatching executor scope and project context. I think the situation here is different: We have a single superclass, but with two separate base methods, it pretends to be two separate super-classes when it comes to overriding. So I think it is better to stick with one method. In future, if we introduce different base classes for different executor scope as commented earlier, that would be a better time to get rid of the nullable ProjectId.

Apologies for the back and forth. I am happy to hear your thoughts. If we do agree to change it back, I hope it is not too much other than reverting the this commit (6bceb18). Thank you! 🙏

prwhelan · 2025-07-11T13:36:08Z

I am very sorry to say this. But I changed my mind after seeing the changes for two separate methods. I now prefer the original approach with the nullable ProjectId parameter.

The main issue with two separate methods is that we have no good way to ensure subclasses implement the right method. That is, if a project scoped task executor implements getClusterScopedAssignment, it is going to be silently skipped and no good way to detect this error. A subclass can still do the wrong thing if we have a single method with nullable ProjectId. But at least the wrong method will always be excercised and error is likely easier to spot. It is not a big concern right now since the methods are called in only a few places. But the usages can expand and so can the concern.

The other good thing is that we can add meaningingful assertions in the superclass like the follows since the executor itself knows its scope:
    // final so not overridable and enforces the assertion
    public final Assignment getAssignment(Params params, Collection<DiscoveryNode> candidateNodes, ClusterState clusterState, @Nullable ProjectId projectId) {
        assert (scope() == Scope.PROJECT && projectId != null) 
            || (scope() == Scope.CLUSTER && projectId == null);
        doGetAssignment(params, candidateNodes, clusterState, projectId);
    }

    // Overridable by subclasses
    public Assignment doGetAssignment(Params params, Collection<DiscoveryNode> candidateNodes, ClusterState clusterState, @Nullable ProjectId projectId) {
        ...
    }
In other places, e.g. PersistentTasksClusterService, we did prefer having separate project and cluster scoped methods. But those methods are not meant to be overidden so that the service class itself has tight control and incorrect usage will be detected due to mismatching executor scope and project context. I think the situation here is different: We have a single superclass, but with two separate base methods, it pretends to be two separate super-classes when it comes to overriding. So I think it is better to stick with one method. In future, if we introduce different base classes for different executor scope as commented earlier, that would be a better time to get rid of the nullable ProjectId.

Apologies for the back and forth. I am happy to hear your thoughts. If we do agree to change it back, I hope it is not too much other than reverting the this commit (6bceb18). Thank you! 🙏

Yeah that makes sense to me - I briefly tried having an intermediary subclass of PersistentTasksExecutor that replaces getAssignment with the ProjectScope variant, but I thought that may create too many layers of abstraction. Basically:

class PersistentTasksExecutor {
   // final so not overridable and enforces the assertion
   public final Assignment getAssignment(Params params, Collection<DiscoveryNode> candidateNodes, ClusterState clusterState, @Nullable ProjectId projectId) {
       assert (scope() == Scope.PROJECT && projectId != null) 
           || (scope() == Scope.CLUSTER && projectId == null);
       doGetAssignment(params, candidateNodes, clusterState, projectId);
   }

   // Overridable by subclasses
   public Assignment doGetAssignment(Params params, Collection<DiscoveryNode> candidateNodes, ClusterState clusterState, @Nullable ProjectId projectId) {
       ...
   }
}

class ProjectScopedPersistentTasksExecutor extends PersistentTasksExecutor {
   public final Assignment doGetAssignment(Params params, Collection<DiscoveryNode> candidateNodes, ClusterState clusterState, @Nullable ProjectId projectId) {
       assert (scope() == Scope.PROJECT && projectId != null);
       return doGetAssignment(params, candidateNodes, clusterState.projectState(projectId)
   }

   public abstract Assignment doGetAssignment(Params params, Collection<DiscoveryNode> candidateNodes, ProjectState projectState);
}

This reverts commit 6bceb18.

ywangd

LGTM

Thanks a lot for the iterations! 👍

ywangd · 2025-07-14T03:23:21Z

server/src/main/java/org/elasticsearch/persistent/PersistentTasksExecutor.java

+     * If {@link #scope()} returns CLUSTER, then {@link ProjectId} will be null.
+     * If {@link #scope()} returns PROJECT, then {@link ProjectId} will not be null.
+     */
+    public Assignment doGetAssignment(


I think we should make this method and all its overridden versions protected. It is meant to be overridden and called by only the subclasses. External callers should use getAssignment which is final and enforces the consistency check.

and all its overridden versions

I think you still need to update the overridden versions; you only updated the base method.

nielsbauman

I left one comment on an existing thread, other than that LGTM. Thanks a lot for the iterations, @prwhelan!

prwhelan added >refactoring :ml Machine learning Team:ML Meta label for the ML team v9.2.0 labels Jul 1, 2025

prwhelan added 3 commits July 2, 2025 06:58

Merge branch 'main' into mp/1

f8782f3

Merge branch 'main' into mp/1

c16f55d

Decoupling transform changes since serverless will depend on these ch…

44e3b7c

…anges, I don't want to have to revert broken transform stuff and affect serverless

prwhelan changed the title ~~[Transform] Assign based on ProjectId~~ [Persistent Tasks] Assign based on ProjectId Jul 3, 2025

elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Jul 3, 2025

Merge branch 'main' into mp/1

e37d6f7

prwhelan marked this pull request as ready for review July 3, 2025 17:56

nielsbauman reviewed Jul 7, 2025

View reviewed changes

Scope methods to Project and Cluster

6bceb18

prwhelan changed the title ~~[Persistent Tasks] Assign based on ProjectId~~ [Persistent Tasks] Assign based on ProjectState Jul 10, 2025

Merge branch 'main' into mp/1

2b0146b

prwhelan added 4 commits July 11, 2025 09:36

Revert "Scope methods to Project and Cluster"

32597e4

This reverts commit 6bceb18.

Assert non-null projectid for project-scoped executors

b38a51b

Fix tests with new assertion

7daaaef

Point to correct super method

8dfe68b

prwhelan changed the title ~~[Persistent Tasks] Assign based on ProjectState~~ [Persistent Tasks] Assign based on ProjectId Jul 11, 2025

Merge branch 'main' into mp/1

6624a94

ywangd approved these changes Jul 14, 2025

View reviewed changes

prwhelan added 2 commits July 14, 2025 08:02

Update PersistentTasksExecutor.java

d51ad0c

Merge branch 'main' into mp/1

0bd72db

nielsbauman approved these changes Jul 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Persistent Tasks] Assign based on ProjectId #130391

[Persistent Tasks] Assign based on ProjectId #130391

prwhelan commented Jul 1, 2025 •

edited

Loading

Uh oh!

elasticsearchmachine commented Jul 3, 2025

Uh oh!

nielsbauman Jul 7, 2025

Uh oh!

prwhelan Jul 7, 2025 •

edited

Loading

Uh oh!

nielsbauman Jul 7, 2025

Uh oh!

ywangd Jul 10, 2025

Uh oh!

ywangd commented Jul 11, 2025

Uh oh!

prwhelan commented Jul 11, 2025

Uh oh!

ywangd left a comment

Uh oh!

ywangd Jul 14, 2025

Uh oh!

nielsbauman Jul 14, 2025

Uh oh!

nielsbauman left a comment

Uh oh!

Uh oh!

[Persistent Tasks] Assign based on ProjectId #130391

Are you sure you want to change the base?

[Persistent Tasks] Assign based on ProjectId #130391

Conversation

prwhelan commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Jul 3, 2025

Uh oh!

nielsbauman Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

prwhelan Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nielsbauman Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

ywangd Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

ywangd commented Jul 11, 2025

Uh oh!

prwhelan commented Jul 11, 2025

Uh oh!

ywangd left a comment

Choose a reason for hiding this comment

Uh oh!

ywangd Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

nielsbauman Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

nielsbauman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

prwhelan commented Jul 1, 2025 •

edited

Loading

prwhelan Jul 7, 2025 •

edited

Loading