Skip to content

Conversation

lkts
Copy link
Contributor

@lkts lkts commented Jul 28, 2025

This PR adjusts routing logic for search, refresh and flush APIs in order to take into account that some shards may be not usable for such operations based on the state of resharding metadata of the index. Note that GET is out of scope of this PR (index, update and delete operations were previously covered).

@elasticsearchmachine elasticsearchmachine added v9.2.0 serverless-linked Added by automation, don't add manually labels Jul 28, 2025
@lkts lkts force-pushed the filter_not_ready_split_shards_from_search branch from 6c1ee55 to 4f9584f Compare July 28, 2025 20:19
for (int i = 0; i < indexRoutingTable.size(); i++) {
shardIds.add(indexRoutingTable.shard(i).shardId());
}
Iterator<IndexShardRoutingTable> iterator = operationRouting.allWritableShards(projectState, index);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This handles refresh and flush APIs.

@lkts lkts added >non-issue :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. :Search Foundations/Search Catch all for Search Foundations labels Jul 30, 2025
@lkts lkts requested review from ankikuma and bcully July 30, 2025 20:48
@lkts lkts marked this pull request as ready for review July 30, 2025 20:48
@elasticsearchmachine elasticsearchmachine added Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch Team:Distributed Indexing Meta label for Distributed Indexing team labels Jul 30, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

@bcully
Copy link
Contributor

bcully commented Jul 30, 2025

Would it be possible to unit test these routing changes at the level of IndexRoutingTests or something? Like manually create some resharding metadata and verify the way it affects shard ids?

Copy link
Contributor

@bcully bcully left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, I'm just wondering if we can add some unit-level tests of the routing changes.

@@ -99,7 +101,8 @@ public void accept(ActionListener<Response> listener) {

final ClusterState clusterState = clusterService.state();
final ProjectMetadata project = projectResolver.getProjectMetadata(clusterState);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't know how expensive a projectResolver is but perhaps it's a little cheaper to resolve once and then retrieve ProjectMetadata from projectState.metadata() ?

* Returns an iterator of shards of the index that are ready to execute write requests.
* A shard may not be ready to execute these operations during processes like resharding.
*/
private static Iterator<IndexShardRoutingTable> allShardsReadyForWrites(ProjectState projectState, String index) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling a bit with naming. I worry that Ready could be confused with a shard that's in recovery. Such a shard might refuse operations but operations should be routed there anyway, because it's the owning shard of a request. Maybe allWriteAddressableShards?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that!

@@ -125,7 +132,7 @@ private static Set<IndexShardRoutingTable> computeTargetedShards(
// we use set here and not list since we might get duplicates
final Set<IndexShardRoutingTable> set = new HashSet<>();
if (routing == null || routing.isEmpty()) {
collectTargetShardsNoRouting(projectState.routingTable(), concreteIndices, set);
collectTargetShardsNoRouting(projectState, concreteIndices, set);
} else {
collectTargetShardsWithRouting(projectState, concreteIndices, routing, set);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this branch resolving target shards to source shards before split?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the routing passed in here based on what we got from IndexRouting ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is handled inside IndexRouting#collectSearchShards, yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but what about this codepath in collectTargetShardsWithRouting. When is this called and why does this not have to check for resharding ?

else {
                for (int i = 0; i < indexRoutingTable.size(); i++) {
                    set.add(indexRoutingTable.shard(i));
        }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should, thanks for catching that.

return allShardsExceptSplitTargetsInStateBefore(projectState, index, IndexReshardingState.Split.TargetShardState.HANDOFF);
}

private static Iterator<IndexShardRoutingTable> allShardsExceptSplitTargetsInStateBefore(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment here so the reader knows it is associated with resharding and not to be confused with the split API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that class level documentation on IndexReshardingMetadata can be fairly easily discovered from this code and it explains what it is.

@ankikuma
Copy link
Contributor

So this PR does not handle GETs, is that correct ? Can you add a description to this PR to explain which codepaths are covered. Looks like you are covering search, flush and refresh ?

@lkts
Copy link
Contributor Author

lkts commented Jul 31, 2025

@ankikuma yes, GET is not handled

Copy link
Contributor

@bcully bcully left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ankikuma
Copy link
Contributor

ankikuma commented Aug 4, 2025

Left one comment about OperationRouting#collectTargetShardsWithRouting. See Above.

@lkts lkts merged commit 2721a6b into elastic:main Aug 27, 2025
33 checks passed
@lkts lkts deleted the filter_not_ready_split_shards_from_search branch August 27, 2025 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. >non-issue :Search Foundations/Search Catch all for Search Foundations serverless-linked Added by automation, don't add manually Team:Distributed Indexing Meta label for Distributed Indexing team Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants