Skip to content

[ES|QL] Fix aggregate_metric_double sorting and mv_expand issues #131658

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

limotova
Copy link
Contributor

This commit fixes a bug when sorting on multiple indices, where one index contains an aggregate_metric_double field, but at least one other index does not contain that field.
The root issue being that the aggregate_metric_double field, when not present in a doc, will be encoded as a NullBlock (1 value), but later on decoded as an AggregateMetricDouble block (which expects to decode 4 values, even if they're all null).

It also adds an implementation for MV_EXPAND so it just returns the existing block instead of erroring out, as multi-values are not supported for AggregateMetricDoubleBlock.

This commit fixes a bug when sorting on multiple indices, where one
index contains an aggregate_metric_double field, but at least one other
index does not contain that field. The root cause being that the
aggregate_metric_double field, when not present in a doc, will be
encoded as a NullBlock (1 value), but later on decoded as an
AggregateMetricDouble block (which expects to decode 4 values, even if
they're all null).

It also adds an implementation for MV_EXPAND so it just returns the
existing block instead of erroring out, as multi-values are not
supported for AggregateMetricDoubleBlock.
@limotova limotova marked this pull request as ready for review July 22, 2025 04:03
@limotova limotova requested a review from dnhatn July 22, 2025 04:03
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jul 22, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @limotova, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add unit tests for AGGREGATE_METRIC_DOUBLE in ExtractorTests? But the production changes look good. Thanks Larisa!

@limotova limotova added auto-backport Automatically create backport pull requests when merged v9.1.1 v8.19.1 labels Jul 22, 2025
@limotova limotova requested a review from dnhatn July 23, 2025 06:47
if (block.isNull(pos)) {
result = 31 * result - 1;
} else {
result = 31 * result + DoubleBlock.hash(block.minBlock());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow this logic? Can you explain it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, similar to the equals thing Mark pointed out earlier I didn't handle looping properly, I will rewrite it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Can we check if all values are null (or ConstantNullBlock), the use Objects.hash(getPositionCount()); or ConstantNullBlock#hashCode? otherwise, delegate to the hashCode of the array of blocks.

Copy link
Contributor Author

@limotova limotova Jul 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's not present in this PR but I have a separate PR addressing the bug I brought up on slack (where doubleBlock.equals(constantNullBlock) can be true but the reverse is false + hashcodes aren't equal), and I addressed it by changing ConstantNullBlock's hashCode, and then set up AggregateMetricDouble's hashcode here to align with that.
Maybe we should address how to calculate hashcode for all blocks via this PR first (it's ready to be reviewed I just want to add a description)


static final BlockFactory blockFactory = TestBlockFactory.getNonBreakingInstance();

// TODO: Add additional tests
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am making a list of tests where AggregateMetricDoubleBlock is missing and making a github issue out of it and it is the first thing I will work on after I get this PR and the initial CSV tests PR merged in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test issue: #131951

Copy link
Member

@not-napoleon not-napoleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this looks good to me.

@limotova limotova merged commit 11a72f0 into elastic:main Jul 26, 2025
33 checks passed
@limotova limotova deleted the agg-metric-sorting-from-multiple-indices branch July 26, 2025 02:39
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
9.1 Commit could not be cherrypicked due to conflicts
8.19 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 131658

limotova added a commit to limotova/elasticsearch that referenced this pull request Jul 26, 2025
limotova added a commit to limotova/elasticsearch that referenced this pull request Jul 26, 2025
elasticsearchmachine pushed a commit that referenced this pull request Jul 26, 2025
elasticsearchmachine pushed a commit that referenced this pull request Jul 26, 2025
szybia added a commit to szybia/elasticsearch that referenced this pull request Jul 28, 2025
…-tracking

* upstream/main:
  Fix MergeWithLowDiskSpaceIT testRelocationWhileForceMerging (elastic#131806)
  [ML] Prevent the trained model deployment memory estimation from double-counting allocations. (elastic#131990)
  ES|QL Assert current thread during query planning and execution (elastic#131807)
  Add ElasticsearchIndexDeletionPolicy and EngineConfig policy wrapper (elastic#130442)
  [TEST] Adds tests for ESTestCase randomSubset methods (elastic#131745)
  Simplify esql session (elastic#131925)
  Simplify EsqlExecution info serialization (elastic#131823)
  Add utility to check for project global block (elastic#131927)
  [DOCS] Update ES|QL applies to's (elastic#131805)
  Handle structured log messages (elastic#131027)
  Mute org.elasticsearch.test.rest.yaml.RcsCcsCommonYamlTestSuiteIT test {p0=search/600_flattened_ignore_above/flattened ignore_above multi-value field} elastic#131967
  Mute org.elasticsearch.xpack.remotecluster.CrossClusterEsqlRCS2EnrichUnavailableRemotesIT testEsqlEnrichWithSkipUnavailable elastic#131965
  Mute org.elasticsearch.xpack.restart.FullClusterRestartIT testWatcherWithApiKey {cluster=UPGRADED} elastic#131964
  [ES|QL] Fix aggregate_metric_double sorting and mv_expand issues (elastic#131658)
  Reduce logging levels for meter usage tests (elastic#131935)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged backport pending >bug :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:StorageEngine v8.19.1 v9.1.1 v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants