Skip to content

adding scripted metric aggs docs #10211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

AntonEliatra
Copy link
Contributor

Description

adding scripted metric aggs docs

Version

all

Checklist

  • By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
    For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged.

Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer.

When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review.

Signed-off-by: Anton Rubin <[email protected]>
@kolchfa-aws
Copy link
Collaborator

@sandeshkr419 Could you please review this PR? Thanks!

@sandeshkr419 sandeshkr419 self-assigned this Jul 15, 2025
Signed-off-by: Anton Rubin <[email protected]>
Copy link
Member

@sandeshkr419 sandeshkr419 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These scripts within script stages are painless scripts, right?

Should we mention it somewhere within the doc - like these script stages should contain painless scripts (and then maybe link it).

Basically, what I am looking for are the rules/syntax to define these script but at the same time, explaining painless script might be out of scope for this page.

So if we can link it and mention it briefly, that might be good.

@@ -9,44 +9,220 @@ redirect_from:

# Scripted metric aggregations

The `scripted_metric` metric is a multi-value metric aggregation that returns metrics calculated from a specified script.
The `scripted_metric` aggregation is a multi-value metric aggregation that returns metrics calculated from a specified script. The aggregation goes through up to four script stages during execution. These stages run in order and allow you to accumulate and combine results from your documents.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aggregation goes through up to four script stages during execution. These stages run in order and allow you to accumulate and combine results from your documents.

I find this wording confusing - this wording seems to imply that upto 4 independent scripts can run at max.

Something like this might be clear: A script has four stages: <name the stages>, which are run in order by each aggregation, and allow you to combine results from your documents.


- Primitive types: `int`, `long`, `float`, `double`, `boolean`
- String
- Map (with keys and values only of allowed types)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Map (with keys and values only of allowed types) : What allowed types? String & Primitives defined above?

- Primitive types: `int`, `long`, `float`, `double`, `boolean`
- String
- Map (with keys and values only of allowed types)
- Array (containing only allowed types)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Map (with keys and values only of allowed types) : What allowed types? String & Primitives defined above?


## Example

The following are examples
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing colon at end of sentence?


- The `init_script` sets up an empty list to hold transaction values for each shard.
- The `map_script` for each document adds the document’s amount to the `state.transactions` list, as a positive value if type is `sale`, or as a negative value if type is `cost`. By the end of the `map` phase, each shard’s `state.transactions` will contain a list of numbers representing the income and expenses on that shard.
- The `combine_script` takes that list and sums it up to produce the shard’s total profit and returns that number.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that list -> mention the list name received from above

that number -> mention the object name which will be passed onto reduce stage next


## Handling empty buckets (no documents scenario)

When using a `scripted_metric` aggregation as a sub-aggregation within a bucket aggregation (such as terms), it is important to account for buckets that contain no documents on certain shards. In such cases, those shards return a `null` value for the aggregation state. During the `reduce_script` phase, the states array may therefore include `null` entries corresponding to these shards. To ensure reliable execution, the `reduce_script` must be designed to handle `null` values gracefully. A common approach is to include a conditional check, such as `if (state != null)`, before accessing or operating on each state. Failure to implement such checks can result in runtime errors when processing empty buckets across shards.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this documentation - but do you think that handling null can be another improvement in code base by passing some paramater like ignore_null_results.

I'm wondering that might even make scripted aggs faster if the null checks are part of code-flow via some param rather than run as part of script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants