Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add blog post about the OTTL context inference feature #6290

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 131 additions & 0 deletions content/en/blog/2025/ottl-contexts-just-got-easier.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
---
title: OTTL contexts just got easier with context inference
linkTitle: OTTL contexts just got easier
date: 2025-02-17
author: '[Edmo Vamerlatti Costa](https://github.com/edmocosta) (Elastic)'
draft: true # TODO: remove this line once your post is ready to be published
issue: 6289
sig: Collector SIG
cSpell:ignore: OTTL Vamerlatti
---

Selecting the right context for running OTTL statements can be challenging, even
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want a link for context here that disambiguates it from the "context propagation" context that is predominantly used in the site.

for experienced users. Choosing the correct context impacts both accuracy and
efficiency, as using higher-level contexts can avoid unnecessary iterations
through nested lower-level contexts.

To simplify this process, the OpenTelemetry community is excited to announce
OTTL
[context inference](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/transformprocessor/README.md#context-inference)
support for the
[transform processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/transformprocessor).
This feature removes the need to manually specify contexts, improving statement
edmocosta marked this conversation as resolved.
Show resolved Hide resolved
processing efficiency by automatically selecting the most appropriate one. This
optimization ensures that data transformations are both accurate and performant.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly necessary to mention this, and we touch on it when talking about the flat configuration style, but from a general UX standpoint, this change also relieves users from needing to know the concept of contexts and instead just think about the data they want to work with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should highlight this point as it is the primary motivator for this work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also point out that soon this will make statements portable between components that use OTTL?

Copy link
Author

@edmocosta edmocosta Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also point out that soon this will make statements portable between components that use OTTL?

It's kind of mentioning it here, but without providing much details.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly necessary to mention this, and we touch on it when talking about the flat configuration style, but from a general UX standpoint, this change also relieves users from needing to know the concept of contexts and instead just think about the data they want to work with.

I've pushed an extra change mentioning that. Could you please take a look again? thanks!


## How does it work?

Starting with version `0.120.0`, the transform processor supports two new
[context-inferred configuration](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/transformprocessor/README.md#context-inferred-configurations)
styles. The first one offers a simpler and flatter approach, while the second
closely resembles the existing configuration format.

### Flat configuration

The flat configuration style simplifies configuration by allowing users to list
all statements together, without worrying about contexts or extra configuration
structures. This style support statements from multiple contexts and does not
require grouping them separately.

To illustrate this, compare the following configuration:

```yaml
metric_statements:
- context: resource
statements:
- keep_keys(attributes, ["host.name"])
- context: metric
statements:
- set(description, "Sum") where type == "Sum"
- convert_sum_to_gauge() where name == "system.processes.count"
- context: datapoint
statements:
- limit(attributes, 100, ["host.name"])
```

With the new flat configuration style, the same logic is expressed more
concisely as:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
With the new flat configuration style, the same logic is expressed more
concisely as:
With the new flat configuration style, the same logic is expressed more
concisely by letting you specify the OTTL context as a part of a key:

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
With the new flat configuration style, the same logic is expressed more
concisely as:
With the new flat configuration style, the same logic is expressed more
concisely by simply providing a list of statements:

I think something like that would better fit here. WDYT?


```yaml
metric_statements:
- keep_keys(resource.attributes, ["host.name"])
- set(metric.description, "Sum") where metric.type == "Sum"
- convert_sum_to_gauge() where metric.name == "system.processes.count"
- limit(datapoint.attributes, 100, ["host.name"])
```

This streamlined approach enhances readability and makes configuration more
intuitive. Please note that all paths in the statements must be prefixed with
their respective contexts. These prefixes are required for all context-inferred
configurations and serve as hints for selecting the best match. It also makes
statements unambiguous and portable between components.
edmocosta marked this conversation as resolved.
Show resolved Hide resolved

### Structured configuration

The context-inferred structured configuration style closely resembles the
existing format and allows users to leverage the benefits of context inference
while providing granular control over statement configurations, such as
`error_mode` and `conditions`. For example, consider the following
configuration:

<!-- prettier-ignore-start -->
```yaml
metric_statements:
- context: datapoint
conditions:
- resource.attributes["service.name"] == "my.service"
statements:
- set(metric.description, "counter") where attributes["my.attr"] == "some"
```
<!-- prettier-ignore-end -->

It can now be written as:
edmocosta marked this conversation as resolved.
Show resolved Hide resolved

<!-- prettier-ignore-start -->
```yaml
metric_statements:
- conditions:
- resource.attributes["service.name"] == "my.service"
statements:
- set(metric.description, "counter") where datapoint.attributes["my.attr"] == "some"
```
<!-- prettier-ignore-end -->

In this example, the `context` value is omitted and is automatically inferred to
`datapoint`, as it is the only context present in the statements that supports
parsing both `datapoint` and `metric` data.

If we update the above configuration removing the `datapoint` usage:

<!-- prettier-ignore-start -->
```yaml
metric_statements:
- conditions:
- resource.attributes["service.name"] == "my.service"
statements:
- set(metric.description, "counter")
```
<!-- prettier-ignore-end -->

The context inferrer would select the `metric` context instead, since no data
points are accessed. Although it would be possible to run the statements using
the `datapoint` context, `metric` is the most efficient option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing here is probably a subsection stating when you should use which form. Users will generally want to know which they should generally prefer, and when.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

## Try it out

As we wrap up, we encourage users to explore this new functionality and take
advantage of its benefits in their telemetry pipelines!

If you have any questions or suggestions, we’d love to hear from you! Join the
conversation in the `#otel-collector` channel on the
[CNCF Slack workspace](https://slack.cncf.io/).
16 changes: 16 additions & 0 deletions static/refcache.json
Original file line number Diff line number Diff line change
Expand Up @@ -3435,6 +3435,10 @@
"StatusCode": 206,
"LastSeen": "2025-02-02T10:41:21.884395-05:00"
},
"https://github.com/edmocosta": {
"StatusCode": 206,
"LastSeen": "2025-02-11T15:13:03.156135+01:00"
},
"https://github.com/edsoncelio": {
"StatusCode": 200,
"LastSeen": "2024-11-06T19:17:39.555698Z"
Expand Down Expand Up @@ -5051,6 +5055,18 @@
"StatusCode": 206,
"LastSeen": "2025-01-16T14:34:46.89984-05:00"
},
"https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/transformprocessor": {
"StatusCode": 206,
"LastSeen": "2025-02-11T17:52:52.423025+01:00"
},
"https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/transformprocessor/README.md#context-inference": {
"StatusCode": 206,
"LastSeen": "2025-02-11T15:13:04.026653+01:00"
},
"https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/transformprocessor/README.md#context-inferred-configurations": {
"StatusCode": 206,
"LastSeen": "2025-02-11T15:13:06.05203+01:00"
},
"https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/hostmetricsreceiver/README.md": {
"StatusCode": 206,
"LastSeen": "2025-01-16T14:34:35.525335-05:00"
Expand Down