-
Notifications
You must be signed in to change notification settings - Fork 178
Add otel component manager to the coordinator #8529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
// MergedOtelConfig returns the merged Otel collector configuration, containing both the plain config and the | ||
// component config. | ||
MergedOtelConfig() *confmap.Conf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This is temporary, and is only necessary for diagnostics. The plan is to move all the Otel-related diagnostic hooks into the OtelComponentManager. Afterwards, there won't be any reason for the coordinator to track this value.
379032c
to
5275554
Compare
This pull request is now in conflicts. Could you fix it? 🙏
|
efa7f79
to
fb1b5c6
Compare
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
for ctx.Err() == nil { | ||
select { | ||
case <-ctx.Done(): | ||
break | ||
|
||
case collectorCfg := <-m.collectorUpdateChan: | ||
if err := m.handleCollectorUpdate(collectorCfg); err != nil { | ||
m.reportError(ctx, err) | ||
} | ||
|
||
case componentModel := <-m.componentUpdateChan: | ||
if err := m.handleComponentUpdate(componentModel); err != nil { | ||
m.reportError(ctx, err) | ||
} | ||
|
||
case err := <-m.otelManager.Errors(): | ||
m.reportError(ctx, err) | ||
|
||
case otelStatus := <-m.otelManager.Watch(): | ||
componentUpdates, err := m.handleOtelStatusUpdate(otelStatus) | ||
if err != nil { | ||
m.reportError(ctx, err) | ||
} | ||
m.sendCollectorStatusUpdate(ctx) | ||
m.sendComponentStateUpdates(ctx, componentUpdates) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the more I look at the newly introduced otel component manager the more I think that this shouldn't exist and every extra bit of logic here should become part of the existing otel manager. AFAICT every channel or other input you have here is already existing there and the transition of the logic seems straight-forward. I think that by migrating the logic there it will be easier for us to maintain/debug/extend the otel manager as we won't have another concurrent layer of channels to pass through, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could do that, yeah. The current implementation is nice in that it's only concerned with translating to and from the component model, and then the OtelManager itself is only concerned with managing the otel collector, with no dependencies on the component model. But you're right that this separation may not be worth the mental overhead of the additional asynchronous behavior it introduces.
To be honest, one of the main reasons I did it this way was because it let me not interfered with #8248. Would you be ok with merging this PR as-is, and moving all of the logic into the otel manager after the dust settles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see how much of a lift it is and get back to you. If it's simple enough, I don't mind doing it now rather than later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did what you suggested in #8737. I'm not sure I feel better about it than this PR. OtelManager feels like it does too much with those changes imo. I don't have very strong feelings either way though, so please have a look and let me know what you think.
This pull request is now in conflicts. Could you fix it? 🙏
|
fb1b5c6
to
69d2900
Compare
This pull request is now in conflicts. Could you fix it? 🙏
|
# Conflicts: # internal/pkg/agent/application/application.go # internal/pkg/agent/application/coordinator/coordinator_test.go
69d2900
to
21a6306
Compare
|
💚 Build Succeeded
History
cc @swiatekm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks great to me code-wise, though I think I agree with @pkoutsovasilis that it would be nicer to add the component management to the existing otel manager as in #8737. (But control plane consensus should take precedence.)
That's a second vote for the approach in #8737, so I'm going to move this one into draft and put the other one up for review instead. |
This pull request is now in conflicts. Could you fix it? 🙏
|
#8737 was merged, closing this. |
What does this PR do?
Adds a new manager object for running components (in the sense of the agent's Component Model) in an otel collector.
Doing this involves two basic activities:
Up until now, the logic for these was haphazardly spread across the agent coordinator. This PR moves all of it into a new object - the
OtelComponentManager
- which can run both raw otel collector configurations and agent components in a single otel collector instance.This new manager encapsulates all the logic involved in interfacing between the agent coordinator and the otel collector. In the near future, it will also take on additional responsibilities, like generating diagnostics for components it runs.
The only new logic this PR introduces lives in the new manager's main loop, and has to do with how updates and configurations are moved around. The rest is either existing logic moved to a new location, and new tests for that old logic.
Why is it important?
Checklist
[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration files[ ] I have added an entry in./changelog/fragments
using the changelog tool[ ] I have added an integration test or an E2E testRelated issues