Skip to content

Add component and unit diagnostics for beats receivers #8991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 17, 2025

Conversation

swiatekm
Copy link
Contributor

What does this PR do?

Add the ability for the otel manager to output diagnostics for components and units it runs. This PR doesn't add any actual diagnostic content, just the necessary scaffolding and tests.

The filebeat registry and beat metrics will be added in follow-up PRs.

Why is it important?

Beat receivers need to be covered by diagnostics to ensure issues with them can be effectively diagnosed.

Checklist

  • I have read and understood the pull request guidelines of this project.
  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • [ ] I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

How to test this PR locally

Build the agent, run it with self-monitoring using beats receivers enabled, and collect diagnostics. You should see the expected directories without any files.

Related issues

@swiatekm swiatekm added enhancement New feature or request skip-changelog labels Jul 14, 2025
Copy link
Contributor

mergify bot commented Jul 14, 2025

This pull request does not have a backport label. Could you fix it @swiatekm? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-./d./d is the label that automatically backports to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

swiatekm added 2 commits July 14, 2025 15:21
# Conflicts:
#	internal/pkg/agent/application/coordinator/coordinator_test.go
#	internal/pkg/otel/manager/manager.go
@swiatekm swiatekm force-pushed the feat/otel-diagnostics branch from 2fe6d75 to b28331d Compare July 14, 2025 13:24
@swiatekm swiatekm force-pushed the feat/otel-diagnostics branch from 1a4c954 to 25657e7 Compare July 14, 2025 14:57
@swiatekm swiatekm added Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team backport-8.19 Automated backport to the 8.19 branch labels Jul 14, 2025
@swiatekm swiatekm marked this pull request as ready for review July 14, 2025 19:03
@swiatekm swiatekm requested a review from a team as a code owner July 14, 2025 19:03
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@pierrehilbert
Copy link
Contributor

@swiatekm should we also backport this to 9.1?

@swiatekm
Copy link
Contributor Author

@swiatekm should we also backport this to 9.1?

We'd also need to backport #8737. I think it might be easier to hold off for now and then make backport decisions once the full milestone 1 of beats receivers is done.

Copy link
Contributor

@pkoutsovasilis pkoutsovasilis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice seeing diagnostics being formed for otel managed 🙂 Left some comments, have a look and tell me if they make sense

@swiatekm swiatekm requested a review from pkoutsovasilis July 15, 2025 16:22
@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Jul 16, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

Copy link
Contributor

@pkoutsovasilis pkoutsovasilis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code changes LGTM, waiting for the CI to be green 🙂

@pkoutsovasilis pkoutsovasilis self-requested a review July 16, 2025 16:23
Copy link

@elasticmachine
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

History

cc @swiatekm

Copy link
Contributor

@pkoutsovasilis pkoutsovasilis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @swiatekm please report the flaky tests 🙏

@swiatekm swiatekm merged commit ce0519f into main Jul 17, 2025
19 checks passed
@swiatekm swiatekm deleted the feat/otel-diagnostics branch July 17, 2025 08:27
mergify bot pushed a commit that referenced this pull request Jul 17, 2025
* Add diagnostics to otel manager

# Conflicts:
#	internal/pkg/agent/application/coordinator/coordinator_test.go
#	internal/pkg/otel/manager/manager.go

* add integration test

* Move helper function to allow diagnostic tests to run on Windows

* Handle diagnostics errors separately

* Centralize locking for config updates

* Make tests more explicit

* Early exit from diagnostics generation function

* Refactor config generation for easier locking

(cherry picked from commit ce0519f)
swiatekm added a commit that referenced this pull request Jul 17, 2025
* Add diagnostics to otel manager

# Conflicts:
#	internal/pkg/agent/application/coordinator/coordinator_test.go
#	internal/pkg/otel/manager/manager.go

* add integration test

* Move helper function to allow diagnostic tests to run on Windows

* Handle diagnostics errors separately

* Centralize locking for config updates

* Make tests more explicit

* Early exit from diagnostics generation function

* Refactor config generation for easier locking

(cherry picked from commit ce0519f)

Co-authored-by: Mikołaj Świątek <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.19 Automated backport to the 8.19 branch enhancement New feature or request skip-changelog Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants