Skip to content

[Fleet Server] Added support for the fleet scalability settings as direct toggles in fleet ui #13766

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

philippkahr
Copy link
Contributor

@philippkahr philippkahr commented May 2, 2025

Agent scalability

When you start to add more and more agents to your cluster, you should change the fleet settings. This is not possible because the fleet policy per-se in ECH is managed and therefore not changeable. I interact with a lot of folks that have airgapped and or onpremise setups where we add additional fleet servers.

Now the questions start here, is that we make it very hard to know what to actually set in the fleet server. We have the scalability guide: https://www.elastic.co/guide/en/fleet/8.6/fleet-server-scalability.html#recommend-settings-scaling-agents which in version 8.6 still lists a table with suggested values to add to the fleet servers. With any version >8.6 we do not have that table anymore and this is a problem, because I don't know what exactly I should add.

This also means, that this PR is based on values that are a lot of versions old and therefore could be outdated. One of the main issues in addition to this is, that we simply list the settings without any reference values.

The simplest form that I found was to add a simple bool toggle for each sizing and append the needed settings into the agent.yml.hbs. All toggles are off per default and we use the default values that ECH, ECE, ECK sets. I do not want to change the defaults.

two things I would like to do:

  • enable only one toggle to be active at a time
  • remove the duplicated things in the agent.yml.hbs and simplify this

The way I tested it is the following way, and since this is a pretty major change nonetheless, I hope that we can test it in even a better way.

elastic-package test which didn't show any error.
elastic-package install and then checked out the new version, upgraded the integration in the UI and watched as the fleet server stayed healthy. Did a diagnostic of the fleet server
elastic-agent-diagnostics-2025-05-02T16-02-33Z-00.zip and saw that the config correctly added the settings for 5.000.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
    ~~- [ ] I have verified that any added dashboard complies with Kibana's Dashboard good practices ~~

Related issues

Screenshots

Screenshot 2025-05-02 at 18 03 32 Screenshot 2025-05-02 at 17 57 22

@philippkahr philippkahr added the enhancement New feature or request label May 2, 2025
@philippkahr philippkahr requested a review from a team as a code owner May 2, 2025 16:15
@kpollich kpollich added the Team:Fleet Fleet team [elastic/fleet] label May 2, 2025
@elasticmachine
Copy link

Pinging @elastic/fleet (Team:Fleet)

@kpollich kpollich added the Integration:fleet_server Fleet Server label May 2, 2025
@kpollich kpollich changed the title Added support for the fleet scalability settings as direct toggles int he fleet ui [Fleet Server] Added support for the fleet scalability settings as direct toggles in fleet ui May 2, 2025
@@ -14,3 +14,96 @@ server:
{{#if custom}}
{{custom}}
{{/if}}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd still want to support overriding any of these settings via the custom YML box, so these should all appear above the custom block above, assuming the last value in this file will take precedence.

@@ -14,3 +14,96 @@ server:
{{#if custom}}
{{custom}}
{{/if}}


{{#if fleet_scalability_5000}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These limits are all encoded into Fleet server, with the max_agents field behaving as a simplified way to configure them. This is meant to work like the Elasticsearch output presets where the preset selector is the number of agents.

The limits for each number of agents are available in https://github.com/elastic/fleet-server/tree/main/internal/pkg/config/defaults which is where Fleet server reads them from when it is compiled.

I think we should allow customizing these, but I don't think we should duplicate them here. We could put the values back into the documentation though for people to use as overrides, but I don't think we should duplicate them here. They will just go out of date as they exist in two places.

Probably the documentation around this overall needs to improve.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so max_agents, is absolutely confusing. How should I know that? I think this should be renamed to expected agents communicating with fleet or something like that, that is much more verbose and then be a dropdown selector that is like this:

Expected Agents connecting to the fleet:
 < 1000
 < 5000
 < 10.000
 < 30.000
 < 50.000

I would be way to scared to put something into the max_agents, because it sounds like if I put 100 in there, and then I want to connect a 101 agent, it doesn't work and it sends me down a spiral debugging.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why it's confusing. We are asking the user to tell us how many agents they have in their deployment, based on that we determine what the right value should be configured for those variables. User should not need to know what value is being configured. We reserve the right to change the value for any of those variables. As Craig mentions these are similar to presets.

@elasticmachine
Copy link

💚 Build Succeeded

History

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Integration:fleet_server Fleet Server Team:Fleet Fleet team [elastic/fleet]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants