-
Notifications
You must be signed in to change notification settings - Fork 452
[Fleet Server] Added support for the fleet scalability settings as direct toggles in fleet ui #13766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Pinging @elastic/fleet (Team:Fleet) |
@@ -14,3 +14,96 @@ server: | |||
{{#if custom}} | |||
{{custom}} | |||
{{/if}} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd still want to support overriding any of these settings via the custom YML box, so these should all appear above the custom
block above, assuming the last value in this file will take precedence.
@@ -14,3 +14,96 @@ server: | |||
{{#if custom}} | |||
{{custom}} | |||
{{/if}} | |||
|
|||
|
|||
{{#if fleet_scalability_5000}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These limits are all encoded into Fleet server, with the max_agents
field behaving as a simplified way to configure them. This is meant to work like the Elasticsearch output presets where the preset selector is the number of agents.
The limits for each number of agents are available in https://github.com/elastic/fleet-server/tree/main/internal/pkg/config/defaults which is where Fleet server reads them from when it is compiled.
I think we should allow customizing these, but I don't think we should duplicate them here. We could put the values back into the documentation though for people to use as overrides, but I don't think we should duplicate them here. They will just go out of date as they exist in two places.
Probably the documentation around this overall needs to improve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, so max_agents
, is absolutely confusing. How should I know that? I think this should be renamed to expected agents communicating with fleet
or something like that, that is much more verbose and then be a dropdown selector that is like this:
Expected Agents connecting to the fleet:
< 1000
< 5000
< 10.000
< 30.000
< 50.000
I would be way to scared to put something into the max_agents
, because it sounds like if I put 100 in there, and then I want to connect a 101 agent, it doesn't work and it sends me down a spiral debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why it's confusing. We are asking the user to tell us how many agents they have in their deployment, based on that we determine what the right value should be configured for those variables. User should not need to know what value is being configured. We reserve the right to change the value for any of those variables. As Craig mentions these are similar to presets.
|
💚 Build Succeeded
History
|
Agent scalability
When you start to add more and more agents to your cluster, you should change the fleet settings. This is not possible because the fleet policy per-se in ECH is managed and therefore not changeable. I interact with a lot of folks that have airgapped and or onpremise setups where we add additional fleet servers.
Now the questions start here, is that we make it very hard to know what to actually set in the fleet server. We have the scalability guide: https://www.elastic.co/guide/en/fleet/8.6/fleet-server-scalability.html#recommend-settings-scaling-agents which in version 8.6 still lists a table with suggested values to add to the fleet servers. With any version >8.6 we do not have that table anymore and this is a problem, because I don't know what exactly I should add.
This also means, that this PR is based on values that are a lot of versions old and therefore could be outdated. One of the main issues in addition to this is, that we simply list the settings without any reference values.
The simplest form that I found was to add a simple bool toggle for each sizing and append the needed settings into the
agent.yml.hbs
. Alltoggles
are off per default and we use the default values that ECH, ECE, ECK sets. I do not want to change the defaults.two things I would like to do:
agent.yml.hbs
and simplify thisThe way I tested it is the following way, and since this is a pretty major change nonetheless, I hope that we can test it in even a better way.
elastic-package test
which didn't show any error.elastic-package install
and then checked out the new version, upgraded the integration in the UI and watched as the fleet server stayed healthy. Did a diagnostic of the fleet serverelastic-agent-diagnostics-2025-05-02T16-02-33Z-00.zip and saw that the config correctly added the settings for
5.000
.Checklist
changelog.yml
file.~~- [ ] I have verified that any added dashboard complies with Kibana's Dashboard good practices ~~
Related issues
Screenshots