Skip to content

The check command used by the service runtime does not guarantee that the service is running #8178

@kaanyalti

Description

@kaanyalti

When the agent is starting a service runtime, it uses a check command to validate that the relevant external services are installed. This function just calls the prescribed check command in the service spec file. In the context of the endpoint service, this command does not seem to account for whether the endpoint is running or not. If endpoint service is not running, the check command can still return without any errors.

For confirmed bugs, please report:

  • Version: Main branch
  • Operating System: Ubuntu 24.04.2 LTS and CentOS Stream 9
    • Mac is most likely affected as well
    • Windows is most likely not affected

Steps to Reproduce:
I ran into this while working on the following #6394, so the steps will include using tamper protected agent-endpoint; however, the bug should be reproducible without tamper protection as well.

  • Create an es deployment
  • In fleet ui create a policy, add endpoint integration, and enable tamper protection
  • Add an agent following deb/rpm instructions
  • Validate that both the agent and endpoint are running and are healthy (sudo elastic-agent status --output full)
  • Stop the endpoint service systemctl stop ElasticEndpoint
  • Remove endpoint's vault directory at /opt/Elastic/Endpoint/state/vault
  • Install the same version of the agent with dpkg -i elastic-agent-<SAME VERSION>.deb
  • After waiting a bit, validate that the endpoint service is still not running,
    and that the agent is unhealthy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions