Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

security-agent.yaml not found during one-step install #14836

Open
silenzium opened this issue Dec 22, 2022 · 21 comments
Open

security-agent.yaml not found during one-step install #14836

silenzium opened this issue Dec 22, 2022 · 21 comments

Comments

@silenzium
Copy link

I want to install datadog agent on Debian 11 (bullseye). Therefor I use the one-step install script from the docs in the datdog-hq website: bash -c "$(curl -L https://s3.amazonaws.com/dd-agent/scripts/install_script_agent7.sh)"

When installing the agent v7 on ubuntu using the one-step script, I get an error during installation saying the /etc/datadog-agent/security-agent.yaml file is missing. There is a security-agent.yaml.example file, but it doesn't get copied.

When using the DD_INSTALL_ONLY=true flag, the installation runs through smoothly. When I start the agent afterwards, I get the same error, but the agent runs because I can see continously added common metric checks in the datadog logs appearing.

I can not find anything useful about the security-agent.yaml file, so are you already aware of this?
Can I just copy over the example file as is? At least it seems to work.

@vandonr-amz
Copy link

I have the same issue, but I'm not entirely sure the problem is linked to that last line of log before the error. I also have a log above saying No datadog.yaml file detected, not starting the agent, which is a bit worrying.
Especially since it's followed a couple of line later with * Starting the Datadog Agent....

@ndroo
Copy link

ndroo commented Feb 7, 2023

+1

for now im going to just do the rename myself, but this feels like a hack

@KSerrania
Copy link
Collaborator

KSerrania commented Feb 14, 2023

Hi,

Thanks for raising these issues!

I have the same issue, but I'm not entirely sure the problem is linked to that last line of log before the error. I also have a log above saying No datadog.yaml file detected, not starting the agent, which is a bit worrying.
Especially since it's followed a couple of line later with * Starting the Datadog Agent....

This is the normal behavior of the script:

  • the Agent install script first installs the datadog-agent package,
  • the datadog-agent package, as part of its post-install hooks, checks if datadog.yaml already exists; if it does, it starts the Agent, otherwise it logs the No datadog.yaml file detected, not starting the agent message (which doesn't make the install fail, it's only here for informative purposes). Here, we are in the second case,
  • then, the install script sets up the /etc/datadog-agent configuration file (by copying the example file provided by the package, and filling the file with info passed on the command line such as DD_API_KEY), and starts the Agent (and logs the * Starting the Datadog Agent... line when it does).

We could improve this by stressing the fact that this log line is not an error (eg. by adding an [INFO] tag before that log line).

When installing the agent v7 on ubuntu using the one-step script, I get an error during installation saying the /etc/datadog-agent/security-agent.yaml file is missing. There is a security-agent.yaml.example file, but it doesn't get copied.

My guess here is that you are seeing a message like /etc/datadog-agent/security-agent.yaml not found. Exiting datadog-agent-security in the output of the installation script. Could you confirm that this is what you are seeing?

The security-agent is an optional component of the Agent, that is enabled by creating the security-agent.yaml configuration file. The install script doesn't enable that component (it only provides the security-agent.yaml.example example file in case you would like to enable this feature).

When the datadog-agent service starts, it tries to start all optional services of the Agent, including the datadog-agent-security service which handles the security-agent component. The datadog-agent-security service only starts if the security-agent.yaml file is found.

That is why you are seeing the /etc/datadog-agent/security-agent.yaml not found log message. That is also why you don't get that message with DD_INSTALL_ONLY=true, as this disables the part of the script which starts the datadog-agent service.

If you want to enable the security-agent components, you have to create the security-agent.yaml file after running the install script; if you don't, nothing needs to be done.

@silenzium
Copy link
Author

silenzium commented Feb 14, 2023

@KSerrania thanks for the detailed explanation.

My guess here is that you are seeing a message like /etc/datadog-agent/security-agent.yaml not found. Exiting datadog-agent-security in the output of the installation script. Could you confirm that this is what you are seeing?

Yes correct, but the script did also quit and not finish the installation, so I could not start the agent afterwards.
Would definitely be an improvement, if the install script finishes with a warning, instead exiting with an error.

@KSerrania
Copy link
Collaborator

Yes correct, but the script did also quit and not finish the installation, so I could not start the agent afterwards.

Thanks for the reply! The fact that the script ends with an error is definitely not expected, would you be able to provide the full (redacted, in case the api key is visible in your command) logs of your failed installation attempt and details about your environment, so we can try troubleshooting your issue? Thanks!

@silenzium
Copy link
Author

Sure, I'll try to get into this again tonight or asap and will post the logs here.

@silenzium
Copy link
Author

sorry I don't find the time currently, if someone else could share their logs please.

@marcustxl
Copy link

marcustxl commented Mar 1, 2023

Hi @KSerrania
I have the same issue as well. Able to install it with DD_INSTALL_ONLY=true flag, but not able to start it with service datadog-agent start.

I also tried running without the flag and this is what I get:

DD_API_KEY=xx DD_SITE="datadoghq.com" bash -c "$(curl -L https://s3.amazonaws.com/dd-agent/scripts/install_script_agent7.sh)"

* Datadog Agent 7 install script v1.15.0

* Installing apt-transport-https, curl and gnupg

Ign:1 https://apt.datadoghq.com stable InRelease
Hit:2 https://apt.datadoghq.com stable Release
Hit:3 http://archive.ubuntu.com/ubuntu focal InRelease
Get:4 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]
Get:6 http://archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB]
Hit:7 http://ppa.launchpad.net/openjdk-r/ppa/ubuntu focal InRelease
Get:8 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
Fetched 336 kB in 1s (448 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
curl is already the newest version (7.68.0-1ubuntu2.16).
gnupg is already the newest version (2.2.19-3ubuntu2.2).
apt-transport-https is already the newest version (2.0.9).
0 upgraded, 0 newly installed, 0 to remove and 15 not upgraded.

* Installing APT package sources for Datadog

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

gpg: key 32637D44F14F620E: "Datadog, Inc. Master key (2020-09-08) <[email protected]>" not changed
gpg: Total number processed: 1
gpg:              unchanged: 1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

gpg: key 32637D44F14F620E: "Datadog, Inc. Master key (2020-09-08) <[email protected]>" not changed
gpg: Total number processed: 1
gpg:              unchanged: 1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

gpg: key D3A80E30382E94DE: "Datadog, Inc <[email protected]>" not changed
gpg: Total number processed: 1
gpg:              unchanged: 1
* Installing the Datadog Agent package

Ign:1 https://apt.datadoghq.com stable InRelease
Hit:2 https://apt.datadoghq.com stable Release
Reading package lists...
  Installing package(s): datadog-agent datadog-signing-keys

Reading package lists...
Building dependency tree...
Reading state information...
datadog-signing-keys is already the newest version (1:1.2.0-1).
datadog-agent is already the newest version (1:7.43.0-1).
0 upgraded, 0 newly installed, 0 to remove and 15 not upgraded.
W: --force-yes is deprecated, use one of the options starting with --allow instead.

* Keeping old /etc/datadog-agent/datadog.yaml configuration file

* Starting the Datadog Agent...

Restarting Datadog Agent
 * Stopping Datadog Agent datadog-agent
   ...done.
 * Stopping Datadog Process Agent datadog-agent-process
   ...done.
 * Stopping Datadog Trace Agent (APM) datadog-agent-trace
   ...done.
/etc/datadog-agent/security-agent.yaml not found. Exiting datadog-agent-security
 * Starting Datadog Agent datadog-agent
   ...fail!
Error starting Datadog Agent
It looks like you hit an issue when trying to install the Datadog Agent.

Troubleshooting and basic usage information for the Datadog Agent are available at:

    https://docs.datadoghq.com/agent/basic_agent_usage/

@KSerrania
Copy link
Collaborator

Hi @marcustxl, thanks for replying!

Could you provide the contents of the Agent logs on that system (they should be in /var/log/datadog/agent.log)? This should give information on why the Agent failed to start. Please make sure that no secrets are present there beforehand.

@marcustxl
Copy link

Here are the contents:

2023-03-01 08:43:49 UTC | CORE | INFO | (pkg/logs/logs.go:149 in Stop) | Stopping logs-agent
2023-03-01 08:43:49 UTC | CORE | INFO | (pkg/logs/logs.go:158 in Stop) | logs-agent stopped
2023-03-01 08:43:49 UTC | CORE | INFO | (cmd/agent/subcommands/run/command.go:537 in stopAgent) | See ya!
2023-03-01 08:44:54 UTC | CORE | INFO | (pkg/util/log/log.go:590 in func1) | Features detected from environment:
2023-03-01 08:44:54 UTC | CORE | INFO | (pkg/runtime/runtime.go:27 in func1) | runtime: final GOMAXPROCS value is: 2
2023-03-01 08:44:54 UTC | CORE | INFO | (cmd/agent/subcommands/run/command.go:248 in startAgent) | Starting Datadog Agent v7.43.0
2023-03-01 08:44:54 UTC | CORE | INFO | (cmd/agent/subcommands/run/command.go:299 in startAgent) | pid '5647' written to pid file '/opt/datadog-agent/run/agent.pid'
2023-03-01 08:44:54 UTC | CORE | ERROR | (cmd/agent/subcommands/run/command.go:309 in startAgent) | Error while getting hostname, exiting: unable to reliably determine the host name. You can define one in the agent config file or in your hosts file

@KSerrania
Copy link
Collaborator

KSerrania commented Mar 1, 2023

Thanks, that confirms my initial feeling. As mentioned above, the security-agent message is just an info-level message, and isn't the actual cause of the crash.

The real error is the Error while getting hostname, exiting: unable to reliably determine the host name. You can define one in the agent config file or in your hosts file line, which indicates you are encountering this issue: #14152. I suggest reading the advice here: #14152 (comment).

In this specific case, you may be able to work around this by specifying DD_HOSTNAME=<your host name> in the parameters of the install script.

@KSerrania
Copy link
Collaborator

Also, out of curiosity, could you give some more information on the system you are running the install script on (OS version, is this a physical host, a VM, a container)?

@marcustxl
Copy link

Hi, just to confirm, setting DD_HOSTNAME=default works right?

I am installing the datadog agent inside a docker image

@KSerrania
Copy link
Collaborator

KSerrania commented Mar 1, 2023

Yes. To be precise, here are the options you have:

  • if you are installing the Agent for the first time, set the DD_HOSTNAME variable in the install script command, like this:
    DD_API_KEY=<api key> DD_HOSTNAME=<hostname> DD_SITE="datadoghq.com" bash -c "$(curl -L https://s3.amazonaws.com/dd-agent/scripts/install_script_agent7.sh)"

  • if you already installed the Agent, edit the /etc/datadog-agent/datadog.yaml file and add hostname: <hostname> in it, then restart the Agent service (with something like service datadog-agent restart)

As for the value of the hostname, this is what is used to identify the host in Datadog (and used for the host tag on all data), so I suggest giving unique and identifiable names (eg. the container id, returned by the hostname command).

@marcustxl
Copy link

Hi @KSerrania, I manage to get it to work once I specify my hostname in the install script.

Thank you so much for the prompt reply and suggestions!

@kayocrd
Copy link

kayocrd commented Apr 18, 2023

Alternatively, until it is resolved, I used: DD_INSTALL_ONLY=true and then: mv /etc/datadog-agent/security-agent.yaml.example security-agent.yaml and it worked

@MichelBitar99
Copy link

MichelBitar99 commented Sep 28, 2023

Is there a similar functionality on Windows? Similar to DD_INSTALL_ONLY=true ?

@lkoniecz
Copy link

I am trying to run the agent in a docker image based on python:3.10-slim-bullseye

Installed via
https://s3.amazonaws.com/dd-agent/scripts/install_script_agent7.sh
with
DD_API_KEY="<edited>" DD_SITE="us5.datadoghq.com" DD_APM_INSTRUMENTATION_ENABLED=host DD_INSTALL_ONLY=true

# service datadog-agent start
Starting Datadog Agent: datadog-agent failed!
# cat /var/log/datadog/agent.log
2024-01-25 09:39:20 UTC | CORE | WARN | (pkg/util/log/log.go:666 in func1) | There was an error fetching the namespace from the context, using default
2024-01-25 09:39:20 UTC | CORE | INFO | (pkg/util/log/log.go:626 in func1) | 0 Features detected from environment: 
2024-01-25 09:39:20 UTC | CORE | INFO | (comp/forwarder/defaultforwarder/default_forwarder.go:242 in NewDefaultForwarder) | Retry queue storage on disk is disabled

@danpst
Copy link

danpst commented Mar 21, 2024

I am trying to run the agent in a docker image based on python:3.10-slim-bullseye

Installed via https://s3.amazonaws.com/dd-agent/scripts/install_script_agent7.sh with DD_API_KEY="<edited>" DD_SITE="us5.datadoghq.com" DD_APM_INSTRUMENTATION_ENABLED=host DD_INSTALL_ONLY=true

# service datadog-agent start
Starting Datadog Agent: datadog-agent failed!
# cat /var/log/datadog/agent.log
2024-01-25 09:39:20 UTC | CORE | WARN | (pkg/util/log/log.go:666 in func1) | There was an error fetching the namespace from the context, using default
2024-01-25 09:39:20 UTC | CORE | INFO | (pkg/util/log/log.go:626 in func1) | 0 Features detected from environment: 
2024-01-25 09:39:20 UTC | CORE | INFO | (comp/forwarder/defaultforwarder/default_forwarder.go:242 in NewDefaultForwarder) | Retry queue storage on disk is disabled

@lkoniecz did you have any luck with this? having the exact same issue with a Debian node image, exact same logs

@lkoniecz
Copy link

@danhuma

I think I changed install_script_agent7.sh to install_script.sh + export DD_AGENT_MAJOR_VERSION=7

@julio-ultimatejetvacations

I could fix this inside a ubuntu container were I was testing if I can manually install the Agent. As I don't know exactly the host name on this containers environment I just run this.

sh -c "sed -i 's/# hostname:.*/hostname: $(hostname)/' /etc/datadog-agent/datadog.yaml"

If you check later the datadog.yaml file you can see that there is filled and after restarting again my datadog agent if worked

Captura de pantalla 2024-09-17 a la(s) 10 18 55 a m

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests