Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sending traces to Datadog using OpenTelemetry from a AWS Lambda #11889

Open
dglozano opened this issue May 4, 2022 · 22 comments
Open

Sending traces to Datadog using OpenTelemetry from a AWS Lambda #11889

dglozano opened this issue May 4, 2022 · 22 comments

Comments

@dglozano
Copy link

dglozano commented May 4, 2022

When using OpenTelemetry instrumentation SDK, there are two different ways to send data to Datadog, as described here.

  1. Send traces to the OpenTelemetry collector, and use the Datadog exporter to forward them to Datadog, or

  2. Ingest traces with the Datadog Agent, which collects them for Datadog.

These two alternatives probably work fine when running a "normal" application, but I have come to a dead end when trying to send OpenTelemetry data to Datadog from a AWS Lambda using .NET.

  1. When trying to follow the first approach, I have added the AWS managed Lambda Layer for ADOT Collector to my Lambda function. However, when I tried to configure the datadog exporter I realised that wasn't possible, since this version of the OTEL Collector is a simplified version that doesn't contain that exporter. (Ref.: https://github.com/open-telemetry/opentelemetry-lambda/blob/main/docs/design_proposal.md and Support build Lambda Collector extension layer on demand open-telemetry/opentelemetry-lambda#100).
  2. Since the first approach (which I would prefer to use) didn't work, I decided to try the second one. For that, I added the datadog agent using the Datadog Lambda extension and tried to send data to it using the OTLP protocol, as described here. However, it seems to me that the Datadog Agent version included in the lambda layer doens't allow OTLP ingestion, although I didn't manage to confirm this since I couldn't find it anywhere in the documentation.

Is there any possible workaround? Could the datadog agent included in the lambda extension accept ingestion over OTLP?

I only managed to make it work using what would be the flow described at the bottom in this image, which is to use DD SDK instead... but I would prefer to use the OpenTelemetry SDK if possible.

image

@ianwremmel
Copy link

@dglozano did you come up with any solution? I've got some tracing data because Datadog is pulling from xray and I've got all the autoinstrumentation coming out of the Datadog Extension, but it's super frustrating that current state of otel on lambda is:

traces: App -> ADOT Otel Layer -> XRay Backend -> Datadog via Polling
metrics: App -> Datadog Extension Layer -> Datadog

I thought I might be able to use opentelemetry-exporter-datadog to send my otel-instrumented traces directly to the datadog collector, but 1. it's deprecated and not compatible with the current otel api and 2. it was the proverbial straw that pushed my app + layer size over the 250 limit (the otel layer and the datadog layer are pretty big, apparently)

I thought I might be able to get around this by building my own lambda layer from the opentelemetry repo, but even that appears to be the stripped-down version without built-in Datadog support.

@dglozano
Copy link
Author

dglozano commented Sep 19, 2022

@ianwremmel unfortunately not, I haven't come up with any solution yet 😞 There were some talks within my team about having our own OTEL Collector instance that we can send our lambdas traces/logs to, but nothing we have done yet.

I will try to remember to update this issue if we come up with any working solution in the future 🙏

@RangelReale
Copy link

Acccording to this link OTLP is disabled by default on the agent, you need to enable it in the config file.

@RangelReale
Copy link

Just tried this, it loaded the config file but the endpoint didn't work, like you said.

@matthias-pichler
Copy link

matthias-pichler commented Nov 28, 2022

I tried configuring it via the DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_HTTP_ENDPOINT=localhost:4318 environment variable instead of the config file, but without success

@aereal
Copy link

aereal commented Dec 28, 2022

I have encountered the same problem.

Regardless of the config, pkg/otlp is disabled in serverless flavor using the build tag:
https://github.com/DataDog/datadog-agent/blob/e55de399f725d2d4b66cafd52570ec3f221471bb/pkg/otlp/config_serverless_noop.go

This change is introduced by #10068 for reducing the layer size.

I'm considering using OpenTelemetry Collector with Datadog exporter distribution instead of Datadog serverless agent. However, currently we have no way to configure DD_API_KEY safely such as fetching API key from Secrets Manager and passing the secret's ARN via the environment variable.

@matthias-pichler
Copy link

@maxday would it be possible to publish two Lambda Layer versions one with OTLP disabled and one with OTLP enabled (something like arn:aws:lambda:<AWS_REGION>:464622532012:layer:Datadog-Extension-ARM-OTLP:36). While I understand that the increased bundle size in undesirable for most use cases adding the ADOT Layer separately increases cold start time too ( open-telemetry/opentelemetry-lambda#263 ) and the following command shows

aws lambda get-layer-version-by-arn --arn arn:aws:lambda:eu-west-1:901920570463:layer:aws-otel-nodejs-arm64-ver-1-8-0:1

{
  "Content": {
    "Location": "https://awslambda-eu-west-1-layers.s3.eu-west-1.amazonaws.com/snapshots/901920570463/aws-otel-nodejs-arm64-ver-1-8-0-cc995c8b-3212-4fca-8188-397fa6a2a106?versionId=RDR5zYln3hllivIxXLoY8yMhjYQyHHbB&...",
    "CodeSha256": "Lm8D/057SuTlHonEZCVvPlNUKlj8EF4QkB7iVmNQxQM=",
    "CodeSize": 14208950
  },
  "LayerArn": "arn:aws:lambda:eu-west-1:901920570463:layer:aws-otel-nodejs-arm64-ver-1-8-0",
  "LayerVersionArn": "arn:aws:lambda:eu-west-1:901920570463:layer:aws-otel-nodejs-arm64-ver-1-8-0:1",
  "Description": "",
  "CreatedDate": "2022-12-20T20:23:16.666+0000",
  "Version": 1
}

that the ADOT layer is about 14.2 MB

@purple4reina
Copy link
Contributor

I'm pleased to announce that we have released v41 of the Datadog AWS Lambda Extension with OpenTelemetry support. You can enable the feature by setting either DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_HTTP_ENDPOINT=localhost:4318 or DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_GRPC_ENDPOINT=localhost:4317 as described here. Then tell your function where to send the traces by setting OTEL_EXPORTER_OTLP_ENDPOINT:http://localhost:4317. Traces will then be discoverable in the Datadog UI.

@RangelReale
Copy link

@purple4reina does this includes metrics, or only traces?

@RangelReale
Copy link

Just tested it, it worked fine for both traces and metrics! 👍

@bviolier
Copy link

bviolier commented May 2, 2023

@RangelReale Next to the question in the other issue, what is your flushing strategy for OTLP in your own lambda code? Are you force flushing anything, or just keeping the defaults?

@RangelReale
Copy link

@RangelReale Next to the question in the other issue, what is your flushing strategy for OTLP in your own lambda code? Are you force flushing anything, or just keeping the defaults?

I thing this is an issue that is not settled yet, I had performance problems in a high-usage lambda, so for traces I've set a sampling of sending only 0.1% of traces (which makes it useless). I created an ENV var to be able to test this value per service, but this is still ongoing.

The recommended flushing-per-request on my big lambda (~120 req/s), made its duration ~3x slower, so it is a no go.

@johnkoehn
Copy link

I'm struggling to get the datadog agent to export the OTEL traces on my lambda. @RangelReale how did you configure your lambda? Did you use the AWS ADOT collector with the datadog layers, or did you just configure the datadog layers?

@RangelReale
Copy link

I gave up on sending traces from lambdas, I'm still only sending 1% of traces.

We are moving to a model of sending traces and metrics from the lambda otel collector to the k8s otel collector that we already have, this seems to have better performance, se we will soon be testing enabling more traces.

@johnkoehn
Copy link

Ack, what a mess. Thanks @RangelReale!

@benbpyle
Copy link

benbpyle commented Jan 5, 2024

Is this still where things are today? Anyone have any luck pushing traces to the Extension via OTel? The Agent works fine but the extension, not so much.

@benbpyle
Copy link

In case anyone is looking for this in the future. I have the DD Extension collecting OTel traces in a Lambda. I did this by setting the ENV VAR

DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_GRPC_ENDPOINT: localhost:4317

From there, I made sure that I was flushing my traces correctly. I'm using Rust and Tokio and this worked great.

My SAM template is referencing this layer

Layers:
        - arn:aws:lambda:us-west-2:464622532012:layer:Datadog-Extension-ARM:53

Hope this helps someone

@jh7459-gh
Copy link

is anything stopping someone from sending protobuf trace data directly to datadog?

aside from the fact it's a non public API & requires a bit of dark magic

@ktruedat
Copy link

ktruedat commented Aug 9, 2024

I'm struggling to get the datadog agent to export the OTEL traces on my lambda. @RangelReale how did you configure your lambda? Did you use the AWS ADOT collector with the datadog layers, or did you just configure the datadog layers?

Have the same problem here, the Datadog Exporter, even with all the configuration from above, does not send the OTel traces, only Datadog Library collected traces which makes us vendor lock. Did you manage to solve this @johnkoehn ? Thanks!

@ktruedat
Copy link

ktruedat commented Aug 9, 2024

In case anyone is looking for this in the future. I have the DD Extension collecting OTel traces in a Lambda. I did this by setting the ENV VAR

DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_GRPC_ENDPOINT: localhost:4317

From there, I made sure that I was flushing my traces correctly. I'm using Rust and Tokio and this worked great.

My SAM template is referencing this layer

Layers:
        - arn:aws:lambda:us-west-2:464622532012:layer:Datadog-Extension-ARM:53

Hope this helps someone

@benbpyle I have tried this but it doesn't send OTel traces, can you please share more info regarding how you flush the traces correctly? Thanks!

@RangelReale
Copy link

I gave up sending traces in lambdas, I am sending only 1% of traces, having to sync each and every call makes it too slow.
We are also moving out of using lambas in favor of k8s.

@benbpyle
Copy link

benbpyle commented Aug 9, 2024

@ktruedat here is a working sample

https://github.com/benbpyle/lambda-dotnet-datadog-sam.

I also wrote a blog article on it

https://binaryheap.com/rust-and-opentelemetry-with-lambda-datadog/

Hope this helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests