Skip to content

Improve NodeJS Lambda Layer cold start time #163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

ezhang6811
Copy link
Contributor

@ezhang6811 ezhang6811 commented Mar 5, 2025

Description of changes:
Change 1: Using webpack to bundle the autoinstrumentation package improves tree-shaking, which significantly shortens the cold-start time for the lambda layer. Inspired by open-telemetry/opentelemetry-lambda#1679.

To test the cold-start improvement, a sample Lambda function instrumented with the Lambda layer was deployed, and a script was run which repeatedly invokes the Lambda function. The initialization time averages are then queried on Cloudwatch logs. The process is then repeated with the lambda layer removed from the Lambda function, and the times are subtracted to get the Lambda layer cold start time.

From my tests, the average cold start time for the Lambda layer improved from ~770ms to ~350ms, an almost 55% reduction. This could be further improved in the future by resolving any conflicting subdependency versions and providing more ESM compatible dependencies, which can further reduce the output bundle.

Change 2: Remove the auto-configuration-propagators dependency from autoinstrumentation package, which eliminates the B3 and Jaeger propagator transitive dependencies. This reduces the size of the zipped layer from 15MB 11.6 MB, approximately 23% decrease. Will update with performance improvements once I finish testing this change.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@ezhang6811 ezhang6811 requested a review from a team as a code owner March 5, 2025 01:23
@ezhang6811 ezhang6811 changed the title Bundle ADOT autoinstrumentation package with webpack Improve NodeJS Lambda Layer cold start time Mar 6, 2025
@ezhang6811 ezhang6811 force-pushed the zhaez/cold-start-improvements branch from 8ecd37a to b5c9e6f Compare March 6, 2025 19:35
@ezhang6811 ezhang6811 force-pushed the zhaez/cold-start-improvements branch from b5c9e6f to cab9e49 Compare March 6, 2025 19:36
@ezhang6811 ezhang6811 force-pushed the zhaez/cold-start-improvements branch from 0512cd2 to 6df8880 Compare March 10, 2025 22:01
@ezhang6811 ezhang6811 force-pushed the zhaez/cold-start-improvements branch from 6df8880 to c6b2d64 Compare March 10, 2025 22:04
@@ -22,7 +22,9 @@
"repository": "aws-observability/aws-otel-js-instrumentation",
"scripts": {
"clean": "rimraf build/*",
"compile": "tsc -p .",
"compile:tsc": "tsc -p .",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who will run "compile:tsc" if we shift "compile" to "compile:webpack" as code compiling entry point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably won't be run unless someone wanted to see the previous package structure.

Comment on lines +4 to +7
"module": "es2020",
"target": "es2020",
"moduleResolution": "node",
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the same target being used by otel lambda upstream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, see here.

Comment on lines +209 to +211
private getPropagator(): TextMapPropagator {
if (process.env.OTEL_PROPAGATORS == null || process.env.OTEL_PROPAGATORS.trim() === '') {
return new CompositePropagator({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a better option to not rewrite this function by copying it from upstream? can we just patch propagatorMap for auto-configuration-propagators/utils file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #171, we revert to using the upstream dependency

@@ -119,8 +124,8 @@
},
"files": [
"build/src/**/*.js",
"build/src/**/*.js.map",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep the compile:tsc command, there's no harm in keeping "build/src/**/*.js.map".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in #171

@@ -190,6 +206,32 @@ export class AwsOpentelemetryConfigurator {
return autoResource;
}

private getPropagator(): TextMapPropagator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This can be a regular function outside of this class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in #171

@ezhang6811 ezhang6811 closed this Mar 18, 2025
@ezhang6811 ezhang6811 deleted the zhaez/cold-start-improvements branch March 18, 2025 17:15
ezhang6811 added a commit that referenced this pull request Mar 22, 2025
*Description of changes:*
This PR is a rehash of #163 with some minor changes reverted after
several autoinstrumentation package dependencies were upgraded in #168.
The changes suggested by the unaddressed comments in the previous PR are
addressed here.

The primary change made to reduce the cold start time was to bundle the
ADOT autoinstrumentation package with webpack instead of using the TS
compiler. Using the same build targets as the upstream (see [this
PR](https://github.com/open-telemetry/opentelemetry-lambda/pull/1679/files)),
the cold start performance of the layer was improved by about 50%, from
~765ms to ~385ms.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants