Skip to content

Conversation

mveroone
Copy link

@mveroone mveroone commented Sep 27, 2025

Which problem is this PR solving?

The current implementation of the Event Loop Utilization passes a delta value to the call instead of an absolute one.
The NodeJS perf_hooks documentation is a little ambiguous but it does say that if calling eventLoopUtilization() with 1 argument, it should be the result of a call to that same function without argument :

utilization1 <Object> The result of a previous call to eventLoopUtilization().

(Emphasis is mine)

The result of this bug is that the value tends to stabilize over time because we pass a diff of a diff of a diff and we tend to just return the value since the start of the process instead of a delta since last execution

Short description of the changes

Replaced the setting of the lastValue internal variable with a call to the argument-less perf_hook.
Given that this only queries internal counters, I believe it's light enough that we can afford to call it twice per tick. The alternative would be to bypass the auto-calculation of the delteas provided as a helper and perform calculation of the ratio ourselves with a couple arithmetic operations.

Note

As a reference, Datadog library fixed the same bug last month, but they chose to disregard nodejs autocalculation of the utilization ratio and just do it themselves
DataDog/dd-trace-js#6344
(line 259 in the new version of the file, search for "elu" if needed)

@mveroone mveroone requested a review from a team as a code owner September 27, 2025 14:25
Copy link

linux-foundation-easycla bot commented Sep 27, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@mveroone mveroone force-pushed the fix/runtime_metrics/elu branch from 3ac9601 to 8d0191d Compare October 2, 2025 15:20
@mveroone
Copy link
Author

mveroone commented Oct 2, 2025

@d4nyll CCLA Signed. Sorry for the delay.

Note : I'm available for discussing it upon need, ideally during Europe business hours, but can arrange otherwise.

@d4nyll
Copy link
Member

d4nyll commented Oct 13, 2025

Hey @mveroone, thank you for raising the issue and apoloigies for taking my time on it. It is indeed a big bug.

In your fix, there are areas of code for which ELU metrics being reported won't take into account:

const elu = eventLoopUtilizationCollector(this._lastValue);
// From here
observableResult.observe(elu.utilization);
this._lastValue = elu;
// To here
this._lastValue = eventLoopUtilizationCollector();

Whilst it's not such a big deal (it's only the timespan of running those two lines), it would be preferable to leave no time gaps with what's being captured.

Calling eventLoopUtilizationCollector() twice this way also calls process.hrtime() twice under the hood.

What do you think about this implementation instead?

const currentELU = eventLoopUtilizationCollector();
const deltaELU = eventLoopUtilizationCollector(currentELU, this._lastValue);
this._lastValue = currentELU;
observableResult.observe(deltaELU.utilization);

It will:

  • ensure there are no time gaps in the ELU measurements
  • only call process.hrtime() once, as the second call (i.e. eventLoopUtilizationCollector(currentELU, this._lastValue)) will only perform a subtraction.

@mveroone
Copy link
Author

Hey @d4nyll ,
Thanks for taking the time to review this. That's a great catch, I had completely missed this. (being really not accustomed to developing in general and TS/JS in particular).

Your solution also has the advantage of being way more self-explaining and should likely confuse future readers less than the previous version.

I took the liberty to commit your suggestion, hope that's fine by you ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants