EW-9366 Provide return event timestamp if span is reported without valid incomingRequest #5282

fhanau · 2025-10-10T23:05:14Z

Edge testing showed that the given error only happens after the return event timestamp has already been provided, which happens in one edge case when a DO request is aborted and spans are only reported as the IncomingRequest is being destructed. At that time, we can't look up the time anymore, but the return event timestamp will still be helpful.
As indicated in the TODO comment for the case where the IoContext itself is no longer available, we'll want to change span lifetimes at a later time so that they don't outlive the IncomingRequest/IoContext, but for now we'll provide the reasonable timestamp available to us.

Also perform some cleanup while we're at it (promote warnings to errors since we've always had the return event available in this case so if they still appear that's an actual issue, avoid superfluous timer lookups).

fhanau · 2025-10-10T23:07:05Z

src/workerd/io/trace.c++

    // TODO(o11y): Once we report the user tracing spanOpen event as soon as a span is created, we
    // should be able to fold this virtual call and just get the timestamp directly.
-    span.emplace(kj::mv(operationName), startTime.orDefault(obs->getTime()));
+    span.emplace(kj::mv(operationName), startTime.orDefault([&obs]() { obs->getTime() }));


this saves us a timer lookup whenever we already have a timestamp, something we want to get right for performance.

github-actions · 2025-10-10T23:07:11Z

The generated output of @cloudflare/workers-types matches the snapshot in types/generated-snapshot 🎉

fhanau · 2025-10-10T23:11:02Z

As discussed with Mar in a call, it might be possible to work around this edge case by restructuring IoContext::IncomingRequest::~IoContext_IncomingRequest(), but I don't understand that code very well, it would be difficult to rule out that this introduces new bugs.

fhanau · 2025-10-10T23:18:44Z

@jmorrell-cloudflare this addresses a warning encountered in edge testing – in a rare edge case, we are unable to get a timestamp when closing a span following an exception for DO-based requests. This PR proposes just providing the return timestamp in that case (which is already available at that time) without reporting a warning, as it's the best time we have (unless we figure out a different way to manage these span lifetimes later). Does that approach work for you?

mar-cf · 2025-10-15T08:27:52Z

Missing tests or a link to a repro.

If this happens on abort only, is it that concerning that we wouldn't have a timestamp to use?

What would it take to change the spans not to outlive the aborted context and report early?

src/workerd/io/tracer.c++

fhanau · 2025-10-15T13:48:26Z

Missing tests or a link to a repro.

If this happens on abort only, is it that concerning that we wouldn't have a timestamp to use?

What would it take to change the spans not to outlive the aborted context and report early?

Mar I thought we already discussed the rationale for this change including 1) and 2) at the Friday meeting. We also touched on 3) – might be possible but I'm not sure if it can be done safely as summarized in #5282 (comment) – we would want some more senior folks to weigh in and it would be hard to rule out that it would introduce other bugs. My understanding is that getting span lifetimes right 100% of the time is not in scope for our next milestone.
That being said, these questions are non-trivial and I'm happy to discuss them some more internally.

…lid incomingRequest Edge testing showed that the given error only happens after the return event timestamp has already been provided, which happens in one edge case when a DO request is aborted and spans are only reported as the IncomingRequest is being destructed. At that time, we can't look up the time anymore, but the return event timestamp will still be helpful. As indicated in the TODO comment for the case where the IoContext itself is no longer available, we'll want to change span lifetimes at a later time so that they don't outlive the IncomingRequest/IoContext, but for now we'll provide the reasonable timestamp available to us. Also perform some cleanup while we're at it (promote warnings to errors since we've always had the return event available in this case so if they still appear that's an actual issue, avoid superfluous timer lookups).

fhanau requested review from jmorrell-cloudflare and mar-cf October 10, 2025 23:05

fhanau requested review from a team as code owners October 10, 2025 23:05

fhanau commented Oct 10, 2025

View reviewed changes

fhanau force-pushed the felix/101025-stw-time branch from a8de521 to f6e2966 Compare October 10, 2025 23:11

fhanau added the observability label Oct 13, 2025

mar-cf reviewed Oct 15, 2025

View reviewed changes

src/workerd/io/tracer.c++ Outdated Show resolved Hide resolved

fhanau force-pushed the felix/101025-stw-time branch from f6e2966 to b9c7d97 Compare October 15, 2025 13:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

EW-9366 Provide return event timestamp if span is reported without valid incomingRequest #5282

EW-9366 Provide return event timestamp if span is reported without valid incomingRequest #5282

Uh oh!

fhanau commented Oct 10, 2025

Uh oh!

fhanau Oct 10, 2025

Uh oh!

github-actions bot commented Oct 10, 2025 •

edited

Loading

Uh oh!

fhanau commented Oct 10, 2025

Uh oh!

fhanau commented Oct 10, 2025

Uh oh!

mar-cf commented Oct 15, 2025

Uh oh!

Uh oh!

fhanau commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

EW-9366 Provide return event timestamp if span is reported without valid incomingRequest #5282

Are you sure you want to change the base?

EW-9366 Provide return event timestamp if span is reported without valid incomingRequest #5282

Uh oh!

Conversation

fhanau commented Oct 10, 2025

Uh oh!

fhanau Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fhanau commented Oct 10, 2025

Uh oh!

fhanau commented Oct 10, 2025

Uh oh!

mar-cf commented Oct 15, 2025

Uh oh!

Uh oh!

fhanau commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Oct 10, 2025 •

edited

Loading