Replies: 1 comment
-
I agree that the long-lasting traces were a bad choice in the past, and it’s good we’re getting rid of them. To me, it seemed that the long-lasting trace ID was essentially our session ID, which the product doesn’t handle well. Span Links are a step in the right direction. A significant issue with only having span links without a session ID is that building the complete tree of connected traces is typically more complex, as it requires traversing numerous spans of the tree to identify all connections. But even if we add a session ID later, span links will still be helpful in linking specific spans and traces together. Furthermore, OTEL also has SpanLinks. I think we still need a session ID to link traces, spans, logs, etc., all to one user session, but that’s a bigger discussion. I like the idea of |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey folks, this discussion is a request for comments and feedback on how we envision browser (and in a broader sense frontend-) tracing. Things discussed in this post might or might not make it into the Sentry SDKs but this discussion should serve as a common ground. We’d love to hear your thoughts and concerns!
Fair warning: This post touches a lot of different parts around tracing in the browser. This is on purpose because we have a lot of ideas for improvements in this space. Concrete projects or changes within this doc will of course be written down in greater details in issues or separate RFCs (depending on applicability).
Apologies for the length in advance! If you're only interested in the proposed improvements, skip ahead to Future Plans
Current Status and Pain Points
Let's take a look at the current state of all things browser tracing and what's cumbersome and bothering us.
Tracing Model
Ever since we introduced v8 of the SDK, we made the decision to keep one traceId around for the entire duration of users being on one route. On a navigation to another route, or a hard page load, we’d start a new trace.
Our develop docs contain more details on the current tracing model.
While today’s browser tracing model works well in most cases, it surfaces some clear pain points all over the Sentry product surface as well as on a more conceptual level:
SSR Traces
Whenever we start a trace in the backend and propagate the trace data via
<meta>
tag to the browser, we continue the trace in the browser as a child of the server trace. This poses a lot of problems some of which we tried to address by fixing the symptoms:http.server
SSR span.pageload
span sub-tree has a longer duration than it’s parent span. This violates the trace definition (as much or more as multiple root spans mentioned above).<meta>
tags is cached (e.g. by ISR or a CDN between application and users), the sampling decision is still carried over to the client.Sampling Consistency
Related to the last problem, sampling consistency is a larger problem:
Currently, by default, whenever we start a new trace (i.e. right now on navigation or page load), we make a new sampling decision. This is not always what you want, especially not in frontend. Some users want a more fine-grained option (within the same page) other users want a consistent sampling decision for the entire user journey.
With the Linked Traces project (more on that later), we also introduced
consistentTraceSampling
which is a first response towards enabling longer-lived sampling decisions across subsequent traces. This works well but is hard to find, not super configurable and not always fine-grained enough.Request Spans
Today,
http.client
request spans forfetch
andXmlHttpRequest
requests are started and sent if a parent span is active. In default configuration this means that any request happening while no pageload or navigation span is active, would not be tracked with a span (tracing headers are still attached). This causes a lot of confusion for users who don’t understand why some requests are not traced with a span, while the backend request is traced.Historically, we couldn’t send
http.client
root spans because their name wouldn’t be low-cardinality enough to fit our transaction name requirements.Bundle Size and Modularity
All tracing functionality in the browser SDKs is added to the SDK via one integration,
browserTracingIntegration
. This has some advantages:browserTracingIntegration
to your SDK. The exception here are meta frameworks, where for the best OOTB experience, we already includebrowserTracingIntegration
by default.However, there are still some considerable disadvantages with the one-integration approach:
browserTracingIntegration
. So even if you only want backend and frontend errors connected but no spans, you have to pull in the entire bundle size weight ofbrowserTracingIntegration
.Future Plans
Now that I ranted a bunch about the current pain points, let’s address how we plan to fix or at least improve them.
SSR Traces
Right now, we always continue the trace and the sampling decision of the server-side SSR trace when we find
<meta>
tags with Sentry tracing data. This was a semi-conscious decision at the time but we realized an important thing here: It is “just” SDK behavior and we can change this at any time. Therefore, we propose to do the following instead:http.server
span tree and thepageload
trace separate. We no longer continue the trace on the client. This means, server- and client traces will have separate ids (continue reading before you protest 🙂)pageload
span tree is its own trace, meaning it is a root span. It’sbrowser.request
child span will have a span link linking to thehttp.server
trace root span. Span links are designed for this very purpose: Connect spans (traces) that are related (causally) but do not fit into the strict trace hierarchy. Since our SDK supports them and we can ingest them (in a basic form at least), let’s make use of them.⇒ We can take an active stance on how SSR traces should be handled. We do not have to adhere to strict trace continuation, as it clearly doesn’t make sense.
Before:

After:

Span Links instead of Long Traces
As already mentioned, our SDK supports span links and we can ingest and display them (in a basic form) in Sentry. So, let’s make use of them more! We already send a linked list of traces by including a sentry.previous_trace span link on the root span of the currently active trace. This works somewhat well today but is limited by sampling consistency and the capabilities of the product (UI- and data-wise). If we can address all the SDK-external limitations, we can and should send more links instead of traces with multiple root spans.
⇒ The goal is: Make traces more organic, avoid trace violations, augment trace connections via span links
To get to this point with good UX, here’s a (probably inconclusive) list of things we need to change:
Always send request spans
Today, the Sentry product is in a much better state to deal with child-less, single
http.client
(root) spans. Soon, it will be in an even better state. This means, we will be able to lift the restriction of only sending request spans as children of active root spans. Instead, we will send request spans whenever fetch/XHR requests are made. We can introduce additional APIs to configure this behaviour for anyone who has span quota concerns.⇒ This we should just do, it really is a no-brainer
Choice over Sampling Consistency
We should provide easy alternatives to our current sampling decision semantics: Users should have the choice how long a sampling decision lasts in browser (frontend) SDKs:
Have a CDN set up to cache initial page requests? opt-out of SSR trace sampling decision being continued on the client!
Decide how long the sampling decision should remain consistent vs. when a new one is made:
consistentTraceSampling: false
- sampling decision is made on per-trace basis (default today)consistentTraceSampling: true
- sampling decision remains consistent until the next hard page reload, or even longer than that if users opt into sessionStorageThe API suggestions are not set in stone yet but they should serve the illustrative purpose: Give users the choice for consistent sampling.
⇒ browser applications are too diverse to provide good defaults for everyone. So let's settle with reasonable defaults for the majority (e.g. SPA apps) and provide enough options and granularity for everyone else
Splitting Up
browserTracingIntegration
To accommodate bundle-size conscious users, we should split up
browserTracingIntegration
into its distinct functionalities to give users the option of pulling in only the parts they need.To be clear: We will continue to ship one
browserTracingIntegration
to maintain ease of setup but this integration will be made up of sub integrations that users can pull in individually optionally.This is how we envision the split:
sentry-trace
,baggage
(and optionallytraceparent
) HTTP headers on XHR/fetch requestsbrowser.*
spans for initial pageload request datafetch
and XHR request spansresource.*
,longanimationframe
,mark
,measure
, etc spansBesides bundle-size optimization, another key benefit is separation of concerns. Today, the
browserTracingIntegration
is massive and has a lot of code. Splitting it up into logical units helps cleaning up the intertwined logic and will make the integrations more maintainable.For Tracing without spans (performance), only the trace continuation and propagation sub integration is necessary. Everything else can be stripped away, resulting in minimal bundle size impact.
Out of Scope (for now)
Async Context
As of today, Async context in browser is still not implemented, which means that we can’t reliably establish correct parent↔child span relationships within async operations. Consequently, we’ll for the forseeable future still default to attaching child spans always directly to the root span. As previously, this can be opted-out of by setting
parentSpanIsAlwaysRootSpan: false
inSentry.init
but with the caveat that hierarchies in async operations might be incorrect.Sessions and Session-based Sampling
In the long term, we're strongly advocating for a serious session model and product experience in Sentry. From SDK perspective, This most importantly includes assigning a session id to all events and data points set from the SDK to Sentry, and potentially also sending more data within the session envelopes we send today (for example, direct event id mappings to better adjust session health metrics post-ingest).
Related, we're advocating for session-based sampling to more uniformly sample distinct data points (errors, traces, replays, etc) within sessions.
Both of these features are out of scope for this RFC but its important to highlight that anything proposed here does not stand in the way of introducing session improvements on top of them. Span links have their place, even if traces can be queried by session id. Fine-grained control over trace sampling decision lifetime must be compatible in the same way that
sampleRate
,profilesSampleRate
, etc are compatible with session sampling controls.WDYT?
Do you have strong opinions on frontend tracing? Are we missing key pain points you're currently experiencing in this roadmap? Do you like what you're reading and where we're heading? Please let us know! We appreciate feedback a lot, regardless of 🌶️-level :)
Beta Was this translation helpful? Give feedback.
All reactions