Add long running traces to flare report, allow flare files to be downloaded with JMX #9874
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What Does This Do
This does two main things:
Examples:
Motivation
While adding custom instrumentation to a complex, asynchronous application we found it was challenging to validate if all spans were
end()ed during tests.dd.trace.debug=trueanddd.trace.experimental.long-running.enabled=truecould be used with some post-processing of debug logs, however this didn't work for our needs because the application breaks with that level of logging. Whendd.trace.experimental.long-running.enabled=trueis used, the long running traces are sent to Datadog's backend, however they are not searchable until they are finished, so we didn't have a good way to find them. This change gives us two ways to access the long running traces list with either a flare report or via JMX.I initially started by adding JMX MBeans to retrieve just the pending and long running traces and counters. Once I added the long running traces to the flare report to parity with pending traces, I realized that a more generic mechanism to allow getting flare details over JMX might be useful. After adding a TracerFlare MBean, this seemed like a far more valuable route and I removed the code I had added for pending/long running trace MBeans.
Additional Notes
This PR has a number of commits and I suggest reviewing commit-by-commit, paying special attention to the notes in bold below:
synchronizedto a few methods (see commit comment for details).features.supportsLongRunning()is false, the traces are kept in theTRACKEDstate, compared to theNOT_TRACKEDstate previously.add*methods as-is, but this could be simplified by refactoring the add* methods into Reporter instances (with a new signature that passes a few more arguments toaddReportToFlare). I think this refactoring would be a good change to make--let me know and I'll happily do that. I also considered not making the zip file an intermediary, and if you like, I could look at what that change might be, as well.Contributor Checklist
type:and (comp:orinst:) labels in addition to any useful labelsclose,fixor any linking keywords when referencing an issue.Use
solvesinstead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]