Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cli): Add the flag --monitor-stack-traces #7391

Closed
wants to merge 1 commit into from

Conversation

fviernau
Copy link
Member

If ort executes slowly this can become handy for getting a first hint where the CPU work is happening.

Note: Print the stack traces in reverse order and the main thread's trace at the bottom to have the likely desired information at the most prominent place.

@fviernau fviernau requested a review from a team as a code owner August 18, 2023 10:47
If `ort` executes slowly this can become handy for getting a first hint
where the CPU work is happening.

Note: Print the stack traces in reverse order and the `main` thread's
trace at the bottom to have the likely desired information at the most
prominent place.

Signed-off-by: Frank Viernau <[email protected]>
@fviernau fviernau force-pushed the monitor-stack-traces branch from 508fd0a to 197186b Compare August 18, 2023 10:48
@codecov
Copy link

codecov bot commented Aug 18, 2023

Codecov Report

Patch and project coverage have no change.

Comparison is base (1ba1071) 61.03% compared to head (508fd0a) 61.03%.

❗ Current head 508fd0a differs from pull request most recent head 197186b. Consider uploading reports for the commit 197186b to get more accurate results

Additional details and impacted files
@@            Coverage Diff            @@
##               main    #7391   +/-   ##
=========================================
  Coverage     61.03%   61.03%           
  Complexity     1968     1968           
=========================================
  Files           336      336           
  Lines         16503    16503           
  Branches       2349     2349           
=========================================
  Hits          10073    10073           
  Misses         5454     5454           
  Partials        976      976           
Flag Coverage Δ
funTest-docker 69.29% <ø> (ø)
funTest-non-docker 30.46% <ø> (ø)
test 36.34% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sschuberth
Copy link
Member

Hmm, I'm unsure whether this is a feature that we should add. It sounds more like something that should be done during debugging with the means of an IDE. Or are there really cases you'd like to inspect e.g. on CI that absolutely cannot be reproduced locally?

Also, this reminds me a bit of #3296. IMO a much nicer (but also more involving) solution for monitoring would be to print better progress information on the CLI, similar to like Gradle does when building with multiple threads in parallel.

@fviernau
Copy link
Member Author

fviernau commented Aug 18, 2023

Hmm, I'm unsure whether this is a feature that we should add. It sounds more like something that should be done during debugging with the means of an IDE.

Can you explain which exact means you are referring to, and why these means should be preferable.
(If you're talking about a profiler, then I'd say the use cases are related but both approaches has pros and cons. If I used
the profiler for profiling my evaluator run, I would have waited ages for it to finish due to the large ORT file size. I'm not saying the profiler is not suitable, but looking at the thread dumps certainly gave me some insight more quickly.)

Or are there really cases you'd like to inspect e.g. on CI that absolutely cannot be reproduced locally?

It's not for CI, it's for local investigations. It could also be local on some other devs machine, so remote to myself.

Also, this reminds me a bit of #3296. IMO a much nicer (but also more involving) solution for monitoring would be to print better progress information on the CLI, similar to like Gradle does when building with multiple threads in parallel.

In my use case the CLI output of ORT just didn't print anything, so I wanted to get a hunch on what it's doing. And this can be a very quick way to do so IMO. If one could anticipate all performance problems and put logging making it visible everywhere, then the performance problem wouldn''t exist in the first place.

I've written this code already a couple of times and I wanted to save that effort for myself and others as well.
So, do you think seeing stack traces cannot be useful for any developer using ORT ?

Would it work for you if this feature was toggled via env var instead via a CI param, or is the feature itself not convincing?

@sschuberth
Copy link
Member

Can you explain which exact means you are referring to

I wasn't thinking about using a profiler, but the threads view of the debugger tab in the IDE. You can pause execution of threads and look at the stacktrace at any time.

and why these means should be preferable.

Since it's you who wants to introduce this change, could you please explain why your approach should be preferred over dedicated debugging tools?

In my use case the CLI output of ORT just didn't print anything

Yes, that's exactly what #3296 is about: Missing information about progress. This should not necessarily be implemented by simply adding more log statement, but ideally by implementing progress interfaces. But that's, as mentioned, a bigger effort.

If one could anticipate all performance problems and put logging making it visible everywhere, then the performance problem wouldn''t exist in the first place.

I don't get that sentence. How does more logging automatically solve performance problems?

So, do you think seeing stack traces cannot be useful for any developer using ORT ?

Obviously, I cannot tell for the rest of @oss-review-toolkit/kotlin-devs, but I myself have never needed it / more than what the IDE debugging offers.

Would it work for you if this feature was toggled via env var instead via a CI param, or is the feature itself not convincing?

To me, the feature itself is not convincing, as it seems to reimplement something that is already available in the IDE, or via some of the tools e.g. mentioned here.

@fviernau
Copy link
Member Author

Since it's you who wants to introduce this change, could you please explain why your approach should be preferred over dedicated debugging tools?

I believe I just did that. It's quicker than doing it in the IDE IMO.

I don't get that sentence. How does more logging automatically solve performance problems?

Your previous message implied to me that one would need to be able to foresee performance issues, in order to do the log statements such that it's visible from the logs where the performance issue is. I just wanted to point out that this assumption does not hold.

To me, the feature itself is not convincing, as it seems to reimplement something that is already available in the IDE, or via some of the tools e.g. mentioned here.

I run ort via orthw report-html which takes care of passing all the necessary parameters to the CLI. If I need to setup a run in IDE which does the same, it's quite a bit of overhead. Moreover, if I run on large scan results in debug mode in IDE, the debug mode slows down additionally.

@@ -104,6 +106,8 @@ class OrtMain : CliktCommand(
"--debug" to Level.DEBUG
).default(Level.WARN)

private val monitorStackTraces by option(help = "Continuously print out the stack traces of all threads.").flag()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice if the option would take in int value to configure the interval.

@fviernau fviernau closed this Aug 18, 2023
@fviernau
Copy link
Member Author

jstack can be used for this purpose, e.g.

@sschuberth sschuberth deleted the monitor-stack-traces branch August 19, 2023 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants