Skip to content

Fix for Recognizing --threads Option in Benchmarking #2161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: dev
Choose a base branch
from

Conversation

sujik18
Copy link
Member

@sujik18 sujik18 commented Mar 28, 2025

In reference to this issue

  • Updated backend_onnxruntime.py to set threading options based on the value of --threads, and also set environment variables to align lower-level library threading for consistency
  • Modified main.py to correctly propagate the --threads argument to the backend loader, which was previously missing.

Test Logs:

mlc log thread command run.txt

System stats during the execution of the script with --threads=2

Screenshot 2025-03-28 061424

@sujik18 sujik18 requested a review from a team as a code owner March 28, 2025 07:21
Copy link
Contributor

github-actions bot commented Mar 28, 2025

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@@ -26,7 +26,14 @@ def image_format(self):

def load(self, model_path, inputs=None, outputs=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_threads parameter not present?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that needs to be added

Copy link
Member Author

@sujik18 sujik18 Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @anandhu-eng, since we are using a general load function call, we need to ensure that the num_threads parameter is added to all the load functions of each individual backend.py file to prevent the current GitHub actions from failing. Additionally, is this a useful feature to add across all backends? As it is currently missing in all of them, is there a specific reason why it was not included? As obviously, everyone will want to use the maximum resources of their system to get the best results, and the only scenario I can think of is when the benchmarking has to be done using relatively less RAM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sujik18 other option will be to use ENV variable

Copy link
Contributor

@anandhu-eng anandhu-eng Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sujik18 , just skimmed through the code. I see that args.threads is being used here and it directly maps to the QueueRunner class here (I'm taking for Offline scenario) and the workers (threads) are defined here. Would you say the current implementation's thread usage is sub-optimal, or is this PR primarily focused on enhancing the performance of internal operations during prediction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @anandhu-eng, currently while running the script, the --threads tag value is not being taken into account and script always utilizes the whole system-wide threads available. Even though args.threads is passed to QueueRunner it seems like backend might be ignoring it.
With this PR, primarily I have tried to configure the backend to respect the threads parameter value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @sujik18 . So I understand that current change controls the number of threads used by the model backend.

Out of curiosity, I would like to ask whether the threads argument is meant only to control the number of inputs processed in parallel by the model backend, or if it is also intended to control the parallel operations performed by the model backend. I'm considering the case where the user doesn't need to control the number of threads used by the model backend but wants to control the number of inputs processed in parallel by the model backend.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whether the threads argument is meant only to control the number of inputs processed in parallel by the model backend, or if it is also intended to control the parallel operations performed by the model backend.

threads argument intends to control the number of physical cores/threads that will be utilised in parallel execution of the model by backend. On a different note, I am not sure whether through script we can control which particular CPU can be utilized other than the option to disable the ones which shouldn't be used.

I'm considering the case where the user doesn't need to control the number of threads used by the model backend but wants to control the number of inputs processed in parallel by the model backend.

I guess a user can achieve that by setting the batch_size parameter and not setting threads parameter.

@@ -646,7 +646,9 @@ def main():
model = backend.load(
args.model,
inputs=args.inputs,
outputs=args.outputs)
# outputs=args.output,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sujik18 , wasn't it originally args.outputs which is different from args.output

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right! It feels like I had unknowingly removed it and later didn't notice it

@sujik18 sujik18 marked this pull request as draft March 30, 2025 04:30
@sujik18 sujik18 marked this pull request as ready for review March 30, 2025 05:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants