Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Conversation

@robertgshaw2-redhat
Copy link
Contributor

No description provided.


NUM_THREADS = 2
URL = "http://localhost:8080/v2/models/text-classification-model/infer"
sentences = ["I hate using GPUs for inference", "I love using DeepSparse on CPUs"] * 100
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should * 100 be * NUM_THREADS if we are taking only sentences[:NUM_THREADS] elements?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rsnm2 see suggestion below

@@ -0,0 +1,75 @@
# **Step 1: Installation**

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

best to add an intro paragraph to give users a heads up of what this example does.

Comment on lines +23 to +27
threads = [threading.Thread(target=tfunc, args=(sentence,)) for sentence in sentences[:NUM_THREADS]]
for thread in threads:
thread.start()
for thread in threads:
thread.join() No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like this creates NUM_THREADS threads to make the request, is that intended?
Might make more sense to create len(sentences) threads and execute NUM_THREADS at a time.

You can do this out of the box with ThreadPoolExecutor with something like:

Suggested change
threads = [threading.Thread(target=tfunc, args=(sentence,)) for sentence in sentences[:NUM_THREADS]]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
from concurrent.futures.thread import ThreadPoolExecutor
threadpool = ThreadPoolExecutor(max_workers=NUM_THREADS)
results = threadpool.map(tfunc, sentences)

URL = "http://localhost:8080/v2/models/text-classification-model/infer"
sentences = ["I hate using GPUs for inference", "I love using DeepSparse on CPUs"] * 100

def tfunc(text):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would rename to something more descriptive like inference_request

Comment on lines +19 to +20
for output in resp["outputs"]:
print(output["data"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

executing a list printout while multithreaded may cause a race condition, any reason to not return the value and print in sequence at the end? (ie consider thread 1 and thread 2 happen to execute exactly at the same time, they will print their lines at the same time and might not tell which is which)


NUM_THREADS = 2
URL = "http://localhost:8080/v2/models/text-classification-model/infer"
sentences = ["I hate using GPUs for inference", "I love using DeepSparse on CPUs"] * 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rsnm2 see suggestion below

@@ -0,0 +1,27 @@
import requests, threading
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would suggest a few in line comments for self-documentation

task = self._settings.parameters.task,
model_path = self._settings.parameters.model_path,
batch_size = self._settings.parameters.batch_size,
sequence_length = self._settings.parameters.sequence_length,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a place for generic kwargs in the settings? Would be cool if we could use that instead to dump extra pipeline args so we can get full generic pipeline support out of the box

@@ -0,0 +1,19 @@
from mlserver import MLModel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great, love that it works out of the box - let's throw in the serving command as a comment just for convenience

@jeanniefinks
Copy link
Member

Per the main README announcement, DeepSparse is being deprecated by June 2, 2025. Closing the PR as work has been suspended; thank you for the inputs and support!

@jeanniefinks jeanniefinks deleted the mlserver-example branch June 2, 2025 21:04
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants