Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow custom metrics export via metrics method for predict_raw-based nodes #4863

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tandav
Copy link

@tandav tandav commented May 24, 2023

What this PR does / why we need it:
This PR adds support for using custom metrics export via thedef metrics method for predict_raw-based inference nodes and not add ['meta']['metrics'] key in response.

My current project requires exact response schema and extra fields like meta are forbidden. Unfortunately the only way to use custom metrics in current seldon-core code is via ['meta']['metrics'] field in response. INCLUDE_METRICS_IN_CLIENT_RESPONSE=false environment variable does not help because it only deletes meta.metrics key, but keeps meta.

One more possible solution: run one more metrics webserver using custom_service. But custom_service usage is undocumented and will require to add 2 metrics endpoints to prometheus scraper instead of 1. (Because variable seldon_metrics is not accessible globally (from other thread)) and you can't write/merge metrics to same SeldonMetrics container.

So this PR fixes this. Basically I just added parsing metrics from metrics to predict_raw if branch.

@seldondev
Copy link
Collaborator

Hi @tandav. Thanks for your PR.

I'm waiting for a SeldonIO or todo member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository.

@tandav tandav changed the title allow custom metrics export via def metrics method for predict_raw-based nodes allow custom metrics export via metrics method for predict_raw-based nodes May 25, 2023
@ukclivecox ukclivecox added the v1 label May 29, 2023
@adriangonz adriangonz self-requested a review June 21, 2023 08:25
Copy link
Contributor

@adriangonz adriangonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @tandav ,

As far as I understand, you can return anything you want from the predict_raw method. Therefore, why would you have a meta field there in the first place?

@@ -102,6 +102,7 @@ def predict(
if hasattr(user_model, "predict_raw"):
try:
response = user_model.predict_raw(request)
client_custom_metrics(user_model, seldon_metrics, PREDICT_METRIC_METHOD_TAG, [])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it is, this would double count metrics on both client_custom_metrics and handle_raw_custom_metrics.

@tandav
Copy link
Author

tandav commented Jun 28, 2023

@adriangonz, if I understand correctly, currently there is only one way to write custom metrics to Seldon's metrics server when using predict_raw. You have to write metrics to response's meta.metrics field. However, this approach may not be suitable if your application requires a custom response format and cannot accommodate extra fields like meta.metrics.

Here are the two locations in the source code where custom metrics are handled:

As stated in the original PR description, the INCLUDE_METRICS_IN_CLIENT_RESPONSE=false environment variable is available, but it only deletes the meta.metrics key, while the meta key remains.

Considering your comment about the potential issue of double-counting metrics, I would like to propose the following solutions:

  1. Raise an error when user_model has hasattr(user_model, "metrics") and also returns metrics in the meta.metrics field of the response. The error message would state that the user must choose only one mechanism to write custom metrics.
  2. Do not use client_custom_metrics for user models that use predict_raw. I could remove the one line change in the current PR and instead modify the handle_raw_custom_metrics function to delete not only meta.metrics but the entire meta key. Alternatively, I could add a new variable INCLUDE_META_IN_CLIENT_RESPONSE which would perform this task.

@adriangonz
Copy link
Contributor

Hey @tandav ,

Thanks for that clarification - I can see the point of your use case now and why it would make sense.

Based on what you describe, it probably makes sense to stick with option 1.. Although, instead of raising an error, I would just combine both in the same way as it's done for the regular predict() path (it could double count them, but that seems aligned with what would happen for predict()).

To make it more clear though - and to be aligned with non-raw - I would add a couple changes to the existing PR:

  • Ensure that the output from metrics() is also included into the meta.metrics field. You can probably use the output of client_custom_metrics, which already takes this into account.
  • Extend the change to the other methods (e.g. send_feedback, combine, etc.)

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants