Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: HF API Embedders - use InferenceClient.feature_extraction instead of InferenceClient.post #8794

Merged
merged 9 commits into from
Feb 3, 2025

Conversation

anakin87
Copy link
Member

@anakin87 anakin87 commented Jan 31, 2025

Related Issues

Our HF API Embedders support 3 different HF APIs using the huggingface_hub client: Serverless Inference API, Text Embeddings Inference (deployed locally or elsewhere), and HF Paid Inference Endpoints.

For several reasons, in the past we have opted to use InferenceClient.post method, but this is fragile, subject to changes (as has happened recently) and will soon be removed (in 0.31).

This is why I'm migrating to the InferenceClient.feature_extraction method. This solution works but may not be ideal, as explained in the comments.

Proposed Changes:

  • Use feature_extraction instead of post.
  • Add some checks on the shape of the embeddings obtained from the API.

How did you test it?

CI; new tests

In the CI, we are only testing the Serverless Inference API.

For these reasons, I also tested the current solution with:

  • a Text Embeddings Inference container deployed locally
  • a Paid Inference Endpoint (deployed on HF)

Notes for the reviewer

Since the solution identified is not ideal, I also plan to contact a huggingface_hub maintainer and ask for advice.

Checklist

  • I have read the contributors guidelines and the code of conduct
  • I have updated the related issue with new insights and changes
  • I added unit tests and updated the docstrings
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I documented my code
  • I ran pre-commit hooks and fixed any issue

@github-actions github-actions bot added topic:tests type:documentation Improvements on the docs labels Jan 31, 2025
@coveralls
Copy link
Collaborator

coveralls commented Jan 31, 2025

Pull Request Test Coverage Report for Build 13116238781

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 7 unchanged lines in 4 files lost coverage.
  • Overall coverage increased (+0.02%) to 91.383%

Files with Coverage Reduction New Missed Lines %
components/preprocessors/document_splitter.py 1 99.5%
utils/callable_serialization.py 1 95.35%
components/embedders/hugging_face_api_text_embedder.py 2 97.14%
components/embedders/hugging_face_api_document_embedder.py 3 96.63%
Totals Coverage Status
Change from base Build 13076295715: 0.02%
Covered Lines: 8898
Relevant Lines: 9737

💛 - Coveralls


np_embeddings = self._client.feature_extraction(
# this method does not officially support list of strings, but works as expected
text=batch, # type: ignore[arg-type]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as explained in the comment, the method does what we need but this usage is not officially supported

Copy link
Contributor

@Amnah199 Amnah199 Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own understanding, I looked into the API. From this discussion, am I correct to deduce that both types str and List[str] are supported for text. We are unsure because the docs don't mention List[str] officially but the underlying models do expect lists and return correct results.
In that case, it would make sense to introduce this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are basically right.
I reached out to the huggingface_hub maintainers here: huggingface/huggingface_hub#2824
You can read this message to get a better understanding.

# this method does not officially support list of strings, but works as expected
text=batch, # type: ignore[arg-type]
# Serverless Inference API does not support truncate and normalize, so we pass None in the request
truncate=self.truncate if self.api_type != HFEmbeddingAPIType.SERVERLESS_INFERENCE_API else None,
Copy link
Member Author

@anakin87 anakin87 Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

truncate and normalize are not supported in the Serverless Inference API.

With post, these parameters are ignored if the server does not support them.
Using feature_extraction, we need to pass None, otherwise we get an error.

@@ -87,7 +87,7 @@ extra-dependencies = [
"numba>=0.54.0", # This pin helps uv resolve the dependency tree. See https://github.com/astral-sh/uv/issues/7881

"transformers[torch,sentencepiece]==4.47.1", # ExtractiveReader, TransformersSimilarityRanker, LocalWhisperTranscriber, HFGenerators...
"huggingface_hub>=0.27.0, <0.28.0", # Hugging Face API Generators and Embedders
"huggingface_hub>=0.27.0", # Hugging Face API Generators and Embedders
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on my local tests, this solution works both with the new and the old version, so we can safely remove the pin

@anakin87 anakin87 changed the title refactor: HF API Embedders refactoring refactor: HF API Embedders - use InferenceClient.feature_extraction instead of InferenceClient.post Jan 31, 2025
@anakin87 anakin87 marked this pull request as ready for review January 31, 2025 17:12
@anakin87 anakin87 requested review from a team as code owners January 31, 2025 17:12
@anakin87 anakin87 requested review from dfokina and Amnah199 and removed request for a team January 31, 2025 17:12
Comment on lines 202 to 204
# Serverless Inference API does not support truncate and normalize, so we pass None in the request
truncate=self.truncate if self.api_type != HFEmbeddingAPIType.SERVERLESS_INFERENCE_API else None,
normalize=self.normalize if self.api_type != HFEmbeddingAPIType.SERVERLESS_INFERENCE_API else None,
Copy link
Contributor

@Amnah199 Amnah199 Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we raise a warning that these params will be ignored, if the user explictly passes truncate or normalize?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in f9479e0

Copy link
Contributor

@Amnah199 Amnah199 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG! Its your call if you want to update a test to check if the warning is raised properly. Otherwise feel free to merge.

@anakin87
Copy link
Member Author

anakin87 commented Feb 3, 2025

LG! Its your call if you want to update a test to check if the warning is raised properly. Otherwise feel free to merge.

Thanks! Added a small assertion to related tests...

@anakin87 anakin87 enabled auto-merge (squash) February 3, 2025 14:55
@anakin87 anakin87 merged commit 877f826 into main Feb 3, 2025
19 checks passed
@anakin87 anakin87 deleted the fix-hfhub-embedder branch February 3, 2025 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HuggingFaceAPIDocumentEmbedder is not compatible with huggingface_hub>=0.28.0
3 participants