refactor: HF API Embedders - use `InferenceClient.feature_extraction` instead of `InferenceClient.post` #8794

anakin87 · 2025-01-31T16:32:23Z

Related Issues

fixes HuggingFaceAPIDocumentEmbedder is not compatible with huggingface_hub>=0.28.0 #8791

Our HF API Embedders support 3 different HF APIs using the huggingface_hub client: Serverless Inference API, Text Embeddings Inference (deployed locally or elsewhere), and HF Paid Inference Endpoints.

For several reasons, in the past we have opted to use InferenceClient.post method, but this is fragile, subject to changes (as has happened recently) and will soon be removed (in 0.31).

This is why I'm migrating to the InferenceClient.feature_extraction method. This solution works but may not be ideal, as explained in the comments.

Proposed Changes:

Use feature_extraction instead of post.
Add some checks on the shape of the embeddings obtained from the API.

How did you test it?

CI; new tests

In the CI, we are only testing the Serverless Inference API.

For these reasons, I also tested the current solution with:

a Text Embeddings Inference container deployed locally
a Paid Inference Endpoint (deployed on HF)

Notes for the reviewer

Since the solution identified is not ideal, I also plan to contact a huggingface_hub maintainer and ask for advice.

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
I documented my code
I ran pre-commit hooks and fixed any issue

coveralls · 2025-01-31T16:38:22Z

Pull Request Test Coverage Report for Build 13116238781

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

0 of 0 changed or added relevant lines in 0 files are covered.
7 unchanged lines in 4 files lost coverage.
Overall coverage increased (+0.02%) to 91.383%

Files with Coverage Reduction	New Missed Lines	%
components/preprocessors/document_splitter.py	1	99.5%
utils/callable_serialization.py	1	95.35%
components/embedders/hugging_face_api_text_embedder.py	2	97.14%
components/embedders/hugging_face_api_document_embedder.py	3	96.63%

Totals
Change from base Build 13076295715:	0.02%
Covered Lines:	8898
Relevant Lines:	9737

💛 - Coveralls

anakin87 · 2025-01-31T17:03:20Z

haystack/components/embedders/hugging_face_api_document_embedder.py

+
+            np_embeddings = self._client.feature_extraction(
+                # this method does not officially support list of strings, but works as expected
+                text=batch,  # type: ignore[arg-type]


as explained in the comment, the method does what we need but this usage is not officially supported

For my own understanding, I looked into the API. From this discussion, am I correct to deduce that both types str and List[str] are supported for text. We are unsure because the docs don't mention List[str] officially but the underlying models do expect lists and return correct results.
In that case, it would make sense to introduce this change.

I think you are basically right.
I reached out to the huggingface_hub maintainers here: huggingface/huggingface_hub#2824
You can read this message to get a better understanding.

anakin87 · 2025-01-31T17:04:34Z

haystack/components/embedders/hugging_face_api_document_embedder.py

+                # this method does not officially support list of strings, but works as expected
+                text=batch,  # type: ignore[arg-type]
+                # Serverless Inference API does not support truncate and normalize, so we pass None in the request
+                truncate=self.truncate if self.api_type != HFEmbeddingAPIType.SERVERLESS_INFERENCE_API else None,


truncate and normalize are not supported in the Serverless Inference API.

With post, these parameters are ignored if the server does not support them.
Using feature_extraction, we need to pass None, otherwise we get an error.

anakin87 · 2025-01-31T17:05:39Z

pyproject.toml

@@ -87,7 +87,7 @@ extra-dependencies = [
  "numba>=0.54.0", # This pin helps uv resolve the dependency tree. See https://github.com/astral-sh/uv/issues/7881

  "transformers[torch,sentencepiece]==4.47.1", # ExtractiveReader, TransformersSimilarityRanker, LocalWhisperTranscriber, HFGenerators...
-  "huggingface_hub>=0.27.0, <0.28.0",                   # Hugging Face API Generators and Embedders
+  "huggingface_hub>=0.27.0",                   # Hugging Face API Generators and Embedders


based on my local tests, this solution works both with the new and the old version, so we can safely remove the pin

Amnah199 · 2025-02-03T13:36:13Z

haystack/components/embedders/hugging_face_api_text_embedder.py

+            # Serverless Inference API does not support truncate and normalize, so we pass None in the request
+            truncate=self.truncate if self.api_type != HFEmbeddingAPIType.SERVERLESS_INFERENCE_API else None,
+            normalize=self.normalize if self.api_type != HFEmbeddingAPIType.SERVERLESS_INFERENCE_API else None,


Should we raise a warning that these params will be ignored, if the user explictly passes truncate or normalize?

done in f9479e0

Amnah199

LG! Its your call if you want to update a test to check if the warning is raised properly. Otherwise feel free to merge.

anakin87 · 2025-02-03T14:55:21Z

LG! Its your call if you want to update a test to check if the warning is raised properly. Otherwise feel free to merge.

Thanks! Added a small assertion to related tests...

HF API Embedders: refactoring

e06997a

github-actions bot added topic:tests type:documentation Improvements on the docs labels Jan 31, 2025

rename variables

d7e3fb1

anakin87 added 3 commits January 31, 2025 17:39

rm leftovers

3de0901

Merge branch 'main' into fix-hfhub-embedder

3ebd5cb

rm pin

3599e05

github-actions bot added the topic:build/distribution label Jan 31, 2025

rm unused import

6c32f83

anakin87 commented Jan 31, 2025

View reviewed changes

anakin87 changed the title ~~refactor: HF API Embedders refactoring~~ refactor: HF API Embedders - use InferenceClient.feature_extraction instead of InferenceClient.post Jan 31, 2025

relnote

56afdd0

anakin87 marked this pull request as ready for review January 31, 2025 17:12

anakin87 requested review from a team as code owners January 31, 2025 17:12

anakin87 requested review from dfokina and Amnah199 and removed request for a team January 31, 2025 17:12

Amnah199 reviewed Feb 3, 2025

View reviewed changes

warning with truncate/normalize and serverless inference API

f9479e0

Amnah199 approved these changes Feb 3, 2025

View reviewed changes

test that warnings are raised

a27287d

anakin87 enabled auto-merge (squash) February 3, 2025 14:55

anakin87 merged commit 877f826 into main Feb 3, 2025
19 checks passed

anakin87 deleted the fix-hfhub-embedder branch February 3, 2025 15:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: HF API Embedders - use `InferenceClient.feature_extraction` instead of `InferenceClient.post` #8794

refactor: HF API Embedders - use `InferenceClient.feature_extraction` instead of `InferenceClient.post` #8794

anakin87 commented Jan 31, 2025 •

edited

Loading

coveralls commented Jan 31, 2025 •

edited

Loading

anakin87 Jan 31, 2025

Amnah199 Feb 3, 2025 •

edited

Loading

anakin87 Feb 3, 2025

anakin87 Jan 31, 2025 •

edited

Loading

anakin87 Jan 31, 2025

Amnah199 Feb 3, 2025 •

edited

Loading

anakin87 Feb 3, 2025

Amnah199 left a comment

anakin87 commented Feb 3, 2025

refactor: HF API Embedders - use InferenceClient.feature_extraction instead of InferenceClient.post #8794

refactor: HF API Embedders - use InferenceClient.feature_extraction instead of InferenceClient.post #8794

Conversation

anakin87 commented Jan 31, 2025 • edited Loading

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

coveralls commented Jan 31, 2025 • edited Loading

Pull Request Test Coverage Report for Build 13116238781

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

anakin87 Jan 31, 2025

Choose a reason for hiding this comment

Amnah199 Feb 3, 2025 • edited Loading

Choose a reason for hiding this comment

anakin87 Feb 3, 2025

Choose a reason for hiding this comment

anakin87 Jan 31, 2025 • edited Loading

Choose a reason for hiding this comment

anakin87 Jan 31, 2025

Choose a reason for hiding this comment

Amnah199 Feb 3, 2025 • edited Loading

Choose a reason for hiding this comment

anakin87 Feb 3, 2025

Choose a reason for hiding this comment

Amnah199 left a comment

Choose a reason for hiding this comment

anakin87 commented Feb 3, 2025

refactor: HF API Embedders - use `InferenceClient.feature_extraction` instead of `InferenceClient.post` #8794

refactor: HF API Embedders - use `InferenceClient.feature_extraction` instead of `InferenceClient.post` #8794

anakin87 commented Jan 31, 2025 •

edited

Loading

coveralls commented Jan 31, 2025 •

edited

Loading

Amnah199 Feb 3, 2025 •

edited

Loading

anakin87 Jan 31, 2025 •

edited

Loading

Amnah199 Feb 3, 2025 •

edited

Loading