Skip to content

feat: add reranker support to Python SDK#119

Open
staru09 wants to merge 3 commits into
usemoss:mainfrom
staru09:reranker_support
Open

feat: add reranker support to Python SDK#119
staru09 wants to merge 3 commits into
usemoss:mainfrom
staru09:reranker_support

Conversation

@staru09
Copy link
Copy Markdown
Contributor

@staru09 staru09 commented Apr 5, 2026

Pull Request Checklist

Please ensure that your PR meets the following requirements:

  • I have read the CONTRIBUTING guide.
  • I have updated the documentation (if applicable).
  • My code follows the style guidelines of this project.
  • I have performed a self-review of my own code.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.

Fixes #84

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Open with Devin



async def setup_index(client):
"""Delete old index, create fresh one with all FAQ data."""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the delete old index step missing ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

darn, I think I removed it for a test and then forgot to add, let me do it real quick



async def main():
client = MossClient(os.getenv("MOSS_PROJECT_ID"), os.getenv("MOSS_PROJECT_KEY"))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please create .env.example

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

top_n=5, # return top 5 after reranking
api_key=os.getenv("COHERE_API_KEY"),
),
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would reranking definition return type searchresult ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it would be of this type

QueryOptions(filter={"$and": [
{"field": "city", "condition": {"$eq": "NYC"}},
{"field": "price", "condition": {"$lt": "50"}},
{"field ": "price", "condition": {"$lt": "50"}},
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the changes minimal, the space here is not necessory.

model: Optional[str] = None
top_n: Optional[int] = None
api_key: Optional[str] = None
options: Dict[str, Any] = field(default_factory=dict)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please use **kwargs, adding options will make the args complex

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I can't use kwargs with a dataclass by default, I have two options

  1. use a regular class
  2. use dataclass(init=false) and write init separately

which one do you think should be done here?


if is_loaded:
return await self._query_local(name, query, options)
result = await self._query_local(name, query, options)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reranking should be optional not necessory path

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I get it correctly but query function already has rerank as optional and defaults to none

async def query(
        self,
        name: str,
        query: str,
        options: Optional[QueryOptions] = None,
        *,
        rerank: Optional[RerankOptions] = None,

model: str = "rerank-v3.5",
**kwargs: Any,
):
"""Initialize the Cohere reranker.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we please use sdk instead of http api

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed this one

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Code Review Summary

Overall this is a solid feature addition — the architecture (Protocol-based Reranker, provider registry, clean integration into query()) is well-designed and extensible. Tests are good. A few issues to address:

Unresolved review comments from @yatharthk2

Several items the author acknowledged but haven't been addressed in the current diff:

  1. Missing delete_index step in examples/python/reranking_sample.py — docstring says "Delete old index, create fresh one" but delete_index is never called. (Comment Remove SDK Ref #1, author said they'd fix)
  2. Missing .env.example — not present in the diff. (Comment Refactor LiveKit voice agent to support multiple index queries. Introduced IndexQuery and IndexQueries models for efficient querying. Updated search_support_faqs method to handle a list of queries, improving FAQ retrieval and logging for better traceability. #3, author said "done" but it's not in the changeset)
  3. Spurious space in moss_client.py docstring: "field " should be "field". (Comment Fix React Server Components CVE vulnerabilities #7)
  4. RerankOptions.options field vs **kwargs — still unresolved, author asked for guidance in Comment Adding CI CD pipeline for JS,Python Checks + Readme Updates #10. I'd suggest option 1 (regular class) for simplicity — it's just a config container.
  5. Use Cohere SDK instead of raw HTTP — author said "fixed" (Comment Addition of CODEOWNERS  #14) but cohere.py still uses httpx directly. This hasn't actually been changed.

New issues

  1. __init__.py full-file rewrite: The diff rewrites the entire file when only 3 lines changed (the RerankOptions import, its __all__ entry, and the # Reranking comment). Likely a line-ending or formatting issue — should be a minimal diff.

  2. Misleading inline comment: reranking_sample.py line 73 says # fetch 10 candidates but top_k=20 is used.

  3. Unnecessary restructuring of _query_local: The original code's final return await asyncio.to_thread(...) is moved into an else: block. Semantically equivalent but adds noise — the original flat structure was cleaner.

  4. _apply_rerank instantiates a new reranker on every query call: This means a new object (and in Cohere's case, a new httpx.AsyncClient) is created per rerank. Consider caching or accepting a pre-built reranker instance for repeated queries.

  5. time_taken_ms is misleading after reranking: _apply_rerank copies the original time_taken_ms into the new SearchResult, but the total latency now includes the reranking API call. Should either accumulate the reranking time or document that it only reflects retrieval time.

from __future__ import annotations
import os
from typing import Any, List, Optional
import httpx
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still uses raw httpx — the switch to the Cohere SDK hasn't been made yet.

Comment thread examples/python/reranking_sample.py Outdated
QueryOptions(top_k=20, alpha=0.8), # fetch 10 candidates
rerank=RerankOptions(
provider="cohere",
model="rerank-v3.5",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: comment says # fetch 10 candidates but top_k=20. Should be # fetch 20 candidates.

@yatharthk2
Copy link
Copy Markdown
Collaborator

can you please resolve the comments

@staru09 staru09 force-pushed the reranker_support branch from 85a603d to 4309523 Compare April 8, 2026 07:51
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@staru09 staru09 force-pushed the reranker_support branch from fa63a1b to 8d4cec3 Compare April 8, 2026 08:14
devin-ai-integration[bot]

This comment was marked as resolved.

@staru09 staru09 force-pushed the reranker_support branch 2 times, most recently from f24c01e to 56b42b3 Compare April 8, 2026 08:36
devin-ai-integration[bot]

This comment was marked as resolved.

@@ -0,0 +1 @@

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this file required ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I added it because of some CI tests were failing

Comment thread examples/python/reranking_sample.py Outdated

from dotenv import load_dotenv
from moss import MossClient, DocumentInfo, QueryOptions, RerankOptions
from moss.rerankers import CohereReranker
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this import is not necessary, user should only be able to pass Cohere string as options and sdk should internally take care of the reranker object from cohere

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool thing
let me add the registry again.

Comment thread examples/python/reranking_sample.py Outdated
Comment on lines +56 to +64
print("\nWith Cohere Reranking")
results = await client.query(
INDEX_NAME,
"How to get discount?",
QueryOptions(top_k=20, alpha=0.8),
rerank=RerankOptions(
provider="cohere",
api_key=os.getenv("COHERE_API_KEY"),
top_n=5,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reranking needs to be part of the query options. Please refer to the #84

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to modify the binding file to do this cause as of now sdks/python/bindings/src/models.rs this only supports

#[pyclass(name = "QueryOptions")]
  pub struct PyQueryOptions {
      pub embedding: Option<Vec<f32>>,
      pub top_k: Option<usize>,
      pub alpha: Option<f32>,
      pub filter: Option<Py<PyAny>>,
  }

Copy link
Copy Markdown
Collaborator

@yatharthk2 yatharthk2 Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to go to bindings, you can do something roughly like below in the file moss_client.py :

async def _query_local(
    self,
    name: str,
    query: str,
    options: Optional[QueryOptions],
) -> SearchResult:
    top_k = getattr(options, "top_k", None) or 5
    alpha = getattr(options, "alpha", None) or 0.8
    query_embedding = getattr(options, "embedding", None)
    filter = getattr(options, "filter", None)
    rerank = getattr(options, "rerank", None)

    fetch_k = top_k * 4 if rerank else top_k

    if query_embedding is None:
        try:
            result = await asyncio.to_thread(
                self._manager.query_text, name, query, fetch_k, alpha, filter,
            )
        except RuntimeError as e:
            if "requires explicit query embeddings" in str(e):
                raise ValueError(
                    "This index uses custom embeddings. "
                    "Query embeddings must be provided via QueryOptions.embedding."
                ) from e
            raise
    else:
        result = await asyncio.to_thread(
            self._manager.query, name, query, list(query_embedding), fetch_k, alpha, filter,
        )

    if rerank:
        from inferedge_moss.services.reranker import get_reranker

        api_key = rerank.api_key or self._rerank_api_key
        reranker = get_reranker(rerank.provider, api_key)
        final_n = rerank.top_n or top_k
        result.docs = reranker.rerank(query, result.docs, rerank.model, final_n)

    return result

Copy link
Copy Markdown
Collaborator

@yatharthk2 yatharthk2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to the issue-84

Comment thread examples/python/reranking_sample.py Outdated
Comment on lines +56 to +64
print("\nWith Cohere Reranking")
results = await client.query(
INDEX_NAME,
"How to get discount?",
QueryOptions(top_k=20, alpha=0.8),
rerank=RerankOptions(
provider="cohere",
api_key=os.getenv("COHERE_API_KEY"),
top_n=5,
Copy link
Copy Markdown
Collaborator

@yatharthk2 yatharthk2 Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to go to bindings, you can do something roughly like below in the file moss_client.py :

async def _query_local(
    self,
    name: str,
    query: str,
    options: Optional[QueryOptions],
) -> SearchResult:
    top_k = getattr(options, "top_k", None) or 5
    alpha = getattr(options, "alpha", None) or 0.8
    query_embedding = getattr(options, "embedding", None)
    filter = getattr(options, "filter", None)
    rerank = getattr(options, "rerank", None)

    fetch_k = top_k * 4 if rerank else top_k

    if query_embedding is None:
        try:
            result = await asyncio.to_thread(
                self._manager.query_text, name, query, fetch_k, alpha, filter,
            )
        except RuntimeError as e:
            if "requires explicit query embeddings" in str(e):
                raise ValueError(
                    "This index uses custom embeddings. "
                    "Query embeddings must be provided via QueryOptions.embedding."
                ) from e
            raise
    else:
        result = await asyncio.to_thread(
            self._manager.query, name, query, list(query_embedding), fetch_k, alpha, filter,
        )

    if rerank:
        from inferedge_moss.services.reranker import get_reranker

        api_key = rerank.api_key or self._rerank_api_key
        reranker = get_reranker(rerank.provider, api_key)
        final_n = rerank.top_n or top_k
        result.docs = reranker.rerank(query, result.docs, rerank.model, final_n)

    return result

@yatharthk2
Copy link
Copy Markdown
Collaborator

this is still being worked upon ?

@staru09
Copy link
Copy Markdown
Contributor Author

staru09 commented Apr 23, 2026

this is still being worked upon ?

Yes will push a PR today sorry for the delay

@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Apr 23, 2026

@staru09 is attempting to deploy a commit to the Moss Team Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 new potential issues.

View 9 additional findings in Devin Review.

Open in Devin Review

Comment thread sdks/python/sdk/src/moss/client/moss_client.py
Comment thread sdks/python/sdk/src/moss/client/moss_client.py Outdated
SearchResult,
)

from .client.models import QueryOptions, RerankOptions
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: QueryOptions replaced from Rust class to Python dataclass — backward compatibility

The PR replaces QueryOptions imported from moss_core (a Rust-backed class) with a new Python @dataclass in models.py:39-54. This is a significant type change. However, it is safe because MossClient never passes the QueryOptions object directly to Rust functions — it always extracts individual fields via getattr() (e.g., moss_client.py:230-238). All existing callers (test_hot_reload.py, test_cloud_fallback.py, test_search.py, examples, packages) use from moss import QueryOptions and construct it with keyword arguments like QueryOptions(top_k=5), which works identically with the new dataclass since it's a superset of the old fields.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 new potential issues.

View 10 additional findings in Devin Review.

Open in Devin Review

Comment on lines +215 to +218
rerank = getattr(options, "rerank", None)
if rerank:
top_k = getattr(options, "top_k", None)
result = await self._apply_rerank(query, result, rerank, top_k)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Reranking over-fetch multiplier leaks into final result count when top_k/top_n not explicitly set

When reranking is enabled but the user doesn't explicitly set top_k or top_n, the 4x internal over-fetch multiplier leaks into the final result count. At moss_client.py:217, top_k = getattr(options, "top_k", None) reads the raw user value (None when unset) and passes it to _apply_rerank as default_top_k. Meanwhile, _query_local (line 230-232) and _query_cloud (line 300-302) both independently resolve None5 and then compute fetch_k = 5 * 4 = 20. In _apply_rerank at line 282, final_n = rerank_opts.top_n or default_top_k evaluates to None or None = None, so the reranker receives top_k=None and returns all 20 over-fetched documents. The user gets 20 results instead of the expected default of 5.

Example triggering the bug
result = await client.query("idx", "search",
    QueryOptions(rerank=RerankOptions(provider="cohere", api_key="..."))
)
len(result.docs)  # Returns 20 instead of 5
Suggested change
rerank = getattr(options, "rerank", None)
if rerank:
top_k = getattr(options, "top_k", None)
result = await self._apply_rerank(query, result, rerank, top_k)
rerank = getattr(options, "rerank", None)
if rerank:
top_k = getattr(options, "top_k", None)
if top_k is None:
top_k = 5
result = await self._apply_rerank(query, result, rerank, top_k)
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +300 to +302
top_k = getattr(options, "top_k", None)
if top_k is None:
top_k = 5
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Cloud fallback default top_k changed from 10 to 5

The old _query_cloud used top_k = getattr(options, "top_k", None) or 10, defaulting to 10 results. The new code at moss_client.py:300-302 uses if top_k is None: top_k = 5, changing the default to 5. This aligns with the local path's default of 5 (moss_client.py:231-232), making behavior consistent across both paths. This is likely intentional but is a behavioral change for users who relied on the implicit cloud default of 10 results. Additionally, the old or 10 pattern would treat top_k=0 as falsy (falling back to 10), while the new is None check correctly preserves explicit top_k=0.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +41 to +46
try:
from .cohere import CohereReranker

register_reranker("cohere", CohereReranker)
except ImportError:
pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Cohere reranker silently swallowed when package not installed

In rerankers/__init__.py:41-46, the CohereReranker import is wrapped in try/except ImportError: pass. If a user specifies provider="cohere" but doesn't have the cohere package installed, the registration silently skips, and they'll get a ValueError: Unknown reranker provider: 'cohere' at query time rather than a clear missing-dependency message. The error message at line 31-35 lists available providers but doesn't hint that the provider might exist but require an optional dependency. This could confuse users since the SDK documentation suggests cohere is a built-in provider.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Support for Reranking Models

2 participants