Feature discussion: Lexical and hybrid search with Milvus 2.5 #1157

stefanwebb · 2025-02-14T22:26:59Z

I'd like to start a discussion on how we could add the new lexical search from Milvus 2.5 (for pure lexical search, or hybrid search). This would mean that the text is stored directly in the vector database.

@doberst Before I start on a PR could we please reach an alignment on the design?

Here's how I would envisage it working in practice:

Lexical search:

LLMWareConfig().set_active_db("milvus")
MilvusConfig().set_config("host", "localhost", "port", 19530)

...

parsing_output = library.add_files(ingestion_folder_path)
query_results = Query(library).text_query(test_query, result_count=10)

and for hybrid search:

LLMWareConfig().set_active_db("milvus")
MilvusConfig().set_config("host", "localhost", "port", 19530)

...

embedding_model = "mini-lm-sbert"
library.add_files(ingestion_folder_path)
library.install_new_embedding(embedding_model_name=embedding_model, vector_db=vector_db, batch_size=100)

query_results = Query(library).hybrid_query(sample_query, result_count=20) # is dual_pass_query equivalent to hybrid search? doesn't seem to be documented

I think LLMWareConfig().set_active_vector_db("milvus") when Milvus is the lexical database should give a warning that it's unnecessary but not throw an exception.

Also, library.add_files() should do the chunking, but no data is inserted into the database until you call library.install_new_embedding or Query(library).text_query since you need the full schema and want to insert the text and embedding simultaneously (you can add fields dynamically in Milvus but its very inefficient, and even more inefficient to update entities).

Do you think it would be an improved design if the function of install_new_embedding was actually done in add_files and the embedding model is configured to be part of MilvusConfig().set_config?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature discussion: Lexical and hybrid search with Milvus 2.5 #1157

Feature discussion: Lexical and hybrid search with Milvus 2.5 #1157

stefanwebb commented Feb 14, 2025

Feature discussion: Lexical and hybrid search with Milvus 2.5 #1157

Feature discussion: Lexical and hybrid search with Milvus 2.5 #1157

Comments

stefanwebb commented Feb 14, 2025