You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'd like to start a discussion on how we could add the new lexical search from Milvus 2.5 (for pure lexical search, or hybrid search). This would mean that the text is stored directly in the vector database.
@doberst Before I start on a PR could we please reach an alignment on the design?
Here's how I would envisage it working in practice:
LLMWareConfig().set_active_db("milvus")
MilvusConfig().set_config("host", "localhost", "port", 19530)
...
embedding_model="mini-lm-sbert"library.add_files(ingestion_folder_path)
library.install_new_embedding(embedding_model_name=embedding_model, vector_db=vector_db, batch_size=100)
query_results=Query(library).hybrid_query(sample_query, result_count=20) # is dual_pass_query equivalent to hybrid search? doesn't seem to be documented
I think LLMWareConfig().set_active_vector_db("milvus") when Milvus is the lexical database should give a warning that it's unnecessary but not throw an exception.
Also, library.add_files() should do the chunking, but no data is inserted into the database until you call library.install_new_embedding or Query(library).text_query since you need the full schema and want to insert the text and embedding simultaneously (you can add fields dynamically in Milvus but its very inefficient, and even more inefficient to update entities).
Do you think it would be an improved design if the function of install_new_embedding was actually done in add_files and the embedding model is configured to be part of MilvusConfig().set_config?
The text was updated successfully, but these errors were encountered:
I'd like to start a discussion on how we could add the new lexical search from Milvus 2.5 (for pure lexical search, or hybrid search). This would mean that the text is stored directly in the vector database.
@doberst Before I start on a PR could we please reach an alignment on the design?
Here's how I would envisage it working in practice:
Lexical search:
and for hybrid search:
I think
LLMWareConfig().set_active_vector_db("milvus")
when Milvus is the lexical database should give a warning that it's unnecessary but not throw an exception.Also,
library.add_files()
should do the chunking, but no data is inserted into the database until you calllibrary.install_new_embedding
orQuery(library).text_query
since you need the full schema and want to insert the text and embedding simultaneously (you can add fields dynamically in Milvus but its very inefficient, and even more inefficient to update entities).Do you think it would be an improved design if the function of
install_new_embedding
was actually done inadd_files
and the embedding model is configured to be part ofMilvusConfig().set_config
?The text was updated successfully, but these errors were encountered: