-
Notifications
You must be signed in to change notification settings - Fork 3.2k
enhance: update tantivy for removing "doc_id" fast field #41198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: SpadeA <[email protected]>
Signed-off-by: SpadeA <[email protected]>
@SpadeA-Tang Please associate the related issue to the body of your Pull Request. (eg. “issue: #”) |
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: SpadeA-Tang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@SpadeA-Tang E2e jenkins job failed, comment |
Signed-off-by: SpadeA <[email protected]>
@SpadeA-Tang cpp-unit-test check failed, comment |
@SpadeA-Tang go-sdk check failed, comment |
@SpadeA-Tang E2e jenkins job failed, comment |
Signed-off-by: SpadeA <[email protected]>
Signed-off-by: SpadeA <[email protected]>
Signed-off-by: SpadeA <[email protected]>
Signed-off-by: SpadeA <[email protected]>
Signed-off-by: SpadeA <[email protected]>
Signed-off-by: SpadeA <[email protected]>
Signed-off-by: SpadeA <[email protected]>
Signed-off-by: SpadeA <[email protected]>
@SpadeA-Tang go-sdk check failed, comment |
@SpadeA-Tang cpp-unit-test check failed, comment |
@SpadeA-Tang E2e jenkins job failed, comment |
Signed-off-by: SpadeA <[email protected]>
Signed-off-by: SpadeA <[email protected]>
Signed-off-by: SpadeA <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #41198 +/- ##
==========================================
+ Coverage 80.36% 80.58% +0.22%
==========================================
Files 1484 1476 -8
Lines 211989 209856 -2133
==========================================
- Hits 170359 169113 -1246
+ Misses 35472 34612 -860
+ Partials 6158 6131 -27
🚀 New features to boost your workflow:
|
/run-cpu-e2e |
@SpadeA-Tang E2e jenkins job failed, comment |
/run-cpu-e2e |
@SpadeA-Tang E2e jenkins job failed, comment |
/run-cpu-e2e |
@SpadeA-Tang E2e jenkins job failed, comment |
Signed-off-by: SpadeA <[email protected]>
@SpadeA-Tang E2e jenkins job failed, comment |
/run-cpu-e2e |
Signed-off-by: SpadeA <[email protected]>
@SpadeA-Tang go-sdk check failed, comment |
rerun go-sdk |
@SpadeA-Tang go-sdk check failed, comment |
rerun go-sdk |
/lgtm |
Issue: #41210
After zilliztech/tantivy#5, we can provide milvus row id directly to tantivy rather than record it in the fast field "doc_id".
So rather than search tantivy doc id and then get milvus row id from "doc_id", now, the searched tantivy doc id is the milvus row id, eliminating the expensive acquiring row id phase.
The following shows a simple benchmark where insert 1M docs where all rows are "hello", the latency is segcore level, CPU is 9900K:

The latency is 2.02 and 2.1 times respectively.
bench mark code: