Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance: add broadcast for streaming service #39020

Merged

Conversation

chyezh
Copy link
Contributor

@chyezh chyezh commented Jan 6, 2025

issue: #38399

  • Add new rpc for transfer broadcast to streaming coord
  • Add broadcast service at streaming coord to make broadcast message sent automicly

@sre-ci-robot sre-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines. label Jan 6, 2025
@mergify mergify bot added dco-passed DCO check passed. kind/enhancement Issues or changes related to enhancement labels Jan 6, 2025
Copy link
Contributor

mergify bot commented Jan 6, 2025

@chyezh go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link

codecov bot commented Jan 6, 2025

Codecov Report

Attention: Patch coverage is 75.21368% with 116 lines in your changes missing coverage. Please review.

Project coverage is 81.13%. Comparing base (f0dae81) to head (4fe6647).
Report is 7 commits behind head on master.

Files with missing lines Patch % Lines
.../streamingcoord/client/broadcast/broadcast_impl.go 15.62% 27 Missing ⚠️
...eamingcoord/server/broadcaster/broadcaster_impl.go 83.08% 18 Missing and 5 partials ⚠️
pkg/streaming/util/message/builder.go 58.33% 11 Missing and 9 partials ⚠️
...nternal/streamingcoord/server/service/broadcast.go 22.22% 14 Missing ⚠️
pkg/streaming/util/message/message_impl.go 79.48% 4 Missing and 4 partials ⚠️
internal/streamingcoord/server/server.go 65.00% 5 Missing and 2 partials ⚠️
internal/metastore/kv/streamingcoord/kv_catalog.go 78.57% 4 Missing and 2 partials ⚠️
pkg/util/contextutil/context_util.go 55.55% 3 Missing and 1 partial ⚠️
internal/distributed/streaming/append.go 62.50% 2 Missing and 1 partial ⚠️
internal/streamingcoord/client/client_impl.go 0.00% 2 Missing ⚠️
... and 1 more
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #39020       +/-   ##
===========================================
+ Coverage   69.53%   81.13%   +11.59%     
===========================================
  Files         296     1394     +1098     
  Lines       26607   197032   +170425     
===========================================
+ Hits        18502   159863   +141361     
- Misses       8105    31558    +23453     
- Partials        0     5611     +5611     
Components Coverage Δ
Client 79.53% <ø> (∅)
Core 69.55% <ø> (+0.01%) ⬆️
Go 83.08% <75.21%> (∅)
Files with missing lines Coverage Δ
internal/distributed/streaming/streaming.go 100.00% <ø> (ø)
internal/distributed/streaming/wal.go 64.86% <100.00%> (ø)
internal/metastore/catalog.go 100.00% <ø> (ø)
internal/streamingcoord/client/client.go 96.82% <100.00%> (ø)
...reamingcoord/server/broadcaster/append_operator.go 100.00% <100.00%> (ø)
internal/streamingcoord/server/broadcaster/task.go 100.00% <100.00%> (ø)
internal/streamingcoord/server/builder.go 100.00% <100.00%> (ø)
...nternal/streamingcoord/server/resource/resource.go 95.91% <100.00%> (ø)
...ingnode/server/flusher/flusherimpl/flusher_impl.go 85.91% <ø> (ø)
internal/streamingnode/server/resource/resource.go 88.73% <ø> (ø)
... and 20 more

... and 1069 files with indirect coverage changes

@chyezh chyezh force-pushed the enhance_add_broadcast_for_msg_system branch 2 times, most recently from adc133d to 1292957 Compare January 7, 2025 09:57
Copy link
Contributor

mergify bot commented Jan 7, 2025

@chyezh E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@chyezh chyezh force-pushed the enhance_add_broadcast_for_msg_system branch 2 times, most recently from 129b3bd to 9e1bfb5 Compare January 7, 2025 11:18
Copy link
Contributor

mergify bot commented Jan 7, 2025

@chyezh go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 7, 2025

@chyezh cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

@chyezh
Copy link
Contributor Author

chyezh commented Jan 7, 2025

rerun cpp-unit-test

@chyezh
Copy link
Contributor Author

chyezh commented Jan 7, 2025

rerun go-sdk

@mergify mergify bot added the ci-passed label Jan 7, 2025
- Add new rpc for transfer broadcast to streaming coord
- Add broadcast service at streaming coord to make broadcast message
  sent

Signed-off-by: chyezh <[email protected]>
@chyezh chyezh force-pushed the enhance_add_broadcast_for_msg_system branch from 9e1bfb5 to 4fe6647 Compare January 8, 2025 06:33
@mergify mergify bot added ci-passed and removed ci-passed labels Jan 8, 2025

func (c *catalog) SaveBroadcastTask(ctx context.Context, task *streamingpb.BroadcastTask) error {
key := buildBroadcastTaskPath(task.TaskId)
if task.State == streamingpb.BroadcastTaskState_BROADCAST_TASK_STATE_DONE {
Copy link
Contributor

@jaime0815 jaime0815 Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the RemoveBroadcastTask method to delete metadata from the catalog?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to keep only state transfer semantic at catalog layer.


// Start n workers to handle the broadcast task.
wg := sync.WaitGroup{}
for i := 0; i < 4; i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the CPU count instead of hardcoding the number 4.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it

case task := <-b.backoffChan:
// task is backoff, push it into backoff queue to make a delay retry.
b.backoffs.Push(task)
case <-nextBackOff:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reinsert the backoff task at the front of the pending queue; otherwise, consuming messages waiting for Barrie will take a long time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it


// Poll polls the task, return nil if the task is done, otherwise not done.
// Poll can be repeated called until the task is done.
func (b *broadcastTask) Poll(ctx context.Context, operator AppendOperator) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method is somewhat confusingly named "poll," as it actually executes the task. so suggest separating the task's state check and its execution into two distinct methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it

if len(newPendings) == 0 {
b.future.Set(&types.BroadcastAppendResult{AppendResults: b.appendResult})
}
b.logger.Info("broadcast task make a new broadcast done", zap.Int("pendingMessages", len(b.pendingMessages)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use "backoff messages " instead of "pendingMessages"

@jaime0815
Copy link
Contributor

Address the issue in the next PR.
/lgtm

@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by: chyezh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot merged commit 3bcdd92 into milvus-io:master Jan 9, 2025
19 of 20 checks passed
@chyezh chyezh deleted the enhance_add_broadcast_for_msg_system branch January 9, 2025 08:27
sre-ci-robot pushed a commit that referenced this pull request Jan 9, 2025
issue: #38399
pr: #39020

- Add new rpc for transfer broadcast to streaming coord
- Add broadcast service at streaming coord to make broadcast message
sent automicly

also cherry pick the pr #38400

---------

Signed-off-by: chyezh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved ci-passed dco-passed DCO check passed. kind/enhancement Issues or changes related to enhancement lgtm size/XXL Denotes a PR that changes 1000+ lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants