-
Notifications
You must be signed in to change notification settings - Fork 391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement leaderelector.Interface in terms of the new leader elector client #5499
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Tom Wieczorek <[email protected]>
This is a useful abstraction that runs a callback whose context gets cancelled as soon as the lead gets lost. Signed-off-by: Tom Wieczorek <[email protected]>
Signed-off-by: Tom Wieczorek <[email protected]>
Signed-off-by: Tom Wieczorek <[email protected]>
013e254
to
ed5071e
Compare
leaderelection.RunLeaderTasks(ctx, l.status.Peek, func(ctx context.Context) { | ||
l.log.Info("acquired leader lease") | ||
runCallbacks(l.acquiredLeaseCallbacks) | ||
<-ctx.Done() | ||
l.log.Infof("lost leader lease (%v)", context.Cause(ctx)) | ||
runCallbacks(l.lostLeaseCallbacks) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't seem to get this 🤔
Why are we running both acquired and lost callback at the same round? Is the ctx
done when we lose the leadership? If so, maybe consider naming it somehow to reflect that, currently it looks too much like a "normal" context and thus hard to figure out the intention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the ctx done when we lose the leadership?
Yes. This is the doc string on the RunLeaderTasks
function:
// Runs the provided tasks function when the lead is taken. It continuously
// monitors the leader election status using statusFunc. When the lead is taken,
// the tasks function is called with a context that is cancelled either when the
// lead has been lost or ctx is done. After the tasks function returns, the
// process is repeated until ctx is done.
That function tries to encapsulate a common pattern: "Do something as long as you're holding the lease". When the context is done, this means that you should stop, no matter why. Could be because the lease got lost, could be because the parent context got cancelled for whichever reason (timeout, application shutdown, ...).
If so, maybe consider naming it somehow to reflect that, currently it looks too much like a "normal" context and thus hard to figure out the intention.
This indeed is is a normal context, there's nothing special about it. RunLeaderTasks
just "adds" another cancellation cause to it (loosing the lease). Pretty much the same way as signal.NotifyContext
can be used to "add" SIGINT/SIGTERM and friends as cancellation causes to a context.
That being said, do you have any suggestions on the naming, or how to structure it differently? We could add more explanatory comments, of course, and improve the RunLeaderTasks
docstring. On the other hand, this exact function is sort of a compatibility layer anyways, as the plan is to completely replace the callbacks in the first place. Maybe that's why it's looking a bit weird. The new way would be to use RunLeaderTasks
directly in the components that need it. The applier manager, for example, does this now:
leaderelection.RunLeaderTasks(ctx, m.LeaderElector.CurrentStatus, func(ctx context.Context) {
wait.UntilWithContext(ctx, m.runWatchers, time.Minute)
})
This is kinda succinct and doesn't look that awkward, I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That being said, do you have any suggestions on the naming, or how to structure it differently?
I think the structure is quite ok. Just bit hard, at least for me, to grasp the context usage. Since we're essentially using the context to tell that we've lost the lease, maybe something like leaderContext
would make it bit more obvious.
The PR is marked as stale since no activity has been recorded in 30 days |
Description
Leaderelector.LeasePool
andK0sControllersLeaseCounter
now use the new leader elector client internally. Deprecate all old interface methods in favor ofCurrentStatus
, which reports the current leader status along with a channel that is closed when the status changes. Callers should then callCurrentStatus
again to get the new status.This way, the leader elector doesn't have to keep track of callback methods, callers can't block each other, and there is no need to "deregister" when callers are no longer interested in leader status changes.
The old API is fully emulated via the new API, and can be easily removed once it's no longer being called.
As a PoC, convert the applier manager to the new API as the first component.
Type of change
How Has This Been Tested?
Checklist: