Skip to content

chore: event count throttle for squashed commands #4924

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

kostasrim
Copy link
Contributor

@kostasrim kostasrim commented Apr 11, 2025

Throttle/preempt flows that use multi command squasher and crb crosses the limit.

thread_local size_t MultiCommandSquasher::throttle_size_limit_ =
absl::GetFlag(FLAGS_throttle_squashed);

thread_local util::fb2::EventCount MultiCommandSquasher::ec_;

MultiCommandSquasher::MultiCommandSquasher(absl::Span<StoredCmd> cmds, ConnectionContext* cntx,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used not only from async fiber but also directly from the connection. If we preempt, the connection will also "freeze". I guess this is fine, just mentioning it here for completeness.

There are 3 calls of this and all of them should be ok if we preempt from these flows.

await cl.execute_command("exec")

# With the current approach this will overshoot
# await client.execute_command("multi")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we also handled this case as well

@@ -94,6 +104,9 @@ class MultiCommandSquasher {

// we increase size in one thread and decrease in another
static atomic_uint64_t current_reply_size_;
static thread_local size_t throttle_size_limit_;
// Used to throttle when memory is tight
static thread_local util::fb2::EventCount ec_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this to avoid ThisFiber::Yield, ThisFiber::SleepFor in while(true).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it's thread local, it's more efficient to use NoOpLock together with CondVarAny

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@romange nice!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I added a bug here:

 static atomic_uint64_t current_reply_size_;  

so current_reply_size is not thread local. So what can happen is:

Core 0 -> starts multi/exec
Core 1 -> starts multi/exec but needs to throttle so it goes to sleep waiting on the thread local cond variable
Core 0 -> is done, notifies the thread local
Core 1 -> the fiber never awakes even though we decremented current_reply_size.

Since current_reply_size is global then so should ec_.

P.s. not very happy with this extra synchronization but we only pay it when we are under memory pressure

@kostasrim kostasrim changed the title [experiment do not review] chore: reject squashed when crb exceeds limit chore: reject squashed when crb exceeds limit Apr 22, 2025
@kostasrim kostasrim marked this pull request as ready for review April 22, 2025 11:40
@kostasrim kostasrim requested a review from adiholden April 22, 2025 11:40
@kostasrim
Copy link
Contributor Author

@adiholden pinging for an early discussion here

@kostasrim kostasrim changed the title chore: reject squashed when crb exceeds limit chore: event count throttle for squashed commands Apr 22, 2025
@@ -15,6 +16,8 @@
#include "server/transaction.h"
#include "server/tx_base.h"

ABSL_FLAG(size_t, throttle_squashed, 0, "");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adiholden I will adjust as we said f2f. Looking for some early feedback based on our discussion

@@ -63,6 +66,10 @@ size_t Size(const facade::CapturingReplyBuilder::Payload& payload) {
} // namespace

atomic_uint64_t MultiCommandSquasher::current_reply_size_ = 0;
thread_local size_t MultiCommandSquasher::throttle_size_limit_ =
absl::GetFlag(FLAGS_throttle_squashed);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed this morning multiply by thread count. The limit should be per thread and the current_reply_size_ is global counter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I know, I even wrote a comment above that I will follow up with this 😄

I wanted to know if you have anything else to add 😄

@kostasrim kostasrim requested review from romange and adiholden April 30, 2025 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants