-
Notifications
You must be signed in to change notification settings - Fork 53
Token bucket unification #582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ucket and waku/TokenBucket (compensating) with no interface change
… allow refill just after period boundary elapsed.
…y with small refactoring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for such a master piece PR! 🙌
I'm just adding a few nits that I hope you find useful
…upon code review finding
|
cc: @richard-ramos I would be happy if you would look at it with eye of libp2p. Thx. |
|
Great to see, though this needs a security review from a nimbus-eth2 perspective since it's on the critical security path there - let's hold off merging until that's done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a draft eth2 PR including the changes, along with a security review of its impact on that codebase
Doing that! Thanks for the point! |
|
is this described in literature / papers somewhere? |
No, it's not. If we need such a "paper" I can write a spec or something similar discussing these replenishing algorithms. Please suggest the right place for it. |
The docs of the code would probably be a good place to start, or if it's more complex than that, the chronos book - from the description, "better utilization" doesn't really explain what's going on and "minting tokens based on elapsed time" sounds a bit like it could violate the upper bound condition in which case it's no longer a token bucket. Instead, it would be better to formalize the precise behavior of the modes, ie how they behave mathematically - then it's up to the consumer to decide which behavior is "better" (ie "better utilization" is not something you put in the docs without quantifying what "better" means). Basically, the parameters you're looking to control with a token bucket are:
A consequence of the above design is that you can de facto never take out a batch larger than bucket bound - ie if your bucket holds 100 items and you want to send 101, you're stuck - you could argue that in this case, the bucket should wait until it has accumulated space for 101 items, ie it should behave exactly as if someone requested 100 items then 1 item (or conversely 1 item then 100 items) - as long as these invariants are maintained - ie as long as under all conditions, the behavior reduces to the description in the literature if only we split up the requests, we're fine and we don't need "modes". If it doesn't reduce to the above behavior, it's no longer a token bucket. Now, there are some nuances here that one could argue about:
|
|
Another way to reframe the above invariant: there should be no period of time I think it's useful to document the behavior with respect to these invariants - and then we can compare the two approaches and find gaps in either where they would be exploitable. |
…ment algorithm, extended TokenBucket unit test
@arnetheduck |
|
@arnetheduck gentle urging here, please let me know if the added explanation and doc are satisfying? Thank you. |
chronos/ratelimit.nim
Outdated
|
|
||
| if currentTime < bucket.lastUpdate: | ||
| return | ||
| return (bucket.budgetCapacity, currentTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will introduce a drift since the current time likely will not land on a fillDuration boundary .. the net result of the drift will be a lower average throughput, even if updates happen on a regular bases (ie at least once per fill duration).
What you're looking to return here is max(bucket.lastUpdate + bucket.fillDuration, currentTime - fillDuration) or something like that to ensure that all tokens are handed out correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got your point, but with given this calculation is for Discrete mode I think we need a calculation of ditinct replenist periods elapsed since lastUpdate, because the replanish might not happen that often. That we we can be precise with fillDuration boundaries. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
precise with fillDuration boundaries
This is only meaningful if the initial "update tick" is set to a specific value. However, the downside of this approach is that you again reduce the effective rate - another way to describe this approach is that you're rounding the last update up to the next fillDuration boundary, and this rounding is effectively "time that no tokens are created".
…w algo comparison from documentation, fix Discrete mode update calculation to properly calculate correct last update time by period distance calculation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the comments @arnetheduck. I addressed your findings.
chronos/ratelimit.nim
Outdated
|
|
||
| if currentTime < bucket.lastUpdate: | ||
| return | ||
| return (bucket.budgetCapacity, currentTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got your point, but with given this calculation is for Discrete mode I think we need a calculation of ditinct replenist periods elapsed since lastUpdate, because the replanish might not happen that often. That we we can be precise with fillDuration boundaries. WDYT?
| if bucket.budget >= tokens: | ||
| # If bucket is full, consider this point as period start, drop silent periods before | ||
| if bucket.budget == bucket.capacity: | ||
| bucket.lastUpdate = now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this change the behavior?
| replenishMode: ReplenishMode | ||
|
|
||
| proc update(bucket: TokenBucket, currentTime: Moment) = | ||
| func periodDistance(bucket: TokenBucket, currentTime: Moment): float = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a better factoring would be to have an elapsed(bucket, now) function that returns a Duration, that would be used for both discrete and continuous mode - it would clamp the elapsed time to values >= 0 which is what both modes do.
| replenishMode: replenishMode | ||
| ) | ||
|
|
||
| proc setState*(bucket: TokenBucket, budget: int, lastUpdate: Moment) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function would have to take into account waiting requests for budget - basically, setting the values here is a special version of a manual replenish and needs to be treated as such to maintain internal consistency. This begs the question what the use case is that isn't covered by a manual replenish?
This pr intends to unify all diverging TokenBucket implementations. waku-org/nwaku#3543
With different needs arising from nim-waku for DOS protection of req/resp protocol a slightly different implementation is added to nim-waku, copied and adopted from nim-chronos.
Now nim-chat-sdk has new requirements.
For to be future proof we decided to go back to the roots and fullfill all these needs in single place.
Also libp2p use was taken into account, so no interface is broken and expected token utilization is similar to original implementation.
What's done:
Short replenishtest was failing in Ci and got skipped), when there were enough time passed between TokenBucket initialization and first update.Because the initial period is not calculated from the first consume but from init, could cause overuse of tokens.
This bug is fixed here.
strictmode added, which allows only refill after full period passed, not in between. > This mode also needed in nim-chat-sdkcompensatingmode was following a burst ballancing calculation, but different from original implementation, allowing overuse of tokens after silent period of time to some extent.compensatingmode requirements also - thus no need to distinguish that with a third mode.Ballancedmode is the default for TokenBucket while user can decide to go forStrictreplenish mode at creation.Documentation and comparison to previous implementation
TokenBucket — Usage Modes (Overview)
TokenBucket provides several usage modes and patterns depending on how you want to rate-limit:
Continuous mode (default):
capacity / fillDuration), adding only whole tokens.lastUpdateis set to the current time.Discrete mode:
fillDurationhas elapsed (step-like refill behavior).Manual-only replenish (fillDuration = 0):
replenish(tokens).Synchronous consumption:
tryConsume(tokens, now)trueon success,falseotherwise.lastUpdateis set tonow(prevents idle-at-full credit banking in Continuous mode).Asynchronous consumption:
consume(tokens, now) -> Future[void]replenish()is called.Capacity and timing introspection:
getAvailableCapacity(now)nowwithout mutating bucket state.Manual replenishment:
replenish(tokens, now)The sections below illustrate Continuous semantics with concrete timelines and compare them with the older algorithm for context.
TokenBucket Continuous Mode — Scenario 1 Timeline
Assumptions:
C = 10fillDuration = 1s(per-token time: 100ms)t = 0ms,budget = 10,lastUpdate = 0msLegend:
lastUpdatechanges at that step (reason)Only request events are listed below (no passive availability checks):
Notes:
lastUpdateis set to the current time; leftover elapsed time is discarded.lastUpdateto the consume time (prevents idle-at-full credit banking).Consumption Summary (0–3s window)
Per
fillDurationperiod (1s each):Total consumed over 3 seconds: 15 + 11 + 10 = 36 tokens.
Old Algorithm (Pre-unification) — Scenario 1 Timeline
Reference: old
TokenBucketfrom master (chronos/ratelimit.nim) before the unification.Key behavioral differences vs current Continuous mode:
Assumptions and inputs are identical to the table above:
C = 10,fillDuration = 1s(100ms per token)t = 0ms,budget = 10,lastUpdate = 0msConsumption Summary (0–3s window, old algorithm)
Total consumed over 3 seconds (old): 36 tokens.
Diff Notes — Continuous vs Old
Consume from full:
lastUpdateto the consume time before subtracting.lastUpdateon consume.When the bucket is full during an elapsed interval:
lastUpdate = currentTime(no hidden time credit).lastUpdateonly by the time used to mint whole tokens, leaving fractional leftover time to carry to the next update.Hitting capacity mid-update (overfill path):
lastUpdate = currentTime, discarding leftover elapsed time.lastUpdate = lastUpdate + usedTime(time to mint the whole tokens), retaining the leftover fractional part of the elapsed interval.Time resolution:
Practical impact:
Comparison Examples
High-rate single-token requests (Continuous)
Settings:
C = 10,fillDuration = 10ms(per-token time: 1ms)0–40ms(4 full periods)We show how the bucket rejects attempts that exceed the available budget at each instant, ensuring no more than
capacity + mintedtokens are usable in any time frame. Over0–40ms, at most10 (initial capacity) + 4 × 10 (mint) = 50tokens can be consumed.Request batches and outcomes:
Totals over 0–40ms:
10 + 4×10)Why the rejections happen (preventing overuse):
capacity / fillDuration = 1 token/ms; the table shows how many become available just before each batch.consume()), enforcing the rate limit.10 + (40ms × 1/ms) = 50.