-
Notifications
You must be signed in to change notification settings - Fork 448
[Feature] 添加LLM Token限流功能 | Add LLM Token Rate Limit #602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds LLM (Large Language Model) Token Rate Limiting functionality to sentinel-golang, implementing two rate limiting strategies: Fixed Window and PETA (Predictive Estimated Token Allowance). The feature enables token-based rate limiting for LLM API calls with support for multiple token counting strategies and Redis-based distributed rate limiting.
Key Changes:
- Implements Fixed Window and PETA token rate limiting strategies with Redis backend
- Adds token encoding support (currently OpenAI) for estimating token usage
- Provides adapters for Eino framework integration
- Includes comprehensive example implementation with Gin middleware
Reviewed Changes
Copilot reviewed 69 out of 102 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
go.mod |
Updates dependencies to support token rate limiting (Redis, tiktoken, copier, etc.) |
pkg/adapters/eino/wrapper.go |
Implements LLM wrapper for Eino framework with token rate limiting |
pkg/adapters/eino/options.go |
Defines configuration options for Eino adapter |
core/llm_token_ratelimit/*.go |
Core implementation of token rate limiting logic, rule management, and strategies |
core/llm_token_ratelimit/script/*.lua |
Redis Lua scripts for atomic rate limiting operations |
example/llm_token_ratelimit/* |
Complete example with Gin server and LLM client integration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, the feature functionality is solid with no critical issues (as previously discussed). The rule loading mechanism still needs full alignment with Sentinel's existing datasource architecture. The proposed adjustments can be prioritized in future releases based on your schedule, but could you confirm your plan to proceed or defer them? This implementation is acceptable for the current preview/experimental release phase.
api/slot_chain.go
Outdated
| sc.AddRuleCheckSlot(isolation.DefaultSlot) | ||
| sc.AddRuleCheckSlot(hotspot.DefaultSlot) | ||
| sc.AddRuleCheckSlot(circuitbreaker.DefaultSlot) | ||
| sc.AddRuleCheckSlot(llmtokenratelimit.DefaultSlot) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest placing the slots related to rate limiting before those related to circuit breaking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your reply. I will place the llmtokenratelimit component right after the flow component and before the circuitbreaker component.
core/llm_token_ratelimit/config.go
Outdated
|
|
||
| type Config struct { | ||
| Enabled bool `json:"enabled" yaml:"enabled"` | ||
| Rules []*Rule `json:"rules" yaml:"rules"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Sentinel's design, configurations and rules are typically decoupled. Configurations are usually defined through YAML configuration files, specifying general parameters for features, while rules are dynamically updated via the datasource module, fully defining the behavior of the functionalities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your reply. I will adjust this part of the content and remove the rules from the configuration file.
| // limitations under the License. | ||
|
|
||
| // Package config provides llm token rate limit mechanism. | ||
| package llm_token_ratelimit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The package documentation comment for llm_token_ratelimit needs to be added to doc.go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your reply. I will add the documentation description for the llm_token_ratelimit component.
| values = []string{pattern} | ||
| } | ||
| for _, v := range values { | ||
| if util.RegexMatch(identifier.Value, key) && util.RegexMatch(pattern, v) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic seems to iterate through all headers, which could become a performance bottleneck. Considering actual requirements, key values likely don't require regex pattern matching. This would allow direct value retrieval from the map , completely avoiding the full iteration process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your reply, but iterating through all headers for regex matching is a necessary step.
Suppose users have configured some key-value pairs with regular expressions for request matching. When a request arrives, Sentinel does not know whether the request matches any rule. Therefore, it must retrieve all headers of the request and perform regex matching against all configured key-value pairs—this is an unavoidable process.
| ) | ||
|
|
||
| var ( | ||
| ruleMap = make(map[string][]*Rule) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rule loading mechanism isn't fully adapted for Sentinel's datasource module - consider referencing existing rule implementations for compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your reply. I will supplement the ext/datasource/helper.go content for the datasource module.
refers to: #596
性能测试报告