Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ratelimit ability to not count ratelimited requests for specific descriptors #10613

Open
nfuden opened this issue Feb 13, 2025 · 12 comments
Open
Assignees
Labels
Prioritized Indicating issue prioritized to be worked on in RFE stream Size: L Type: Enhancement Zendesk

Comments

@nfuden
Copy link
Collaborator

nfuden commented Feb 13, 2025

Migrated from OSS kgateway-dev#8699

Gloo Edge Product

Enterprise

Gloo Edge Version

v1.14.7

Is your feature request related to a problem? Please describe.

Customer would like to define 2 levels of ratelimits, first based on org and package, the second based on org and integrator. For example limit of 5 requests/min for Integrator and 10 requests/min for Org. Currently if Integrator1 does 8 requests he gets 429 after request 5 for the last 3 requests. If in the same minute Integrator 2 tries to do 5 requests he starts getting 429 after 2 requests as also the ratelimited requests are counted for the package level ratelimit.

Example ratelimitconfigs:

apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
  name: rl
  namespace: gloo-system
spec:
  raw:
    rateLimits:
    - setActions:
      - requestHeaders:
          descriptorKey: odyssey-org
          headerName: x-odc-org
      - requestHeaders:
          descriptorKey: odyssey-package
          headerName: x-odc-rb
    setDescriptors:
    - rateLimit:
        requestsPerUnit: 10
        unit: MINUTE
      simpleDescriptors:
      - key: odyssey-org
      - key: odyssey-package
        value: basic
---
apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
  name: rl2
  namespace: gloo-system
spec:
  raw:
    rateLimits:
    - setActions:
      - requestHeaders:
          descriptorKey: odyssey-int
          headerName: x-odc-int
      - requestHeaders:
          descriptorKey: odyssey-package
          headerName: x-odc-rb
    setDescriptors:
    - rateLimit:
        requestsPerUnit: 5
        unit: MINUTE
      simpleDescriptors:
      - key: odyssey-int
      - key: odyssey-package
        value: basic

Describe the solution you'd like

A flag for specific descriptor that would tell that requests ratelimited based on this descriptor should not increase any counters.

Describe alternatives you've considered

Using envoy ratelimit API instead of the set-style with nested descriptors and weight on the integrator descriptor:

apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
  name: rl
  namespace: gloo-system
spec:
  raw:
    rateLimits:
    - actions:
      - requestHeaders:
          descriptorKey: odyssey-org
          headerName: x-odc-org
      - requestHeaders:
          descriptorKey: odyssey-package
          headerName: x-odc-rb
    - actions:
      - requestHeaders:
          descriptorKey: odyssey-org
          headerName: x-odc-org
      - requestHeaders:
          descriptorKey: odyssey-package
          headerName: x-odc-rb
      - requestHeaders:
          descriptorKey: odyssey-int
          headerName: x-odc-int
    descriptors:
    - key: odyssey-org
      descriptors:
      - rateLimit:
          requestsPerUnit: 10
          unit: MINUTE
        key: odyssey-package
        value: basic
        descriptors:
        - rateLimit:
            requestsPerUnit: 5
            unit: MINUTE
          weight: 1
          key: odyssey-int

This does not server the purpose as any number of integrators can do 5 requests regardless of their package as only counter for the descriptor with non-zero weight is increased.

Additional Context

No response

@nfuden nfuden added Prioritized Indicating issue prioritized to be worked on in RFE stream Zendesk Type: Enhancement Size: L labels Feb 13, 2025
@nfuden
Copy link
Collaborator Author

nfuden commented Feb 13, 2025

zf-rl drawio

@sam-heilbron @nfuden updated diagram. Thanks @huzlak

@nfuden
Copy link
Collaborator Author

nfuden commented Feb 13, 2025

Whats the use case for not being able to define both the rate limits in the same config? https://docs.solo.io/gloo-edge/latest/guides/security/rate_limiting/set/#rule-priority

@nfuden
Copy link
Collaborator Author

nfuden commented Feb 13, 2025

With regular setup in one config if first(integrator) rule matches, the other(organization) rule counter is not incremented. It should be incremented, but only for the requests that were not ratelimited by integrator's limit as these should count in whole organization limit because requests to upstream happened. The ones ratelimited by integrator limit should not be counted as they were ratelimited already and did not go to the upstream. If we would apply alwaysApply then the organization rule would be always applied even if the first was too and also for ratelimited requests. Customer needs a way to not count requests ratelimited by integrator config. As all the headers come from extauth stage, they can not split into early and regular ratelimitconfigs.

Adding example test results for better clarification:

Without AlwaysApply:
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int";done; echo
200 200 200 200 200 429 429 429 
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int2";done; echo
200 200 200 200 200 429 429 429 
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int3";done; echo
200 200 200 200 200 429 429 429

With AlwaysApply:

$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int";done; echo
200 200 200 200 200 429 429 429 
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int2";done; echo
200 200 429 429 429 429 429 429 
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int3";done; echo
429 429 429 429 429 429 429 429

Used ratelimit config:

spec:
  raw:
    rateLimits:
    - setActions:
      - requestHeaders:
          descriptorKey: odyssey-org
          headerName: x-odc-org
      - requestHeaders:
          descriptorKey: odyssey-package
          headerName: x-odc-rb
      - requestHeaders:
          descriptorKey: odyssey-int
          headerName: x-odc-int
      - requestHeaders:
          descriptorKey: odyssey-org
          headerName: x-odc-org
      - requestHeaders:
          descriptorKey: odyssey-package
          headerName: x-odc-rb
    setDescriptors:
    - rateLimit:
        requestsPerUnit: 5
        unit: MINUTE
      simpleDescriptors:
      - key: odyssey-org
      - key: odyssey-package
      - key: odyssey-int
    - alwaysApply: true
      rateLimit:
        name: group_limit
        requestsPerUnit: 10
        unit: MINUTE
      simpleDescriptors:
      - key: odyssey-org
      - key: odyssey-package

Desired behavior:

$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int";done; echo
200 200 200 200 200 429 429 429 
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int2";done; echo
200 200 200 200 200 429 429 429 
$ for call in $(seq 1 8); do curl -qs --output /dev/null $(glooctl proxy url)/get -k -w "%{http_code} " -H "x-odc-org: org1" -H "x-odc-rb: package1" -H "x-odc-int: int3";done; echo
429 429 429 429 429 429 429 429

@nfuden
Copy link
Collaborator Author

nfuden commented Feb 13, 2025

So, if we look at the application policy for RL, we currently have "alwaysApply=true" and "alwaysApply=false". Now, say we have RL1 and RL2 in one config (pseudo code):

rateLimit:
  setDescriptors:
  - RL1-Integrator: 5 requests per minute
  - RL2-Organization: 10 requests per minute

The current behaviour is that either 1 rule is applied (alwaysApply=false) where the integrators are only rate-limited per integrator, and never as a group of integrators belonging to the same organization. Here every integrator has its own RL1 bucket. So that means, even if you want to rate limit on organization, every integrator will always be able to do 5 requests per minute, even if they, as a group, should be blocked at 10.

And the other option (alwaysApply=true) means that both rules always apply. And that means that a request will always be counted in both buckets, RL1 and RL2, even if RL1 has already rate-limited the request. So, if Integrator-1 starts sending hundreds of requests per second, they will be counted in both in Integrator-1's own RL1 bucket, but ALSO in the RL2 bucket for the organization. And this means that all of the other Integrators will be blocked as well by RL2.

So as I see it, the ask is to add an additional mode, say applyIfNotRateLimited=true (I'm very bad with names, and yes, this is a double negation, so you might want to come up with something better), of which the semantics should be that the rule should ONLY be applied if the request is not already rate-limited by a previous rule. And this should prevent requests already rate-limited by RL1 to also be counted in the RL2 bucket.

Something like this:

rateLimit:
  setDescriptors:
  - RL1-Integrator: 5 requests per minute
  - RL2-Organization: 10 requests per minute
     applyIfRateLimited=false

You could argue that it would make sense to introduce something like a mode property that could take a value out of set/enumeration of different modes: "alwaysApply", "applyIfNotRateLimited", instead of introducing another property. Argument for this is that "alwaysApply" and "applyIfNotRateLimited" cannot/should not be configured at the same time.

@nfuden
Copy link
Collaborator Author

nfuden commented Feb 13, 2025

Do we forsee the applyIfNotRateLimited use case ever being extended to a larger nested tree?

For example we say we have a place where you have the following rate limits

  1. Org total safety bucket, ie the dont blow up the service
  2. Team fairness bucket, ie dont let one sub team hog all the usage of a service within the org
  3. Sub group fairness, ie the Team itself may have 2 services and want to make sure that one of their 2 services never entirely consumes their team fairness bucket.

For example

  1. ORG: 100 requests per minute
  2. TEAM: 50 requests per minute pulled from header:team
  3. SubGroup: 40 requests per minute pulled from header:subgroup

@nfuden
Copy link
Collaborator Author

nfuden commented Feb 13, 2025

Hi @nfuden @huzlak any updates when this will be picked up we (ZF) have been waiting for this for a very long time..

@nfuden
Copy link
Collaborator Author

nfuden commented Feb 13, 2025

Zendesk ticket #4472 has been linked to this issue.

@nfuden
Copy link
Collaborator Author

nfuden commented Feb 13, 2025

Hi @nfuden @huzlak any updates ????

@nfuden
Copy link
Collaborator Author

nfuden commented Feb 13, 2025

Zendesk ticket #4932 has been linked to this issue.

@nfuden
Copy link
Collaborator Author

nfuden commented Feb 13, 2025

Hi @huzlak @nfuden any updates on this...

@nfuden
Copy link
Collaborator Author

nfuden commented Feb 13, 2025

Hi @huzlak @nfuden This is now becoming very critical. Lot of customers are using the APIs. Please provide an ETA...

@nfuden
Copy link
Collaborator Author

nfuden commented Feb 13, 2025

@akhilaj Sorry for the slow response here it is my understanding that this is being picked up by solo now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Prioritized Indicating issue prioritized to be worked on in RFE stream Size: L Type: Enhancement Zendesk
Projects
None yet
Development

No branches or pull requests

2 participants