-
Notifications
You must be signed in to change notification settings - Fork 821
Add proposal for tenant limits API #6818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 3 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
--- | ||
title: "Limits API" | ||
linkTitle: "Limits API" | ||
weight: 1 | ||
slug: limits-api | ||
--- | ||
|
||
- Author: Bogdan Stancu | ||
- Date: June 2025 | ||
- Status: Proposed | ||
|
||
## Overview | ||
|
||
This proposal outlines the design for a new API endpoint that will allow users to modify their current limits in Cortex. Currently, limits can only be changed by administrators modifying the runtime configuration file and waiting for it to be reloaded. | ||
|
||
## Problem | ||
|
||
Currently, when users need limit adjustments, they must: | ||
1. Manually editing the runtime configuration file | ||
2. Coordinating with users to verify the changes | ||
3. Potentially repeating this process multiple times to find the right balance | ||
|
||
This manual process is time-consuming, error-prone, and doesn't scale well with a large number of users. By offering a self-service API, users can adjust their own limits within predefined boundaries, reducing the administrative overhead and improving the user experience. | ||
|
||
## Proposed API Design | ||
|
||
### Endpoints | ||
|
||
#### 1. GET /api/v1/user-limits | ||
Returns the current limits configuration for a specific tenant. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently the limits are loaded periodically in an interval. Would this API read the config directly from storage? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think reading from the currently loaded config is fine enough for this. Using the api to make changes will also trigger a reload of the loaded limits so the only issue I see would be changing the config manually and waiting for it to get reloaded which will lead to a wrong answer from the api for 10 seconds max (assuming the default), change that is probably made by an admin and is aware of this implication. I might be wrong on this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the answer is yes, s3 is our only source of truth. Similar to how Alertmanager cortex API works. |
||
|
||
Response format: | ||
```json | ||
{ | ||
"ingestion_rate": 10000, | ||
"ingestion_burst_size": 20000, | ||
"max_global_series_per_user": 1000000, | ||
"max_global_series_per_metric": 200000, | ||
... | ||
} | ||
``` | ||
|
||
#### 2. PUT /api/v1/user-limits | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The limits are managed by the runtime-config which is either stored on a volume backed by a config map or in from an S3/gcs/azure bucket.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The end goal is to remove admin intervention for user limits. My initial idea was writing to either the config map or the s3/gcs/azure bucket but I'm not 100% sure of all the implications, other than requiring more access. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good question. @bogdan-at-adobe Configmap are normally readonly. Put it in the spec that the API will not support configmaps, only block storage backends. |
||
Updates limits for a specific tenant. The request body should contain only the limits that need to be updated. | ||
|
||
Request body: | ||
```json | ||
{ | ||
"ingestion_rate": 10000, | ||
"max_series_per_metric": 100000 | ||
} | ||
``` | ||
|
||
#### 3. DELETE /api/v1/user-limits | ||
Removes tenant-specific limits, reverting to default limits. | ||
|
||
### Implementation Details | ||
|
||
1. The API will be integrated into the cortex-overrides component to: | ||
- Read the current runtime config from the configured storage backend | ||
- Persist changes back to the storage backend | ||
|
||
2. Security: | ||
- Rate limiting will be implemented to prevent abuse | ||
friedrichg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Changes will be validated before being applied | ||
friedrichg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
3. Error Handling: | ||
- Invalid limit values will return 400 Bad Request | ||
- Storage backend errors will return 500 Internal Server Error | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which component in Cortex will serve this API? Maybe a new admin service in Cortex for this purpose?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as @friedrichg said
So my guess is that the cortex-overrides is a good place. Looking at the fact that the
GET /runtime_config
is on all components I don't see a reason why the limits api wouldn't be the same though.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only want this in the cortex overrides, no need to put In the other components.