Skip to content

Add proposal for tenant limits API #6818

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions docs/proposals/limits-api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
title: "Limits API"
linkTitle: "Limits API"
weight: 1
slug: limits-api
---

- Author: Bogdan Stancu
- Date: June 2025
- Status: Proposed

## Overview

This proposal outlines the design for a new API endpoint that will allow users to modify their current limits in Cortex. Currently, limits can only be changed by administrators modifying the runtime configuration file and waiting for it to be reloaded.

## Problem

Currently, when users need limit adjustments, they must:
1. Manually editing the runtime configuration file
2. Coordinating with users to verify the changes
3. Potentially repeating this process multiple times to find the right balance

This manual process is time-consuming, error-prone, and doesn't scale well with a large number of users. By offering a self-service API, users can adjust their own limits within predefined boundaries, reducing the administrative overhead and improving the user experience.

## Proposed API Design

### Endpoints

#### 1. GET /api/v1/user-limits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which component in Cortex will serve this API? Maybe a new admin service in Cortex for this purpose?

Copy link
Author

@bogdan-at-adobe bogdan-at-adobe Jun 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as @friedrichg said

We already have a component that reads limits, so it's perfect for this use case.

So my guess is that the cortex-overrides is a good place. Looking at the fact that the GET /runtime_config is on all components I don't see a reason why the limits api wouldn't be the same though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only want this in the cortex overrides, no need to put In the other components.

Returns the current limits configuration for a specific tenant.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the limits are loaded periodically in an interval. Would this API read the config directly from storage?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think reading from the currently loaded config is fine enough for this. Using the api to make changes will also trigger a reload of the loaded limits so the only issue I see would be changing the config manually and waiting for it to get reloaded which will lead to a wrong answer from the api for 10 seconds max (assuming the default), change that is probably made by an admin and is aware of this implication. I might be wrong on this. GET /runtime_config endpoint makes the same assumptions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the answer is yes, s3 is our only source of truth. Similar to how Alertmanager cortex API works.


Response format:
```json
{
"ingestion_rate": 10000,
"ingestion_burst_size": 20000,
"max_global_series_per_user": 1000000,
"max_global_series_per_metric": 200000,
...
}
```

#### 2. PUT /api/v1/user-limits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The limits are managed by the runtime-config which is either stored on a volume backed by a config map or in from an S3/gcs/azure bucket.

  • How would this API work in the former case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The end goal is to remove admin intervention for user limits. My initial idea was writing to either the config map or the s3/gcs/azure bucket but I'm not 100% sure of all the implications, other than requiring more access.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. @bogdan-at-adobe Configmap are normally readonly. Put it in the spec that the API will not support configmaps, only block storage backends.

Updates limits for a specific tenant. The request body should contain only the limits that need to be updated.

Request body:
```json
{
"ingestion_rate": 10000,
"max_series_per_metric": 100000
}
```

#### 3. DELETE /api/v1/user-limits
Removes tenant-specific limits, reverting to default limits.

### Implementation Details

1. The API will be integrated into the cortex-overrides component to:
- Read the current runtime config from the configured storage backend
- Persist changes back to the storage backend

2. Security:
- Rate limiting will be implemented to prevent abuse
- Changes will be validated before being applied

3. Error Handling:
- Invalid limit values will return 400 Bad Request
- Storage backend errors will return 500 Internal Server Error

Loading