Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
2028c27
refactor(core): 🔨 modularize client and usage architecture
Mirrowel Jan 21, 2026
cd744cc
refactor(core): 🔨 finalize modular architecture and preserve legacy i…
Mirrowel Jan 21, 2026
26f2846
feat(core): ✨ add async credential waiting and quota group sync
Mirrowel Jan 21, 2026
1c86c22
feat(core): ✨ parse granular Google quota details and cleanup streaming
Mirrowel Jan 21, 2026
523ea31
feat(usage): ✨ implement granular tracking, cost calculation, and hooks
Mirrowel Jan 21, 2026
01dffe7
feat(core): ✨ integrate advanced logging and dynamic usage configuration
Mirrowel Jan 21, 2026
673ca0e
feat(usage): ✨ implement quota group aggregation and explicit initial…
Mirrowel Jan 21, 2026
3450c73
feat(core): ✨ enhance usage stats aggregation and enforce async clien…
Mirrowel Jan 21, 2026
790c01e
refactor(core): 🔨 transition availability checks to async and expose …
Mirrowel Jan 21, 2026
4f1cceb
feat(usage): ✨ track reasoning and cache write tokens
Mirrowel Jan 22, 2026
2136d98
refactor(core): 🔨 preserve legacy client and usage manager implementa…
Mirrowel Jan 22, 2026
dfd2070
feat(usage): ✨ track internal provider retries and enhance api docume…
Mirrowel Jan 22, 2026
22ef3f6
fix(usage): 🐛 prevent stale api data from overwriting local counts
Mirrowel Jan 22, 2026
7ed4c9c
feat(client): ✨ add Anthropic API compatibility handler
Mirrowel Jan 22, 2026
ba60834
refactor(usage): 🔨 centralize window aggregation and reconcile usage …
Mirrowel Jan 22, 2026
94231e0
refactor(client): 🔨 centralize request execution setup logic
Mirrowel Jan 22, 2026
39731b7
refactor(usage): 🔨 decouple model and group usage statistics
Mirrowel Jan 23, 2026
b56dd6b
feat(usage): ✨ make window limits optional and restore legacy logging
Mirrowel Jan 23, 2026
7635027
fix(usage): 🐛 make group and model quota updates mutually exclusive
Mirrowel Jan 23, 2026
effdd28
refactor(usage): 🔨 standardize window config and add human-readable t…
Mirrowel Jan 23, 2026
3a046f5
refactor(usage): 🔨 improve quota resolution strategy and defaults
Mirrowel Jan 23, 2026
610fe8e
fix(usage): 🐛 respect window limit configuration in tracking engine
Mirrowel Jan 23, 2026
9e75529
refactor(client): 🔨 share provider instances across client components
Mirrowel Jan 23, 2026
aa744f1
refactor(providers): 🔨 enforce singleton pattern for provider instances
Mirrowel Jan 23, 2026
2ce5478
fix(providers): 🐛 refine quota exhaustion logic and fetch contexts
Mirrowel Jan 23, 2026
87d04d1
feat(usage): ✨ implement flexible quota thresholds and advanced durat…
Mirrowel Jan 23, 2026
cdded49
fix(usage): 🐛 enforce independent checks for model and group caps
Mirrowel Jan 23, 2026
06db1e2
feat(usage): ✨ implement historical max request tracking
Mirrowel Jan 23, 2026
b212f50
feat(quota-viewer): ✨ enhance quota viewer with multi-window support …
Mirrowel Jan 24, 2026
03a22ef
fix(usage): 🐛 lazy initialize window start and reset times
Mirrowel Jan 24, 2026
ba35176
refactor(usage): 🔨 update quota sorting logic to prioritize limit size
Mirrowel Jan 24, 2026
731e848
feat(quota-viewer): ✨ implement fair cycle and custom cap visualization
Mirrowel Jan 24, 2026
dc763c0
refactor(ui): 🔨 replace raw emojis with rich markup aliases
Mirrowel Jan 24, 2026
640be5a
refactor(core): 🔨 remove legacy modules and simplify executor logic
Mirrowel Jan 24, 2026
f1c963c
chore(config): 🧹 update usage persistence mounts and documentation
Mirrowel Jan 24, 2026
89f0358
refactor(core): 🔨 extract executor helpers and centralize credential …
Mirrowel Jan 24, 2026
bb6e627
refactor(core): 🔨 extract credential logging and callback helpers
Mirrowel Jan 24, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -126,9 +126,10 @@ staged_changes.txt
launcher_config.json
quota_viewer_config.json
cache/antigravity/thought_signatures.json
logs/
cache/
/logs/
/cache/
*.env

oauth_creds/
/oauth_creds/

/usage/
20 changes: 10 additions & 10 deletions DOCUMENTATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ This architecture cleanly separates the API interface from the resilience logic,

This library is the heart of the project, containing all the logic for managing a pool of API keys, tracking their usage, and handling provider interactions to ensure application resilience.

### 2.1. `client.py` - The `RotatingClient`
### 2.1. `client/rotating_client.py` - The `RotatingClient`

The `RotatingClient` is the central class that orchestrates all operations. It is designed as a long-lived, async-native object.
The `RotatingClient` is the central class that orchestrates all operations. It is now a slim facade that delegates to modular components (executor, filters, transforms) while remaining a long-lived, async-native object.

#### Initialization

Expand All @@ -35,7 +35,7 @@ client = RotatingClient(
api_keys=api_keys,
oauth_credentials=oauth_credentials,
max_retries=2,
usage_file_path="key_usage.json",
usage_file_path="usage.json",
configure_logging=True,
global_timeout=30,
abort_on_callback_error=True,
Expand All @@ -50,7 +50,7 @@ client = RotatingClient(
- `api_keys` (`Optional[Dict[str, List[str]]]`, default: `None`): A dictionary mapping provider names to a list of API keys.
- `oauth_credentials` (`Optional[Dict[str, List[str]]]`, default: `None`): A dictionary mapping provider names to a list of file paths to OAuth credential JSON files.
- `max_retries` (`int`, default: `2`): The number of times to retry a request with the *same key* if a transient server error occurs.
- `usage_file_path` (`str`, default: `"key_usage.json"`): The path to the JSON file where usage statistics are persisted.
- `usage_file_path` (`str`, optional): Base path for usage persistence (defaults to `usage/` in the data directory). The client stores per-provider files under `usage/usage_<provider>.json`.
- `configure_logging` (`bool`, default: `True`): If `True`, configures the library's logger to propagate logs to the root logger.
- `global_timeout` (`int`, default: `30`): A hard time limit (in seconds) for the entire request lifecycle.
- `abort_on_callback_error` (`bool`, default: `True`): If `True`, any exception raised by `pre_request_callback` will abort the request.
Expand Down Expand Up @@ -96,9 +96,9 @@ The `_safe_streaming_wrapper` is a critical component for stability. It:
* **Error Interception**: Detects if a chunk contains an API error (like a quota limit) instead of content, and raises a specific `StreamedAPIError`.
* **Quota Handling**: If a specific "quota exceeded" error is detected mid-stream multiple times, it can terminate the stream gracefully to prevent infinite retry loops on oversized inputs.

### 2.2. `usage_manager.py` - Stateful Concurrency & Usage Management
### 2.2. `usage/manager.py` - Stateful Concurrency & Usage Management

This class is the stateful core of the library, managing concurrency, usage tracking, cooldowns, and quota resets.
This class is the stateful core of the library, managing concurrency, usage tracking, cooldowns, and quota resets. Usage tracking now lives in the `rotator_library/usage/` package with per-provider managers and `usage/usage_<provider>.json` storage.

#### Key Concepts

Expand Down Expand Up @@ -419,7 +419,7 @@ The `CooldownManager` handles IP or account-level rate limiting that affects all
- All subsequent `acquire_key()` calls for that provider will wait until the cooldown expires


### 2.10. Credential Prioritization System (`client.py` & `usage_manager.py`)
### 2.10. Credential Prioritization System (`client/rotating_client.py` & `usage/manager.py`)

The library now includes an intelligent credential prioritization system that automatically detects credential tiers and ensures optimal credential selection for each request.

Expand Down Expand Up @@ -762,7 +762,7 @@ Acquiring key for model antigravity/claude-opus-4.5. Tried keys: 0/12(17,cd:3,fc
```

**Persistence:**
Cycle state is persisted in `key_usage.json` under the `__fair_cycle__` key.
Cycle state is persisted alongside usage data in `usage/usage_<provider>.json`.

### 2.20. Custom Caps

Expand Down Expand Up @@ -1773,7 +1773,7 @@ The system follows a strict hierarchy of survival:

2. **Credential Management (Level 2)**: OAuth tokens are cached in memory first. If credential files are deleted, the proxy continues using cached tokens. If a token refresh succeeds but the file cannot be written, the new token is buffered for retry and saved on shutdown.

3. **Usage Tracking (Level 3)**: Usage statistics (`key_usage.json`) are maintained in memory via `ResilientStateWriter`. If the file is deleted, the system tracks usage internally and attempts to recreate the file on the next save interval. Pending writes are flushed on shutdown.
3. **Usage Tracking (Level 3)**: Usage statistics (`usage/usage_<provider>.json`) are maintained in memory via `ResilientStateWriter`. If the file is deleted, the system tracks usage internally and attempts to recreate the file on the next save interval. Pending writes are flushed on shutdown.

4. **Provider Cache (Level 4)**: The provider cache tracks disk health and continues operating in memory-only mode if disk writes fail. Has its own shutdown mechanism.

Expand Down Expand Up @@ -1813,7 +1813,7 @@ INFO:rotator_library.resilient_io:Shutdown flush: all 2 write(s) succeeded
This architecture supports a robust development workflow:

- **Log Cleanup**: You can safely run `rm -rf logs/` while the proxy is serving traffic. The system will recreate the directory structure on the next request.
- **Config Reset**: Deleting `key_usage.json` resets the persistence layer, but the running instance preserves its current in-memory counts for load balancing consistency.
- **Config Reset**: Deleting `usage/usage_<provider>.json` resets the persistence layer, but the running instance preserves its current in-memory counts for load balancing consistency.
- **File Recovery**: If you delete a critical file, the system attempts directory auto-recreation before every write operation.
- **Safe Exit**: Ctrl+C triggers graceful shutdown with final data flush attempt.

Expand Down
35 changes: 22 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,20 +53,21 @@ docker run -d \
-v $(pwd)/.env:/app/.env:ro \
-v $(pwd)/oauth_creds:/app/oauth_creds \
-v $(pwd)/logs:/app/logs \
-v $(pwd)/usage:/app/usage \
-e SKIP_OAUTH_INIT_CHECK=true \
ghcr.io/mirrowel/llm-api-key-proxy:latest
```

**Using Docker Compose:**

```bash
# Create your .env file and key_usage.json first, then:
# Create your .env file and usage directory first, then:
cp .env.example .env
touch key_usage.json
mkdir usage
docker compose up -d
```

> **Important:** You must create both `.env` and `key_usage.json` files before running Docker Compose. If `key_usage.json` doesn't exist, Docker will create it as a directory instead of a file, causing errors.
> **Important:** Create the `usage/` directory before running Docker Compose so usage stats persist on the host.

> **Note:** For OAuth providers, complete authentication locally first using the credential tool, then mount the `oauth_creds/` directory or export credentials to environment variables.

Expand Down Expand Up @@ -335,17 +336,20 @@ The proxy includes a powerful text-based UI for configuration and management.
### TUI Features

- **🚀 Run Proxy** — Start the server with saved settings
- **⚙️ Configure Settings** — Host, port, API key, request logging
- **⚙️ Configure Settings** — Host, port, API key, request logging, raw I/O logging
- **🔑 Manage Credentials** — Add/edit API keys and OAuth credentials
- **📊 View Status** — See configured providers and credential counts
- **🔧 Advanced Settings** — Custom providers, model definitions, concurrency
- **📊 View Provider & Advanced Settings** — Inspect providers and launch the settings tool
- **📈 View Quota & Usage Stats (Alpha)** — Usage, quota windows, fair-cycle status
- **🔄 Reload Configuration** — Refresh settings without restarting

### Configuration Files

| File | Contents |
|------|----------|
| `.env` | All credentials and advanced settings |
| `launcher_config.json` | TUI-specific settings (host, port, logging) |
| `quota_viewer_config.json` | Quota viewer remotes + per-provider display toggles |
| `usage/usage_<provider>.json` | Usage persistence per provider |

---

Expand Down Expand Up @@ -446,6 +450,7 @@ The proxy includes a powerful text-based UI for configuration and management.
<summary><b>📝 Logging & Debugging</b></summary>

- **Per-request file logging** with `--enable-request-logging`
- **Raw I/O logging** with `--enable-raw-logging` (proxy boundary payloads)
- **Unique request directories** with full transaction details
- **Streaming chunk capture** for debugging
- **Performance metadata** (duration, tokens, model used)
Expand Down Expand Up @@ -801,6 +806,7 @@ Options:
--host TEXT Host to bind (default: 0.0.0.0)
--port INTEGER Port to run on (default: 8000)
--enable-request-logging Enable detailed per-request logging
--enable-raw-logging Capture raw proxy I/O payloads
--add-credential Launch interactive credential setup tool
```

Expand All @@ -813,6 +819,9 @@ python src/proxy_app/main.py --host 127.0.0.1 --port 9000
# Run with logging
python src/proxy_app/main.py --enable-request-logging

# Run with raw I/O logging
python src/proxy_app/main.py --enable-raw-logging

# Add credentials without starting proxy
python src/proxy_app/main.py --add-credential
```
Expand Down Expand Up @@ -850,8 +859,8 @@ The proxy is available as a multi-architecture Docker image (amd64/arm64) from G
cp .env.example .env
nano .env

# 2. Create key_usage.json file (required before first run)
touch key_usage.json
# 2. Create usage directory (usage_*.json files are created automatically)
mkdir usage

# 3. Start the proxy
docker compose up -d
Expand All @@ -860,13 +869,13 @@ docker compose up -d
docker compose logs -f
```

> **Important:** You must create `key_usage.json` before running Docker Compose. If this file doesn't exist on the host, Docker will create it as a directory instead of a file, causing the container to fail.
> **Important:** Create the `usage/` directory before running Docker Compose so usage stats persist on the host.

**Manual Docker Run:**

```bash
# Create key_usage.json if it doesn't exist
touch key_usage.json
# Create usage directory if it doesn't exist
mkdir usage

docker run -d \
--name llm-api-proxy \
Expand All @@ -875,7 +884,7 @@ docker run -d \
-v $(pwd)/.env:/app/.env:ro \
-v $(pwd)/oauth_creds:/app/oauth_creds \
-v $(pwd)/logs:/app/logs \
-v $(pwd)/key_usage.json:/app/key_usage.json \
-v $(pwd)/usage:/app/usage \
-e SKIP_OAUTH_INIT_CHECK=true \
-e PYTHONUNBUFFERED=1 \
ghcr.io/mirrowel/llm-api-key-proxy:latest
Expand All @@ -895,7 +904,7 @@ docker compose -f docker-compose.dev.yml up -d --build
| `.env` | Configuration and API keys (read-only) |
| `oauth_creds/` | OAuth credential files (persistent) |
| `logs/` | Request logs and detailed logging |
| `key_usage.json` | Usage statistics persistence |
| `usage/` | Usage statistics persistence (`usage_*.json`) |

**Image Tags:**

Expand Down
4 changes: 2 additions & 2 deletions docker-compose.dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@ services:
- ./oauth_creds:/app/oauth_creds
# Mount logs directory for persistent logging
- ./logs:/app/logs
# Mount key_usage.json for usage statistics persistence
- ./key_usage.json:/app/key_usage.json
# Mount usage directory for usage statistics persistence
- ./usage:/app/usage
# Optionally mount additional .env files (e.g., combined credential files)
# - ./antigravity_all_combined.env:/app/antigravity_all_combined.env:ro
environment:
Expand Down
4 changes: 2 additions & 2 deletions docker-compose.tls.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,8 @@ services:
- ./oauth_creds:/app/oauth_creds
# Mount logs directory for persistent logging
- ./logs:/app/logs
# Mount key_usage.json for usage statistics persistence
- ./key_usage.json:/app/key_usage.json
# Mount usage directory for usage statistics persistence
- ./usage:/app/usage
# Optionally mount additional .env files (e.g., combined credential files)
# - ./antigravity_all_combined.env:/app/antigravity_all_combined.env:ro
environment:
Expand Down
4 changes: 2 additions & 2 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ services:
- ./oauth_creds:/app/oauth_creds
# Mount logs directory for persistent logging
- ./logs:/app/logs
# Mount key_usage.json for usage statistics persistence
- ./key_usage.json:/app/key_usage.json
# Mount usage directory for usage statistics persistence
- ./usage:/app/usage
# Optionally mount additional .env files (e.g., combined credential files)
# - ./antigravity_all_combined.env:/app/antigravity_all_combined.env:ro
environment:
Expand Down
Loading