feat(minimax): native M3/M2.7 backend with cost + dashboard integration by axelfleureau · Pull Request #1361 · headroomlabs-ai/headroom

axelfleureau · 2026-06-24T10:25:07Z

Summary

Adds the MiniMax provider as a first-class citizen in headroom's proxy and dashboard, alongside Anthropic / OpenAI / Gemini / Bedrock. MiniMax exposes M3 and M2.x through an Anthropic-compatible wire format, so the proxy can route MiniMax-M* traffic via a thin handler mixin that delegates to AnthropicHandlerMixin while stamping provider="minimax" for cost tracking, savings, and dashboard rendering.

This unblocks anyone running Claude Code, Codex CLI, or the Codex desktop app against MiniMax through headroom on Python 3.13+ (litellm path) or 3.14 (litellm-skipped path with hardcoded fallback).

What's in the PR

New provider + handler

headroom/providers/minimax.py — MiniMaxProvider with model metadata (context windows up to 1M for M3, 200K for M2.x), pricing, vision flags, token counter.
headroom/proxy/handlers/minimax.py — MiniMaxHandlerMixin that delegates to AnthropicHandlerMixin (wire-compat) and strips the minimax/ prefix from model names.
headroom/proxy/handlers/__init__.py — exports the mixin.
headroom/proxy/server.py — HeadroomProxy mixes in MiniMaxHandlerMixin; _retry_request falls back gracefully when ProxyConfig fields are absent (test doubles).
headroom/proxy/handlers/streaming.py — adds the missing import os (was already broken on Python 3.14).
headroom/proxy/handlers/anthropic.py — binds body before the client-override check (fixes UnboundLocalError on batch endpoints).
headroom/providers/proxy_routes.py — conditional routing: when model matches MiniMax-M*, hand off to MiniMaxHandlerMixin.
headroom/proxy/models.py — ProxyConfig gains minimax_api_url, minimax_api_key, minimax_session_token.

Cost tracking (litellm-optional)

The fork of headroom-ai skips litellm on Python 3.14 (litellm>=1.86.2,<2.0; python_version < '3.14'). Without a fallback, the dashboard silently shows $0 for every non-MiniMax model. This PR adds a small hardcoded fallback so the dashboard reports real costs:

_get_cache_prices_fallback() in headroom/proxy/cost.py covers:
- Anthropic Claude 4.x (opus-4, sonnet-4/-4-5, haiku-4-5) and 3.x (3-5-sonnet, 3-opus, 3-haiku), including truncated-datestamp variants (claude-3-5-sonnet-20) that exposed a regex-group bug.
- OpenAI gpt-5/-4o/-4/-3.5 and o1/o3/o4 reasoning models.
Cache economics preserved per provider: Anthropic 90% read discount + 25% write premium; OpenAI 50% read + no write premium; MiniMax 90% read + 25% write premium.
Existing MiniMax fallback in _get_cache_prices is preserved (input-cost dict + savings tracker path).

Dashboard

headroom/dashboard/templates/dashboard.html — Per-Model Token Savings table gets a Provider chip column (Bedrock / Anthropic / OpenAI / Gemini / Mistral / DeepSeek / MiniMax / other) and a Cost (USD) column. Provider classification is done in Alpine.js with conservative substring rules; Bedrock is checked before Anthropic because us.anthropic.claude-* contains claude-.
Dashboard chip contrast bumped for readability.

Tests (35 new cases)

tests/test_provider_minimax.py — 20 tests for MiniMaxProvider (model metadata, capabilities, token counter, parametrized context-window checks).
tests/test_minimax_cost_fallbacks.py — 33 tests for the cost-tracking fallbacks (15 original + 18 new parametrized cases for Anthropic / OpenAI litellm-missing paths).

Test status

Before this PR: 6 pre-existing failures on Python 3.14 (litellm unavailable: 4 in test_models.py; cargo-missing in test_release_workflows.py; environment-dependent in test_cli/test_wrap_copilot.py). All reproduce on a clean origin/main.
After this PR: 1 new failure (test_retry_request_retries_connect_timeout) fixed by switching self.config.minimax_api_key to getattr(...) so SimpleNamespace test doubles don't crash. The remaining 6 pre-existing failures are unchanged and unrelated to this PR.

6879 passed, 523 skipped, 7 deselected, 5815 warnings in 139.58s

ruff check headroom/proxy/cost.py headroom/proxy/models.py headroom/proxy/server.py tests/test_minimax_cost_fallbacks.py tests/test_provider_minimax.py → All checks passed!

Live verification

MiniMax-M3 200 OK with input_cost_usd=0.000518 reported per-model on the dashboard (vs $0 without the fallback).
M2.7, M3-thinking, M3-tool_use, SSE streaming, 20/20 serial requests all pass.
Codex 8787 (upstream Anthropic) and MiniMax 8788 (fork) coexist on separate ports without collision.

Out of scope

No new dashboard route — Per-Model table is the natural home; no parallel /dashboard/minimax page.
No separate /dashboard/minimax analytics JSON endpoint.
The MiniMax provider reuses Anthropic's request/response wire format intentionally — no schema divergence.

Commits

e9e88109 fix(cost): Anthropic/OpenAI pricing fallback when litellm missing
1f0dbf7e fix(dashboard): bump provider-chip contrast for readability
8a25cae8 feat(dashboard): add Provider column to Per-Model Token Savings
8bfbb184 fix(handlers,models): export MiniMaxHandlerMixin + add minimax_api_url/key to ProxyConfig
a36b772d feat(minimax): native M3/M2.7 backend with cost + dashboard integration

Adds the MiniMax provider as a first-class citizen in headroom's proxy and dashboard. This is a single PR-ready commit on top of upstream/main that combines 21 incremental commits from the axelfleureau/headroom fork. The logic in each file is the post-bugfix version. New files: - headroom/providers/minimax.py — MiniMaxProvider class with context limits, max output, pricing, vision/tool/streaming flags for the MiniMax-M3 / M2.7 / M2.5 / M2.1 / M2 model families. - headroom/proxy/handlers/minimax.py — MiniMaxHandlerMixin that delegates to AnthropicHandlerMixin (wire-compatible) but stamps provider='minimax' on records and strips the minimax/ routing prefix from model names. - tests/test_minimax_cost_fallbacks.py — 15 unit tests covering MiniMax pricing fallback paths in CostTracker + SavingsTracker. Runs on Python 3.14 where litellm is intentionally skipped. Modified files: - headroom/providers/registry.py — resolve MINIMAX_TARGET_API_URL env var and surface the resolved URL through ProviderApiTargets. - headroom/providers/proxy_routes.py — register MiniMaxHandlerMixin in the proxy routes table; route /v1/messages traffic to the MiniMax handler when the request body names a MiniMax-M* model. - headroom/proxy/cost.py — CostTracker._get_cache_prices and _get_list_price consult MiniMaxProvider.MODEL_INPUT_COST when litellm doesn't know the model (Python 3.14 case). Cache economics per-provider loop also recognises provider='minimax'. - headroom/proxy/savings_tracker.py — savings_tracker estimates use MiniMaxProvider pricing as a primary path, litellm as fallback. Both paths work on Python 3.14. - headroom/proxy/handlers/anthropic.py — bind request body before the client-override check in handle_anthropic_batch_passthrough + handle_anthropic_batch_results (fixes UnboundLocalError on GET). - headroom/proxy/handlers/streaming.py — add missing 'import os' used by the MiniMax session-token fallback. - headroom/proxy/server.py — HeadroomProxy mixes in MiniMaxHandlerMixin so model detection runs in the main path. - headroom/dashboard/templates/dashboard.html — Per-Model Token Savings table now includes a Cost (USD) column sourced from cost.per_model[model].input_cost_usd (populated by the cost.py changes above). Existing MiniMax traffic surfaces automatically in Per-Provider Breakdown + Providers + Recent Requests via the existing provider-bucketing logic.

…l/key to ProxyConfig The MiniMax integration referenced MiniMaxHandlerMixin from server.py without exporting it from headroom.proxy.handlers, and the HeadroomProxy.__init__ reads config.minimax_api_key — both fields need to exist on ProxyConfig for the proxy to boot when the MiniMax provider is enabled. These two changes are mechanical: they don't alter runtime behaviour beyond registering the new symbol + accepting the new config keys. Default values keep the proxy compatible with upstream (no API key required, no URL override).

The Per-Model Token Savings table shows models from any upstream the proxy has touched — claude-*, gpt-*, gemini-*, minimax-*, and Bedrock region-prefixed ids like us.anthropic.claude-*. Without a provider column readers can't tell at a glance which upstream billed which row. Adds two methods to the Alpine component: - providerFor(model): substring-based classifier with conservative ordering. Bedrock / AWS is checked first because its region-prefix contains 'anthropic.' (so it must precede the claude* match). Recognises anthropic, openai, gemini, minimax, mistral, deepseek; anything else falls into 'other'. - providerChipClass(provider): tailwind colour tokens per provider. Kept low-saturation so the table reads as a list, not a rainbow. Verified with 19 unit cases via node -e (no test framework needed since the function is plain JS inside an x-data attribute).

On Python 3.14 the headroom-ai fork skips litellm (it pins requires-python<3.14), so without a fallback CostTracker silently returns None for every Claude / gpt-* / o-series model and the dashboard shows $0 for non-MiniMax traffic. Add _get_cache_prices_fallback() with a small hardcoded pricing table covering: - Anthropic Claude 4.x (opus-4, sonnet-4/-4-5, haiku-4-5) - Anthropic Claude 3.x (3-5-sonnet, 3-opus, 3-haiku) — including the truncated-datestamp variant 'claude-3-5-sonnet-20' that exposed the original regex-group bug - OpenAI gpt-5/-4o/-4/-3.5 and o1/o3/o4 reasoning models Cache economics are preserved per provider: - Anthropic: 90% off cache reads, 25% write premium - OpenAI: 50% off cache reads, no write premium - MiniMax: 90% off cache reads, 25% write premium (already in code) Also fix an AttributeError in _retry_request: self.config.minimax_api_key must use getattr() so SimpleNamespace test doubles (which only model a subset of ProxyConfig fields) don't crash when the direct-MiniMax-API auth branch runs. Add minimax_session_token to ProxyConfig for symmetry with the existing minimax_api_key / minimax_api_url fields. Tests: 18 new parametrized cases covering 8 Claude + 10 OpenAI variants; all 33 cost-fallback tests + 20 provider tests + the regression-case test_retry_request_retries_connect_timeout now pass.

github-actions · 2026-06-24T10:25:22Z

PR governance

This PR does not yet satisfy the required template fields:

Missing required section Description.
Missing required section Type of Change.
Missing required section Changes Made.
Missing required section Testing.
Missing required section Real Behavior Proof.
Missing required section Review Readiness.
Check I have performed a self-review before requesting human review.
Check This PR is ready for human review or convert the PR back to draft.

Please update the PR body, or move the PR back to draft while it is still in progress.

…idle Two UX bugs in the dashboard Session/Historical tabs: 1. Historical tab showed 'waiting for saved requests' until the user clicked it. fetchHistoryStats() was only called inside pollDashboard() when viewMode === 'history', and never inside init(). The first navigation to the tab had nothing to render. Now init() also calls fetchHistoryStats() so the tab is populated as soon as the page loads — /stats-history is a small JSON payload (500 most-recent points + 4 daily/weekly/monthly series + lifetime summary) and is cheap to load on boot. 2. Session tab hero metrics collapsed to $0 and '0 requests processed' whenever the proxy had no traffic in the current polling window, even though ~/.headroom/proxy_savings.json had 83M tokens saved across the lifetime of the install. Added three Alpine getters: - displayedTokensSaved: runtime > display_session (if fresh) > lifetime - displayedRequests: runtime > display_session (if fresh) > lifetime - displayedTokensSavedSource: 'runtime' | 'session' | 'lifetime' The Token Savings hero now uses displayedTokensSaved and exposes the source via :title so it's transparent which tier is being displayed. A session counts as 'fresh' if last_activity_at is within the last 5 minutes — past that the lifetime total is the correct headline number. No backend changes. No new endpoints. No new tests required (UI-only).

JerrettDavis

This is a substantial integration and the shape is promising, but I found blockers that need another pass:

The focused test suite fails locally. I ran uv run --extra proxy --with pytest --with pytest-asyncio python -m pytest tests/test_provider_minimax.py tests/test_minimax_cost_fallbacks.py -q and got 7 failures in tests/test_minimax_cost_fallbacks.py. The failures are pricing/cache-economics assertions for claude-haiku, gpt-5, gpt-5-mini/nano, gpt-4-turbo, and gpt-3.5 paths, so the fallback pricing behavior is not internally consistent with the tests in this branch.
The new minimax_api_url plumbing appears unused for the actual /v1/messages route. ProviderApiTargets/ProxyConfig gain minimax fields, but MiniMaxHandlerMixin.handle_minimax_messages intentionally does not pass an upstream_base_url and delegates to AnthropicHandlerMixin so it falls back to self.ANTHROPIC_API_URL. That means setting MINIMAX_TARGET_API_URL or ProxyConfig.minimax_api_url will not route direct MiniMax traffic to the MiniMax endpoint; unless the operator also repoints ANTHROPIC_TARGET_API_URL, MiniMax-looking requests can still go to the Anthropic upstream. Please either wire the MiniMax target into the handler or remove the unused config surface and document the required Anthropic-target setup.

Given the breadth of this PR, please also add a focused routing test that proves a minimax/MiniMax-M3 request forwards to the configured MiniMax upstream with the stripped model name and expected auth header.

FleureauAxel added 5 commits June 24, 2026 08:52

fix(dashboard): bump provider-chip contrast for readability

1f0dbf7

github-actions Bot added the status: needs author action Pull request body or readiness checklist still needs author updates label Jun 24, 2026

JerrettDavis requested changes Jun 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(minimax): native M3/M2.7 backend with cost + dashboard integration#1361

feat(minimax): native M3/M2.7 backend with cost + dashboard integration#1361
axelfleureau wants to merge 6 commits into
headroomlabs-ai:mainfrom
axelfleureau:pr/minimax-provider

axelfleureau commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

JerrettDavis left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

axelfleureau commented Jun 24, 2026

Summary

What's in the PR

New provider + handler

Cost tracking (litellm-optional)

Dashboard

Tests (35 new cases)

Test status

Live verification

Out of scope

Commits

Uh oh!

github-actions Bot commented Jun 24, 2026

PR governance

Uh oh!

JerrettDavis left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants