Skip to content

feat: credit pricing overhaul — per-model rates, margin, max cap#63

Merged
escapeboy merged 1 commit intodevelopfrom
feat/credit-pricing-overhaul
May 4, 2026
Merged

feat: credit pricing overhaul — per-model rates, margin, max cap#63
escapeboy merged 1 commit intodevelopfrom
feat/credit-pricing-overhaul

Conversation

@escapeboy
Copy link
Copy Markdown
Owner

Summary

Base companion to escapeboy/agent-fleet#36 (FleetQ-Cloud). Restructures llm_pricing.php from flat input/output integers to per-model real rates ($USD/Mtok), introduces margin/snapshot/max-cap primitives in CostCalculator, and wires the max-credits-per-call enforcement into BudgetEnforcement middleware.

What changes

  • `config/llm_pricing.php` — new schema with `tier`, `input_usd_per_mtok`, `output_usd_per_mtok`, `cache_read_usd_per_mtok`, `cache_write_5m_usd_per_mtok`, `cache_write_1h_usd_per_mtok`, `context_window`, `last_verified_at`, `source_url` per model. Real prices for Anthropic/OpenAI/Google + 7 other providers. Adds GPT-5/GPT-5 nano (were missing), corrects Opus 4.7 to $5/$25 (was $15/$75), refreshes Haiku 4.5 + Gemini Flash. Tier-based reservation multipliers (default 1.5 / nano 1.2 / heavy 2.0). Non-LLM cost categories (compute / outbound / storage / tool) scaffolded at 0 cost.
  • `CostCalculator` — adds `calculatePlatformCredits()` returning `{platform_credits, raw_cost_usd, billable_cost_usd, margin_applied, model_pricing}` with cache_strategy parameter and per-call margin/max-cap overrides. Adds `estimatePlatformCredits()` using model.tier reservation multiplier. Existing `calculateCost()`/`estimateCost()` signatures preserved for BudgetEnforcement / PrismAiGateway / SkillCostCalculator / ContextHealthService.
  • `EnforceMaxCreditsPerCallAction` (new) — pre-call enforcement; throws InsufficientBudgetException BEFORE Prism call when team's estimated cap would be exceeded. Skipped when team has no cap.
  • `BudgetEnforcement` middleware — wires EnforceMaxCreditsPerCallAction in front of the existing reservation flow.
  • `Team` — adds `effectiveMarginMultiplier()` + `effectiveMaxCreditsPerCall()` returning config defaults (community edition). Cloud override returns per-team values from new columns.
  • `TeamSettingsPage` — adds "Billing & Per-Call Limits" section, gated by `Schema::hasColumn('teams', 'max_credits_per_call')` so community gracefully hides cloud-only fields.

Test plan

  • 12 new unit tests in tests/Unit/Domain/Budget/CostCalculatorPlatformCreditsTest covering nano/sonnet/opus, cache strategies, margin overrides, max cap clamp, min floor, unknown model fallback, tier-based reservation, back-compat
  • All pre-existing budget tests (26) and infrastructure tests (28) pass
  • PHPStan clean
  • Pint clean

Cloud parent PR

escapeboy/agent-fleet#36

…ax cap

Restructures llm_pricing.php from flat input/output ints to per-model
real rates ($USD/Mtok) with cache_read / cache_write_5m / cache_write_1h
tiers, context_window, last_verified_at, and tier-based reservation
multipliers (default 1.5 / nano 1.2 / heavy 2.0).

Real prices now reflect Jan 2026 reality:
- Opus 4.7 corrected to $5/$25 (was $15/$75 — 3× overpriced)
- Sonnet 4.6 confirmed $3/$15
- Haiku 4.5 corrected to $1/$5 (was $0.80/$4)
- GPT-5 added at $1.25/$10
- GPT-5 nano added at $0.05/$0.40 (was missing entirely)
- Gemini 2.5 Flash refreshed to $0.30/$2.50 (was $0.075/$0.30)

CostCalculator gains calculatePlatformCredits() returning structured
{platform_credits, raw_cost_usd, billable_cost_usd, margin_applied}
with cache_strategy parameter and per-call margin/max-cap overrides.
estimatePlatformCredits() uses model.tier reservation multiplier.

calculateCost() / estimateCost() back-compat preserved (BudgetEnforcement,
PrismAiGateway, SkillCostCalculator, ContextHealthService unchanged).

EnforceMaxCreditsPerCallAction wires into BudgetEnforcement middleware
to throw InsufficientBudgetException BEFORE the Prism call when team's
estimated cap would be exceeded — protects against runaway Opus calls.

Team gains effectiveMarginMultiplier() + effectiveMaxCreditsPerCall()
with config-default fallback (per-team override lives in cloud).

TeamSettingsPage adds "Billing & Per-Call Limits" section gated on
Schema::hasColumn so community edition gracefully hides cloud-only fields.

Non-LLM cost categories (compute / outbound / storage / tool) scaffolded
in config at usd_per_unit=0 — threaded through but not deducted yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@escapeboy escapeboy merged commit 955fd5e into develop May 4, 2026
2 of 3 checks passed
@escapeboy escapeboy deleted the feat/credit-pricing-overhaul branch May 4, 2026 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant