Fix duplicate models across named custom providers#1874
Conversation
…ontend fallback Backend (_build_configured_model_badges): skip badge assignment for synthetic candidate keys like 'custom:provider/model' that aren't real dropdown entries, preventing phantom badge keys that normalize to the same string as bare model IDs from other providers. Frontend (_getConfiguredModelBadge): when a providerId is specified and no provider-specific badge match is found, return null instead of falling back to a badge from another provider. Together these fixes ensure that when 'custom:super-javis' is the active provider with 'gpt-5.4' as the default model, only super-javis's own '@Custom:super-javis:gpt-5.4' carries the PRIMARY badge, while Edith's bare 'gpt-5.4' is correctly left unadorned.
SummaryInvestigated against Code referenceThe core fix in _seen_custom_ids_by_group: dict[str, set[str]] = {
"__unnamed__": {m["id"] for m in auto_detected_models}
}
for _cp in _custom_providers_cfg:
...
_bucket_key = _slug or "__unnamed__"
_bucket_seen = _seen_custom_ids_by_group.setdefault(_bucket_key, set())That's precisely the right scoping — same-bucket dedup (within one named provider, or within the unnamed bucket) is preserved, cross-bucket de-dupe is correctly delegated to the post-process at Concern 1: dedup now treats slash IDs as collidableBefore this PR, # Before
if mid.startswith("@") or "/" in mid:
continueAfter: # After (this PR)
if not mid or mid.startswith("@"):
continueThis is intentional — the new test But this is a silent semantic change for the That probably isn't wrong, but I'd want to confirm it doesn't break sessions with persisted bare slash IDs that were previously matched by the second provider's group. The "first occurrence wins" rule guarantees the first group's bare slash ID stays intact, but if the user's saved session model belongs to the second provider that group's option-id has now changed. Consider documenting this as an intentional behavior change in CHANGELOG / migration notes, even if it's mostly invisible. The frontend rehydration via Concern 2:
|
_deduplicate_model_ids was keeping the alphabetically-first provider's bare IDs. When the active provider was different, selecting a bare model from a non-active provider would silently route through the active provider instead. Fix: the active provider now gets first-occurrence priority for bare model IDs. All overlapping non-active providers get @provider_id: prefixes, so selecting them routes through the correct provider.
Three fixes in this commit: 1. resolve_model_provider(): Replace greedy first-match on custom_providers with a multi-pass collector. When a bare model ID exists in multiple custom providers, prefer the active provider; when there's exactly one match, verify the active provider doesn't also have it before routing. 2. _build_configured_model_badges(): Remove fallback to exact_match and matches[0] when provider_match fails — they may belong to a different provider and would leak PRIMARY badges. 3. Guard the final badges[match_id] assignment with a provider identity check as defense-in-depth. Add _provider_has_model() helper to check whether a provider's catalogue includes a given model ID, supporting both custom and built-in providers.
|
Thanks @hacker1e7 — defer-flagging this PR for the next pass. The per-group dedup bucket fix is the right shape and the per-named-provider seen-set is exactly the right scoping change. Two concrete things to address before merge, plus the rebase. Blocker 1: Behavior change to
|
Closing — superseded by #1947's narrower fixThanks @hacker1e7 — your diagnosis of the Why I'm going with #1947's smaller version:
Closing this PR but the analysis is preserved — the per-group bucket is the canonical fix, your PR description articulates the root cause clearly, and your three-named-provider regression test inspired the test I'm adding to #1947 before merge. Worth noting: if there's a real |
CHANGELOG, ROADMAP, TESTING refresh for v0.51.31 stage release covering 12 contributor PRs: Added (2 PRs): - #1956 JKJameson — persistent composer draft (server-side, cross-client) - #1957 hermes-gimmethebeans — configurable session TTL via env + settings Fixed (10 PRs): - #1939 ai-ag2026 — theme-color + sw cache regression coverage - #1941 ai-ag2026 — preserve chat scroll across final render - #1945 franksong2702 — localize session jump controls (#1938) - #1947 happy5318 — show same model from different custom providers (Co-authored-by hacker1e7 for #1874 close) - #1949 Sanjays2402 — close #1937 endless-scroll vs Start-jump race with generation-token + mutex (Co-authored-by franksong2702 + Michaelyklam) - #1950 franksong2702 — mute stale stopped gateway heartbeat (#1944) - #1951 amlyczz — gate goal hook on goal-related turns (#1932) (Co-authored-by franksong2702 for #1946 close) - #1953 lucky-yonug — skip provider peel for custom host:port slugs - #1960 Michaelyklam — translate hidden-files workspace label (#1841) - #1961 sbe27 — respect image_input_mode (#1959) Closed in favor of canonical: #1942, #1962, #1946, #1874, #1311. Stage-326 hotfixes (per Opus advisor): - CRITICAL #1951 PENDING_GOAL_CONTINUATION race fix (removed finally discard that race-erased the marker before consumer could read it) - #1956 composer-draft input validation (50 KB text / 50 file clamp + type coercion to prevent unbounded session-JSON bloat) - #1957 SESSION_TTL constant preserved as named fallback (existing regression tests pin it; #1957 originally deleted it) Tests: 5006 → 5028 (+51 net new) — 0 regressions, 142.61s runtime.
Summary
Fix
/api/modelsso duplicate model IDs from different named custom providers are preserved instead of being dropped by an overly broad global de-duplication pass.Root cause
While assembling named custom provider groups in
get_available_models(), the API used a single global_seen_custom_idsset seeded from auto-detected models. That meant if two named custom providers exposed the same raw model ID (for example both exposinggpt-5.4), the first provider to be processed "claimed" the ID and later providers silently lost their copy before provider-aware namespacing/deduplication ran.In practice this caused configs like:
edith->gpt-5.4super-javis->gpt-5.4,gpt-5.5...to incorrectly omit
super-javis'sgpt-5.4entry from/api/models.Fix
Replace the single global custom-provider seen set with a per rendered provider-group seen set:
__unnamed__bucket_deduplicate_model_ids()) remains responsible for cross-provider collisions and namespacing behaviorThis keeps de-duplication scoped to the place where it is actually valid: inside the same rendered provider group.
Why this is the correct behavior
Provider identity matters. Two providers can legitimately expose the same upstream model ID while still being different choices (different base URLs, auth, routing, latency, quotas, or semantics). The API should preserve both entries and only disambiguate them in a provider-aware way, not erase one globally just because the bare model ID matches.
Regression coverage
Added
tests/test_custom_provider_duplicate_models.py, which verifies that:gpt-5.4@custom:super-javis:gpt-5.4gpt-5.5are still returned normallyTesting