Skip to content

stage-batch32: v0.51.150 / Release DV — single-PR reasoning-effort agent metadata#3036

Merged
nesquena-hermes merged 3 commits into
masterfrom
release/stage-batch32
May 28, 2026
Merged

stage-batch32: v0.51.150 / Release DV — single-PR reasoning-effort agent metadata#3036
nesquena-hermes merged 3 commits into
masterfrom
release/stage-batch32

Conversation

@nesquena-hermes

Copy link
Copy Markdown
Collaborator

stage-batch32 — v0.51.150 (Release DV)

Single-PR reasoning-effort agent metadata integration.

PR

Changes resolve_model_reasoning_efforts() in api/config.py to consult Hermes Agent's models.dev metadata (agent.models_dev.get_model_capabilities()) between the exact provider-specific resolvers and the broad heuristic fallback. Closes the gap where native/provider-specific catalogs (e.g. xAI OAuth grok-4.3) were reasoning-capable per agent metadata but reported no efforts by WebUI because no local prefix heuristic matched.

Order of precedence (verified by Opus)

ACP / Copilot subset / OpenAI Codex / LM Studio probe  →  (authoritative)
NEW: models.dev metadata
  • supports_reasoning=True  → full effort list
  • supports_reasoning=False → [] (authoritative — known non-reasoning variants stay hidden)
  • Unknown / None          → fall through
Prefix-based heuristic fallback                        →  (unchanged)

The new metadata layer sits exactly where the old prefix heuristic was the only answer for non-special-cased providers. Existing exact resolvers return BEFORE the metadata lookup, so they cannot be overridden by stale agent metadata.

Pre-merge gate

  • Targeted-test sweep (41 reasoning-adjacent tests): 41/41 passed
  • Full sequential pytest: 6657 passed, 0 failures, 12 skipped (156s)
  • CI green on PR branch: test 3.11/3.12/3.13 SUCCESS
  • Opus advisor deep review (claude-opus-4-7): SHIP-IN-NEXT-LOW-RISK-BATCH
    • Brick-class assessment: ~2/10 — additive + fallthrough-safe
    • Import is bare-except Exception-guarded (older hermes-agent without models_dev module = identical behavior to current master)
    • supports_reasoning=False authoritative path is bounded: a single model becomes unaccessible-for-reasoning at worst, not a global brick
    • The models.dev call is wrapped in a second try/except Exception; no raise can escape

Drive-by fix called out by Opus

The LMStudio branch now passes hinted_model (the provider-stripped value) to lmstudio_model_reasoning_options() instead of the raw model arg. Previously the unstripped @xai-oauth:... style hint was passed through to LMStudio's probe — that was a real bug that this PR incidentally fixes.

Test coverage

tests/test_models_dev_reasoning.py (110 LOC, 6 tests) covers:

  • supports_reasoning=True → full effort list
  • supports_reasoning=False → authoritative empty
  • Unknown / None → falls through to prefix heuristic
  • @xai-oauth: hint stripping before lookup
  • False overrides a prior known prefix heuristic
  • get_reasoning_status() hydrates from config defaults

Files

  • api/config.py: +114/-39 (new _models_dev_reasoning_efforts() helper, resolver call-chain update, no-query path hydration)
  • tests/test_models_dev_reasoning.py: NEW, 6 tests
  • CHANGELOG.md: v0.51.150 / Release DV entry

Non-blocking follow-ups (Opus suggestions)

  1. Add a model.reasoning_efforts_force_supported: true config knob so a user can override an authoritative-False signal when models.dev cache is stale. File as sprint-candidate.
  2. Add a direct test for the ImportError branch (mock sys.modules["agent.models_dev"] = None) — the existing "metadata returns None" tests cover the same downstream behavior via a different upstream path.

Both can land in a subsequent PR.

@nesquena-hermes nesquena-hermes merged commit 5bc3cdb into master May 28, 2026
3 checks passed
@nesquena-hermes nesquena-hermes deleted the release/stage-batch32 branch May 28, 2026 02:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants