Skip to content

fix: invalidate model cache on auth-store drift#1703

Closed
Michaelyklam wants to merge 3 commits into
nesquena:masterfrom
Michaelyklam:fix/issue-1699-model-cache-drift
Closed

fix: invalidate model cache on auth-store drift#1703
Michaelyklam wants to merge 3 commits into
nesquena:masterfrom
Michaelyklam:fix/issue-1699-model-cache-drift

Conversation

@Michaelyklam
Copy link
Copy Markdown
Contributor

@Michaelyklam Michaelyklam commented May 5, 2026

Thinking Path

  • /api/models is the source of truth for the WebUI composer model picker and configured PRIMARY badges.
  • The existing cache knew about WebUI version/schema drift and config mtime changes, but not external auth.json changes made by terminal hermes setup.
  • That left both the in-process cache and STATE_DIR/models_cache.json able to serve a previous active provider for up to the 24h TTL.
  • The fix is to stamp and validate the cache against non-secret source fingerprints for both config.yaml and auth.json, then rebuild when either changes outside WebUI.

What Changed

  • Added a non-secret /api/models cache source fingerprint covering config.yaml and auth.json path, mtime, and size.
  • Stamped the disk cache with _source_fingerprint and bumped the disk cache schema to reject older cache files cleanly.
  • Tracked the same source fingerprint for the in-memory TTL cache so external auth-store writes invalidate without requiring a WebUI-originated provider edit.
  • Reused a shared auth-store path helper in the model discovery cold path.
  • Added regression coverage for:
    • stale in-memory cache bypass after external auth.json active-provider change;
    • stale disk cache bypass after external auth.json active-provider change;
    • fresh disk cache reuse when config/auth sources are unchanged.

Why It Matters

Users who switch primary providers from the Hermes CLI should see the WebUI picker follow that source of truth after refresh, without clearing browser state or waiting for cache TTL expiry. This keeps the WebUI in sync with terminal hermes setup and prevents stale PRIMARY badges such as OpenRouter/MiniMax lingering after switching to OpenCode Go.

Verification

  • python -m pytest tests/test_issue1699_model_cache_source_fingerprint.py -q — 3 passed (RED first: two stale-cache tests failed before the fix).
  • /home/michael/.hermes/hermes-agent/venv/bin/python -m pytest tests/test_issue1699_model_cache_source_fingerprint.py tests/test_issue1633_models_cache_version_stamp.py -q — 22 passed after the CI-portability/test-state follow-ups.
  • HERMES_WEBUI_TEST_STATE_DIR=/tmp/hermes-webui-kanban/t_0cdf52fe/full-suite-state-1997a48 HERMES_WEBUI_TEST_PORT=29583 env -u HERMES_CONFIG_PATH -u HERMES_WEBUI_HOST /home/michael/.hermes/hermes-agent/venv/bin/python -m pytest tests/ -q — 4480 passed, 2 skipped, 1 xfailed, 2 xpassed, 1 warning, 8 subtests passed in 435.69s.
  • GitHub Actions on head a358400 — pending immediately after the final test-state follow-up; previous head 94d0b78 was green on Python 3.11, 3.12, and 3.13.
  • git diff --check — passed.
  • Browser QA on an isolated temp WebUI server: started with auth.json active provider openrouter, loaded /api/models, edited auth.json externally to opencode-go, reloaded the same browser tab without clearing cookies/localStorage, and confirmed /api/models/picker data reported active_provider: opencode-go, PRIMARY (OPENCODE-GO), provider group opencode-go, and no stale OpenRouter group.

Evidence / UI media

Browser QA: model picker cache refreshes after external auth-store change

Raw media verified HTTP 200 (150269 bytes).

Risks / Follow-ups

  • The source fingerprint intentionally uses only file path, mtime, and size — never auth/config contents — so it is safe for cache metadata but depends on normal filesystem mtime updates for external writes.
  • The disk cache schema bump forces one rebuild after upgrade, which is expected for the new required _source_fingerprint metadata.

Model Used

OpenAI Codex gpt-5.5 via Hermes, with terminal/file/browser tooling.

Closes #1699

@Michaelyklam Michaelyklam force-pushed the fix/issue-1699-model-cache-drift branch from 1997a48 to 94d0b78 Compare May 5, 2026 15:46
@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Closed by the v0.51.4 release in PR #1707 (merged at 4daa238, deployed to production).

Live on production: https://github.com/nesquena/hermes-webui/releases/tag/v0.51.4

🚀

Michaelyklam pushed a commit to Michaelyklam/hermes-webui that referenced this pull request May 5, 2026
Michaelyklam added a commit to Michaelyklam/hermes-webui that referenced this pull request May 5, 2026
10 PRs (3 surfaces additions, 7 fixes):
- nesquena#1644 model picker chip + group count (@bergeouss, closes nesquena#1425)
- nesquena#1684 update network failures UX (@Michaelyklam, closes nesquena#1321)
- nesquena#1685 Codex spark models (@Michaelyklam, closes nesquena#1680)
- nesquena#1689 normalize profile base homes (@Michaelyklam, refs nesquena#749)
- nesquena#1693 adaptive title refresh deadlock (@ai-ag2026)
- nesquena#1701 normalize update banner URL (@Michaelyklam, closes nesquena#1691)
- nesquena#1702 workspace double-click rename (@Michaelyklam, closes nesquena#1698)
- nesquena#1703 cache invalidation on auth-store drift (@Michaelyklam, closes nesquena#1699)
- nesquena#1704 markdown fence lengths (@Michaelyklam, closes nesquena#1696)
- nesquena#1706 multi-image paste fix (@Michaelyklam, closes nesquena#1697)

Tests: 4477 → 4503 (+26). Opus: SHIP, 7/7 verification clean.

Co-authored-by: Michael Lam <Michaelyklam1@gmail.com>
Co-authored-by: ai-ag2026 <noreply@github.com>
Co-authored-by: bergeouss <noreply@github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants