Skip to content

stage-350: medium-risk batch — auth trilogy (#2191/2/3) + cancel-status #2151 with conflict resolution + #2178 ollama guard + #2204 provider precedence + #2203 activity animation#2209

Merged
nesquena-hermes merged 36 commits into
masterfrom
stage-350
May 13, 2026

Conversation

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

stage-350 — medium-risk batch (v0.51.57)

Per Nathan's "now around later today, do the next round of PRs that require a little work but otherwise look good" directive. This batch is the medium-risk slice: stacked auth trilogy, semantic conflict resolution, CI fix for a fragile static-source test, plus 3 clean PRs.

Composition (7 PRs)

PR Author LOC Surface Stage work
#2191 lucasrc 378 api/auth.py — login rate-limiter lock + PBKDF2 key separation + migration None (CI green)
#2192 lucasrc 430 api/auth.py + config.py — _invalidate_password_hash_cache() hook from save_settings() (depends on #2191) Contributor fixed CI between sweeps
#2193 lucasrc 464 api/auth.py — full HMAC digest + migration bridge + Secure cookie heuristic restore None (CI green)
#2151 Jordan-SkyLF 599 api/streaming.py — cancel/error classification + race/idempotency guards 18-line semantic conflict resolution with already-shipped #2136 (see below)
#2178 hualong1009 115 static/ui.js — allowOllamaFormat guard against non-ollama @-provider IDs Test assertion updated to match new code shape (existing static-source test was asserting on pre-PR string)
#2204 Michaelyklam 174 api/config.py — resolve_model_provider() prefers configured non-custom provider when it owns a bare model id (closes #1894) None (CI green)
#2203 dobby-d-elf 63 static/style.css + ui.js — "Activity: X tools" shimmer animation None (CI green)

Stage-350 maintainer fixes

  1. fix: clarify cancelled chat turn status #2151 conflict resolution on api/streaming.py:4549 — Both this PR and the shipped fix: guard stale stream writebacks #2136 added stale-stream ownership guards at the same cancel-handler site. Kept fix: guard stale stream writebacks #2136's _stream_writeback_is_current() check as the strictly-stronger condition (it also catches stream-rotated-with-new-pending-message scenarios that fix: clarify cancelled chat turn status #2151's standalone check would miss). Adopted fix: clarify cancelled chat turn status #2151's _emit_cancel_event = False semantic so the terminal cancel SSE event isn't emitted alongside skipping the writeback (otherwise a successful done already delivered to the client would be contradicted). 55/55 tests pass across both PR suites after the resolution.

  2. Opus SHOULD-FIX-pre-merge: _partial_already_present dedup scope tightening — Opus advisor reviewer flagged that fix: clarify cancelled chat turn status #2151's partial-dedup loop used a substring check (_stripped in _existing or _existing in _stripped) against any prior assistant message — too broad. A short prior assistant reply like "OK" would be a substring of many later partial bodies and silently drop the new partial, resurrecting the Bug: Stop_generation button deletes LLM response #893 data-loss bug. Tightened to only dedup against actual prior _partial=True markers with exact whitespace-stripped content match. 3 new regression tests added covering: short non-partial prior reply does NOT dedup, exact-partial-match DOES dedup (re-entry safety), prior assistant with same content but NOT _partial does NOT dedup.

  3. fix(ui): Fix the issue where custom models are not displayed in the m… #2178 test fix — tests/test_ollama_model_chip_label_regression.py — Updated the static-source assertion from the pre-PR string to the post-PR string containing the new allowOllamaFormat && guard prefix. Extended the docstring to explain the bug class the guard fixes (Qwen3.6-35B-A3B-shaped bare custom-provider model IDs had hyphens stripped to spaces + last letter lowercased by the ollama formatter).

What's deferred to tomorrow

PR Why
#2149 (464 LOC) Still CONFLICTING + high-risk cache invalidation
#2174 (37 LOC) Behavior policy: show_cli_sessions=True default flip — needs explicit OK
#2183 (246 LOC) CI red — 2 failures in test_session_lineage_full_transcript that reject the merge-semantics change
#2194 (765 LOC) CI red + CONFLICTING. Large reconciliation surface
#2165 (1757 LOC) Large new pooled Codex quota UI feature
#2195 (744 LOC) New endpoint + daily snapshot persistence
#2099 (811 LOC) Streaming fade — explicit request for human review
Held: #1418, #1721, #1975, #2072, #2082, #2145, #2146 See sweep notes

Verification

  • Targeted pytest: 97/97 pass across all touched-surface suites
  • Post-Opus-fix pytest: 57/57 pass across cancel/data-loss + auth + ollama suites
  • run-browser-tests.sh: 20/20 QA + 11/11 API checks PASSED in 107s
  • Live UI smoke on 8789 (fresh env, zero JS errors): all 7 PR surfaces verified live
  • Phase 5 strict merge-marker check: zero markers in any modified file
  • python -m py_compile clean on all 7 modified .py files
  • node --check clean on all 4 modified .js files
  • Opus advisor: SHIP after _partial_already_present dedup tightening (applied inline). Three lower-priority follow-ups noted (legacy HMAC bridge removal date, X-Forwarded-Proto trust documented in docstring, activity shimmer fallback verified) — all non-blocking.

Stats

21 files changed, 1631 insertions(+), 109 deletions(-)

Closes

Refs #1361, #893, #2154 (via #2151 conflict resolution).

Jordan-SkyLF and others added 30 commits May 12, 2026 13:26
- classify string-only CancelledError payloads as cancelled
- centralize cancel marker substring matching
- add targeted regression coverage
…quest

get_password_hash() computes PBKDF2-SHA256 with 600k iterations to
hash the HERMES_WEBUI_PASSWORD env var.  This is called on nearly every
HTTP request via check_auth -> is_auth_enabled -> get_password_hash.

Before: ~1s of PBKDF2 per request, regardless of how many times the
same env-var value has already been hashed.  A page load hitting 5+
API endpoints would burn 5+ seconds purely on password hashing.

After: compute once on first call, cache the hex result in a module-
level variable.  Subsequent calls are a single global-variable read
(~50ns).  The env var is immutable for the process lifetime, so there
is nothing to invalidate.

Thread-safe: double-checked locking ensures that under a burst of
concurrent requests only one thread computes PBKDF2, while the fast
path (after initialisation) requires zero locks.

Security analysis: zero regression.  The hash is derived from a static
env var and a static signing key — both already readable from process
memory.  Caching does not introduce any new disclosure or replay
vector.  PBKDF2 is still used for the initial computation and for
verify_password() on login.

AI: deepseek/deepseek-v4-flash
…quest

get_password_hash() computes PBKDF2-SHA256 with 600k iterations to
hash the HERMES_WEBUI_PASSWORD env var.  This is called on nearly every
HTTP request via check_auth -> is_auth_enabled -> get_password_hash.

Before: ~1s of PBKDF2 per request, regardless of how many times the
same env-var value has already been hashed.  A page load hitting 5+
API endpoints would burn 5+ seconds purely on password hashing.

After: compute once on first call, cache the hex result in a module-
level variable.  Subsequent calls are a single global-variable read
(~50ns).  The env var is immutable for the process lifetime, so there
is nothing to invalidate.

Thread-safe: double-checked locking ensures that under a burst of
concurrent requests only one thread computes PBKDF2, while the fast
path (after initialisation) requires zero locks.

10 unit tests covering all branches, cache-lifetime semantics, and
concurrent burst safety (8 threads, exactly 1 PBKDF2 call).
Test isolation: reloads only api.auth via importlib.reload, leaving
api.config untouched so test_pytest_state_isolation.py is unaffected.

Security analysis: zero regression.  The hash is derived from a static
env var and a static signing key — both already readable from process
memory.  Caching does not introduce any new disclosure or replay
vector.  PBKDF2 is still used for the initial computation and for
verify_password() on login.

AI: deepseek/deepseek-v4-flash
…odel configuration list

- Fix the issue where custom models are not shown
- Fix the issue where custom models are not ollama but go through the ollama model processing function, causing the hyphen '-' in the model name to be replaced with a space " " and the last letter to be lowercase
… migration path

Concurrent failed logins raced on _login_attempts because no lock guarded
the dict. Add _LOGIN_ATTEMPTS_LOCK and wrap both _check_login_rate() and
_record_login_attempt() with it.

Extract _load_key() to de-duplicate key file I/O. Add _pbkdf2_key() that
loads .pbkdf2_key (separate from .signing_key) so PBKDF2 and HMAC signing
no longer share a key — key reuse across cryptographic primitives is unsafe.

Update _hash_password() to use _pbkdf2_key() as its default salt, with an
optional *salt* kwarg so verify_password() can try the legacy .signing_key
salt during transparent migration. When the old hash matches, save_settings()
re-hashes with _pbkdf2_key() and _invalidate_password_hash_cache() ensures
the next request sees the upgraded hash without a restart.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ristic

HMAC length: create_session() now emits a full 64-char HMAC-SHA256 hex
digest instead of the truncated 32-char form. verify_session() accepts
both lengths during a transition window so existing sessions survive the
upgrade without a forced global logout. The legacy 32-char branch can be
removed once the default 30-day session TTL has elapsed.

Secure flag: introduce _is_secure_context(handler) to encapsulate the
env-var override and heuristic. Restores the getpeercert / X-Forwarded-Proto
heuristic that was present before this refactor, keeping the env-var
override (HERMES_WEBUI_SECURE) on top for proxy deployments that need
explicit control. The bare `return False` stub that the previous commit
left in place silently broke Secure-cookie delivery for all reverse-proxy
users who never set the env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fix opencode-go custom provider overlap routing (Michaelyklam, closes #1894)
Activity highlight animation (dobby-d-elf)
fix(auth) 1/3: thread-safe login rate limiter + PBKDF2 key separation + transparent migration (lucasrc)
fix(auth) 3/3: full HMAC digest with upgrade migration bridge + restore Secure cookie heuristic (lucasrc)
Hermes Agent added 6 commits May 13, 2026 20:42
fix(auth) 2/3: invalidate password hash cache when password changes via Settings panel (lucasrc, depends on #2191)
fix: clarify cancelled chat turn status (Jordan-SkyLF)

Conflict resolution on api/streaming.py:4549-4567 (the cancel-handler
ownership guard). Both this PR and the already-shipped PR #2136 add a
guard at the same site against stale stream writebacks, from different
angles:

  - PR #2136 (HEAD): _stream_writeback_is_current(_cs, stream_id) — strictly
    dominates by checking the active_stream_id token equality.
  - PR #2151: 'worker won the race' check via (active_stream_id != stream_id
    and not pending_user_message), with _emit_cancel_event = False to suppress
    the terminal cancel event.

Resolution merges both: keep #2136's strictly-stronger condition for skip
detection, and adopt #2151's _emit_cancel_event = False semantic so the
cancel event isn't emitted in addition to skipping the writeback (when
client may have already received the successful done payload).

55/55 tests pass across cancelled-turn-status + stale-stream-writeback +
the four cancel/data-loss sibling test files.
fix(ui): custom models not displayed in model configuration list (hualong1009)
…llowOllamaFormat guard

PR #2178 added an 'allowOllamaFormat' guard (resolves to false for non-ollama
@-provider prefixes like '@Custom:ai_gateway') to stop the ollama label
formatter from reformatting custom-provider model IDs with dashes. The
existing test asserted on the pre-PR code shape and didn't pick up the new
guard.

Updated the assertion to match the actual post-PR code at static/ui.js:2202,
with an extended docstring explaining the bug class the guard fixes (bare
custom-provider model IDs like 'Qwen3.6-35B-A3B' had hyphens stripped to
spaces + last letter lowercased by the formatter).
…edup scope

Opus flagged that PR #2151's cancel-handler partial-dedup loop used a
substring check that was too broad: any short prior assistant reply
('OK', 'Here is the answer:') would dedup a longer new partial containing
it, silently dropping the partial and resurrecting the #893 data-loss bug.

Tightened to only dedup against actual prior _partial=True markers with
exact (whitespace-stripped) content match. Three new regression tests
added (short-non-partial-prefix-does-not-dedup, exact-partial-match-still-
dedups, same-content-non-partial-does-not-dedup).

10/10 partial-cancel tests pass after the fix. Also updated CHANGELOG with
the conflict-resolution notes for #2151 vs #2136 and the #2178 test-fix.
@nesquena-hermes nesquena-hermes merged commit 6aedb7e into master May 13, 2026
3 checks passed
@nesquena-hermes nesquena-hermes deleted the stage-350 branch May 13, 2026 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

401 error via opencode go deepseek model

6 participants