stage-350: medium-risk batch — auth trilogy (#2191/2/3) + cancel-status #2151 with conflict resolution + #2178 ollama guard + #2204 provider precedence + #2203 activity animation#2209
Merged
Conversation
- classify string-only CancelledError payloads as cancelled - centralize cancel marker substring matching - add targeted regression coverage
…quest get_password_hash() computes PBKDF2-SHA256 with 600k iterations to hash the HERMES_WEBUI_PASSWORD env var. This is called on nearly every HTTP request via check_auth -> is_auth_enabled -> get_password_hash. Before: ~1s of PBKDF2 per request, regardless of how many times the same env-var value has already been hashed. A page load hitting 5+ API endpoints would burn 5+ seconds purely on password hashing. After: compute once on first call, cache the hex result in a module- level variable. Subsequent calls are a single global-variable read (~50ns). The env var is immutable for the process lifetime, so there is nothing to invalidate. Thread-safe: double-checked locking ensures that under a burst of concurrent requests only one thread computes PBKDF2, while the fast path (after initialisation) requires zero locks. Security analysis: zero regression. The hash is derived from a static env var and a static signing key — both already readable from process memory. Caching does not introduce any new disclosure or replay vector. PBKDF2 is still used for the initial computation and for verify_password() on login. AI: deepseek/deepseek-v4-flash
…quest get_password_hash() computes PBKDF2-SHA256 with 600k iterations to hash the HERMES_WEBUI_PASSWORD env var. This is called on nearly every HTTP request via check_auth -> is_auth_enabled -> get_password_hash. Before: ~1s of PBKDF2 per request, regardless of how many times the same env-var value has already been hashed. A page load hitting 5+ API endpoints would burn 5+ seconds purely on password hashing. After: compute once on first call, cache the hex result in a module- level variable. Subsequent calls are a single global-variable read (~50ns). The env var is immutable for the process lifetime, so there is nothing to invalidate. Thread-safe: double-checked locking ensures that under a burst of concurrent requests only one thread computes PBKDF2, while the fast path (after initialisation) requires zero locks. 10 unit tests covering all branches, cache-lifetime semantics, and concurrent burst safety (8 threads, exactly 1 PBKDF2 call). Test isolation: reloads only api.auth via importlib.reload, leaving api.config untouched so test_pytest_state_isolation.py is unaffected. Security analysis: zero regression. The hash is derived from a static env var and a static signing key — both already readable from process memory. Caching does not introduce any new disclosure or replay vector. PBKDF2 is still used for the initial computation and for verify_password() on login. AI: deepseek/deepseek-v4-flash
…odel configuration list - Fix the issue where custom models are not shown - Fix the issue where custom models are not ollama but go through the ollama model processing function, causing the hyphen '-' in the model name to be replaced with a space " " and the last letter to be lowercase
… migration path Concurrent failed logins raced on _login_attempts because no lock guarded the dict. Add _LOGIN_ATTEMPTS_LOCK and wrap both _check_login_rate() and _record_login_attempt() with it. Extract _load_key() to de-duplicate key file I/O. Add _pbkdf2_key() that loads .pbkdf2_key (separate from .signing_key) so PBKDF2 and HMAC signing no longer share a key — key reuse across cryptographic primitives is unsafe. Update _hash_password() to use _pbkdf2_key() as its default salt, with an optional *salt* kwarg so verify_password() can try the legacy .signing_key salt during transparent migration. When the old hash matches, save_settings() re-hashes with _pbkdf2_key() and _invalidate_password_hash_cache() ensures the next request sees the upgraded hash without a restart. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ristic HMAC length: create_session() now emits a full 64-char HMAC-SHA256 hex digest instead of the truncated 32-char form. verify_session() accepts both lengths during a transition window so existing sessions survive the upgrade without a forced global logout. The legacy 32-char branch can be removed once the default 30-day session TTL has elapsed. Secure flag: introduce _is_secure_context(handler) to encapsulate the env-var override and heuristic. Restores the getpeercert / X-Forwarded-Proto heuristic that was present before this refactor, keeping the env-var override (HERMES_WEBUI_SECURE) on top for proxy deployments that need explicit control. The bare `return False` stub that the previous commit left in place silently broke Secure-cookie delivery for all reverse-proxy users who never set the env var. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sion-b # Conflicts: # static/ui.js
Fix opencode-go custom provider overlap routing (Michaelyklam, closes #1894)
Activity highlight animation (dobby-d-elf)
fix(auth) 1/3: thread-safe login rate limiter + PBKDF2 key separation + transparent migration (lucasrc)
fix(auth) 3/3: full HMAC digest with upgrade migration bridge + restore Secure cookie heuristic (lucasrc)
added 6 commits
May 13, 2026 20:42
fix(auth) 2/3: invalidate password hash cache when password changes via Settings panel (lucasrc, depends on #2191)
fix: clarify cancelled chat turn status (Jordan-SkyLF) Conflict resolution on api/streaming.py:4549-4567 (the cancel-handler ownership guard). Both this PR and the already-shipped PR #2136 add a guard at the same site against stale stream writebacks, from different angles: - PR #2136 (HEAD): _stream_writeback_is_current(_cs, stream_id) — strictly dominates by checking the active_stream_id token equality. - PR #2151: 'worker won the race' check via (active_stream_id != stream_id and not pending_user_message), with _emit_cancel_event = False to suppress the terminal cancel event. Resolution merges both: keep #2136's strictly-stronger condition for skip detection, and adopt #2151's _emit_cancel_event = False semantic so the cancel event isn't emitted in addition to skipping the writeback (when client may have already received the successful done payload). 55/55 tests pass across cancelled-turn-status + stale-stream-writeback + the four cancel/data-loss sibling test files.
fix(ui): custom models not displayed in model configuration list (hualong1009)
…llowOllamaFormat guard PR #2178 added an 'allowOllamaFormat' guard (resolves to false for non-ollama @-provider prefixes like '@Custom:ai_gateway') to stop the ollama label formatter from reformatting custom-provider model IDs with dashes. The existing test asserted on the pre-PR code shape and didn't pick up the new guard. Updated the assertion to match the actual post-PR code at static/ui.js:2202, with an extended docstring explaining the bug class the guard fixes (bare custom-provider model IDs like 'Qwen3.6-35B-A3B' had hyphens stripped to spaces + last letter lowercased by the formatter).
…medium-risk batch
…edup scope Opus flagged that PR #2151's cancel-handler partial-dedup loop used a substring check that was too broad: any short prior assistant reply ('OK', 'Here is the answer:') would dedup a longer new partial containing it, silently dropping the partial and resurrecting the #893 data-loss bug. Tightened to only dedup against actual prior _partial=True markers with exact (whitespace-stripped) content match. Three new regression tests added (short-non-partial-prefix-does-not-dedup, exact-partial-match-still- dedups, same-content-non-partial-does-not-dedup). 10/10 partial-cancel tests pass after the fix. Also updated CHANGELOG with the conflict-resolution notes for #2151 vs #2136 and the #2178 test-fix.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
stage-350 — medium-risk batch (v0.51.57)
Per Nathan's "now around later today, do the next round of PRs that require a little work but otherwise look good" directive. This batch is the medium-risk slice: stacked auth trilogy, semantic conflict resolution, CI fix for a fragile static-source test, plus 3 clean PRs.
Composition (7 PRs)
_invalidate_password_hash_cache()hook fromsave_settings()(depends on #2191)allowOllamaFormatguard against non-ollama @-provider IDsresolve_model_provider()prefers configured non-custom provider when it owns a bare model id (closes #1894)Stage-350 maintainer fixes
fix: clarify cancelled chat turn status #2151 conflict resolution on api/streaming.py:4549 — Both this PR and the shipped fix: guard stale stream writebacks #2136 added stale-stream ownership guards at the same cancel-handler site. Kept fix: guard stale stream writebacks #2136's
_stream_writeback_is_current()check as the strictly-stronger condition (it also catches stream-rotated-with-new-pending-message scenarios that fix: clarify cancelled chat turn status #2151's standalone check would miss). Adopted fix: clarify cancelled chat turn status #2151's_emit_cancel_event = Falsesemantic so the terminal cancel SSE event isn't emitted alongside skipping the writeback (otherwise a successfuldonealready delivered to the client would be contradicted). 55/55 tests pass across both PR suites after the resolution.Opus SHOULD-FIX-pre-merge:
_partial_already_presentdedup scope tightening — Opus advisor reviewer flagged that fix: clarify cancelled chat turn status #2151's partial-dedup loop used a substring check (_stripped in _existing or _existing in _stripped) against any prior assistant message — too broad. A short prior assistant reply like "OK" would be a substring of many later partial bodies and silently drop the new partial, resurrecting the Bug: Stop_generation button deletes LLM response #893 data-loss bug. Tightened to only dedup against actual prior_partial=Truemarkers with exact whitespace-stripped content match. 3 new regression tests added covering: short non-partial prior reply does NOT dedup, exact-partial-match DOES dedup (re-entry safety), prior assistant with same content but NOT_partialdoes NOT dedup.fix(ui): Fix the issue where custom models are not displayed in the m… #2178 test fix —
tests/test_ollama_model_chip_label_regression.py— Updated the static-source assertion from the pre-PR string to the post-PR string containing the newallowOllamaFormat &&guard prefix. Extended the docstring to explain the bug class the guard fixes (Qwen3.6-35B-A3B-shaped bare custom-provider model IDs had hyphens stripped to spaces + last letter lowercased by the ollama formatter).What's deferred to tomorrow
show_cli_sessions=Truedefault flip — needs explicit OKVerification
run-browser-tests.sh: 20/20 QA + 11/11 API checks PASSED in 107spython -m py_compileclean on all 7 modified.pyfilesnode --checkclean on all 4 modified.jsfiles_partial_already_presentdedup tightening (applied inline). Three lower-priority follow-ups noted (legacy HMAC bridge removal date, X-Forwarded-Proto trust documented in docstring, activity shimmer fallback verified) — all non-blocking.Stats
Closes
Closes #1894(via Fix opencode-go custom provider overlap routing #2204)Refs #1361, #893, #2154 (via #2151 conflict resolution).