Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
e4d16e9
fix: clarify cancelled chat turn status
Jordan-SkyLF May 12, 2026
112eadc
fix: address cancelled turn review feedback
Jordan-SkyLF May 12, 2026
bc3f4e5
Cache PBKDF2 password hash to eliminate ~1s overhead on every HTTP re…
lucasrc May 13, 2026
7acbb3d
Cache PBKDF2 password hash to eliminate ~1s overhead on every HTTP re…
lucasrc May 13, 2026
a49c0fb
fix(ui): Fix the issue where custom models are not displayed in the m…
hualong1009 May 13, 2026
de3dba3
feat: soften sweep edges and widen band for Activity animation
dobby-d-elf May 13, 2026
7b263ce
save
dobby-d-elf May 13, 2026
e6e91e4
fix(auth): thread-safe login rate limiter, PBKDF2 key separation, and…
lucasrc May 13, 2026
a60c222
Version A: tune Activity sweep animation
dobby-d-elf May 13, 2026
3640cd8
Version B: use gold Activity highlight sweep
dobby-d-elf May 13, 2026
720e69c
fix(auth): cache signing and PBKDF2 keys in memory, remove migration …
lucasrc May 13, 2026
a183378
Refine version B Activity highlight sweep
dobby-d-elf May 13, 2026
f6a5fc2
Widen version B Activity highlight sweep
dobby-d-elf May 13, 2026
8ca2961
fix(auth): tighten except to OSError, add type hints, fix test imports
lucasrc May 13, 2026
978dbc1
fix(auth): correct misleading cache invalidation comment in verify_pa…
lucasrc May 13, 2026
2bcf411
fix(auth): invalidate password hash cache in save_settings() on passw…
lucasrc May 13, 2026
3daa12c
test(auth): add cache invalidation regression tests for save_settings()
lucasrc May 13, 2026
07a5fe0
fix(auth): HMAC length migration bridge and restore Secure cookie heu…
lucasrc May 13, 2026
9921bbb
docs(auth): add X-Forwarded-Proto trust warning to _is_secure_context()
lucasrc May 13, 2026
7e6f737
fix(auth): add type hint to verify_session()
lucasrc May 13, 2026
b734d95
test(auth): add regression tests for HMAC migration bridge (32→64 char)
lucasrc May 13, 2026
2a96fb4
fix(auth): update HMAC sig length assertion to 64 chars and rebase on…
lucasrc May 13, 2026
11d9687
Polish version B Activity highlight sweep
dobby-d-elf May 13, 2026
efce9eb
Merge remote-tracking branch 'origin/master' into tools-animation-ver…
dobby-d-elf May 13, 2026
1e17760
Fix opencode-go provider overlap routing
Michaelyklam May 13, 2026
fe4689e
test(auth): merge invalidation tests into hash cache test file, remov…
lucasrc May 13, 2026
f94314e
Merge pull request #2204 into stage-350
May 13, 2026
73b47ec
Merge pull request #2203 into stage-350
May 13, 2026
ca82f60
Merge pull request #2191 into stage-350
May 13, 2026
5f8b834
Merge pull request #2193 into stage-350
May 13, 2026
df3352e
Merge pull request #2192 into stage-350
May 13, 2026
3f85105
Merge pull request #2151 into stage-350
May 13, 2026
1f9520d
Merge pull request #2178 into stage-350
May 13, 2026
43f86d0
stage-350: fix #2178 CI — update Ollama test assertion to match new a…
May 13, 2026
66ffc7d
docs: CHANGELOG stage-350 — close v0.51.56, open Unreleased for 7-PR …
May 13, 2026
7209e89
stage-350: apply Opus SHOULD-FIX — tighten _partial_already_present d…
May 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,34 @@

## [Unreleased]

### Fixed

- **PR #2191** by @lucasrc (auth refactor 1/3) — Thread-safe login rate limiter (new `_LOGIN_ATTEMPTS_LOCK`) + PBKDF2 key separation (new `_pbkdf2_key()` reading `.pbkdf2_key` separately from `_signing_key()` reading `.signing_key` — previously both shared `.signing_key`, a key-reuse anti-pattern across HMAC and PBKDF2 primitives) + transparent migration in `verify_password()` that re-salts legacy hashes with the new key on next successful login. 241-line regression suite covering the lock + migration paths. Split from earlier #2167 per maintainer review request.

- **PR #2192** by @lucasrc (auth refactor 2/3, depends on #2191) — Invalidate password-hash cache when password changes via the Settings panel. The PR #2191 cache lives for the process lifetime, but `save_settings({'_set_password': ...})` could mutate `settings.json.password_hash` without telling the auth module — leaving the cache stale and verifying against the old password until restart. Now `save_settings()` calls `_invalidate_password_hash_cache()` on both `_set_password` and `_clear_password` paths. 52-line regression suite + `verify_password()` simplified to rely on the new hook instead of doing the invalidation itself.

- **PR #2193** by @lucasrc (auth refactor 3/3, independent of #2191/2) — Full 64-char HMAC-SHA256 session signatures with upgrade migration bridge. `create_session()` now emits the full digest instead of the previous `[:32]` truncated form; `verify_session()` accepts both lengths during a transition window so existing sessions survive the upgrade without a forced global logout. Restored the `_is_secure_context(handler)` heuristic (getpeercert + X-Forwarded-Proto) that the original #2167 had dropped — adds an `HERMES_WEBUI_SECURE` env-var override on top of the auto-detect. 42-line regression suite covering both signature lengths + Secure-cookie env-var override.

- **PR #2151** by @Jordan-SkyLF — Cancelled chat turns are no longer reported as provider/no-content failures. Classifies user/client cancellation, interruption/abort, provider-empty/no-content, and provider/rate/quota errors separately in streaming error handling. Persists cancelled turns as `_error` assistant markers with verbose copy and a `Cancellation details` disclosure, so reloads match the live UI. Adds race/idempotency guards so worker finalization and `/api/chat/cancel` do not duplicate cancel markers, late Stop clicks after a completed worker save do not emit contradictory cancel events (`_emit_cancel_event = False` short-circuits the terminal event when the writeback is stale), and partial streamed text/reasoning/tool-call metadata is still preserved on real cancellation. Stage-350 maintainer resolution merged this PR's cancel-handler guard with #2136's `_stream_writeback_is_current()` ownership check — both correct guards now coexist on the cancel path.

- **PR #2178** by @hualong1009 — Custom-provider models now display correctly in the model configuration list, and bare custom-provider model IDs containing dashes (e.g. `Qwen3.6-35B-A3B`) no longer have their hyphens stripped to spaces + last letter lowercased by the Ollama label formatter. Adds an `allowOllamaFormat` guard derived from `atProvider` (the `@<provider>` prefix on the model id, if any): the Ollama formatter only runs when `atProvider` is empty or starts with `ollama`. For `@custom:ai_gateway:Qwen3.6-35B-A3B` and similar non-ollama @-provider model IDs, the formatter is suppressed and the model badge label preserves the original casing/punctuation. Stage-350 maintainer fix updated `tests/test_ollama_model_chip_label_regression.py` to assert on the new `allowOllamaFormat &&` guard prefix (the original test asserted on the pre-PR code shape and was failing CI).

- **PR #2204** by @Michaelyklam (closes #1894) — `resolve_model_provider()` now prefers the configured non-custom provider when it owns a requested bare model id, even when a named custom provider also advertises the same model. Pre-fix, `model="deepseek-v4-pro"` under `provider="opencode-go"` could route to a sibling `custom_providers["opencode-go"]` entry that happened to advertise the same model rather than the canonical opencode-go provider. Custom-provider routing for custom-only models is preserved. 157-line regression suite covering the opencode-go/deepseek-v4-pro overlap and explicit provider/suffix parsing.

### Added

- **PR #2203** by @dobby-d-elf — Animates the "Activity: X tools" composer footer text while the LLM is using tools — subtle shimmer gradient that stops when tool-calling completes. Highlight color follows the active theme. Reduced-motion and mask-support fallbacks render plain muted Activity text unchanged in unsupported or `prefers-reduced-motion` environments. Also fixes a small flickering/unclickable first "Thinking" block when the user clicks it while the model is still streaming reasoning into it (unrelated to the animation but right next to it on screen).

### Stage-350 maintainer fixes

- **`api/streaming.py:_partial_already_present` dedup scope tightening** — Opus SHOULD-FIX-pre-merge on PR #2151. The dedup loop that prevents double-writing a `_partial` marker on `cancel_stream` re-entry used a substring check (`_stripped in _existing or _existing in _stripped`) against any prior assistant message — too broad. Any short prior assistant reply like "OK" or "Here is the answer:" would be a substring of many later partial bodies and could silently drop the new partial, resurrecting the #893 data-loss bug on long sessions. Tightened to: only dedup against actual prior `_partial=True` markers, with exact (whitespace-stripped) content match. Three new regression tests added: (a) short prior non-partial reply does NOT dedup a longer new partial that contains it, (b) exact-content match against a prior `_partial` marker DOES still dedup (re-entry safety), (c) prior assistant message with same content but NOT marked `_partial` does NOT dedup (it's from a completed earlier turn). 10/10 partial-cancel tests pass after the fix.

- **`api/streaming.py` cancel-handler conflict resolution between #2151 and the already-shipped #2136** — Resolved a semantic merge conflict on the cancel handler. Both PRs added stale-stream ownership guards at the same site. Kept #2136's `_stream_writeback_is_current()` check as the strictly-stronger condition (it also catches the case where the stream rotated to a new stream with a new pending_user_message — #2151's standalone check would have let that case fall through). Adopted #2151's `_emit_cancel_event = False` semantic on the same path so the terminal cancel SSE event isn't emitted in addition to skipping the writeback (otherwise a successful done payload already delivered to the client would be contradicted by a late cancel event). 55/55 tests across both PR suites pass after the resolution.

- **`tests/test_ollama_model_chip_label_regression.py` updated to match PR #2178's new `allowOllamaFormat` guard** — The existing static-source test asserted on the pre-PR string and was failing CI. Updated the assertion to require the new `allowOllamaFormat &&` guard prefix, with an extended docstring explaining the bug class (`Qwen3.6-35B-A3B`-shaped bare custom-provider model IDs had hyphens stripped to spaces + last letter lowercased by the ollama formatter pre-fix).

## [v0.51.56] — 2026-05-13 — Release AF (stage-349 — Tier 1 safe slice — reasoning_content whitelist + fork-from-here absolute index + Firefox sidebar scroll + provisional session titles)

### Added

- **PR #2202** by @Jordan-SkyLF — Early session titles on chat start. Pre-fix, new conversations sat as "Untitled" until later title generation completed. Now `/api/chat/start` derives a provisional title from the first user prompt and returns it in the response, so the sidebar and topbar sync immediately. Later SSE title refinements replace the provisional via one guarded helper (only when the current title is still known-default/provisional). Manual/custom user titles are protected via exact-normalized-match detection, so user-renamed prefix titles are never treated as automatic placeholders. 167-line regression suite in `tests/test_early_session_title.py` covering default/eager/manual title behavior, chat-start response shape, JS wiring, and manual-prefix protection.
Expand Down
211 changes: 168 additions & 43 deletions api/auth.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import os
import secrets
import tempfile
import threading
import time

from api.config import STATE_DIR, load_settings
Expand Down Expand Up @@ -154,91 +155,181 @@ def _save_login_attempts(attempts: dict[str, list[float]]) -> None:


_login_attempts = _load_login_attempts() # ip -> [timestamp, ...]
_LOGIN_ATTEMPTS_LOCK = threading.Lock()


def _check_login_rate(ip: str) -> bool:
"""Return True if the IP is allowed to attempt login."""
now = time.time()
attempts = _login_attempts.get(ip, [])
# Prune old attempts
attempts = [t for t in attempts if now - t < _LOGIN_WINDOW]
if attempts:
_login_attempts[ip] = attempts
else:
_login_attempts.pop(ip, None)
_save_login_attempts(_login_attempts)
return len(attempts) < _LOGIN_MAX_ATTEMPTS
"""Return True if the IP is allowed to attempt login (thread-safe)."""
with _LOGIN_ATTEMPTS_LOCK:
now = time.time()
attempts = _login_attempts.get(ip, [])
# Prune old attempts
attempts = [t for t in attempts if now - t < _LOGIN_WINDOW]
if attempts:
_login_attempts[ip] = attempts
else:
_login_attempts.pop(ip, None)
_save_login_attempts(_login_attempts)
return len(attempts) < _LOGIN_MAX_ATTEMPTS


def _record_login_attempt(ip: str) -> None:
now = time.time()
attempts = _login_attempts.get(ip, [])
attempts.append(now)
_login_attempts[ip] = attempts
_save_login_attempts(_login_attempts)
"""Record a login attempt for rate limiting (thread-safe)."""
with _LOGIN_ATTEMPTS_LOCK:
now = time.time()
attempts = _login_attempts.get(ip, [])
attempts.append(now)
_login_attempts[ip] = attempts
_save_login_attempts(_login_attempts)


def _signing_key():
"""Return a random signing key, generating and persisting one on first call."""
key_file = STATE_DIR / '.signing_key'
def _load_key(filename: str) -> bytes:
"""Load a 32-byte key from STATE_DIR, generating and persisting one if missing."""
key_file = STATE_DIR / filename
try:
if key_file.exists():
raw = key_file.read_bytes()
if len(raw) >= 32:
return raw[:32]
except Exception:
logger.debug("Failed to read or access signing key file, using in-memory key")
# Generate a new random key
except OSError:
logger.debug("Failed to read key %s", filename)
key = secrets.token_bytes(32)
try:
STATE_DIR.mkdir(parents=True, exist_ok=True)
key_file.write_bytes(key)
key_file.chmod(0o600)
except Exception:
logger.debug("Failed to persist signing key, using in-memory key only")
except OSError:
logger.debug("Failed to persist key %s", filename)
return key


def _hash_password(password):
_PBKDF2_KEY_CACHE: bytes | None = None
_SIGNING_KEY_CACHE: bytes | None = None


def _pbkdf2_key() -> bytes:
global _PBKDF2_KEY_CACHE
if _PBKDF2_KEY_CACHE is None:
_PBKDF2_KEY_CACHE = _load_key('.pbkdf2_key')
return _PBKDF2_KEY_CACHE


def _signing_key() -> bytes:
global _SIGNING_KEY_CACHE
if _SIGNING_KEY_CACHE is None:
_SIGNING_KEY_CACHE = _load_key('.signing_key')
return _SIGNING_KEY_CACHE


def _hash_password(password, *, salt: bytes | None = None) -> str:
"""PBKDF2-SHA256 with 600k iterations (OWASP recommendation).
Salt is the persisted random signing key, which is secret and unique per
Salt is the persisted PBKDF2 key, which is secret and unique per
installation. This keeps the stored hash format a plain hex string
(no format change to settings.json) while replacing the predictable
STATE_DIR-derived salt from the original implementation."""
salt = _signing_key()
STATE_DIR-derived salt from the original implementation.

The *salt* parameter exists solely to support transparent migration
of password hashes that were computed with a different key (e.g. the
old `.signing_key`). Normal callers should never pass it.
"""
if salt is None:
salt = _pbkdf2_key()
dk = hashlib.pbkdf2_hmac('sha256', password.encode(), salt, 600_000)
return dk.hex()


_AUTH_HASH_LOCK = threading.Lock()
_AUTH_HASH_COMPUTED: bool = False
_AUTH_HASH_CACHE: str | None = None


def _invalidate_password_hash_cache() -> None:
"""Invalidate the in-process password hash cache so the next call to
get_password_hash() re-reads from settings.json or the env var."""
global _AUTH_HASH_COMPUTED, _AUTH_HASH_CACHE
with _AUTH_HASH_LOCK:
_AUTH_HASH_COMPUTED = False
_AUTH_HASH_CACHE = None


def get_password_hash() -> str | None:
"""Return the active password hash, or None if auth is disabled.
Priority: env var > settings.json."""
env_pw = os.getenv('HERMES_WEBUI_PASSWORD', '').strip()
if env_pw:
return _hash_password(env_pw)
settings = load_settings()
return settings.get('password_hash') or None
Priority: env var > settings.json.

The hash is computed once and cached for the lifetime of the process.
PBKDF2-600k takes ~1 s and is called on nearly every HTTP request via
check_auth → is_auth_enabled, so caching avoids wasting a full second
of CPU per request after the first one.

Thread-safe: double-checked locking ensures that under a burst of
concurrent requests only one thread computes PBKDF2, while the fast
path (after initialisation) requires zero locks.
"""
global _AUTH_HASH_COMPUTED, _AUTH_HASH_CACHE

# Fast path — no lock needed once cache is populated.
if _AUTH_HASH_COMPUTED:
return _AUTH_HASH_CACHE

with _AUTH_HASH_LOCK:
# Re-check inside lock — another thread may have populated while
# we were waiting to acquire.
if _AUTH_HASH_COMPUTED:
return _AUTH_HASH_CACHE

env_pw = os.getenv('HERMES_WEBUI_PASSWORD', '').strip()
if env_pw:
result = _hash_password(env_pw)
else:
result = load_settings().get('password_hash') or None

_AUTH_HASH_CACHE = result
_AUTH_HASH_COMPUTED = True
return result


def is_auth_enabled() -> bool:
"""True if a password is configured (env var or settings)."""
return get_password_hash() is not None


def verify_password(plain) -> bool:
"""Verify a plaintext password against the stored hash."""
def verify_password(plain: str) -> bool:
"""Verify a plaintext password against the stored hash.

Supports transparent migration of password hashes that were computed
with the old `.signing_key` salt. When the two keys differ and the
legacy-salted hash matches, the password is transparently re-hashed
with the current `.pbkdf2_key` and persisted to settings.json.
"""
expected = get_password_hash()
if not expected:
return False
return hmac.compare_digest(_hash_password(plain), expected)
# Fast path: current PBKDF2 key
if hmac.compare_digest(_hash_password(plain), expected):
return True
# Migration: some hashes were computed with `.signing_key` before the
# PBKDF2 key was separated. Try the legacy salt; if it matches,
# transparently upgrade so the next login uses the fast path.
legacy_salt = _signing_key()
current_salt = _pbkdf2_key()
if legacy_salt != current_salt:
if hmac.compare_digest(_hash_password(plain, salt=legacy_salt), expected):
from api.config import save_settings

save_settings({'_set_password': plain})
# Password re-hashed and persisted to disk using the current salt.
# Cache invalidation is handled by fix 2/3 (#2192) which adds the
# _invalidate_password_hash_cache() call inside save_settings().
return True
return False


def create_session() -> str:
"""Create a new auth session. Returns signed cookie value."""
token = secrets.token_hex(32)
_sessions[token] = time.time() + _resolve_session_ttl()
_save_sessions(_sessions)
sig = hmac.new(_signing_key(), token.encode(), hashlib.sha256).hexdigest()[:32]
sig = hmac.new(_signing_key(), token.encode(), hashlib.sha256).hexdigest()
return f"{token}.{sig}"


Expand All @@ -252,14 +343,20 @@ def _prune_expired_sessions():
_save_sessions(_sessions)


def verify_session(cookie_value) -> bool:
def verify_session(cookie_value: str) -> bool:
"""Verify a signed session cookie. Returns True if valid and not expired."""
if not cookie_value or '.' not in cookie_value:
return False
_prune_expired_sessions() # lazy cleanup on every verification attempt
token, sig = cookie_value.rsplit('.', 1)
expected_sig = hmac.new(_signing_key(), token.encode(), hashlib.sha256).hexdigest()[:32]
if not hmac.compare_digest(sig, expected_sig):
full_sig = hmac.new(_signing_key(), token.encode(), hashlib.sha256).hexdigest()
# Accept both new (64-char) and legacy (32-char truncated) signatures so
# existing sessions survive the upgrade without a forced global logout.
# The legacy branch can be removed once session TTLs have expired (~30 days).
valid = hmac.compare_digest(sig, full_sig) or (
len(sig) == 32 and hmac.compare_digest(sig, full_sig[:32])
)
if not valid:
return False
expiry = _sessions.get(token)
if not expiry or time.time() > expiry:
Expand Down Expand Up @@ -342,6 +439,35 @@ def check_auth(handler, parsed) -> bool:
return False


def _is_secure_context(handler=None) -> bool:
"""Return True if cookies should carry the Secure flag.

Behaviour is overridable via HERMES_WEBUI_SECURE env var for
reverse-proxy setups where TLS terminates at a frontend proxy
(nginx, Cloudflare, etc.) and Python only sees plain HTTP.
1/true/yes → force Secure on; 0/false/no → force Secure off.
When unset, fall back to heuristics: direct TLS socket (getpeercert)
or X-Forwarded-Proto header from the request.

.. warning::
The ``X-Forwarded-Proto`` header is only trustworthy when a
reverse proxy (nginx, Cloudflare, etc.) is deployed in front
of the application. Without a proxy, any client can forge the
header and cause the Secure flag to be set on plain HTTP.
"""
env = os.getenv('HERMES_WEBUI_SECURE', '').strip().lower()
if env in ('1', 'true', 'yes'):
return True
if env in ('0', 'false', 'no'):
return False
if handler is not None:
if getattr(handler.request, 'getpeercert', None) is not None:
return True
if handler.headers.get('X-Forwarded-Proto', '') == 'https':
return True
return False


def set_auth_cookie(handler, cookie_value) -> None:
"""Set the auth cookie on the response."""
cookie = http.cookies.SimpleCookie()
Expand All @@ -350,8 +476,7 @@ def set_auth_cookie(handler, cookie_value) -> None:
cookie[COOKIE_NAME]['samesite'] = 'Lax'
cookie[COOKIE_NAME]['path'] = '/'
cookie[COOKIE_NAME]['max-age'] = str(_resolve_session_ttl())
# Set Secure flag when connection is HTTPS
if getattr(handler.request, 'getpeercert', None) is not None or handler.headers.get('X-Forwarded-Proto', '') == 'https':
if _is_secure_context(handler):
cookie[COOKIE_NAME]['secure'] = True
handler.send_header('Set-Cookie', cookie[COOKIE_NAME].OutputString())

Expand Down
Loading
Loading