Skip to content

fix(compression): ignore tool output for compaction cards#2803

Closed
simjak wants to merge 1 commit into
nesquena:masterfrom
simjak:fix/compression-marker-origin
Closed

fix(compression): ignore tool output for compaction cards#2803
simjak wants to merge 1 commit into
nesquena:masterfrom
simjak:fix/compression-marker-origin

Conversation

@simjak

@simjak simjak commented May 23, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Expose the shared strict compression-marker predicate as is_context_compression_marker().
  • Make the streaming auto-compression path reuse that shared predicate instead of its local broad substring matcher.
  • Add regressions proving tool output and normal user discussion that merely mention context compression no longer become compression_anchor_summary cards.

Root cause

The WebUI had two different compression-marker detectors:

  1. api.compression_anchor._is_context_compression_marker() was strict: it only accepted known synthetic marker prefixes on non-tool messages.
  2. api.streaming._is_context_compression_marker() was broad: it returned true if the message text contained phrases like context compression or context compaction anywhere.

_compression_summary_from_messages() scans messages in reverse to find the latest compression marker and persists that text into compression_anchor_summary. Because the streaming detector did not check the message role and did substring matching, an ordinary tool result, for example JSON emitted by a skill/tool whose text happened to describe "context compression", could be selected as the compression summary.

Once persisted, the frontend rendered that tool JSON as a Context compaction reference card. To users this looked like WebUI compacted after the current turn, even though the card was seeded from unrelated tool output that merely mentioned compression.

This patch removes that drift by routing streaming through the same strict shared marker predicate used by compression-anchor helpers. Tool messages are ignored, and user messages that discuss context compression in normal prose are not treated as synthetic compaction markers.

Testing

  • python -m py_compile api/compression_anchor.py api/streaming.py
  • /home/agent/.hermes/hermes-agent/venv/bin/python -m pytest tests/test_issue2028_compression_anchor_helpers.py tests/test_auto_compression_card.py -q
    • 47 passed
  • Full-suite local attempts were not clean in this environment:
    • one run had an order/environment-sensitive tests/test_issue1499_keyless_onboarding.py::TestKeylessChatReady::test_lmstudio_keyless_chat_ready_via_full_status failure; that test passed immediately when rerun alone.
    • another run hit many localhost ConnectionRefused failures in server-dependent tests, consistent with the local test server not being available for those cases.

@nesquena-hermes

Copy link
Copy Markdown
Collaborator

Summary

Read the diff at cron-pr-2803 against origin/master. Two-part change: rename _is_context_compression_marker to a public is_context_compression_marker in api/compression_anchor.py:56 (keeping a backward-compatible alias), and have api/streaming.py:2301 delegate to that shared strict predicate instead of using its own broad substring matcher.

The framing in the PR body is accurate. The streaming module's local detector at api/streaming.py:2299-2309 on origin/master was substring-matching on lowercased message text without checking role, so anything containing the literal 'context compression' or 'context compaction' got matched — including tool JSON output describing compression and user prose discussing compaction. Since _compression_summary_from_messages() at line 2344-2353 walks messages in reverse and grabs the first matched message's text, a stray tool result could "win" and seed compression_anchor_summary, then surface as a Context Compaction reference card in the UI.

Code reference

Old streaming-side detector at api/streaming.py:2299-2309 (master):

def _is_context_compression_marker(msg):
    if not isinstance(msg, dict):
        return False
    text = _message_text(msg.get('content', '')).lower()
    return (
        'context compaction' in text
        or 'context compression' in text
        or 'context was auto-compressed' in text
        or 'active task list was preserved across context compression' in text
    )

Shared strict predicate now at api/compression_anchor.py:56-71:

def is_context_compression_marker(message):
    """Return true for synthetic compression/reference cards, not user turns."""
    if not isinstance(message, dict):
        return False
    role = message.get("role")
    if not role or role == "tool":
        return False
    text = _content_text(...).lower().lstrip()
    return (
        text.startswith("[context compaction")
        or text.startswith("context compaction")
        or text.startswith("[your active task list was preserved across context compression]")
    )

The strict version requires (a) non-tool role and (b) prefix match. The synthetic marker source at api/routes.py:10082 emits "[CONTEXT COMPACTION — REFERENCE ONLY] {headline}\n", which lstrips and lowercases into a prefix match. So real markers still resolve.

Behavior change

One semantic delta worth flagging explicitly: the master streaming detector also accepted 'context was auto-compressed' in text as a substring match. The new strict predicate has no equivalent for that phrase. Looking at where that phrase is generated:

  • api/streaming.py:4788 emits a goal_status event with 'message': 'Context auto-compressed to continue the conversation'.
  • static/ui.js:5329 has it as a fallback string for the UI.
  • static/messages.js:1845 reads String(d.message||'Context auto-compressed to continue the conversation') from the goal_status event.

I don't see this string being persisted into the message array as a marker (it's transient goal_status UI text). So dropping it from the matcher likely doesn't lose any real marker detection. But worth confirming: if any code path injects a synthetic message with body text starting with "Context was auto-compressed..." (note the trailing pre-state vs the current "Context auto-compressed..."), the new strict predicate will miss it. A quick grep -rn in the agent repo + plugins for that phrase as a synthetic content body would close that out.

Test plan

The new test_context_compression_marker_detection_is_prefix_and_role_scoped exercises four realistic cases (real marker prefix, preserved-tasks marker prefix, tool noise with the phrase in JSON, user prose discussing compression). All four assertions are the right direction.

test_compression_summary_ignores_tool_output_that_mentions_compression is the regression test for the exact failure mode the PR fixes:

assert _compression_summary_from_messages([marker, skill_tool_output]) == marker["content"]
assert _compression_summary_from_messages([skill_tool_output]) is None

Both correct — the tool noise is no longer eligible to seed the summary, and a tool-only history returns None instead of accidentally matching.

Verdict

LGTM with the one note above about the dropped 'context was auto-compressed' substring branch — please confirm via a grep over the agent codebase that no message-content path generates that text as a synthetic marker. If clean, the strict-prefix shape is the right contract. CHANGELOG entry is present.

@nesquena-hermes

Copy link
Copy Markdown
Collaborator

Shipped in v0.51.121 via release/stage-batch3 (#2813). The strict predicate is now the single source of truth — tool output that mentions compression no longer false-positives. Thanks for the comprehensive new test coverage.

huoli4844 pushed a commit to huoli4844/hermes-webui that referenced this pull request May 24, 2026
…isk batch)

Cherry-picked PRs:
- nesquena#2788 (Carry00) — state.db merge: include id column + per-profile reads
- nesquena#2797 (ai-ag2026) — align messaging session display counts (raw->merged)
- nesquena#2803 (simjak) — compression marker strict predicate (no tool output)
- nesquena#2783 (Koraji95-coder) — native Windows start.ps1 + README community link
eleboucher pushed a commit to eleboucher/homelab that referenced this pull request May 24, 2026
…➔ 0.51.124) (#634)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [ghcr.io/nesquena/hermes-webui](https://github.com/nesquena/hermes-webui) | patch | `0.51.108` → `0.51.124` |

---

### Release Notes

<details>
<summary>nesquena/hermes-webui (ghcr.io/nesquena/hermes-webui)</summary>

### [`v0.51.124`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051124--2026-05-24--Release-CV-stage-batch6--3-PR-Windows-only-stack--agent-paths--docs--port-hardening)

[Compare Source](nesquena/hermes-webui@v0.51.123...v0.51.124)

##### Added

- **PR [#&#8203;2805](nesquena/hermes-webui#2805 by [@&#8203;Koraji95-coder](https://github.com/Koraji95-coder) — `start.ps1`: expand hermes-agent candidate paths for Windows installers. The launcher now searches `$env:USERPROFILE\.hermes\hermes-agent`, the dev-checkout sibling, and the Windows installer roots (`$env:LOCALAPPDATA\hermes\hermes-agent`, `${env:ProgramW6432}\hermes\hermes-agent`, `${env:ProgramFiles}\hermes\hermes-agent`, `${env:ProgramFiles(x86)}\hermes\hermes-agent`) with `Select-Object -Unique` to collapse WOW64 ProgramFiles redirection collisions on 32-bit PowerShell processes. Adds `-PathType Container` to the `HERMES_WEBUI_AGENT_DIR` guard so a file named `hermes_cli` doesn't false-positive. Null-guards `${env:ProgramFiles(x86)}` for constrained environments where it's missing. Zero impact on Linux/macOS — file is `start.ps1`, never loaded by `start.sh` or `bootstrap.py`.

##### Documentation

- **PR [#&#8203;2806](nesquena/hermes-webui#2806 by [@&#8203;Koraji95-coder](https://github.com/Koraji95-coder) — Native Windows venv path corrected in `start.ps1` doc-comment and `README.md`. The previous text suggested "run bootstrap.py inside WSL2 once to create the venv, then this script can use that venv" — but a WSL2-created venv is `venv/bin/python` (ELF) and cannot be invoked by native Windows Python. The corrected guidance is to create a Windows venv natively (`python -m venv venv` from PowerShell), then `start.ps1` auto-discovers `venv\Scripts\python.exe`. WSL2 remains useful as a parallel install for the full `bootstrap.py` + Linux runtime path.

##### Hardened

- **PR [#&#8203;2807](nesquena/hermes-webui#2807 by [@&#8203;Koraji95-coder](https://github.com/Koraji95-coder) — `start.ps1`: `HERMES_WEBUI_PORT` env-var parsing uses `[int]::TryParse` + range guard (1-65535) instead of a bare `[int]` cast that threw `InvalidCastException` with no context on typos or accidental shell expansion. Server-process exit code is captured into `$script:serverExitCode` and emitted via `exit` AFTER the `try/finally` cleanup, so `Pop-Location` always runs (avoids leaving the caller stuck at `$RepoRoot` in interactive or dot-sourced sessions). Also drops a non-functional `@args` splat that PowerShell doesn't populate under `[CmdletBinding()]` — the launcher's existing use case is env-var-driven, no pass-through args needed.

### [`v0.51.123`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051123--2026-05-24--Release-CU-stage-batch5--2-PR-low-risk-batch--gzipETag-static-caching--Open-in-VS-Code)

[Compare Source](nesquena/hermes-webui@v0.51.122...v0.51.123)

##### Performance

- **PR [#&#8203;2779](nesquena/hermes-webui#2779 by [@&#8203;v2psv](https://github.com/v2psv) — Static asset serving negotiates gzip, emits ETags, and uses `immutable` cache headers for fingerprinted URLs. `_serve_static()` in `api/routes.py` previously sent every `/static/*` response with `Cache-Control: no-store` and no `Content-Encoding`, so a page reload over a slow link re-downloaded the full \~2.4 MB JS+CSS shell on every visit. The fix layers three changes inside the same function: (1) gzip the body when the client opts in via `Accept-Encoding`, gated to compressible MIME types and files >1 KB; (2) emit a weak ETag derived from `(size, mtime_ns)` and short-circuit conditional GETs to `304 Not Modified`; (3) send `Cache-Control: public, max-age=31536000, immutable` when the URL carries a non-empty `?v=…` fingerprint (the `__WEBUI_VERSION__` token already substituted by the index template and referenced from `static/sw.js`'s `SHELL_ASSETS`), falling back to `public, max-age=300` otherwise. Raw bytes, compressed bytes, and ETags are cached in-process keyed by `(size, mtime_ns)` so a redeploy is picked up without a restart, while missing/random paths never enter the cache and image/font types skip gzip to avoid wasted CPU on already-compressed payloads. Measured against an asyncio TCP proxy that injects RTT + bandwidth caps for representative VPN scenarios: cold loads improve 2.7-3.1× (e.g. 80 ms RTT / 10 Mbps WireGuard goes from 4.0 s to 1.3 s), warm reloads improve 3.3-4.0× via 304 responses, and bytes-on-the-wire drop 74% on cold loads. Loopback (already fast) still benefits 2.4×. Scope is strictly `/static/*`: `/api/*`, `/stream`, `/`, `/index.html`, `/session/*`, and login/auth routes are served by independent handlers and continue to send `no-store` exactly as before — no change to CSRF, session payloads, SSE buffering, or login flows. 11 regression tests pin gzip negotiation, ETag/304 round-trip including `Vary: Accept-Encoding`, fingerprint-driven cache policy including empty `?v=`, image/tiny-file skip rules, redeploy invalidation, and the existing path-traversal sandbox.

##### Added

- **PR [#&#8203;2787](nesquena/hermes-webui#2787 by [@&#8203;munim](https://github.com/munim) — "Open in VS Code" action in workspace file browser (resolves [#&#8203;2735](nesquena/hermes-webui#2735)). Right-clicking any file, folder, or the workspace root now shows an **Open in VS Code** menu item alongside the existing Reveal in File Manager action. The action calls a new `POST /api/file/open-vscode` endpoint which resolves the workspace-relative path via the existing `safe_resolve` traversal guard, then launches VS Code via `subprocess.Popen` (fire-and-forget, consistent with `_handle_file_reveal`). The endpoint resolves the executable via `shutil.which()` first, then falls back to a hardcoded list of common install locations (macOS: `/usr/local/bin/code` and the app-bundle CLI; Linux: `/usr/bin/code`, `/snap/bin/code`; Windows: `%LOCALAPPDATA%\Programs\Microsoft VS Code\bin\code.cmd` and the `%PROGRAMFILES%` variants) so the action works even when the server process inherits a minimal PATH. Configurable via a new optional `vscode` block in `config.yaml`: `command` overrides the default `code` executable; `host_path_prefix` + `container_path_prefix` enable Docker/container host-path translation. If the command cannot be found anywhere, a descriptive error is returned instead of a bare OS error. i18n keys `open_in_vscode` and `open_in_vscode_failed` added with full translations in all 10 locales. 26 new tests in `tests/test_2735_open_in_vscode.py` pin source wiring, command-resolution logic, i18n completeness, translated strings, and live endpoint error paths.

### [`v0.51.122`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051122--2026-05-24--Release-CT-stage-batch4--4-PR-low-risk-batch--stale-cache-tail--inflight-UI--segment-flush--reasoning-accumulator)

[Compare Source](nesquena/hermes-webui@v0.51.121...v0.51.122)

##### Fixed

- **PR [#&#8203;2802](nesquena/hermes-webui#2802 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Drop stale inactive cached user tails when `/api/session` reloads a conversation whose saved sidecar already ends on an assistant answer. Supersedes [#&#8203;2733](nesquena/hermes-webui#2733) (held due to async-compression interaction): the new guard adds a `len(cached_messages) <= len(disk_messages)` filter so it never fires when the cache has genuine new concurrent edits beyond the disk state — only when the cache has an unsaved user row past the saved assistant tail. Adds `api/models._inactive_cache_tail_needs_disk_check()` + `_cache_has_stale_unsaved_user_tail()` helpers and 5 new tests in `tests/test_webui_state_db_reconciliation.py`. Previously-held test `test_session_compress_async_reports_stale_session_guard` now passes (verified). Closes umbrella [#&#8203;2361](nesquena/hermes-webui#2361) partially.

- **PR [#&#8203;2796](nesquena/hermes-webui#2796 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Clear stale inflight UI state before starting a new send so blocked composer busy-state from failed/incomplete prior turns doesn't divert new turns into the invisible queue. Five-commit squashed fix: (1) drop stale optimistic sidebar rows once canonical session data arrives, (2) clear stale busy state before send via `_clearStaleBusyStateBeforeSend()`, (3) preserve server idle rows over stale optimistic local rows, (4) let `/api/chat/start` survive non-fatal pre-start UI errors via `_runOptionalPreStartUiStep()`, (5) keep those warnings console-only instead of throwing. Adds `_shouldKeepLocalOnlyOptimisticSessionRow()` in `static/sessions.js` and 8 new tests in `tests/test_inflight_send_start_race.py`. Closes [#&#8203;2795](nesquena/hermes-webui#2795). Authorship preserved via `--author`.

- **PR [#&#8203;2777](nesquena/hermes-webui#2777 by [@&#8203;b3nw](https://github.com/b3nw) — Flush pending render before segment reset at tool/interim\_assistant boundaries so live tokens that arrived in the 66ms rAF throttle window don't get lost from the DOM when `_resetAssistantSegment()` clears `assistantBody`. New `_flushPendingSegmentRender()` helper writes via `smd`, `renderMd`, or `esc` fallback (same paths as `_doRender`) only when `_renderPending` is true. Completed transcripts were never affected — `renderMessages` rebuilds from the full `assistantText` accumulator on `done`. Adds `tests/test_issue2713_streaming_segment_flush.py`. Closes [#&#8203;2713](nesquena/hermes-webui#2713).

- **PR [#&#8203;2778](nesquena/hermes-webui#2778 by [@&#8203;b3nw](https://github.com/b3nw) — Reset reasoning accumulator per turn and prefer `reasoning_content` over `reasoning` on read. Two related bugs: (1) `reasoningText` was initialized once when the SSE stream opened and never reset between turns, so the `done` event would assign the union of every turn's reasoning to the last assistant message in multi-turn agent sessions; now reset at both turn boundaries (`tool` + `interim_assistant`). (2) `static/ui.js renderMessages` preferred `m.reasoning` (potentially corrupted by bug 1) over `m.reasoning_content` (the clean per-turn backend value); the fallback now reads `m.reasoning_content || m.reasoning`. Updates `tests/test_streaming_race_fix.py` to scope the reconnect-accumulator guard to the `_wireSSE` preamble only (turn-boundary resets inside event listeners are intentional). Adds `tests/test_issue2565_reasoning_accumulation.py`. Closes [#&#8203;2565](nesquena/hermes-webui#2565).

### [`v0.51.121`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051121--2026-05-24--Release-CS-stage-batch3--4-PR-low-risk-batch--statedb-merge--display-counts--compression-marker--Windows-launcher)

[Compare Source](nesquena/hermes-webui@v0.51.120...v0.51.121)

##### Fixed

- **PR [#&#8203;2788](nesquena/hermes-webui#2788 by [@&#8203;Carry00](https://github.com/Carry00) — Prevent `state.db` messages being silently dropped during sidecar merge. Two related bugs were combining to discard historical messages: (1) `get_state_db_session_messages()` was selecting `role, content, timestamp` but NOT `id`, so every row was assigned a `("legacy", ...)` merge key instead of `("message_id", ...)`; (2) when a WebUI-origin session was continued via another Hermes surface (Gateway, CLI), the reader was always hitting the *active* profile's `state.db` rather than the session's own profile. Symptom: a 189-message session showed only 50 in the WebUI. Fix: include `id` in the SELECT when the column exists, and accept an optional `profile=` arg so cross-profile reads use the right database. Both callers in `api/routes.py handle_get` now thread `profile=getattr(s, 'profile', None)` through.

- **PR [#&#8203;2797](nesquena/hermes-webui#2797 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Align messaging session display counts with deduped display messages. The `message_count` returned by `/api/session` is the display coordinate space used for pagination and the header badge. Messaging-thread `state.db` metadata can carry raw duplicate transport rows (blank assistant separators between Discord/Slack thread turns) that `_merged_session_messages_for_display()` intentionally dedupes for rendering. The advertised count was the raw row count, so the frontend expected phantom messages after dedupe — `len(display_msgs) < message_count` triggered "load older" UI states that immediately returned nothing. Fix: `raw["message_count"] = _merged_message_count` for messaging sessions, computed from the same merge that produced the displayed messages. Adds `tests/test_gateway_sync.py::test_messaging_session_message_count_matches_deduped_display_messages` covering the regression.

- **PR [#&#8203;2803](nesquena/hermes-webui#2803 by [@&#8203;simjak](https://github.com/simjak) — Compression-summary cards no longer use ordinary tool output that merely mentions context compression. The streaming auto-compression path was using a local broad substring matcher that fired on any message containing the strings "context compaction" / "context compression" / "context was auto-compressed" / "active task list was preserved across context compression", including skill/tool JSON output and ordinary user discussion about compaction. The strict predicate at `api/compression_anchor._is_context_compression_marker()` was already correctly scoped to synthetic marker prefixes on non-tool messages. Fix: expose the strict predicate as `is_context_compression_marker()` (public name) and route `api/streaming._is_context_compression_marker` through it as a backward-compatible alias. Tool/skill output that mentions compression no longer seeds `compression_anchor_summary` cards.

##### Added

- **PR [#&#8203;2783](nesquena/hermes-webui#2783 by [@&#8203;Koraji95-coder](https://github.com/Koraji95-coder) — Native Windows launcher and community-guide README link (squashed from 3 commits). `start.ps1` is a PowerShell equivalent of `start.sh` that bypasses `bootstrap.py`'s `ensure_supported_platform()` refusal and invokes `server.py` directly on native Windows. It mirrors `start.sh`'s discovery (load optional `.env` with the same readonly-var filter for `UID`/`GID`/`EUID`/`EGID`/`PPID`, find Python via `HERMES_WEBUI_PYTHON` env → `python3` → `python` → `py`, validate `HERMES_WEBUI_AGENT_DIR` on disk before use, prefer the agent's `venv\Scripts\python.exe`, set `HERMES_WEBUI_HOST` / `HERMES_WEBUI_PORT` / `HERMES_WEBUI_STATE_DIR` / `HERMES_HOME` defaults). The README adds a community-maintained native Windows setup section pointing to [@&#8203;markwang2658](https://github.com/markwang2658)'s `hermes-windows-native-guide` and `hermes-windows-native` repos with the documented memory delta (\~330 MB native vs \~1080 MB WSL2+Docker). Closes both halves of [#&#8203;1952](nesquena/hermes-webui#1952). Assumes Python + agent venv are already set up — first-time setup still needs WSL2 once to create the venv (`bootstrap.py` still refuses on native Windows).

### [`v0.51.120`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051120--2026-05-24--Release-CR-stage-batch2--3-PR-low-risk-batch--Bedrock-provider--update-check-past-tag--CORS-preflight)

[Compare Source](nesquena/hermes-webui@v0.51.119...v0.51.120)

##### Added

- **PR [#&#8203;2786](nesquena/hermes-webui#2786 by [@&#8203;munim](https://github.com/munim) — Surface AWS Bedrock as a configurable provider in the WebUI model picker. `api/config.py` registers `"bedrock": "AWS Bedrock"` in `PROVIDER_LABELS`, adds 6 default Bedrock model IDs (Claude Opus 4.7 / 4.6 / 4.5, Sonnet 4.6 / 4.5, Haiku 4.5) to `DEFAULT_MODELS["bedrock"]`, and teaches `_build_configured_model_badges()` to detect Bedrock when both `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` are present (IAM-style auth, not single-API-key). Static fallback list is overridden at runtime by `hermes_cli.models.provider_model_ids("bedrock")` when the live AWS model list is reachable. Adds `tests/test_issue2720_bedrock_model_picker.py` with 11 test cases covering registry, defaults, env-detection, and runtime override. Resolves [#&#8203;2720](nesquena/hermes-webui#2720).

##### Fixed

- **PR [#&#8203;2789](nesquena/hermes-webui#2789 by [@&#8203;munim](https://github.com/munim) — Update check no longer falsely reports "Up to date" when HEAD has moved hundreds of commits past the latest tag. The hermes-agent repository keeps committing to master between tagged releases, and the old `_check_repo_release()` returned `behind=0` (since `current_tag == latest_tag`) and stopped — so the user saw "Up to date" while the working tree was hundreds of commits behind. The fix: when `behind == 0`, run `git describe --tags --always`; if the result contains the `-N-gSHA` suffix (HEAD past tag), return `None` so `_check_repo_branch()` runs and reports the real commit gap. Adds 8 new test cases in `tests/test_updates.py` covering past-tag detection, equal-tag-and-HEAD pass-through, untagged-repo behavior, and the agent-cadence [#&#8203;2653](nesquena/hermes-webui#2653) scenario. Resolves [#&#8203;2653](nesquena/hermes-webui#2653).

- **PR [#&#8203;2790](nesquena/hermes-webui#2790 by [@&#8203;weidzhou](https://github.com/weidzhou) — Add `do_OPTIONS()` handler in `server.py` so CORS preflight requests return `200 OK` with appropriate `Access-Control-Allow-*` headers instead of `501 Not Implemented`. Browsers sending a preflight OPTIONS for cross-origin API calls previously hit the BaseHTTPRequestHandler default and the entire CORS exchange was blocked. The handler narrowly responds only to OPTIONS — no broader CORS posture change to other endpoints. Resubmit of closed [#&#8203;2750](nesquena/hermes-webui#2750) (which bundled unrelated session-index changes); this PR is the minimal preflight-only split that [@&#8203;nesquena-hermes](https://github.com/nesquena-hermes) and [@&#8203;AJV20](https://github.com/AJV20) requested.

### [`v0.51.119`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051119--2026-05-24--Release-CQ-stage-batch1--3-PR-low-risk-batch--tool-cards--404-recovery--Hepburn-skin)

[Compare Source](nesquena/hermes-webui@v0.51.118...v0.51.119)

##### Fixed

- **PR [#&#8203;2801](nesquena/hermes-webui#2801 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Preserve settled tool cards across stream completion. The streaming `done` handler now derives anchored settled tool cards from message-level tool metadata (`message.tool_calls`, `message._partial_tool_calls`, or `content[].type === 'tool_use'`) when present, instead of unconditionally falling back to session-level `d.session.tool_calls`. The fallback could overwrite the per-message anchors after pagination/windowing because session-level coordinates may not line up with the active message array, causing tool cards to disappear on the final `done` render. Fixes [#&#8203;2613](nesquena/hermes-webui#2613), complements [#&#8203;2777](nesquena/hermes-webui#2777) (which covers pending-segment flushes at tool/interim boundaries). Adds `tests/test_streaming_markdown.py::test_done_handler_prefers_message_tool_metadata_for_settled_render` to lock the precedence.

- **PR [#&#8203;2808](nesquena/hermes-webui#2808 by [@&#8203;chouzz](https://github.com/chouzz) — Recover deterministically from boot-time `/session/{id}` 404s (Option A for [#&#8203;2798](nesquena/hermes-webui#2798)). When `loadSession()` hits a 404 during boot-time restore (`!currentSid`), `static/sessions.js` now always clears `localStorage['hermes-webui-session']`, strips the stale URL with `history.replaceState(null, '', '/')`, and rethrows so boot falls through to empty-state recovery. The previous condition required the stale id to match `localStorage`, so a stale `/session/{id}` URL with empty `localStorage` (post state-reset) could leave the UI stuck on "Session not available in web UI." Fixes [#&#8203;2798](nesquena/hermes-webui#2798).

##### Added

- **PR [#&#8203;2799](nesquena/hermes-webui#2799 by [@&#8203;gavinssr](https://github.com/gavinssr) — Add Hepburn skin (magenta-rose palette derived from the Hepburn TUI theme). Full light + dark palette under `:root[data-skin="hepburn"]` / `:root.dark[data-skin="hepburn"]`, registered in `static/boot.js` `_SKINS` and whitelisted in `static/index.html`'s inline skin gate. As part of this PR `loadSettingsPanel()` in `static/panels.js` now prefers `localStorage.getItem('hermes-skin')` over `settings.skin` when populating the skin picker (DOM truth → settings fallback), so the picker matches what the user actually sees after the inline gate has already resolved legacy aliases.

### [`v0.51.118`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051118--2026-05-22--Release-CP-stage-pr2773--1-PR-hotfix--v051117-brick-fix-chat-input-restored)

[Compare Source](nesquena/hermes-webui@v0.51.117...v0.51.118)

##### Fixed

- **PR [#&#8203;2773](nesquena/hermes-webui#2773 by [@&#8203;nesquena-hermes](https://github.com/nesquena-hermes) — fix(chat): rename `_inflightStateLimits()` in `static/ui.js` to `_getInflightStateLimits()` so it no longer collides with the `window._inflightStateLimits` config object set in `static/boot.js`. Closes [#&#8203;2771](nesquena/hermes-webui#2771). The v0.51.117 in-flight-recovery quota fix ([#&#8203;2766](nesquena/hermes-webui#2766)) declared a top-level helper with the same name as a window-attached config object; because top-level `function foo(){…}` declarations in classic (non-module) scripts attach to `window`, boot.js's `window._inflightStateLimits = {…}` assignment overwrote the function reference before any session could send. Every new chat broke on first `send()` with `TypeError: _inflightStateLimits is not a function`, leaving v0.51.117 effectively unusable. Renamed the function only (the public-ish window key is unchanged) and updated all 4 call sites. \*\*New regression test `tests/test_window_function_collision.py` scans every static JS file for top-level `function NAME()` declarations whose name is also the target of `window.NAME = {…}` / `= <number>`, the exact shape that broke [#&#8203;2715](nesquena/hermes-webui#2715) (`_pinnedSessionsLimit` in v0.51.106) and [#&#8203;2771](nesquena/hermes-webui#2771) (`_inflightStateLimits` in v0.51.117). The test fails loudly with a precise file:name diagnostic if the bug class returns. Verified end-to-end against the live browser before merge: `_getInflightStateLimits()` returns the limits object and `saveInflightState()` persists to localStorage without throwing.

### [`v0.51.117`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051118--2026-05-22--Release-CP-stage-pr2773--1-PR-hotfix--v051117-brick-fix-chat-input-restored)

[Compare Source](nesquena/hermes-webui@v0.51.116...v0.51.117)

##### Fixed

- **PR [#&#8203;2773](nesquena/hermes-webui#2773 by [@&#8203;nesquena-hermes](https://github.com/nesquena-hermes) — fix(chat): rename `_inflightStateLimits()` in `static/ui.js` to `_getInflightStateLimits()` so it no longer collides with the `window._inflightStateLimits` config object set in `static/boot.js`. Closes [#&#8203;2771](nesquena/hermes-webui#2771). The v0.51.117 in-flight-recovery quota fix ([#&#8203;2766](nesquena/hermes-webui#2766)) declared a top-level helper with the same name as a window-attached config object; because top-level `function foo(){…}` declarations in classic (non-module) scripts attach to `window`, boot.js's `window._inflightStateLimits = {…}` assignment overwrote the function reference before any session could send. Every new chat broke on first `send()` with `TypeError: _inflightStateLimits is not a function`, leaving v0.51.117 effectively unusable. Renamed the function only (the public-ish window key is unchanged) and updated all 4 call sites. \*\*New regression test `tests/test_window_function_collision.py` scans every static JS file for top-level `function NAME()` declarations whose name is also the target of `window.NAME = {…}` / `= <number>`, the exact shape that broke [#&#8203;2715](nesquena/hermes-webui#2715) (`_pinnedSessionsLimit` in v0.51.106) and [#&#8203;2771](nesquena/hermes-webui#2771) (`_inflightStateLimits` in v0.51.117). The test fails loudly with a precise file:name diagnostic if the bug class returns. Verified end-to-end against the live browser before merge: `_getInflightStateLimits()` returns the limits object and `saveInflightState()` persists to localStorage without throwing.

### [`v0.51.116`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051116--2026-05-22--Release-CN-stage-pr2676--1-PR--per-skill-enabledisable-toggle-in-Skills-panel-CLI-parity-with-hermes-skills-config)

[Compare Source](nesquena/hermes-webui@v0.51.115...v0.51.116)

##### Added

- **PR [#&#8203;2676](nesquena/hermes-webui#2676 by [@&#8203;lucasrc](https://github.com/lucasrc) — Each skill in the Skills panel now has a toggle pill (enabled/disabled) so users can turn individual skills on or off directly from the WebUI without editing `config.yaml`. Achieves parity with the existing `hermes skills config` CLI subcommand (interactive TUI that toggles `skills.disabled` in config). The disabled state is mirrored through to `skills.platform_disabled.webui` when that key is present. Disabled skills remain visible in the panel (muted via `opacity: .45`) instead of being filtered out, so users can re-enable them later. New endpoint: `POST /api/skills/toggle` validates the skill exists in the filesystem before mutating config, wraps the YAML read-modify-write under the existing `_cfg_lock` for thread safety, and calls `reload_config()` so the change takes effect immediately. Toggle pill uses theme variables (`--accent-bg-strong`, `--accent`, `--border`, `--muted`, `--accent-text`) so it adapts automatically to each skin: gold for default, red for ares, blue for poseidon, purple for sisyphus, grey for mono — verified empirically across light + dark variants. i18n keys (`skill_enabled`, `skill_disabled`, `skill_toggle_failed`) translated across all 10 locales. Default-state safety: fresh installs (no `skills.disabled` key in config) return `disabled: False` for every skill — no regression risk for new users.

### [`v0.51.115`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051115--2026-05-22--Release-CM-stage-pr2731--1-PR--clarify-prompt-collapseexpand-with-chevron-icon-polish)

[Compare Source](nesquena/hermes-webui@v0.51.114...v0.51.115)

##### Added

- **PR [#&#8203;2731](nesquena/hermes-webui#2731 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) — Clarification prompts now include a compact Collapse/Expand control so users can temporarily shrink a blocking decision card and reread the chat context behind it before responding. The toggle uses Lucide chevron icons (chevron-down expanded → click to collapse, chevron-up collapsed → click to expand) and a small circular pill matching the existing composer-button design language. The collapsed card sits cleanly above the composer at every tested viewport (desktop 1920×1080, mobile iPhone 14 390×844) without edge clipping. New clarification prompts still open expanded so users notice them.

### [`v0.51.114`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051114--2026-05-22--Release-CL-stage-407--1-PR--update-check-recovery-from-remote-re-tags)

[Compare Source](nesquena/hermes-webui@v0.51.113...v0.51.114)

##### Fixed

- **PR [#&#8203;2758](nesquena/hermes-webui#2758 by [@&#8203;nesquena-hermes](https://github.com/nesquena-hermes) — fix(updates): pass `--force` to `git fetch --tags` in `api/updates.py` so the WebUI's release-tracking update check can recover from a remote re-tag (e.g. a release tag that was force-pushed to a new commit after a squash-merge). Without `--force`, plain `git fetch origin --tags` returns `! [rejected] vX.Y.Z (would clobber existing tag)` and the entire update path (check, force-apply, normal-apply) jams indefinitely — neither the periodic check nor manual "Check now" nor the Update button can recover. Three fetch call sites were patched (`_check_repo`, `apply_force_update`, `apply_update`) to use `--tags --force`; the WebUI never pushes tags, so deferring to the remote's view is the right contract. Closes [#&#8203;2756](nesquena/hermes-webui#2756).

### [`v0.51.113`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051113--2026-05-22--Release-CK-stage-406--1-PR--composer-model-picker-lag-fix--hard-refresh-recovery)

[Compare Source](nesquena/hermes-webui@v0.51.112...v0.51.113)

##### Fixed

- **PR [#&#8203;2743](nesquena/hermes-webui#2743 by [@&#8203;franksong2702](https://github.com/franksong2702) — Composer model picker now opens immediately from the existing static option list while the dynamic `/api/models` catalog hydrates in the background, instead of blocking the click on the catalog request. A just-selected session model also survives a hard refresh that interrupts the async `/api/session/update` POST: the selection is staged into `sessionStorage` (keyed by session\_id, 10-minute TTL) before the async update flies, and `loadSession()` re-applies the pending pick on next session restore and retries the persistence call. Tests pin the new ordering: visible picker render before `await`, pending-state save before `await api('/api/session/update')`, and pending-state replay before the first `syncTopbar()` projects server metadata.

### [`v0.51.112`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051112--2026-05-22--Release-CJ-stage-405--1-PR--session-model-authoritative-across-restore)

[Compare Source](nesquena/hermes-webui@v0.51.111...v0.51.112)

##### Fixed

- **PR [#&#8203;2737](nesquena/hermes-webui#2737 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Keep the session model authoritative when a restored session is reactivated. Previously, stale browser-cached picker state could override an active conversation's model in four scenarios: (1) on initial boot when `localStorage` had a different model preference than the active session, (2) on hard refresh when `S._bootReady` revealed the composer chip before the live catalog hydrated, (3) when the session's model wasn't in the current provider catalog (the static/default fallback silently rewrote `S.session.model`), (4) when starting a new session whose model wasn't in the static HTML dropdown. The fix: `loadSession()` now requests `resolve_model=1` so backend normalization happens synchronously with metadata; boot model hydration prefers the active session over `localStorage`; hard refresh re-runs the model dropdown hydration before `_bootReady`; a new `_ensureModelOptionInDropdown()` helper injects a `data-custom='1'` option for models not in the catalog instead of silently rewriting `S.session.model` to the default. 100 LOC of new pytest regression coverage pinning each behavior.

### [`v0.51.111`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051111--2026-05-22--Release-CI-stage-404--1-PR--keep-statedb-replays-out-of-sidecar-tail)

[Compare Source](nesquena/hermes-webui@v0.51.110...v0.51.111)

##### Fixed

- **PR [#&#8203;2746](nesquena/hermes-webui#2746 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Prevent replayed state.db rows from being appended after an already-correct sidecar transcript tail. `merge_session_messages_append_only()` previously tried to skip state.db rows replaying the sidecar, but two edge cases leaked through: (1) the final row of a replayed sidecar prefix was not skipped because the replay index had reached the sidecar sequence length, and (2) a replayed middle segment was not considered prefix replay, so old state.db rows could be appended after the saved assistant tail. That made `/api/session` appear to end on an old user prompt even when the saved sidecar already ended on the real assistant answer. The fix tracks per-(role, content) visible-occurrence counts in the sidecar and uses that as a replay budget when comparing state.db rows; legitimate repeated messages from state.db are still preserved. `_has_visible_duplicate()` is kept as a thin wrapper around the new `_matching_visible_duplicate()` for backwards compatibility. Regression test covers both full-replay and middle-segment replay shapes.

### [`v0.51.110`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051110--2026-05-22--Release-CH-stage-403--2-PR-batch--default-personality-from-config--sort-configured-providers-to-top)

[Compare Source](nesquena/hermes-webui@v0.51.109...v0.51.110)

##### Added

- **PR [#&#8203;2747](nesquena/hermes-webui#2747 by [@&#8203;s010mn](https://github.com/s010mn) — `new_session()` now reads `display.personality` from `config.yaml` as the default for new conversations. Previously every new session started with `personality=None` and required an explicit `/personality <name>` slash command. Values `'none'`, `'default'`, `'neutral'`, and empty string are treated as no-personality. Case-insensitive — `personality: Taleb` normalizes to `taleb`. Config-read is wrapped in try/except so malformed config falls back to the prior behavior rather than crashing session creation. The `/personality` slash command still works for per-session overrides.
- **PR [#&#8203;2683](nesquena/hermes-webui#2683 by [@&#8203;jasonjcwu](https://github.com/jasonjcwu) — Sort providers so configured/custom entries appear first in both the model picker dropdown (`api/config.py::get_available_models`) and the Settings providers panel (`api/providers.py::get_providers`). Priority order: (1) the active provider, (2) `custom:*` providers from `custom_providers` config, (3) providers with configured API keys (credential pool or `config.yaml`), (4) all others alphabetical. Eliminates scrolling past 25+ unconfigured providers to find the one in active use.

### [`v0.51.109`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051109--2026-05-22--Release-CG-stage-402--2-PR-batch--sidebar-action-menu-click-stability--chat-panel-sidebar-resync-after-navigation)

[Compare Source](nesquena/hermes-webui@v0.51.108...v0.51.109)

##### Fixed

- **PR [#&#8203;2741](nesquena/hermes-webui#2741 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Keep the sidebar conversation actions menu open while session-list refreshes, stream updates, or panel-resync repairs arrive. Previously the three-dot menu beside chat titles could be torn down before the user finished clicking it because `renderSessionListFromCache()` rebuilt the row DOM (and the fixed-position menu's anchor) without checking whether the menu was open. The new early-return at the top of the refresh keeps the menu stable; destructive menu actions explicitly close the menu before they fire, so dismissal still works as expected.
- **PR [#&#8203;2736](nesquena/hermes-webui#2736 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Resync the chat sidebar after returning from Settings/Logs/other panels. The session list is virtualized, and the browser can clamp the preserved scrollTop during a panel transition; without a render after the chat view is visible again, stale virtual spacer/header DOM remained until the next manual scroll. The new `_resyncChatSidebarAfterPanelSwitch()` helper runs one guarded `requestAnimationFrame` after the panel becomes visible, bails if a rename input or action menu is open, and uses no polling.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19-->

Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/634
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants