Skip to content

fix(chat): #2713 flush pending render before segment reset at tool/interim#2777

Closed
b3nw wants to merge 1 commit into
nesquena:masterfrom
b3nw:fix/streaming-tool-segment-truncation
Closed

fix(chat): #2713 flush pending render before segment reset at tool/interim#2777
b3nw wants to merge 1 commit into
nesquena:masterfrom
b3nw:fix/streaming-tool-segment-truncation

Conversation

@b3nw

@b3nw b3nw commented May 23, 2026

Copy link
Copy Markdown
Contributor

Closes #2713 — live assistant text can truncate at tool-call segment boundaries during streaming.

Before _resetAssistantSegment() in the tool and interim_assistant SSE handlers, synchronously flush any pending rAF render work so tokens that arrived during the 66ms throttle window are written to the DOM before assistantBody is cleared. Without this flush, the pending _doRender callback fires after assistantBody is null and skips the write silently, causing the tail of the pre-tool segment to disappear from the live view.

The fix uses the existing _cancelAnimationFramePendingStreamRender() helper and writes via smd/renderMd/esc (same paths as _doRender) only when _renderPending is true — normal cases where the rAF has already fired are unaffected.

Completed transcripts were never affected (renderMessages rebuilds from the full assistantText accumulator on done).

@b3nw b3nw changed the title fix(chat): #2731 flush pending render before segment reset at tool/interim fix(chat): #2713 flush pending render before segment reset at tool/interim May 23, 2026
@nesquena-hermes

Copy link
Copy Markdown
Collaborator

Summary

Reading the patch at static/messages.js:1423-1437 and the matching block at 1486-1502 against the rAF throttling logic on origin/master (static/messages.js:1313-1372), this is a targeted fix for a real bug: _scheduleRender() queues a rAF callback that captures assistantBody from closure scope, and _resetAssistantSegment() nulls that out (messages.js:1289-1294) — so the captured _doRender writes get silently dropped because of the if(assistantBody) guard at line 1335. Calling _cancelAnimationFramePendingStreamRender() plus a synchronous flush before the reset is the right shape.

Code reference

Before the fix, the rAF closure body at static/messages.js:1334-1339 was:

if(assistantBody){
  const displayText = segmentStart===0
    ? parsed.displayText
    : _stripXmlToolCalls(assistantText.slice(segmentStart));
  if(_shouldUseStreamFade()){
    ...

When the tool listener fires inside the 66ms throttle window (line 1325-1326), it goes straight to _resetAssistantSegment() which sets assistantBody=null. The queued rAF then sees assistantBody falsy and the partial-segment tail is lost from the live DOM. The completed transcript is unaffected because renderMessages() rebuilds from assistantText, which matches what the PR description claims.

Diagnosis

The fix block in both event handlers reads:

if(assistantBody&&_renderPending){
  _cancelAnimationFramePendingStreamRender();
  const displayText=segmentStart===0
    ? _parseStreamState().displayText
    : _stripXmlToolCalls(assistantText.slice(segmentStart));
  if(_smdParser){
    _smdWrite(displayText);
  } else if(renderMd){
    assistantBody.innerHTML=renderMd(displayText);
  } else {
    assistantBody.innerHTML=esc(displayText);
  }
}

Three things look correct:

  1. The guard assistantBody && _renderPending means the no-pending case skips entirely — zero overhead on the hot path when rAF already fired.
  2. The displayText computation mirrors _doRender at line 1336-1338 exactly, including the segmentStart===0 think-tag-vs-tool-strip distinction. Good — easy place to silently diverge otherwise.
  3. The smd-vs-renderMd-vs-esc cascade matches the _doRender cascade at line 1347-1365.

Concerns

Stream-fade path not exercised. The real _doRender calls _renderStreamingFadeMarkdown(displayText) via _shouldUseStreamFade() at line 1339-1346 to drive the per-token fade animation. The flush path bypasses that and writes directly. That's probably fine because we're about to seal the segment anyway, but worth confirming: when stream-fade is active and a tool fires mid-fade, will the pre-tool tail render without fade animation? If users see that as a visual glitch it's worth gating the flush behind !_shouldUseStreamFade() or routing through _renderStreamingFadeMarkdown with a "force-complete" flag. Easy to defer to a follow-up if no one's complained.

_parseStreamState() is called twice in the interim_assistant path — once in the new flush block (computed from assistantText.slice(segmentStart) semantics, but only the displayText form), and again at the existing parsed=_parseStreamState() ~30 lines up. Not a correctness issue, just a perf nit: on long sessions the regex-driven think-tag stripping in _stripXmlToolCalls() (messages.js:849-864) isn't free. Cache once.

Code duplication. The two blocks are byte-identical. Worth extracting _flushPendingSegmentRender() (4 lines + 8-line body) and calling it from both listeners. Especially since the tool listener has an additional _freshSegment=true; _smdEndParser(); step that the interim handler doesn't, which makes it easy to drift over time.

Test plan

No automated coverage was added. The PR comment claims "Completed transcripts were never affected" — I agree from reading the code (the persisted message comes from assistantText, not DOM). But the live-render bug itself is hard to assert without a browser or a DOM-level harness. At minimum, please add a tests/test_issue2713_* regression that uses the same read('static/messages.js') AST-grep style as tests/test_streaming_race_fix.py (which already covers _wireSSE invariants) to pin:

  • The tool listener body contains _cancelAnimationFramePendingStreamRender().
  • It contains _renderPending as a guard.
  • Both flush blocks reference both _smdWrite and renderMd/esc as fallback.

That way a future refactor of _scheduleRender() can't silently re-break this.

Otherwise this looks correct. The CHANGELOG entry is present.

…boundaries

Fixes nesquena#2713 — live assistant text can truncate at tool-call segment
boundaries during streaming.

Before _resetAssistantSegment() in the tool and interim_assistant SSE
handlers, synchronously flush any pending rAF render work so tokens that
arrived during the 66ms throttle window are written to the DOM before
assistantBody is cleared. Without this flush, the pending _doRender
callback fires after assistantBody is null and skips the write silently,
causing the tail of the pre-tool segment to disappear from the live view.

Implementation:
- Extract _flushPendingSegmentRender() helper (guarded by assistantBody
  && _renderPending) that cancels the pending rAF and synchronously
  writes via smd/renderMd/esc — same cascade as _doRender.
- Call the helper from both the tool and interim_assistant handlers
  before their respective _resetAssistantSegment() calls.
- Normal cases where the rAF has already fired are unaffected (guard
  skips immediately).

Completed transcripts were never affected (renderMessages rebuilds from
the full assistantText accumulator on done).

Adds tests/test_issue2713_streaming_segment_flush.py with 11 static
analysis regression tests pinning the helper shape and call-site
ordering.
@b3nw

b3nw commented May 23, 2026

Copy link
Copy Markdown
Contributor Author

@nesquena-hermes addressed concerns as requested

@nesquena-hermes

Copy link
Copy Markdown
Collaborator

Summary

Re-read the latest HEAD (5655d0ff) against origin/master. Both review asks landed:

  1. Helper extraction: _flushPendingSegmentRender() now lives at static/messages.js:1288-1301 as a single function. Both call sites (tool handler at line 1440, interim_assistant at line 1491) reduce to a single one-line invocation. The drift risk between the two byte-identical blocks I flagged in the first review is gone.
  2. Test coverage: tests/test_issue2713_streaming_segment_flush.py (178 lines) uses the same read('static/messages.js') static-analysis pattern as test_streaming_race_fix.py. The class breakdown is exactly the shape I asked for.

Code reference

The extracted helper at static/messages.js:1288-1301:

function _flushPendingSegmentRender(){
  if(!assistantBody||!_renderPending) return;
  _cancelAnimationFramePendingStreamRender();
  const displayText=segmentStart===0
    ? _parseStreamState().displayText
    : _stripXmlToolCalls(assistantText.slice(segmentStart));
  if(_smdParser){
    _smdWrite(displayText);
  } else if(renderMd){
    assistantBody.innerHTML=renderMd(displayText);
  } else {
    assistantBody.innerHTML=esc(displayText);
  }
}

This is a closure over assistantBody, _renderPending, segmentStart, assistantText, and _smdParser from attachLiveStream() scope, so the call sites stay clean and the guard logic is single-sourced. Good.

The test invariants pin the three semantically meaningful properties: existence (test_flush_helper_declared), guards (test_flush_helper_guards_on_assistant_body + test_flush_helper_guards_on_render_pending), rAF cancellation (test_flush_helper_cancels_pending_raf), render cascade (_smdWrite/renderMd/esc fallback tests), and crucially the ordering invariant in both SSE handler tests:

flush_pos = fn.index("_flushPendingSegmentRender()")
reset_pos = fn.index("_resetAssistantSegment()", flush_pos)
assert flush_pos < reset_pos

If a future refactor moves the flush call below the reset, those two ordering tests fail — which is exactly the regression class the original bug was in.

Notes

The two deferred items from the first review remain:

  • Stream-fade bypass. _flushPendingSegmentRender writes directly via _smdWrite/renderMd/esc rather than routing through _renderStreamingFadeMarkdown(). When _shouldUseStreamFade() returns true and a tool fires mid-fade, the pre-tool tail still won't get a fade animation. This is an aesthetic micro-glitch on a segment that's about to be sealed anyway, so it's fine to defer; just noting it's not addressed in this PR.
  • _parseStreamState() call duplication in the interim_assistant path. Still computed once in the flush and once at the existing parsed=_parseStreamState(). Pure perf nit, defer.

Neither blocks the merge.

Verdict

LGTM. The helper extraction is the right shape, and the tests pin the contract well enough that the rAF/segment-reset ordering can't silently regress. CHANGELOG entry is present at the right severity (live-streaming display only, no transcript impact).

@nesquena-hermes

Copy link
Copy Markdown
Collaborator

Shipped in v0.51.122 via release/stage-batch4 (#2815). The _flushPendingSegmentRender helper is in. Closes #2713.

huoli4844 pushed a commit to huoli4844/hermes-webui that referenced this pull request May 24, 2026
…isk batch)

Cherry-picked PRs:
- nesquena#2802 (ai-ag2026) — drop stale cached user tail (supersedes held nesquena#2733)
- nesquena#2796 (ai-ag2026) — clear stale inflight UI state (5-commit squash)
- nesquena#2777 (b3nw) — flush pending render at segment boundaries
- nesquena#2778 (b3nw) — reset reasoning accumulator per turn + prefer reasoning_content
eleboucher pushed a commit to eleboucher/homelab that referenced this pull request May 24, 2026
…➔ 0.51.124) (#634)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [ghcr.io/nesquena/hermes-webui](https://github.com/nesquena/hermes-webui) | patch | `0.51.108` → `0.51.124` |

---

### Release Notes

<details>
<summary>nesquena/hermes-webui (ghcr.io/nesquena/hermes-webui)</summary>

### [`v0.51.124`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051124--2026-05-24--Release-CV-stage-batch6--3-PR-Windows-only-stack--agent-paths--docs--port-hardening)

[Compare Source](nesquena/hermes-webui@v0.51.123...v0.51.124)

##### Added

- **PR [#&#8203;2805](nesquena/hermes-webui#2805 by [@&#8203;Koraji95-coder](https://github.com/Koraji95-coder) — `start.ps1`: expand hermes-agent candidate paths for Windows installers. The launcher now searches `$env:USERPROFILE\.hermes\hermes-agent`, the dev-checkout sibling, and the Windows installer roots (`$env:LOCALAPPDATA\hermes\hermes-agent`, `${env:ProgramW6432}\hermes\hermes-agent`, `${env:ProgramFiles}\hermes\hermes-agent`, `${env:ProgramFiles(x86)}\hermes\hermes-agent`) with `Select-Object -Unique` to collapse WOW64 ProgramFiles redirection collisions on 32-bit PowerShell processes. Adds `-PathType Container` to the `HERMES_WEBUI_AGENT_DIR` guard so a file named `hermes_cli` doesn't false-positive. Null-guards `${env:ProgramFiles(x86)}` for constrained environments where it's missing. Zero impact on Linux/macOS — file is `start.ps1`, never loaded by `start.sh` or `bootstrap.py`.

##### Documentation

- **PR [#&#8203;2806](nesquena/hermes-webui#2806 by [@&#8203;Koraji95-coder](https://github.com/Koraji95-coder) — Native Windows venv path corrected in `start.ps1` doc-comment and `README.md`. The previous text suggested "run bootstrap.py inside WSL2 once to create the venv, then this script can use that venv" — but a WSL2-created venv is `venv/bin/python` (ELF) and cannot be invoked by native Windows Python. The corrected guidance is to create a Windows venv natively (`python -m venv venv` from PowerShell), then `start.ps1` auto-discovers `venv\Scripts\python.exe`. WSL2 remains useful as a parallel install for the full `bootstrap.py` + Linux runtime path.

##### Hardened

- **PR [#&#8203;2807](nesquena/hermes-webui#2807 by [@&#8203;Koraji95-coder](https://github.com/Koraji95-coder) — `start.ps1`: `HERMES_WEBUI_PORT` env-var parsing uses `[int]::TryParse` + range guard (1-65535) instead of a bare `[int]` cast that threw `InvalidCastException` with no context on typos or accidental shell expansion. Server-process exit code is captured into `$script:serverExitCode` and emitted via `exit` AFTER the `try/finally` cleanup, so `Pop-Location` always runs (avoids leaving the caller stuck at `$RepoRoot` in interactive or dot-sourced sessions). Also drops a non-functional `@args` splat that PowerShell doesn't populate under `[CmdletBinding()]` — the launcher's existing use case is env-var-driven, no pass-through args needed.

### [`v0.51.123`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051123--2026-05-24--Release-CU-stage-batch5--2-PR-low-risk-batch--gzipETag-static-caching--Open-in-VS-Code)

[Compare Source](nesquena/hermes-webui@v0.51.122...v0.51.123)

##### Performance

- **PR [#&#8203;2779](nesquena/hermes-webui#2779 by [@&#8203;v2psv](https://github.com/v2psv) — Static asset serving negotiates gzip, emits ETags, and uses `immutable` cache headers for fingerprinted URLs. `_serve_static()` in `api/routes.py` previously sent every `/static/*` response with `Cache-Control: no-store` and no `Content-Encoding`, so a page reload over a slow link re-downloaded the full \~2.4 MB JS+CSS shell on every visit. The fix layers three changes inside the same function: (1) gzip the body when the client opts in via `Accept-Encoding`, gated to compressible MIME types and files >1 KB; (2) emit a weak ETag derived from `(size, mtime_ns)` and short-circuit conditional GETs to `304 Not Modified`; (3) send `Cache-Control: public, max-age=31536000, immutable` when the URL carries a non-empty `?v=…` fingerprint (the `__WEBUI_VERSION__` token already substituted by the index template and referenced from `static/sw.js`'s `SHELL_ASSETS`), falling back to `public, max-age=300` otherwise. Raw bytes, compressed bytes, and ETags are cached in-process keyed by `(size, mtime_ns)` so a redeploy is picked up without a restart, while missing/random paths never enter the cache and image/font types skip gzip to avoid wasted CPU on already-compressed payloads. Measured against an asyncio TCP proxy that injects RTT + bandwidth caps for representative VPN scenarios: cold loads improve 2.7-3.1× (e.g. 80 ms RTT / 10 Mbps WireGuard goes from 4.0 s to 1.3 s), warm reloads improve 3.3-4.0× via 304 responses, and bytes-on-the-wire drop 74% on cold loads. Loopback (already fast) still benefits 2.4×. Scope is strictly `/static/*`: `/api/*`, `/stream`, `/`, `/index.html`, `/session/*`, and login/auth routes are served by independent handlers and continue to send `no-store` exactly as before — no change to CSRF, session payloads, SSE buffering, or login flows. 11 regression tests pin gzip negotiation, ETag/304 round-trip including `Vary: Accept-Encoding`, fingerprint-driven cache policy including empty `?v=`, image/tiny-file skip rules, redeploy invalidation, and the existing path-traversal sandbox.

##### Added

- **PR [#&#8203;2787](nesquena/hermes-webui#2787 by [@&#8203;munim](https://github.com/munim) — "Open in VS Code" action in workspace file browser (resolves [#&#8203;2735](nesquena/hermes-webui#2735)). Right-clicking any file, folder, or the workspace root now shows an **Open in VS Code** menu item alongside the existing Reveal in File Manager action. The action calls a new `POST /api/file/open-vscode` endpoint which resolves the workspace-relative path via the existing `safe_resolve` traversal guard, then launches VS Code via `subprocess.Popen` (fire-and-forget, consistent with `_handle_file_reveal`). The endpoint resolves the executable via `shutil.which()` first, then falls back to a hardcoded list of common install locations (macOS: `/usr/local/bin/code` and the app-bundle CLI; Linux: `/usr/bin/code`, `/snap/bin/code`; Windows: `%LOCALAPPDATA%\Programs\Microsoft VS Code\bin\code.cmd` and the `%PROGRAMFILES%` variants) so the action works even when the server process inherits a minimal PATH. Configurable via a new optional `vscode` block in `config.yaml`: `command` overrides the default `code` executable; `host_path_prefix` + `container_path_prefix` enable Docker/container host-path translation. If the command cannot be found anywhere, a descriptive error is returned instead of a bare OS error. i18n keys `open_in_vscode` and `open_in_vscode_failed` added with full translations in all 10 locales. 26 new tests in `tests/test_2735_open_in_vscode.py` pin source wiring, command-resolution logic, i18n completeness, translated strings, and live endpoint error paths.

### [`v0.51.122`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051122--2026-05-24--Release-CT-stage-batch4--4-PR-low-risk-batch--stale-cache-tail--inflight-UI--segment-flush--reasoning-accumulator)

[Compare Source](nesquena/hermes-webui@v0.51.121...v0.51.122)

##### Fixed

- **PR [#&#8203;2802](nesquena/hermes-webui#2802 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Drop stale inactive cached user tails when `/api/session` reloads a conversation whose saved sidecar already ends on an assistant answer. Supersedes [#&#8203;2733](nesquena/hermes-webui#2733) (held due to async-compression interaction): the new guard adds a `len(cached_messages) <= len(disk_messages)` filter so it never fires when the cache has genuine new concurrent edits beyond the disk state — only when the cache has an unsaved user row past the saved assistant tail. Adds `api/models._inactive_cache_tail_needs_disk_check()` + `_cache_has_stale_unsaved_user_tail()` helpers and 5 new tests in `tests/test_webui_state_db_reconciliation.py`. Previously-held test `test_session_compress_async_reports_stale_session_guard` now passes (verified). Closes umbrella [#&#8203;2361](nesquena/hermes-webui#2361) partially.

- **PR [#&#8203;2796](nesquena/hermes-webui#2796 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Clear stale inflight UI state before starting a new send so blocked composer busy-state from failed/incomplete prior turns doesn't divert new turns into the invisible queue. Five-commit squashed fix: (1) drop stale optimistic sidebar rows once canonical session data arrives, (2) clear stale busy state before send via `_clearStaleBusyStateBeforeSend()`, (3) preserve server idle rows over stale optimistic local rows, (4) let `/api/chat/start` survive non-fatal pre-start UI errors via `_runOptionalPreStartUiStep()`, (5) keep those warnings console-only instead of throwing. Adds `_shouldKeepLocalOnlyOptimisticSessionRow()` in `static/sessions.js` and 8 new tests in `tests/test_inflight_send_start_race.py`. Closes [#&#8203;2795](nesquena/hermes-webui#2795). Authorship preserved via `--author`.

- **PR [#&#8203;2777](nesquena/hermes-webui#2777 by [@&#8203;b3nw](https://github.com/b3nw) — Flush pending render before segment reset at tool/interim\_assistant boundaries so live tokens that arrived in the 66ms rAF throttle window don't get lost from the DOM when `_resetAssistantSegment()` clears `assistantBody`. New `_flushPendingSegmentRender()` helper writes via `smd`, `renderMd`, or `esc` fallback (same paths as `_doRender`) only when `_renderPending` is true. Completed transcripts were never affected — `renderMessages` rebuilds from the full `assistantText` accumulator on `done`. Adds `tests/test_issue2713_streaming_segment_flush.py`. Closes [#&#8203;2713](nesquena/hermes-webui#2713).

- **PR [#&#8203;2778](nesquena/hermes-webui#2778 by [@&#8203;b3nw](https://github.com/b3nw) — Reset reasoning accumulator per turn and prefer `reasoning_content` over `reasoning` on read. Two related bugs: (1) `reasoningText` was initialized once when the SSE stream opened and never reset between turns, so the `done` event would assign the union of every turn's reasoning to the last assistant message in multi-turn agent sessions; now reset at both turn boundaries (`tool` + `interim_assistant`). (2) `static/ui.js renderMessages` preferred `m.reasoning` (potentially corrupted by bug 1) over `m.reasoning_content` (the clean per-turn backend value); the fallback now reads `m.reasoning_content || m.reasoning`. Updates `tests/test_streaming_race_fix.py` to scope the reconnect-accumulator guard to the `_wireSSE` preamble only (turn-boundary resets inside event listeners are intentional). Adds `tests/test_issue2565_reasoning_accumulation.py`. Closes [#&#8203;2565](nesquena/hermes-webui#2565).

### [`v0.51.121`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051121--2026-05-24--Release-CS-stage-batch3--4-PR-low-risk-batch--statedb-merge--display-counts--compression-marker--Windows-launcher)

[Compare Source](nesquena/hermes-webui@v0.51.120...v0.51.121)

##### Fixed

- **PR [#&#8203;2788](nesquena/hermes-webui#2788 by [@&#8203;Carry00](https://github.com/Carry00) — Prevent `state.db` messages being silently dropped during sidecar merge. Two related bugs were combining to discard historical messages: (1) `get_state_db_session_messages()` was selecting `role, content, timestamp` but NOT `id`, so every row was assigned a `("legacy", ...)` merge key instead of `("message_id", ...)`; (2) when a WebUI-origin session was continued via another Hermes surface (Gateway, CLI), the reader was always hitting the *active* profile's `state.db` rather than the session's own profile. Symptom: a 189-message session showed only 50 in the WebUI. Fix: include `id` in the SELECT when the column exists, and accept an optional `profile=` arg so cross-profile reads use the right database. Both callers in `api/routes.py handle_get` now thread `profile=getattr(s, 'profile', None)` through.

- **PR [#&#8203;2797](nesquena/hermes-webui#2797 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Align messaging session display counts with deduped display messages. The `message_count` returned by `/api/session` is the display coordinate space used for pagination and the header badge. Messaging-thread `state.db` metadata can carry raw duplicate transport rows (blank assistant separators between Discord/Slack thread turns) that `_merged_session_messages_for_display()` intentionally dedupes for rendering. The advertised count was the raw row count, so the frontend expected phantom messages after dedupe — `len(display_msgs) < message_count` triggered "load older" UI states that immediately returned nothing. Fix: `raw["message_count"] = _merged_message_count` for messaging sessions, computed from the same merge that produced the displayed messages. Adds `tests/test_gateway_sync.py::test_messaging_session_message_count_matches_deduped_display_messages` covering the regression.

- **PR [#&#8203;2803](nesquena/hermes-webui#2803 by [@&#8203;simjak](https://github.com/simjak) — Compression-summary cards no longer use ordinary tool output that merely mentions context compression. The streaming auto-compression path was using a local broad substring matcher that fired on any message containing the strings "context compaction" / "context compression" / "context was auto-compressed" / "active task list was preserved across context compression", including skill/tool JSON output and ordinary user discussion about compaction. The strict predicate at `api/compression_anchor._is_context_compression_marker()` was already correctly scoped to synthetic marker prefixes on non-tool messages. Fix: expose the strict predicate as `is_context_compression_marker()` (public name) and route `api/streaming._is_context_compression_marker` through it as a backward-compatible alias. Tool/skill output that mentions compression no longer seeds `compression_anchor_summary` cards.

##### Added

- **PR [#&#8203;2783](nesquena/hermes-webui#2783 by [@&#8203;Koraji95-coder](https://github.com/Koraji95-coder) — Native Windows launcher and community-guide README link (squashed from 3 commits). `start.ps1` is a PowerShell equivalent of `start.sh` that bypasses `bootstrap.py`'s `ensure_supported_platform()` refusal and invokes `server.py` directly on native Windows. It mirrors `start.sh`'s discovery (load optional `.env` with the same readonly-var filter for `UID`/`GID`/`EUID`/`EGID`/`PPID`, find Python via `HERMES_WEBUI_PYTHON` env → `python3` → `python` → `py`, validate `HERMES_WEBUI_AGENT_DIR` on disk before use, prefer the agent's `venv\Scripts\python.exe`, set `HERMES_WEBUI_HOST` / `HERMES_WEBUI_PORT` / `HERMES_WEBUI_STATE_DIR` / `HERMES_HOME` defaults). The README adds a community-maintained native Windows setup section pointing to [@&#8203;markwang2658](https://github.com/markwang2658)'s `hermes-windows-native-guide` and `hermes-windows-native` repos with the documented memory delta (\~330 MB native vs \~1080 MB WSL2+Docker). Closes both halves of [#&#8203;1952](nesquena/hermes-webui#1952). Assumes Python + agent venv are already set up — first-time setup still needs WSL2 once to create the venv (`bootstrap.py` still refuses on native Windows).

### [`v0.51.120`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051120--2026-05-24--Release-CR-stage-batch2--3-PR-low-risk-batch--Bedrock-provider--update-check-past-tag--CORS-preflight)

[Compare Source](nesquena/hermes-webui@v0.51.119...v0.51.120)

##### Added

- **PR [#&#8203;2786](nesquena/hermes-webui#2786 by [@&#8203;munim](https://github.com/munim) — Surface AWS Bedrock as a configurable provider in the WebUI model picker. `api/config.py` registers `"bedrock": "AWS Bedrock"` in `PROVIDER_LABELS`, adds 6 default Bedrock model IDs (Claude Opus 4.7 / 4.6 / 4.5, Sonnet 4.6 / 4.5, Haiku 4.5) to `DEFAULT_MODELS["bedrock"]`, and teaches `_build_configured_model_badges()` to detect Bedrock when both `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` are present (IAM-style auth, not single-API-key). Static fallback list is overridden at runtime by `hermes_cli.models.provider_model_ids("bedrock")` when the live AWS model list is reachable. Adds `tests/test_issue2720_bedrock_model_picker.py` with 11 test cases covering registry, defaults, env-detection, and runtime override. Resolves [#&#8203;2720](nesquena/hermes-webui#2720).

##### Fixed

- **PR [#&#8203;2789](nesquena/hermes-webui#2789 by [@&#8203;munim](https://github.com/munim) — Update check no longer falsely reports "Up to date" when HEAD has moved hundreds of commits past the latest tag. The hermes-agent repository keeps committing to master between tagged releases, and the old `_check_repo_release()` returned `behind=0` (since `current_tag == latest_tag`) and stopped — so the user saw "Up to date" while the working tree was hundreds of commits behind. The fix: when `behind == 0`, run `git describe --tags --always`; if the result contains the `-N-gSHA` suffix (HEAD past tag), return `None` so `_check_repo_branch()` runs and reports the real commit gap. Adds 8 new test cases in `tests/test_updates.py` covering past-tag detection, equal-tag-and-HEAD pass-through, untagged-repo behavior, and the agent-cadence [#&#8203;2653](nesquena/hermes-webui#2653) scenario. Resolves [#&#8203;2653](nesquena/hermes-webui#2653).

- **PR [#&#8203;2790](nesquena/hermes-webui#2790 by [@&#8203;weidzhou](https://github.com/weidzhou) — Add `do_OPTIONS()` handler in `server.py` so CORS preflight requests return `200 OK` with appropriate `Access-Control-Allow-*` headers instead of `501 Not Implemented`. Browsers sending a preflight OPTIONS for cross-origin API calls previously hit the BaseHTTPRequestHandler default and the entire CORS exchange was blocked. The handler narrowly responds only to OPTIONS — no broader CORS posture change to other endpoints. Resubmit of closed [#&#8203;2750](nesquena/hermes-webui#2750) (which bundled unrelated session-index changes); this PR is the minimal preflight-only split that [@&#8203;nesquena-hermes](https://github.com/nesquena-hermes) and [@&#8203;AJV20](https://github.com/AJV20) requested.

### [`v0.51.119`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051119--2026-05-24--Release-CQ-stage-batch1--3-PR-low-risk-batch--tool-cards--404-recovery--Hepburn-skin)

[Compare Source](nesquena/hermes-webui@v0.51.118...v0.51.119)

##### Fixed

- **PR [#&#8203;2801](nesquena/hermes-webui#2801 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Preserve settled tool cards across stream completion. The streaming `done` handler now derives anchored settled tool cards from message-level tool metadata (`message.tool_calls`, `message._partial_tool_calls`, or `content[].type === 'tool_use'`) when present, instead of unconditionally falling back to session-level `d.session.tool_calls`. The fallback could overwrite the per-message anchors after pagination/windowing because session-level coordinates may not line up with the active message array, causing tool cards to disappear on the final `done` render. Fixes [#&#8203;2613](nesquena/hermes-webui#2613), complements [#&#8203;2777](nesquena/hermes-webui#2777) (which covers pending-segment flushes at tool/interim boundaries). Adds `tests/test_streaming_markdown.py::test_done_handler_prefers_message_tool_metadata_for_settled_render` to lock the precedence.

- **PR [#&#8203;2808](nesquena/hermes-webui#2808 by [@&#8203;chouzz](https://github.com/chouzz) — Recover deterministically from boot-time `/session/{id}` 404s (Option A for [#&#8203;2798](nesquena/hermes-webui#2798)). When `loadSession()` hits a 404 during boot-time restore (`!currentSid`), `static/sessions.js` now always clears `localStorage['hermes-webui-session']`, strips the stale URL with `history.replaceState(null, '', '/')`, and rethrows so boot falls through to empty-state recovery. The previous condition required the stale id to match `localStorage`, so a stale `/session/{id}` URL with empty `localStorage` (post state-reset) could leave the UI stuck on "Session not available in web UI." Fixes [#&#8203;2798](nesquena/hermes-webui#2798).

##### Added

- **PR [#&#8203;2799](nesquena/hermes-webui#2799 by [@&#8203;gavinssr](https://github.com/gavinssr) — Add Hepburn skin (magenta-rose palette derived from the Hepburn TUI theme). Full light + dark palette under `:root[data-skin="hepburn"]` / `:root.dark[data-skin="hepburn"]`, registered in `static/boot.js` `_SKINS` and whitelisted in `static/index.html`'s inline skin gate. As part of this PR `loadSettingsPanel()` in `static/panels.js` now prefers `localStorage.getItem('hermes-skin')` over `settings.skin` when populating the skin picker (DOM truth → settings fallback), so the picker matches what the user actually sees after the inline gate has already resolved legacy aliases.

### [`v0.51.118`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051118--2026-05-22--Release-CP-stage-pr2773--1-PR-hotfix--v051117-brick-fix-chat-input-restored)

[Compare Source](nesquena/hermes-webui@v0.51.117...v0.51.118)

##### Fixed

- **PR [#&#8203;2773](nesquena/hermes-webui#2773 by [@&#8203;nesquena-hermes](https://github.com/nesquena-hermes) — fix(chat): rename `_inflightStateLimits()` in `static/ui.js` to `_getInflightStateLimits()` so it no longer collides with the `window._inflightStateLimits` config object set in `static/boot.js`. Closes [#&#8203;2771](nesquena/hermes-webui#2771). The v0.51.117 in-flight-recovery quota fix ([#&#8203;2766](nesquena/hermes-webui#2766)) declared a top-level helper with the same name as a window-attached config object; because top-level `function foo(){…}` declarations in classic (non-module) scripts attach to `window`, boot.js's `window._inflightStateLimits = {…}` assignment overwrote the function reference before any session could send. Every new chat broke on first `send()` with `TypeError: _inflightStateLimits is not a function`, leaving v0.51.117 effectively unusable. Renamed the function only (the public-ish window key is unchanged) and updated all 4 call sites. \*\*New regression test `tests/test_window_function_collision.py` scans every static JS file for top-level `function NAME()` declarations whose name is also the target of `window.NAME = {…}` / `= <number>`, the exact shape that broke [#&#8203;2715](nesquena/hermes-webui#2715) (`_pinnedSessionsLimit` in v0.51.106) and [#&#8203;2771](nesquena/hermes-webui#2771) (`_inflightStateLimits` in v0.51.117). The test fails loudly with a precise file:name diagnostic if the bug class returns. Verified end-to-end against the live browser before merge: `_getInflightStateLimits()` returns the limits object and `saveInflightState()` persists to localStorage without throwing.

### [`v0.51.117`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051118--2026-05-22--Release-CP-stage-pr2773--1-PR-hotfix--v051117-brick-fix-chat-input-restored)

[Compare Source](nesquena/hermes-webui@v0.51.116...v0.51.117)

##### Fixed

- **PR [#&#8203;2773](nesquena/hermes-webui#2773 by [@&#8203;nesquena-hermes](https://github.com/nesquena-hermes) — fix(chat): rename `_inflightStateLimits()` in `static/ui.js` to `_getInflightStateLimits()` so it no longer collides with the `window._inflightStateLimits` config object set in `static/boot.js`. Closes [#&#8203;2771](nesquena/hermes-webui#2771). The v0.51.117 in-flight-recovery quota fix ([#&#8203;2766](nesquena/hermes-webui#2766)) declared a top-level helper with the same name as a window-attached config object; because top-level `function foo(){…}` declarations in classic (non-module) scripts attach to `window`, boot.js's `window._inflightStateLimits = {…}` assignment overwrote the function reference before any session could send. Every new chat broke on first `send()` with `TypeError: _inflightStateLimits is not a function`, leaving v0.51.117 effectively unusable. Renamed the function only (the public-ish window key is unchanged) and updated all 4 call sites. \*\*New regression test `tests/test_window_function_collision.py` scans every static JS file for top-level `function NAME()` declarations whose name is also the target of `window.NAME = {…}` / `= <number>`, the exact shape that broke [#&#8203;2715](nesquena/hermes-webui#2715) (`_pinnedSessionsLimit` in v0.51.106) and [#&#8203;2771](nesquena/hermes-webui#2771) (`_inflightStateLimits` in v0.51.117). The test fails loudly with a precise file:name diagnostic if the bug class returns. Verified end-to-end against the live browser before merge: `_getInflightStateLimits()` returns the limits object and `saveInflightState()` persists to localStorage without throwing.

### [`v0.51.116`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051116--2026-05-22--Release-CN-stage-pr2676--1-PR--per-skill-enabledisable-toggle-in-Skills-panel-CLI-parity-with-hermes-skills-config)

[Compare Source](nesquena/hermes-webui@v0.51.115...v0.51.116)

##### Added

- **PR [#&#8203;2676](nesquena/hermes-webui#2676 by [@&#8203;lucasrc](https://github.com/lucasrc) — Each skill in the Skills panel now has a toggle pill (enabled/disabled) so users can turn individual skills on or off directly from the WebUI without editing `config.yaml`. Achieves parity with the existing `hermes skills config` CLI subcommand (interactive TUI that toggles `skills.disabled` in config). The disabled state is mirrored through to `skills.platform_disabled.webui` when that key is present. Disabled skills remain visible in the panel (muted via `opacity: .45`) instead of being filtered out, so users can re-enable them later. New endpoint: `POST /api/skills/toggle` validates the skill exists in the filesystem before mutating config, wraps the YAML read-modify-write under the existing `_cfg_lock` for thread safety, and calls `reload_config()` so the change takes effect immediately. Toggle pill uses theme variables (`--accent-bg-strong`, `--accent`, `--border`, `--muted`, `--accent-text`) so it adapts automatically to each skin: gold for default, red for ares, blue for poseidon, purple for sisyphus, grey for mono — verified empirically across light + dark variants. i18n keys (`skill_enabled`, `skill_disabled`, `skill_toggle_failed`) translated across all 10 locales. Default-state safety: fresh installs (no `skills.disabled` key in config) return `disabled: False` for every skill — no regression risk for new users.

### [`v0.51.115`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051115--2026-05-22--Release-CM-stage-pr2731--1-PR--clarify-prompt-collapseexpand-with-chevron-icon-polish)

[Compare Source](nesquena/hermes-webui@v0.51.114...v0.51.115)

##### Added

- **PR [#&#8203;2731](nesquena/hermes-webui#2731 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) — Clarification prompts now include a compact Collapse/Expand control so users can temporarily shrink a blocking decision card and reread the chat context behind it before responding. The toggle uses Lucide chevron icons (chevron-down expanded → click to collapse, chevron-up collapsed → click to expand) and a small circular pill matching the existing composer-button design language. The collapsed card sits cleanly above the composer at every tested viewport (desktop 1920×1080, mobile iPhone 14 390×844) without edge clipping. New clarification prompts still open expanded so users notice them.

### [`v0.51.114`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051114--2026-05-22--Release-CL-stage-407--1-PR--update-check-recovery-from-remote-re-tags)

[Compare Source](nesquena/hermes-webui@v0.51.113...v0.51.114)

##### Fixed

- **PR [#&#8203;2758](nesquena/hermes-webui#2758 by [@&#8203;nesquena-hermes](https://github.com/nesquena-hermes) — fix(updates): pass `--force` to `git fetch --tags` in `api/updates.py` so the WebUI's release-tracking update check can recover from a remote re-tag (e.g. a release tag that was force-pushed to a new commit after a squash-merge). Without `--force`, plain `git fetch origin --tags` returns `! [rejected] vX.Y.Z (would clobber existing tag)` and the entire update path (check, force-apply, normal-apply) jams indefinitely — neither the periodic check nor manual "Check now" nor the Update button can recover. Three fetch call sites were patched (`_check_repo`, `apply_force_update`, `apply_update`) to use `--tags --force`; the WebUI never pushes tags, so deferring to the remote's view is the right contract. Closes [#&#8203;2756](nesquena/hermes-webui#2756).

### [`v0.51.113`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051113--2026-05-22--Release-CK-stage-406--1-PR--composer-model-picker-lag-fix--hard-refresh-recovery)

[Compare Source](nesquena/hermes-webui@v0.51.112...v0.51.113)

##### Fixed

- **PR [#&#8203;2743](nesquena/hermes-webui#2743 by [@&#8203;franksong2702](https://github.com/franksong2702) — Composer model picker now opens immediately from the existing static option list while the dynamic `/api/models` catalog hydrates in the background, instead of blocking the click on the catalog request. A just-selected session model also survives a hard refresh that interrupts the async `/api/session/update` POST: the selection is staged into `sessionStorage` (keyed by session\_id, 10-minute TTL) before the async update flies, and `loadSession()` re-applies the pending pick on next session restore and retries the persistence call. Tests pin the new ordering: visible picker render before `await`, pending-state save before `await api('/api/session/update')`, and pending-state replay before the first `syncTopbar()` projects server metadata.

### [`v0.51.112`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051112--2026-05-22--Release-CJ-stage-405--1-PR--session-model-authoritative-across-restore)

[Compare Source](nesquena/hermes-webui@v0.51.111...v0.51.112)

##### Fixed

- **PR [#&#8203;2737](nesquena/hermes-webui#2737 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Keep the session model authoritative when a restored session is reactivated. Previously, stale browser-cached picker state could override an active conversation's model in four scenarios: (1) on initial boot when `localStorage` had a different model preference than the active session, (2) on hard refresh when `S._bootReady` revealed the composer chip before the live catalog hydrated, (3) when the session's model wasn't in the current provider catalog (the static/default fallback silently rewrote `S.session.model`), (4) when starting a new session whose model wasn't in the static HTML dropdown. The fix: `loadSession()` now requests `resolve_model=1` so backend normalization happens synchronously with metadata; boot model hydration prefers the active session over `localStorage`; hard refresh re-runs the model dropdown hydration before `_bootReady`; a new `_ensureModelOptionInDropdown()` helper injects a `data-custom='1'` option for models not in the catalog instead of silently rewriting `S.session.model` to the default. 100 LOC of new pytest regression coverage pinning each behavior.

### [`v0.51.111`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051111--2026-05-22--Release-CI-stage-404--1-PR--keep-statedb-replays-out-of-sidecar-tail)

[Compare Source](nesquena/hermes-webui@v0.51.110...v0.51.111)

##### Fixed

- **PR [#&#8203;2746](nesquena/hermes-webui#2746 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Prevent replayed state.db rows from being appended after an already-correct sidecar transcript tail. `merge_session_messages_append_only()` previously tried to skip state.db rows replaying the sidecar, but two edge cases leaked through: (1) the final row of a replayed sidecar prefix was not skipped because the replay index had reached the sidecar sequence length, and (2) a replayed middle segment was not considered prefix replay, so old state.db rows could be appended after the saved assistant tail. That made `/api/session` appear to end on an old user prompt even when the saved sidecar already ended on the real assistant answer. The fix tracks per-(role, content) visible-occurrence counts in the sidecar and uses that as a replay budget when comparing state.db rows; legitimate repeated messages from state.db are still preserved. `_has_visible_duplicate()` is kept as a thin wrapper around the new `_matching_visible_duplicate()` for backwards compatibility. Regression test covers both full-replay and middle-segment replay shapes.

### [`v0.51.110`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051110--2026-05-22--Release-CH-stage-403--2-PR-batch--default-personality-from-config--sort-configured-providers-to-top)

[Compare Source](nesquena/hermes-webui@v0.51.109...v0.51.110)

##### Added

- **PR [#&#8203;2747](nesquena/hermes-webui#2747 by [@&#8203;s010mn](https://github.com/s010mn) — `new_session()` now reads `display.personality` from `config.yaml` as the default for new conversations. Previously every new session started with `personality=None` and required an explicit `/personality <name>` slash command. Values `'none'`, `'default'`, `'neutral'`, and empty string are treated as no-personality. Case-insensitive — `personality: Taleb` normalizes to `taleb`. Config-read is wrapped in try/except so malformed config falls back to the prior behavior rather than crashing session creation. The `/personality` slash command still works for per-session overrides.
- **PR [#&#8203;2683](nesquena/hermes-webui#2683 by [@&#8203;jasonjcwu](https://github.com/jasonjcwu) — Sort providers so configured/custom entries appear first in both the model picker dropdown (`api/config.py::get_available_models`) and the Settings providers panel (`api/providers.py::get_providers`). Priority order: (1) the active provider, (2) `custom:*` providers from `custom_providers` config, (3) providers with configured API keys (credential pool or `config.yaml`), (4) all others alphabetical. Eliminates scrolling past 25+ unconfigured providers to find the one in active use.

### [`v0.51.109`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051109--2026-05-22--Release-CG-stage-402--2-PR-batch--sidebar-action-menu-click-stability--chat-panel-sidebar-resync-after-navigation)

[Compare Source](nesquena/hermes-webui@v0.51.108...v0.51.109)

##### Fixed

- **PR [#&#8203;2741](nesquena/hermes-webui#2741 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Keep the sidebar conversation actions menu open while session-list refreshes, stream updates, or panel-resync repairs arrive. Previously the three-dot menu beside chat titles could be torn down before the user finished clicking it because `renderSessionListFromCache()` rebuilt the row DOM (and the fixed-position menu's anchor) without checking whether the menu was open. The new early-return at the top of the refresh keeps the menu stable; destructive menu actions explicitly close the menu before they fire, so dismissal still works as expected.
- **PR [#&#8203;2736](nesquena/hermes-webui#2736 by [@&#8203;ai-ag2026](https://github.com/ai-ag2026) — Resync the chat sidebar after returning from Settings/Logs/other panels. The session list is virtualized, and the browser can clamp the preserved scrollTop during a panel transition; without a render after the chat view is visible again, stale virtual spacer/header DOM remained until the next manual scroll. The new `_resyncChatSidebarAfterPanelSwitch()` helper runs one guarded `requestAnimationFrame` after the panel becomes visible, bails if a rename input or action menu is open, and uses no polling.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19-->

Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/634
@b3nw b3nw deleted the fix/streaming-tool-segment-truncation branch June 1, 2026 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(chat): live assistant text can truncate at tool-call segment boundaries during streaming

2 participants