Skip to content

fix: reattach SSE on session-switch return + close leaked stream connections#2925

Closed
wirtsi wants to merge 9 commits into
nesquena:masterfrom
wirtsi:fix/live-streams-connection-leak
Closed

fix: reattach SSE on session-switch return + close leaked stream connections#2925
wirtsi wants to merge 9 commits into
nesquena:masterfrom
wirtsi:fix/live-streams-connection-leak

Conversation

@wirtsi
Copy link
Copy Markdown
Contributor

@wirtsi wirtsi commented May 25, 2026

Summary

What was broken

  1. Connection leak (browser beach-ball). LIVE_STREAMS was never written to — the EventSource was kept in a closure variable but not tracked in the dictionary, so closeOtherLiveStreams() and closeLiveStream() were no-ops. Every session switch leaked a live EventSource. After a few switches, browser connection-pool exhaustion produced pending requests and the macOS beach-ball during long agent runs.
  2. Reattach skipped on return. Once the leak fix landed (closeOtherLiveStreams actually closes prior streams), loadSession() returning to a still-streaming session needed to reopen the SSE. The reattach gate is INFLIGHT[sid].reattach && activeStreamId, but reattach=true was only set on the storage-load path. An in-memory INFLIGHT entry stayed unflagged, so no new EventSource was opened on return — the user saw nothing until the final response landed via metadata refresh.
  3. _dirty_suffix silently dropped -dirty. _run_git() substitutes a synthetic "git exited with status N" diagnostic when both stdout/stderr are empty (which is exactly what diff-index --quiet does to signal a dirty tree). The naïve if not out guard always saw a truthy out and dropped the suffix — defeating dev-build cache busting (static/foo.js?v=… stayed identical between clean and dirty checkouts, so browsers kept serving stale assets after a local edit).

What changed

  • static/messages.js_wireSSE() writes LIVE_STREAMS[activeSid]; closeLiveStream() now also sets INFLIGHT[sid].reattach = true (guarded) after closing, so loadSession's reattach branch fires on return. Reconnect handler bails out via _isSessionActivelyViewed() so an SSE closed intentionally during session switch doesn't auto-reconnect in the background.
  • static/sessions.jsloadSession() calls closeOtherLiveStreams(sid) before fetching session metadata, so the previous session's EventSource is torn down deterministically (instead of leaking until the next attachLiveStream).
  • api/updates.py_dirty_suffix() recognises both the empty-out and synthetic-diagnostic shapes as the dirty signal, keeping the _run_git() call path so the existing test mock still works.
  • api/routes.py, tests/test_regressions.py, tests/test_streaming_race_fix.py — small edits that came along with the broader connection-leak hardening already on this branch.

Tests

  • tests/test_inflight_stream_reuse.py — 3 new regression tests pin the chain: closeLiveStream marks reattach → closeOtherLiveStreams propagates the mark → loadSession's INFLIGHT branch keeps the gate shape that the mark feeds into.
  • tests/test_parallel_session_switch.py — two brittle substring tests rewritten with resilient regex matches so future inserts in the same loadSession reset block don't break the assertion.
  • tests/test_version_badge.py — exercises the corrected _dirty_suffix via the existing _run_git mock.
  • Full local suite: 6406 passed, 2 unrelated (gateway-sync network timeout in CI-less env).

Test plan

  • Start chat A with a long prompt; click +; send a brief message in B; click back to A; confirm tokens stream live.
  • Repeat with a model that has visible reasoning vs. plain token output to ensure both paths reattach.
  • Confirm static/messages.js?v=… URL gains -dirty on local edits and busts the browser cache.

🤖 Generated with Claude Code

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Summary

Reading the diff against origin/master and the actual chain in static/messages.js:610-628, the four-part fix (track the EventSource, mark for reattach on teardown, restore accumulators on reconnect, suppress reconnect for backgrounded sessions) lines up correctly. The non-obvious part — that the same patch had to land in four places to work end-to-end — is exactly the cluster of regressions #2924 reports, and the regression tests pin the chain so a future refactor can't half-fix it.

Code reference — the missing LIVE_STREAMS write was the headline bug

On origin/master, LIVE_STREAMS is declared but never written to. _wireSSE opens an EventSource and keeps the handle in a closure, but the dictionary stays empty, so closeOtherLiveStreams() and closeLiveStream() iterate over {} and silently no-op. Every session switch leaked a live stream. The branch adds the missing write at static/messages.js:1478:

function _wireSSE(source){
  // Track the EventSource in LIVE_STREAMS so closeOtherLiveStreams() /
  // closeLiveStream() can close it when switching sessions, and so
  // reconnect can detect an already-connected stream.
  LIVE_STREAMS[activeSid]={streamId,source};

Without this, the rest of the chain has nothing to operate on. With it, closeOtherLiveStreams(sid) from static/sessions.js:577 (added in this PR) actually closes the prior session's EventSource on switch.

Code reference — reattach gate

The follow-on bug is the part most likely to be missed in review. Once the leak is sealed, returning to a still-streaming session needs to reopen the SSE. The reattach branch in loadSession() is gated on INFLIGHT[sid].reattach, but that flag was only set on the storage-load path. The branch fixes this at static/messages.js:627:

if(INFLIGHT[sessionId]) INFLIGHT[sessionId].reattach=true;

closeLiveStream() is now the single chokepoint that marks reattach. The teardown call at _clearOwnerInflightState() runs before _closeSource() in the terminal-state path, so INFLIGHT[sessionId] is already gone there and the guarded write is a no-op — exactly the behaviour you want for the clean-finish path. The regression test at tests/test_inflight_stream_reuse.py:104-122 pins this precise shape.

Code reference — reconnect guard

static/messages.js:2113-2116 short-circuits the error-handler's reconnect when the user has navigated away:

if(!_isSessionActivelyViewed(activeSid)) return;

This is the right place to put it — source.close() runs first, so the EventSource is torn down whether or not we attempt reconnect, but the explicit _isSessionActivelyViewed check prevents _reconnectAttempted from re-firing a background stream the user just walked away from. Combined with LIVE_STREAMS[activeSid]={streamId,source} so closeOtherLiveStreams() can find it, no orphan streams remain after a switch.

_dirty_suffix correctness

_run_git() at api/updates.py:104 substitutes "git exited with status N" when both stdout and stderr are empty. git diff-index --quiet HEAD -- is designed to produce exactly that shape on a dirty tree (exit 1, no output). The previous return "-dirty" if not out else "" always saw truthy out and dropped the suffix. The branch correctly handles both:

if not out or out.startswith('git exited with status '):
    return "-dirty"
return ""

This restores cache busting on dirty dev checkouts — without it, static/messages.js?v=… was identical between clean and dirty builds, so browsers kept serving stale assets after a local edit. Real errors (timeout text, "git executable not found") carry distinct diagnostics and correctly fall through to "".

Note — the api/routes.py diff is whitespace-only

The api/routes.py hunk in the diff (84 lines changed) is the indentation undo from commit c0a683bd "fix: suppress hidden-tab polling and narrow CHAT_LOCK scope", where _handle_chat_sync got its CHAT_LOCK scope narrowed — that lives only on this branch, so against master the diff has to re-indent the AIAgent construction block back inside with CHAT_LOCK:. Worth flagging in the PR description so reviewers don't waste time looking for a behavioural change inside it. If you want a smaller blast radius for the SSE fix, splitting that earlier CHAT_LOCK narrowing into a separate PR would make the streaming PR more obviously contained.

Verification

CI is green (3.11/3.12/3.13 SUCCESS), and the repro steps in the PR description match the chain the tests pin — start A long, switch to B and send, return to A → tokens stream live because (1) the EventSource for A was actually closed on switch, (2) INFLIGHT.reattach was marked, (3) loadSession's reattach branch reopens via attachLiveStream with {reconnecting:true}, (4) the new closure restores assistantText from the last live INFLIGHT message so the prefix isn't doubled or dropped.

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Holding for @nesquena review — this PR modifies 3 existing tests (test_inflight_stream_reuse, test_parallel_session_switch, test_regressions, test_streaming_race_fix), which signals an intentional contract change in the SSE reattach behavior. Compared to the SSE work we just shipped in #2928 (Release DH, v0.51.136), the surfaces overlap heavily.

What needs to be answered before merge:

  1. Does this PR's behavior agree with or contradict fix(chat): keep one live SSE source per stream #2928's "single live source per stream" guarantee?
  2. The test assertion changes — are they redefining what we considered a regression contract, or just adjusting to new internal helper names?
  3. Is the "close leaked stream connections" part actually fixing a separate leak, or duplicating what fix(chat): keep one live SSE source per stream #2928 already does in _wireSSE?

Flagging for @nesquena. @wirtsi, thank you — just needs a careful conflict-with-shipped-fix check.

@darickmunroec
Copy link
Copy Markdown

Reproduced on iOS mobile (Safari PWA). When switching sessions or locking screen, SSE drops and messages won't appear until fully loaded. Makes the mobile experience unusable. Looking forward to this fix! 🙏

@franksong2702
Copy link
Copy Markdown
Contributor

franksong2702 commented May 26, 2026

I re-read this against current origin/master, #2347, and the already-merged #2928. My read is that the core of this PR does not contradict #2928 one live SSE source; it completes the missing half of that invariant.

#2928 made LIVE_STREAMS authoritative enough that old EventSources can actually be closed. That is correct, but it creates the state described in #2924: the original session can still have an in-memory INFLIGHT[sid] entry and an active server-side active_stream_id, while its browser EventSource has been closed. The existing loadSession() return path only reopens the SSE when INFLIGHT[sid].reattach && activeStreamId is true, and the in-memory path never sets that flag. So the stream is silently disconnected until final session refresh lands.

That makes the closeLiveStream() -> INFLIGHT[sessionId].reattach = true part the key missing invariant. It also restores the user-facing guarantee from #2347: switching away and back should preserve the live working scene, not just the last DOM snapshot before the stream was cut.

If this PR is being held because the diff is broad, I would support narrowing it rather than dropping it: keep the reattach marking, reconnect accumulator restore, background reconnect guard, and the focused tests for the close/reattach/loadSession chain; split the unrelated _dirty_suffix / cache-busting cleanup and any incidental route/test churn if needed. #2958 looks adjacent but separate: that is live timeline/interim-progress presentation, while #2924/#2925 is the transport reattach break.

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Stale base — naive merge would revert v0.51.137 → v0.51.143 (6 releases)

Re-checked this PR against current master while doing the full hold-bucket reassessment.

Detection:

  • Merge-base with master is 48a2e792 (post-v0.51.137).
  • Since then we shipped DI → DJ → DK → DL → DM → DN → DO (6 releases, 35+ PRs).
  • Tip-vs-master diff: 131 files changed, +608/-3385.
  • Tip-vs-merge-base diff: 8 files changed, +208/-59.
  • The 3000+ deletions = 6 releases worth of shipped code that would be reverted by a naive merge.

Specific conflicts already detected:

Path forward:
Rebase onto current origin/master (currently 3f22e547). The PR's actual change is small (8 files, +208/-59 on the merge-base diff), so the rebase should be tractable — just resolve conflicts in:

Once rebased, please reply here and I'll do the deep review + ship it. The underlying SSE-leak + reattach fix is exactly the iOS Safari beach-ball symptom from #2924 that several users have reported — we want this landed.

cc @wirtsi

Florian Krause and others added 9 commits May 27, 2026 07:54
…a#2024)

- New api/agent_subprocess.py: stripped agent worker that runs in a separate
  multiprocessing.Process. All heavy hermes-agent imports happen inside the
  subprocess, keeping the main HTTP process free.

- api/streaming.py: add _run_agent_streaming_subprocess() which:
  1. Creates a multiprocessing.Queue + Event for IPC
  2. Spawns a relay thread to forward events from the MP queue to STREAMS
  3. Starts the agent subprocess via _start_agent_subprocess()
  4. Waits for process exit, then captures the final result
  5. Merges messages back into the session and emits done/error/cancel

- api/routes.py: switch call sites from _run_agent_streaming to
  _run_agent_streaming_subprocess for /api/chat/start, /api/btw, and
  /api/background.

- cancel_stream() now also signals the MP cancel event so the subprocess
  exits early.

Trade-off: agent cache is lost per-turn (fresh AIAgent each time). Session
state is still preserved because sessions are file-backed. The subprocess
incurs ~1-2s cold-start on first use but keeps the HTTP server responsive
during long agent runs.
time.sleep(0) in the put() callback releases the GIL after each token/tool
event, giving HTTP handler threads a chance to serve /api/sessions and other
endpoints during long agent runs.
Two intertwined bugs fixed:

1. LIVE_STREAMS was never written to — the EventSource created by
   attachLiveStream() was stored in a closure variable but never tracked
   in the LIVE_STREAMS dictionary. This meant closeOtherLiveStreams()
   and closeLiveStream() were no-ops (iterating an empty object). Every
   session switch leaked the old SSE connection, which kept pumping token
   events into the orphaned closure, flooding the browser main thread and
   causing the macOS beach ball during long agent runs.

   Fix: store {streamId, source} in LIVE_STREAMS[activeSid] inside
   _wireSSE() so closeOtherLiveStreams() actually closes the previous
   session's EventSource when switching.

2. When switching away from a running chat and back, attachLiveStream()
   with {reconnecting: true} started with empty assistantText and
   reasoningText, losing all progress. The new SSE connection would
   append new tokens to nothing — the already-rendered response vanished.

   Fix: on reconnect, restore assistantText and reasoningText from
   INFLIGHT[activeSid].messages (the _live assistant message) instead of
   starting from empty strings.

Also removes the time.sleep(0) GIL-yield in streaming.py — the stall was
browser-side (connection leak → event flood → main thread freeze), not
Python-side. ThreadingHTTPServer serves requests in separate threads and
run_conversation() runs in a daemon thread; the GIL is not the bottleneck.
The source-level assertions in test_streaming_race_fix and
test_regressions now accept both the original empty-string init
('' for first connect) and the conditional restore from INFLIGHT
(for reconnect).
…onnect

1. sessions.js: Call closeOtherLiveStreams() in loadSession() when
   switching away from a session. This ensures the old session's
   EventSource is closed, stopping token events from flooding the
   main thread. Previously only called inside attachLiveStream(),
   which is not invoked for idle sessions — leaving leaked SSE
   connections that froze the browser.

2. messages.js: Store EventSource in LIVE_STREAMS inside _wireSSE()
   so closeOtherLiveStreams() and closeLiveStream() actually work.
   LIVE_STREAMS was never written to, making both functions no-ops.

3. messages.js: Restore assistantText/reasoningText from INFLIGHT
   on reconnect so the already-rendered content survives the
   session switch. The StreamChannel replays buffered gap events
   which correctly append to the restored state.

4. tests: Update assertions to accept the new conditional init
   pattern for reconnection accumulator restoration.
The root cause of the browser beach ball: switching sessions didn't
close the old session's SSE EventSource, which kept pumping token/
reasoning events through its closure into the browser main thread.

Closing the EventSource triggers its 'error' handler which auto-
reconnects. Added _isSessionActivelyViewed() guard to the error handler
so it won't reconnect when the user has switched to a different session.

Also Reverted the syncInflightAssistantMessage reordering — it needs
to run even when backgrounded to keep INFLIGHT data up-to-date for
reconnection.
Two related fixes:

1. messages.js — closeLiveStream() now flags INFLIGHT[sid].reattach=true
   after tearing down the EventSource. Previously this flag was only set by
   the storage-load path in sessions.js loadSession(), so an in-memory
   INFLIGHT entry stayed unflagged through the session switch. When the user
   returned to the still-streaming session, the reattach branch in
   loadSession() was skipped and the SSE was never reopened — the user saw
   no live tokens until the server-side run completed and a metadata
   refresh swapped in the final reply.

   Guarded by an existence check so the terminal-state teardown path
   (_clearOwnerInflightState() runs before _closeSource()) remains a safe
   no-op.

2. api/updates.py — _dirty_suffix() silently dropped the `-dirty` suffix on
   any dirty working tree. The previous implementation routed through
   _run_git(), which packaged a synthetic "git exited with status 1"
   diagnostic into stdout for non-zero exit codes. diff-index --quiet
   uses exit code 1 to *signal* dirty (not an error), so the `if not out`
   guard always saw a non-empty `out` and skipped the suffix. As a result
   the static-asset cache-busting query string (`?v=<WEBUI_VERSION>`) was
   identical for a clean and dirty checkout — browsers kept serving the
   pre-edit JS during local development. Call subprocess directly and
   check for the `returncode == 1, no stdout/stderr` shape that
   diff-index --quiet uses.

Tests:
- 3 new regression tests in test_inflight_stream_reuse.py pin the
  closeLiveStream → reattach chain (fail on master, pass with fix).
- 2 tests in test_parallel_session_switch.py rewritten with a more
  resilient regex match so unrelated inserts in the same loadSession
  reset block (like the closeOtherLiveStreams call added earlier on
  this branch) don't break the assertion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous attempt bypassed _run_git() and called subprocess directly,
which broke test_dirty_check_appends_suffix_when_fast (the test mocks
_run_git, so a direct subprocess.run() escapes the mock).

Restore the _run_git() call path. The trick is that _run_git() packs a
synthetic "git exited with status N" diagnostic into its return value
when both stdout and stderr are empty — which is exactly what
`diff-index --quiet` does to *signal* a dirty tree (exit 1, no output).
Treat that synthetic shape AND an empty `out` as the dirty signal; real
errors (timeouts, missing git, repo-not-found) come with their own
diagnostic and correctly suppress the suffix so the base version
remains visible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@wirtsi wirtsi force-pushed the fix/live-streams-connection-leak branch from 587f2b3 to 07ed25e Compare May 27, 2026 05:55
@wirtsi
Copy link
Copy Markdown
Contributor Author

wirtsi commented May 27, 2026

Thanks @nesquena-hermes and @franksong2702 for the detailed reads — the rebase-vs-merge call and the "reattach marking is the missing invariant" framing were both really helpful for keeping the change focused.

Rebased onto current origin/master (now 329debcd, v0.51.145) — clean, no conflicts. The branch is 9 commits ahead / 0 behind, and the only diff vs master is the SSE leak + reattach chain plus its focused tests. Full local suite is green (one unrelated network-timeout flake in test_gateway_sync aside).

Ready for the deep review whenever you have a chance — thanks again for the thorough input.

dso2ng pushed a commit to dso2ng/hermes-webui that referenced this pull request May 28, 2026
…ections (nesquena#2925)

Squash-merged pr-2925 into stage-batch33. Closes nesquena#2924.
dso2ng pushed a commit to dso2ng/hermes-webui that referenced this pull request May 28, 2026
3-PR mid-risk batch: SSE reattach + title-lang + composer cap (nesquena#2925, nesquena#2984, nesquena#2946)
eleboucher pushed a commit to eleboucher/homelab that referenced this pull request May 29, 2026
…➔ 0.51.157) (#699)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [ghcr.io/nesquena/hermes-webui](https://github.com/nesquena/hermes-webui) | patch | `0.51.145` → `0.51.157` |

---

### Release Notes

<details>
<summary>nesquena/hermes-webui (ghcr.io/nesquena/hermes-webui)</summary>

### [`v0.51.157`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051157--2026-05-28--Release-EC-stage-batch39--5-PR-mixed-risk-cleanup-gateway-prefill-forward--prefill-budget--compressed-continuation-sidebar--browser-transcript-memory-guidance--reasoning-max-parity)

[Compare Source](nesquena/hermes-webui@v0.51.156...v0.51.157)

##### Added

- The reasoning-effort selector now offers a `max` level, matching the agent's `hermes_constants.VALID_REASONING_EFFORTS`. This restores parity with the underlying set (the WebUI mirror previously stopped at `xhigh`) so providers such as Anthropic that support the `max` thinking level are selectable from the composer dropdown and the `/reasoning` command.

##### Changed

- WebUI's browser-session surface prompt now explicitly tells agents not to dump browser transcripts into external notes or durable memory by default; it limits saving to explicit captures and clearly reusable durable signals such as preferences, decisions, blockers, and runbook-worthy workflows.

##### Fixed

- Gateway-backed WebUI chat now forwards configured prefill/session-recall context and a compact WebUI session-context block into delegated Gateway turns, so browser sessions retain note recall, connected-platform awareness, and delivery hints instead of sending only the latest user message. If the dynamic prefill script fails, WebUI falls back to the configured static router prefill when available.
- Oversized WebUI startup prefill payloads now respect a configurable context budget (`webui_prefill_context_max_chars` / `HERMES_WEBUI_PREFILL_CONTEXT_MAX_CHARS`, default 12,000 chars). When a dynamic prefill script exceeds the budget and a compact static prefill file is configured, WebUI falls back to the compact file; otherwise it injects a small retrieval instruction instead of dumping the full note/body payload into every new chat.
- Sidebar now keeps the newest active continuation visible when it has more recent activity than an older fuller pre-compression snapshot in the same lineage. Adds lineage-aware dedupe for WebUI-origin state-db projections, restores normal context-only turns into the visible transcript after compression while preserving order, and recognizes `[Session Arc Summary]` as a compression marker so it isn't backfilled into the chat transcript.

### [`v0.51.156`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051156--2026-05-28--Release-EB-stage-batch38--2-PR-Tier-B-cleanup-WebUI-requestruntime-hardening--chat-start-provider-fallback)

[Compare Source](nesquena/hermes-webui@v0.51.155...v0.51.156)

##### Fixed

- Hardened WebUI request/session/runtime edges: malformed request body lengths are rejected before reads, session writes reject unsafe IDs, auth session/login-attempt maps avoid unsynchronized mutation, and successful password login clears stale rate-limit failures.
- Hardened frontend startup and navigation fallbacks: early storage access now survives blocked `localStorage`, stale session recovery preserves subpath mounts, session URL generation removes both legacy session query aliases, canceling a stream closes the local EventSource, and the PWA shell precaches same-origin markdown/KaTeX vendor assets.
- Added missing i18n keys used by command, cron, provider, search/default, and session-rename UI paths across supported locales so missing translations fall back to labels instead of raw key names.
- Made workspace Git tests pin their temporary repository branch to `master` so the suite is independent of the host Git default-branch setting.
- Browser chat start and queued-turn payloads now fall back to the selected/persisted provider only when it belongs to the same model being sent, preventing fresh sessions from sending a dropdown-selected model with `model_provider=null`.

### [`v0.51.155`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051155--2026-05-28--Release-EA-stage-batch37--3-PR-very-low-risk-cleanup-passive-timeout-toasts--sidecar-order--subsecond-timestamps)

[Compare Source](nesquena/hermes-webui@v0.51.154...v0.51.155)

##### Fixed

- Passive background refreshes such as sidebar/project polling, health checks, cron-status watches, and client-event logging no longer surface generic timeout toasts; explicit user actions still show timeout errors. (Related to [#&#8203;3024](nesquena/hermes-webui#3024))
- Messaging/session display merges now preserve sidecar transcript order when the sidecar already contains at least as many rows as the mirrored state store, avoiding role/content fallback sorting when timestamp precision collapses.
- Gateway-backed turns and compacted/reconciled message batches now keep subsecond timestamp ordering instead of assigning the same integer-second timestamp to multiple transcript rows.

### [`v0.51.154`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051154--2026-05-28--Release-DZ-stage-batch36--9-PR-medium-risk-cleanup-cron-project-chip--KaTeX-streaming--recovery--env-keys--discoverability-repair--media-MEDIA-tokens--gateway-401--notes-prefill--cron-filter)

[Compare Source](nesquena/hermes-webui@v0.51.153...v0.51.154)

##### Added

- Session discoverability audit now has a default-dry-run `--repair-safe` routine for deterministic cleanup: stale persisted WebUI-as-CLI flags can be cleared from sidecars/index entries, and messageful WebUI rows present only in `state.db` can be materialized into sidecars/index entries when `--apply --backup-dir <dir>` is explicitly provided.

##### Changed

- The third-party notes drawer's "Recently used by AI" list now follows the provider-neutral WebUI-specific `HERMES_WEBUI_PREFILL_MESSAGES_SCRIPT` / `webui_prefill_messages_script` hook when configured, including argv-style hooks such as `[python3, /path/to/recall.py]` and command strings such as `python3 /path/to/recall.py`, before falling back to the legacy generic `prefill_messages_script`. Configured third-party notes sources such as Joplin, Obsidian, Notion, and llm-wiki remain visible even before runtime tool inventory hydrates.

##### Fixed

- Streaming KaTeX render passes now skip parser-owned equation placeholders that may still be receiving text, preventing long equations from being marked rendered before the final parser flush completes. ([#&#8203;2976](nesquena/hermes-webui#2976))
- Cron sessions assigned to the dedicated Cron Jobs project now remain hidden from the default sidebar while still appearing when that project chip is selected.
- Compression parent sessions are no longer repaired as stale interrupted turns when a continuation already exists, preventing false "Response interrupted" markers and hidden continuation rows after auto-compression session rotation. (Refs [#&#8203;2361](nesquena/hermes-webui#2361))
- Empty partial activity rows preserved from cancelled turns no longer define sidebar recency, anchor the initial paginated message window, or get restored after newer completed turns. Long sessions with old activity-only partials after recent replies now stay grouped by their latest real message and open on the recent readable transcript. ([#&#8203;3057](nesquena/hermes-webui#3057))
- Local `MEDIA:` image tokens in chat history now include the current session id and can render exact image paths already present in that session transcript, so agent-generated artifacts outside the active workspace no longer show as broken thumbnails while arbitrary local paths remain blocked.
- Gateway-backed browser chat now turns Gateway API Server 401s into a specific `gateway_auth_error` explaining that `HERMES_WEBUI_GATEWAY_API_KEY` must match `API_SERVER_KEY`, instead of surfacing the Gateway's generic "Invalid API key" body as if the model provider key failed. The browser error renderer recognizes this event type as "Gateway authentication failed" instead of falling back to a generic "Error" heading. `/api/health/agent` also reports redacted gateway-chat configuration status (`enabled`, backend, base URL configured, API key configured) as an operator diagnostic payload; it is not currently rendered as a user-facing health banner.
- New profiles with an API key supplied at create time now write the key to the profile's `.env` under the correct provider-specific variable (e.g. `KIMI_API_KEY`, `DEEPSEEK_API_KEY`) at mode 0o600, instead of writing it to `config.yaml` where Hermes Agent never reads it.

### [`v0.51.153`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051153--2026-05-28--Release-DY-stage-batch35--11-PR-low-risk-cleanup-title-language--clarify-SSE--upload-filename--discoverability--SSE-reconnect--gateway-image--docker-docs)

[Compare Source](nesquena/hermes-webui@v0.51.152...v0.51.153)

##### Changed

- Local fallback title generation no longer has a German-only `Session Bilder` special case; it now uses the same generic topic extraction path as other fallback titles. (Refs [#&#8203;3040](nesquena/hermes-webui#3040))
- Title-generation prompts now use the same language-neutral "match the user language" instruction for every locale instead of adding German-only exemplars. (Refs [#&#8203;3040](nesquena/hermes-webui#3040))
- Session discoverability audit findings for stale persisted WebUI-as-CLI flags now report whether an API-visible lineage representative already covers the hidden snapshot, including the representative session id in JSON and Markdown output.

##### Fixed

- Title-language detection no longer treats common English tech/jargon text such as "session die" or DAS/DER references as German just because of shared tokens. (Refs [#&#8203;3040](nesquena/hermes-webui#3040))
- Clarify prompt SSE fallback polling now preserves its owner session id, matching approval polling behavior so terminal events from another session cannot stop the active clarify fallback poller.
- Duplicate chat uploads now report the actual stored filename in `/api/upload` responses, so suffixed files such as `photo-1.png` do not appear under the original basename in WebUI attachment metadata.
- Visible but unfocused chat windows now still attempt the immediate SSE reconnect for the current session; only a real session switch skips the reconnect path. (Refs [#&#8203;3040](nesquena/hermes-webui#3040))
- Gateway-backed WebUI chat now forwards current-turn image attachments as OpenAI-style multimodal `image_url` parts when native image input is enabled, matching the legacy WebUI runtime's image handoff.
- New chat sessions reset `_messagesTruncated` / `_oldestIdx` so a fresh conversation never displays the stale "Scroll up or click to load older messages" indicator inherited from a previously-paginated session.
- `openai-codex` reasoning-effort resolution now lets the existing `models.dev` metadata pass set the supported levels (including `xhigh`) instead of being silently clipped through the Copilot model heuristic.

##### Documentation

- Clarify two Docker onboarding traps: `sudo docker compose` can mount `/root/.hermes` instead of the user's Hermes home on Linux, and Linux Docker Engine users should use a `host-gateway` alias such as `api.local` for host-local model servers instead of configuring `localhost` inside the container. ([#&#8203;3006](nesquena/hermes-webui#3006), [#&#8203;3012](nesquena/hermes-webui#3012))

### [`v0.51.152`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051152--2026-05-28--Release-DX-stage-batch34--single-PR-optional-gateway-backed-browser-chat)

[Compare Source](nesquena/hermes-webui@v0.51.151...v0.51.152)

##### Added

- Browser chat can now opt into a default-off `HERMES_WEBUI_CHAT_BACKEND=gateway` bridge that routes new WebUI turns through a running Hermes Gateway API server while preserving the existing WebUI chat start/stream contract. Strict enable: only the literal values `gateway`, `api_server`, or `api-server` activate the bridge — generic truthy strings like `1` or `true` keep the legacy in-process WebUI runtime. Configurable via `HERMES_WEBUI_GATEWAY_BASE_URL` (default `http://127.0.0.1:8642`) and `HERMES_WEBUI_GATEWAY_API_KEY` (falls back to `API_SERVER_KEY`). New `api/gateway_chat.py` module isolates the bridge logic; existing direct WebUI chat path unchanged when the env/config is not set. ([#&#8203;3021](nesquena/hermes-webui#3021))

### [`v0.51.151`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051151--2026-05-28--Release-DW-stage-batch33--3-PR-mid-risk-batch-SSE-reattach--title-lang--composer-cap)

[Compare Source](nesquena/hermes-webui@v0.51.150...v0.51.151)

##### Fixed

- Live SSE stream now reattaches when returning to a session that lost its connection during a session switch, closing the connection-leak window where stale `EventSource`s could accumulate. Also fixes a `_dirty_suffix` correctness path and yields the GIL after every SSE put so the HTTP server stays responsive under burst load. ([#&#8203;2924](nesquena/hermes-webui#2924), [#&#8203;2925](nesquena/hermes-webui#2925))
- Generated session titles now stay in the conversation language by adding an explicit title-generation instruction to the auxiliary prompt. Prevents the default prompt from drifting into English for non-English conversations. ([#&#8203;2984](nesquena/hermes-webui#2984))

##### Changed

- Composer box max-width is now capped at 1600px on ultrawide viewports (≥1600px) so chips stay anchored against a content-sized boundary instead of stretching across 3440px+ displays. Maintainer-confirmed cap from the [#&#8203;2856](nesquena/hermes-webui#2856) thread. ([#&#8203;2946](nesquena/hermes-webui#2946))

### [`v0.51.150`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051150--2026-05-28--Release-DV-stage-batch32--single-PR-reasoning-effort-agent-metadata)

[Compare Source](nesquena/hermes-webui@v0.51.149...v0.51.150)

##### Fixed

- Reasoning-effort capability detection now consults Hermes Agent's `models.dev` metadata before falling back to WebUI-local provider/model prefix heuristics. xAI OAuth Grok models (e.g. `grok-4.3`) and other native/provider-specific catalogs that Hermes Agent already knows about now show the reasoning-effort chip without requiring a per-provider WebUI allowlist update. Existing exact-resolver paths (ACP unsupported, Copilot/GitHub effort subsets, OpenAI Codex, LM Studio live probing) keep their authoritative behavior — the new metadata lookup sits between those resolvers and the broad heuristic fallback. Metadata `supports_reasoning=False` is treated as authoritative so known non-reasoning variants stay hidden. Also: the no-query boot path of `/api/reasoning` now hydrates against the configured default model so the chip can populate before the front-end has session model context. ([#&#8203;3017](nesquena/hermes-webui#3017))

### [`v0.51.149`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051149--2026-05-28--Release-DU-stage-batch31--hyphenated-session-ids--prefill-role-consistency)

[Compare Source](nesquena/hermes-webui@v0.51.148...v0.51.149)

##### Fixed

- Session IDs containing hyphens (e.g. API-issued `api-*` and gateway-issued `reachy-voice-*`) are now accepted by every filesystem-touching session validator: `Session.load`, `Session.load_metadata_only`, `_repair_stale_pending`, `/api/session/delete`, and `/api/session/worktree/remove`. Previously the load path accepted hyphens but the delete and worktree-remove routes rejected them with HTTP 400, producing a confusing "visible in sidebar but undeletable" UX. Refactors the duplicated character-class check into a shared `api.models.is_safe_session_id` helper with regression coverage at every call site. ([#&#8203;3023](nesquena/hermes-webui#3023), [#&#8203;3024](nesquena/hermes-webui#3024))
- Plain-text `webui_prefill_messages_script` output is now wrapped as a `user` prefill message instead of a `system` message, so dynamic recall context from notes/Obsidian/Joplin scripts becomes ordinary turn context rather than an extra system instruction. The JSON message-list escape hatch is unchanged: scripts that emit explicit `[{"role": "system", "content": "..."}]` still produce a system message. Avoids provider-specific multi-system-message footguns (Anthropic concatenation, OpenAI Responses-API divergence). ([#&#8203;3009](nesquena/hermes-webui#3009))

### [`v0.51.148`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051148--2026-05-28--Release-DT-stage-batch30--single-PR-Insights-skill-usage-reader)

[Compare Source](nesquena/hermes-webui@v0.51.147...v0.51.148)

##### Added

- Insights page now shows a Skill Usage card after the LLM Wiki card, displaying per-skill cumulative invocation counts (uses / views / patches / share-%) from the agent-owned `.usage.json`. WebUI reads only; the agent (`tools/skills_tool.py`, `tools/skill_manager_tool.py`) is the single writer with `fcntl` locking, so there is no double-counting or write race. Empty-state shows when no skills have been used yet. Includes i18n keys for the 12 new strings across all supported locales. ([#&#8203;3008](nesquena/hermes-webui#3008))

### [`v0.51.147`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051147--2026-05-28--Release-DS-stage-batch29--single-PR-streaming-ownership-cleanup-follow-up)

[Compare Source](nesquena/hermes-webui@v0.51.146...v0.51.147)

##### Fixed

- Settled stream cleanup helpers (`_restoreSettledSession`, `_handleStreamError`, `_deferStreamErrorIfPageHidden`, `_reattachOrRestoreAfterDeferredStreamError`) now thread the owning `EventSource` instance through every async deferred path, so a late error or settle callback from an older source can no longer tear down a newer reconnect source. Completes the ownership-aware cleanup pattern introduced by `closeLiveStream(sessionId, streamId, source)`. ([#&#8203;2930](nesquena/hermes-webui#2930), [#&#8203;3010](nesquena/hermes-webui#3010))

### [`v0.51.146`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v051146--2026-05-28--Release-DR-stage-batch28--6-PR-low-risk-safetycontrast-batch)

[Compare Source](nesquena/hermes-webui@v0.51.145...v0.51.146)

##### Fixed

- Dark-mode panel header save buttons now use a theme-aware foreground token, keeping workspace and other detail-pane check icons visible on the default gold accent. ([#&#8203;2998](nesquena/hermes-webui#2998), [#&#8203;3022](nesquena/hermes-webui#3022))
- Custom provider `/v1/models` discovery now uses a short per-endpoint timeout and gracefully skips slow or unreachable providers, reducing cold `/api/models` cache rebuild latency. ([#&#8203;3024](nesquena/hermes-webui#3024), [#&#8203;3025](nesquena/hermes-webui#3025))
- Messaging (Telegram-resumed) sessions: the `?messages=0` metadata fast path now routes through the same display merge as the full-message path, so `message_count` and `last_message_at` match the rendered transcript and stop triggering refresh-loops that reset scroll and close open dropdowns. ([#&#8203;3003](nesquena/hermes-webui#3003))
- WebUI sessions mirrored into `state.db` for long-history retention now stay in the WebUI sidebar tab instead of being misclassified as CLI rows. Adds regression coverage for both directions of the WebUI vs CLI source-tab invariant. ([#&#8203;3027](nesquena/hermes-webui#3027))
- Sidebar projection now keeps at least one messageful representative visible per non-background conversation when normal filters would otherwise hide every row, rescuing discoverability for sessions with stale snapshot/lineage metadata. Rescued rows are marked `discoverability_warning: rescued_messageful_hidden_session` for auditability; intentional background/cron sessions stay hidden. ([#&#8203;3028](nesquena/hermes-webui#3028))

##### Added

- New read-only `api.session_discoverability` audit module cross-checks JSON sidecars, `_index.json`, `state.db`, and the live sidebar response to classify messageful sessions without a visible representative, stale WebUI-as-CLI source flags, missing sidecars, and lineage segments without a visible tip. Diagnostic surface only; does not repair, restart, or mutate any state. ([#&#8203;3029](nesquena/hermes-webui#3029))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19-->

Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/699
@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Status note (no action needed from you yet, @wirtsi): this PR overlaps with #3005 (currently a draft, by another contributor) which takes a more comprehensive approach to the same session-switch SSE-reattach problem (#2924) — it also adds live-progress preservation. We're tracking both.

Keeping this on maintainer-review rather than closing it, because #3005 is still a draft and this PR is the review-ready option. Once #3005 either lands or stalls, we'll reconcile — if #3005 ships first and covers this PR's reattach + leaked-connection-close behavior, we'll close this with thanks; if #3005 stays draft, we'll bring this one forward on its own merits after a rebase onto current master (it's several releases stale now). No rush on your end; flagging so the overlap is visible.

@wirtsi
Copy link
Copy Markdown
Contributor Author

wirtsi commented Jun 1, 2026

Looks like this already landed — 8408a3dd on master is the squash-merge of this branch (rolled into stage-batch33, merged 2026-05-28), and the changelog/release tags through v0.51.195 are sitting on top of it.

Just rebased the branch onto current upstream/master (1fcd81e3) to double-check: all conflicts resolved in master's favor (e.g. master tightened the SSE reconnect guard from _isSessionActivelyViewed_isSessionCurrentPane in c197e0c0, and _dirty_suffix already routes through _run_git with the -dirty-{digest} cache-busting upgrade), and git diff upstream/master..HEAD is empty. Net: nothing left to merge.

Closing this PR since the work is already shipped. Thanks again @nesquena-hermes and @franksong2702 for the detailed reviews — much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hold maintainer-review Maintainer fit-assessment needed — may not merge even with fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Live SSE stream does not reattach after session switch — user sees no live tokens until completion

4 participants