Skip to content

sync: upstream v0.51.295#51

Merged
Du7chManiac merged 71 commits into
masterfrom
sync/upstream-v0.51.295
Jun 6, 2026
Merged

sync: upstream v0.51.295#51
Du7chManiac merged 71 commits into
masterfrom
sync/upstream-v0.51.295

Conversation

@Du7chManiac

Copy link
Copy Markdown

Catches the fork up from v0.51.232 to upstream's tip v0.51.295 (nesquena/hermes-webui) in a single --no-ff merge.

Conflict resolution

Additive / union (kept both):

  • CHANGELOG.md, README.md (newer compatibility section), api/config.py + static/boot.js + static/style.css (neon restore + new verdigris skin), static/i18n.js (verdigris in cmd_theme, all 12 locales), api/routes.py (anchored-fd TOCTOU helpers, profile-scoped rename event, POSIX as_posix() path fixes), static/sessions.js (_profileMatchesActiveProfile, named-custom fallback), server.py (Windows bind-retry).
  • tests/test_passkey_auth.py: kept the fork's importorskip("cryptography") and upstream's SimpleNamespace import.

.github/workflows/* reset to our master — the fork runs its own CI and the Actions GITHUB_TOKEN can't push workflow changes anyway.

Update banner (fork's standing decision preserved): the fork previously rejected upstream's single-banner / applyUpdates() redesign, and its own tests assert that. So:

  • Kept the multi-row webui/agent banner and applyUpdates(target).
  • Reset test_update_apply_ui.py / test_update_banner_fixes.py to the fork's versions.
  • Dropped upstream-added tests/test_issue3597_update_banner_position.py (it asserts the rejected single-banner position). ⚠️ Maintainer note: upstream's redesign also makes the banner visible from any panel — if you want that improvement ported into the fork's multi-row design, say so and I'll do it as a follow-up.

static/ui.js:

Validation

Full suite green locally: 8151 passed, 107 skipped. The only local failures are environmental and verified to pass otherwise:

  • git/worktree tests — fail only under the sandbox's forced commit-signing; pass with signing off (55 passed).
  • permission/workspace tests — fail only because the sandbox runs as root; pass as non-root (10 passed).

Neither condition exists in GitHub CI. Per policy this must be merged with a merge commit (never squash) to preserve the merge-base for the next sync.


Generated by Claude Code

nesquena-hermes and others added 30 commits June 3, 2026 09:34
## Release v0.51.233 — Release HA (stage-q3)

Single high-impact data-integrity fix.

### Fix
| PR | Author | Fix |
|----|--------|-----|
| nesquena#3472 | @Mubashirrrr | Guard `/api/session/truncate` `keep_count` against a **silent persisted transcript wipe** (negative value sliced as `messages[:-N]`, deleting the newest N messages and saving to disk) and an HTTP 500 (non-numeric). Now validates before the destructive slice — non-int → 400, negative → 400 — mirroring the existing `/api/session/branch` guard. `keep_count=0` "clear all" semantics preserved. |

### Gate results
- **Full pytest suite**: 7450 passed, 7 skipped, 3 xpassed, **0 failed**
- **ruff forward gate**: CLEAN
- **browser-smoke gate**: CLEAN (`/`, `/#settings`, `/#sessions` — zero console errors)
- **Codex (regression)**: SAFE TO SHIP (probed the route directly — negative/non-numeric → 400 without mutation; `keep_count=2`/`0` still work)
- **Opus (correctness)**: SAFE TO SHIP (guard placed before slice + save; both front-end callers compute from a non-negative DOM index)

Co-authored-by: Mubashirrrr <Mubashirrrr@users.noreply.github.com>
## Release v0.51.234 — Release HB (stage-q4)

Two medium-risk backend/infra fixes. All gates green.

### Fixes
| PR | Author | Fix |
|----|--------|-----|
| nesquena#3289 | @rodboev | Refuse server startup when a live instance already serves the port (Windows/macOS silent port-sharing hazard). Live-listener probe (`GET /health`, 2s timeout) + Windows `SO_EXCLUSIVEADDRUSE` — **preserves fast restart** (POSIX keeps `allow_reuse_address=True`; a dying socket in the kernel backlog times out → startup proceeds). |
| nesquena#3486 | @dso2ng | Allow remote/SSH terminal profiles to use target-side workspace paths under `terminal.cwd` without a server-local `stat()`. Local profiles unchanged — bypass only fires for remote backends and only for paths contained within `terminal.cwd`. |

### History note on nesquena#3289
This PR was **held earlier this sweep** — its original form globally disabled `SO_REUSEADDR`, which a Codex gate flagged as breaking fast restart (TIME_WAIT bricks rebind for ~60s). The contributor reworked it along the suggested lines (live-listener probe instead of the global disable). This release ships the reworked version. Unheld → full pickup → full gate.

### Gate results
- **Full pytest suite**: 7458 passed, 8 skipped, 3 xpassed, **0 failed**
- **ruff forward gate**: CLEAN
- **browser-smoke gate**: CLEAN (real server boots fine with the new startup probe)
- **Codex (regression)**: SAFE TO SHIP (verified fast-rebind preserved + remote bypass gated on backend+containment, local validation unchanged)
- **Opus (correctness + security)**: SAFE TO SHIP (probe false-positive, `_is_within` containment, local-profile bypass all hold up; applied its one minor double-call cleanup note)

Closes nesquena#3289.

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
Co-authored-by: dso2ng <dso2ng@users.noreply.github.com>
## Release v0.51.235 — Release HC (stage-q5)

Single fix in the data-sensitive transcript-merge path, with a Codex-found MUST-FIX applied.

### Fix
| PR | Author | Fix |
|----|--------|-----|
| nesquena#3468 | @jasonjcwu | `_find_current_user_turn` returns the **last** matching user turn instead of the first, so post-compression `result_messages` (which carries the full history) no longer replays the entire conversation when the user repeats a similar question (137-msg session → 89 duplicate replays). |

### Pre-release review caught a CORE issue (fixed before ship)
Codex's regression gate found that a naive last-match could be overridden by a **synthetic `role:"user"` continuation prompt** (the agent loop injects "Continue"/empty-recovery nudges — verified at `conversation_loop.py:1763/4183/4356`) that only *substring*-matches the user text — anchoring the merge **past** the real turn and dropping the assistant/tool output in between. Applied the fix: track **strong** (exact `_looks_like_current_user_turn`) and **weak** (substring) matches separately and return `last_strong → last_weak → fallback`. The real turn (strong) always wins over a later synthetic continuation (weak).

Opus reviewed the original and said ship; the applied fix is strictly safer than what it reviewed.

### Gate results
- **Full pytest suite**: 7465 passed, 8 skipped, 3 xpassed, **0 failed**
- **ruff forward gate**: CLEAN
- **browser-smoke gate**: CLEAN
- **Codex (regression)**: SHIP ONLY WITH FIXES → fix applied → re-reviewed **SAFE TO SHIP**
- **Opus (correctness)**: SHIP IT (on the pre-fix code; applied fix is strictly safer)
- **Regression test** (`tests/test_issue3468_duplicate_after_compression.py`, 7 cases): pins the last-match behavior, the strong-beats-later-weak invariant, and the end-to-end no-duplicate-replay invariant — each **verified to fail against the pre-fix logic**.

Co-authored-by: jasonjcwu <jasonjcwu@users.noreply.github.com>
## Release v0.51.236 — Release HD (stage-q7)

First Phase 3 (deep-review) release — picked by the 3-factor framework (contributor × impact × mitigated-risk): high-impact (nesquena#1952 native Windows support), backend-only (no screenshots), well-mitigated risk (POSIX path provably unchanged), from a contributor active this session (@rodboev, nesquena#3446/nesquena#3486 shipped earlier today).

### Added
| PR | Author | Fix |
|----|--------|-----|
| nesquena#1952 | @rodboev | Native Windows support for `bootstrap.py` + the embedded terminal: POSIX-only `fcntl`/`termios`/`select` guarded behind `_TERMINAL_SUPPORTED`; terminal entry points raise `NotImplementedError`/no-op on Windows; bootstrap Windows block → warning; auto-install errors clearly on native Windows (WSL unaffected); foreground uses `Popen`+exit on Windows instead of `os.execv`. **POSIX behavior unchanged on every path.** |

### Absorbed on the way in (fix-it-ourselves, reviewed fresh)
- `subprocess.CREATE_NEW_PROCESS_GROUP` → `getattr(subprocess, ..., 0)` — the constant is Windows-only, so a win32-simulating test `AttributeError`'d on Linux. Mirrors the `SO_EXCLUSIVEADDRUSE` getattr guard.
- Fixed 2 over-reaching tests in `test_windows_native_support.py` — one was launching a **real installer subprocess** via an unstubbed `subprocess.run` (now stubbed; harness 2.8s vs 80s); removed unused imports.
- Updated `test_onboarding_static.py` — it asserted the OLD "Native Windows is not supported" hard-block string this PR intentionally replaces; now asserts the new experimental-warning + auto-install guard.
- Help-text accuracy: `--foreground` help now describes the Windows Popen path (Opus nit).

### Gate results
- **Full pytest suite**: 7478 passed, 9 skipped, 3 xpassed, **0 failed**
- **ruff forward gate**: CLEAN
- **browser-smoke gate**: CLEAN (gate hardened mid-release to auto-detect the cached chromium revision)
- **Codex (regression)**: SAFE TO SHIP (simulated `sys.platform=win32`, verified POSIX modules not imported + all terminal guards complete + POSIX foreground still uses execv)
- **Opus (correctness)**: SAFE TO SHIP (POSIX path provably unchanged, all fcntl/termios/select guarded, Popen+exit correct; noted inherent-Windows trade-offs that aren't PR bugs)

Note: the Windows *runtime* path can't be executed on the Linux CI box; it was reviewed statically by both reviewers + the contributor's 209-line test (win32 simulated via monkeypatch). Linux/POSIX no-regression is fully verified.

Closes nesquena#1952.

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
…inst live worker state (nesquena#3492)

## Release v0.51.237 — Release HE (stage-q8)

Phase 3 MEDIUM-ring pick (3-factor framework): a real concurrency/state-consistency fix from a regular contributor (@franksong2702, ★★★ 38 merges). The hardest review of the sweep — the dual gate caught **two** silent data-loss bugs across two review rounds.

### Fixed
| PR | Author | Fix |
|----|--------|-----|
| nesquena#3475 | @franksong2702 | Cancelling a live turn immediately after send now reliably stops the worker and settles the session to a cancelled state (was: spinner over a blank page). `cancel_stream()` falls back to the live active-run registry (`ACTIVE_RUNS`) + session agent cache when `STREAMS` has already detached, so the worker still receives `interrupt("Cancelled by user")`. `/api/session` reports run-journal active state from the live registry instead of trusting a persisted `active_stream_id`. |

### Pre-release dual gate caught TWO silent data-loss bugs (both fixed + regression-tested)
The PR's refactor moved `agent.interrupt()` ahead of the partial-text/reasoning/tool-call snapshot, and that snapshot was no longer under `streams_lock`:
1. **Codex round 1** — the worker's `finally` (which pops `STREAM_PARTIAL_TEXT`/`STREAM_REASONING_TEXT`/`STREAM_LIVE_TOOL_CALLS` under `STREAMS_LOCK`) could clear those buffers the instant `interrupt()` wakes it, so a cancelled turn **silently lost its already-streamed text**. (Opus reviewed the original and said ship — it assumed the snapshot was still lock-protected; the stale comment claimed so but the code wasn't. Verified against the actual code → Codex was right.)
2. **Codex round 2** — my first fix only snapshotted on the `STREAMS`-present path; the detached `ACTIVE_RUNS`-only path (the case this PR adds) still lost text. Fixed by hoisting the snapshot above the `if stream_present` branch so it runs unconditionally under the lock.

Both fixes have regression tests **verified to fail against the buggy versions** (`test_cancel_preserves_partial_text_when_interrupt_pops_buffers` + `test_cancel_preserves_partial_text_on_detached_active_run_path`).

### Gate results
- **Full pytest suite**: 7481 passed, 9 skipped, 3 xpassed, **0 failed**
- **ruff forward gate**: CLEAN
- **browser-smoke gate**: CLEAN
- **Codex (regression)**: SHIP ONLY WITH FIXES (×2 rounds) → both MUST-FIXes applied → re-reviewed **SAFE TO SHIP**
- **Opus (correctness)**: reviewed the original (SAFE); the shipped version is strictly safer (adds the under-lock snapshot Opus deemed unnecessary)
- Deadlock concern cleared: no path takes `ACTIVE_RUNS_LOCK` then `STREAMS_LOCK`.

Rebased onto current master (streaming.py/routes.py merged clean — no overlap with the nesquena#3468 dedup change shipped in HC).

Closes nesquena#3475.

Co-authored-by: franksong2702 <franksong2702@users.noreply.github.com>
## Release v0.51.238 — Release HF (stage-q9)

Phase 3 MEDIUM-ring pick (3-factor: contributor×impact×mitigated-risk) — high-impact perf fix to the most-clicked affordance from a regular contributor (@franksong2702 ★★★), small code surface, CI-green.

### Fixed
| PR | Author | Fix |
|----|--------|-----|
| nesquena#2518 follow-up | @franksong2702 | Clicking **New Conversation** on a cold start no longer hangs 3–4s on a catalog rebuild. `newSession()` fills `model_provider` from `window._activeProvider` (then prev-session) when the dropdown carries none, so `POST /api/session/new` takes the fast path on the first click too. |

### Pre-release dual gate caught a wrong-backend routing bug (fixed + regression-tested)
The server fast path passes `(model, provider)` through **without validating the pair**, so naively attaching the active provider to *any* bare model could silently route to the wrong backend (e.g. bare `claude-opus-4.8` + active `openrouter`). **Codex** flagged this; **Opus** had judged it acceptable ("respect the selection over silent swap"). I took the stricter, empirically-grounded path and added a **family-mismatch guard** mirroring the server's own bare-prefix→provider map (`gpt`→openai, `claude`→anthropic, `gemini`→google): when the model's known family differs from the fallback provider, `model_provider` stays `null` so the server slow-path's family repair runs. This keeps the perf win for the common matching case while closing the mis-route. Backend behavioral tests confirm fast-path-on-match + slow-path-on-mismatch. (Also re-anchored the source-shape test assertions on the real `reqBody.model_provider=` assignment per Codex's 2nd note.)

### Gate results
- **Full pytest suite**: 7495 passed, 9 skipped, 3 xpassed, **0 failed**
- **ESLint runtime gate**: CLEAN  ·  **ruff**: CLEAN  ·  **browser-smoke**: CLEAN
- **Codex (regression)**: SHIP ONLY WITH FIXES → guard + test-anchor applied → re-reviewed **SAFE TO SHIP**
- **Opus (correctness)**: reviewed the original (judged acceptable); the shipped version is strictly safer (adds the family guard)

Note: `docs/pr-media/2518/{PR_BODY.md,bench.py}` are the contributor's review aids, included per the tracked `docs/pr-media/` convention (157 files already tracked) — not app code.

Closes nesquena#2518.

Co-authored-by: franksong2702 <franksong2702@users.noreply.github.com>
## Release v0.51.239 — Release HG (stage-q10)

Phase 3 MEDIUM-ring **salvage** from nesquena#3407. The source PR bundled a universal reliability fix with debug scaffolding + Android-specific work; this release ships only the clean, universal nugget.

### Fixed
| Salvaged from | Author | Fix |
|---|---|---|
| nesquena#3407 | @PatrickNoFilter | `server.py` ignores `SIGPIPE` (`SIG_IGN`) at import time so a client dropping the connection mid-response (tab close mid-stream, network drop, mobile backgrounding, dropped long-poll, `/api/updates/check` timeout) can't silently `Term` the whole process. The broken write now surfaces as a catchable `BrokenPipeError`; the server keeps serving. |

### Why salvage, not merge whole
nesquena#3407 (585L, 11 commits) bundles three groups: (1) the SIGPIPE fix + a 271-line `diag_shim.py` debug module, (2) an Android-cgroup-specific `os.fork`/`setsid` restart rewrite in `updates.py`, (3) personal deploy scripts (`start-webui.sh`/`watchdog-loop.sh`, which the author notes are "user-side infra, not in the server tree"). Only the SIGPIPE fix is universal, low-risk, and ship-ready — the rest is investigation tooling for a now-solved mystery or platform-specific. The source PR is held with a detailed split explanation.

### Added safety over the source PR
The original used a bare `signal.signal(signal.SIGPIPE, ...)` which would `AttributeError` on Windows (no `SIGPIPE`). The salvaged version is `getattr`-guarded so it's a no-op on Windows, preserving the native-Windows support shipped in nesquena#1952 (HD).

### Gate results
- **Full pytest suite**: 7498 passed, 9 skipped, 3 xpassed, **0 failed**
- **ruff**: CLEAN · **browser-smoke**: CLEAN
- **Codex (regression)**: SAFE TO SHIP — verified the getattr Windows-guard, that the ignore disposition lands correctly across the `os.execv` self-restart, and that subprocess children use `restore_signals=True` so the ignore doesn't leak to git/shell/editor children.

Regression test `tests/test_issue3407_sigpipe_ignore.py` pins SIG_IGN on POSIX, no-raise import, and the getattr guard.

Co-authored-by: PatrickNoFilter <PatrickNoFilter@users.noreply.github.com>
## Release v0.51.240 — Release HH (stage-q12)

Mobile UX bug-fix — UX-approved via Telegram.

### Fixed
| PR | Author | Fix |
|----|--------|-----|
| nesquena#3470 | @cnogrin | On mobile/touch, you can now **swipe up to stop streaming auto-scroll**. Previously the view snapped to the bottom on every token with no way to read earlier content while a response streamed — `_recordNonMessageScrollIntent()` only detected upward intent on the wheel path (`e.deltaY`), and touch events have no `deltaY`. Now tracks `_touchStartY` on `touchstart` and treats a `touchmove` that drags the finger down >8px (= scroll up into history, `scrollTop` decreases) as upward intent, setting the same `_messageUserUnpinned` flag the wheel path + scroll listener use. |

### Why this is safe for existing installs
- **Only ADDS a touch-unpin path** — wheel + desktop behavior completely untouched, no existing branch modified.
- New `touchstart`/`touchend`/`touchcancel` listeners are **passive + capture-only** (they only write `_touchStartY`), so they can't interfere with existing touch handling.
- Net effect for users: mobile users *gain* the ability to scroll up during streaming (which was simply broken before). No one's working flow changes.

### Absorbed on review (Codex CORE MUST-FIX)
The contributor's gesture sign was inverted (`dy<-8` = finger up = *follow* the stream), which would have unpinned in the wrong direction. Corrected to `dy>8` to match the existing scroll listener's `movedUp` semantics; fixed the comment; strengthened the regression test to pin the gesture direction.

### Gate
- Full pytest suite: **7503 passed, 9 skipped, 3 xpassed, 0 failed**
- ESLint runtime gate: CLEAN · browser-smoke: CLEAN
- Codex (regression): SHIP-ONLY-WITH-FIXES (inverted sign) → fixed → re-reviewed **SAFE TO SHIP**
- Regression test `tests/test_issue3470_touch_unpin_streaming_scroll.py` (verified to fail against master)

UX-approved via Telegram (mobile touch-gesture behavior, no visual/layout delta to screenshot).

Co-authored-by: cnogrin <cnogrin@users.noreply.github.com>
## Release v0.51.241 — Release HI (stage-q13)

UX-flow bug-fix — approved via Telegram.

### Fixed
| PR | Author | Fix |
|----|--------|-----|
| nesquena#3471 (nesquena#3333) | @starGazerK | **New Chat keeps your unsent draft after peeking at history.** Start a New Chat draft → open a previous conversation → click New Chat: the draft is no longer lost. Empty New-Chat sessions are hidden from the sidebar, so there was no way back to the session holding the draft — New Chat just created another fresh empty session. The entrypoint now remembers the candidate empty draft session (one `localStorage` pointer) and, before creating a fresh session, re-validates it via `/api/session`, routing back only if it is still a safe empty draft (zero messages, no active stream, no pending message, not worktree-backed, matching profile, non-empty server-side `composer_draft`). |

### Why it's safe for existing installs
- When there is no remembered draft, it's a **pure no-op fall-through** to the existing `newSession()` path — no behavior change.
- Preserves the "zero-message sessions stay hidden from the sidebar" contract.
- Conservative multi-guard validation; the pointer is cleared on draft-clear (after send) so an emptied draft never traps you on New Chat.

### Verified live (end-to-end on a test server)
- **Positive**: typed a draft → visited a 2-message history session → clicked New Chat → landed back on the draft session with the text restored.
- **Negative**: emptied the draft → New Chat created a fresh session (no accidental trap).

### Absorbed on review (Codex CORE MUST-FIX)
The PR added `await _saveComposerDraftNow(...)` before the session-switch, which opened a rapid-switch race: clicking session B then quickly C could let B's stale continuation blank C's freshly-loaded state. Added `if (_loadingSessionId !== sid) return;` immediately after the awaited save and before the destructive state-clear (mirrors the existing nesquena#1060 stale-guard) + a regression test pinning the guard's position.

### Gate
- Full pytest suite: **7510 passed, 9 skipped, 3 xpassed, 0 failed**
- ESLint runtime gate: CLEAN · browser-smoke: CLEAN
- Codex (regression): SHIP-ONLY-WITH-FIXES (rapid-switch race) → fixed → re-reviewed **SAFE TO SHIP**
- `tests/test_issue_new_chat_draft_restore.py` (7 assertions incl. the race-guard; live-verified behavior)

UX-flow change, no visual/layout delta — approved via Telegram.

Co-authored-by: starGazerK <starGazerK@users.noreply.github.com>
## Release v0.51.242 — Release HJ (stage-q14)

UX-approved via Telegram (Nathan — dark/light/mobile screenshots).

### Added
| PR | Author | Feature |
|----|--------|---------|
| nesquena#3440 | @t3chn0pr13st | **Graphite appearance skin** — a quiet, neutral-gray "workbench" alternative to the default gold/cream. Selectable from Settings → Appearance and `/theme skin graphite`. Both light + dark palettes on the existing CSS-variable token system; tightened typography, shadows, active-sidebar spacing, code-block framing. |

### Why it's safe
- **Fully scoped + additive**: every new CSS rule (and every `!important`) is under `[data-skin="graphite"]` — Codex verified zero bleed into the default appearance or other skins. `api/config.py` keeps the default skin as `default` and only *adds* `graphite` to the allowed set. No i18n keys dropped (only the `/theme` help string gains `graphite`).
- Opt-in; a user has to select it. Default experience unchanged.

### Test-robustness fix (absorbed)
The new graphite scoped selectors (e.g. `:root[data-skin="graphite"] .session-item.active .session-time{…}`) appear in `style.css` *before* the canonical unscoped rules, which broke 3 naive first-occurrence CSS-contract tests (`test_issue677` scroll-btn-overlay, `test_issue856_pinned_indicator_layout`, `test_workspace_panel_session_list`). Fixed those 3 to anchor on the canonical **unscoped** rule (start-of-line regex) instead of the first `.selector` match — robust against this and future skins. Verified they still pass on clean master CSS (invariant preserved, not weakened).

### Gate
- Full pytest suite: **7517 passed, 9 skipped, 3 xpassed, 0 failed**
- ESLint runtime gate: CLEAN · ruff: CLEAN · browser-smoke: CLEAN
- Codex (regression): **SAFE TO SHIP** (verified all CSS scoped, default skin unchanged, no i18n key loss)
- Vision-verified dark + light + mobile; UX-approved by Nathan via Telegram

Co-authored-by: t3chn0pr13st <t3chn0pr13st@users.noreply.github.com>
## Release v0.51.243 — Release HK (stage-q15)

UX-approved direction via Telegram (workspace feature you named). Backend security hardened through 3 Codex rounds.

### Added
| PR | Author | Feature |
|----|--------|---------|
| nesquena#3402 / nesquena#3422 | @pamnard | **Drag a file or folder in the workspace tree onto another folder row (or breadcrumb segment) to move it** within the workspace. New `POST /api/file/move`. Drop handlers use `stopPropagation` so the composer `@path` drag (nesquena#1097) and OS-file upload-drop (nesquena#3411) are unchanged. |

### Verified
- Live end-to-end: `notes.txt` → `docs/` confirmed on disk; legit moves return the correct `new_path`.
- 13 tests (incl. folder-into-self/descendant guard, existing-target collision, and 3 security regressions).

### Security hardening absorbed (3 Codex rounds — all fixed + regression-tested)
1. **TOCTOU symlink race on destination**: a path-based `source.rename(dest)` could be raced by swapping `dest_dir` to an external symlink between validation and rename. Now opens both parent dirs via the workspace-anchored `open_anchored_fd` (openat + `O_NOFOLLOW`, same helper as the upload hardening) and uses `os.rename(leaf, leaf, src_dir_fd=…, dst_dir_fd=…)` with an fd-based collision check; path-based fallback only where `dir_fd` is unsupported.
2. **Symlinked workspace root** returned a confusing 400 after a successful move — returned `new_path` now computed against `ws_root.resolve()`.
3. **Symlinked source entry**: `safe_resolve` follows the final symlink, so moving `link.txt` would move its *target* and dangle the link — now rejected via no-follow `lstat` on the lexical path.

### Gate
- Full pytest suite: **7530 passed, 9 skipped, 3 xpassed, 0 failed**
- ESLint: CLEAN · ruff: CLEAN · browser-smoke: CLEAN
- Codex (regression): SHIP-ONLY-WITH-FIXES ×3 → all applied + tested → **SAFE TO SHIP**

Co-authored-by: pamnard <pamnard@users.noreply.github.com>
## Release v0.51.244 — Release HL (stage-q16)

UX-approved direction via Telegram (workspace drag-drop polish you requested). All 4 drag-drop flows verified live in-browser.

### Added
| PR | Author | Feature |
|----|--------|---------|
| nesquena#3402 / nesquena#3424 | @pamnard | **Drop OS files/folders onto a specific workspace folder row or breadcrumb** to upload into that directory (not just the current dir). OS folder drops are traversed (`webkitGetAsEntry`/`readEntries`) preserving nested structure. Uploads via the existing `/api/workspace/upload` (no new backend). |

### Fixed
- **Composer drop-zone jank**: dragging a workspace file (or OS file) over the composer footer rendered a translucent overlay that let the textarea/chips/icons bleed through and collide with the hint text. Now a clean, fully-opaque box with a single centered **context-aware** label — *"Drop to insert workspace reference"* (workspace file → `@path` insert) vs *"Drop files to attach"* (OS file → message attach).
- **Drag-drop handler coexistence (CORE, caught in review)**: nesquena#3424's OS-upload binding assigned `el.ondrop` on folder rows, which **overwrote** the drag-to-move handler from nesquena#3422 (also `el.ondrop`) — silently breaking move-to-folder (the ws-path drop fell through to the composer as an `@path` insert). Fixed by binding the OS-upload handlers via `addEventListener` so they compose; each handler gates on its own drag type.

### Drag-drop matrix — all verified LIVE in-browser (real drag→drop, asserted on disk)
| Flow | Result |
|------|--------|
| OS image → composer footer | ✓ attaches |
| workspace file → composer footer | ✓ inserts `@path` |
| workspace file → workspace folder | ✓ moves on disk (report.md → docs/) |
| OS file → workspace folder | ✓ uploads into target folder |

### Scope note
nesquena#3424's PR branch carried the OLD pre-hardening `_handle_file_move`. Applied **frontend-only** — master's hardened move backend (v0.51.243, TOCTOU/symlink fixes) is untouched (Codex confirmed no `api/routes.py` diff).

### Gate
- Full pytest suite: **7542 passed, 9 skipped, 3 xpassed, 0 failed**
- ESLint: CLEAN · ruff: CLEAN · browser-smoke: CLEAN
- Codex (regression): CORE handler-clobber → fixed → **SAFE TO SHIP**

Co-authored-by: pamnard <pamnard@users.noreply.github.com>
## Release v0.51.245 — Release HM (stage-q17)

Small UX bug-fix.

### Fixed
| Issue | Author | Fix |
|-------|--------|-----|
| nesquena#3338 | @rodboev (nesquena#3502) | **Messaging sessions (Telegram, Discord, WeChat, …) now show their platform source badge in the chat-pane topbar**, not just the sidebar. The topbar badge was gated on `is_cli_session` (intentionally `false` for messaging sources), so it vanished once the session opened. Gate removed; a recovered native session stamped `source_label:"WebUI"` stays un-badged. Reuses the existing `.topbar-source-badge` styling — no new chrome. |

Picked nesquena#3502 over the duplicate **nesquena#3499** (same issue/files) — nesquena#3502 adds the `WebUI` self-source suppression and a stronger regression test. nesquena#3499 closed as superseded with credit.

### Gate
- Full pytest suite: **7543 passed, 0 failed** (first run hit the known boot-cascade flake — 376 connection-refused across 29 files; clean on re-run, as expected for a JS-only change)
- ESLint: CLEAN · browser-smoke: CLEAN · 8 source-contract tests pin the fix
- Codex (regression): **SAFE TO SHIP** — both renderers (panels.js + ui.js) fixed consistently, WebUI-suppression correct, read-only suffix intact, `textContent` XSS-safe

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
## Release v0.51.246 — Release HN (stage-q18)

Backend bug-fix.

### Fixed
| Issue | Author | Fix |
|-------|--------|-----|
| nesquena#3225 | @rodboev | **WebUI session rename now syncs the new title to the agent's `state.db`**, so the TUI/CLI stop showing the stale name. `/api/session/rename` now calls `_sync_session_title_to_insights(s)` after `s.save()` and before `publish_session_list_changed` — exactly mirroring the sibling `/api/session/title/regenerate` handler. Gated on the `sync_to_insights` setting and exception-contained (a sync failure can't break the rename). |

### Gate
- Full pytest suite: **7544 passed, 0 failed**
- ruff: CLEAN · 1 new regression test (call present + sync-before-publish ordering) + 82 rename/title-sync tests pass
- Codex (regression): **SAFE TO SHIP** — mirrors the regenerate handler (sync after lock release, gated, exception-contained), no deadlock, no stale data

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
## Release v0.51.247 — Release HO (stage-q19)

Backend correctness fix.

### Fixed
| Issue | Author | Fix |
|-------|--------|-----|
| nesquena#3505 | @franksong2702 | **Reasoning effort is coerced to a level the active model/provider actually supports** before each request, instead of being sent verbatim and rejected. `openai-codex` `gpt-5` no longer gets `max` (→ `xhigh`); `o1`/`o3`/`o4` clamp to `low`/`medium`/`high`. Coercion only steps *down* (never escalates); `none`/unset preserved. The capability filter is applied across heuristic / models.dev / Copilot / LM Studio paths. |

This is the narrow, correct fix for the detection gap that nesquena#3431 tried to address by removing the chip-visibility gate (which we shelved). The chip-visibility gate is **untouched** (Codex confirmed) — `get_reasoning_status`/`_applyReasoningChip` still hide the chip for unconfirmed models.

### Review fix absorbed (Codex + self-flagged)
The first cut **dropped** a configured effort for *unrecognized* models, because capability detection returns `[]` for both "known-unsupported" and "simply-unknown" (custom providers, aggregator-rewritten ids, new releases) — that's a behavior change vs master (which sent it verbatim) and would silently disable reasoning. Fixed: an **empty** capability set now **preserves** the configured effort (provider stays the final authority; worst case = the same rejected request master already produces, i.e. no regression). Known-bad clamps return *non-empty* filtered sets, so they still degrade correctly. Nathan chose this "preserve-for-unknown" behavior. + regression test.

### Gate
- Full pytest suite: **7548 passed, 0 failed**
- ruff: CLEAN · 48 reasoning tests pass (incl. preserve-for-unknown + codex-clamp + never-escalate)
- Codex (regression): SHIP-ONLY-WITH-FIXES (unknown-model drop) → fixed → **SAFE TO SHIP**
- Verified empirically: gpt-5/codex max→xhigh, o3 max/xhigh→high, unknown high→high (preserved), none/unset preserved

Co-authored-by: franksong2702 <franksong2702@users.noreply.github.com>
## Release v0.51.248 — Release HP (stage-q20)

Bug-fix.

### Fixed
| Issue | Author | Fix |
|-------|--------|-----|
| nesquena#2782 | @rodboev | **A WebUI session whose sidecar was deleted server-side (e.g. `docker compose --force-recreate`) but whose messages remain in `state.db` no longer bricks the chat.** It used to look alive (`GET` 200 from a CLI stub) while every action 404'd (`POST /api/session/draft`, `/api/chat/start`). The GET handler now consults `_index.json`: a deleted **WebUI-origin** session (webui/fork/blank-non-CLI source) returns 404 so the client self-heals (clears saved id, strips the stale `/session/<id>` URL, falls through to the welcome screen). Genuine CLI/imported sessions keep their 200 read-only stub. Client self-heal now also covers mid-session sidecar deletion of the current session. |

### Review fix absorbed (Codex CORE catch)
The first cut collapsed `source_tag or raw_source or session_source or ""`, defaulting a **blank-source** row to WebUI — which would wrongly 404 a **legacy CLI/imported** session that carries `is_cli_session:true` with blank source fields. Now classified **per-field**: any `webui`/`fork` → 404; any explicit non-WebUI source → keep the 200 CLI stub; all-blank → 404 only when NOT `is_cli_session` and NOT `read_only`. + 2 regression tests.

### Gate
- Full pytest suite: **7555 passed, 0 failed**
- ESLint: CLEAN · ruff: CLEAN · browser-smoke: CLEAN · 10 stale-session-restore tests (incl. 2 Codex-catch regressions)
- Codex (regression): CORE legacy-CLI false-404 → per-field fix → **SAFE TO SHIP**

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
## Release v0.51.249 — Release HQ (stage-q21)

Small opt-in feature. UX-approved (screenshot of the toggle in Settings → Preferences).

### Added
| Issue | Author | Feature |
|-------|--------|---------|
| nesquena#2974 | @rodboev | **"Auto-expand terminal on output"** preference (Settings → Preferences, **off by default**). When enabled, the collapsed embedded terminal panel expands automatically the first time a running command emits output. Fires once per stream (guarded on open && collapsed — not per chunk), and uses `expandComposerTerminal({focus:false})` so it doesn't steal focus from the composer. Backend-persisted boolean mirroring the `simplified_tool_calling` pattern; default-off = no behavior change on upgrade. |

### Gate
- Full pytest suite: **7557 passed, 0 failed**
- ESLint: CLEAN · ruff: CLEAN · browser-smoke: CLEAN · screenshot vision-verified (toggle renders cleanly under "Compact tool activity", unchecked by default)
- Codex (regression): **SAFE TO SHIP** (clean first pass) — no-arg `expandComposerTerminal` callers unchanged, default-off incl. settings-load-failure path, all 5 plumb sites mirror the existing pattern

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
Merging the RFC as the agreed product contract for long-running-session assistant replies. Thank you @franksong2702! 🙏

It's docs-only (no code), well-structured, and gives the project a shared vocabulary for the follow-up implementation slices — in particular the honest terminal-state set (completed / cancelled / interrupted / compression-exhausted / tool-limit-reached / no-response / error, specific-wins-over-generic) and the live → settled → recovery/replay lifecycle. Nathan blessed merging it as the north-star contract.
## Release v0.51.250 — Release HR (stage-q22)

UX-approved (dark + light-fallback screenshots).

### Added
| PR | Author | Feature |
|----|--------|---------|
| nesquena#3328 | @heagandev | **Zeus appearance skin** — OLED-near-black dark surfaces that keep the default **gold accent** (a high-contrast "gold on black" look no existing skin offered). Selectable from Settings → Appearance or `/theme skin zeus`. Dark-focused; falls back to the default light palette in light mode. |

### Notes
- The PR was 2 days / ~19 releases stale and CONFLICTING; re-applied surgically onto current master (CSS palette + `zeus` registered at all 5 sites: config allowlist, boot.js swatch, index.html boot-map, i18n `cmd_theme` ×12 locales, picker) + THEMES.md doc row. The PR's own `test_zeus_skin.py` (6 tests) passes against the re-applied version.
- Fully scoped + additive: Codex verified every new CSS rule is under `:root.dark[data-skin="zeus"]` — no bleed into the default appearance or other skins.

### Gate
- Full pytest suite: **7563 passed, 0 failed**
- ESLint: CLEAN · ruff: CLEAN · browser-smoke: CLEAN · vision-verified dark (OLED+gold) + light (clean fallback)
- Codex (regression): **SAFE TO SHIP**

Co-authored-by: heagandev <heagandev@users.noreply.github.com>
## Release v0.51.251 — Release HS (stage-q23)

UX-verified live (path dropdown opens on `~/`).

### Fixed
| Issue | Author | Fix |
|-------|--------|-----|
| nesquena#3433 | @puneetdixit200 | **Composer `~/` path autocomplete** (TUI parity). Typing a `~/` token in the composer opens a home-directory path-suggestion dropdown. Reuses the existing slash-command dropdown (positioning + keyboard nav) and the trusted `/api/workspaces/suggest` endpoint; replaces only the matched token on selection (surrounding text preserved). Slash-command autocomplete still takes precedence for `/`-prefixed input. |

### Gate
- Full pytest suite: **7568 passed, 0 failed**
- ESLint: CLEAN · ruff: CLEAN · browser-smoke: CLEAN · live-verified the dropdown opens on `~/`
- Codex (regression): **SAFE TO SHIP** — slash-precedence preserved, `~/../../etc` → no suggestions (path-escape blocked via root-confined endpoint), bounds-clamped token replacement, esc-escaped, URLSearchParams-encoded

Co-authored-by: puneetdixit200 <puneetdixit200@users.noreply.github.com>
## Release v0.51.252 — Release HT (stage-q24)

Two trivially-safe @rodboev changes (independent).

### Fixed
| Issue | Author | Fix |
|-------|--------|-----|
| nesquena#2481 | @rodboev | The floating "selected-text reply" button now has `user-select:none`, so its own label can't get caught in a text selection (no bleed-through). CSS one-liner. |

### Docs
- README **Compatibility** section: upgrade WebUI + hermes-agent together until the stable agent API (nesquena#2491) lands. (@rodboev)

### Dropped from this batch
- **nesquena#2977 `/use` skill command** was staged here but **dropped** — the Codex regression gate found an async stale-directive race (`cmdUse()` awaits `/api/skills` but `send()` doesn't await the handler → a fast next send can miss it, or a stale directive leaks to a later message) plus an over-eager `finally` clear that silently discards the directive on a local slash-command early-return. Held with `changes-requested` + a detailed rework note (tracked pending promise + clear-on-consume). Concept approved; needs lifecycle hardening.

### Gate
- Full pytest suite: **7570 passed, 0 failed**
- ESLint: CLEAN · ruff: CLEAN · browser-smoke: CLEAN
- Codex (regression): **SAFE TO SHIP** — `user-select:none` scoped to the button only, README docs-only, no `/use` code remains

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
## Release v0.51.253 — Release HU (stage-r1)

Phase-1 low-risk batch — 7 PRs (no intervention beyond apply + one inline MUST-FIX).

### Fixed
| Issue/PR | Author | Fix |
|----------|--------|-----|
| nesquena#3525 | @TomBanksAU | Streaming DOM-replace "follow" window tightened 1200px→120px — a reader who scrolled up mid-stream no longer gets snapped to the bottom on completion. |
| nesquena#3556 | @ai-ag2026 | Topbar count distinguishes a partially-loaded transcript ("loaded of total" via server `message_count`); fully-loaded keeps the tool-row-filtered count. |
| nesquena#3502 follow-up | @rodboev | Sidebar messaging source badges (Telegram/Discord/…) render as chips, not just CLI ones. |
| — | @Karlineal | `.pre-header+pre` margin override scoped under `.msg-body` (removes a 10px gap above code blocks). |

### Tests
- `test_ctl_script.py` kills orphan fake-python trees on Windows; conftest `_discover_python` checks the Windows venv layout (`Scripts/python.exe`). (nesquena#3537, nesquena#3577, @rodboev)

### Docs
- Explicit WebUI–Agent compatibility policy + Docker pinning guidance. (nesquena#3232, @franksong2702)

### Dropped from this batch
- **nesquena#3538** (self-update stash-pop recovery) — the Codex regression gate found a **BRICK-class data-loss**: the recovery path runs `git reset --merge` then `git stash drop`, permanently discarding the user's local modifications while returning `ok:true` + scheduling a restart. Held with `changes-requested` + a repro and the fix (keep the stash, return `ok:false`, no restart). Concept is good; the destructive `stash drop` must go.

### Gate
- Full pytest suite: **7575 passed, 0 failed**
- ESLint: CLEAN · ruff: CLEAN · browser-smoke: CLEAN
- Codex (regression): SHIP-ONLY-WITH-FIXES (BRICK data-loss nesquena#3538 + tool-row count regression nesquena#3556) → nesquena#3538 dropped, nesquena#3556 fixed inline → **SAFE TO SHIP**

Co-authored-by: TomBanksAU <TomBanksAU@users.noreply.github.com>
Co-authored-by: ai-ag2026 <ai-ag2026@users.noreply.github.com>
Co-authored-by: rodboev <rodboev@users.noreply.github.com>
Co-authored-by: Karlineal <Karlineal@users.noreply.github.com>
Co-authored-by: franksong2702 <franksong2702@users.noreply.github.com>
## Release v0.51.254 — Release HV (stage-r2)

Phase-2 medium wave 1 — 4 PRs (UI/mobile/cancel fixes + an un-held model dedup).

### Fixed
| Issue/PR | Author | Fix |
|----------|--------|-----|
| nesquena#3528 | @franksong2702 | Render partial tool calls after cancel — interrupted turns keep their `_partial_tool_calls` rows in the transcript + fallback tool-cards. (Codex confirmed it stays render-only, not forwarded to the provider API.) |
| nesquena#3550 | @lurebat | Android offline recovery soft-reattaches the live stream instead of hard-reloading the page on a transient background/disconnect. |
| nesquena#3479 | @mvanhorn | iOS Safari no longer snaps the conversation to the top when a handoff/compression card is inserted mid-stream or on `refreshSession()`. |
| nesquena#3478 | @JayC-L | **Un-held:** named custom providers (`@custom:name:model`) dedup against bare model IDs without regressing Ollama multi-colon tags (`qwen2.5:7b-instruct-q4`). Only `@custom:` IDs strip the two-segment prefix. |

### Hold-sweep note
nesquena#3478/nesquena#3489 was held yesterday for an Ollama multi-colon-tag regression risk (a blanket `lastIndexOf` would lose the model). The author pushed a scoped fix (only `@custom:` IDs use `lastIndexOf`); I verified `_normId` in node against the regression cases — Ollama bare tags are preserved. Un-held + shipped.

### Gate
- Full pytest suite: **7588 passed, 0 failed**
- ESLint: CLEAN · ruff: CLEAN · browser-smoke: CLEAN
- Codex (regression): **SAFE TO SHIP** — nesquena#3552 partial-tool-calls verified render-only (no `_API_SAFE_MSG_KEYS` leak / no 400-on-strict-provider, the v0.50.251 nesquena#1375 trap); nesquena#3551 no EventSource double-subscribe; nesquena#3541 no regression vs the nesquena#3525 scroll-follow shipped in v0.51.253; nesquena#3489 no over-dedup.

Co-authored-by: franksong2702 <franksong2702@users.noreply.github.com>
Co-authored-by: lurebat <lurebat@users.noreply.github.com>
Co-authored-by: mvanhorn <mvanhorn@users.noreply.github.com>
Co-authored-by: JayC-L <JayC-L@users.noreply.github.com>
## Release v0.51.255 — Release HW (stage-r3)

Backend hardening — single PR.

### Fixed
| PR | Author | Fix |
|----|--------|-----|
| nesquena#3561 | @rodboev | Turn journal (crash-recovery backbone) writes **pid-scoped shards** (`{sid}~{pid}.jsonl`) instead of one shared `{sid}.jsonl`, so concurrent processes (e.g. a self-restart overlap) can't interleave-corrupt large JSON lines. `read_turn_journal` merges all shards + the legacy file and sorts by `created_at` — recovery unchanged, backward-compatible. |

### Gate
- Full pytest suite: **7593 passed, 0 failed**
- ruff: CLEAN · 18 turn-journal tests pass
- Codex (regression): **SAFE TO SHIP** — verified legacy+shard merge (no data loss on upgrade), `~` separator can't collide with a session id, the cross-shard `created_at` sort doesn't break recovery (it derives state by timestamp; stream lookup keys by unique `stream_id`), and no reader/writer bypasses `_journal_path`.
- *Non-blocking note:* old `{sid}~{oldpid}.jsonl` shards aren't pruned, so the journal dir can grow across restarts — storage hygiene, not a core-flow regression. Worth a follow-up cleanup (e.g. drop shards with no live pid on session delete).

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
## Release v0.51.256 — Release HX (stage-r4)

Performance — bound WebUI memory growth & idle CPU on large installs.

### Fixed
| Issue | Author | Fix |
|-------|--------|-----|
| nesquena#3506 | @nesquena-hermes (reported w/ profiling by @djenttleman) | On a large install (~615 sessions / 40k messages / 454 MB state.db) the WebUI process climbed ~100 MB → ~1.5 GB RSS over days and held high idle CPU. Three root causes fixed: (1) `session_lifecycle._sessions` grew unbounded → new `discard_session()` drops the entry at agent-eviction boundaries, only when no in-flight commit / no uncommitted memory work (retry invariant preserved); (2) cache caps now operator-tunable (`HERMES_WEBUI_AGENT_CACHE_MAX` default 50→25, `HERMES_WEBUI_SESSIONS_MAX`); (3) GatewayWatcher computes a cheap fingerprint before the expensive per-session `MAX(messages.timestamp)` projection and only re-projects on change. |

### Rebase + review notes
- Rebased onto current master; the code diff was verified **byte-identical to the nesquena-APPROVED head** at rebase time (only CHANGELOG re-resolved).
- The Codex regression gate then surfaced **two correctness gaps** the approval didn't catch, both fixed here with regression tests:
  1. **Watcher fingerprint missed same-count transcript rewrites.** `/retry`,`/undo`,`/compress` (`SessionDB.replace_messages`) rewrite messages with new timestamps but can leave `message_count` unchanged → stale sidebar `last_activity`. Fixed with a **per-session** grouped message aggregate (`id, count, user_count, MAX(timestamp)`) over the same non-excluded sessions (a global MAX would miss a rewrite of an older, non-newest session); cron/webui stay excluded so idle churn still doesn't re-project.
  2. **LRU agent-cache eviction could close a live worker's agent** (`popitem(last=False)`, liveness-blind — pre-existing, but the lower 50→25 cap made it more likely). Eviction now snapshots `ACTIVE_RUNS` session_ids (before the cache lock — no nested lock) and skips live sessions, deferring (temporarily exceeding cap) rather than closing a live agent.

### Gate
- Full pytest suite: **7612 passed, 0 failed** (one boot-cascade flake re-run; clean on re-run)
- ruff: CLEAN · Codex (regression): 4 rounds → both gaps + a stale test fixed → **SAFE TO SHIP**

Co-authored-by: nesquena <nesquena@users.noreply.github.com>
## Release v0.51.257 — Release HY (stage-r5)

Two rebased ★★★ fixes.

### Fixed
| Issue | Author | Fix |
|-------|--------|-----|
| nesquena#3546 | @rodboev | **"Refresh Models" on a provider card no longer returns "Error: Not found".** Sent `POST /api/models/refresh` but no route matched (404). Added the route, wired to the existing `invalidate_provider_models_cache(provider_id)`. |
| nesquena#3548 | @franksong2702 | **Credential self-heal no longer writes to a dead `SessionDB` handle.** A credential-refresh evicted/closed the cached agent, but the retry rebuilt a new agent from kwargs still holding the old closed `SessionDB` → persistence silently targeted a dead handle. Per-request `SessionDB` construction centralized + refreshed on the retry. |

### Dropped from this batch
- **nesquena#2660** (session-event SSE profile scoping) — the Codex regression gate found **two SILENT dropped-refresh bugs** the scoping introduced: (1) the `maxsize=1` subscriber-queue coalescing overwrites a pending profile-A event with a profile-B event → A-tabs filter B out and never refresh for the A change; (2) renamed-root/`default` alias mismatch (backend `_profiles_match` treats them equal, the client filter uses a literal `!==`). A dropped refresh (stale sidebar) is worse than the extra refreshes the PR removes. Held with `changes-requested` + repros + the fail-safe fix (coalesce to unscoped on a profile mismatch; normalize root aliases).

### Gate
- Full pytest suite: **7622 passed, 0 failed**
- ESLint: CLEAN · ruff: CLEAN · browser-smoke: CLEAN
- Codex (regression): SHIP-ONLY-WITH-FIXES (nesquena#2660 dropped-refresh bugs) → nesquena#2660 dropped + held → **SAFE TO SHIP**

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
Co-authored-by: franksong2702 <franksong2702@users.noreply.github.com>
## Release v0.51.258 — Release HZ (stage-r6)

Fresh-arrival low-risk pair (both @rodboev, landed in the last sweep window).

### Fixed
| Issue | Fix |
|-------|-----|
| nesquena#3597 | The "update available" banner now shows from **any panel** (Settings → System "Check now", etc.), not just the Chat view — it was positioned inside the chat surface so it only rendered there. |
| nesquena#3592 | Under Simplified Tool Calling, an assistant turn with **thinking but no tool calls** now renders that thinking inline on settlement instead of burying it in an empty collapsed activity group. |

### Review fix absorbed (Codex)
nesquena#3592's inline-render `continue` skipped the activity-group creation that carried the turn's `data-turn-duration`, but the footer still suppressed the "Done in …" duration for any `assistantThinking` turn → thinking-only turns silently lost their duration display. Fixed: footer duration is now suppressed **only** for turns that actually build an activity group (`toolCallAssistantIdxs.has(mi)`), so thinking-only inline turns keep "Done in …". + regression test.

### Gate
- Full pytest suite: **7631 passed, 0 failed**
- ESLint: CLEAN · browser-smoke: CLEAN
- Codex (regression): SHIP-ONLY-WITH-FIXES (duration-drop) → fixed → **SAFE TO SHIP**

### Sweep note
nesquena#3603 + nesquena#3604 (sidebar CLI-session classification, same author/area) were **not** included — they assert contradictory models for a sidecar-less `source='cli'` recovery row; flagged on both PRs for the author to reconcile.

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
## Release v0.51.259 — Release IA (stage-r7)

Two ship-ready @rodboev bug-fixes from today (the clean subset of the prioritized 6).

### Fixed
| Issue | Fix |
|-------|-----|
| nesquena#3582 | **Edge-TTS playback no longer has a ~31s delay / playback error** — `_handle_tts` streamed audio without `Content-Length` on an HTTP/1.0 server; audio is now buffered and sent with an exact `Content-Length`. |
| nesquena#3583 | **CLI-bridge message reconstruction strips orphaned `tool_calls`** (assistant `tool_calls` with no matching `tool` response, left by an aborted bridge) so the next request no longer 400s on strict providers. |

### Held back from the 6-PR batch (Codex regression gate caught a real defect in each)
- **nesquena#3586/nesquena#3603** (`is_cli_session_row` reclassification) — CORE: messaging rows become non-CLI, but the sidebar open path only imports when `is_cli_session`, so opening a Discord/Telegram session shows a transient stub and the next send 404s on `/api/chat/start`. Needs a client import-gate fix + live verify. **Held.**
- **nesquena#3585/nesquena#3604** (cron-overflow) — removing `exclude_sources=None` also re-excludes `source='webui'` rows, dropping sidecarless WebUI session recovery from `/api/sessions`. Needs a separate webui recovery pass. **Held.**
- **nesquena#3587** (intermediate reasoning) — `on_interim_assistant` is suppressed upstream for contentless tool-call assistant messages (`run_agent.py:3834`), so advancing the reasoning index there never fires at tool-call boundaries → mis-attribution. **Held.**
- **nesquena#3538** (self-update stash-pop) — BRICK data-loss (`git reset --merge` + `git stash drop` discards user mods), still unaddressed. **Held.**

### Gate
- Full pytest suite: **7645 passed, 0 failed**
- ruff: CLEAN
- Codex (regression): 3 rounds — 4 PRs dropped/held for real regressions → **SAFE TO SHIP** on the clean 2

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
## Release v0.51.260 — Release IB (stage-r8)

Un-held safety fixes (author resolved my earlier hold findings; re-reviewed fresh) + a clean fix batch. 6 PRs.

### Fixed
| Issue/PR | Author | Fix |
|----------|--------|-----|
| nesquena#3535 (nesquena#3538) | @rodboev | **Self-update recovers from a stash-pop conflict without data loss.** Was a BRICK bug (`git reset --merge` + `git stash drop` discarded local mods while reporting success). Now keeps the stash, returns `ok:false` + "preserved in `stash@{0}`", no restart on conflict. *(was held — fix verified)* |
| nesquena#1909 s3 (nesquena#3562) | @rodboev | **Auth `Secure` cookie no longer locks out plain-HTTP LAN/Tailscale users.** Secure now keys only on real TLS evidence (env / TLS socket / opt-in `TRUST_FORWARDED_PROTO`); non-loopback plain-HTTP is no longer force-Secure. SameSite back to `Lax`. *(was held — fix verified)* |
| nesquena#2785 (nesquena#3559) | @franksong2702 | Clearer cron/gateway diagnostics for single-container Docker (gateway configured, no daemon → jobs silently don't fire). |
| nesquena#3555 | @lambyangzhao | Long TTS responses chunked at sentence boundaries (works around the browser's ~32K silent-truncation). |
| nesquena#3340 (nesquena#3342) | @rly09 | Persistent-state toast when a turn has saved memory / created-updated a skill. |
| nesquena#3533 | @franksong2702 | `/reload-mcp` marked `cli_only` so the WebUI doesn't dispatch it as an LLM prompt. |

### Gate
- Full pytest suite: **7681 passed, 0 failed**
- ESLint: CLEAN · ruff: CLEAN · browser-smoke: CLEAN
- Codex (regression): **SAFE TO SHIP** — confirmed the stash-conflict path never drops the stash / never restarts on conflict, auth Secure handles LAN-HTTP correctly with no header-forgery hole, `/reload-mcp` allowlisted, state-toast has a real backend writer + active-session guard, diagnostics leak no paths, TTS chunking preserves order.

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
Co-authored-by: franksong2702 <franksong2702@users.noreply.github.com>
Co-authored-by: lambyangzhao <lambyangzhao@users.noreply.github.com>
Co-authored-by: rly09 <rly09@users.noreply.github.com>
## Release v0.51.261 — Release IC (stage-r11)

Live Todos panel via an explicit `todo_state` SSE contract.

### Fixed
| Issue/PR | Author | Fix |
|----------|--------|-----|
| nesquena#3373 follow-up (nesquena#3454) | @v2psv | The Todos side panel now tracks `todo` tool state **live during an active run** instead of staying stale until settle / rolling back on a mid-stream reload. A dedicated `todo_state` SSE event sends a full, redacted, idempotent snapshot on todo-tool completion (no more truncated `tool_complete.preview`); the same `api.todo_state` parser feeds live + cold-load; live snapshots persist into INFLIGHT so reload/reattach restores the panel; cold-load vs INFLIGHT reconciled by timestamp (incl. the `coldTs===0` compressed-session edge); legacy reverse-scan kept as fallback for old servers. |

### Gate
- Full pytest suite: **7692 passed, 0 failed**
- ESLint: CLEAN · ruff: CLEAN · browser-smoke: CLEAN
- Codex (regression): **SAFE TO SHIP** — verified the new `todo_state` SSE handler composes with existing dispatch (no double-subscribe), INFLIGHT persistence is cleared on terminal/cancel (composes with discard_session + turn-journal), timestamp reconciliation can't let a stale local snapshot win, redaction holds, the legacy reverse-scan fallback still works with no double-render, and the `models.py` change is todo-scoped (no CLI-classification interaction).

Co-authored-by: v2psv <v2psv@users.noreply.github.com>
nesquena-hermes and others added 27 commits June 5, 2026 12:45
nesquena#3661) (nesquena#3679)

* fix(security): reject traversal-shaped job_id in cron output endpoint (nesquena#3661)

Co-authored-by: hinotoi-agent <paperlantern.agent@gmail.com>

* docs(changelog): v0.51.273 — Release IO (stage-p3b)

* test(cron): guard new cron-output tests with @requires_agent_modules (nesquena#3661)

The two new direct-handler tests import cron.jobs, which lives in hermes-agent
and is NOT installed in CI — without the marker they error/hang in the no-agent
CI shard (caught by the shard-0 timeout). Mirrors how the other 30 agent-dependent
tests skip cleanly when hermes-agent modules aren't importable.

---------

Co-authored-by: nesquena-hermes <[email protected]>
Co-authored-by: hinotoi-agent <paperlantern.agent@gmail.com>
…ning nesquena#3630) (nesquena#3680)

* fix(security): harden routes file APIs against symlink swaps (nesquena#3630, nesquena#3450)

Co-authored-by: Rod Boev <rod.boev@gmail.com>

* docs(changelog): v0.51.274 — Release IP (stage-p3c)

---------

Co-authored-by: nesquena-hermes <[email protected]>
Co-authored-by: Rod Boev <rod.boev@gmail.com>
…tion nesquena#3575) (nesquena#3681)

* refactor(routes): extract approval SSE state into api/route_approvals.py (nesquena#3575)

Co-authored-by: Rod Boev <rod.boev@gmail.com>

* docs(changelog): v0.51.275 — Release IQ (stage-p3d)

---------

Co-authored-by: nesquena-hermes <[email protected]>
Co-authored-by: Rod Boev <rod.boev@gmail.com>
…ession titles nesquena#3542) (nesquena#3682)

* feat(sessions): skip adaptive auto-rename for manually-named sessions (nesquena#3542, nesquena#3230)

Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>

* docs(changelog): v0.51.276 — Release IR (stage-p3e)

* fix(sessions): clear manual_title lock on /api/session/clear (nesquena#3542)

Codex regression-gate follow-up: the clear endpoint reset the title to
Untitled directly, stranding manual_title=True so the reused session never
auto-named again. Route the reset through apply_session_title_rename (which
clears the lock for auto-labels) + add a behavioral and a static-guard test.

---------

Co-authored-by: nesquena-hermes <[email protected]>
Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
…n usage indicator nesquena#3663) (nesquena#3683)

* fix(ui): preserve resolved context window in usage indicator (nesquena#3663, nesquena#3185, nesquena#3660)

Co-authored-by: Frank Song <franksong2702@gmail.com>

* docs(changelog): v0.51.277 — Release IS (stage-p3f)

---------

Co-authored-by: nesquena-hermes <[email protected]>
Co-authored-by: Frank Song <franksong2702@gmail.com>
…esquena#3652) (nesquena#3684)

* fix(ui): repair inline PDF preview (blob module loader + CSP worker-src) (nesquena#3652, nesquena#3649)

Co-authored-by: sky <example@email.com>

* docs(changelog): v0.51.278 — Release IT (stage-p3g, nesquena#3652 only); widen CSP test window

---------

Co-authored-by: nesquena-hermes <[email protected]>
Co-authored-by: sky <example@email.com>
…ng turn on mid-stream scroll nesquena#3665) (nesquena#3686)

* fix(streaming): preserve Activity + streaming turn when loading earlier messages mid-stream (nesquena#3665, nesquena#3346)

Co-authored-by: mysoul12138 <839465496@qq.com>

* docs(changelog): v0.51.279 — Release IU (stage-p3h)

---------

Co-authored-by: nesquena-hermes <[email protected]>
Co-authored-by: mysoul12138 <839465496@qq.com>
fix nesquena#3647) (nesquena#3687)

* fix(updates): Windows self-update restart via detached Popen + bind-retry (os.execv doesn't replace proc on Windows) (nesquena#3647)

Co-authored-by: jja881 <jja881@users.noreply.github.com>

* docs(changelog): v0.51.280 — Release IV (stage-p3i)

---------

Co-authored-by: nesquena-hermes <[email protected]>
Co-authored-by: jja881 <jja881@users.noreply.github.com>
…ronze skin nesquena#3602)

Adds the Verdigris dark-only appearance skin (emerald/forest-green + bronze-gold),
renamed from the contributor's 'Hermes Agent' to a descriptive material name per
maintainer naming convention. Registered across all 5 skin sites (config allowlist,
boot.js swatch, index.html FOUC map, i18n in 12 locales, scoped CSS palette) + test.
Also fixes the zeus i18n test (zeus is no longer the trailing skin token).

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
Release v0.51.281 — Release IW (stage-verdigris — Verdigris emerald/bronze skin nesquena#3602)
Absorbs contributor PR nesquena#3544 (@rodboev, closes nesquena#3340) with two fixes:

1. DETECTION VOCAB (would never fire): the original gated on action names
   {save,create,update,upsert}, which don't match the real agent tool enums —
   memory.action is add|replace|remove, skill_manage.action is
   create|patch|edit|delete|write_file|remove_file. Split into per-tool
   predicates with the correct vocabularies: _isMemorySave gates memory on
   {add,replace}; _isSkillUpdate gates skill_manage on {create,patch,edit,
   write_file}. Deletions excluded so the saved/updated verbs stay accurate;
   running/errored excluded.

2. SNAPSHOT/RESTORE PERSISTENCE (Codex catch): classification lived only on the
   row._tcData JS property, which does NOT survive the outerHTML/innerHTML
   snapshot+restore the live tool-call group uses on session switch/restore —
   a restored memory/skill row would be re-counted as a generic tool and the
   suffix would silently vanish. buildToolCard now also stamps durable
   data-memory-save / data-skill-update attributes, and _syncToolCallGroupSummary
   counts them as a fallback when _tcData is absent. Verified live across a real
   outerHTML round-trip: label identical before/after.

Replaces the PR's static source assertions with a node-driven behavioral test
(11 cases) covering the real action vocabularies, exclusions, case-insensitivity,
null-arg safety, and the durable-attribute persistence guard.

Co-authored-by: rodboev <rodboev@users.noreply.github.com>
Release v0.51.282 — Release IX (stage-3544 — surface memory/skill saves in Activity summary)
… auto-compaction nesquena#3512) (nesquena#3690)

* feat(composer): surface that messages queue during auto-compaction (nesquena#3512, nesquena#3079)

Co-authored-by: Rod Boev <rod.boev@gmail.com>

* docs(changelog): v0.51.283 — Release IY (stage-w2)

* fix(composer): restore placeholder on ALL compaction-exit paths, not just clearCompressionUi (nesquena#3512)

Codex+Opus both caught: setCompressionUi(done) and the live-anchored SSE
window._compressionUi=null paths bypassed clearCompressionUi, leaving the
'will queue' placeholder stuck after compaction. Factor restore into
_restoreCompressionPlaceholder() + call from every compaction-exit path.

---------

Co-authored-by: nesquena-hermes <[email protected]>
Co-authored-by: Rod Boev <rod.boev@gmail.com>
…on-sessions toggle nesquena#3570 nesquena#3514) (nesquena#3692)

* feat(sidebar): add show_cron_sessions toggle to surface cron sessions (nesquena#3514, nesquena#2841)

Co-authored-by: Rod Boev <rod.boev@gmail.com>

* feat(sidebar): add manual session status labels (nesquena#3570)

Co-authored-by: Rod Boev <rod.boev@gmail.com>

* docs(changelog): v0.51.284 — Release IZ (stage-w4)

* fix(settings): persist show_cron_sessions in the explicit Save Settings path too (nesquena#3514)

Codex regression-gate follow-up: the autosave path (_preferencesPayloadFromUi)
included show_cron_sessions but the explicit saveSettings() button path read/saved
show_cli_sessions and dropped the cron checkbox — clicking Save Settings silently
omitted it. Read settingsShowCronSessions + add body.show_cron_sessions (gated on
CLI sessions, mirroring autosave).

* fix(settings): gate show_cron_sessions identically in BOTH save paths (nesquena#3514)

Codex round-2: my saveSettings() gate exposed that the autosave path
(_preferencesPayloadFromUi) posted the raw cron checkbox state ungated, so
show_cli_sessions=false + show_cron_sessions=true could persist via autosave.
Gate autosave on showCliCb too; update the regression test to assert both
paths gate on settingsShowCliSessions.

---------

Co-authored-by: nesquena-hermes <[email protected]>
Co-authored-by: Rod Boev <rod.boev@gmail.com>
…tity race fix nesquena#3654) (nesquena#3693)

* Fix update reload readiness race — poll /health server identity before reload (nesquena#3654)

Replaces the raw-uptime comparison (couldn't distinguish a fresh old process
from the restarted one) with a stable server_started_at identity read before
the update POST; reloads only when the identity changes. Both the force-update
and regular apply paths read + pass the baseline. (nesquena#874, nesquena#3654)

Co-authored-by: Frank Song <franksong2702@gmail.com>

* docs(changelog): v0.51.285 — Release JA (stage-r19)

---------

Co-authored-by: Frank Song <franksong2702@gmail.com>
Co-authored-by: nesquena-hermes <[email protected]>
…squena#3067) (nesquena#3694)

* feat: allow sidebar tab reordering via drag (nesquena#3067)

Drag-reorder for sidebar tab chips in Settings, persisted via a sanitized
tab_order setting (collapses duplicates, rejects chat/settings, strips
non-strings).

Co-authored-by: ai-ag2026 <261867348+ai-ag2026@users.noreply.github.com>

* docs(changelog): v0.51.286 — Release JB (stage-r21)

---------

Co-authored-by: ai-ag2026 <261867348+ai-ag2026@users.noreply.github.com>
Co-authored-by: nesquena-hermes <[email protected]>
…ion nesquena#3653 + worker-profile picker hiding nesquena#3662) (nesquena#3695)

* feat(sessions): classify WeCom gateway sessions as messaging (nesquena#3653)

Co-authored-by: Frank Song <franksong2702@gmail.com>

* feat(profiles): hide worker profiles from chat picker (nesquena#3662)

Co-authored-by: Rod Boev <rod.boev@gmail.com>

* docs(changelog): v0.51.287 — Release JC (stage-r22)

---------

Co-authored-by: Frank Song <franksong2702@gmail.com>
Co-authored-by: Rod Boev <rod.boev@gmail.com>
Co-authored-by: nesquena-hermes <[email protected]>
…esquena#3515) (nesquena#3697)

* feat(approval): make the approval card collapsible (nesquena#3515)

Adds a collapse toggle to the approval card header so users can shrink it
to a thin header strip and keep the tool-call rationale/transcript above
readable. Full ARIA (aria-expanded/controls/label), chevron swap, and
transcript reflow that preserves near-bottom scroll. Closes nesquena#3007.

Co-authored-by: Rod Boev <rod.boev@gmail.com>

* docs(changelog): v0.51.288 — Release JD (stage-r24)

* fix(approval): clear collapsed state for a distinct queued approval (nesquena#3515)

Codex regression-gate finding: showApprovalCard's sameApproval check didn't
include approval_id and didn't clear .collapsed in the !sameApproval branch, so
a NEW/parallel approval arriving while the card was already collapsed could
render collapsed with its command + action buttons hidden. Add approval_id to
the signature; clear .collapsed for a distinct approval before syncing. +2 regression tests.

---------

Co-authored-by: Rod Boev <rod.boev@gmail.com>
Co-authored-by: nesquena-hermes <[email protected]>
nesquena#3696) + scope-undef prevention gate (nesquena#3698)

* fix(sidebar): hoist _sessionAttentionState to top-level scope (nesquena#3696)

_sessionAttentionState was declared inside renderSessionListFromCache() and
relied on function hoisting, but the top-level function _sidebarRowHasVisible
Messages (reached via renderSessionListFromCache -> _partitionSidebarSessionRows)
called it bare. Hoisting is scoped to the enclosing function, so every sidebar
cache-render threw 'ReferenceError: _sessionAttentionState is not defined' and
the session list went blank. Regressed in nesquena#3672 (v0.51.269) when _sidebarRow
HasVisibleMessages was extracted to top level.

Fix: move _sessionAttentionState to top-level scope (it is pure — only uses its
arg plus the i18n global t), so both the visibility predicate and the nested
per-row renderer can reach it.

Prevention (the durable half): add scripts/scope_undef_gate.py — models the
classic-<script> shared global scope (union of all static files' top-level
symbols) and runs ESLint no-undef per file, flagging a function defined nested
but called from a sibling scope. Wired into CI (.github/workflows/tests.yml lint
job) alongside the existing no-const-assign runtime gate, plus an in-suite test
(test_static_js_scope_undef.py) and a focused structural regression test
(test_issue3696_session_attention_scope.py). RED/GREEN-validated against the
broken tree.

* fix(streaming): thread source param into stale-stream bailout; tighten scope gate

Opus review of nesquena#3698 found the new scope_undef_gate's 'source' allowlist entry
was masking a real same-class bug: _bailOutOfTerminalEventsFromStaleStream
(declared inside attachLiveStream, params activeSid/streamId/uploaded/options)
called _closeSource(source) against a 'source' not in its lexical scope. All 5
call sites are inside _wireSSE(source), but JS scope is lexical not dynamic, so
the helper would throw ReferenceError: source is not defined on the stale-stream
terminal-event path (user back in an active session whose old stream finalizes
late).

Fix: thread source as an explicit parameter (declaration + all 5 call sites),
the same make-the-dependency-explicit fix as nesquena#3696 — and REMOVE the 'source'
allowlist entry so the gate stays gated against that name (it now passes because
the bug is fixed, not because it's allowlisted). Added the documented
false-negative classes from Opus's review to the gate docstring (name-collision
shadowing, destructuring-regex gap, exposure escape hatches, name-keyed
allowlist) and a focused regression test.

This is the prevention gate catching a real latent bug on its first outing.

---------

Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
…hotfix + scope gate) (nesquena#3699)

The v0.51.289 tag ships the nesquena#3696 sidebar-crash hotfix + the scope_undef_gate
(merged in nesquena#3698, commit da5bf69). This stamps the CHANGELOG [Unreleased]
section to the v0.51.289 release header. Docs-only.

Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
…olution nesquena#3448 fixes nesquena#3405) (nesquena#3703)

* fix(nesquena#3405): respect profile provider/model in session resolution (nesquena#3448)

Profile-bound sessions now resolve their provider/model from the profile
instead of silently falling back to the global active provider — fixes wrong
credentials/billing and silent context truncation. Repairs stale models under
the profile provider (incl. the openai-codex + openai/ slash-model case) while
preserving native slash IDs on openrouter/custom.

Co-authored-by: Rod Boev <rod.boev@gmail.com>

* docs(changelog): v0.51.290 — Release JF (stage-s1, nesquena#3448 fixes nesquena#3405)

---------

Co-authored-by: Rod Boev <rod.boev@gmail.com>
Co-authored-by: nesquena-hermes <[email protected]>
… on switch-away nesquena#3668) (nesquena#3704)

* fix(nesquena#3668): snapshot live turn before stream teardown on session switch

The 'stays gone' variant: switching away from a streaming session during a quiet
window (mid tool-exec / silent thinking, between content SSE events) left a
stale/absent live-turn snapshot, so restoreLiveTurnHtmlForSession() failed on
switch-back and loadSession()'s fallback rebuilt with an empty appendThinking(),
permanently losing streamed thinking/tool content (only the elapsed clock
survived). closeLiveStream() now snapshots the live-turn DOM via
snapshotLiveTurnHtmlForSession(sessionId) BEFORE closing the source + tearing
down LIVE_STREAMS, so switch-back always restores the exact state shown at
switch-away. + regression test asserting snapshot precedes teardown.

* docs(changelog): v0.51.291 — Release JG (stage-s2, nesquena#3668)

---------

Co-authored-by: nesquena-hermes <[email protected]>
…s surface as errors nesquena#3316 fixes nesquena#3315) (nesquena#3705)

* fix(nesquena#3315): surface compression-exhausted/no-final-answer turns as errors (nesquena#3316)

When Hermes Agent exhausts context compression in a long tool-heavy turn, the
streamed result can end on a tool result / assistant(tool_calls) turn with no
final assistant answer. WebUI was finalizing that as a completed response.
Now _session_lacks_final_assistant_answer() + _agent_result_terminal_failure()
classify these as terminal failures and surface an apperror instead. The
compression session-id migration + pre-compression snapshot now run BEFORE the
terminal-failure return (ordering bug from the prior hold) so state stays
consistent when exhaustion fires after the agent rotated session_id.

Co-authored-by: Frank Song <franksong2702@gmail.com>

* docs(changelog): v0.51.292 — Release JH (stage-s4, nesquena#3316 fixes nesquena#3315)

---------

Co-authored-by: Frank Song <franksong2702@gmail.com>
Co-authored-by: nesquena-hermes <[email protected]>
…nders twice nesquena#3709) (nesquena#3715)

* fix(nesquena#3709): thinking card no longer renders twice (in Activity + below answer)

The nesquena#3592 inline-render branch (v0.51.258) emitted a thinking card for a
thinking-only message even when a sibling tool-message in the same turn already
built an Activity group carrying that turn's thinking — so the card showed twice,
the second one stranded below the answer + 'Done in …' footer (insertAdjacentHTML
'beforeend' on a segment that already had body+footer).

Fix (keeps nesquena#3592, does NOT revert it):
- A1: precompute turnsWithActivityGroup (turns whose segments have tool cards);
  the inline branch only renders when the anchor turn is NOT in that set.
- A2: when it does render inline, insert 'beforebegin' the .msg-body/.msg-foot so
  the card sits above the answer, not orphaned below the footer.
- B: strip thinking against the TURN's combined visible answer
  (_turnVisibleTextByRawIdx), so a trailing thinking-only message that echoes the
  answer gets de-duped even though its own body is empty.

Live-verified in browser: nesquena#3709 repro (tool+trailing-thinking) → exactly 1 card in
Activity, above footer; nesquena#3592 repro (thinking-only) → exactly 1 inline card, not
buried in a collapsed group. + regression test tests/test_issue3709_*.

Supersedes nesquena#3708 (which deleted the inline branch outright, re-breaking nesquena#3592).

* fix(nesquena#3709): merge suppressed sibling thinking into the Activity group (Codex re-gate)

Codex caught a content-loss edge in the first cut: when A1 suppresses a
thinking-only sibling's inline card (its turn has an Activity group), the group
only rendered assistantThinking.get(aIdx) for the TOOL message — so a sibling
with DISTINCT reasoning was neither inline nor in the group → dropped.

Fix: aggregate all of a turn's thinking (turnThinkingParts, de-duped, index
order) and render that merged text once per turn in the Activity group
(_renderedTurnThinking guard). Live-verified: tool-thinking A + distinct
sibling-thinking B → 1 merged node carrying both, no loss. + regression test.

* fix(nesquena#3709): shared anchor resolver so inline-suppression & group placement agree (Codex re-gate #2)

Codex caught a fallback-anchor mismatch: turnsWithActivityGroup was populated only
from assistantSegments.get(tcIdx) (direct segment), but the group-render path falls
back to a nearby earlier segment when a tool's assistant_msg_idx has no directly
rendered segment (legacy/rebased). So a fallback-anchored group's turn wasn't in
turnsWithActivityGroup → the sibling rendered inline AND the group rendered → dup
again. Fix: one shared _anchorRowForActivityIdx(aIdx) helper (direct-or-fallback)
used by the precompute, the inline branch, and the group render — they now agree.
Live-verified all three repros still pass.

* test(nesquena#3709): update test_compact_activity assertion to mergedThinking var

The brittle source-scan asserted _thinkingActivityNode(thinkingText, false) — the
nesquena#3709 fix renders the turn's MERGED thinking via _thinkingActivityNode(mergedThinking,
false) into the same Activity body. Intent (settled thinking renders inside the
Activity disclosure alongside tools) unchanged; only the source variable. Updated to
assert the new variable, kept all intent assertions.

---------

Co-authored-by: nesquena-hermes <[email protected]>
…ive-to-final redesign nesquena#3401 + 4 deep-review fixes) (nesquena#3741)

* Harden interrupted recovery control filtering

* Redesign live-to-final assistant replies

* Fix live activity anchor test fixture

* Fix CI lint issues for live reply tests

* Strengthen live progress prompt contract

* Recover PR nesquena#3401 refresh on origin/master

* Repair live-to-final refresh regressions

* Fix live worklog refresh regressions

* Show live footer timer on initial stream start

* Restore live stream shell after reload

* Preserve per-frame live SSE replay cursors

* Preserve reasoning as Worklog Thinking cards

* Quiet Worklog Thinking card styling

* Align Worklog Thinking card styling

* Scope live Worklog Thinking cards by segment

* Suppress exact duplicate settled Thinking

* Close nesquena#3401 merge review test gaps

* fix(nesquena#3401): resolve 4 deep-review regressions (inline-think, reconnect-dup, neon skin, busy-gate worklog)

Deep review (Codex diff-vs-master + live-browser drive) of the live-to-final refactor
surfaced 4 regressions vs master that the rewritten suite no longer guarded:

1. Inline <think>…</think>answer reasoning vanished — _assistantReasoningPayloadText
   used $-anchored regexes so a leading think block + visible answer extracted nothing
   and the Thinking card never rendered. Removed the 3 $ anchors to match the
   (non-anchored) display stripper. Live: inline-think thinking-only turn now renders.
2. (CORE) reconnect/reload duplicated the live reply — _rememberRunJournalCursor advanced
   a closure-local seq but never wrote INFLIGHT[activeSid].lastRunJournalSeq, so a reload
   replayed the journal from after_seq=0 over restored lastAssistantText. Now mirrors the
   cursor onto INFLIGHT + schedules a throttled persist.
3. Neon skin silently broke — PR deleted the :root[data-skin="neon"] CSS but left Neon in
   the picker. Restored the neon CSS block from master.
4. Settled tool-worklog rebuild gated purely on !S.busy — dropped every prior settled
   turn's worklog when renderMessages re-ran during an active stream (switch-back to an
   in-progress session). Restored master's !S.busy || (S.toolCalls && S.toolCalls.length).
   Live: busy re-render now preserves tool cards (4→4, was 4→0).

Live-verified all 4 + confirmed nesquena#3709/nesquena#3592 invariants still hold (1 thinking card, none
below footer; distinct siblings preserved). + tests/test_issue3401_deep_review_fixes.py (7).

* test(nesquena#3401): realign 3 stale source-shape assertions to the deep-review fixes

Fix commit changed two source literals that existing stage tests scanned for:
- test_live_activity_timeline.py (x2): split anchor 'if(!S.busy){' → the restored
  'if(!S.busy || (S.toolCalls&&S.toolCalls.length)){' guard (fix 4).
- test_run_journal_frontend_static.py: 'after_seq=0' not in source — fix 2's comment
  contained that literal; rephrased the comment to 'the zero floor (after_seq of 0)'.
Intent of all three assertions unchanged; only the matched string updated. No code
behavior change.

* docs(changelog): v0.51.294 — Release JJ (stage-3401, nesquena#3401 live-to-final redesign)

---------

Co-authored-by: Frank Song <franksong2702@gmail.com>
Co-authored-by: Nathan-Hermes <nesquena-hermes@users.noreply.github.com>
Co-authored-by: nesquena-hermes <[email protected]>
 + session-status revert nesquena#3742) (nesquena#3743)

* fix: honor explicit model pick, suppress silent revert on cross-family selection (nesquena#3737)

When a user changes the model in the composer dropdown and sends,
_resolve_compatible_session_model_state previously had no way to
distinguish an explicit user pick from stale session state. The
profile-aware branch (v0.51.290, PR nesquena#3448) and the legacy block
both rewrote bare cross-family models to the profile default, and
the client unconditionally applied effective_model — silently
discarding the user's choice.

Backend: accept explicit_model_pick flag (default False) on
_resolve_compatible_session_model_state. Guard both the
profile-aware branch (routes.py:2024) and the legacy block
(routes.py:2124) to skip cross-provider normalization when set.
_handle_chat_start extracts the flag and passes it through.

Frontend: consult _readPendingSessionModel (sessionStorage, 10-min
window) to detect explicit picks and include the flag. Add a toast
as defense-in-depth when the server still returns effective_model.

Closes nesquena#3737

* fix: tighten explicit-pick detection and add regression tests (nesquena#3737)

Greptile P2-1: compare model_provider in pending pick detection,
not just model name, to avoid false-positive flag when the
session provider changes between pick and send.

Greptile P2-2: only show the defense-in-depth toast when an
explicit pick was actually overridden — stale-session
normalizations are expected behavior and should be silent.

Add two regression tests for the profile-branch guard:
- explicit_model_pick=True → cross-family model survives
- explicit_model_pick=False → existing normalization preserved

* revert(sidebar): remove manual session status labels (nesquena#3570)

The manual per-session status labels (Todo / In Progress / Done) added in
v0.51.284 (nesquena#3570) stored state only in browser localStorage keyed by session
id, with no server-side backing — so labels did not persist across browsers
or devices (a user who labeled sessions on one machine saw none after moving
to a laptop). They also rendered as three flat top-level entries in the
session context menu, crowding the root menu.

Per maintainer decision, remove the feature entirely for now. It can be
reintroduced later with proper server-side persistence and a less intrusive
menu treatment.

Removes:
- JS state/cycle helpers + SESSION_MANUAL_STATUS_KEY (static/sessions.js)
- context-menu status entries + sidebar status badge render
- .session-manual-status* CSS (static/style.css)
- session_status_* locale strings across all locales (static/i18n.js)

Full suite: 8084 passed, 0 failed. ESLint runtime gate: clean.

reverts nesquena#3570

* fix(nesquena#3737): keep explicit-pick marker until send consumes it (Codex catch)

Codex found the explicit_model_pick flag never engaged in the normal flow: boot.js
modelSelect.onchange cleared the pending-pick marker right after /api/session/update,
so by the time send() ran _readPendingSessionModel returned null, _explicitPick was
false, and the server's profile-provider branch still reverted the cross-family pick
(the exact nesquena#3737 bug). The flag only worked in the rare race where send beat the
session-update round-trip.

Fix (Codex prescription): do NOT clear the marker in onchange; clear it in send()
immediately after reading a matching pending pick, so it's consumed for that send only.
onchange still RECORDS the pick (_rememberPendingSessionModel) — only the premature
clear is removed.

* test(nesquena#3737): lock client clear-timing wiring (onchange records, send consumes)

Static source guards for the Codex clear-timing fix: onchange must record the
pending pick and NOT clear it post-session-update; send() must consume (clear) it
only after reading a matching _explicitPick, and send the flag only when truthy.
Complements the author's resolver-level tests in test_provider_mismatch.py.

* test(nesquena#3737): realign refresh-persistence test to the moved pending-pick clear

The Codex clear-timing fix moved the pending-pick clear out of modelSelect.onchange
into send() (consume-on-send). test_model_selection_records_pending_state_before_async_session_update
asserted the OLD onchange-clears behavior (assert _clearPendingSessionModel in body).
Updated to assert the NEW correct behavior (onchange must NOT clear it — it survives to
send). The test's core refresh-survives invariant (marker recorded before the async
session-update; reapplied on load) is unchanged and still passes; only the stale
clear-location assertion is flipped. Not a regression-blessing: the refresh-survives
feature is intact, the marker lifecycle is more correct.

---------

Co-authored-by: John Doe <johndoe@example.com>
Co-authored-by: nesquena-hermes <[email protected]>
Catch up the fork from v0.51.232 to upstream's tip v0.51.295 in a single
merge. Conflict resolution summary:

- Additive/union: CHANGELOG.md, README.md, api/config.py, server.py (Windows
  bind retry), static/boot.js, static/style.css (neon+verdigris skins),
  static/i18n.js (verdigris in cmd_theme, 12 locales), api/routes.py (anchored
  fd helpers, profile-scoped rename event, POSIX path fixes), static/sessions.js
  (_profileMatchesActiveProfile, named-custom fallback), test_passkey_auth.py
  (kept fork's importorskip + upstream's SimpleNamespace).
- .github/workflows/* reset to our master (fork runs its own CI; GITHUB_TOKEN
  cannot push workflow changes).
- Update banner: kept the fork's multi-row webui/agent banner and
  applyUpdates(target) design (upstream's single-banner redesign was previously
  rejected by the fork and is asserted against by the fork's own tests). Dropped
  upstream-added tests/test_issue3597_update_banner_position.py (asserts the
  rejected single-banner position); reset test_update_apply_ui.py and
  test_update_banner_fixes.py to the fork's versions.
- ui.js: spliced the fork's applyUpdates(target); profile chip now reads
  S.activeProfile inline in both syncTopbar branches (nesquena#3635) while keeping the
  fork's titlebar-chip sync; folded in upstream's topbar message-count helpers;
  ported upstream's update what's-new/summary helper definitions the fork already
  referenced (fixes latent undefined refs now caught by the new scope gate).

Full suite green (8151 passed); remaining local failures are environmental only
(sandbox commit-signing + running as root), verified to pass otherwise.
@Du7chManiac Du7chManiac merged commit 3f0f5b2 into master Jun 6, 2026
3 checks passed
@Du7chManiac Du7chManiac deleted the sync/upstream-v0.51.295 branch June 7, 2026 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants