Skip to content

sync: upstream v0.51.230 (needs review)#47

Merged
Du7chManiac merged 184 commits into
masterfrom
sync/upstream-v0.51.230
Jun 3, 2026
Merged

sync: upstream v0.51.230 (needs review)#47
Du7chManiac merged 184 commits into
masterfrom
sync/upstream-v0.51.230

Conversation

@github-actions

@github-actions github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown

Auto-generated by .github/workflows/sync-upstream.yml.

Merging upstream tag v0.51.230 from nesquena/hermes-webui into master.

Needs human review.

  • Conflicts in: CHANGELOG.md,README.md,static/i18n.js,static/panels.js,static/style.css,static/ui.js

See CLAUDE.mdSyncing with upstream for the conflict playbook.

⚠️ Dropped upstream workflow changes

The default GITHUB_TOKEN cannot push changes under .github/workflows/, and this
fork runs its own CI rather than tracking upstream's. These upstream edits were
reverted to our master version and excluded from this PR:

  • .github/workflows/tests.yml

If any of these changes are wanted, port them in a separate PR.

Validation

  • Merge: conflict
  • Tests: skipped (0 = pass; skipped = not run because merge had conflicts)

Next step

  • Squash-merge is forbidden — keep the merge commit so future syncs see the correct merge-base.
  • After merge, the next workflow run will pick up the tag after v0.51.230.

PINKIIILQWQ and others added 30 commits May 31, 2026 18:26
Shift from backend mtime-based detection to frontend SSE deduplication.

Backend: Revert gateway_watcher.py to original pure hash-based polling.
Remove _get_db_mtime, _detect_gateway_restart, and mtime tracking.
This is a no-op in behavior — the original was already hash-only.

Frontend: Add deduplication at the SSE event handler level.
- _gatewaySessionSnapshotKey(sessions): deterministic key from
  session_id + updated_at + message_count (same fields as backend hash)
- _isGatewaySessionForSnapshot(session): classify non-webui sessions
- _isDuplicateGatewaySessionSnapshot(sessions): compare SSE payload
  against current _allSessions, filtered to gateway subset
- SSE sessions_changed handler wraps renderSessionList() in dedupe:
  identical data → skip refresh

This directly addresses the real root cause: the SSE reconnect snapshot
(routes.py:7735) unconditionally pushes an initial snapshot, and the
frontend always re-renders. After this fix, a reconnect with unchanged
session data is correctly detected and the redundant redraw is skipped.

Previously submitted as nesquena#3259 (backend mtime approach, now closed per
maintainer review).
…3275)

Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
v0.51.189: ruff lint gate + SSE refresh dedupe + tooltip i18n (stage-batch1)
…ured banner on first deploy (nesquena#3194)

Two urgent breaking bugs that had no PR, combined into one hotfix.

nesquena#2905 (data-loss-class): v0.51.134 moved the Windows default Hermes home
from %USERPROFILE%\.hermes to %LOCALAPPDATA%\hermes (PR nesquena#2897) with no
migration, so upgrading Windows users opened the app to empty
sessions/pins/settings (data intact on disk, at an address the new build
no longer read). _platform_default_hermes_home() now prefers the populated
legacy home ONLY when the new location is not yet established —
non-destructive (no file moves) and self-healing on next launch.
profiles._resolve_base_hermes_home() delegates to the same config helper so
the active-profile pointer can never drift from STATE_DIR.

nesquena#3194: GET /api/gateway/status reported 'Gateway not configured' on a fresh
two-container Docker deploy because an alive=None + gateway_stale_running_state
health payload with an empty identity_map fell through to
configured=bool(identity_map)=False. The alive=None branch now treats a
payload carrying gateway metadata (gateway_state detail, or a stale-running/
stale-stopped reason) as configured.

Tests: +17 regression tests (11 for nesquena#2905 incl. full truth table + non-destructive
guard + POSIX no-op; 6 for nesquena#3194 incl. 5 no-regression guards). Full suite
7090 passed, 0 failed.

Closes nesquena#2905
Closes nesquena#3194
…esquena#1944), narrow nesquena#2905 markers, scope except

Both pre-release reviewers (Opus advisor + Codex regression gate) converged on
the same MUST-FIX:

- nesquena#3194: treating gateway_stale_stopped_state as 'configured' contradicted
  nesquena#1944 (a stopped root gateway should read like 'not configured' so the
  banner doesn't nag). Now ONLY stale-RUNNING metadata
  (reason=gateway_stale_running_state or gateway_state=='running') flips
  configured=True; stale-stopped falls through to bool(identity_map) like the
  genuinely-unconfigured case. Updated the test accordingly + added a
  stale-stopped no-regression test.

Opus follow-ups also applied:
- nesquena#2905: narrowed the populated-home markers to WebUI-only artifacts
  (webui/, webui/sessions, webui/settings.json), dropping config.yaml/auth.json
  so a long-time agent user installing WebUI fresh isn't wrongly diverted to
  the legacy %USERPROFILE%\.hermes (auth.json predates nesquena#2897 there).
- profiles._resolve_base_hermes_home(): narrowed except Exception -> ImportError
  so a real bug in the config helper still surfaces.

Adjacent suites green: nesquena#2840, nesquena#1879, gateway_status_agent_health (66 tests).
…x gate MUST-FIX)

Codex regression gate found the exact-name hidden-key set leaked secret-shaped
args (apiKey/access_token/clientSecret/Authorization/cookie/...) into the
always-visible collapsed tool-card header. Replace with a normalized
case-insensitive _toolArgPreviewKeyIsHidden() predicate matching secret-bearing
substrings + camelCase variants. Adds 22 parametrized regression tests pinning
the secret-key denial + a legit-key-still-shown guard. Co-authored-by preserved.
v0.51.190: Windows upgrade state-stranding hotfix (nesquena#2905) + gateway banner (nesquena#3194) + quiet tool previews (stage-batch2)
Adds .github/FUNDING.yml so GitHub displays a "Sponsor" button on the
repository, linking to GitHub Sponsors for @nesquena.
chore: add GitHub Sponsors funding config
Skill detail and linked markdown files now use the same preview-md
pipeline as Memory/Notes, with code highlighting and KaTeX enhancement.
…dex gate MUST-FIX)

Codex regression gate found the launchd guard blocked ANY ctl.sh start while a
launchd job was live — including a legitimate second instance on a different
port (HERMES_WEBUI_PORT=8788). Now _launchd_webui_pid only treats the launchd
job as a conflict when its PID is actually listening on the requested CTL_PORT
(via a new best-effort _pid_listens_on_port helper using lsof); a different-port
start is allowed. When port ownership can't be determined (no lsof), falls back
to guarding only the default 8787 port so non-default ports are never wrongly
blocked. Adds a different-port-allowed regression test + makes the existing
block test deterministic. Co-authored-by preserved.
v0.51.191: skills-detail markdown styling (nesquena#3284) + launchd duplicate-start guard (nesquena#3291) (stage-batch3)
…185 upgrade

Re-applies cb0065e + b34311b (context_length default-only guard) which were
dropped by the upgrade reset to v0.51.185. Fixes 4.7-1m context window showing
as stale global cap (232K) instead of real 1M metadata. Backend-only: touches
api/routes.py + api/streaming.py, zero frontend/render changes.
…shot

_live_usage_snapshot() runs on every metering tick (~10x/sec while streaming).
The nesquena#3256 default-only guard recomputed get_model_context_length() there on
every tick for non-default models, which does a config read + potential
metadata/network probe — freezing claude-opus-4.7-1m streams while the default
model (4.8) stayed fast (guard not triggered for it). Resolve the real cap at
most once per stream via _real_ctx_cache. Backend-only, no frontend changes.
…shot

The default-only guard corrected context_length to the real per-model cap
(e.g. 1M for claude-opus-4.7-1m) but left threshold_tokens pointing at the
ContextCompressor's stale value (computed from the global 232K cap → 197.2k
@ 85%). UI then showed 'auto-compress at 197.2k / 1M' which is misleading.

Rescale threshold_tokens by the real/orig ratio so the displayed trigger
reflects the actual window (e.g. ~850k @ 1M).

NOTE: this only corrects the SSE display payload. The real auto-compress
trigger lives inside ContextCompressor in hermes-agent (agent_init.py:1446
constructs it with the global cap). A full fix requires a parallel change
upstream — tracked separately.
…nly context_length guard regression test

- test_pr1341 distance limit 9000→13000 (the PR legitimately added the
  default-only pre-save guard block; the test is designed to be bumped when a
  new pre-save mutation block lands — this was the only CI-red cause on shard 1).
- new tests/test_issue3256_context_length_default_only_guard.py: verifies the
  global model.context_length cap applies ONLY to model.default (revert-fix-
  verified — fails on master, passes with the fix).

Co-authored-by: allenliang2022 <allenliang2022@users.noreply.github.com>
…ted stale cap + rescale terminal threshold

Codex regression gate (+Opus, both independently) found the default-only guard
dropped the stale compressor cap but two sibling paths stayed inconsistent:
1. Per-turn persistence: fallback resolver only ran when context_length was
   falsy, so a previously-persisted stale 232K survived forever on non-default
   sessions. Now also runs when _skip_cc_cl, and rescales threshold_tokens to
   the recomputed real cap (or clears it).
2. Terminal SSE 'done' payload: re-emitted the stale compressor threshold, so
   messages.js overwrote S.lastUsage and the indicator reverted on stream end.
   Now rescales threshold to the resolved window when the stale cap was dropped.
Added 3 source-structure regression tests pinning both fixes; bumped the brittle
test_pr1341 distance limit 13000→15000 (+ noted it should become structural).

Co-authored-by: allenliang2022 <allenliang2022@users.noreply.github.com>
…UnboundLocalError on no-compressor path) + relax brittle nesquena#1318 source-assertion

The threshold-rescale block runs unconditionally after the fallback and
references _skip_cc_cl/_cc_cl, which were only defined inside 'if _cc_for_save:'.
On the no-compressor path (fresh agent / interrupted stream) that raised
UnboundLocalError (caught by test_issue1857_usage_overwrite). Hoist both inits
above the block (no-op rescale when no compressor). Also widen the nesquena#1318
source-assertion test to accept the widened fallback gate (still asserts the
falsy-check invariant).

Co-authored-by: allenliang2022 <allenliang2022@users.noreply.github.com>
…ate MUST-FIX)

The default-only context_length guard compared model.default to the session
model with exact string equality. But model.default and the session model can
be stored in equivalent-but-different shapes (bare 'claude-opus-4.8',
provider-prefixed 'anthropic/claude-opus-4.8', or '@Anthropic:claude-opus-4.8').
An exact compare wrongly treats the actual default model as non-default and
drops its configured context_length cap for provider-prefixed configs.

Add api/routes._model_matches_configured_default(session_model, cfg_default,
provider) that normalizes all three shapes, and use it at all 6 guard sites
(routes resolver + the 5 api/streaming.py sites: live-usage snapshot, persistence
_skip_cc_cl, persistence fallback _apply_cfg_ctx, SSE-done _dropped_stale_cap_sse,
SSE fallback _apply_cfg_ctx). Imported function-scoped in streaming to avoid the
routes<->streaming module-level circular import. 10 helper unit tests + a
behavioral test that a prefixed default still receives its cap.

Co-authored-by: allenliang2022 <allenliang2022@users.noreply.github.com>
…are-name on different providers (Codex over-match MUST-FIX)

Prior round stripped provider prefixes from both sides and matched bare-only,
which over-matched: openai/gpt-4o would match default openrouter/gpt-4o. Now the
matcher compares bare model ids AND rejects when both sides identify DIFFERENT
providers (from provider/ prefix, @Provider: qualifier, or the explicit provider
arg). Same-provider / unknown-session-provider still match. Added cross-provider
rejection regression tests.

Co-authored-by: allenliang2022 <allenliang2022@users.noreply.github.com>
v0.51.192: per-model context_length default-only guard (nesquena#3263, closes nesquena#3256) (stage-batch4)
README:
- Add a Contents table of contents for navigability (800-line doc).
- Freshen stale snapshots: test count 5303/488 files -> ~7,150/~700 files;
  reframe contributor counts to point at CONTRIBUTORS.md as the live source.
- Rebuild the Architecture section: drop per-file exact LOC (drifts every
  release -> chronic staleness) in favor of a stable backend/frontend role map;
  add pyproject.toml + the ruff/browser/docker CI gates.
- Reorganize the Docs index by purpose (Start here / Using / Deploying /
  Contributing & design / Release history); add missing CONTRIBUTING.md,
  DESIGN.md, docs/workspace-git.md; convert bare paths to working links.

ARCHITECTURE.md: header v0.51.54/5303 -> v0.51.192/~7,150 + note that the
  numbers are a periodic snapshot (authoritative source = git tag + collect-only).

TESTING.md: header + footer test counts refreshed; drop the stale
  'through v0.50.21' framing; note the ruff/browser/docker gates.

ROADMAP.md: refresh the 'Last updated' stamp (v0.51.31/5028 -> v0.51.192/~7,150)
  with recent themes.

Markdown-only; all internal links verified to resolve.
Regenerate CONTRIBUTORS.md + README contributors section from a verified
3-source union: GitHub merged-PR list, CHANGELOG.md attribution lines, and
Co-authored-by trailers on master commits (the canonical signal for a CLOSED
contributor PR whose commits were cherry-picked/absorbed and attributed).

- New tally: 194 contributors / 843 PR credits (was a stale 137 / 646).
- The increase: ~135 releases since the v0.51.58 pin PLUS newly-detected
  absorbed-CLOSED PRs the prior hand-count missed (e.g. franksong2702 148 =
  129 merged + 19 cherry-picked-and-attributed).
- UNION with the existing hand-curated file as a floor: 27 old contributors
  had no machine-readable signal (very old closed PRs) — preserved, ZERO dropped.
- Refreshed special-thanks PR counts to match.
- Generator committed to the maintainer workspace as scripts/regen_contributors.py
  (--merge-existing keeps it safe for all future refreshes).

Verified: every one of the original 137 logins still present (+57 new).
nesquena-hermes and others added 23 commits June 2, 2026 20:36
…#3429 regression from nesquena#3366)

nesquena#3366 changed getModelLabel() to strip only the first /-segment (fixing nesquena#3360
multi-slash proxy IDs). That regressed URI-scheme IDs like Yandex
gpt://${FOLDER}/deepseek-v4-flash/latest — indexOf('/') lands inside the ://
and leaves /${FOLDER}/... path junk in the composer model chip. Detect a
scheme:// id, drop scheme+authority, and take the last meaningful path segment
(skipping ${...} env-var placeholders and bare version tails like latest).
Non-URI multi-slash IDs keep the nesquena#3360 first-segment-strip behavior unchanged.
Node-driven regression test covers the URI case + the nesquena#3360 non-regression.
… model names (Codex MUST-FIX)

Codex gate found two edges in the first cut: (1) the candidate segment list
included the URI authority, so gpt://folder123/v4 and .../latest returned the
folder; (2) _isVersionTail matched any digit-leading segment, dropping a real
model named 2026-model. Fix: build path segments from AFTER the authority only;
tighten the version-tail regex to pure version tokens (latest/stable/v4/1.2),
not mixed names; fall back to last-usable (non-placeholder) path segment so the
authority is never returned. Added edge-case regression tests.
…eholder (Codex MUST-FIX)

Codex re-check: degenerate URIs still leaked — gpt://folder123 returned the
authority, gpt://folder123/${MODEL} returned the placeholder. Removed the _all[0]
authority fallback and guarded the literal-last-path fallback against placeholders;
a URI with no usable model segment now falls back to the raw id. Added regression
cases for gpt://folder123 and gpt://folder123/${MODEL}.
Release GL — v0.51.218 (fix getModelLabel mangling URI-scheme model IDs, nesquena#3429 regression)
…tion + model-key matching (nesquena#3436, @b3nw)

v0.51.218 fixed getModelLabel() (the visible chip) for URI-scheme model IDs but
left the same first-segment-slash-strip bug in the matching/dedup paths:
api/config.py _norm_model_id + _get_label_for_model, and static/ui.js
_normalizeConfiguredModelKey. For gpt://folder/model/latest those treat the
path slashes as provider delimiters, mis-normalizing the model-identity key
(the nesquena#3360 collision class, for URI ids). Adds a URI-scheme guard to all three
so the scheme is not stripped. Kept v0.51.218's getModelLabel (more thorough
than nesquena#3436's variant — it extracts the model name vs returning the whole id);
took nesquena#3436's backend + _normalizeConfiguredModelKey halves with backend/frontend
parity tests.

Co-authored-by: b3nw <b3nw@duck.com>
Release GM — v0.51.219 (extend URI-scheme model-ID fix to backend normalization, nesquena#3436)
…nesquena#3430, @pamnard)

Manual title regeneration (POST /api/session/title/regenerate) and background
aux title generation failed with 422 / llm_error_aux when
auxiliary.title_generation.model in config.yaml used the WebUI-internal
@Provider:model picker format (e.g. @gemini:gemini-3.1-flash-lite) — the raw
@-qualified id was forwarded to the provider API verbatim. Normalize it through
the canonical _split_webui_provider_model_value() helper before the aux call.

Co-authored-by: pamnard <pamnard@users.noreply.github.com>
Release GN — v0.51.220 (fix aux title generation 422 with @Provider: model ids, nesquena#3430)
nesquena#2931, nesquena#3104, nesquena#3220, nesquena#3223, nesquena#3337)

Six contributor PRs were shipped via the cherry-pick/absorb path but their
absorb commits never carried a `Co-authored-by:` trailer, so the contributors
received zero commit credit on their GitHub contribution graphs. Three of them
(@antoniocarlos97ss, @liuqiangweb-svg, @pix0127) were also missing from
CONTRIBUTORS.md entirely; the other three (@AJV20, @mysoul12138) were already
listed via CHANGELOG attribution but still lacked the graph credit.

This commit:
  - Adds the three missing contributors to CONTRIBUTORS.md (single-PR section).
  - Carries Co-authored-by trailers for all six so each gets a real commit on
    their contribution graph (the non-history-rewrite way to repair this).
  - Bumps the tracked totals (194 -> 197 contributors, 843 -> 846 credits).

The shipped work, by PR:
  nesquena#2622 (@pix0127)            WebUI dashboard plugin system w/ iframe isolation
  nesquena#2931 (@liuqiangweb-svg)    Edge TTS as an alternative speech engine
  nesquena#3104 (@antoniocarlos97ss)  workspace file upload + drag-drop w/ archive extract
  nesquena#3220 (@AJV20)              generated media artifact cards
  nesquena#3223 (@AJV20)              manual session title regeneration
  nesquena#3337 (@mysoul12138)        syntax highlighting in workspace file preview

Co-authored-by: pix0127 <8500500+pix0127@users.noreply.github.com>
Co-authored-by: Andy <281253538+liuqiangweb-svg@users.noreply.github.com>
Co-authored-by: antoniocarlos97ss <101895404+antoniocarlos97ss@users.noreply.github.com>
Co-authored-by: AJV20 <24819659+AJV20@users.noreply.github.com>
Co-authored-by: mysoul12138 <203929894+mysoul12138@users.noreply.github.com>
…bution-backfill

docs: backfill contributor attribution for absorbed PRs (graph credit + CONTRIBUTORS.md)
…nk escapes + portable TOCTOU hardening [security]) (nesquena#3398) (nesquena#3451)

* [security] fix(workspace): block all symlink escapes from the selected workspace (nesquena#3398, @Hinotoi-agent)

Previously safe_resolve_ws allowed a symlink placed inside a workspace to resolve
to an external host path as long as it wasn't a system dir (/etc, /proc, etc).
But the workspace file API is reachable by LLM agent tool calls (read_file_content),
so an in-workspace symlink to ~/.ssh, ~/.hermes/auth.json (credentials), etc. was a
real read path. Now ANY symlink escape is blocked: safe_resolve_ws resolves and
requires the result stay under the workspace root; list_dir hides escaping symlinks
(they could never be opened anyway); internal symlinks resolving back under the
workspace still work. Updated the upload symlink-target test to accept the new
400 'Path traversal blocked' rejection (was 403) — the invariant (nothing lands
outside the workspace) is unchanged.

Co-authored-by: Hinotoi-agent <Hinotoi-agent@users.noreply.github.com>

* docs(changelog): v0.51.221 release header for nesquena#3398 symlink-escape security fix

* [security] harden workspace file API against symlink-swap TOCTOU via portable anchored openat-walk (nesquena#3398 follow-up)

Codex review of nesquena#3398 flagged that safe_resolve_ws() validates a path but
list_dir/read_file_content/upload/extraction then re-open by pathname, leaving a
TOCTOU window: a symlink swapped in AFTER the check could still escape. (This
race pre-existed nesquena#3398; closing it here so the containment is complete.)

A first attempt used /proc/self/fd for the post-open containment check, but that
BRICKS workspace browsing on macOS/Windows (no /proc → every read/list rejected).
This version is portable:

- open_anchored_fd(): opens the (already symlink-resolved) target
  component-by-component from the workspace root via openat (dir_fd) + O_NOFOLLOW.
  Every component must be a real non-symlink entry, so a component swapped to a
  symlink mid-flight is refused. No /proc dependency. Used by read_file_content
  (read from the fd) and list_dir (enumerate via os.scandir(fd), per-entry
  fstatat/readlinkat).
- open_anchored_create_fd(): same anchored walk for writes, creating missing
  intermediate dirs with mkdir(dir_fd=) and the leaf with O_CREAT|O_EXCL|
  O_NOFOLLOW. Used by the workspace upload write AND archive (zip+tar) member
  writes, anchored against the TRUE workspace root (not the mutable extraction
  dest_dir, closing Codex's root-swap finding). fd-leak-safe on rejection.
- Portability: gated on os.supports_dir_fd; platforms without it (Windows, where
  symlink creation needs admin) fall back to a plain O_NOFOLLOW open/exclusive
  create — no new race protection but no regression vs the prior path-based code.

Legit in-workspace symlinks still resolve and read/list fine (safe_resolve_ws
collapses them to a real in-workspace path, which the anchored walk then opens).
Verified: the swap-race leaks external content against the old path-based read
and is blocked here; macOS-class symlinked-root workspaces work; no fd leak over
300 rejected creates. Adds TOCTOU + anchored-create regression tests.

* [security] close 3 more nesquena#3398 TOCTOU gaps from Codex r3: root-swap, pre-create mkdir, Windows list_dir fallback

Codex round-3 review found three residual issues in the anchored openat-walk:

1. (CORE) The workspace ROOT itself could be swapped to a symlink after
   resolve() but before the root os.open() — add _O_NOFOLLOW to the root open in
   open_anchored_fd() and open_anchored_create_fd() (and make_anchored_dir()), so
   a raced root symlink is refused. Verified: root-swap race now blocked.

2. (SILENT) Upload/extraction still did pathname Path.mkdir() AFTER the
   containment check, so a raced symlink component could make the server create
   dirs outside the workspace before the anchored file create rejected. Removed
   the redundant member_path.parent.mkdir() calls (open_anchored_create_fd
   already creates intermediates via anchored mkdirat) and replaced the two
   base-dir mkdirs (upload target dir + archive extraction root) with a new
   make_anchored_dir() that walks from the true workspace root via
   openat+O_NOFOLLOW + mkdir(dir_fd=).

3. (CORE) list_dir() unconditionally used os.scandir(fd)/os.stat(dir_fd=)/
   os.readlink(dir_fd=), which would brick workspace browsing on platforms
   without os.supports_dir_fd (Windows). Split list_dir() into a _DIR_FD_OK
   anchored branch and a path-based fallback branch (prior behaviour) sharing one
   _process() entry builder. open_anchored_create_fd()'s Windows fallback now also
   creates parent dirs.

Adds regression tests: no-dir_fd fallback (list+read+create+symlink filtering)
and the root-swap race. All prior TOCTOU + anchored-create tests still green.

* fix(workspace): portable symlink-loop filtering in list_dir via follow-stat ELOOP

CI on Python 3.13 caught test_mutual_symlink_loop_filtered failing: a mutual
symlink loop (a->b->a) was NOT filtered from the listing. Root cause: the new
readlink-based cycle detection relied on (target_resolved / raw_link).resolve()
RAISING on a loop, but Path.resolve() loop handling differs by Python version
(3.11 raises RuntimeError, 3.13 can return a path), so the loop slipped through
on 3.13.

Fix: compute a version-independent 'reachable' flag per symlink via
os.stat(..., follow_symlinks=True) — the syscall reliably returns ELOOP for
mutual/self loops and ENOENT for broken targets on every platform/version. A
symlink whose follow-stat raises can never be opened, so list_dir filters it.
Applied in both the dir_fd-anchored branch (fd-relative stat) and the Windows
path-based fallback branch. Mutual loop now filtered on all versions.

---------

Co-authored-by: nesquena-hermes <[email protected]>
Co-authored-by: Hinotoi-agent <Hinotoi-agent@users.noreply.github.com>
…e language drift nesquena#3293 + orphaned CLI sidecar prune nesquena#3238 + pin-quota lineage nesquena#3288) (nesquena#3452)

* fix: reject cross-script drifted auto-generated session titles (nesquena#3293)

The title-language mismatch guard only knew two states: German (de) or empty,
and _title_language_mismatch early-returned False whenever the user start
wasn't German. So an English conversation whose LLM-generated title came back
in Chinese / Spanish / Russian sailed through and persisted with
llm_title_generated=true. The German case was the only one covered because
that's the one prior report it was built for.

Generalize from a German-specific binary to a language-agnostic cross-script
check. Add _script_counts() + _dominant_script() (cheap, dependency-free
Unicode-block classification: latin / cjk / cyrillic / arabic / hebrew / greek
/ devanagari). _title_language_mismatch now rejects a title that introduces a
substantial amount (>=35% of alphabetic chars, min 2) of a script different
from the conversation start's dominant script — so short titles that embed a
borrowed Latin technical term still trip, while an English title with a single
foreign place-name does not. The legacy German->English same-script heuristic
is preserved verbatim.

Kept api/streaming.py ASCII-only (the test_title_generation_source_has_no_cjk_
literals guard) — all CJK examples live in the test file, not the source.

Closes nesquena#3293

Co-authored-by: andrewkangkr <andrewkangkr@users.noreply.github.com>

* fix: prune orphaned imported-CLI sidecars from the WebUI sidebar (nesquena#3238)

When a CLI/agent session is opened in WebUI it gets a WebUI-owned sidecar
(webui/sessions/<id>.json + _index.json row) so it can render and reopen;
all_sessions() then returns it independently of the agent state.db. If the
user later deletes that session from the CLI / local Hermes storage, nothing
pruned the sidecar — the merge loop only overlays CLI metadata when a matching
state.db row exists and otherwise continues, so the stale row lingered in the
sidebar indefinitely (there is no WebUI delete affordance for CLI rows).

Add api.models.agent_session_row_exists(): an exact, uncapped existence probe
against the state.db sessions table. The sidebar merge loop now drops a row
that is_cli_session_row + not WebUI-native + absent from cli_by_id + whose
state.db row is genuinely gone, and calls prune_session_from_index() so
_index.json self-heals.

The state.db probe is deliberate: get_cli_sessions() caps at
CLI_VISIBLE_SESSION_LIMIT (20), so a still-existing session can fall out of
that window and look deleted — pruning on cli_by_id absence alone would delete
live sessions. WebUI-native rows with a CLI ancestor are never pruned, and any
probe error degrades to keep-the-row so a transient failure can't lose data.

Closes nesquena#3238

Co-authored-by: Luxciax <Luxciax@users.noreply.github.com>

* fix: count pin quota by visible session lineage

* docs(changelog): v0.51.222 — backend bugfix batch (nesquena#3293 title drift, nesquena#3238 sidecar prune, nesquena#3288 pin lineage)

* fix(pins): forks count as own pin lineage, not collapsed to parent (nesquena#3288 Codex follow-up)

Codex review of the batch found a pin-limit UNDERCOUNT: _session_row_lineage_root_id
followed any parent_session_id to the root, but /api/session/branch creates
independent visible fork sessions that also carry parent_session_id (session_source=
'fork'). Two pinned forks of the same parent collapsed to one quota lineage, letting
a user exceed pinned_sessions_limit with no 400. Fix: a fork returns its own id as
its lineage root (it's a separately-visible session); only compression/continuation
rows still collapse to a shared root. Adds a regression test with two pinned forks +
the parent counting as three distinct lineages, and confirms the existing
pre-compression-snapshot collapse case still passes.

* test(pins): update nesquena#2508/nesquena#2821 source-match tests for nesquena#3288 lineage rename

nesquena#3288 replaced the raw-session-id pin counter (pinned_ids set) with a
visible-lineage counter (pinned_lineage_ids via _visible_pinned_lineage_ids over
persisted_rows/candidate_rows). Two pre-existing source-string-matching tests
asserted the OLD implementation literals (pinned_ids = {, _session_field(existing,
session_id...), len(pinned_ids) >=). Updated both to assert the new mechanism while
preserving the invariants they actually guard: snapshot computed BEFORE LOCK (no
all_sessions()-inside-LOCK deadlock), quota filtering via the shared
_session_counts_toward_pin_quota helper, and the limit/400 guard. Behaviour
unchanged; these were implementation-detail assertions, not behaviour tests.

---------

Co-authored-by: nesquena-hermes <[email protected]>
Co-authored-by: andrewkangkr <andrewkangkr@users.noreply.github.com>
Co-authored-by: Luxciax <Luxciax@users.noreply.github.com>
Co-authored-by: Andy Kang <andrewkang.kr@gmail.com>
…ker provider nesquena#3443 + MiniMax-M3 nesquena#3374) (nesquena#3453)

* feat: upgrade MiniMax default model to M3

Add MiniMax-M3 as the new default and prune deprecated older
versions (M2.5/M2.5-highspeed/M2.1/M2) from the model catalog.
M2.7 (and M2.7-highspeed) is retained as the legacy compatible
option for users who pin to it.

Updated:
  - api/config.py: _FALLBACK_MODELS adds minimax/MiniMax-M3 (placed
    before M2.7 so the dropdown surfaces it first)
  - api/config.py: _PROVIDER_MODELS['minimax'] adds M3 first, removes
    M2.5/M2.5-highspeed/M2.1
  - api/config.py: _PROVIDER_MODELS['minimax-cn'] adds M3 first,
    removes M2.5/M2.1/M2
  - tests/test_minimax_provider.py: updated CN catalog assertions
    to match the new {M3, M2.7} list

API URL and TTS configuration are unchanged.

Co-Authored-By: Octopus <liyuan851277048@icloud.com>

* fix(models): register openai-api as first-class picker provider

* fix(models): detect OPENAI_API_KEY as openai-api, not bare openai (nesquena#3443 Codex follow-up)

Codex review found nesquena#3444 added the openai-api picker entry but the env-detection
side still did detected_providers.add('openai') for OPENAI_API_KEY. The agent
registry has only openai-api and openai-codex (no bare openai), so a env-only
OPENAI_API_KEY setup emitted @openai: picker entries the agent can't resolve on
the send path. Detect openai-api to match the registry. Adds a regression test.

* docs(changelog): v0.51.223 — re-stamp keep-set (nesquena#3443 openai-api + nesquena#3374 MiniMax-M3); dropped nesquena#3289 + nesquena#3264 to hold per Codex

---------

Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Octopus <liyuan851277048@icloud.com>
Co-authored-by: Rod Boev <rod.boev@gmail.com>
Co-authored-by: nesquena-hermes <[email protected]>
…authoritative on streaming worker nesquena#3294) (nesquena#3456)

* fix: respect profile toolset/skill config on WebUI streaming worker (nesquena#3294)

The streaming agent runs on a detached worker thread that does not inherit
the per-request thread-local profile context (set from the hermes_profile
cookie on the HTTP handler thread). On that worker, the ambient get_config()
resolves via get_active_profile_name() which falls back to the process-global
_active_profile (usually 'default'). A session under a non-default profile
with an empty platform_toolsets.cli therefore loaded the DEFAULT profile's
full toolset list, inflating a tools-disabled profile's prompt from ~400 to
~15K input tokens.

Add api.config.get_config_for_profile_home() — a race-free direct disk read
of an explicit profile home's config.yaml (no shared-cache mutation), which
defers to get_config() when the requested home matches the ambient path so
in-memory test overrides are preserved. The streaming worker now resolves
_cfg from the session's own profile home, fixing toolsets, prefill context,
and fallback chains in one place.

Closes nesquena#3294

Co-authored-by: gottipx <gottipx@users.noreply.github.com>

* docs(changelog): v0.51.224 — nesquena#3294 profile toolset config (dropped nesquena#3405 to hold per Codex stale-model-repair finding)

---------

Co-authored-by: nesquena-hermes <[email protected]>
Co-authored-by: gottipx <gottipx@users.noreply.github.com>
…e resolves gateway_state nesquena#3355) (nesquena#3458)

* fix(health): probe /health/detailed first and unify gateway env vars (nesquena#3355)

* docs(changelog): v0.51.225 — remote gateway health probe gateway_state fix (nesquena#3355)

* fix(health): normalize gateway URL health-suffix + cap remote body read (nesquena#3355 Codex follow-up)

Codex review of nesquena#3355 found two issues:
(1) SILENT — a gateway env var already pointing at a health path (e.g.
    GATEWAY_HEALTH_URL=http://host/health) produced doubled paths like
    /health/health/detailed once probe paths were appended. Now strip a trailing
    /health/detailed, /health, /v1/health, /status suffix before appending
    (mirrors api/updates.py).
(2) CORE — the new resp.read() on a 2xx body was unbounded; a large/trickled
    remote response could hang /api/health/agent or balloon memory. Cap the read
    at _REMOTE_PROBE_BODY_LIMIT_BYTES (64KB)+1 and skip JSON parse when over cap
    (still report the gateway alive, just without parsed gateway_state).

Adds regression tests for both (no doubled /health/health path; oversized body
is capped + does not hang). Also updated _FakeResp.read to accept the size arg.

---------

Co-authored-by: Rod Boev <rod.boev@gmail.com>
Co-authored-by: nesquena-hermes <[email protected]>
…age ring nesquena#3062 + activity-feed default-expand setting nesquena#3080) (nesquena#3459)

* feat(composer): replace mobile config-button sliders icon with context-usage ring (nesquena#3062, @NottheGuy007)

Squashed net diff of nesquena#3062 (the PR branch's tip commits were deletes of files not
present in our tree). Replaces the composerMobileCtxBadge text badge with an SVG
progress ring (ctx-arc + ctx-num) showing real-time context-window usage: ring
fill via stroke-dashoffset, centered percentage, color-coded green<=50%
orange<=85% red>85%. Ring resets to 0%/green on new session.

* feat(activity): add 'expand activity feed by default' appearance setting (nesquena#3080, @AJV20)

Squashed net diff of nesquena#3080. Adds a Settings -> Appearance checkbox
(activity_feed_expanded_default, default off) to expand new Activity disclosures
by default; preserves manual per-turn collapse/expand (explicit toggle still
wins); live 'Waiting on model' rows explain what the agent is doing before/after
tool calls. i18n keys for all locales.

* docs(changelog): v0.51.226 — context-usage ring (nesquena#3062) + activity-feed default-expand setting (nesquena#3080)

* test(mobile): update touch-target test for nesquena#3062 ring (badge removed)

nesquena#3062 replaced the composerMobileCtxBadge text badge with the SVG context-usage
ring (composerMobileCtxRing) but left 3 stale references in tests/test_mobile_layout.py.
The full suite caught test_mobile_composer_primary_controls_keep_touch_friendly_sizing
asserting the removed .composer-mobile-ctx-badge CSS rule + #composerMobileCtxBadge
element. Updated the assertion to the new ring: confirm composerMobileCtxRing exists,
the old badge is fully gone (not dangling), and the ring SVG is aria-hidden so it
stays a decorative overlay that doesn't steal the config button's 44px touch target
(which is still asserted via .composer-mobile-config-btn above). 56/56 mobile-layout
tests pass. Codex: no production JS dereferences the removed badge.

---------

Co-authored-by: nesquena-hermes <[email protected]>
…ble in sidebar nesquena#3408) (nesquena#3461)

* fix(sidebar): keep active New Chat visible before first message (nesquena#3408, @AJV20)

Squashed net diff of nesquena#3408. Injects ONLY the active ephemeral session into the
sidebar render rows (when the server list omits it) so a freshly-created New Chat
stays visible/selected before its first turn; inactive empty sessions stay
filtered as before. New Chat also resets a CLI source-filter back to webui so the
active chat isn't immediately hidden.

* fix(sidebar): gate active-row reinjection to 0-message ephemeral only (nesquena#3408 Codex follow-up)

Codex review found _ensureActiveSessionRowPresent re-injected ANY active session
after search-merge — so an active conversation WITH messages that was correctly
filtered out by the search query would pollute unrelated search results. Gate the
reinjection to Number(activeRow.message_count||0)<=0 so only the freshly-created
0-message ephemeral chat is re-added; an active chat with messages stays filtered
by search as before. Added a regression test asserting the gate.

---------

Co-authored-by: nesquena-hermes <[email protected]>
…ena#3411 + large-markdown preview nesquena#3378) (nesquena#3463)

* Release v0.51.228 (stage-p12): workspace tree-drop nesquena#3411 + large-markdown preview nesquena#3378

nesquena#3411 (@pamnard): stopPropagation on workspace file-tree OS-file dragenter/dragover/drop
so a tree drop uploads to the workspace WITHOUT also attaching to the composer.
nesquena#3378 (@starGazerK): raise md rich-render ceiling 64KB/1500L->256KB/5000L + backend
file read 200KB->400KB, add 'Render as markdown anyway' force button (reuses cached
raw content, no extra fetch).

* fix(workspace): force-render uses fresh path-scoped cache, blocked while dirty (nesquena#3378 Codex follow-up)

Codex review of nesquena#3378 found the markdown force-render path had two SILENT bugs:
(1) saving a md file from the plain-text fallback didn't update _previewRawContent,
so a later force-render showed stale pre-save content; (2) the cache-reuse check
'path===_previewCurrentPath' was tautological (var just assigned), so a force-render
after a file switch could render the previous file's cached content. Fixes: track
_previewRawContentPath (set on fetch AND save), reuse cache only when it matches the
requested path, and block force-render while the editor is dirty/open. +3 regression
tests. (nesquena#3411 was cleared clean by Codex.)

---------

Co-authored-by: nesquena-hermes <[email protected]>
…ps a versioned name to a -tier variant nesquena#3368) (nesquena#3465)

* Release v0.51.229 (stage-p13): /model never silently snaps versioned name to -tier variant (nesquena#3368, @nesquena-hermes)

Agent-authored, nesquena-APPROVED. Rebased onto current master. Both _findModelInDropdown
(ui.js) and _bestModelMatch (commands.js) reject a prefix-snap when the typed target ends
in a version digit and the longer option's extra text is a variant/tier suffix (.pro) rather
than a version continuation (.digit). Adds _nearestModelSuggestion + 'did you mean?' toast.
34 tests pass (14 new nesquena#3368 + 20 regression: nesquena#1188 fuzzy + nesquena#3360 collision).

* fix(commands): /model did-you-mean toast renders suggestion + single quotes (nesquena#3368 review)

Live-render review of the approved nesquena#3437 caught two toast-assembly bugs in cmdModel:
(1) t('model_did_you_mean') was called WITHOUT the suggestion arg — model_did_you_mean
is a (m)=>... template that t() invokes, so it rendered 'did you mean "undefined"?';
fixed to t('model_did_you_mean', suggestion). (2) no_model_match already ends with an
opening quote, so '"${args}"' doubled it ('No model matching ""deepseek-v4""'); fixed
to '${args}"'. +4 source-assertion regression tests. Verified live: toast now reads
'No model matching "deepseek-v4" — did you mean "deepseek/deepseek-v4-pro"?'.

* fix(commands): slash-qualified versioned no-snap falls through to suggestion (nesquena#3368 Codex CORE)

Codex review found a 2nd no-snap layer the version guard missed: for a slash-qualified
versioned query (e.g. 'xiaomi/mimo-v2.5') whose only near catalog entry is a rejected
tier variant ('xiaomi/mimo-v2.5-pro'), cmdModel's cross-provider /api/session/update
fallback would silently persist the invalid model + 'Switched to...'. Now gated on
!versionedNoSnap (_looksLikeVersionedModel(bare) && a near suggestion exists) so it falls
through to the 'did you mean?' toast; genuinely off-catalog providers (no near variant)
still direct-update. Verified live: '/model deepseek/deepseek-v4' no longer switches, shows
suggestion toast. +1 regression test.

---------

Co-authored-by: nesquena-hermes <[email protected]>
…oning nesquena#3455 + LLM Wiki last-writer nesquena#1257) (nesquena#3466)

* Release v0.51.230 (stage-p14): extract <think> to m.reasoning nesquena#3455 + LLM Wiki last-writer (nesquena#1257)

Salvage of nesquena#3455 (@gsurenull): dropped the stale api/config.py bits (MiniMax-M3 +
SCHEMA_VERSION 3->4 — both already on master via nesquena#3374). Kept the two genuine fixes:
(1) _splitThinkFromContent persist-path extraction of inline <think> blocks into
m.reasoning (fixes 30-50% session bloat for reasoning-only providers like MiniMax-M3);
(2) LLM Wiki status Last-writer 3-tier fallback (was always 'Not available' since nesquena#1257).
Added 9 Node-driven think-split regression tests (data-loss guards: content-before/after
preserved, unclosed blocks intact, lookalike tags not extracted).

* fix(nesquena#3455): renderer-matching think extraction + wiki symlink/bounded-read guards (Codex review)

Codex review of stage-p14 found 3 SILENT bugs, all fixed:
(1) DATA-LOSS: _splitThinkFromContent's Pass-2 whole-body scan extracted a CLOSED literal
<think>...</think> from visible prose/code (e.g. inside a fenced code block) into m.reasoning,
emptying it — more aggressive than the renderer (which only strips LEADING blocks). Removed
Pass 2; extraction now matches _streamDisplay semantics (leading-only, loop captures
consecutive leading blocks). +fenced-code regression test.
(2) PRIVACY: _llm_wiki_last_writer followed symlinked .md pages resolving OUTSIDE the wiki
(is_file follows symlinks), leaking external frontmatter. Now requires resolved path under
wiki_root. +symlink-containment regression test.
(3) CONTRACT/PERF: replaced full read_text() with bounded line-by-line reads (frontmatter
block only / capped log-heading scan), never page bodies.

* fix(nesquena#3455): think-split is leading-single (renderer-matching) + fix 2 stale source-match tests

Codex re-review finding #2: looping consecutive leading blocks diverged from the renderer
(_streamDisplay/_parseStreamState strip ONE leading block). Now extracts exactly one leading
block. Also updated 2 tests that asserted pre-split implementation strings:
test_live_stream_tokens_persist (content:assistantText -> content:split.content, invariant
preserved) and the consecutive-blocks test. NOTE: Codex finding #1 (client-only split doesn't
persist server-side) is a separate architectural decision pending Nathan.

* feat(nesquena#3455): split inline <think> server-side before s.save() so persisted file is compacted (Codex #1)

Codex finding #1: the think-split was client-only, so the SAVED session file still
carried inline <think> blocks (bloat) — the fix only compacted the browser copy.
Added _split_thinking_from_content (api/streaming.py), a server-side twin of the JS
helper with identical leading-only/single-block semantics, applied to the final
assistant message before s.save() (extended the existing reasoning-persist block).
Merges with on_reasoning-stream reasoning. +8 backend-parity regression tests covering
the mid-body-code-block data-loss guard, unclosed-intact, single-leading, none-content.

* test: update 3 save-path source-assertion tests for nesquena#3455 server-side think-split

The backend think-split (api/streaming.py reasoning-persist block) changed the literal
code shape + grew the pre-save block, breaking 8 source-assertion tests that anchor on it:
- test_sprint42: assert _rm['reasoning']=_reasoning_text -> now _merged_reasoning/_existing_reasoning
  + _split_thinking_from_content present (intent preserved: reasoning persisted before save).
- test_pr1318 (6) + test_pr1341: re-anchored the locator from the changed 'if _reasoning_text
  and s.messages:' line to the stable 'Persist reasoning trace in the session' comment marker;
  bumped the 1341 byte-distance limit 15000->16000 (the test self-documents bumping on legit
  pre-save growth). All behavioral invariants (reasoning persisted + context fields before save)
  unchanged.

---------

Co-authored-by: nesquena-hermes <[email protected]>
@github-actions github-actions Bot added the sync-upstream PRs from .github/workflows/sync-upstream.yml label Jun 3, 2026
Catch-up merge from nesquena/hermes-webui v0.51.188 → v0.51.230 (~42 tags)
under the new latest-tag sync policy. Conflicts resolved per CLAUDE.md
"Auto-resolve policy" and iterated to a green full suite.

Conflict resolutions:
- CHANGELOG.md / README.md: union. Kept upstream's new release blocks +
  restructured README, re-applied fork-only content (fork notice, GHCR
  /clone URLs → TheCouchCoder-com). Upstream-removed Tailscale/remote
  sections dropped as intended (they were upstream content, not fork).
- static/i18n.js: took upstream's refined zh-Hant wording, kept fork-only
  settings_*_always_new_chat keys (and translated them to zh-Hant so
  upstream's new test_zh_hant_locale.py passes).
- static/style.css / static/panels.js: union of the panel lists — kept
  both our fork 'admin' (RBAC) panel and upstream's new 'plugin' panel.
- static/ui.js: kept the fork _setProfileChipLabel helper (composer +
  titlebar chip + a11y) while adopting upstream's session-profile value
  precedence.

Adapted to upstream's new ruff forward-lint gate: fixed a latent F821
(undefined name in a never-taken branch) in tests/test_issue2_admin_endpoints.py.

Dropped upstream edits we don't track: .github/workflows/* (fork runs its
own CI; token can't push them anyway) and .github/FUNDING.yml (pointed at
upstream's account).

Validation: pytest tests/ -q → 7486 passed, 108 skipped, 0 failed.
Issue-#2 auth regression suites green (45 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Du7chManiac Du7chManiac force-pushed the sync/upstream-v0.51.230 branch from f669d8f to f3947ef Compare June 3, 2026 07:15
@Du7chManiac Du7chManiac marked this pull request as ready for review June 3, 2026 07:15
@Du7chManiac

Copy link
Copy Markdown

Babysitter resolution — pushed, full suite green

Resolved the v0.51.188 → v0.51.230 catch-up merge (~42 tags) per the new CLAUDE.md Auto-resolve policy. The conflict-markers commit was replaced with a clean merge commit (f3947ef2).

6 conflicts resolved:

File Resolution
CHANGELOG.md Union — upstream release blocks above the fork entry
README.md Took upstream's restructured doc; re-applied fork notice + GHCR/clone URLs (TheCouchCoder-com). Upstream-removed Tailscale/remote sections dropped (they were upstream content, not fork).
static/i18n.js Upstream's refined zh-Hant wording + kept fork-only always_new_chat keys (translated to zh-Hant)
static/style.css Union — kept both fork admin panel and upstream plugin panel selectors
static/panels.js Union — both admin + plugin in the panel list
static/ui.js Kept fork _setProfileChipLabel helper (titlebar + a11y) with upstream's session-profile value precedence

Iterate-to-green (2 attempts):

  • Adapted a latent F821 (undefined name in a never-taken branch) in tests/test_issue2_admin_endpoints.py, surfaced by upstream's new ruff forward-lint gate.
  • Translated the 2 fork-only always_new_chat zh-Hant strings so upstream's new test_zh_hant_locale.py passes.

Dropped (not tracked by the fork): .github/workflows/* (own CI; token can't push them) and .github/FUNDING.yml (pointed at upstream's account).

Validation: pytest tests/ -q7486 passed, 108 skipped, 0 failed. Issue-#2 auth regression suites green (45 passed). Merge commit preserves the merge-base (parents: master + v0.51.230 tag).

Waiting on CI; will auto-merge with a merge commit once green, per policy.

@Du7chManiac Du7chManiac merged commit dc0785e into master Jun 3, 2026
7 checks passed
@Du7chManiac Du7chManiac deleted the sync/upstream-v0.51.230 branch June 3, 2026 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sync-upstream PRs from .github/workflows/sync-upstream.yml

Projects

None yet

Development

Successfully merging this pull request may close these issues.