Skip to content

fix: stop replaying reasoning-only history#2536

Merged
1 commit merged into
nesquena:masterfrom
Michaelyklam:fix/issue-2514-activity-duplication
May 18, 2026
Merged

fix: stop replaying reasoning-only history#2536
1 commit merged into
nesquena:masterfrom
Michaelyklam:fix/issue-2514-activity-duplication

Conversation

@Michaelyklam
Copy link
Copy Markdown
Contributor

@Michaelyklam Michaelyklam commented May 18, 2026

Thinking Path

  • Long WebUI sessions rely on persisted transcript/context reconstruction to keep reloads and follow-on turns stable.
  • Issue 思考过程错误显示 #2514 includes a redacted session where old reasoning-only assistant entries have multiplied into thousands of blank assistant turns with stale Thinking metadata.
  • Those entries are useful as display metadata for interrupted/canceled turns, but they are not valid provider-facing conversation history.
  • The narrow fix is to keep visible Thinking cards in the transcript while preventing reasoning-only display entries from being replayed to the model or reinserted into restored context.

What Changed

  • Added a backend guard that identifies reasoning-only assistant display entries and excludes them from _sanitize_messages_for_api() / API-safe context alignment.
  • Stopped _restore_reasoning_metadata() from reinserting prior reasoning-only display entries into model-facing context while still restoring them for the visible transcript when needed.
  • Kept visible assistant replies that also carry reasoning metadata intact.
  • Cleared stale settled .agent-activity-thinking rows before compact Activity rerenders rebuild the transcript, avoiding duplicate Thinking rows on repeated renders/session switches.
  • Added regression coverage for provider-history sanitization, reasoning metadata restoration, and settled Activity rerender cleanup.
  • Added a release-note entry.

Why It Matters

This prevents long sessions from feeding blank assistant Thinking artifacts back into the next turn and compounding stale reasoning/tool metadata. The UI can still show legitimate Thinking entries, but old display-only internals no longer metastasize into conversation history or inflated Activity counts.

Verification

  • env -u HERMES_CONFIG_PATH -u HERMES_WEBUI_HOST /home/michael/.hermes/hermes-agent/venv/bin/python -m pytest tests/test_sprint49.py tests/test_ui_tool_call_cleanup.py tests/test_issue1361_cancel_data_loss.py tests/test_pr1375_partial_tool_calls_sanitize.py tests/test_orphaned_tool_messages.py -q — 63 passed
  • env -u HERMES_CONFIG_PATH -u HERMES_WEBUI_HOST /home/michael/.hermes/hermes-agent/venv/bin/python -m pytest tests/test_sprint49.py::test_restore_reasoning_metadata_preserves_existing_timestamps tests/test_sprint42.py::test_streaming_restores_prior_reasoning_metadata_after_followup tests/test_sprint42.py::test_routes_restores_prior_reasoning_metadata_after_followup -q — 3 passed
  • python3 -m py_compile api/streaming.py
  • /home/michael/.hermes/hermes-agent/venv/bin/python -m py_compile api/streaming.py api/routes.py
  • node --check static/ui.js
  • git diff --check
  • Manual redacted-session probe from the issue attachment: _sanitize_messages_for_api() reduced the attached 2,202-message sample to 440 provider-facing messages with 0 blank assistant turns.

UI media: not attached because this is a history/context sanitation and rerender idempotence fix rather than a new visual surface; the visible outcome depends on a large reporter session attachment and is covered by source/regression checks.

Risks / Follow-ups

  • Existing already-bloated saved sessions may still contain old reasoning-only display entries until a future repair/migration cleans persisted session files. This PR prevents them from being replayed into provider history and stops settled rerenders from stacking duplicates.
  • Tool-call duration semantics mentioned in the issue thread may deserve a separate focused follow-up if reporters still see per-tool timing confusion after the history bloat stops.

Closes #2514
Refs #2535

Model Used

AI-assisted change with repository inspection, targeted editing, and shell-based test verification.

@Michaelyklam Michaelyklam force-pushed the fix/issue-2514-activity-duplication branch 2 times, most recently from c39c6a5 to b57ed3b Compare May 18, 2026 16:40
@nesquena-hermes
Copy link
Copy Markdown
Collaborator

CI failure — test (3.11) red on this head

Heads up @Michaelyklam — the most recent push has test (3.11) failing with test (3.12) and test (3.13) cancelled. Won't be able to fold this into a batch release while CI is red.

Could you check tests/test_sprint49.py and tests/test_ui_tool_call_cleanup.py against the new guard? The fix shape itself (transcript display vs provider-replay separation) looks right per #2514, but the regression coverage needs to land green before we can route this through the merge queue.

Will revisit on the next sweep tick once CI clears.

@Michaelyklam Michaelyklam force-pushed the fix/issue-2514-activity-duplication branch from 47d3d21 to e94827f Compare May 18, 2026 17:51
@Michaelyklam
Copy link
Copy Markdown
Contributor Author

Michaelyklam commented May 18, 2026

Rebased this on the latest master after the release batch moved the changelog, keeping the PR note under [Unreleased] and preserving the current release notes.

Verification on the rebased head:

  • pytest tests/test_sprint49.py::test_restore_reasoning_metadata_preserves_existing_timestamps tests/test_sprint42.py::test_streaming_restores_prior_reasoning_metadata_after_followup tests/test_sprint42.py::test_routes_restores_prior_reasoning_metadata_after_followup tests/test_ui_tool_call_cleanup.py -q — 24 passed
  • python -m py_compile api/streaming.py api/routes.py
  • node --check static/ui.js
  • git diff --check origin/master...HEAD
  • git merge-tree --write-tree origin/master HEAD
  • GitHub Actions test (3.11), test (3.12), and test (3.13) — passed on the rebased head

The PR is mergeable again.

@nesquena-hermes nesquena-hermes closed this pull request by merging all changes into nesquena:master in 718a4c7 May 18, 2026
Michaelyklam pushed a commit to Michaelyklam/hermes-webui that referenced this pull request May 18, 2026
eleboucher pushed a commit to eleboucher/homelab that referenced this pull request May 19, 2026
… 0.51.92) (#560)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [ghcr.io/nesquena/hermes-webui](https://github.com/nesquena/hermes-webui) | patch | `0.51.90` → `0.51.92` |

---

### Release Notes

<details>
<summary>nesquena/hermes-webui (ghcr.io/nesquena/hermes-webui)</summary>

### [`v0.51.92`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v05192--2026-05-19--Release-BP-stage-385--7-PR-full-sweep-batch--RFC-Slice-3c-clarification--workspace-tree-icon-alignment--project-move-cache-refresh--auto-compression-handoff-metadata--Grok-OAuth-provider-catalog--anonymous-custom-endpoint-picker-fallback--PWA-standalone-reload--pull-to-refresh)

[Compare Source](nesquena/hermes-webui@v0.51.91...v0.51.92)

##### Fixed

- **PR [#&#8203;2563](nesquena/hermes-webui#2563 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) (closes [#&#8203;2554](nesquena/hermes-webui#2554)) — Align workspace-tree file rows with sibling directory rows by reserving the same expand/collapse toggle slot for files via a new `.file-tree-toggle-placeholder` element. Expanded directories now show child files stepped in at the same icon column as child folders. Directory toggles and file interactions are unchanged; source-level regression coverage and before/after PNGs included.
- **PR [#&#8203;2561](nesquena/hermes-webui#2561 by [@&#8203;nanookclaw](https://github.com/nanookclaw) (closes [#&#8203;2551](nesquena/hermes-webui#2551)) — Refresh the authoritative `_allSessions` cache when the project picker moves a session to/from a project. Previous code mutated only the shallow sidebar row copy, so `renderSessionListFromCache()` re-read the unchanged cache and repainted a stale project dot until the next `/api/sessions` poll healed the UI. Both the "Removed from project" and "Moved to <project>" branches now write the new `project_id` into `_allSessions[idx]` before re-rendering.
- **PR [#&#8203;2567](nesquena/hermes-webui#2567 by [@&#8203;dso2ng](https://github.com/dso2ng) (refs [#&#8203;2477](nesquena/hermes-webui#2477)) — Surface automatic-compression handoff metadata through the `compressed` SSE event so the active browser stream keeps its completion card even after the backend rotates the session id from the origin to a compressed continuation. The event now carries both `old_session_id` and `new_session_id`/`continuation_session_id`; the frontend `compressed` listener accepts either, and the automatic-compression detail line names the compressed continuation session so the done state isn't silently dropped.
- **PR [#&#8203;2568](nesquena/hermes-webui#2568 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) (closes [#&#8203;2545](nesquena/hermes-webui#2545)) — Add the Hermes Agent `xai-oauth` provider to the WebUI's OAuth provider catalog so Grok OAuth accounts authenticated via the Hermes CLI appear in Settings → Providers and the `/api/models` picker. The provider is treated as CLI-managed OAuth (no WebUI API-key form) and uses the live Hermes CLI model catalog when available with a Grok 4.20 static fallback.
- **PR [#&#8203;2550](nesquena/hermes-webui#2550 by [@&#8203;espokaos-ops](https://github.com/espokaos-ops) (refs [#&#8203;2542](nesquena/hermes-webui#2542)) — Keep anonymous custom OpenAI-compatible endpoints in the model picker even when the configured `/v1/models` probe fails. Lightweight relays and llama-server-style deployments that authenticate `/v1/chat/completions` but not `/v1/models` no longer have their provider group silently dropped from the picker. Users can type a model id manually in the free-form input when no live catalog is available.

##### Added

- **PR [#&#8203;2548](nesquena/hermes-webui#2548 by [@&#8203;espokaos-ops](https://github.com/espokaos-ops) — Add a PWA-standalone reload affordance. A small refresh button appears in the app titlebar (visible only under `@media (display-mode: standalone), (display-mode: fullscreen)`) so users running the WebUI as an installed home-screen PWA can reload without re-launching the app. Adds a complementary pull-to-refresh gesture on the messages container with an 80px threshold and a smooth-scroll-to-top guard so accidental triggers while reading history feel intentional. 4-viewport screenshots (390/1280/1440/1920, light/dark, hover/idle) included under `docs/pr-media/2548/`.

##### Documentation

- **PR [#&#8203;2560](nesquena/hermes-webui#2560 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) (refs [#&#8203;1925](nesquena/hermes-webui#1925)) — Clarify the RuntimeAdapter Slice 3c state after [#&#8203;2544](nesquena/hermes-webui#2544) shipped. The RFC now distinguishes shipped `/api/goal` routing through `RuntimeAdapter.update_goal(...)` from the still-staged `queue_message(...)` protocol method, and explicitly warns not to add a new server-side queue endpoint or queue scheduler merely for adapter symmetry while `/queue` remains browser-side queue/drain behavior.

### [`v0.51.91`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v05191--2026-05-18--Release-BO-stage-384--5-PR-full-sweep-batch--reasoning-replay-history-fix--archive-extract-per-session-inbox--fallback-streaming-warnings--sanitized-custom-provider-env-hints--Slice-3c-queuegoal-adapter-routing)

[Compare Source](nesquena/hermes-webui@v0.51.90...v0.51.91)

##### Fixed

- **PR [#&#8203;2536](nesquena/hermes-webui#2536 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) (closes [#&#8203;2514](nesquena/hermes-webui#2514), refs [#&#8203;2535](nesquena/hermes-webui#2535)) — Stop reasoning-only Thinking entries from being replayed into provider-facing history as blank assistant turns. Long WebUI sessions were accumulating duplicated stale Thinking blocks and inflated Activity/tool metadata on later turns when reasoning-only display entries (from interrupted/canceled turns) got reinserted into the restored conversation history. The fix keeps visible Thinking cards in the transcript while filtering them out of provider-facing replay. Settled compact Activity rerenders now also clear previously inserted Thinking rows before rebuilding the visible transcript.
- **PR [#&#8203;2520](nesquena/hermes-webui#2520 by [@&#8203;OneFat3](https://github.com/OneFat3) (refs [#&#8203;2247](nesquena/hermes-webui#2247)) — Route archive extraction (`/api/upload/extract`) through the per-session attachment inbox (`_session_attachment_dir`) instead of hardcoded `Path(s.workspace)`, matching the single-file upload path. Extracted archives now land at `<attachment_root>/<session_id>/<archive_stem>/` so session deletion cleanup covers them and per-session isolation is preserved when `HERMES_WEBUI_ATTACHMENT_DIR` is configured.
- **PR [#&#8203;2505](nesquena/hermes-webui#2505 by [@&#8203;cyberdyne187](https://github.com/cyberdyne187) — Surface provider fallback and rate-limit lifecycle notices as auto-clearing fallback warnings in the streaming composer status. The new bridge in `_agent_status_callback` matches agent lifecycle messages containing `rate limited` / `switching to fallback` / `falling back` / `fallback activated` / `trying fallback` and emits them as `warning` events with `type=fallback`, so the existing `static/messages.js` warning channel surfaces them with the correct auto-clear contract instead of letting them drop silently.
- **PR [#&#8203;2556](nesquena/hermes-webui#2556 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) (closes [#&#8203;2541](nesquena/hermes-webui#2541)) — Sanitize auto-generated custom-provider API-key environment variable names so endpoint-derived provider ids such as `custom:gpu.local-8000` use POSIX-safe names like `CUSTOM_GPU_LOCAL_8000_API_KEY`. Runtime custom-provider key resolution now checks the sanitized env var first and falls back to the legacy punctuation-preserving name with a one-shot deprecation warning. Configured literal `api_key` values and explicit `key_env` config are unchanged.

##### Documentation

- **PR [#&#8203;2544](nesquena/hermes-webui#2544 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) (refs [#&#8203;1925](nesquena/hermes-webui#1925)) — Implement the first Slice 3c RuntimeAdapter control routing. `RuntimeAdapter` / `LegacyJournalRuntimeAdapter` now expose `queue_message(...)` and `update_goal(...)` as protocol-translator delegates, and the `/api/goal` route uses `update_goal(...)` only when `HERMES_WEBUI_RUNTIME_ADAPTER=legacy-journal` is enabled while preserving the legacy-direct response shape. The change keeps `/queue`'s existing browser-side drain semantics and goal post-turn evaluation in the current agent loop; no runner/sidecar, WebUI-owned queue, goal scheduler, cached-agent table, or execution-survives-restart claim is introduced.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19-->

Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/560
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

思考过程错误显示

2 participants