Skip to content

fix(chat): keep visible interim progress in the live timeline#2958

Merged
2 commits merged into
nesquena:masterfrom
franksong2702:franksong2702/fix-2956-visible-interim-progress
May 27, 2026
Merged

fix(chat): keep visible interim progress in the live timeline#2958
2 commits merged into
nesquena:masterfrom
franksong2702:franksong2702/fix-2956-visible-interim-progress

Conversation

@franksong2702
Copy link
Copy Markdown
Contributor

@franksong2702 franksong2702 commented May 26, 2026

Thinking Path

  • Hermes WebUI's Compact tool activity contract is "compact inline activity while preserving the agent timeline."
  • Visible interim assistant progress is user-facing timeline content, not raw tool/debug detail.
  • The current interim_assistant stream path could append progress text, then reset the assistant segment before the live Session timeline had a visible row to render into.
  • Compact Activity should still group tool/debug metadata, but visible progress must not become effectively collapsed-only or disappear from the live timeline.
  • This PR restores that timeline boundary and pins the contract in docs and tests.

Fixes #2956.

What Changed

  • Updated the interim_assistant handler so non-echo visible progress:
    • creates the live assistant row first,
    • force-flushes that segment synchronously,
    • closes the current live Activity burst so later tools append after the visible progress boundary,
    • then resets for the next assistant/tool segment.
  • Extended _flushPendingSegmentRender with a narrow {force:true} option for the interim-only case where there may be no pending rAF but the newly-created segment still needs immediate rendering.
  • Updated docs/UIUX-GUIDE.md and docs/rfcs/webui-run-state-consistency-contract.md to state that visible interim progress remains timeline content and must not be available only inside a collapsed Activity disclosure.
  • Updated regression tests that had encoded the opposite compact-activity behavior.
  • Added an Unreleased changelog entry.

Why It Matters

Long-running Session streams need a readable chronological story: user request, visible agent progress, tools/activity, and final answer. Compact Activity can reduce noise, but it should not hide the concise progress text that tells the user what the agent is doing.

Verification

  • RED before implementation:
    • python3 -m pytest tests/test_issue2713_streaming_segment_flush.py::TestInterimAssistantHandlerFlush tests/test_ui_tool_call_cleanup.py::TestToolCallGroupingStatic::test_live_visible_interim_text_preserves_timeline_boundary -q
    • Failed on missing forced flush / Activity boundary in the interim_assistant path.
  • GREEN after implementation:
    • python3 -m pytest tests/test_issue2713_streaming_segment_flush.py::TestInterimAssistantHandlerFlush tests/test_ui_tool_call_cleanup.py::TestToolCallGroupingStatic::test_live_visible_interim_text_preserves_timeline_boundary -q -> 5 passed
    • python3 -m pytest tests/test_issue2713_streaming_segment_flush.py tests/test_ui_tool_call_cleanup.py tests/test_run_journal_frontend_static.py tests/test_regressions.py::test_messages_js_supports_interim_assistant_events tests/test_regressions.py::test_ui_js_does_not_hide_anchor_segments_that_contain_thinking tests/test_regressions.py::test_messages_js_live_assistant_segment_reuses_live_turn_wrapper -q -> 42 passed
    • node --check static/messages.js
    • node --check static/ui.js
    • git diff --check

Risks / Follow-ups

  • This preserves compact grouping for tool/debug details but intentionally treats visible interim assistant progress as a timeline boundary. If reviewers want a different visual treatment for that boundary, the product contract should be updated explicitly rather than inferred from Activity grouping behavior.
  • No screenshots are attached because this is a narrow live-stream contract fix and source-level regression coverage pins the affected event ordering. A browser reproduction can be added if maintainers want manual visual evidence for the in-flight stream state.

Model Used

OpenAI GPT-5 via Codex app agent. Tool use: local repo inspection, GitHub CLI, pytest, Node syntax checks, git worktree, and git push/PR creation.

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Read the diff against static/messages.js, the existing interim_assistant handler at lines 1493-1521, and closeCurrentLiveActivityGroup() in static/ui.js:5309-5315. The fix is small and the contract argument in the PR description is sound. A few observations on whether the change is sufficient.

The new ordering looks right

Before this PR, the handler did _flushPendingSegmentRender()ensureAssistantRow(true)_resetAssistantSegment(). The flush was a no-op if _renderPending was false, so a freshly-pushed visibleInterimSnippets segment that hadn't yet been scheduled for rAF would be reset away before it ever rendered. The new sequence is:

ensureAssistantRow(true);
_flushPendingSegmentRender({force:true});
if(typeof closeCurrentLiveActivityGroup==='function') closeCurrentLiveActivityGroup();
_resetAssistantSegment();
_scheduleRender();

ensureAssistantRow(true) creates the live row first, the forced flush guarantees the segment text materializes synchronously into a transcript element (not an Activity-collapsed slot), then closeCurrentLiveActivityGroup() marks the in-flight tool group as "no longer current" so the next tool burst opens a new group below the interim text. That maps to the contract you cite (visible progress is a timeline boundary, Activity grouping resumes after it).

_flushPendingSegmentRender({force:true}) change is the load-bearing fix

function _flushPendingSegmentRender(options={}){
    const force=!!(options&&options.force);
    if(!assistantBody||(!force&&!_renderPending)) return;
    if(_renderPending) _cancelAnimationFramePendingStreamRender();
    ...

The if(_renderPending) _cancelAnimationFramePendingStreamRender() guard is correct — calling cancel on a null frame ID was probably harmless but the conditional avoids the unnecessary call when forced. Worth double-checking that _cancelAnimationFramePendingStreamRender is safe to call when _renderPending is false — if it's idempotent and zero-cost, the conditional is just a micro-opt; if it has side effects, you want it. From the look of it, it's just cancelAnimationFrame(id); _renderPending=false; so either way is fine, but worth confirming.

One subtle thing: the existing tool/done handlers also call _flushPendingSegmentRender() without {force:true}. Were any of those reachable in a state where _renderPending===false but segmentStart < assistantText.length? If so they have the same bug and silently drop text. Worth a quick audit — the regression test you added is interim-specific but the underlying flush behavior change applies to anything that calls this function. Either the other callers are provably safe (e.g. they always reach this only after _scheduleRender() has run), or they want {force:true} too.

closeCurrentLiveActivityGroup is new behavior for the interim path

This call wasn't there before. Looking at static/ui.js:5309-5315:

function closeCurrentLiveActivityGroup(){
  const turn=$('liveAssistantTurn');
  if(!turn) return;
  turn.querySelectorAll('.tool-call-group[data-live-tool-call-group="1"][data-live-activity-current="1"]').forEach(group=>{
    group.removeAttribute('data-live-activity-current');
  });
}

That just clears the data-live-activity-current attribute. The downstream effect is that the next tool emission will start a new group (because the "current" marker is gone) rather than appending to the existing one. So in compact mode, before this PR you'd get [Activity: tool1, tool2, interim_text_eaten, tool3] collapsed into one group; after, you get [Activity: tool1, tool2] interim_text [Activity: tool3]. That matches the issue's "preserve chronology" ask.

Worth verifying with a manual reproduction (the PR description notes screenshots aren't attached) that the visual seam between the closed group and the next one isn't jarring — e.g. that the elapsed-timer started by _startActivityElapsedTimer on the old group doesn't keep counting on a stale DOM node. From a quick scan I don't see it being cleared on group-close, only on group-finalize. Probably fine since the timer reads from data-live-activity-started-at and stops when the group is no longer "current"... but worth a check.

Tests pin the static contract, not the runtime behavior

tests/test_issue2713_streaming_segment_flush.py::TestInterimAssistantHandlerFlush and the cleanup test assert the source contains specific strings (ensureAssistantRow(true) before _flushPendingSegmentRender({force:true}), etc.). That's the standard static-pattern test style for static/*.js in this repo, but it means a refactor that preserves behavior but uses different identifiers will trip CI. Acceptable for now; just noting.

A real DOM regression test for this would need jsdom or Playwright, which the repo doesn't run. The static-pattern coverage is the best you can do without that — fine.

Smaller things

  • The CHANGELOG entry and docs/UIUX-GUIDE.md update both say the right thing. The RFC update in docs/rfcs/webui-run-state-consistency-contract.md is the bit that matters long-term — that's where future PRs will (or won't) look before changing this code.
  • tests/test_ui_tool_call_cleanup.py change to test_live_visible_interim_text_preserves_timeline_boundary is the inversion of an older test asserting the opposite. The PR description's "Risks" section flags this; good. Worth checking whether fix: keep compact tool activity grouped #2638 (named as the regression source) has a similar contract test that also needs to be inverted or removed.

Overall: small, targeted, contract-grounded. The forced-flush idea is the right primitive. I'd want a maintainer eye on whether other _flushPendingSegmentRender() callers have the same drop-on-no-pending bug before this lands, and whether the new closeCurrentLiveActivityGroup boundary plays well with the elapsed-timer on the now-closed group.

@franksong2702
Copy link
Copy Markdown
Contributor Author

Follow-up pushed in a9ea5604 for the review notes.

What changed:

  • Audited the remaining _flushPendingSegmentRender() call site in the live tool handler and changed it to _flushPendingSegmentRender({force:true}), so the same no-pending-rAF text-drop condition is covered at tool boundaries too.
  • Tightened the static regression test to require forced flush before the tool handler resets the assistant segment.
  • Updated the live Activity elapsed timer path so a group whose data-live-activity-current marker has been cleared stops being treated as the active elapsed-timer target.
  • Added static coverage that the elapsed timer checks the current Activity marker.

Verification:

  • pytest tests/test_issue2713_streaming_segment_flush.py tests/test_ui_tool_call_cleanup.py tests/test_run_journal_frontend_static.py tests/test_regressions.py::test_messages_js_supports_interim_assistant_events tests/test_regressions.py::test_ui_js_does_not_hide_anchor_segments_that_contain_thinking tests/test_regressions.py::test_messages_js_live_assistant_segment_reuses_live_turn_wrapper -q -> 42 passed
  • node --check static/messages.js
  • node --check static/ui.js
  • git diff --check
  • GitHub Actions is green on Python 3.11, 3.12, and 3.13.

Remaining limitation:

  • This still uses the repo's existing static JS regression style rather than a DOM/browser test. The new coverage locks the affected event ordering and timer guard, but it is not a visual screenshot assertion.

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Holding briefly — interim progress in live timeline. +69/-23 in messages.js + ui.js + new test, plus updates to docs/UIUX-GUIDE.md and an RFC. Surface overlap with the recent #2907 (echo dedupe, shipped in v0.51.136) and #2928 (single SSE source, shipped in v0.51.136).

What I want to verify:

  1. Does this PR's behavior agree with fix(streaming): suppress visible progress echoes #2907's "drop visible-output echoes through reasoning channel" or contradict it?
  2. RFC change (docs/rfcs/webui-run-state-consistency-contract.md) — needs Contract Routing per the contract-routing rule.
  3. The "live timeline" surface is new — does it add new always-visible chrome?

Will revisit next sweep. @franksong2702, thanks.

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Merged in Release DM / v0.51.141 (stage-batch23 — 4-PR second hold-bucket pass with PRs #2506 #2792 #2888 #2958).

Thanks @franksong2702! 🚢

franksong2702 pushed a commit to franksong2702/hermes-webui-fork that referenced this pull request May 27, 2026
# Conflicts:
#	CHANGELOG.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Regression: Compact tool activity hides visible interim progress from the live Session timeline

2 participants