Skip to content

Redesign live-to-final assistant replies for running agent sessions #3400

@franksong2702

Description

@franksong2702

Summary

Hermes WebUI should redesign the live-to-final assistant reply experience for running agent sessions.

This is one of the most important UX surfaces in Hermes WebUI: when an agent is running, the user needs to understand what is happening now, what has already happened, and where the final answer begins. The current experience has repeatedly exposed edge cases where live progress, tool activity, replay/recovery, auto-compression, and final transcript rendering compete for the same visual surface.

The goal is not just to add a Worklog widget. The goal is to make the assistant reply lifecycle coherent:

  • During a running turn, process text should remain the primary timeline.
  • Tool activity should be visible, quiet, and close to the prose that caused it.
  • When the turn settles, implementation detail should collapse into a compact activity summary above the final answer.
  • The final answer must remain readable, complete, and clearly separate from the worklog.
  • Recovery, session switching, and replay should reconstruct the same structure rather than creating a different UI state.

Why this matters

Running agent sessions are the core Hermes WebUI experience. Users spend most of their time watching an agent reason, inspect files, run tools, recover from reconnects, and eventually produce a final answer. If this surface is noisy or unstable, the product feels unreliable even when the backend is working correctly.

Mature agent clients such as Codex and Claude Code generally treat intermediate tool/progress details as supporting activity rather than as the final answer itself. They make running state visible, but they do not force every internal lifecycle marker to remain in the final transcript. Hermes WebUI should follow the same product direction while preserving its own stronger browser affordances: replay, session switching, resumability, and structured tool details.

Historical context

This issue intentionally cross-references prior Hermes WebUI problems because this UX has been revisited many times from different angles:

These are not separate random bugs. They are symptoms of one product surface: running agent sessions need one consistent live-to-final assistant reply model.

Product model

Live phase

A running assistant turn should show:

  • Visible prose/progress as the primary timeline.
  • Quiet L2 tool rows near the prose that triggered them.
  • Collapsed tool rows/groups by default.
  • L3 details only after expansion: full command, args, output, and long payloads.
  • A bottom live status/timer within the active assistant turn, not at the top of the transcript.
  • Running-only lifecycle markers, such as automatic compression, as transient status rows/dividers rather than persistent transcript content.

Final phase

A settled assistant turn should show:

  • One compact L1 Activity summary at the top of the assistant reply.
  • The L1 summary collapsed by default.
  • Expanded L1 showing prose/tool Worklog details when the user asks for them.
  • The final answer below the L1 summary, as normal assistant prose.
  • No leftover running-only lifecycle markers that do not help the user interpret the final answer.

Design requirements

  1. Process prose is the primary live timeline.
  2. Tool rows are supporting activity, visually quieter than prose.
  3. Tool groups and individual tool rows are collapsed by default.
  4. Full command/output details are L3-only.
  5. Final Answer must not be swallowed by the Worklog.
  6. Session switching and refresh must rebuild the same live/final structure from durable state.
  7. Auto Compression should be visible while it is happening, but should not persist as final transcript text.
  8. Recovery-control/internal lifecycle messages must not leak into the visible transcript.
  9. Duplicate stream ownership must not create a hidden active stream that steals live/final events.
  10. Terminal edge cases should be classified explicitly instead of producing misleading final UI.

Non-goals for the first PR

The first implementation slice does not need to solve every related edge case:

  • Queue composer behavior during compression can be handled later.
  • A more explicit degraded/rebuild indicator during slow reattach can be handled later.
  • Native SSE Last-Event-ID support can remain a follow-up.
  • Compression-exhausted/no-final-answer and max-tool-call-limit terminal taxonomy can be refined in follow-up issues/PRs.
  • Broader sidebar/session awareness improvements can remain separate.

Acceptance criteria

A first implementation slice is acceptable if it demonstrates:

  • Live prose and tool rows interleave in the main assistant timeline.
  • Live does not prematurely show the final L1 summary.
  • Final shows one L1 Activity summary above the answer.
  • L1 is collapsed by default and expands to prose plus tool rows.
  • L2 tool rows are quieter than prose and use readable short labels.
  • L3 expansion contains full command/args/output.
  • Auto Compression shows only as a running/live status and disappears from final settled content.
  • Switching away and back to a running session can rebuild prose/tool content from replay rather than showing only an empty Running shell.
  • Recovery-control/internal lifecycle messages remain hidden from the visible transcript.
  • Duplicate same-session stream starts are prevented or safely reused.

Suggested PR shape

The PR implementing this should stay focused on the live-to-final assistant reply lifecycle. Supporting fixes such as Automatic Compression display and duplicate stream ownership may be included when they are required to make the running-session experience coherent, but they should be framed as supporting edge-case fixes rather than the headline.

Use Refs rather than auto-close language unless the PR fully resolves the broader design space.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestuxUser experience / visual polish

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions