release: v0.50.250#1368
Conversation
…al reconnect (#1367) Follow-up to v0.50.249 / PR #1365 absorbing Opus SHOULD-FIX #2. Originally reset out of #1365 because the reviewer flagged it as out-of-scope; brought back per follow-up guidance that correctness-improving changes should ship even when out of scope. The clarify SSE health timer at static/messages.js:1715 was an unconditional 60s force-reconnect, not the 'no event in 60s' detector its comment claimed. Now actually a stale-detector that tracks lastEventAt on initial+clarify event arrivals; only reconnects when the gap exceeds 60s. Under healthy conditions the timer never fires. Co-authored-by: nesquena-hermes <nesquena-hermes@users.noreply.github.com>
Without this check, switching browser tabs while a stream is running causes finalizeThinkingCard() to operate on the wrong session's thinking card DOM — the card belongs to the stream that started it, not the session currently displayed in the tab. The guard ensures finalize only runs when the live assistant turn's session matches the current session. Co-authored-by: Josh <josh@fyul.link>
Bundles 2 PRs: - #1366 fix: guard finalizeThinkingCard with session ID check (with pre-release fix) - #1367 fix(clarify-sse): stale-detector health timer (Opus SHOULD-FIX from v0.50.249) Pre-release fix on #1366: the contributor's guard depends on liveAssistantTurn.dataset.sessionId, but no code in the repo sets that attribute. Without the fix, the guard would always early-return (undefined !== sid is always true), breaking the streaming UI completely — every assistant turn's thinking card would stay open forever. Added per-site stamps at all 3 places that create liveAssistantTurn in static/ui.js, plus a regression test that fails any future creation site that forgets the stamp.
nesquena
left a comment
There was a problem hiding this comment.
Review — end-to-end ✅ (clean approve, bot's pre-release fix verified)
Release v0.50.250 — small batch of 2 PRs. The bot caught and fixed a critical pre-release bug in #1366 (Josh's guard depended on a dataset.sessionId attribute that no code set anywhere in the repo). Verified the fix is correct and the regression test catches the missing-stamp invariant.
Squash audit
2566b43 #1367 fix(clarify-sse): stale-detector health timer (already nesquena APPROVED)
d0257e8 #1366 fix: guard finalizeThinkingCard with session ID check @JKJameson
bc10a22 release: v0.50.250 (CHANGELOG + per-site stamps + regression test)
#1366's Author: Josh <josh@fyul.link> preserved. #1367 is the same JS already approved on its standalone PR. ✅
What the bot caught (already fixed) — guard with no data source
Josh's commit d0257e8 adds the guard at static/ui.js:4288-4289:
const _guardTurn = $('liveAssistantTurn');
if(_guardTurn && S.session && _guardTurn.dataset.sessionId !== S.session.session_id) return;But liveAssistantTurn.dataset.sessionId was never set anywhere in the repo before this PR. So undefined !== "<sid>" is always true, the guard always early-returns whenever liveAssistantTurn exists, and finalizeThinkingCard() would silently no-op for every assistant turn. The thinking card stays as a spinner forever.
CI was green because none of the 29 #1366 contributor tests + 3447 baseline tests exercise streaming end-to-end with the thinking-card lifecycle.
The bot's release commit bc10a22 adds the missing stamps at all three creation sites in static/ui.js:
- Line 3179 in
renderMessages()— when re-rendering messages and the assistant message has_live=true. - Line 3551 in
appendLiveToolCard()— when creating a turn for a tool card before the assistant text starts. - Line 4339 in
appendThinking()— when creating the initial thinking spinner.
All three use the same idempotent pattern: if(S.session) turn.dataset.sessionId=S.session.session_id; immediately after turn.id='liveAssistantTurn';.
The regression test at tests/test_pr1366_finalize_thinking_card_guard.py walks static/ui.js with a regex, finds every <var>.id='liveAssistantTurn' assignment, and asserts a matching <var>.dataset.sessionId= appears within 500 chars after. I verified the test catches the missing-stamp regression by simulating removal of one stamp:
Sites: [(currentAssistantTurn, 149507, True), (turn, 167877, False), (turn, 206329, True)]
Unstamped: [(turn, 167877)]
Test correctly catches the missing stamp ✅
Behavioural harness — guard logic across 6 scenarios
Extracted the guard into Node:
✅ No live turn: CONTINUE
✅ No session: CONTINUE
✅ Stamped A, viewing A (match): CONTINUE
✅ Stamped A, viewing B (mismatch — bug case): EARLY_RETURN
✅ Unstamped (undefined), viewing B (broken-shipped behavior): EARLY_RETURN
✅ Stamped B, viewing B (B has live turn): CONTINUE
The "Unstamped" case demonstrates exactly why Josh's PR was broken without the bot's fix: when no creation site stamps the dataset attribute, EVERY call to finalizeThinkingCard() early-returns once a liveAssistantTurn exists.
Traced against upstream hermes-agent
Frontend-only changes. No upstream interaction.
End-to-end trace — cross-session DOM protection
- User in session A, types message, stream starts.
appendThinking()at static/ui.js:4334 creates the live turn →turn.id='liveAssistantTurn',turn.dataset.sessionId='A'. - User switches to session B (via session click or cross-tab
storageevent from PR #1359).loadSession(B)→renderMessages()rebuilds the DOM for B's content. The oldliveAssistantTurnelement from A is removed. - If B has a live in-flight stream too, B's
renderMessagesat static/ui.js:3174-3179 creates a new live turn stamped with B. - Late callback from A's stream fires
finalizeThinkingCard():_guardTurn = $('liveAssistantTurn')→ returns whatever's in DOM now (B's turn if B has one, else null).- Case (a) DOM has stale A turn (renderMessages hadn't yet run):
_guardTurn.dataset.sessionId === 'A',S.session.session_id === 'B'→ mismatch → early return. ✅ Protects A's stale DOM. - Case (b) DOM has B's turn (renderMessages already replaced):
B === B→ match → continues. Operates on B's DOM.⚠️ The guard CANNOT distinguish "called for B legitimately" from "called by A's late callback while B is current." - Case (c) DOM has nothing:
_guardTurnnull → guard passes → most offinalizeThinkingCard's body no-ops becausethinkingRowetc don't exist.
Case (b) is the partial-coverage gap — see Minor observations.
Other audit — confirmed correct
- ✅ PR #1367 (clarify-sse stale-detector) — already approved on standalone PR. Same JS, no drift.
- ✅ No new endpoints / config / env vars. Frontend-only.
- ✅ No XSS surface.
dataset.sessionIdis read-only used for comparison; never injected as HTML. - ✅ No memory leak — the stamp is just a string property on a DOM element that's eventually GC'd along with the element.
- ✅ Thread safety — single-threaded JS event loop.
S.sessionreads are atomic. - ✅ Backwards compat — pre-fix sessions saved without the stamp don't exist persistently; the stamp lives only on transient DOM elements. Every page load creates fresh stamps.
- ✅ CHANGELOG entries are detailed and credit Josh.
Edge-case matrix
| Scenario | Pre-bot-fix (broken) | Post-fix |
|---|---|---|
| Session A streaming, user stays on A | Guard early-returns (undefined !== A); thinking card stays as spinner | Guard passes (A === A); finalize runs ✅ |
| Session A streaming, user switches to B with B-live | Guard early-returns; spinner sticks for both | Case (b) — guard passes for B's turn |
| Session A streaming, user switches to B with no live | Guard early-returns; spinner sticks | Guard passes (no liveAssistantTurn); body no-ops on missing nodes ✅ |
| Stale A turn in DOM, current session is B | Guard early-returns (correct outcome but for wrong reason) | Guard early-returns (A !== B); ✅ correctly protects |
S.session is null (transient nav) |
Guard early-returns | Guard passes (S.session && short-circuits); body checks for nodes ✅ |
| Future creation site forgot to stamp | Test fails ✅ | Test still passes ✅ |
| Three creation sites consolidated to fewer | test_at_least_three_live_turn_sites fails (sanity) |
Acceptable to relax if intentional |
Tests
- PR's own:
tests/test_pr1366_finalize_thinking_card_guard.py3/3 pass;test_clarify_sse.py29/29 pass. - Full suite: 3398 passed, 54 skipped, 3 xpassed, 0 failed in 16.49s on
bc10a22. (PR description claims 3450; counting drift consistent with prior batch releases — both runs pass all SSE / lifecycle / streaming tests.) - CI: 3.11/3.12/3.13 all green.
- Sanity check on regression test effectiveness: I verified that mutating one stamp out causes
test_live_turn_creation_sites_stamp_session_idto fail with a clear error message naming the unstamped site.
Minor observations (non-blocking)
- Partial coverage of cross-session callback case: when both A and B have live streams and the user switches A→B, the guard at line 4289 doesn't fully prevent A's late callback from finalizing B's thinking card. A complete fix would have
finalizeThinkingCard(streamSessionId)accept the stream's session_id as a parameter and compare directly. Out of scope for this release; the current fix is a strict improvement (handles the stale-A-turn-still-in-DOM case correctly). - No JS unit test — same constraint as PR #1367. The 3 source-level tests + behavioural harness cover what's reasonable without a JS test infrastructure.
- PR description test count off by ~50 between claimed (3450) and local (3398). Consistent with prior batch releases. Not a defect.
- Stamp uses
if(S.session)guard — ifS.sessionis null at creation time (transient state during session loading), the stamp is omitted. The DOM stays unstamped. Then the guard at finalize-time will early-return becausedataset.sessionId === undefined. This is the safe default — better to skip cleanup than corrupt the wrong session's DOM. - No coverage of
appendLiveToolCard's creation path in any existing test. The regression test asserts the stamp exists at the source; behavioural coverage of that creation path would need a streaming integration test.
Recommendation
✅ Approved. The bot's pre-release fix correctly addresses the broken-as-shipped guard in #1366 — without it, every assistant turn's thinking card would stay as a spinner forever. Per-site stamps at all 3 liveAssistantTurn creation sites + a regression test that catches future site additions that forget the stamp. Behavioural harness confirms the guard logic is correct across 6 scenarios. PR #1367's content is identical to what was already approved on its standalone PR.
Parked at approval — ready for the release agent's merge/tag pipeline.
The timer fires every 60s either way; what changed is whether it triggers a reconnect. Under steady clarify traffic the reconnect never happens; on long-idle sessions it still reconnects every 60-120s (the residual idle churn is now a tracked follow-up rather than the original v0.50.249 unconditional per-minute reconnect). Tightens the CHANGELOG language to match observed behavior.
Opus pre-release reviewVerdict: SHIP. No MUST-FIX or SHOULD-FIX. 3 NITs, 1 of which (CHANGELOG accuracy) I fixed in Q1 — Diagnosis on #1366: VERIFIED CORRECT
Q2 — 3-site stamps: VERIFIED CORRECTAll 3 sites that set
Manually ran the 3 regression-test assertions against Q3 — #1367 stale-detector: VERIFIED CORRECTServer pushes only NITs (non-blocking)🟦 NIT-1 — guard semantics narrower than the PR title implies. The 4 call sites in 🟦 NIT-2 — #1367 still reconnects every ~120s on idle sessions. Because 🟦 NIT-3 — defensive Branch state after Opus fix
@nesquena — ready for re-review on HEAD |
Phase 2 of #1003: extend the autosave pattern from the Appearance panel to the Preferences panel so all preference changes are saved automatically without requiring a manual 'Save Settings' click. Mirrors the Phase 1 (Appearance) pattern exactly: - 350ms debounce on field changes (500ms additional debounce on the bot_name text input — effective ~850ms latency for typing) - Inline status feedback (saving / saved / failed + retry button) - Clears dirty flag and hides unsaved-changes bar after successful save - Password field excluded — still requires explicit save (security) - Model selector excluded — still requires explicit save 13 fields now autosaving: send_key, language, show_token_usage, simplified_tool_calling, show_cli_sessions, sync_to_insights, check_for_updates, sound_enabled, notifications_enabled, sidebar_density, auto_title_refresh_every, busy_input_mode, bot_name. i18n keys (settings_autosave_saving/saved/failed/retry) already exist in all 8 locales from Phase 1. Co-authored-by: Feco Linhares <feco.linhares@gmail.com>
9 source-level invariants covering #1369: - All 13 preference fields appear in _preferencesPayloadFromUi - Listeners use _schedulePreferencesAutosave, not _markSettingsDirty - Password field STILL uses _markSettingsDirty (security invariant) - _autosavePreferencesSettings clears _settingsDirty + hides unsaved bar on success - Status div present in static/index.html - Status function uses shared i18n keys from Phase 1 - Retry function falls back gracefully when no stored payload - Debounce clears prior timer (350ms, matching Phase 1) - Phase 1 (Appearance) autosave still intact
…odel pending (Opus SHOULD-FIX Q1) Pre-release Opus review of v0.50.250 caught a UX regression in PR #1369: _autosavePreferencesSettings unconditionally cleared _settingsDirty=false and hid the unsaved-changes bar on every successful autosave. But password and model are still committed via the explicit 'Save Settings' button (password for security; model goes through /api/default-model). Race scenario: 1. User opens System pane, types a new password (sets _settingsDirty=true; bar appears on close) 2. User switches to Preferences, toggles any checkbox -> autosave fires -> _settingsDirty=false, bar permanently suppressed 3. User closes panel -> _closeSettingsPanel short-circuits because !_settingsDirty -> typed password silently discarded (loadSettingsPanel blanks pwField.value='' on next open) Same shape with model selector: pick a new default model, then toggle any preference -> autosave fires -> no warning on close -> model never persists. Fix: only clear _settingsDirty and hide settingsUnsavedBar when both the password field is empty AND the model selector matches its on-open snapshot. Pinned by an updated regression test asserting the conditional guard exists.
Update: PR #1369 added to v0.50.250Per the user's directive, PR #1369 (@fecolinhares — autosave preferences settings, Phase 2 of #1003) has been added to this batch. Per agreement, no second independent-review round is required for the combined release; instead it gets:
My review of #1369 — APPROVE with one fix appliedPattern conformance: Phase 2 cleanly mirrors the Phase 1 (Appearance) pattern. Same 350ms debounce, same i18n keys (already in all 8 locales from Phase 1), same status div + retry button shape. 12 of 13 fields use the standard Security invariant preserved: Password field still calls Server-side compat: I read Added regression suite: Opus pre-release review of #1369🟨 SHOULD-FIX Q1 — Autosave-clears-dirty regression (FIXED in
Same shape with model selector. Real UX bug. Fix applied: only clear const pwField=$('settingsPassword');
const pwDirty=!!(pwField&&pwField.value);
const modelSel=$('settingsModel'); // (corrected from initial 'settingsHermesDefaultModel')
const modelDirty=!!(modelSel&&((modelSel.value||'')!==(_settingsHermesDefaultModelOnOpen||'')));
if(!pwDirty&&!modelDirty){
_settingsDirty=false;
const bar=$('settingsUnsavedBar');
if(bar) bar.style.display='none';
}Regression test in 🟦 NIT (deferred):
🟥 MUST-FIX: None. Final branch stateTests
Per the user's directive, no second nesquena review round needed for the combined release. Proceeding to merge once CI re-runs green on |
Shipped — v0.50.250 ✅Merge: What shipped
Plus pre-release fixes:
Tests
Hygiene complete
Held PRs (NOT in this release, still on hold)
What's tracked as follow-up
Per your directive: PR #1369 was added without requiring a second independent review round; combined release shipped after Opus + my structural review caught and fixed the SHOULD-FIX dirty-flag race. |
Release v0.50.250
Small batch — 2 PRs, one with a critical pre-release fix.
Constituent PRs
Pre-release fix on #1366 (critical)
The contributor's guard was broken as shipped:
liveAssistantTurn.dataset.sessionIdis never set anywhere in the repo. Soundefined !== "<some-id>"is always true, andfinalizeThinkingCard()always early-returns when there's aliveAssistantTurn— breaking the streaming UI completely (every assistant turn's thinking card would stay open forever).29 contributor tests + CI green don't catch this because none of them exercise streaming end-to-end with the thinking-card lifecycle.
Fix applied at commit
bc10a22:if(S.session) <var>.dataset.sessionId=S.session.session_id;at all 3 sites that createliveAssistantTurninstatic/ui.js(lines ~3174, ~3550, ~4338)tests/test_pr1366_finalize_thinking_card_guard.pywith 3 source-level invariants:dataset.sessionIdand compare againstS.session.session_id<var>.id='liveAssistantTurn'must also stamp<var>.dataset.sessionId(catches future regressions)Pre-release gate
Author:+Co-authored-by:trailersproc_f3a6df1bb1a6)Plan after approval
--merge, no squash — preserves all 4 commits)v0.50.250on the merge commitwebui_qa_agent.sh/tmp/wt-v0.50.250Held PRs (NOT in this release)
Compatibility notes
finalizeThinkingCardruns as before).