Title-language detection has German false-positive surface
Follow-up from stage-batch33 / PR #2984 (Release DW, v0.51.151). Opus advisor flagged 4 quality items on the title-language fix.
Item 1 — _detect_title_language German false-positive surface (api/streaming.py:1381-1394)
The German marker set includes very common tech-jargon tokens that also appear in English:
german_markers = {
'warum', 'werden', 'wird', 'wurde', 'hier', 'nicht', 'mehr', 'alte', 'alten',
'bilder', 'angezeigt', 'session', 'prüfe', 'ich', 'die', 'der', 'das', 'den',
'und', 'oder', 'mit', 'für', 'von', 'zu', 'ist', 'sind', 'bitte', 'kannst',
}
The threshold is 2 marker hits OR any umlaut/ß. The umlaut path is solid (umlauts only appear in German). The 2-marker path is the hazard:
'session' + 'die' (English verb "to die") → flags as German
'session' + 'ich' (rare, but as a typo or in tech-archive contexts) → flags
'das' + 'session' (very plausible if someone references a DAS storage system) → flags
A safer shape:
- Require ≥3 hits when no umlaut/ß present
- Drop the most-ambiguous tokens (
'session', 'die', 'der', 'das') — they're shared with too many languages
- Use a more specific German n-gram (e.g.
der die das co-occurrence) instead of single tokens
Item 2 — Hardcoded 'Alte Session Bilder' / 'Session Bilder' literal fallback (api/streaming.py:1872-1877)
Branch-specific one-off for a single user's bug-report scenario. Only fires for German + "bilder" + "session" inputs, so blast radius is small. Should be removed once the LLM-side fix is verified in production. Tech debt.
Item 3 — Title-language rule only knows German (api/streaming.py:1397-1407)
Other languages (es, fr, zh, ru per static/i18n.js) get the bare "Match the language" line with no exemplars. The German user gets the structured exemplar treatment; everyone else gets a one-liner. Uneven UX. Could either:
- Drop the German-specific exemplar block and use the generic line everywhere (simpler)
- Add exemplar blocks for the other shipped locales (more consistent UX, larger maintenance surface)
Item 4 — _isSessionActivelyViewed gate is broader than the comment claims (static/messages.js:2155-2158, from PR #2925)
The comment says "if the user has switched to a different session, don't reconnect," but _isSessionActivelyViewed (defined at messages.js:22-26) ALSO returns false when the document isn't focused. The prior _deferStreamErrorIfPageHidden(source) at line 2153 only handles document.hidden — a visible-but-unfocused window slips both gates and skips reconnect.
Fix options:
- Narrow the guard to
_isSessionCurrentPane(activeSid) (defined at messages.js:18) which doesn't gate on focus
- Or update the comment to match the broader behavior
Item 5 — Implicit DOM↔INFLIGHT contract on reattach (static/messages.js:661-670)
On reconnect, _lastLiveAssistant is sourced from INFLIGHT[activeSid].messages.findLast(m._live) and seeds assistantText/reasoningText. New SSE tokens append to that seed. If rendered DOM diverges from INFLIGHT (partial restore that updates one but not the other), the user sees doubled or missing tokens after reconnect. The invariant is now load-bearing but isn't asserted anywhere — a regression test that introduces an out-of-sync state and confirms reconnect produces no duplication would pin the contract.
Priority
Item 1 is the most impactful (real users will hit it as English+jargon traffic grows). Items 2–4 are cleanup. Item 5 is regression-prevention.
Suggested labels: enhancement, tech-debt, triage-followup.
Title-language detection has German false-positive surface
Follow-up from stage-batch33 / PR #2984 (Release DW, v0.51.151). Opus advisor flagged 4 quality items on the title-language fix.
Item 1 —
_detect_title_languageGerman false-positive surface (api/streaming.py:1381-1394)The German marker set includes very common tech-jargon tokens that also appear in English:
The threshold is 2 marker hits OR any umlaut/ß. The umlaut path is solid (umlauts only appear in German). The 2-marker path is the hazard:
'session'+'die'(English verb "to die") → flags as German'session'+'ich'(rare, but as a typo or in tech-archive contexts) → flags'das'+'session'(very plausible if someone references a DAS storage system) → flagsA safer shape:
'session','die','der','das') — they're shared with too many languagesder die dasco-occurrence) instead of single tokensItem 2 — Hardcoded
'Alte Session Bilder'/'Session Bilder'literal fallback (api/streaming.py:1872-1877)Branch-specific one-off for a single user's bug-report scenario. Only fires for German + "bilder" + "session" inputs, so blast radius is small. Should be removed once the LLM-side fix is verified in production. Tech debt.
Item 3 — Title-language rule only knows German (api/streaming.py:1397-1407)
Other languages (es, fr, zh, ru per
static/i18n.js) get the bare "Match the language" line with no exemplars. The German user gets the structured exemplar treatment; everyone else gets a one-liner. Uneven UX. Could either:Item 4 —
_isSessionActivelyViewedgate is broader than the comment claims (static/messages.js:2155-2158, from PR #2925)The comment says "if the user has switched to a different session, don't reconnect," but
_isSessionActivelyViewed(defined atmessages.js:22-26) ALSO returns false when the document isn't focused. The prior_deferStreamErrorIfPageHidden(source)at line 2153 only handlesdocument.hidden— a visible-but-unfocused window slips both gates and skips reconnect.Fix options:
_isSessionCurrentPane(activeSid)(defined atmessages.js:18) which doesn't gate on focusItem 5 — Implicit DOM↔INFLIGHT contract on reattach (static/messages.js:661-670)
On reconnect,
_lastLiveAssistantis sourced fromINFLIGHT[activeSid].messages.findLast(m._live)and seedsassistantText/reasoningText. New SSE tokens append to that seed. If rendered DOM diverges from INFLIGHT (partial restore that updates one but not the other), the user sees doubled or missing tokens after reconnect. The invariant is now load-bearing but isn't asserted anywhere — a regression test that introduces an out-of-sync state and confirms reconnect produces no duplication would pin the contract.Priority
Item 1 is the most impactful (real users will hit it as English+jargon traffic grows). Items 2–4 are cleanup. Item 5 is regression-prevention.
Suggested labels:
enhancement,tech-debt,triage-followup.