feat(i18n): German language support + fix fire-and-forget race in agent_end#406
feat(i18n): German language support + fix fire-and-forget race in agent_end#406Banger455 wants to merge 3 commits intoCortexReach:masterfrom
Conversation
rwmjhb
left a comment
There was a problem hiding this comment.
Thanks for the PR @Banger455 — both the German i18n support and the agent_end race fix address real user pain points (issue #260). A few things need attention before this can merge.
Must Fix
Build failure — test/plugin-manifest-regression.mjs fails because the test expects api.hooks["command:new"] === undefined, but appendSelfImprovementNote already registers on command:new upstream. This is likely a pre-existing main/test mismatch rather than something your diff introduced, but the branch still needs to be green. A rebase onto latest main and re-verification should clarify this.
German regex issues (MEDIUM)
Three related problems with the new German trigger patterns:
-
False positives from substring matching —
/immer|niemals|wichtig/iatindex.ts:1266matches inside compound words:Zimmermann,Schwimmerin,Flimmern,Wichtigkeit. Consider using word boundary assertions (\b) or a more targeted pattern. -
Ungrouped alternation in fact triggers —
/mein\s+\w+\s+ist|heißt|wohne|arbeite/iatindex.ts:1265causesheißt,wohne, andarbeiteto match anywhere, not just aftermein. Sentences likeEr heißt PeterorDie Stadt heißt Berlinwill spuriously trigger capture. Wrap the alternation:/mein\s+\w+\s+ist|mein\s+\w+\s+heißt|ich\s+wohne|ich\s+arbeite/ior similar. -
Inconsistency between explicit-remember detection and capture triggers —
isExplicitRememberCommandatindex.ts:741usesmerke?\s+dirwhileshouldCaptureatindex.ts:1262usesmerk es dir. Result:"merk es dir"bypasses the explicit path but hits the capture path;"vergiss das nicht"does the opposite. These should be aligned.
Nice to have
- Timer cleanup (
index.ts:2895-2904): ThesetTimeoutin thePromise.racekeeps Node alive up to 15s afterbackgroundRunresolves. ConsiderclearTimeoutin the.then()handler. Minor, not blocking. - Missing tests: The new German regex patterns and async hook timeout have no direct test coverage. Given the regex issues above, adding a few positive/negative test cases would catch future regressions.
…eval Add German patterns to three trigger arrays that previously only supported English, Czech, and Chinese: - MEMORY_TRIGGERS: detect "merke dir", "vergiss nicht", "ich bevorzuge", "wir haben entschieden", "ab jetzt", personal info patterns, etc. - AUTO_CAPTURE_EXPLICIT_REMEMBER_RE: match "merke dir" / "vergiss nicht" as explicit remember commands (also adds English "remember this") - FORCE_RETRIEVE_PATTERNS (adaptive-retrieval.ts): match "erinnerst du dich", "weißt du noch", "gestern", "habe ich erwähnt", etc. Without these patterns, German-speaking users' store requests silently fall through shouldCapture() and retrieval skips force-retrieve, causing RECALLED: NONE in new sessions. Fixes CortexReach#393
The agent_end hook used `void backgroundRun` (fire-and-forget), which works in gateway mode (persistent process) but races against process.exit() in `--local` / embedded CLI mode — the capture Promise gets killed before it can flush to LanceDB. Changes: - Make the hook async and await backgroundRun with a 15 s timeout - Widen the AgentEndAutoCaptureHook type to allow Promise<void> The timeout is a safety net only; in gateway mode the await resolves instantly because the process stays alive. Session-lock and channel- delivery timing is unaffected because OpenClaw core already calls runAgentEnd() as fire-and-forget with .catch() (cf. Issue CortexReach#260). Fixes CortexReach#393
…erage Addresses all review feedback from @rwmjhb: Regex precision (MUST FIX): - Add \b word boundaries to /immer|niemals|wichtig/ to prevent false positives on compound words like Zimmermann, Schwimmerin, Wichtigkeit - Fix ungrouped alternation in personal info trigger: heißt/wohne/arbeite now require proper context (mein X heißt / ich wohne / ich arbeite) - Align isExplicitRememberCommand and MEMORY_TRIGGERS to cover all German remember variants consistently (merk dir, merke dir, merk es dir, vergiss nicht, vergiss das nicht, nicht vergessen) Timer cleanup (NICE TO HAVE): - Store setTimeout handle and clearTimeout in finally block to prevent the 15s safety timer from keeping Node alive after backgroundRun resolves Test coverage (NICE TO HAVE): - german-i18n-triggers.test.mjs: 54 test cases covering shouldCapture (24 positive, 9 false-positive prevention, 15 retrieval, 6 consistency) - agent-end-async-capture.test.mjs: 4 test cases covering async hook behavior (Promise return, error swallowing, __lastRun, early return) Note on build failure: plugin-manifest-regression.mjs L155 expects command:new === undefined, but selfImprovement now defaults to enabled (26abb04). This is a pre-existing upstream test mismatch unrelated to this PR.
5572d15 to
e773153
Compare
|
Thanks for the thorough review @rwmjhb! All points addressed in Regex fixes (MUST FIX) ✅
Timer cleanup (NICE TO HAVE) ✅
Test coverage (NICE TO HAVE) ✅Two new test files, 58 test cases total:
Build failure noteThe RebaseBranch rebased onto latest |
AliceLJY
left a comment
There was a problem hiding this comment.
Thanks for the well-structured PR! Two things to confirm before merging:
-
async hook safety: Making
agentEndAutoCaptureHookasync is the right direction for--localembedded mode, but could you confirm the OpenClaw corerunAgentEnd()call path handles unhandled rejections from async hooks correctly? If core has any special handling for sync-only hooks, theasyncconversion could silently swallow errors in edge cases. -
German
immertrigger precision:/\b(immer|niemals|wichtig)\b/i—immeris a very common German adverb ("always", "still") and could produce false-positive captures on ordinary sentences like "Das ist immer so". Would you consider scoping it to explicit memory-intent contexts like/\b(immer\s+(?:wenn|daran|denken)|niemals|wichtig)\b/i, or adding it to a lower-confidence tier? Themerke dir/vergiss nichtpatterns look solid.
Happy to approve once these two points are addressed!
Summary
Fixes #393 — Memory store reports
STORE-OKbut recall in a new session returnsRECALLED: NONEfor German-speaking users.Root cause
Three separate bugs compound into a silent data-loss scenario:
MEMORY_TRIGGERS,AUTO_CAPTURE_EXPLICIT_REMEMBER_RE, andFORCE_RETRIEVE_PATTERNScover English, Czech, and Chinese but have no German entries. Phrases like "Merke dir: X" fall throughshouldCapture()→ nothing is stored.void backgroundRunin theagent_endhook — works in gateway mode (persistent process) but races againstprocess.exit()in--local/ embedded CLI mode. The capture Promise gets killed before it flushes to LanceDB.Changes
Commit 1 —
feat(i18n): add German language triggersMEMORY_TRIGGERSindex.tsAUTO_CAPTURE_EXPLICIT_REMEMBER_REindex.tsFORCE_RETRIEVE_PATTERNSsrc/adaptive-retrieval.tsCommit 2 —
fix(auto-capture): await backgroundRun with timeoutasync;void backgroundRunreplaced byawait Promise.race([backgroundRun, 15s timeout])AgentEndAutoCaptureHooktype widened toPromise<void> | voidrunAgentEnd()fire-and-forget with.catch(), so the newawaitdoes not block session locks or downstream deliveries in gateway mode. The timeout is a safety net for hung API calls only.Note on
extractMinMessagesThe default value of
extractMinMessages(currently 4) means smart extraction never fires for CLI one-shot commands (only 1 eligible text). We worked around this with a config override (extractMinMessages: 1), but the default might be worth revisiting upstream — leaving that decision to the maintainers.Verification (on live OpenClaw instance)
regex fallback found 1 capturable text(s)→auto-captured 1 memoriesopenclaw memory-pro search "Emerald Bear"