-
Notifications
You must be signed in to change notification settings - Fork 553
Description
Bug: Memory search fails to match words with punctuation
Summary
The extractWords function in memory/inmemory.go uses space-only splitting and doesn't strip punctuation, causing memory search to miss relevant results when text contains punctuation or non-space whitespace.
Reproduction
Scenario: Add a session with content containing punctuation, then search for a word without punctuation.
// Add session with punctuation
session := makeSession("app1", "user1", "sess1", []*session.Event{
{
LLMResponse: model.LLMResponse{
Content: genai.NewContentFromText("The agent works great!", genai.RoleModel),
},
},
})
memSvc.AddSession(ctx, session)
// Search for "great" (without punctuation)
resp, _ := memSvc.Search(ctx, &memory.SearchRequest{
AppName: "app1",
UserID: "user1",
Query: "great",
})
// Expected: 1 memory found
// Actual: 0 memories (because stored token is "great!" not "great")Root Cause
File: memory/inmemory.go, line ~155
func extractWords(text string) map[string]struct{} {
res := make(map[string]struct{})
for s := range strings.SplitSeq(text, " ") { // ← Only splits on space
if s == "" {
continue
}
res[strings.ToLower(s)] = struct{}{} // ← Doesn't strip punctuation
}
return res
}Issues:
- Space-only splitting:
strings.SplitSeq(text, " ")doesn't handle tabs, newlines, or multiple spaces - No punctuation normalization:
"great!"is stored as-is, won't match"great" - Case sensitivity handled but not enough: Lowercasing happens after punctuation is included
Impact
- Search accuracy degraded: Users searching for "error" won't find memories containing "error." or "error," or "error!"
- Common patterns affected:
- Sentences ending with punctuation (
.,!,?) - Comma-separated lists
- Quoted text
- Multi-line responses with
\nor\t
- Sentences ending with punctuation (
Proposed Fix
Replace space-only splitting with proper whitespace tokenization and strip punctuation:
func extractWords(text string) map[string]struct{} {
res := make(map[string]struct{})
for _, word := range strings.Fields(text) { // Splits on all whitespace
// Strip punctuation
cleaned := strings.TrimFunc(word, func(r rune) bool {
return !unicode.IsLetter(r) && !unicode.IsNumber(r)
})
if cleaned == "" {
continue
}
res[strings.ToLower(cleaned)] = struct{}{}
}
return res
}Alternative: Use a proper tokenizer/stemmer for production-grade search, but the above fix would resolve the immediate issue.
Test Case to Add
{
name: "match words with punctuation",
initSessions: []session.Session{
makeSession(t, "app1", "user1", "sess1", []*session.Event{
{
LLMResponse: model.LLMResponse{
Content: genai.NewContentFromText("Error: connection timeout! Please retry.", genai.RoleModel),
},
},
}),
},
req: &memory.SearchRequest{
AppName: "app1",
UserID: "user1",
Query: "error timeout retry", // No punctuation
},
wantResp: &memory.SearchResponse{
Memories: []memory.Entry{/* should find the memory */},
},
},Environment
- Version:
mainbranch (commit: latest as of 2026-02-16) - Go version: 1.22+
Additional Context
This is particularly problematic for AI agent memory since LLM responses naturally contain punctuation. The current implementation significantly reduces search recall in real-world usage.