-
Notifications
You must be signed in to change notification settings - Fork 1
Add a Dedicated Pending Messages Endpoint to Optimize SessionStart Hook Performance #7
Description
Summary [16/933]
When a session starts, hook_session_start.py calls search_memories (GET /api/v1/memories/search) solely to retrieve pending_messages.
However, this endpoint triggers the full retrieval pipeline (Milvus vector search + ElasticSearch + rerank), even though the query is an empty string and the search results themselves are never used — resulting in unnecessary performance overhead.
The root cause is that the pending_messages retrieval logic is coupled to the retrieve_mem path. The fetch_mem path (GET /api/v1/memories) does not return pending_messages, and no lightweight dedicated endpoint currently exists.
Problem
The current call flow in hook_session_start.py:
fetch_recent_memories()→GET /api/v1/memories: fetches recent episodic memories (direct MongoDB query, lightweight)search_memories("", method="hybrid")→GET /api/v1/memories/search: used solely to retrievepending_messages
Step 2 triggers:
- Milvus vector search
- ElasticSearch keyword search
- Rerank (triggered unconditionally in
hybridmode)
The memories results returned by the search are never used in hook_session_start.py — this is pure waste.
Root Cause
The pending_messages retrieval logic lives inside memory_manager.retrieve_mem() and is only triggered via the search endpoint. The fetch_mem() path does not include this logic, and the FetchMemResponse DTO does not have a pending_messages field.
TODO
Expose a dedicated pending messages endpoint (P1)
- Add
GET /api/v1/messages/pending - Parameters:
user_id,group_id,limit - Directly invoke
_get_pending_messages()orMemoryRequestLogService.get_pending_messages(), without triggering any vector or keyword
search - Return format should be consistent with the existing
pending_messagesfield
Update hook_session_start.py (P1)
- Replace
search_memories("", method="hybrid")with a call to the newGET /api/v1/messages/pendingendpoint
Reference
src/agentic_layer/memory_manager.py—retrieve_mem()vsfetch_mem()src/api_specs/dtos/memory.py—FetchMemResponse(nopending_messages) vsRetrieveMemResponse(haspending_messages)~/.claude/skills/evermemos/scripts/hook_session_start.py