Environment
- Hermes Agent: hermes-webui (latest, GitHub: nesquena/hermes-webui)
- OS: WSL2 Ubuntu 26.04 on Windows
- Hardware: Samsung Galaxy Book Flex (i5-1035G4, 12GB RAM, WSL2 memory limit 2GB)
- Model: mimo-v2.5-pro via xiaomi provider
Bug Description
After extended use, the WebUI server process (server.py) memory grows to 1GB+ RSS, causing all API endpoints to time out. The frontend then falls back to returning locally cached content — specifically the compression marker — for any user input, making the session appear completely broken.
Steps to Reproduce
- Start WebUI on port 7002
- Have a long conversation (500+ messages, 800KB session file) with context compression triggered
- Continue using the WebUI for ~30+ minutes without restart
- Observe that sending any message immediately returns
[Your active task list was preserved across context compression] — no API call is made
Evidence
Process memory at time of failure
PID 15005 — started 11:38, observed at ~11:47 (9 minutes uptime)
%CPU 17.1 %MEM 56.3 VIRT 3.4GB RSS 1.07GB
command: python server.py (port 7002)
API timeout
$ curl -s http://127.0.0.1:7002/api/health/agent
# Timed out after 60s — no response
Crash diagnostics (auto-collected by keepalive)
crash-20260514_114711/ — WebUI health check failed, keepalive restarted
Session data (backend is healthy)
- 11 independent sessions all triggered context compression at message index 104
- After compression, the backend continued normally — all sessions show valid tool calls and assistant responses for 400+ messages after compression
- The problem is purely frontend: when server.py is unresponsive, the browser-side code returns cached content without attempting an API call
Affected session
session_20260514_065326_8a3de3.json — 544 messages, 796KB, mimo-v2.5-pro
Expected Behavior
- WebUI should not leak memory during normal use
- If the API is unreachable, the frontend should show an error toast/banner, not silently return stale content
- The compression marker
[Your active task list was preserved...] should never be returned as a "response" to user input
Actual Behavior
server.py RSS grows to 1GB+ (56% of system memory) within ~9 minutes
- All API calls time out
- Frontend returns the compression marker text as if it were the model's response, with no API round-trip
- User sees the same message repeated for every input — appears as if the model is "stuck"
Workaround
Kill the WebUI process and let keepalive restart it:
kill $(pgrep -f "server.py.*7002")
# keepalive auto-restarts within 5s
Additional Notes
- The session file itself is not corrupted — all 544 messages are valid
- The compression itself works correctly — the backend continues normally after compression
- The
_isPreservedCompressionTaskListMessage() function in ui.js:4544 detects this specific message pattern, which suggests the frontend has special handling for it — but when the server is unreachable, this content leaks through as a "response"
- This may be related to how the frontend handles SSE stream disconnection when the server is overloaded
Environment
Bug Description
After extended use, the WebUI server process (
server.py) memory grows to 1GB+ RSS, causing all API endpoints to time out. The frontend then falls back to returning locally cached content — specifically the compression marker — for any user input, making the session appear completely broken.Steps to Reproduce
[Your active task list was preserved across context compression]— no API call is madeEvidence
Process memory at time of failure
API timeout
Crash diagnostics (auto-collected by keepalive)
Session data (backend is healthy)
Affected session
Expected Behavior
[Your active task list was preserved...]should never be returned as a "response" to user inputActual Behavior
server.pyRSS grows to 1GB+ (56% of system memory) within ~9 minutesWorkaround
Kill the WebUI process and let keepalive restart it:
Additional Notes
_isPreservedCompressionTaskListMessage()function inui.js:4544detects this specific message pattern, which suggests the frontend has special handling for it — but when the server is unreachable, this content leaks through as a "response"