feat: add VPS resource health insights#1688
Conversation
ReviewReading the diff at 🔴 Blocker: syntax error in the regression test file
ROUTES_PY = (REPO_ROOT / "api" / "routes.py").read_text(encoding="utf-8")
AUTH_PY=*** / "api" / "auth.py").read_text(encoding="utf-8")That looks like an editor/tool artefact from a redacted diff — the test file simply will not parse. AUTH_PY = (REPO_ROOT / "api" / "auth.py").read_text(encoding="utf-8")…which is then asserted against at line 107 ( 🟡 AuthenticationThe endpoint goes through What concerns me more is the default-no-auth case: a fresh WebUI install with no password set has
Note: the polled 🟢 Implementation looks good
def _cpu_percent() -> float:
start = _read_proc_stat_cpu()
time.sleep(_CPU_SAMPLE_SECONDS) # 0.05s
end = _read_proc_stat_cpu()
return _cpu_delta_percent(start, end)50ms blocking sleep on every poll is fine in practice (BaseHTTPServer is threaded), but worth noting this will serialize behind the GIL alongside other GETs. At 5s polling per client × small instance, totally negligible.
The frontend at One small nit: Action items
Once the test file parses I'd be happy with this. Closes #693 — the feature itself is well-scoped and matches the feature request shape. |
|
Follow-up on the reported test-file syntax blocker:
I left the separate default-no-auth |
Correction on the "blocker"I owe you a correction: the syntax-error claim in my earlier review was wrong, and your follow-up was right. I re-read the actual file bytes at the PR head ( Line 16 is What happened is the Net status on the PRBlocker retracted. Implementation review still stands:
The default-no-auth concern (raised in my prior review §🟡) is unchanged and not a blocker — same posture as the existing This is mergeable from my side. Will let nesquena take the merge call. Closes #693. |
|
I don't think the information is important enough that I'd want to see it always displayed at the top like that, feels too power user. I wouldn't mind there being an area in settings or a new icon that gives you a rundown of overall system and hermes health including these types of stats but I think having them always present at the very top is overkill. Will leave open for discussion for 24 hours. @aronprins Thoughts? |
It seems this introduces a whole new bar which is a hard no. Regarding your option perhaps something above the cog wheel in the left rail if anything - I wouldnt be a fan of it but trying to keep an open mine here 😜 |
|
I agree, this probably isn't the cleanest implementation. what if it was under insights? that seems like a better place to put charts on current + historical resource consumption. |
|
Moved the VPS/resource health surface out of the always-visible top chrome and into Insights per the placement feedback. What changed in follow-up commit
Verification:
UI evidence:
Historical resource charts are intentionally left as a follow-up unless maintainers want this PR to grow a storage/aggregation contract for host metrics. |
|
Looks much better, thanks! Going to move towards review and merge |
|
Fwiw I think this needs more discussion. @nesquena Im working in a few branches and might have missed it but do we already havr jnsights there? Or are we adding that just for this? |
|
Insights tab already made it in to the project - so this just adds a small part into that tab. For now I consider it an alpha. At some point we can review and decide what stays or goes or if we want to do a refresh |
|
Closed by the v0.51.5 release in PR #1713 (merged at 0ea3dfb, deployed to production). Thanks! Live on production: https://github.com/nesquena/hermes-webui/releases/tag/v0.51.5 🚀 |
4 PRs (1 surface addition, 3 fixes): - nesquena#1688 VPS resource health Insights panel (@Michaelyklam, closes nesquena#693) - nesquena#1709 preserve scroll on stream completion (@Michaelyklam, closes nesquena#1690) - nesquena#1711 hide rename tooltip on folders (@nesquena-hermes, closes nesquena#1710) - nesquena#1712 guard localStorage.setItem against QuotaExceededError (@24601) Tests: 4504 → 4527 (+23). Opus: SHIP, 6/6 verification clean. Held back: nesquena#1686 (Docker enhance) — Opus flagged sibling-repo dep that breaks standalone clones. Left open for follow-up. Co-authored-by: Michael Lam <Michaelyklam1@gmail.com> Co-authored-by: 24601 <noreply@github.com>
Thinking Path
What Changed
api/system_health.py, a dependency-free Linux/stdlib metrics collector for aggregate CPU, memory, and root disk usage.GET /api/system/healthrouting that returns only sanitized aggregate fields plus safe status/error codes.System healthdiagnostics/resource card.Why It Matters
Self-hosted Hermes users can still inspect basic VPS pressure from the WebUI without SSH, but the diagnostics no longer consume permanent space in every chat. Keeping it inside Insights matches the existing analytics/observability mental model and leaves room for future historical resource charts without overloading the main conversation surface.
Verification
/home/michael/.hermes/hermes-agent/venv/bin/python -m pytest tests/test_issue693_system_health_panel.py -q→7 passednode --check static/ui.js→ passednode --check static/panels.js→ passed/home/michael/.hermes/hermes-agent/venv/bin/python -m pytest tests/test_issue693_system_health_panel.py tests/test_insights.py tests/test_issue1257_llm_wiki_status.py -q→15 passedgit diff --check→ passedenv -u HERMES_CONFIG_PATH -u HERMES_WEBUI_HOST /home/michael/.hermes/hermes-agent/venv/bin/python -m pytest tests/ -q→4484 passed, 2 skipped, 3 xpassed, 1 warning, 8 subtests passed127.0.0.1:18788): Insights shows the liveSystem healthresource card; Chat no longer has a persistent CPU/RAM/Disk health bar in the top chrome.UI media:
Risks / Follow-ups
/proc/statdelta sample, so/api/system/healthtakes roughly 50ms on Linux; that keeps the first poll state-free and dependency-free.Closes #693
Model Used
AI assisted.
gpt-5.5