Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,29 @@
# Hermes Web UI -- Changelog

## [v0.51.16] — 2026-05-07 — 3-PR contributor batch (anthropic env race close, CLI tool metadata, model picker reset)

### Fixed

- **PR #1768** by @franksong2702 — Serialize Anthropic env fallback reads (closes #1736, the architectural follow-up filed in v0.51.8 sweep). Wraps `_clear_anthropic_env_values()` and the runtime-provider resolver behind `_ENV_LOCK` (the same `threading.Lock` already serializing env save/restore in `streaming.py`). New helper `resolve_runtime_provider_with_anthropic_env_lock()` in `api/oauth.py` is called from 3 sites in `api/routes.py` and 2 in `api/streaming.py`. Opus stage-310 verified: same-lock not a new lock (no ordering risk), nested acquires are sequential not nested (no deadlock), the lock is released before the agent runs (chat throughput unaffected). `api/oauth.py +36`, `api/routes.py +18`, `api/streaming.py +16`, +52 LOC test coverage in `tests/test_issue1362_codex_oauth_onboarding.py`. Race window in `_clear_anthropic_env_values` now closed for the chat hot path; remaining detector-style polls in `api/config.py` are UI-only and never bypass real credentials.
- **PR #1778** by @Michaelyklam — Preserve CLI session tool metadata (closes #1772). The server's CLI session loader was reading only `role`, `content`, `timestamp` from `state.db.messages`, missing tool_calls/tool_results columns. `api/models.py +54` extends the loader to read those columns plus `reasoning_details`, `codex_reasoning_items`, `codex_message_items`, `reasoning_content`, `reasoning` and rehydrate them onto the message dicts. `PRAGMA table_info(messages)` check ensures legacy state.db schemas without the columns don't error. `_is_cli_tool_metadata_enrichment()` correctly rebuilds sidecars when message count is identical but new metadata is present, and uses `save(touch_updated_at=False)` to avoid bumping updated_at on passive enrichment. `api/routes.py +66`, 152 LOC test coverage in `tests/test_cli_session_tool_metadata.py` plus captured API evidence at `docs/pr-media/1772/cli-tool-metadata-api-evidence.json`.
- **PR #1779** by @Michaelyklam — Reset model picker on session switch (closes #1771). Bug: switching sessions silently kept the previous chat's model selected in the composer (could route an inexpensive chat to an expensive model unnoticed — high-impact for users on premium-credit OAuth providers). Fix in `static/ui.js +88/-29`: when session model metadata is missing, `unknown`, or stale, fall back to configured default model/provider, with first-available dropdown option only as last resort. **Auto-fix applied at stage**: Opus stage-310 caught a regression in the new `!hasSessionModel` branch — it dropped the `deferModelCorrection` guard that the parallel else-branch keeps. Without the guard, every fast-path session view of an empty/unknown-model session fired a spurious `/api/session/update` POST that raced `_resolveSessionModelForDisplaySoon` and silently wrote to imported/read-only CLI sessions whose model field reads `"unknown"` (#1778 introduces exactly that surface in this same release). Wrapped the new branch's `_persistSessionModelCorrection` call + state mutation in `if(!deferModelCorrection)` mirroring the else-branch. Added `test_sync_topbar_does_not_persist_correction_while_model_resolution_deferred` regression test that exercises the fast-path interaction with `_modelResolutionDeferred=true` for both empty and `"unknown"` model values; asserts the visible `sel.value` still updates for UX but no POST is issued and no state mutation occurs. 192 LOC of original regression coverage in `tests/test_issue1771_session_model_switch_sync.py` (now 215 LOC with the new test), 7 LOC tweak to `test_provider_mismatch.py` and 1 LOC to `test_session_metadata_fast_path.py` to align existing tests with the new fallback helper.

### Tests

4694 → **4702 collected** (+8 across 2 new test files plus 1 stage auto-fix regression test). 4695 passed, 4 skipped (2 dev-only spawn from v0.51.15 + 2 prong-2 noise), 3 xpassed, 0 failed in 141.29s.

### Pre-release verification

- All 3 PRs CI-green individually.
- File overlap on `api/routes.py` (#1768 + #1778) auto-merged cleanly (different functions: oauth env-lock helpers vs CLI session loader extension).
- `node -c` clean on `static/ui.js`; Python compile clean on all 6 changed .py files.
- pytest: 4695 passed, 0 failed.
- `scripts/run-browser-tests.sh`: all 11 endpoints PASS on isolated port 8789.
- Pre-stamp re-fetch: all 3 PR heads still match local rebases.
- Opus advisor: SHIP #1768 + #1778, #1779 SHOULD-FIX before merge — auto-fix applied at stage with regression test, re-verified clean.

Closes #1736, #1771, #1772.

## [v0.51.15] — 2026-05-07 — 4-PR contributor batch + 1 self-built (cron spawn migration, context menu, codex quota, model prefix)

### Fixed
Expand Down
2 changes: 1 addition & 1 deletion ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

> Web companion to the Hermes Agent CLI. Same workflows, browser-native.
>
> Last updated: v0.51.15 (May 7, 2026) — 4694 tests collected — 4-PR contributor batch + 1 self-built (#1762, #1767, #1769, #1770)
> Last updated: v0.51.16 (May 7, 2026) — 4702 tests collected — 3-PR contributor batch (#1768, #1778, #1779)
> Test source: `pytest tests/ --collect-only -q`
> Per-version detail: see [CHANGELOG.md](./CHANGELOG.md)

Expand Down
4 changes: 2 additions & 2 deletions TESTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -1835,8 +1835,8 @@ Bridged CLI sessions:

---

*Last updated: v0.51.15, May 7, 2026*
*Total automated tests collected: 4694*
*Last updated: v0.51.16, May 7, 2026*
*Total automated tests collected: 4702*
*Regression gate: tests/test_regressions.py*
*Run: pytest tests/ -v --timeout=60*
*Source: <repo>/*
54 changes: 48 additions & 6 deletions api/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -1618,10 +1618,24 @@ def _cron_pid():
return cli_sessions


def _json_loads_if_string(value):
if not isinstance(value, str):
return value
text = value.strip()
if not text:
return None
try:
return json.loads(text)
except Exception:
return value


def get_cli_session_messages(sid) -> list:
"""Read messages for a single CLI/external-agent session.
Returns a list of {role, content, timestamp} dicts.
Returns empty list on any error.

Preserve tool-call/result and reasoning metadata from the agent state.db so
CLI-origin transcripts render with the same tool cards as WebUI-native
sessions. Returns empty list on any error.
"""
import os
if str(sid or '').startswith(f'{CLAUDE_CODE_SOURCE}_'):
Expand All @@ -1644,19 +1658,47 @@ def get_cli_session_messages(sid) -> list:
with closing(sqlite3.connect(str(db_path))) as conn:
conn.row_factory = sqlite3.Row
cur = conn.cursor()
cur.execute("""
SELECT role, content, timestamp
cur.execute("PRAGMA table_info(messages)")
available = {str(row['name']) for row in cur.fetchall()}
required = {'role', 'content', 'timestamp'}
if not required.issubset(available):
return []
optional = [
'tool_call_id',
'tool_calls',
'tool_name',
'reasoning',
'reasoning_details',
'codex_reasoning_items',
'reasoning_content',
'codex_message_items',
]
selected = ['role', 'content', 'timestamp'] + [c for c in optional if c in available]
cur.execute(f"""
SELECT {', '.join(selected)}
FROM messages
WHERE session_id = ?
ORDER BY timestamp ASC
""", (sid,))
msgs = []
for row in cur.fetchall():
msgs.append({
msg = {
'role': row['role'],
'content': row['content'],
'timestamp': row['timestamp'],
})
}
for col in optional:
if col not in row.keys():
continue
value = row[col]
if value in (None, ''):
continue
if col in {'tool_calls', 'reasoning_details', 'codex_reasoning_items', 'codex_message_items'}:
value = _json_loads_if_string(value)
msg[col] = value
if msg.get('role') == 'tool' and msg.get('tool_name') and not msg.get('name'):
msg['name'] = msg['tool_name']
msgs.append(msg)
except Exception:
return []
return msgs
Expand Down
36 changes: 32 additions & 4 deletions api/oauth.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,30 @@

_OAUTH_FLOWS: dict[str, dict[str, Any]] = {}
_OAUTH_FLOWS_LOCK = threading.Lock()
_ANTHROPIC_ENV_KEYS = ("ANTHROPIC_TOKEN", "ANTHROPIC_API_KEY")


def _clear_process_anthropic_env_values() -> None:
"""Clear Anthropic process env fallbacks under the streaming env lock."""
from api.streaming import _ENV_LOCK

with _ENV_LOCK:
for key in _ANTHROPIC_ENV_KEYS:
os.environ.pop(key, None)


def resolve_runtime_provider_with_anthropic_env_lock(resolver, *args, **kwargs):
"""Resolve runtime credentials under the Anthropic onboarding env lock.

Request paths must resolve Anthropic env fallbacks per outbound request,
not cache ANTHROPIC_TOKEN or ANTHROPIC_API_KEY across onboarding. Sharing
the process-env lock prevents a chat stream from observing one stale
Anthropic env value while onboarding has already cleared the other.
"""
from api.streaming import _ENV_LOCK

with _ENV_LOCK:
return resolver(*args, **kwargs)


def _normalize_onboarding_oauth_provider(provider: str) -> str:
Expand Down Expand Up @@ -234,18 +258,22 @@ def _read_claude_code_credentials() -> dict[str, Any] | None:


def _clear_anthropic_env_values(hermes_home: Path) -> None:
"""Clear Anthropic API/setup-token env values in the active profile only."""
"""Clear Anthropic API/setup-token env values in the active profile only.

The .env write path already clears os.environ while holding the streaming
env lock. Keep a locked process-env clear here too so import/write failures
cannot leave or partially clear stale Anthropic fallbacks.
"""
try:
from api.providers import _write_env_file

_write_env_file(
Path(hermes_home) / ".env",
{"ANTHROPIC_TOKEN": None, "ANTHROPIC_API_KEY": None},
{key: None for key in _ANTHROPIC_ENV_KEYS},
)
except Exception as exc:
logger.warning("Failed to clear Anthropic env values: %s", exc)
os.environ.pop("ANTHROPIC_TOKEN", None)
os.environ.pop("ANTHROPIC_API_KEY", None)
_clear_process_anthropic_env_values()


def _link_anthropic_credentials(hermes_home: Path) -> None:
Expand Down
66 changes: 63 additions & 3 deletions api/routes.py
Original file line number Diff line number Diff line change
Expand Up @@ -6261,9 +6261,13 @@ def _handle_chat_sync(handler, body):
# Resolve API key via Hermes runtime provider (matches gateway behaviour)
_api_key = None
try:
from api.oauth import resolve_runtime_provider_with_anthropic_env_lock
from hermes_cli.runtime_provider import resolve_runtime_provider

_rt = resolve_runtime_provider(requested=_provider)
_rt = resolve_runtime_provider_with_anthropic_env_lock(
resolve_runtime_provider,
requested=_provider,
)
_api_key = _rt.get("api_key")
# Also use runtime provider/base_url if the webui config didn't resolve them
if not _provider:
Expand Down Expand Up @@ -7015,6 +7019,7 @@ def _summarize_manual_compression(
)

import api.config as _cfg
from api.oauth import resolve_runtime_provider_with_anthropic_env_lock
import hermes_cli.runtime_provider as _runtime_provider
import run_agent as _run_agent

Expand All @@ -7024,7 +7029,10 @@ def _summarize_manual_compression(

resolved_api_key = None
try:
_rt = _runtime_provider.resolve_runtime_provider(requested=resolved_provider)
_rt = resolve_runtime_provider_with_anthropic_env_lock(
_runtime_provider.resolve_runtime_provider,
requested=resolved_provider,
)
resolved_api_key = _rt.get("api_key")
if not resolved_provider:
resolved_provider = _rt.get("provider")
Expand Down Expand Up @@ -7616,6 +7624,7 @@ def _agent_text_completion(agent, system_prompt, user_text, max_tokens=700):
# Call LLM for summary.
try:
import api.config as _cfg
from api.oauth import resolve_runtime_provider_with_anthropic_env_lock
import hermes_cli.runtime_provider as _runtime_provider
import run_agent as _run_agent

Expand All @@ -7634,7 +7643,10 @@ def _agent_text_completion(agent, system_prompt, user_text, max_tokens=700):

resolved_api_key = None
try:
_rt = _runtime_provider.resolve_runtime_provider(requested=resolved_provider)
_rt = resolve_runtime_provider_with_anthropic_env_lock(
_runtime_provider.resolve_runtime_provider,
requested=resolved_provider,
)
resolved_api_key = _rt.get("api_key")
if not resolved_provider:
resolved_provider = _rt.get("provider")
Expand Down Expand Up @@ -7849,6 +7861,49 @@ def _normalize_message_for_import_refresh(message: object) -> object:
return normalized


def _message_has_cli_tool_metadata(message: object) -> bool:
if not isinstance(message, dict):
return False
if message.get("role") == "assistant" and message.get("tool_calls"):
return True
if message.get("role") == "tool" and (message.get("tool_call_id") or message.get("tool_name") or message.get("name")):
return True
return False


def _strip_cli_tool_metadata_for_refresh(message: object) -> object:
if not isinstance(message, dict):
return _normalize_message_for_import_refresh(message)
normalized = _normalize_message_for_import_refresh(message)
if not isinstance(normalized, dict):
return normalized
for key in ("tool_calls", "tool_call_id", "tool_name", "name"):
normalized.pop(key, None)
return normalized


def _is_cli_tool_metadata_enrichment(existing_messages: list, fresh_messages: list) -> bool:
"""Return True when fresh messages only add CLI tool metadata.

Older imports from get_cli_session_messages() persisted assistant/tool rows
without tool_calls, tool_call_id, or tool_name. After #1772 the refreshed
transcript can have the same length but richer metadata, so re-imports must
rebuild the stored sidecar even without a new row.
"""
if not isinstance(existing_messages, list) or not isinstance(fresh_messages, list):
return False
if len(existing_messages) != len(fresh_messages):
return False
if any(_message_has_cli_tool_metadata(m) for m in existing_messages):
return False
if not any(_message_has_cli_tool_metadata(m) for m in fresh_messages):
return False
for idx, existing_message in enumerate(existing_messages):
if _strip_cli_tool_metadata_for_refresh(existing_message) != _strip_cli_tool_metadata_for_refresh(fresh_messages[idx]):
return False
return True


def _is_messages_refresh_prefix_match(existing_messages: list, fresh_messages: list) -> bool:
"""Return True when existing_messages is a prefix of fresh_messages by value.

Expand Down Expand Up @@ -7893,6 +7948,11 @@ def _handle_session_import_cli(handler, body):
if _is_messages_refresh_prefix_match(existing.messages, fresh_msgs):
existing.messages = fresh_msgs
changed = True
elif fresh_msgs and _is_cli_tool_metadata_enrichment(existing.messages, fresh_msgs):
# Same row count, richer payload: rebuild sidecars imported before
# CLI tool metadata was preserved (#1772).
existing.messages = fresh_msgs
changed = True
if cli_meta:
updates = {
"is_cli_session": True,
Expand Down
16 changes: 13 additions & 3 deletions api/streaming.py
Original file line number Diff line number Diff line change
Expand Up @@ -1741,7 +1741,10 @@ def _attempt_credential_self_heal(
re-invoke ``run_conversation`` with these).
"""
try:
from api.oauth import read_auth_json
from api.oauth import (
read_auth_json,
resolve_runtime_provider_with_anthropic_env_lock,
)
from api.config import (
SESSION_AGENT_CACHE, SESSION_AGENT_CACHE_LOCK,
invalidate_credential_pool_cache,
Expand All @@ -1762,7 +1765,10 @@ def _attempt_credential_self_heal(
invalidate_credential_pool_cache(provider_id)

# 4. Re-resolve runtime provider with fresh credentials
_new_rt = resolve_runtime_provider(requested=provider_id)
_new_rt = resolve_runtime_provider_with_anthropic_env_lock(
resolve_runtime_provider,
requested=provider_id,
)

logger.info(
'[webui] self-heal: credential refresh succeeded for provider=%s session=%s',
Expand Down Expand Up @@ -2170,8 +2176,12 @@ def on_tool(*cb_args, **cb_kwargs):
# Pass the resolved provider so non-default providers get their own credentials.
resolved_api_key = None
try:
from api.oauth import resolve_runtime_provider_with_anthropic_env_lock
from hermes_cli.runtime_provider import resolve_runtime_provider
_rt = resolve_runtime_provider(requested=resolved_provider)
_rt = resolve_runtime_provider_with_anthropic_env_lock(
resolve_runtime_provider,
requested=resolved_provider,
)
resolved_api_key = _rt.get("api_key")
if not resolved_provider:
resolved_provider = _rt.get("provider")
Expand Down
Binary file added docs/pr-media/1771/session-model-fallback.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
25 changes: 25 additions & 0 deletions docs/pr-media/1772/cli-tool-metadata-api-evidence.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"issue": 1772,
"check": "api.models.get_cli_session_messages preserves CLI tool metadata for WebUI rendering",
"session_id": "cli_issue_1772_demo",
"message_count": 2,
"assistant_tool_calls": [
{
"id": "call_1772_demo",
"type": "function",
"function": {
"name": "terminal",
"arguments": "{\"command\": \"printf ok\"}"
}
}
],
"tool_result": {
"role": "tool",
"tool_call_id": "call_1772_demo",
"tool_name": "terminal",
"name": "terminal",
"content": {
"output": "ok"
}
}
}
Loading
Loading