nesquena · nesquena-hermes · May 7, 2026 · May 7, 2026 · May 7, 2026 · May 7, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,29 @@
 # Hermes Web UI -- Changelog
 
+## [v0.51.16] — 2026-05-07 — 3-PR contributor batch (anthropic env race close, CLI tool metadata, model picker reset)
+
+### Fixed
+
+- **PR #1768** by @franksong2702 — Serialize Anthropic env fallback reads (closes #1736, the architectural follow-up filed in v0.51.8 sweep). Wraps `_clear_anthropic_env_values()` and the runtime-provider resolver behind `_ENV_LOCK` (the same `threading.Lock` already serializing env save/restore in `streaming.py`). New helper `resolve_runtime_provider_with_anthropic_env_lock()` in `api/oauth.py` is called from 3 sites in `api/routes.py` and 2 in `api/streaming.py`. Opus stage-310 verified: same-lock not a new lock (no ordering risk), nested acquires are sequential not nested (no deadlock), the lock is released before the agent runs (chat throughput unaffected). `api/oauth.py +36`, `api/routes.py +18`, `api/streaming.py +16`, +52 LOC test coverage in `tests/test_issue1362_codex_oauth_onboarding.py`. Race window in `_clear_anthropic_env_values` now closed for the chat hot path; remaining detector-style polls in `api/config.py` are UI-only and never bypass real credentials.
+- **PR #1778** by @Michaelyklam — Preserve CLI session tool metadata (closes #1772). The server's CLI session loader was reading only `role`, `content`, `timestamp` from `state.db.messages`, missing tool_calls/tool_results columns. `api/models.py +54` extends the loader to read those columns plus `reasoning_details`, `codex_reasoning_items`, `codex_message_items`, `reasoning_content`, `reasoning` and rehydrate them onto the message dicts. `PRAGMA table_info(messages)` check ensures legacy state.db schemas without the columns don't error. `_is_cli_tool_metadata_enrichment()` correctly rebuilds sidecars when message count is identical but new metadata is present, and uses `save(touch_updated_at=False)` to avoid bumping updated_at on passive enrichment. `api/routes.py +66`, 152 LOC test coverage in `tests/test_cli_session_tool_metadata.py` plus captured API evidence at `docs/pr-media/1772/cli-tool-metadata-api-evidence.json`.
+- **PR #1779** by @Michaelyklam — Reset model picker on session switch (closes #1771). Bug: switching sessions silently kept the previous chat's model selected in the composer (could route an inexpensive chat to an expensive model unnoticed — high-impact for users on premium-credit OAuth providers). Fix in `static/ui.js +88/-29`: when session model metadata is missing, `unknown`, or stale, fall back to configured default model/provider, with first-available dropdown option only as last resort. **Auto-fix applied at stage**: Opus stage-310 caught a regression in the new `!hasSessionModel` branch — it dropped the `deferModelCorrection` guard that the parallel else-branch keeps. Without the guard, every fast-path session view of an empty/unknown-model session fired a spurious `/api/session/update` POST that raced `_resolveSessionModelForDisplaySoon` and silently wrote to imported/read-only CLI sessions whose model field reads `"unknown"` (#1778 introduces exactly that surface in this same release). Wrapped the new branch's `_persistSessionModelCorrection` call + state mutation in `if(!deferModelCorrection)` mirroring the else-branch. Added `test_sync_topbar_does_not_persist_correction_while_model_resolution_deferred` regression test that exercises the fast-path interaction with `_modelResolutionDeferred=true` for both empty and `"unknown"` model values; asserts the visible `sel.value` still updates for UX but no POST is issued and no state mutation occurs. 192 LOC of original regression coverage in `tests/test_issue1771_session_model_switch_sync.py` (now 215 LOC with the new test), 7 LOC tweak to `test_provider_mismatch.py` and 1 LOC to `test_session_metadata_fast_path.py` to align existing tests with the new fallback helper.
+
+### Tests
+
+4694 → **4702 collected** (+8 across 2 new test files plus 1 stage auto-fix regression test). 4695 passed, 4 skipped (2 dev-only spawn from v0.51.15 + 2 prong-2 noise), 3 xpassed, 0 failed in 141.29s.
+
+### Pre-release verification
+
+- All 3 PRs CI-green individually.
+- File overlap on `api/routes.py` (#1768 + #1778) auto-merged cleanly (different functions: oauth env-lock helpers vs CLI session loader extension).
+- `node -c` clean on `static/ui.js`; Python compile clean on all 6 changed .py files.
+- pytest: 4695 passed, 0 failed.
+- `scripts/run-browser-tests.sh`: all 11 endpoints PASS on isolated port 8789.
+- Pre-stamp re-fetch: all 3 PR heads still match local rebases.
+- Opus advisor: SHIP #1768 + #1778, #1779 SHOULD-FIX before merge — auto-fix applied at stage with regression test, re-verified clean.
+
+Closes #1736, #1771, #1772.
+
 ## [v0.51.15] — 2026-05-07 — 4-PR contributor batch + 1 self-built (cron spawn migration, context menu, codex quota, model prefix)
 
 ### Fixed

diff --git a/ROADMAP.md b/ROADMAP.md
@@ -2,7 +2,7 @@
 
 > Web companion to the Hermes Agent CLI. Same workflows, browser-native.
 >
-> Last updated: v0.51.15 (May 7, 2026) — 4694 tests collected — 4-PR contributor batch + 1 self-built (#1762, #1767, #1769, #1770)
+> Last updated: v0.51.16 (May 7, 2026) — 4702 tests collected — 3-PR contributor batch (#1768, #1778, #1779)
 > Test source: `pytest tests/ --collect-only -q`
 > Per-version detail: see [CHANGELOG.md](./CHANGELOG.md)
 

diff --git a/TESTING.md b/TESTING.md
@@ -1835,8 +1835,8 @@ Bridged CLI sessions:
 
 ---
 
-*Last updated: v0.51.15, May 7, 2026*
-*Total automated tests collected: 4694*
+*Last updated: v0.51.16, May 7, 2026*
+*Total automated tests collected: 4702*
 *Regression gate: tests/test_regressions.py*
 *Run: pytest tests/ -v --timeout=60*
 *Source: <repo>/*
diff --git a/api/models.py b/api/models.py
@@ -1618,10 +1618,24 @@ def _cron_pid():
     return cli_sessions
 
 
+def _json_loads_if_string(value):
+    if not isinstance(value, str):
+        return value
+    text = value.strip()
+    if not text:
+        return None
+    try:
+        return json.loads(text)
+    except Exception:
+        return value
+
+
 def get_cli_session_messages(sid) -> list:
     """Read messages for a single CLI/external-agent session.
-    Returns a list of {role, content, timestamp} dicts.
-    Returns empty list on any error.
+
+    Preserve tool-call/result and reasoning metadata from the agent state.db so
+    CLI-origin transcripts render with the same tool cards as WebUI-native
+    sessions. Returns empty list on any error.
     """
     import os
     if str(sid or '').startswith(f'{CLAUDE_CODE_SOURCE}_'):
@@ -1644,19 +1658,47 @@ def get_cli_session_messages(sid) -> list:
         with closing(sqlite3.connect(str(db_path))) as conn:
             conn.row_factory = sqlite3.Row
             cur = conn.cursor()
-            cur.execute("""
-                SELECT role, content, timestamp
+            cur.execute("PRAGMA table_info(messages)")
+            available = {str(row['name']) for row in cur.fetchall()}
+            required = {'role', 'content', 'timestamp'}
+            if not required.issubset(available):
+                return []
+            optional = [
+                'tool_call_id',
+                'tool_calls',
+                'tool_name',
+                'reasoning',
+                'reasoning_details',
+                'codex_reasoning_items',
+                'reasoning_content',
+                'codex_message_items',
+            ]
+            selected = ['role', 'content', 'timestamp'] + [c for c in optional if c in available]
+            cur.execute(f"""
+                SELECT {', '.join(selected)}
                 FROM messages
                 WHERE session_id = ?
                 ORDER BY timestamp ASC
             """, (sid,))
             msgs = []
             for row in cur.fetchall():
-                msgs.append({
+                msg = {
                     'role': row['role'],
                     'content': row['content'],
                     'timestamp': row['timestamp'],
-                })
+                }
+                for col in optional:
+                    if col not in row.keys():
+                        continue
+                    value = row[col]
+                    if value in (None, ''):
+                        continue
+                    if col in {'tool_calls', 'reasoning_details', 'codex_reasoning_items', 'codex_message_items'}:
+                        value = _json_loads_if_string(value)
+                    msg[col] = value
+                if msg.get('role') == 'tool' and msg.get('tool_name') and not msg.get('name'):
+                    msg['name'] = msg['tool_name']
+                msgs.append(msg)
     except Exception:
         return []
     return msgs

diff --git a/api/oauth.py b/api/oauth.py
@@ -56,6 +56,30 @@
 
 _OAUTH_FLOWS: dict[str, dict[str, Any]] = {}
 _OAUTH_FLOWS_LOCK = threading.Lock()
+_ANTHROPIC_ENV_KEYS = ("ANTHROPIC_TOKEN", "ANTHROPIC_API_KEY")
+
+
+def _clear_process_anthropic_env_values() -> None:
+    """Clear Anthropic process env fallbacks under the streaming env lock."""
+    from api.streaming import _ENV_LOCK
+
+    with _ENV_LOCK:
+        for key in _ANTHROPIC_ENV_KEYS:
+            os.environ.pop(key, None)
+
+
+def resolve_runtime_provider_with_anthropic_env_lock(resolver, *args, **kwargs):
+    """Resolve runtime credentials under the Anthropic onboarding env lock.
+
+    Request paths must resolve Anthropic env fallbacks per outbound request,
+    not cache ANTHROPIC_TOKEN or ANTHROPIC_API_KEY across onboarding. Sharing
+    the process-env lock prevents a chat stream from observing one stale
+    Anthropic env value while onboarding has already cleared the other.
+    """
+    from api.streaming import _ENV_LOCK
+
+    with _ENV_LOCK:
+        return resolver(*args, **kwargs)
 
 
 def _normalize_onboarding_oauth_provider(provider: str) -> str:
@@ -234,18 +258,22 @@ def _read_claude_code_credentials() -> dict[str, Any] | None:
 
 
 def _clear_anthropic_env_values(hermes_home: Path) -> None:
-    """Clear Anthropic API/setup-token env values in the active profile only."""
+    """Clear Anthropic API/setup-token env values in the active profile only.
+
+    The .env write path already clears os.environ while holding the streaming
+    env lock. Keep a locked process-env clear here too so import/write failures
+    cannot leave or partially clear stale Anthropic fallbacks.
+    """
     try:
         from api.providers import _write_env_file
 
         _write_env_file(
             Path(hermes_home) / ".env",
-            {"ANTHROPIC_TOKEN": None, "ANTHROPIC_API_KEY": None},
+            {key: None for key in _ANTHROPIC_ENV_KEYS},
         )
     except Exception as exc:
         logger.warning("Failed to clear Anthropic env values: %s", exc)
-    os.environ.pop("ANTHROPIC_TOKEN", None)
-    os.environ.pop("ANTHROPIC_API_KEY", None)
+    _clear_process_anthropic_env_values()
 
 
 def _link_anthropic_credentials(hermes_home: Path) -> None:

diff --git a/api/routes.py b/api/routes.py
@@ -6261,9 +6261,13 @@ def _handle_chat_sync(handler, body):
             # Resolve API key via Hermes runtime provider (matches gateway behaviour)
             _api_key = None
             try:
+                from api.oauth import resolve_runtime_provider_with_anthropic_env_lock
                 from hermes_cli.runtime_provider import resolve_runtime_provider
 
-                _rt = resolve_runtime_provider(requested=_provider)
+                _rt = resolve_runtime_provider_with_anthropic_env_lock(
+                    resolve_runtime_provider,
+                    requested=_provider,
+                )
                 _api_key = _rt.get("api_key")
                 # Also use runtime provider/base_url if the webui config didn't resolve them
                 if not _provider:
@@ -7015,6 +7019,7 @@ def _summarize_manual_compression(
                 )
 
         import api.config as _cfg
+        from api.oauth import resolve_runtime_provider_with_anthropic_env_lock
         import hermes_cli.runtime_provider as _runtime_provider
         import run_agent as _run_agent
 
@@ -7024,7 +7029,10 @@ def _summarize_manual_compression(
 
         resolved_api_key = None
         try:
-            _rt = _runtime_provider.resolve_runtime_provider(requested=resolved_provider)
+            _rt = resolve_runtime_provider_with_anthropic_env_lock(
+                _runtime_provider.resolve_runtime_provider,
+                requested=resolved_provider,
+            )
             resolved_api_key = _rt.get("api_key")
             if not resolved_provider:
                 resolved_provider = _rt.get("provider")
@@ -7616,6 +7624,7 @@ def _agent_text_completion(agent, system_prompt, user_text, max_tokens=700):
         # Call LLM for summary.
     try:
         import api.config as _cfg
+        from api.oauth import resolve_runtime_provider_with_anthropic_env_lock
         import hermes_cli.runtime_provider as _runtime_provider
         import run_agent as _run_agent
 
@@ -7634,7 +7643,10 @@ def _agent_text_completion(agent, system_prompt, user_text, max_tokens=700):
 
         resolved_api_key = None
         try:
-            _rt = _runtime_provider.resolve_runtime_provider(requested=resolved_provider)
+            _rt = resolve_runtime_provider_with_anthropic_env_lock(
+                _runtime_provider.resolve_runtime_provider,
+                requested=resolved_provider,
+            )
             resolved_api_key = _rt.get("api_key")
             if not resolved_provider:
                 resolved_provider = _rt.get("provider")
@@ -7849,6 +7861,49 @@ def _normalize_message_for_import_refresh(message: object) -> object:
     return normalized
 
 
+def _message_has_cli_tool_metadata(message: object) -> bool:
+    if not isinstance(message, dict):
+        return False
+    if message.get("role") == "assistant" and message.get("tool_calls"):
+        return True
+    if message.get("role") == "tool" and (message.get("tool_call_id") or message.get("tool_name") or message.get("name")):
+        return True
+    return False
+
+
+def _strip_cli_tool_metadata_for_refresh(message: object) -> object:
+    if not isinstance(message, dict):
+        return _normalize_message_for_import_refresh(message)
+    normalized = _normalize_message_for_import_refresh(message)
+    if not isinstance(normalized, dict):
+        return normalized
+    for key in ("tool_calls", "tool_call_id", "tool_name", "name"):
+        normalized.pop(key, None)
+    return normalized
+
+
+def _is_cli_tool_metadata_enrichment(existing_messages: list, fresh_messages: list) -> bool:
+    """Return True when fresh messages only add CLI tool metadata.
+
+    Older imports from get_cli_session_messages() persisted assistant/tool rows
+    without tool_calls, tool_call_id, or tool_name. After #1772 the refreshed
+    transcript can have the same length but richer metadata, so re-imports must
+    rebuild the stored sidecar even without a new row.
+    """
+    if not isinstance(existing_messages, list) or not isinstance(fresh_messages, list):
+        return False
+    if len(existing_messages) != len(fresh_messages):
+        return False
+    if any(_message_has_cli_tool_metadata(m) for m in existing_messages):
+        return False
+    if not any(_message_has_cli_tool_metadata(m) for m in fresh_messages):
+        return False
+    for idx, existing_message in enumerate(existing_messages):
+        if _strip_cli_tool_metadata_for_refresh(existing_message) != _strip_cli_tool_metadata_for_refresh(fresh_messages[idx]):
+            return False
+    return True
+
+
 def _is_messages_refresh_prefix_match(existing_messages: list, fresh_messages: list) -> bool:
     """Return True when existing_messages is a prefix of fresh_messages by value.
 
@@ -7893,6 +7948,11 @@ def _handle_session_import_cli(handler, body):
             if _is_messages_refresh_prefix_match(existing.messages, fresh_msgs):
                 existing.messages = fresh_msgs
                 changed = True
+        elif fresh_msgs and _is_cli_tool_metadata_enrichment(existing.messages, fresh_msgs):
+            # Same row count, richer payload: rebuild sidecars imported before
+            # CLI tool metadata was preserved (#1772).
+            existing.messages = fresh_msgs
+            changed = True
         if cli_meta:
             updates = {
                 "is_cli_session": True,

diff --git a/api/streaming.py b/api/streaming.py
@@ -1741,7 +1741,10 @@ def _attempt_credential_self_heal(
        re-invoke ``run_conversation`` with these).
     """
     try:
-        from api.oauth import read_auth_json
+        from api.oauth import (
+            read_auth_json,
+            resolve_runtime_provider_with_anthropic_env_lock,
+        )
         from api.config import (
             SESSION_AGENT_CACHE, SESSION_AGENT_CACHE_LOCK,
             invalidate_credential_pool_cache,
@@ -1762,7 +1765,10 @@ def _attempt_credential_self_heal(
         invalidate_credential_pool_cache(provider_id)
 
         # 4. Re-resolve runtime provider with fresh credentials
-        _new_rt = resolve_runtime_provider(requested=provider_id)
+        _new_rt = resolve_runtime_provider_with_anthropic_env_lock(
+            resolve_runtime_provider,
+            requested=provider_id,
+        )
 
         logger.info(
             '[webui] self-heal: credential refresh succeeded for provider=%s session=%s',
@@ -2170,8 +2176,12 @@ def on_tool(*cb_args, **cb_kwargs):
             # Pass the resolved provider so non-default providers get their own credentials.
             resolved_api_key = None
             try:
+                from api.oauth import resolve_runtime_provider_with_anthropic_env_lock
                 from hermes_cli.runtime_provider import resolve_runtime_provider
-                _rt = resolve_runtime_provider(requested=resolved_provider)
+                _rt = resolve_runtime_provider_with_anthropic_env_lock(
+                    resolve_runtime_provider,
+                    requested=resolved_provider,
+                )
                 resolved_api_key = _rt.get("api_key")
                 if not resolved_provider:
                     resolved_provider = _rt.get("provider")

diff --git a/docs/pr-media/1771/session-model-fallback.png b/docs/pr-media/1771/session-model-fallback.png
diff --git a/docs/pr-media/1772/cli-tool-metadata-api-evidence.json b/docs/pr-media/1772/cli-tool-metadata-api-evidence.json
@@ -0,0 +1,25 @@
+{
+  "issue": 1772,
+  "check": "api.models.get_cli_session_messages preserves CLI tool metadata for WebUI rendering",
+  "session_id": "cli_issue_1772_demo",
+  "message_count": 2,
+  "assistant_tool_calls": [
+    {
+      "id": "call_1772_demo",
+      "type": "function",
+      "function": {
+        "name": "terminal",
+        "arguments": "{\"command\": \"printf ok\"}"
+      }
+    }
+  ],
+  "tool_result": {
+    "role": "tool",
+    "tool_call_id": "call_1772_demo",
+    "tool_name": "terminal",
+    "name": "terminal",
+    "content": {
+      "output": "ok"
+    }
+  }
+}