feat(chat): 前端显示上下文真实token消耗，/usage命令结果与cli格式同步 by GoldenFish123321 · Pull Request #1136 · EKKOLearnAI/hermes-studio

GoldenFish123321 · 2026-05-29T17:36:58Z

问题

bridge 模式下，前端进度条和 /usage 命令都使用本地 tiktoken 估算 token，而非上游 API 返回的真实消耗。
/usage 只输出一行简单结果，和 CLI 的详细格式差距很大

根因

hermes agent conversation_loop.py:4156-4183 的返回值已包含完整 token 数据：

result = {
    "input_tokens": ..., "output_tokens": ...,
    "cache_read_tokens": ..., "cache_write_tokens": ...,
    "reasoning_tokens": ..., "total_tokens": ...,
    "api_calls": ..., "model": ...,
    "estimated_cost_usd": ..., "cost_status": ...,
}

bridge 原样透传到 chunk.result，但 TypeScript 端此前只用本地 countTokens() 重新估算。

效果

做了什么

提取真实 API 数据：新增 extractBridgeUsage() 从 chunk.result 中提取上游 token 消耗，写入 BridgeUsageState。这个数据通过 run.completed / run.failed 事件发给前端，同时持久化到 sessions 表，页面刷新后也能恢复。

修正进度条：hermes agent 返回的 session_prompt_tokens 会在多轮 tool call 时重复累加共享上下文，进度条显示的值是实际占用的 N 倍。改用 last_prompt_tokens（单次调用的真实 prompt 消耗），进度条终于显示准确的上下文占用。

重写 /usage：输出格式完全对齐 TUI CLI，包含完整的字段列表、成本估算、上下文占用百分比、消息数和压缩次数。数据优先从 live bridge 读取，其次从 DB 恢复，最后回退到本地 tiktoken 估算。

防止数据污染：DB 回退时拿到的 session_* 是上一轮的累加值（可能几百万 token），不能直接写进进度条。通过 lastPromptTokens != null 守卫严格隔离两类数据：有 lastPromptTokens 的是 live 数据，可以放心用；没有的是 DB 回退数据，只用于 /usage 的静态展示，不会覆盖实时进度条。这个守卫覆盖了所有注入路径——run.completed、/usage handler、切换会话的 resume、WebSocket 重连的 resume。

清理了几个边缘问题：每轮 run 开始清空 bridgeUsage 防上一轮失败数据泄露、updateSession() 跳过 undefined 参数防 SQLite binding error、loadSessions() 刷新时保留内存中的 apiUsage、run.failed 也提取 API 真实数据、duration 在 started_at 缺失时显示 N/A 而非 Date.now() 假值。

涉及文件

文件	改了什么
`types.ts`	新增 `BridgeUsageState` 接口
`handle-bridge-run.ts`	`extractBridgeUsage()` 提取 token、`updateSession()` 持久化、每 run 清空 stale data
`session-command.ts`	`/usage` 重写：live → DB → tiktoken 三层 fallback，格式对齐 CLI
`index.ts`	resume payload 携带 `bridgeUsage`
`schemas.ts`	sessions 表加 `last_prompt_tokens` 列（自动迁移）
`session-store.ts`	`updateSession()` 跳过 `undefined` 参数
`chat.ts` (client)	`apiUsage` 字段、提取 `applyApiUsage()`、resume/reconnect 路径统一恢复
`chat.ts` (api)	`ResumeSessionPayload` 加 `bridgeUsage` 字段
`ChatInput.vue`	进度条优先读 `apiUsage.lastPromptTokens`

兼容性

无 bridge 数据时自动 fallback 到原有 tiktoken 估算
进度条 fallback 链：lastPromptTokens → inputTokens+cache → contextTokens → inputTokens+outputTokens

… command - Extract real API token consumption from hermes agent's run_conversation() result (conversation_loop.py:4156-4183) instead of local tiktoken estimation - Add BridgeUsageState interface for typed usage data - Enhanced /usage command output to match hermes CLI format: model, input/output/cache/reasoning tokens, cost info, context window, messages, compressions - Frontend now displays actual API input/output tokens when bridge mode apiUsage data is available - Fallback to local tiktoken estimate when bridge data is not yet available (old sessions, error cases)

When upstream API usage data is available (bridge mode), use apiUsage.inputTokens as contextTokens for the progress bar display instead of the local tiktoken estimate. API input_tokens already includes system prompt + tools + messages — the actual context window consumption. Also guard all mid-run contextTokens updates (usage.updated, compression.completed) against overwriting apiUsage-based values.

Instead of patching contextTokens in 6 event handlers with fragile guards, simply make the progress bar (ChatInput.vue totalTokens) check apiUsage.inputTokens first. When upstream API data is available, it's the authoritative context window consumption — no syncing needed.

Previously the 'Current context' line showed the local tiktoken estimate (e.g. 15,073), which didn't match the API input_tokens displayed above (e.g. 16,928). Now it prefers bu.inputTokens.

- After run.completed, update session store with bridge API token values (input/output/cache/reasoning/cost) so they survive page reloads. - When /usage falls back to DB because bridgeUsage is null (e.g. after reload), reconstruct BridgeUsageState from the persisted session row. Missing fields (prompt_tokens, completion_tokens, api_calls, cost_source) are omitted from output rather than shown as zero.

- Add BridgeUsageState type annotation on dbBu - Add Prompt/Completion/API calls/Cost source lines (N/A) to keep output format consistent with live bridge path

Extract resolveBridgeUsageFromDb() and formatCost() helpers. Resolve bu via a single ?? chain: live state → DB fallback. Cut 55 lines of duplicated message construction.

…inding error When estimatedCostUsd or actualCostUsd is undefined (bridge result lacks cost data), updateSession silently pushed undefined into the SQLite parameter array causing 'Provided value cannot be bound'.

When all input tokens are cache hits, input_tokens can be 0 but prompt_tokens reflects the real context window consumption including cache reads. Use promptTokens as the primary context display value.

- Fix ||/?? mixing syntax error in session-command.ts - ChatInput.vue: compute promptTokens from input+cache_read+cache_write instead of checking apiUsage.inputTokens > 0 (fails when all cached)

…ssion - Drop !row.input_tokens check: input_tokens can legitimately be 0 when the entire prompt is served from cache. cost_status alone is sufficient to detect persisted bridge data. - Use ?? instead of || for estimatedCostUsd to preserve 0. - Reuse sessionRow from the ?? chain to avoid duplicate DB read.

When a run in the same session fails before any chunk is processed (e.g. bridge connection drops), state.bridgeUsage still holds the previous successful run's data. The run.failed event then includes stale apiUsage, misleading the frontend. Clear bridgeUsage along with the other bridge state in the reset block.

… duration - Extract applyApiUsage() helper to eliminate duplicated token extraction across 4 run.* handlers (2 completed, 2 failed) - Add apiUsage handling to both run.failed handlers so failed runs that do have bridge usage data (e.g. terminal error after chunk processing) surface real API token consumption - Fix session duration showing bogus 0s when started_at is null; now shows 'N/A' instead of computing from Date.now() fallback - Document resolveBridgeUsageFromDb hardcoded-zero fields and state.bridgeUsage caching behavior

RunEvent is an interface without an index signature, incompatible with Record<string,unknown> under strict TypeScript checks. Use any to match the existing pattern used by all event handlers in chat.ts. Also remove redundant (evt as any) casts now that evt is already any.

loadSessions() recreates all Session objects from the API response, only preserving messages and contextTokens from old objects. This causes apiUsage (upstream API token data) to be lost when switching sessions or refreshing the session list, making the progress bar fall back to local tiktoken estimates. Add apiUsage to the runtime preservation map so it survives session list refreshes.

…on client Server's resumeSession() sent inputTokens/outputTokens/contextTokens (local tiktoken estimates) but omitted bridgeUsage (real API data). When the client switched sessions or refreshed the page, apiUsage was lost and the progress bar fell back to local estimates. Now: - Server includes state.bridgeUsage in the resumed payload - Client rebuilds apiUsage from bridgeUsage on resume, keeping the progress bar accurate without needing /usage to repair it

…counting session_prompt_tokens is a cumulative accumulator across all API calls within a turn. When there are multiple tool calls, it sums the shared context N times (e.g. 17k + 17k = 34k instead of 17k). hermes-agent already exposes last_prompt_tokens (line 4179 in conversation_loop.py) — the prompt cost of the single most recent API call, which IS the true context window consumption. - Add lastPromptTokens to BridgeUsageState and client apiUsage - Extract r.last_prompt_tokens in extractBridgeUsage - ChatInput.vue: prefer apiUsage.lastPromptTokens > cumulative sum - Thread through /usage emit and resume handler

…GoldenFish123321/hermes-web-ui into feature/bridge-real-token-usage

Same issue as the progress bar — session_prompt_tokens is cumulative across all API calls within a turn, double-counting shared context. Prefer lastPromptTokens (single-call) for the context display line.

…lback || treats 0 as falsy, so lastPromptTokens=0 would incorrectly fall through to the cumulative promptTokens. Use an explicit != null && > 0 guard so only undefined/null/0 triggers the fallback chain.

- Remove Prompt tokens (total) — redundant, = input + cache - Remove Completion tokens — redundant, = output_tokens - Remove Cost status / Cost source — merge into Cost line - Remove Pricing unknown note — Cost line already shows n/a - Remove separator line before Current context - Cache read/write tokens only shown when >0 (like CLI) - Compressions only shown when >0 (like CLI) - Session duration only shown when valid - formatCost: drop '(estimated)' suffix, add 'included' status - Keep Reasoning tokens, Messages, Session duration as useful extras

CLI shows Prompt tokens (total), Completion tokens, Cost status/source, separator lines, and always-displayed cache rows. Restore all of them. Use column-aligned padding matching CLI's wide-column layout.

… data DB fallback carries cumulative/inflated session_* counters from the previous turn (e.g. cache_read_tokens=5.3M). When the /usage handler blindly builds apiUsage from these, the progress bar explodes. Use isLive = (lastPromptTokens != null) as the gate. DB fallback does not set lastPromptTokens, so it never overwrites apiUsage. Live bridge data (which has lastPromptTokens) passes through.

- Reasoning tokens: only shown if >0, labeled '↳ Reasoning (subset):' - Add 'Note: Pricing unknown for {model}' at bottom when applicable

- Add last_prompt_tokens column to sessions table schema - Add field to HermesSessionRow interface - Write lastPromptTokens in updateSession() call - Read it back in resolveBridgeUsageFromDb() so the progress bar and /usage Current context survive page reload without falling back to cumulative session_* values

…teSession These construct HermesSessionRow objects and were missing the newly added last_prompt_tokens field, causing TS2741 errors.

GoldenFish123321 · 2026-05-30T08:11:41Z

修了一些东西

进度条修正

hermes agent 返回的 session_prompt_tokens 会在多轮 tool call 时重复累加共享上下文，进度条显示的值是实际占用的 N 倍。改成取 last_prompt_tokens（最后一次 API 调用的真实 prompt 消耗），进度条终于显示准确的上下文占用。/usage 的 Current context 行也做了同样的修正。

DB 也加了 last_prompt_tokens 列，页面刷新后不会退化成累加值。

数据源隔离

live bridge 数据和 DB 回退数据不能混用。DB 回退时拿到的 session_* 是上一轮的累加值（可能几百万 token），直接写进进度条就炸了。

所有 apiUsage 的注入点——run.completed、/usage handler、切换会话的 resume、WebSocket 重连的 resume——统一用 lastPromptTokens != null 做闸门：有就是 live 数据，放心更新；没有就不碰 apiUsage，让进度条走现有 fallback。

DB 变更

sessions 表加了一列 last_prompt_tokens（syncTable 自动迁移）。updateSession() 也修了一个 binding error：bridge 返回的 cost 字段可能是 undefined，直接塞给 SQLite prepared statement 会报错，现在循环里遇到 undefined 直接跳过。

恢复路径

所有恢复路径现在都能正确处理 apiUsage：

切换会话：switchSession 的 resume 回调从 bridgeUsage 重建
WS 重连：applyReconnectResume 同上（之前漏了这个路径）
标签页恢复：只刷 messages，不碰 token（本来就没问题）
loadSessions 刷新：保留旧 session 的 apiUsage 不丢

ResumeSessionPayload 接口同步加了 bridgeUsage 字段，之前一直靠 (data as any) 拿。

`/usage` 格式对齐

输出格式完全对齐 TUI CLI（cli.py:10206-10234），包括完整字段列表、成本估算、上下文占用百分比。细节上 Reasoning (subset)、cache 行、compression 只在有值时显示，unknown 定价时补一句提示。

边缘修复

每轮 run 开始清空 bridgeUsage 防上一轮失败数据泄露
run.failed 现在也提取 API 真实数据（之前只用了本地估算）
loadSessions() 刷新时保留 apiUsage（之前刷新后进度条跳变）
duration 在 started_at 缺失时显示 N/A 而非 Date.now() 假值
/usage 多行输出用 \x0a 替代 \n 防 JSON 二次转义

GoldenFish123321 · 2026-05-30T08:35:01Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(chat): 前端显示上下文真实token消耗，/usage命令结果与cli格式同步#1136

feat(chat): 前端显示上下文真实token消耗，/usage命令结果与cli格式同步#1136
GoldenFish123321 wants to merge 32 commits into
EKKOLearnAI:mainfrom
GoldenFish123321:feature/bridge-real-token-usage

GoldenFish123321 commented May 29, 2026 •

edited

Loading

Uh oh!

GoldenFish123321 commented May 30, 2026 •

edited

Loading

Uh oh!

GoldenFish123321 commented May 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

GoldenFish123321 commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

问题

根因

效果

做了什么

涉及文件

兼容性

Uh oh!

GoldenFish123321 commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

进度条修正

数据源隔离

DB 变更

恢复路径

/usage 格式对齐

边缘修复

Uh oh!

GoldenFish123321 commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

相关issue

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

GoldenFish123321 commented May 29, 2026 •

edited

Loading

GoldenFish123321 commented May 30, 2026 •

edited

Loading

`/usage` 格式对齐

GoldenFish123321 commented May 30, 2026 •

edited

Loading