feat(chat): 前端显示上下文真实token消耗,/usage命令结果与cli格式同步#1136
Conversation
… command - Extract real API token consumption from hermes agent's run_conversation() result (conversation_loop.py:4156-4183) instead of local tiktoken estimation - Add BridgeUsageState interface for typed usage data - Enhanced /usage command output to match hermes CLI format: model, input/output/cache/reasoning tokens, cost info, context window, messages, compressions - Frontend now displays actual API input/output tokens when bridge mode apiUsage data is available - Fallback to local tiktoken estimate when bridge data is not yet available (old sessions, error cases)
When upstream API usage data is available (bridge mode), use apiUsage.inputTokens as contextTokens for the progress bar display instead of the local tiktoken estimate. API input_tokens already includes system prompt + tools + messages — the actual context window consumption. Also guard all mid-run contextTokens updates (usage.updated, compression.completed) against overwriting apiUsage-based values.
Instead of patching contextTokens in 6 event handlers with fragile guards, simply make the progress bar (ChatInput.vue totalTokens) check apiUsage.inputTokens first. When upstream API data is available, it's the authoritative context window consumption — no syncing needed.
Previously the 'Current context' line showed the local tiktoken estimate (e.g. 15,073), which didn't match the API input_tokens displayed above (e.g. 16,928). Now it prefers bu.inputTokens.
- After run.completed, update session store with bridge API token values (input/output/cache/reasoning/cost) so they survive page reloads. - When /usage falls back to DB because bridgeUsage is null (e.g. after reload), reconstruct BridgeUsageState from the persisted session row. Missing fields (prompt_tokens, completion_tokens, api_calls, cost_source) are omitted from output rather than shown as zero.
- Add BridgeUsageState type annotation on dbBu - Add Prompt/Completion/API calls/Cost source lines (N/A) to keep output format consistent with live bridge path
Extract resolveBridgeUsageFromDb() and formatCost() helpers. Resolve bu via a single ?? chain: live state → DB fallback. Cut 55 lines of duplicated message construction.
…inding error When estimatedCostUsd or actualCostUsd is undefined (bridge result lacks cost data), updateSession silently pushed undefined into the SQLite parameter array causing 'Provided value cannot be bound'.
When all input tokens are cache hits, input_tokens can be 0 but prompt_tokens reflects the real context window consumption including cache reads. Use promptTokens as the primary context display value.
- Fix ||/?? mixing syntax error in session-command.ts - ChatInput.vue: compute promptTokens from input+cache_read+cache_write instead of checking apiUsage.inputTokens > 0 (fails when all cached)
…ssion - Drop !row.input_tokens check: input_tokens can legitimately be 0 when the entire prompt is served from cache. cost_status alone is sufficient to detect persisted bridge data. - Use ?? instead of || for estimatedCostUsd to preserve 0. - Reuse sessionRow from the ?? chain to avoid duplicate DB read.
When a run in the same session fails before any chunk is processed (e.g. bridge connection drops), state.bridgeUsage still holds the previous successful run's data. The run.failed event then includes stale apiUsage, misleading the frontend. Clear bridgeUsage along with the other bridge state in the reset block.
… duration - Extract applyApiUsage() helper to eliminate duplicated token extraction across 4 run.* handlers (2 completed, 2 failed) - Add apiUsage handling to both run.failed handlers so failed runs that do have bridge usage data (e.g. terminal error after chunk processing) surface real API token consumption - Fix session duration showing bogus 0s when started_at is null; now shows 'N/A' instead of computing from Date.now() fallback - Document resolveBridgeUsageFromDb hardcoded-zero fields and state.bridgeUsage caching behavior
RunEvent is an interface without an index signature, incompatible with Record<string,unknown> under strict TypeScript checks. Use any to match the existing pattern used by all event handlers in chat.ts. Also remove redundant (evt as any) casts now that evt is already any.
loadSessions() recreates all Session objects from the API response, only preserving messages and contextTokens from old objects. This causes apiUsage (upstream API token data) to be lost when switching sessions or refreshing the session list, making the progress bar fall back to local tiktoken estimates. Add apiUsage to the runtime preservation map so it survives session list refreshes.
…on client Server's resumeSession() sent inputTokens/outputTokens/contextTokens (local tiktoken estimates) but omitted bridgeUsage (real API data). When the client switched sessions or refreshed the page, apiUsage was lost and the progress bar fell back to local estimates. Now: - Server includes state.bridgeUsage in the resumed payload - Client rebuilds apiUsage from bridgeUsage on resume, keeping the progress bar accurate without needing /usage to repair it
…counting session_prompt_tokens is a cumulative accumulator across all API calls within a turn. When there are multiple tool calls, it sums the shared context N times (e.g. 17k + 17k = 34k instead of 17k). hermes-agent already exposes last_prompt_tokens (line 4179 in conversation_loop.py) — the prompt cost of the single most recent API call, which IS the true context window consumption. - Add lastPromptTokens to BridgeUsageState and client apiUsage - Extract r.last_prompt_tokens in extractBridgeUsage - ChatInput.vue: prefer apiUsage.lastPromptTokens > cumulative sum - Thread through /usage emit and resume handler
…GoldenFish123321/hermes-web-ui into feature/bridge-real-token-usage
Same issue as the progress bar — session_prompt_tokens is cumulative across all API calls within a turn, double-counting shared context. Prefer lastPromptTokens (single-call) for the context display line.
…lback || treats 0 as falsy, so lastPromptTokens=0 would incorrectly fall through to the cumulative promptTokens. Use an explicit != null && > 0 guard so only undefined/null/0 triggers the fallback chain.
- Remove Prompt tokens (total) — redundant, = input + cache - Remove Completion tokens — redundant, = output_tokens - Remove Cost status / Cost source — merge into Cost line - Remove Pricing unknown note — Cost line already shows n/a - Remove separator line before Current context - Cache read/write tokens only shown when >0 (like CLI) - Compressions only shown when >0 (like CLI) - Session duration only shown when valid - formatCost: drop '(estimated)' suffix, add 'included' status - Keep Reasoning tokens, Messages, Session duration as useful extras
CLI shows Prompt tokens (total), Completion tokens, Cost status/source, separator lines, and always-displayed cache rows. Restore all of them. Use column-aligned padding matching CLI's wide-column layout.
… data DB fallback carries cumulative/inflated session_* counters from the previous turn (e.g. cache_read_tokens=5.3M). When the /usage handler blindly builds apiUsage from these, the progress bar explodes. Use isLive = (lastPromptTokens != null) as the gate. DB fallback does not set lastPromptTokens, so it never overwrites apiUsage. Live bridge data (which has lastPromptTokens) passes through.
- Reasoning tokens: only shown if >0, labeled '↳ Reasoning (subset):'
- Add 'Note: Pricing unknown for {model}' at bottom when applicable
- Add last_prompt_tokens column to sessions table schema - Add field to HermesSessionRow interface - Write lastPromptTokens in updateSession() call - Read it back in resolveBridgeUsageFromDb() so the progress bar and /usage Current context survive page reload without falling back to cumulative session_* values
…teSession These construct HermesSessionRow objects and were missing the newly added last_prompt_tokens field, causing TS2741 errors.
|
修了一些东西 进度条修正hermes agent 返回的 DB 也加了 数据源隔离live bridge 数据和 DB 回退数据不能混用。DB 回退时拿到的 所有 DB 变更
恢复路径所有恢复路径现在都能正确处理
|
Same issue as the /usage handler — resume payload includes bridgeUsage which may carry stale cumulative session_* values without lastPromptTokens. Only rebuild apiUsage when lastPromptTokens is available (live bridge data).
… resume applyReconnectResume() was missing the bridgeUsage → apiUsage rebuild that switchSession's resume callback has. Add it with the same lastPromptTokens guard to prevent stale cumulative values from polluting the progress bar.
问题
/usage命令都使用本地tiktoken估算 token,而非上游 API 返回的真实消耗。/usage只输出一行简单结果,和 CLI 的详细格式差距很大根因
hermes agent
conversation_loop.py:4156-4183的返回值已包含完整 token 数据:bridge 原样透传到
chunk.result,但 TypeScript 端此前只用本地countTokens()重新估算。效果
做了什么
提取真实 API 数据:新增
extractBridgeUsage()从chunk.result中提取上游 token 消耗,写入BridgeUsageState。这个数据通过run.completed/run.failed事件发给前端,同时持久化到 sessions 表,页面刷新后也能恢复。修正进度条:hermes agent 返回的
session_prompt_tokens会在多轮 tool call 时重复累加共享上下文,进度条显示的值是实际占用的 N 倍。改用last_prompt_tokens(单次调用的真实 prompt 消耗),进度条终于显示准确的上下文占用。重写
/usage:输出格式完全对齐 TUI CLI,包含完整的字段列表、成本估算、上下文占用百分比、消息数和压缩次数。数据优先从 live bridge 读取,其次从 DB 恢复,最后回退到本地 tiktoken 估算。防止数据污染:DB 回退时拿到的
session_*是上一轮的累加值(可能几百万 token),不能直接写进进度条。通过lastPromptTokens != null守卫严格隔离两类数据:有lastPromptTokens的是 live 数据,可以放心用;没有的是 DB 回退数据,只用于/usage的静态展示,不会覆盖实时进度条。这个守卫覆盖了所有注入路径——run.completed、/usagehandler、切换会话的 resume、WebSocket 重连的 resume。清理了几个边缘问题:每轮 run 开始清空
bridgeUsage防上一轮失败数据泄露、updateSession()跳过undefined参数防 SQLite binding error、loadSessions()刷新时保留内存中的apiUsage、run.failed也提取 API 真实数据、duration 在started_at缺失时显示 N/A 而非 Date.now() 假值。涉及文件
types.tsBridgeUsageState接口handle-bridge-run.tsextractBridgeUsage()提取 token、updateSession()持久化、每 run 清空 stale datasession-command.ts/usage重写:live → DB → tiktoken 三层 fallback,格式对齐 CLIindex.tsbridgeUsageschemas.tslast_prompt_tokens列(自动迁移)session-store.tsupdateSession()跳过undefined参数chat.ts(client)apiUsage字段、提取applyApiUsage()、resume/reconnect 路径统一恢复chat.ts(api)ResumeSessionPayload加bridgeUsage字段ChatInput.vueapiUsage.lastPromptTokens兼容性
lastPromptTokens→inputTokens+cache→contextTokens→inputTokens+outputTokens