Releases · LeenHawk/gproxy

01 Mar 15:35

github-actions

staging

62422fa

staging Pre-release

Pre-release

Automated staging build from 62422faf88a19178ddfa97723abaabcf4d891c75.

Assets 32

14 Apr 16:45

LeenHawk

v1.0.10

27aaed2

v1.0.10 Latest

Latest

v1.0.10

Two focused fixes from the v1.0.9 fallout: claudecode OAuth refresh was broken against Anthropic's token endpoint and left credentials permanently dead, and the sanitize middleware was leaking anthropic-version through so every upstream request carried a duplicated header.

English

Fixed

claudecode OAuth refresh actually works again. The v1.0.9 gproxy-channel refactor routed refresh_credential's refresh_token path through the generic oauth2_refresh::refresh_oauth2_token helper, which posts grant_type=refresh_token&refresh_token=... (no client_id, no anthropic headers) to https://console.anthropic.com/v1/oauth/token. Anthropic's token endpoint rejects that shape with invalid_request_error: Invalid request format, so any credential with a refresh_token but no cookie fallback was stuck dead forever — the 401 → refresh → retry loop would fail every time. Replaced with exchange_tokens_with_refresh_token in claudecode_cookie.rs, which posts the CLI-matching shape to {api_base}/v1/oauth/token (form body with client_id=9d1c250a-... and headers anthropic-version: 2023-06-01 / anthropic-beta: oauth-2025-04-20 / user-agent: claude-cli/...).
Pre-flight credential refresh. Added Channel::needs_refresh as a new trait hook (default false). claudecode overrides it to return true when access_token is empty, expires_at_ms is already past, or expiry is within a 60s skew window. The retry loop now calls refresh_credential up-front for such credentials and proceeds with the fresh token, skipping the otherwise-guaranteed 401 round-trip. Errors from the pre-flight are logged and swallowed — the existing AuthDead path still catches anything that slips through.
anthropic-version no longer duplicated on upstream requests. The request sanitize middleware's HEADER_DENYLIST was already stripping authorization / user-agent / content-type / etc. from the downstream request before the channel forwarding loop ran — but anthropic-version was missing from the list. Since http::request::Builder::header appends rather than replaces, the client-forwarded copy ended up alongside the channel's own value, producing anthropic-version: 2023-06-01 twice on the wire. Added to the denylist.

Compatibility

Drop-in upgrade from v1.0.9. No DB migration, no HTTP API change, no config change. SDK consumers are unaffected — no public types or module paths moved.

简体中文

修复

claudecode OAuth refresh 重新可用. v1.0.9 的 gproxy-channel 重构把 refresh_credential 的 refresh_token 路径切到通用的 oauth2_refresh::refresh_oauth2_token helper,它往 https://console.anthropic.com/v1/oauth/token POST grant_type=refresh_token&refresh_token=...(没有 client_id,没有 anthropic header),Anthropic 的 token 端点会返回 invalid_request_error: Invalid request format 直接拒绝,所以只有 refresh_token 没有 cookie 兜底的 credential 永远死透 —— 401 → refresh → retry 循环每次都失败。换成 claudecode_cookie.rs 里新增的 exchange_tokens_with_refresh_token,按 CLI 的请求 shape 打到 {api_base}/v1/oauth/token(form body 带 client_id=9d1c250a-...,header 带 anthropic-version: 2023-06-01 / anthropic-beta: oauth-2025-04-20 / user-agent: claude-cli/...)。
Credential 的 pre-flight refresh. 新增 Channel::needs_refresh trait 方法(默认 false)。claudecode 覆盖实现:access_token 为空、expires_at_ms 已经过期、或 60 秒内即将过期时返回 true。retry 循环检测到后先调用 refresh_credential 刷新一次再发请求,省掉那次必然 401 的 round-trip。pre-flight 报错只记日志不中断,现有的 AuthDead 回退路径继续兜底。
anthropic-version 不再在上游请求中重复. 请求 sanitize 中间件的 HEADER_DENYLIST 之前已经在进 channel 转发循环之前抹掉了 authorization / user-agent / content-type 等,但漏了 anthropic-version。由于 http::request::Builder::header 是追加而不是替换,客户端发来的那份会和 channel 自己设的那份一起出现,上游就看到两份 anthropic-version: 2023-06-01。已加进 denylist。

兼容性

从 v1.0.9 直接升级。不涉及 DB 迁移、HTTP API 变更或配置变更。SDK 使用者不受影响 —— 没有任何公开类型或模块路径移动。

Assets 32

gproxy-android-aarch64.zip

sha256:046c7995849eadf3555149198e78d07c23a85689bd5c1997cc626b0464af0452

8.3 MB 2026-04-14T16:53:04Z
gproxy-android-aarch64.zip.sha256

sha256:a30cfaf35a41dfd6e2590918c2048bde7f115f5fbaaace81625fc51ddde88c23

93 Bytes 2026-04-14T16:53:04Z
gproxy-android-aarch64.zip.sha256.sig

sha256:9831bfd9a5eb98de799dea9babd616501a7e14f7cfc20847e9c6bd56c3af1145

89 Bytes 2026-04-14T16:57:40Z
gproxy-android-x86_64.zip

sha256:3208603b1af89e07eb925f8c78bb4321a0b7dfe5b6894443b26a6681d28d9b00

9.28 MB 2026-04-14T16:54:10Z
gproxy-android-x86_64.zip.sha256

sha256:049438bbf3df8e56637d448722fa1ae9e9a426766327beabfb6e62a7214d58be

92 Bytes 2026-04-14T16:54:10Z
gproxy-android-x86_64.zip.sha256.sig

sha256:70aadd420739ea0ee44ba7b93acbbb2751bfe4cc9c97f0f27d0a8cde00ee7e30

89 Bytes 2026-04-14T16:57:40Z
gproxy-linux-aarch64-musl.zip

sha256:bd5ba6d0397d2ac964b44dba4e9700fe8ded5b3c7d560bc9bae9ce7de84955e1

8.28 MB 2026-04-14T16:51:56Z
gproxy-linux-aarch64-musl.zip.sha256

sha256:e1422cab45b331e2c68a3f37f14de242415351f260455283da967a913ecb47f0

96 Bytes 2026-04-14T16:51:56Z
gproxy-linux-aarch64-musl.zip.sha256.sig

sha256:0cefaf38a7411e904f80bf0ebd771307f73c0b51896991f2cebb0e1b10924a54

89 Bytes 2026-04-14T16:57:40Z
gproxy-linux-aarch64.zip

sha256:e9fd22589dcc7a2d88ecca4446e7d00992cc9422658926b15617520de6cd38a3

8.36 MB 2026-04-14T16:51:58Z
Source code (zip)

2026-04-14T16:45:36Z
Source code (tar.gz)

2026-04-14T16:45:36Z

14 Apr 15:32

LeenHawk

v1.0.9

5af0c0d

v1.0.9

The SDK splits into four publishable crates — gproxy-protocol, gproxy-channel, gproxy-engine, gproxy-sdk — with real per-channel feature pruning, a standalone execute_once single-request client for single-provider use, and no DB / API / config changes for binary operators.

English

Added

Four publishable SDK crates — gproxy-protocol (L0 wire types + transforms), gproxy-channel (L1 Channel trait, 14 concrete channels, credentials, execute_once pipeline), gproxy-engine (L2 GproxyEngine, provider store, retry, affinity, routing helpers), and gproxy-sdk (facade re-exporting all three). Every SDK crate now carries complete crates.io metadata (license, readme, keywords, categories) and a per-crate README with a common layering table.
execute_once / execute_once_stream in gproxy_channel::executor — a complete single-request pipeline (finalize → sanitize → rewrite → prepare_request → HTTP send → normalize → classify) you can drive with just gproxy-channel as a dependency. Comes with lower-level prepare_for_send / send_attempt / send_attempt_stream helpers for users who want to write their own retry loop.
apply_outgoing_rules helper — the single in-tree invocation point for apply_sanitize_rules + apply_rewrite_rules. Engine, API handler, and L1 executor all funnel through one body-mutation helper instead of each re-implementing the JSON round-trip.
CommonChannelSettings (#[serde(flatten)]) — every channel now embeds one common struct holding user_agent, max_retries_on_429, sanitize_rules, rewrite_rules instead of each of the 14 channels copy-pasting the same four fields and trait method overrides. TOML / JSON wire format is unchanged.
Runtime transform dispatcher as public L0 API — gproxy_protocol::transform::dispatch::{transform_request, transform_response, create_stream_response_transformer, nonstream_to_stream, stream_to_nonstream, convert_error_body_or_raw}. External users who only want protocol conversion can now depend on gproxy-protocol alone and get everything without pulling wreq or tokio.
hello_openai example in sdk/gproxy-channel/examples/ — a minimal single-file demo of execute_once that runs against real OpenAI with OPENAI_API_KEY. Compiles under --no-default-features --features openai as a smoke test that single-channel use really only pulls one channel.
Integration test for execute_once — spins up a local axum mock server, points OpenAiSettings::base_url at it, runs the full L1 pipeline, and asserts on both request side (Bearer token, body) and response side (status, classification, JSON).
Optional label field on provider — free-text display name shown in the console alongside the internal provider name.

Changed

TransformError now carries Cow<'static, str> messages so the runtime dispatcher can produce dynamically-built errors (format!("no stream aggregation for protocol: {protocol}")) without allocating a new TransformError variant. Existing TransformError::not_implemented("literal") call sites keep working; new TransformError::new(impl Into<String>) constructor handles the dynamic case.
store.rs split — the 1564-line gproxy-engine/src/store.rs is now store/{mod,public_traits,runtime,types}.rs so the main ProviderStore orchestrator, the internal ProviderRuntime trait + ProviderInstance<C> generic implementation, the public traits, and the value types each live in their own file.
Lock-step SDK versioning — all four SDK crates follow workspace.package.version; release.sh's cargo set-version bump propagates to every [package] inherit plus the four workspace.dependencies.gproxy-*.version entries at once. The release strategy + manual publish recipe is documented inline in the root Cargo.toml.

Fixed

Per-channel feature flags now actually prune — the openai, anthropic, … channel feature flags on gproxy-channel, gproxy-engine, and gproxy-sdk were declared in v1.0.8 but non-functional. cargo build --no-default-features --features openai compiled all 14 channels anyway, because (a) the upstream gproxy-channel dep didn't opt out of default-features, so the default all-channels came in regardless; (b) gproxy-engine's all-channels feature only forwarded to gproxy-channel/all-channels and didn't enable its own per-channel features, so the #[cfg(feature = "…")] gates would have been false even if they existed; and (c) the gates didn't exist on engine's hardcoded match arms in built_in_model_prices, validate_credential_json, GproxyEngineBuilder::add_provider_json, ProviderStore::add_provider_json, and bootstrap_credential_on_upsert. All three fixed in this release, and cargo build -p gproxy-sdk --no-default-features --features openai now genuinely compiles only the single requested channel.
Pricing editor in the console collapses into a single triangle disclosure — the nested editor no longer cascades open by accident.
Dispatch template description now clarifies that it describes the upstream protocol, not the downstream-client shape.
Claude Code OAuth beta badge drops the misleading "always" suffix; the badge just shows the beta name now.
Self-update button and its success toast are now localized.
Doc-comment clippy lint (doc_lazy_continuation) on gproxy-engine crate doc no longer fails cargo clippy -- -D warnings.

Removed

gproxy-provider crate — the old aggregator that mixed single-channel access with the multi-channel engine. Its content is now split between gproxy-channel (L1) and gproxy-engine (L2).
gproxy-routing crate — merged into gproxy-engine::routing (classify, permission, rate_limit, provider_prefix, model_alias, model_extraction, headers / former sanitize.rs).
Deprecated gproxy_sdk::provider / gproxy_sdk::routing module aliases — use gproxy_sdk::channel::*, gproxy_sdk::engine::*, gproxy_sdk::engine::routing::* instead.
Unused ProviderDefinition type — dead code with no consumers.
gproxy-engine::transform_dispatch passthrough — engine now calls gproxy_protocol::transform::dispatch::* directly; the 14-line re-export file is gone.

Compatibility

Binary / server operators: drop-in upgrade from v1.0.8. No DB migration, no HTTP API change, no admin client change, no config change.
SDK library consumers: breaking change. gproxy_sdk::provider::* and gproxy_sdk::routing::* paths no longer exist. Migrate every import site to gproxy_sdk::channel::*, gproxy_sdk::engine::*, gproxy_sdk::engine::routing::* (for the former routing helpers), or gproxy_sdk::protocol::transform::dispatch::* (for the runtime transform dispatcher). All in-tree downstream consumers have already been migrated.
Direct gproxy-provider / gproxy-routing dependencies in downstream Cargo.toml must be replaced with gproxy-channel + gproxy-engine, or just gproxy-sdk if you want the facade.
14 channel Settings structs gained a common: CommonChannelSettings field flattened via serde, so existing TOML / JSON configs deserialize unchanged.
crates.io publishing: The four SDK crates are metadata-complete and packaged (verified via cargo publish --dry-run on gproxy-protocol and cargo package --list on the downstream three). Actual publish has NOT happened yet — this release is local to the repo. When you publish, the dependency order is gproxy-protocol → gproxy-channel → gproxy-engine → gproxy-sdk with ~30 s between each step for the registry index to catch up.

简体中文

新增

四个可发布的 SDK crate — gproxy-protocol(L0 wire 类型 + 协议转换)、gproxy-channel(L1 Channel trait、14 个具体 channel、credentials、execute_once 流水线)、gproxy-engine(L2 GproxyEngine、provider store、retry、affinity、路由 helper),以及 gproxy-sdk(facade,重导出上述三个)。每个 crate 都带齐 crates.io 元数据(license、readme、keywords、categories)和独立 README,README 顶部有统一的分层对照表。
execute_once / execute_once_stream(在 gproxy_channel::executor)—— 单次请求完整流水线(finalize → sanitize → rewrite → prepare_request → HTTP send → normalize → classify),只依赖 gproxy-channel 就能跑。还附带 prepare_for_send / send_attempt / send_attempt_stream 低阶 helper,供需要自己写 retry 循环的用户使用。
apply_outgoing_rules helper —— apply_sanitize_rules + apply_rewrite_rules 在仓库内的唯一调用点。engine、API handler 和 L1 executor 全部通过一个 body 变换 helper 走,不再各自重复 JSON 反序列化 / 变换 / 序列化三部曲。
CommonChannelSettings(#[serde(flatten)])—— 14 个 channel 的 Settings struct 现在统一 embed 一个 common struct,里面装 user_agent、max_retries_on_429、sanitize_rules、rewrite_rules,不再各自 copy-paste 同样的四个字段和四个 trait 方法。TOML / JSON 线格式不变。
运行时协议分发作为 L0 公开 API —— gproxy_protocol::transform::dispatch::{transform_request, transform_response, create_stream_response_transformer, nonstream_to_stream, stream_to_nonstream, convert_error_body_or_raw}。只想做协议转换的外部用户现在只依赖 gproxy-protocol 就够了,不会被 wreq、tokio 拖进来。
hello_openai 示例(sdk/gproxy-channel/examples/)—— 用 OPENAI_API_KEY 打真实 OpenAI 的单文件 demo。用 --no-default-features --features openai 编译就能作为"单渠道场景真的只拖一家"的 smoke test。
execute_once 集成测试 —— 起本地 axum mock 服务,把 OpenAiSettings::base_url 指过去,跑完整 L1 流水线,从请求侧(Bearer token、body)和响应侧(status、classification、JSON)双向断言。
provider 新增可选 label 字段 —— 控制台里显示的自由文本名称,与内部 provider 名称并列。

变更

TransformError 消息改为 Cow<'static, str>,让运行时 dispatcher 能动态构造错误(format!("no stream aggregation for protocol: {protocol}")),不用为此新增 TransformError 变体。旧的 TransformError::not_implemented("literal") 调用位照旧工作;新的 TransformError::new(impl Into<String>) 构造器负责动态场景。
store.rs 拆分 —— 原本 1564 行的 gproxy-engine/src/store.rs 拆成 store/{mod,public_traits,runtime,types}.rs,主 ProviderStore 编排层、内部 ProviderRuntime trait + ProviderInstance<C> 泛型实现、公开 trait、值类型各自独立成文件。
SDK 锁步版本 —— 四个 SDK crate 统一跟随 `workspace.pa...

Assets 32

14 Apr 03:53

LeenHawk

v1.0.8

cb602c7

v1.0.8

Cross-protocol error bodies finally make it to the client in the
right schema, orphaned tool_result messages stop breaking Claude
requests, and streaming upstream logs now store the actual wire
bytes. The headline fix: when a Claude/Gemini/OpenAI upstream
returns a non-2xx error body, the engine now converts it to the
client's declared error shape (e.g. Claude {"type":"error",...} →
OpenAI {"error":{...}}) instead of handing the raw JSON to an SDK
that can't parse it — with a raw-bytes fallback when the upstream
shape doesn't match any declared schema. Streaming error responses
finally reach the client too, after a buffer-and-convert fast path
replaces the broken SSE transformer that used to swallow the error
body and emit only [DONE]. On the transform side, a new
push_message_block utility centralizes Claude message-building
invariants across every *→Claude converter, fixing an OpenAI
Responses-API bug where previous_response_id + fresh
function_call_output produced orphaned tool_result blocks and
Claude returned a 400. The console picks up a per-channel
max_retries_on_429 field and a one-click TOML download on the
config export page.

English

Fixed

Non-2xx upstream errors reached clients in the wrong protocol
schema — each provider uses a different error shape (Claude
{"type":"error","error":{...}}, OpenAI {"error":{...}}, Gemini
{"error":{"code":N,...}}), and before this release the engine only
ran transform_response on 2xx bodies. An OpenAI-speaking client
that hit a Claude 400 got the raw Claude JSON back, which the SDK
couldn't parse, so dashboards saw a generic "invalid response" on
what was really a simple upstream 400 (e.g. prompt is too long).
sdk/gproxy-provider/src/engine.rs and
sdk/gproxy-provider/src/transform_dispatch.rs now route error
bodies through the new convert_error_body_or_raw helper, which
tries the declared error variant via BodyEnvelope::from_body_bytes
and falls back to raw upstream bytes on schema mismatch (e.g. codex
returning {"detail":{"code":"deactivated_workspace"}}, which isn't
any declared error schema). Claude-error-to-OpenAI-error conversion
is covered by a new integration test.
Streaming endpoints swallowed upstream error bodies — on a
cross-protocol transform route (e.g. client speaks
OpenAI-chat-completions, upstream is Claude), a non-2xx upstream
response was fed to the inline per-chunk SSE transformer, which
couldn't parse the JSON error body as an SSE frame, yielded nothing,
and emitted only a synthetic [DONE]. The client saw an empty
success stream instead of the actual 4xx/5xx error. execute_stream
now detects !is_success upstream early, buffers the full error
body (which is always a small complete JSON, not a real SSE
stream), runs it through convert_error_body_or_raw, and returns
a single-chunk ExecuteBody::Stream with the converted bytes. The
raw pre-conversion upstream bytes are still captured for the
upstream log so operators can see what actually came over the wire.
Orphaned tool_result blocks caused Claude 400 on OpenAI
Responses-API requests — Claude's API requires "each tool_result
block must have a corresponding tool_use block in the previous
message," but the OpenAI Responses API lets clients send only
function_call_output items when using previous_response_id
(the tool_use side lives in the prior turn, which the client is
referencing by id instead of resending). The legacy *→Claude
transforms built messages by blindly pushing blocks, so these
requests ended up with a leading user/tool_result message and
no matching assistant/tool_use — Claude returned 400 every
time. The new push_message_block helper (see Added) synthesizes
a placeholder tool_use block with the matching id whenever it
detects an orphaned tool_result, so the request now satisfies
Claude's pairing rule and goes through.
Adjacent same-role messages from multi-block transforms — the
per-transform push_block_message helpers produced two separate
user messages for two consecutive tool_result pushes (and
similarly for assistant blocks), which Claude's API rejects as
malformed. push_message_block now merges consecutive blocks for
the same role into a single BetaMessageContent::Blocks message,
so every *→Claude transform produces a well-formed message list
by construction.
Streaming upstream logs stored post-transform bytes instead of
pre-transform wire bytes — the handler's old
accumulated_body: Vec<u8> collected chunks as they were yielded
downstream, so for cross-protocol routes the response_body
column in upstream_requests held the converted (OpenAI/Gemini/…)
bytes, not what Claude/OpenAI-upstream actually sent. This diverged
from the non-stream path, which stores the pre-transform bytes via
raw_response_body_for_log. A new stream wrapper
(wrap_upstream_response_stream) now tees upstream bytes into an
Arc<Mutex<Vec<u8>>> capture buffer before they reach the
transformer, and the handler reads it after the stream drains.
Stream and non-stream paths are now byte-for-byte consistent in
the upstream log.

Changed

Passthrough streaming fast path — when a stream route has no
transformer, no raw_capture, and no response_model_override, the
engine now hands response.body through to the client unwrapped
instead of going through a per-chunk try_stream! loop. This
reclaims the passthrough latency that was lost when accumulated_body
was added. The wrapper is only spliced in when at least one of raw
capture, transform, or alias rewriting is active.
rand 0.9.4 / rand_core 0.10.1 — minor dependency bumps.
Picks up upstream API cleanups; no gproxy code changes required.

Added

convert_error_body_or_raw(src_op, src_proto, dst_op, dst_proto, body) in sdk/gproxy-provider/src/transform_dispatch.rs —
converts an upstream non-2xx body from the upstream protocol's
error schema to the client's expected error schema via
transform_response, substituting GenerateContent for streaming
ops (error bodies share the non-stream schema). Passthrough routes
(same src/dst protocol and op) skip conversion entirely. On schema
mismatch the helper logs at debug level with the full
src_op/src_proto/dst_op/dst_proto context and returns the
raw bytes so no error information is lost. Three unit tests cover
Claude→OpenAI rewriting, codex-shape fallback, and the passthrough
case.
ExecuteResult.stream_raw_capture: Option<Arc<Mutex<Vec<u8>>>>
— new field on the SDK result type, populated by
execute_stream when enable_upstream_log && enable_upstream_log_body and the route actually sees a raw-capture
tee. The handler reads the buffer after the stream drains and
copies it into meta.response_body, so
upstream_requests.response_body contains the pre-transform wire
bytes that correspond to what the non-stream path already stored.
None on passthrough-with-logging-off routes and on the error-body
fast path's re-use (which seeds the buffer with pre-conversion
bytes itself).
wrap_upstream_response_stream in
sdk/gproxy-provider/src/engine.rs — single stream-combinator that
applies, in order: raw-byte tee into raw_capture, optional
per-chunk StreamResponseTransformer, and optional model-alias
rewriting. Replaces the previous two inlined try_stream! loops
(one for transform + alias, one for alias-only) with a unified
helper whose behaviour is covered by two unit tests
(wrap_response_stream_tees_raw_bytes_in_passthrough_mode,
wrap_response_stream_pure_passthrough_yields_chunks_unchanged).
push_message_block(messages, role, block) in
sdk/gproxy-protocol/src/transform/claude/utils.rs — central
utility for building Claude messages lists from any non-Claude
source. Maintains two invariants:
1. Consecutive blocks for the same role are merged into one
  BetaMessageContent::Blocks message (no adjacent same-role
  messages).
2. Whenever a tool_result block is appended to a user message,
  the immediately-preceding assistant message is checked for a
  matching tool_use block; if none exists, a placeholder
  tool_use (named tool_use_placeholder) is synthesized in the
  assistant slot — either by promoting an existing assistant
  message's content into blocks and appending, or by inserting a
  new assistant message before the trailing user one.
  Exported from transform::claude::utils and re-exported from
  transform::utils so non-Claude callers don't need a cross-module
  dependency. Every *→Claude request transform (gemini,
  openai_chat_completions, openai_response, openai_compact,
  openai_count_tokens) is migrated to call it instead of pushing
  messages directly. Covered by 9 unit tests, including the exact
  orphaned-tool_result shape reported in production.
Per-channel max_retries_on_429 setting in ConfigTab — every
channel's structured editor now exposes an optional integer input
bound to the backend's per-credential 429-without-retry-after
retry cap (backend default: 3). Empty input is omitted from the
settings JSON so the backend default still applies. i18n strings
added in both locales (field.max_retries_on_429).
TOML download button on the config export page — ConfigExport
module grows a neutral Download button alongside the existing
Export. Clicking it ships the current export as
gproxy-config-<ISO-timestamp>.toml via a Blob + <a>-click. If
the user hasn't clicked Export yet, Download fetches the TOML
first and then triggers the file save. New i18n key:
common.download.

Compatibility

No DB, API, or config changes. `settings....

Assets 32

13 Apr 12:02

LeenHawk

v1.0.7

f2d4d72

v1.0.7

Self-update is unbroken, failing transforms finally tell you which
request broke them, and the docs site deploys itself. The headline
fix centralizes wreq client policy in the engine so every HTTP path
(including self-update) follows redirects — GitHub asset downloads
stop failing on their 302 to the CDN. Pre-upstream transform errors
now capture the original downstream request body in the upstream
log, so operators actually see which JSON failed to parse. The
release pipeline grows a Cloudflare Pages deploy job for the docs
site, and the Docker deployment page is rewritten around the
official ghcr.io/leenhawk/gproxy image.

English

Fixed

Self-update failing with download failed: HTTP 302 Found —
GitHub serves every /releases/download/... asset as a 302 to the
CDN host, but wreq's default redirect policy is
redirect::Policy::none(), so wreq::get(url) in
crates/gproxy-api/src/admin/update.rs returned the redirect
response verbatim and download_bytes / download_text rejected
it at the status().is_success() check. The update path never
touched the engine client either, so the fresh per-call default
client inherited none of the runtime configuration.
Pre-upstream transform failures lost the request body in logs
— when transform_dispatch::transform_request failed before we
ever sent anything upstream (e.g. a malformed tools[] entry
failing to deserialize into ResponseTool), the error bubbled up
as ExecuteError { meta: None, .. } and
record_execute_error_logs wrote an upstream-log row with
request_body = NULL, leaving operators a 500 with no way to see
which JSON actually failed. GproxyEngine::execute and
execute_stream now catch the transform error, clone the
original downstream body beforehand, and synthesize an
UpstreamRequestMeta via the new build_transform_error helper
so the offending body lands in the log. URL / headers /
response fields stay empty because the request never hit the
wire; enable_upstream_log / enable_upstream_log_body are still
honored.

Changed

Single source of truth for HTTP client policy — new
default_http_client() helper in
sdk/gproxy-provider/src/engine.rs centralizes the global wreq
client policy (redirect::Policy::limited(10)). Every build path
now routes through it:
- GproxyEngineBuilder::build() uses it as the default fallback
  (was self.client.unwrap_or_default()), so bare
  GproxyEngine::builder().build() — used by tests and several
  admin-only bootstrap paths — no longer produces a client that
  drops redirects.
- configure_clients and with_new_clients set .redirect(...)
  on both the normal and spoof-emulation builders, and their
  Err fallbacks route through default_http_client() instead
  of wreq::Client::default().
  This also closes a latent footgun: if configure_clients ever
  failed to build (bad proxy URL, TLS init error), the process used
  to silently fall back to a fully-unconfigured default client.
  The fallback now at least keeps the redirect policy.
update.rs reuses the engine's HTTP client — check_update
and perform_update grab state.engine().client().clone() and
pass it through to fetch_github_manifest, download_bytes, and
download_text. The three helpers no longer call wreq::get(url)
/ wreq::Client::new() at all. Practical upshot: self-update
traffic now inherits the operator's configured upstream proxy,
TLS settings, and whatever else the engine is built with —
previously it silently bypassed all of them.
Docker deployment guide rewritten around the official image
— docs/src/content/docs/deployment/docker.md (and the Chinese
mirror) now leads with docker pull ghcr.io/leenhawk/gproxy:latest
instead of "build Dockerfile.action locally," and documents the
full tag matrix (latest / vX.Y.Z / staging × glibc / musl,
plus per-arch suffixes). The installation pages cross-reference
the new guidance so new users don't start by building an image
they don't need to.

Added

GproxyEngine::client() getter — public accessor exposing
the shared &wreq::Client, so auxiliary admin code paths can
reuse the engine's configured client instead of constructing
their own. The spoof client stays private; the normal client is
the right choice for anything that is not upstream provider
traffic.
build_transform_error helper in
sdk/gproxy-provider/src/engine.rs — synthesizes an
UpstreamRequestMeta for the pre-upstream transform failure path
so operators get the downstream request body in the upstream log
even when we never reached a credential or a URL.
Cloudflare Pages docs deploy job — the
.github/workflows/release-binary.yml pipeline gains a
deploy-docs-cloudflare job that runs on default-branch pushes
and on releases: pnpm-installs, builds docs/, then ships the
result to Cloudflare Pages via cloudflare/wrangler-action@v3
using the cloudflare environment's
CLOUDFLARE_API_TOKEN / CLOUDFLARE_ACCOUNT_ID /
CLOUDFLARE_PROJECT_ID secrets. The docs site at
https://gproxy.leenhawk.com now updates automatically with every
merge.
sea-orm-migration workspace dependency — declared in
[workspace.dependencies] in preparation for an upcoming
managed-migration pass. No crate pulls it in yet, so this has no
runtime effect in v1.0.7.

Compatibility

No DB, API, or config changes. settings.toml,
global_settings, and the admin API schema are all untouched.
This is a drop-in upgrade from v1.0.6 — just swap the binary.
Engine builder defaults shift. GproxyEngine::builder().build()
now yields a client that follows up to 10 redirects, where v1.0.6
and earlier yielded a client that followed zero. SDK consumers
that were relying on the old behavior (e.g. intentionally
capturing 3xx responses as terminal) must explicitly pass their
own wreq::Client via http_client(...) /
configure_clients(...).
Transform-failure log rows now include request_body where
they previously had NULL. url / request_headers /
response_* on those rows are still empty strings / empty
arrays / NULL — the request never hit the wire, so there's
nothing real to record. Dashboards that were filtering transform
failures by url = '' will still work; ones that were filtering
by request_body IS NULL will need to check for the actual error
message instead.

简体中文

修复

自更新报 download failed: HTTP 302 Found — GitHub
的 /releases/download/... 资源永远是 302 到 CDN 域名的，
但 wreq 的默认重定向策略是 redirect::Policy::none()，所以
crates/gproxy-api/src/admin/update.rs 里 wreq::get(url)
拿到的是 302 本身，download_bytes / download_text 在
status().is_success() 这一步就直接拒绝。更新路径根本没
接触到 engine 的 client，所以每次新建的默认 client 也继承不到
任何运行时配置。
上游前的 transform 失败在日志里丢了 request body —
当 transform_dispatch::transform_request 在真正发请求之前
就失败（例如 tools[] 里有一个字段无法反序列化成
ResponseTool），错误会以 ExecuteError { meta: None, .. }
冒上来，record_execute_error_logs 写出的 upstream log 行
request_body = NULL，运维只能看到一个 500 但看不到到底是
哪段 JSON 解析不动。GproxyEngine::execute 和 execute_stream
现在会捕获这个 transform 错误，提前克隆原始 downstream body，
再通过新加的 build_transform_error helper 合成一个
UpstreamRequestMeta，让出问题的 body 能落进日志。URL /
headers / response 相关字段留空，因为请求根本没发上游；
enable_upstream_log / enable_upstream_log_body 仍然生效。

变更

HTTP client 策略统一到一个入口 —
sdk/gproxy-provider/src/engine.rs 新增 default_http_client()
helper，把全局 wreq client 策略（redirect::Policy::limited(10)）
收敛到一个地方。所有构建路径现在都走它：
- GproxyEngineBuilder::build() 的默认兜底从
  self.client.unwrap_or_default() 改成
  unwrap_or_else(default_http_client)，裸的
  GproxyEngine::builder().build() —— 测试和若干 admin-only
  bootstrap 路径都在用 —— 不会再构造出一个不跟随重定向的 client。
- configure_clients 和 with_new_clients 给普通 client 和
  spoof client 的 builder 都加了 .redirect(...)，而且它们的
  Err 兜底分支也从 wreq::Client::default() 切到
  default_http_client()。
  顺带堵了一个潜在陷阱：如果 configure_clients 构建失败（代理
  URL 有问题、TLS 初始化失败之类），之前会静默退回到一个完全
  未配置的默认 client。现在至少兜底 client 仍然会跟随重定向。
update.rs 改为复用 engine 的 HTTP client —
check_update 和 perform_update 取
state.engine().client().clone() 传给 fetch_github_manifest、
download_bytes 和 download_text，三个 helper 都不再调用
wreq::get(url) / wreq::Client::new()。实际效果：自更新流量
现在会经过运维配置的上游代理、TLS 设置以及 engine 上的其它
配置 —— 此前是悄悄绕过了所有这些配置。
Docker 部署文档改为以官方镜像为中心 —
docs/src/content/docs/deployment/docker.md（以及中文镜像）
现在首推 docker pull ghcr.io/leenhawk/gproxy:latest，而不是
「本地构建 Dockerfile.action」，并补齐了完整的 tag 矩阵
（latest / vX.Y.Z / staging × glibc / musl，以及各自的
per-arch 后缀）。安装页也相应调整，避免新用户上来就去构建
一个他们根本不需要构建的镜像。

新增

GproxyEngine::client() getter —
公开访问器，暴露共享的 &wreq::Client，方便 admin 辅助
代码路径复用 engine 已配置好的 client，而不是各自再建一个。
spoof client 仍然保持私有；非上游 provider 流量应该用这个
普通 client。
build_transform_error helper —
sdk/gproxy-provider/src/engine.rs 新增，专门给上游前的
transform 失败路径合成 UpstreamRequestMeta，让运维在根本
还没选到 credential、没拿到 URL 的时候，也能在 upstream log 里
看到 downstream 原始 body。
Cloudflare Pages 文档部署 Job —
.github/workflows/release-binary.yml 新增 deploy-docs-cloudflare
job：在默认分支推送和 release 事件上触发，pnpm install
→ 构建 docs/ → 通过 cloudflare/wrangler-action@v3 推到
Cloudflare Pages，使用 cloudflare environment 下的
CLOUDFLARE_API_TOKEN / CLOUDFLARE_ACCOUNT_ID /
CLOUDFLARE_PROJECT_ID 三个 secret。从此
https://gproxy.leenhawk.com 每次合并都会自动更新。
sea-orm-migration workspace 依赖 —
在 [workspace.dependencies] 中声明，为后续引入受管迁移做
铺垫。v1.0.7 里还没有 crate 实际引用它，运行时没有任何
影响。

兼容性

不涉及 DB、API、配置变更。 settings.toml、
global_settings 和 admin API schema 全部原封不动，v1.0.6
可以直接替换二进制升级到 v1.0.7。
Engine builder 默认行为变了。 `GproxyEngine::builder(...

Assets 32

12 Apr 16:52

LeenHawk

v1.0.6

0b2b6c6

v1.0.6

Pricing is now fully admin-editable, end to end. Model prices move
out of the compiled-in &'static [ModelPrice] slice into a
pricing_json column on the models table, the provider store holds
an ArcSwap<Vec<ModelPrice>> that bootstrap and every admin mutation
push into, and the console grows a structured editor that covers all
four billing modes. The docs site is rewritten as a full bilingual
Starlight site (25 pages × 2 locales) including a new pricing
reference page.

English

Added

models.pricing_json column — nullable TEXT column on the
models entity holding the full ModelPrice JSON blob: all four
billing modes (default / flex / scale / priority) in one
place. Threaded through ModelQueryRow,
ModelWrite, store_query/admin, and write_sink. MemoryModel now
carries a single Option<ModelPrice> deserialized from the column on
load and re-serialized on admin upsert, so the complete pricing shape
round-trips through the DB.
Hot-swappable provider pricing — ProviderInstance.model_pricing
goes from &'static [ModelPrice] to
ArcSwap<Vec<ModelPrice>>, and the ProviderRuntime trait gains
set_model_pricing. Engine::set_model_pricing(provider, prices) is
exposed for host wiring. AppState::push_pricing_to_engine rebuilds
a ModelPrice slice from the current MemoryModel snapshot and
pushes it into the engine; it runs once during bootstrap after
replace_models and again from every admin mutation handler that
changes the model set. This fixes a long-standing bug where admin
edits to price_each_call / price_tiers_json were persisted to the
DB but the billing engine kept reading the compiled-in slice forever.
Structured pricing editor in ModelsTab — the lone
pricing_json textarea is replaced with a PricingEditor component
that toggles between "Structured" and "JSON" views. Structured view
provides: a single price_each_call USD input; an add/remove
price_tiers table with 7 per-tier fields (input_tokens_up_to
plus the six per-token unit prices); and collapsible <details>
sections for flex / scale / priority, each with its own
price_each_call and tiers table and auto-expanded when the model
already has pricing in that mode. All numeric fields are held as
strings in form state so users can type freely.
TOML import/export round-trips full ModelPrice — ModelToml
gains six new fields (flex_price_each_call / flex_price_tiers,
scale_price_each_call / scale_price_tiers,
priority_price_each_call / priority_price_tiers). All nine
pricing fields use #[serde(default, skip_serializing_if = ...)] so
minimal models still produce compact TOML. Previously the shape only
carried default-mode tiers, so admin-edited priority pricing was
silently dropped on export.
Bilingual Starlight documentation site — the placeholder docs
template is replaced with a comprehensive site covering the whole
gproxy stack. 25 pages per locale (English + 简体中文), all validated
against the source rather than inferred from READMEs. Sections:
Introduction, Getting Started (installation, quick start, first
request for both aggregated /v1 and scoped /{provider}/v1
routing), Guides (providers & channels, models & aliases, users &
API keys, permissions / rate limits / quotas, rewrite rules, Claude
prompt caching, adding a channel, embedded console, observability),
Reference (env vars, TOML config, dispatch table, database backends,
graceful shutdown, Rust SDK), and Deployment (release build, Docker).
Root READMEs rewritten as project overviews pointing at the docs
site.
Pricing reference page — new
reference/pricing.md in both English and Chinese covers the
ModelPrice JSON shape, the per-1M-token formula, billing mode
selection, exact-then-default price matching, and debugging checklist
for when a price doesn't apply. Linked from guides/models.md and
from the Starlight sidebar.
Unit tests for the new pricing and usage paths — an
unknown-provider branch assertion on set_model_pricing.
Batch delete mode across 5 admin tables — the Users, User Keys,
My Keys, Models, and Rewrite Rules lists gain a reusable "batch"
toggle. Activating it swaps per-row delete buttons for checkboxes and
surfaces a [Select all] [Clear] [Delete N] [Exit] action bar.
Confirmation goes through window.confirm, matching existing delete
UX. Four of the five tables reuse existing */batch-delete handlers
already exposed by crates/gproxy-api/src/admin/mod.rs; the fifth
(/user/keys/batch-delete) is new — user-scoped with an up-front
ownership check against keys_for_user to prevent cross-user key
deletion. Rewrite rules batch delete is purely client-side (filters
the in-memory rewrite_rules JSON) since that resource has no
backend CRUD. Implementation is factored into two shared primitives
in frontend/console/src/components/: a generic useBatchSelection
hook (selection state, stale-key pruning on row refetch, confirm +
delete orchestration) and a presentational BatchActionBar.

Changed

ModelsTab model-pricing field — replaced price_each_call +
price_tiers_json text inputs with the new structured
PricingEditor / JSON textarea toggle. MemoryModelRow and
ModelWrite TS types now expose pricing_json instead of the two
legacy fields; the legacy fields remain on ModelWrite as nullable
for API-schema compatibility but are always written as null by the
console. i18n strings common.priceEachCall /
common.priceTiersJson removed.
Atomic admin upsert validation — batch_upsert_models now
pre-validates every item's pricing_json before writing any of
them, so a malformed entry halfway through a batch no longer leaves
half of the DB updated.
push_pricing_to_engine is best-effort / last-writer-wins —
documented as such so future readers don't reach for a mutex. Logs
a warn! when set_model_pricing returns false (i.e. the
provider is missing from the engine store), so the no-op state
surfaces instead of being silent.
Responsive breakpoints tightened across admin modules — most
admin pages used xl:grid-cols (1280px) for sidebar+content splits
and lg:grid-cols-2 (1024px) for forms, so common laptop widths
collapsed to a single wasteful column. Drop those to lg: / md:
so the intended two-column layouts appear at 1024px / 768px; add
sm: fallback to 6-field filter grids; let 8-metric rows shrink to
1 column on small phones; scope the mobile full-width .btn rule to
.toolbar-shell so inline table/card buttons stay compact; cap
toast min-width to the viewport; and give the suffix-dialog modal
padding so it no longer hugs the screen edge on phones.

Fixed

UsageModule query button stuck on "querying" — UsageModule
(admin) and MyUsageModule (user) shared a single queryTokenRef
between their summary and rows effects. When setActiveQuery fired
both effects, the rows effect bumped the counter before the summary
request resolved, so the summary's .finally() check
(queryTokenRef.current === token) failed and setLoadingMeta(false)
was never called — pinning the button on "querying" forever. Split
into summaryTokenRef + rowsTokenRef so the cancellation tokens
are independent, matching the pattern in useRequestsModuleState.
x-title and http-referer headers leaked upstream — added to
the request-header denylist in both
gproxy-server/src/middleware/sanitize.rs and
sdk/gproxy-routing/src/sanitize.rs, so OpenRouter-style client
metadata stops reaching upstream channels that might reject or log
it.

Removed

Legacy price_each_call + price_tiers_json columns on models
— the two columns are removed from the SeaORM entity,
ModelQueryRow, ModelWrite, store_query/admin, write_sink, and
write/event. Pricing lives in pricing_json only. The 2.3→2.4
transition intentionally left the legacy columns on disk temporarily
to allow a backfill; this release retires them.
Update source configuration — update_source TOML field,
related i18n messages, admin types, and the
.github/workflows/release-binary.yml internal update server flow
are removed. The standalone DownloadsPage.astro is gone; docs
download links now point at GitHub Releases.
Orphan frontend ModelsModule — the module was wired into
app/modules.tsx's activeModule switch as case "models", but
buildAdminNavItems never emitted a nav item for "models", so it
was unreachable. Admin model management already lives inside the
provider workspace's Models tab.
PriceTier from gproxy-core — downstream consumers use
gproxy_sdk::provider::billing::ModelPriceTier instead.

Compatibility

DB schema: models.pricing_json is a pure column add, picked up
by the SeaORM schema-sync step on startup. Existing rows get NULL
and fall back to whatever ModelPrice the provider compiled in. The
legacy price_each_call and price_tiers_json columns are
removed from the entity — if you are upgrading a DB that still
has data in those columns, migrate them into pricing_json before
pointing v1.0.6 at the DB. A clean install via TOML seed is not
affected.
Admin clients: upsert payloads now carry pricing_json: string | null. Legacy price_each_call / price_tiers_json fields remain
on the admin API as nullable for schema compatibility, but the
backend no longer reads them — clients should stop sending them and
send pricing_json instead.
TOML exports: pricing blocks now include the extra flex / scale
/ priority fields when set. Existing TOML files without those fields
continue to import cleanly.
**Self-update source is now hardcode...

Assets 32

12 Apr 07:21

LeenHawk

v1.0.5

72ad2ef

v1.0.5

Major refactor. Two sibling releases worth of architectural cleanup
condensed into one tag: the suffix system is deleted, the models and
model_aliases DB tables are merged, rewrite-rule/billing ownership
moves from the engine into the handler, and request-time model
resolution finally makes permission → rewrite → alias → execute
the single canonical order. No automated migration is shipped — old
model_aliases rows are re-imported into the unified models table on
startup when a TOML seed is present, otherwise re-enter them from the
console once v1.0.5 is running.

English

Added

Model aliases injected into model_list / model_get responses — aliases
are now first-class entries: they appear in the OpenAI / Claude / Gemini
model-list responses (both scoped and unscoped) alongside real models,
GET /v1/models/{alias} resolves to the alias, and non-stream responses
have their "model" field rewritten to the alias name the client sent
(streaming chunks are rewritten per chunk in the engine).
Suffix-aware alias resolution — an alias like gpt4-fast is resolved
by trying an exact match first, then stripping any known suffix from the
tail, looking up the base alias, and re-appending the suffix before
forwarding to the upstream model. (Subsequently removed along with the
whole suffix system, but the alias+suffix combo kept working via
channel-level rewrite rules until then.)
Unified model table — model_aliases is merged into models with a
new alias_of: Option<i64> column. A row with alias_of = NULL is a
real model; a row with alias_of = Some(id) is an alias pointing at
another row's id in the same table. The alias lookup structure
(HashMap<String, ModelAliasTarget>) is unchanged — it is simply
rebuilt from the unified models snapshot at startup / reload.
POST /admin/models/pull — admin endpoint that fetches a provider's
live model list from upstream and returns the model ids. The console
uses this to populate the local models table via a new "Pull Models"
button in the provider workspace's Models tab. Pulled models are
imported as real entries (alias_of = NULL) with no pricing, which the
admin can then edit.
Model List / Local dispatch for model_list / model_get — the
*-only dispatch template presets (chat-completions-only, response-only,
claude-only, gemini-only) default model_list and model_get to the
Local dispatch implementation. Requests served locally never hit
upstream; the handler builds the protocol-specific response body
directly from the models table. GproxyEngine::is_local_dispatch(...)
lets handlers decide before calling engine.execute.
Local merge for non-Local dispatch — for *-like / pass-through
dispatch, the proxy still calls upstream for model_list, but the
response is merged with the local models table before being returned:
local real models that aren't in the upstream response get appended,
then aliases mirror their target entry. model_get checks the local
table first and returns the local entry if present, otherwise falls
through to upstream. This works across OpenAI / Claude / Gemini
protocols, scoped and unscoped.
Alias-level pricing fallback — billing now tries to price a request
against the alias name first and falls back to the resolved real model
name if no alias-level pricing exists. Admins can set a custom
price_each_call / price_tiers_json on an alias row to override the
real model's pricing for that alias only.
Provider workspace: dedicated Rewrite Rules tab — rewrite rules
moved out of the Config tab's settings JSON editor into their own
provider-workspace tab (/providers/:name → "Rewrite Rules"). The
editor is a two-column list + detail layout: the left column shows all
rules with a scrollbar (max ~10 visible), the right column shows path /
action / JSON value / filter (model glob + operation / protocol chips)
for the selected rule. Data still lives in provider.settings_json.
Provider workspace: unified "Models" tab — the separate "Models"
(pricing) and "Model Aliases" tabs are merged into a single "Models"
tab that lists both real models and aliases in the same scrollable
list. Alias rows are shown with an "alias" badge and a → target
indicator, and three filter buttons (All / Real only / Aliases only)
control what is visible. The edit form has an alias_of dropdown for
picking an alias target, and the pull-models flow is embedded in the
same tab.
"+ Add Suffix Variant" dialog in the Models tab — when a real
model is selected, a new button opens a dialog that mirrors the old
composable suffix system: the user picks one entry per group
(thinking / reasoning / service tier / effort / verbosity / ...), the
dialog computes a combined suffix string and a list of rewrite-rule
actions, and on confirm it atomically creates an alias row
(alias_of = base.id, model_id = base + suffix) and appends the
rewrite rules to the provider's settings_json with
filter.model_pattern scoped to the new alias name. Presets cover
everything the deleted Rust suffix module supported except the Claude
header-modifying suffixes (-fast, -non-fast, -1m, -200k),
which rewrite rules can't express.
Rewrite rules editor: typed value input — the "Set" action no
longer forces admins to hand-write JSON. A Type dropdown
(string / number / boolean / null / array / object) switches the
value editor between a plain text input, numeric input, boolean
dropdown, null placeholder, or JSON textarea (for arrays/objects).
Switching type resets the value to a sensible default for the new
type.
Rewrite rules editor: model-pattern autocomplete — focusing the
model_pattern input shows a scrollable dropdown of matching model
names (real + aliases) for the current provider. Typing filters the
list by substring without auto-completing the input; clicking an
entry fills in the pattern exactly.
Pricing-by-alias in the billing pipeline — the engine now exposes
build_billing_context / estimate_billing as public methods, and the
handler builds the billing context in the handler layer with the
alias name visible so per-alias pricing takes effect.

Changed

Request pipeline ordering: permission check (original model name) → rewrite_rules (original model name) → alias resolve → engine.execute → billing. Permission is checked against the name the client sent
(before any alias rewrite), so admins must explicitly whitelist each
alias — aliases do not silently inherit their target's permissions.
Rewrite rules moved out of the engine into the handler layer. The
engine no longer applies rewrite_rules; instead the handler calls
state.engine().rewrite_rules(provider) and applies them to the
request body itself, using the original model name for
model_pattern matching so patterns like gpt4-fast can match before
the name is rewritten by alias resolution.
Billing moved out of the engine into the handler layer. The engine
no longer computes cost / billing_context / billing on its
ExecuteResult; those fields are gone. Handlers now call
engine.build_billing_context(...) and engine.estimate_billing(...)
directly after the upstream call returns, which is also what makes
pricing-by-alias possible.
Provider proxy responses rewrite the "model" field to the alias
name the client sent, using the engine's new response_model_override
field on ExecuteRequest. The suffix rewrite (when still present) was
skipped when the alias override was about to overwrite the same field,
avoiding a redundant JSON parse / serialize per request.
model_alias_middleware simplified — the middleware now does a
single exact alias lookup and drops the ResolvedAlias.suffix field;
all suffix+alias combo handling has been removed along with the suffix
system.

Fixed

/admin/models/pull returning HTTP 500 — the endpoint was cloning
the admin request's headers (including Authorization: Bearer <admin-token>, Content-Length, Host) and forwarding them to the
upstream, which either corrupted the body length or overrode the
channel-supplied credentials. Pull now passes an empty HeaderMap so
the channel's finalize_request is the only source of upstream
headers. Error messages include the first 500 characters of the
upstream response body so failures are debuggable.
Pull-models button was unreachable — the button lived in the
standalone ModelAliasesModule route, but the sidebar never linked to
that route. Moved it into the provider-workspace Aliases tab (and
eventually into the unified Models tab), where it actually renders.
Models tab scrolling — the provider-workspace Models tab now has a
max-h-128 scrollable list so long model lists stay usable.
custom channel: mask_table — the mask_table field was
removed from the backend long ago, but the frontend custom-channel
form still rendered a dead JSON editor. Removed from
channel-forms.ts.

Removed

Suffix system — the entire sdk/gproxy-provider/src/suffix.rs
module (801 lines) is deleted, along with the enable_suffix field
and ChannelSettings::enable_suffix / ProviderRuntime::enable_suffix
methods on all 14 channels. Response / streaming suffix rewriting,
suffix-based model-list expansion, the suffix groups, and all
match_suffix_groups / strip_model_suffix_in_body /
rewrite_model_suffix_in_body / expand_model_list_with_suffixes /
rewrite_model_get_suffix_in_body helpers — gone. The same feature
(gpt4 vs gpt4-fast etc.) is now expressed as separate alias rows
with channel-level rewrite rules.
/admin/model-aliases/* endpoints and model_aliases DB table —
...

Assets 32

11 Apr 16:03

LeenHawk

v1.0.4

5bf1559

v1.0.4

English

Added

Channel-level rewrite rules — new rewrite_rules field on all 14
channel Settings structs allows per-channel request body rewriting before
the request is finalized. Rules support JSON path targeting with glob
matching. A dedicated RewriteRulesEditor component with full i18n is
available in the console.
Dispatch template presets for custom channel — the console now offers
built-in dispatch template presets when configuring custom channels,
and dispatch templates are shown for all channel types (not just custom).

Fixed

Request log query button stuck on loading — the query button no longer
gets permanently stuck in loading state.
HTTP client protocol negotiation — removed http1_only restriction and
enabled proper HTTP/1.1 support for client builders, improving compatibility
with upstream providers behind HTTP/1.1-only proxies.
Sampling parameter stripping — model-aware stripping for
anthropic/claudecode channels ensures unsupported sampling parameters are
correctly removed based on the target model.
Dispatch template passthrough — *-only dispatch templates now correctly
use passthrough+transform for model_list / model_get operations.
Session-expired toast suppressed — the error toast for expired sessions
is now suppressed before the page reload, preventing a flash of error UI.
Update-available toast color — changed from error-red to green success
style.
Noisy ORM logging — sqlx and sea_orm log levels now default to
warn, reducing log noise at startup and during normal operation.
Dispatch / sanitize rules overflow — both panels now scroll when content
exceeds the viewport instead of overflowing the layout.
Upstream proxy placeholder — the upstream proxy input field now shows a
placeholder hint.
Frontend i18n — alias, enable_suffix, enable_magic_cache labels
are now properly translated; "模型" renamed to "模型价格表" / "Model Pricing";
sanitize_rules renamed to "消息重写规则" / "Message Rewrite Rules".

中文

新增

渠道级重写规则 — 全部 14 个渠道 Settings 结构新增 rewrite_rules
字段，支持在请求最终发送前对请求体进行按路径重写，规则支持 JSON path
定位与 glob 匹配。控制台提供专用的 RewriteRulesEditor 结构化编辑组件，
完整支持中英文。
Custom 渠道调度模板预设 — 控制台在配置 custom 渠道时提供内置调度模板
预设，且调度模板现在对所有渠道类型可见（不再限于 custom）。

修复

请求日志查询按钮卡死 — 查询按钮不再永久停留在 loading 状态。
HTTP 客户端协议协商 — 移除 http1_only 限制并启用 HTTP/1.1 支持，
改善通过仅支持 HTTP/1.1 的代理访问上游 provider 的兼容性。
采样参数裁剪 — anthropic/claudecode 渠道现在根据目标模型感知地裁剪
不支持的采样参数。
调度模板透传 — *-only 调度模板现在正确使用 passthrough+transform
处理 model_list / model_get 操作。
会话过期 toast 抑制 — 页面刷新前不再闪现会话过期的错误提示。
更新可用 toast 颜色 — 从红色错误样式改为绿色成功样式。
ORM 日志降噪 — sqlx 和 sea_orm 日志级别默认设为 warn，减少
启动和运行期间的日志噪音。
调度规则 / 重写规则溢出 — 两个面板内容超出视口时改为滚动，不再
撑破布局。
上游代理占位提示 — 上游代理输入框现在显示占位符提示。
前端国际化 — alias、enable_suffix、enable_magic_cache 标签
已正确翻译；"模型"改名为"模型价格表" / "Model Pricing"；sanitize_rules
改名为"消息重写规则" / "Message Rewrite Rules"。

Assets 32

11 Apr 08:06

LeenHawk

v1.0.3

1d9c475

v1.0.3

English

Added

Suffix system for model-list / model-get — suffix modifiers (e.g. -thinking-high, -fast) are now expanded in model list responses and rewritten in model get responses, so clients can discover available suffix variants.
Suffix per-channel toggle — new enable_suffix setting lets operators enable/disable suffix processing per channel.
VertexExpress local model catalogue — model list/get requests are served from a static model catalogue embedded at compile time, since Vertex AI Express does not expose a standard model-listing endpoint.
Vertex SA token bootstrap on credential upsert — when a Vertex credential with client_email and private_key is added via the admin API, the access token is automatically obtained so the first request has valid auth.

Fixed

GeminiCLI / Antigravity model list — both channels now correctly route model list/get through their respective quota/model endpoints (retrieveUserQuota for GeminiCLI, fetchAvailableModels for Antigravity) and normalize responses to standard Gemini format.
Vertex model list normalization — Vertex AI returns publisherModels with full resource paths; responses are now converted to standard Gemini models format.
Vertex / VertexExpress header filtering — anthropic-version and anthropic-beta headers are dropped before forwarding to Google endpoints.
Vertex GeminiCLI-style User-Agent — Vertex requests now send proper User-Agent and x-goog-api-client headers matching Gemini CLI traffic.
Engine HTTP client proxy — database proxy settings now take effect after bootstrap; previously the engine client was built before DB config was loaded.
Engine HTTP/1.1 for standard client — the non-spoof wreq client uses http1_only() for reliable proxy traversal.
HTTP client request dispatch — switched from wreq::Request::from() + execute() to client.request().send() to ensure proxy/TLS settings propagate correctly.
Frontend: VertexExpress credential — field changed from access_token to api_key.
Frontend: Vertex credential — added missing optional fields (private_key_id, client_id, token_uri).

中文

新增

Suffix 系统支持 model-list / model-get — suffix 修饰符（如 -thinking-high、-fast）现在会在模型列表响应中展开、在模型详情响应中回写，客户端可以发现可用的 suffix 变体。
Suffix 按渠道开关 — 新增 enable_suffix 配置项，可按渠道启用/禁用 suffix 处理。
VertexExpress 本地模型目录 — model list/get 请求从编译时嵌入的静态模型目录返回，因为 Vertex AI Express 没有标准的模型列表端点。
Vertex SA 凭证 upsert 自动换 token — 通过 admin API 添加包含 client_email 和 private_key 的 Vertex 凭证时，自动获取 access token，首次请求不会因空 token 失败。

修复

GeminiCLI / Antigravity 模型列表 — 两个渠道现在正确通过各自的配额/模型端点（GeminiCLI 用 retrieveUserQuota，Antigravity 用 fetchAvailableModels）路由 model list/get 请求，并将响应整形为标准 Gemini 格式。
Vertex 模型列表整形 — Vertex AI 返回的 publisherModels（含完整资源路径）现在被转换为标准 Gemini models 格式。
Vertex / VertexExpress 头过滤 — 转发到 Google 端点前丢弃 anthropic-version 和 anthropic-beta 头。
Vertex GeminiCLI 风格 User-Agent — Vertex 请求现在发送匹配 Gemini CLI 流量的 User-Agent 和 x-goog-api-client 头。
Engine HTTP 客户端代理 — 数据库代理设置现在在自举后生效；之前 engine 客户端在 DB 配置加载前就已构建。
Engine 标准客户端 HTTP/1.1 — 非伪装 wreq 客户端使用 http1_only() 确保代理穿透可靠。
HTTP 客户端请求调度 — 从 wreq::Request::from() + execute() 改为 client.request().send()，确保代理/TLS 设置正确传递。
前端：VertexExpress 凭证 — 字段从 access_token 改为 api_key。
前端：Vertex 凭证 — 添加缺失的可选字段（private_key_id、client_id、token_uri）。

Assets 32

10 Apr 17:09

LeenHawk

v1.0.2

afab9fe

v1.0.2

English

Added

WebSocket per-model usage tracking — when the client switches models mid-session (e.g. via response.create), usage is segmented per model and recorded separately instead of attributing all tokens to the last model.
WebSocket upstream message logging — WS session end now records an upstream request log containing all client→server and server→client messages as request/response body.

中文

新增

WebSocket 按模型分段用量 — 客户端在 WS 会话中切换模型时，用量按模型分段记录，不再把所有 token 归到最后一个模型。
WebSocket 上游消息日志 — WS session 结束时记录上游请求日志，包含所有客户端→服务器和服务器→客户端消息。

Assets 32

Releases: LeenHawk/gproxy

staging

Uh oh!

v1.0.10

v1.0.10

English

Fixed

Compatibility

简体中文

修复

兼容性

Uh oh!

v1.0.9

v1.0.9

English

Added

Changed

Fixed

Removed

Compatibility

简体中文

新增

变更

Uh oh!

v1.0.8

v1.0.8

English

Fixed

Changed

Added

Compatibility

Uh oh!

v1.0.7

v1.0.7

English

Fixed

Changed

Added

Compatibility

简体中文

修复

变更

新增

兼容性

Uh oh!

v1.0.6

v1.0.6

English

Added

Changed

Fixed

Removed

Compatibility

Uh oh!

v1.0.5

v1.0.5

English

Added

Changed

Fixed

Removed

Uh oh!

v1.0.4

v1.0.4

English

Added

Fixed

中文

新增

修复

Uh oh!

v1.0.3

v1.0.3

English

Added

Fixed

中文

新增

修复

Uh oh!