Skip to content

Releases: LeenHawk/gproxy

staging

01 Mar 15:35

Choose a tag to compare

staging Pre-release
Pre-release

Automated staging build from 62422faf88a19178ddfa97723abaabcf4d891c75.

v1.0.10

14 Apr 16:45

Choose a tag to compare

v1.0.10

Two focused fixes from the v1.0.9 fallout: claudecode OAuth refresh was broken against Anthropic's token endpoint and left credentials permanently dead, and the sanitize middleware was leaking anthropic-version through so every upstream request carried a duplicated header.

English

Fixed

  • claudecode OAuth refresh actually works again. The v1.0.9 gproxy-channel refactor routed refresh_credential's refresh_token path through the generic oauth2_refresh::refresh_oauth2_token helper, which posts grant_type=refresh_token&refresh_token=... (no client_id, no anthropic headers) to https://console.anthropic.com/v1/oauth/token. Anthropic's token endpoint rejects that shape with invalid_request_error: Invalid request format, so any credential with a refresh_token but no cookie fallback was stuck dead forever — the 401 → refresh → retry loop would fail every time. Replaced with exchange_tokens_with_refresh_token in claudecode_cookie.rs, which posts the CLI-matching shape to {api_base}/v1/oauth/token (form body with client_id=9d1c250a-... and headers anthropic-version: 2023-06-01 / anthropic-beta: oauth-2025-04-20 / user-agent: claude-cli/...).
  • Pre-flight credential refresh. Added Channel::needs_refresh as a new trait hook (default false). claudecode overrides it to return true when access_token is empty, expires_at_ms is already past, or expiry is within a 60s skew window. The retry loop now calls refresh_credential up-front for such credentials and proceeds with the fresh token, skipping the otherwise-guaranteed 401 round-trip. Errors from the pre-flight are logged and swallowed — the existing AuthDead path still catches anything that slips through.
  • anthropic-version no longer duplicated on upstream requests. The request sanitize middleware's HEADER_DENYLIST was already stripping authorization / user-agent / content-type / etc. from the downstream request before the channel forwarding loop ran — but anthropic-version was missing from the list. Since http::request::Builder::header appends rather than replaces, the client-forwarded copy ended up alongside the channel's own value, producing anthropic-version: 2023-06-01 twice on the wire. Added to the denylist.

Compatibility

  • Drop-in upgrade from v1.0.9. No DB migration, no HTTP API change, no config change. SDK consumers are unaffected — no public types or module paths moved.

简体中文

修复

  • claudecode OAuth refresh 重新可用. v1.0.9 的 gproxy-channel 重构把 refresh_credentialrefresh_token 路径切到通用的 oauth2_refresh::refresh_oauth2_token helper,它往 https://console.anthropic.com/v1/oauth/token POST grant_type=refresh_token&refresh_token=...(没有 client_id,没有 anthropic header),Anthropic 的 token 端点会返回 invalid_request_error: Invalid request format 直接拒绝,所以只有 refresh_token 没有 cookie 兜底的 credential 永远死透 —— 401 → refresh → retry 循环每次都失败。换成 claudecode_cookie.rs 里新增的 exchange_tokens_with_refresh_token,按 CLI 的请求 shape 打到 {api_base}/v1/oauth/token(form body 带 client_id=9d1c250a-...,header 带 anthropic-version: 2023-06-01 / anthropic-beta: oauth-2025-04-20 / user-agent: claude-cli/...)。
  • Credential 的 pre-flight refresh. 新增 Channel::needs_refresh trait 方法(默认 false)。claudecode 覆盖实现:access_token 为空、expires_at_ms 已经过期、或 60 秒内即将过期时返回 true。retry 循环检测到后先调用 refresh_credential 刷新一次再发请求,省掉那次必然 401 的 round-trip。pre-flight 报错只记日志不中断,现有的 AuthDead 回退路径继续兜底。
  • anthropic-version 不再在上游请求中重复. 请求 sanitize 中间件的 HEADER_DENYLIST 之前已经在进 channel 转发循环之前抹掉了 authorization / user-agent / content-type 等,但漏了 anthropic-version。由于 http::request::Builder::header追加 而不是替换,客户端发来的那份会和 channel 自己设的那份一起出现,上游就看到两份 anthropic-version: 2023-06-01。已加进 denylist。

兼容性

  • 从 v1.0.9 直接升级。不涉及 DB 迁移、HTTP API 变更或配置变更。SDK 使用者不受影响 —— 没有任何公开类型或模块路径移动。

v1.0.9

14 Apr 15:32

Choose a tag to compare

v1.0.9

The SDK splits into four publishable crates — gproxy-protocol, gproxy-channel, gproxy-engine, gproxy-sdk — with real per-channel feature pruning, a standalone execute_once single-request client for single-provider use, and no DB / API / config changes for binary operators.

English

Added

  • Four publishable SDK cratesgproxy-protocol (L0 wire types + transforms), gproxy-channel (L1 Channel trait, 14 concrete channels, credentials, execute_once pipeline), gproxy-engine (L2 GproxyEngine, provider store, retry, affinity, routing helpers), and gproxy-sdk (facade re-exporting all three). Every SDK crate now carries complete crates.io metadata (license, readme, keywords, categories) and a per-crate README with a common layering table.
  • execute_once / execute_once_stream in gproxy_channel::executor — a complete single-request pipeline (finalize → sanitize → rewrite → prepare_request → HTTP send → normalize → classify) you can drive with just gproxy-channel as a dependency. Comes with lower-level prepare_for_send / send_attempt / send_attempt_stream helpers for users who want to write their own retry loop.
  • apply_outgoing_rules helper — the single in-tree invocation point for apply_sanitize_rules + apply_rewrite_rules. Engine, API handler, and L1 executor all funnel through one body-mutation helper instead of each re-implementing the JSON round-trip.
  • CommonChannelSettings (#[serde(flatten)]) — every channel now embeds one common struct holding user_agent, max_retries_on_429, sanitize_rules, rewrite_rules instead of each of the 14 channels copy-pasting the same four fields and trait method overrides. TOML / JSON wire format is unchanged.
  • Runtime transform dispatcher as public L0 APIgproxy_protocol::transform::dispatch::{transform_request, transform_response, create_stream_response_transformer, nonstream_to_stream, stream_to_nonstream, convert_error_body_or_raw}. External users who only want protocol conversion can now depend on gproxy-protocol alone and get everything without pulling wreq or tokio.
  • hello_openai example in sdk/gproxy-channel/examples/ — a minimal single-file demo of execute_once that runs against real OpenAI with OPENAI_API_KEY. Compiles under --no-default-features --features openai as a smoke test that single-channel use really only pulls one channel.
  • Integration test for execute_once — spins up a local axum mock server, points OpenAiSettings::base_url at it, runs the full L1 pipeline, and asserts on both request side (Bearer token, body) and response side (status, classification, JSON).
  • Optional label field on provider — free-text display name shown in the console alongside the internal provider name.

Changed

  • TransformError now carries Cow<'static, str> messages so the runtime dispatcher can produce dynamically-built errors (format!("no stream aggregation for protocol: {protocol}")) without allocating a new TransformError variant. Existing TransformError::not_implemented("literal") call sites keep working; new TransformError::new(impl Into<String>) constructor handles the dynamic case.
  • store.rs split — the 1564-line gproxy-engine/src/store.rs is now store/{mod,public_traits,runtime,types}.rs so the main ProviderStore orchestrator, the internal ProviderRuntime trait + ProviderInstance<C> generic implementation, the public traits, and the value types each live in their own file.
  • Lock-step SDK versioning — all four SDK crates follow workspace.package.version; release.sh's cargo set-version bump propagates to every [package] inherit plus the four workspace.dependencies.gproxy-*.version entries at once. The release strategy + manual publish recipe is documented inline in the root Cargo.toml.

Fixed

  • Per-channel feature flags now actually prune — the openai, anthropic, … channel feature flags on gproxy-channel, gproxy-engine, and gproxy-sdk were declared in v1.0.8 but non-functional. cargo build --no-default-features --features openai compiled all 14 channels anyway, because (a) the upstream gproxy-channel dep didn't opt out of default-features, so the default all-channels came in regardless; (b) gproxy-engine's all-channels feature only forwarded to gproxy-channel/all-channels and didn't enable its own per-channel features, so the #[cfg(feature = "…")] gates would have been false even if they existed; and (c) the gates didn't exist on engine's hardcoded match arms in built_in_model_prices, validate_credential_json, GproxyEngineBuilder::add_provider_json, ProviderStore::add_provider_json, and bootstrap_credential_on_upsert. All three fixed in this release, and cargo build -p gproxy-sdk --no-default-features --features openai now genuinely compiles only the single requested channel.
  • Pricing editor in the console collapses into a single triangle disclosure — the nested editor no longer cascades open by accident.
  • Dispatch template description now clarifies that it describes the upstream protocol, not the downstream-client shape.
  • Claude Code OAuth beta badge drops the misleading "always" suffix; the badge just shows the beta name now.
  • Self-update button and its success toast are now localized.
  • Doc-comment clippy lint (doc_lazy_continuation) on gproxy-engine crate doc no longer fails cargo clippy -- -D warnings.

Removed

  • gproxy-provider crate — the old aggregator that mixed single-channel access with the multi-channel engine. Its content is now split between gproxy-channel (L1) and gproxy-engine (L2).
  • gproxy-routing crate — merged into gproxy-engine::routing (classify, permission, rate_limit, provider_prefix, model_alias, model_extraction, headers / former sanitize.rs).
  • Deprecated gproxy_sdk::provider / gproxy_sdk::routing module aliases — use gproxy_sdk::channel::*, gproxy_sdk::engine::*, gproxy_sdk::engine::routing::* instead.
  • Unused ProviderDefinition type — dead code with no consumers.
  • gproxy-engine::transform_dispatch passthrough — engine now calls gproxy_protocol::transform::dispatch::* directly; the 14-line re-export file is gone.

Compatibility

  • Binary / server operators: drop-in upgrade from v1.0.8. No DB migration, no HTTP API change, no admin client change, no config change.
  • SDK library consumers: breaking change. gproxy_sdk::provider::* and gproxy_sdk::routing::* paths no longer exist. Migrate every import site to gproxy_sdk::channel::*, gproxy_sdk::engine::*, gproxy_sdk::engine::routing::* (for the former routing helpers), or gproxy_sdk::protocol::transform::dispatch::* (for the runtime transform dispatcher). All in-tree downstream consumers have already been migrated.
  • Direct gproxy-provider / gproxy-routing dependencies in downstream Cargo.toml must be replaced with gproxy-channel + gproxy-engine, or just gproxy-sdk if you want the facade.
  • 14 channel Settings structs gained a common: CommonChannelSettings field flattened via serde, so existing TOML / JSON configs deserialize unchanged.
  • crates.io publishing: The four SDK crates are metadata-complete and packaged (verified via cargo publish --dry-run on gproxy-protocol and cargo package --list on the downstream three). Actual publish has NOT happened yet — this release is local to the repo. When you publish, the dependency order is gproxy-protocol → gproxy-channel → gproxy-engine → gproxy-sdk with ~30 s between each step for the registry index to catch up.

简体中文

新增

  • 四个可发布的 SDK crategproxy-protocol(L0 wire 类型 + 协议转换)、gproxy-channel(L1 Channel trait、14 个具体 channel、credentials、execute_once 流水线)、gproxy-engine(L2 GproxyEngine、provider store、retry、affinity、路由 helper),以及 gproxy-sdk(facade,重导出上述三个)。每个 crate 都带齐 crates.io 元数据(license、readme、keywords、categories)和独立 README,README 顶部有统一的分层对照表。
  • execute_once / execute_once_stream(在 gproxy_channel::executor)—— 单次请求完整流水线(finalize → sanitize → rewrite → prepare_request → HTTP send → normalize → classify),只依赖 gproxy-channel 就能跑。还附带 prepare_for_send / send_attempt / send_attempt_stream 低阶 helper,供需要自己写 retry 循环的用户使用。
  • apply_outgoing_rules helper —— apply_sanitize_rules + apply_rewrite_rules 在仓库内的唯一调用点。engine、API handler 和 L1 executor 全部通过一个 body 变换 helper 走,不再各自重复 JSON 反序列化 / 变换 / 序列化三部曲。
  • CommonChannelSettings(#[serde(flatten)])—— 14 个 channel 的 Settings struct 现在统一 embed 一个 common struct,里面装 user_agentmax_retries_on_429sanitize_rulesrewrite_rules,不再各自 copy-paste 同样的四个字段和四个 trait 方法。TOML / JSON 线格式不变。
  • 运行时协议分发作为 L0 公开 API —— gproxy_protocol::transform::dispatch::{transform_request, transform_response, create_stream_response_transformer, nonstream_to_stream, stream_to_nonstream, convert_error_body_or_raw}。只想做协议转换的外部用户现在只依赖 gproxy-protocol 就够了,不会被 wreqtokio 拖进来。
  • hello_openai 示例(sdk/gproxy-channel/examples/)—— 用 OPENAI_API_KEY 打真实 OpenAI 的单文件 demo。用 --no-default-features --features openai 编译就能作为"单渠道场景真的只拖一家"的 smoke test。
  • execute_once 集成测试 —— 起本地 axum mock 服务,把 OpenAiSettings::base_url 指过去,跑完整 L1 流水线,从请求侧(Bearer token、body)和响应侧(status、classification、JSON)双向断言。
  • provider 新增可选 label 字段 —— 控制台里显示的自由文本名称,与内部 provider 名称并列。

变更

  • TransformError 消息改为 Cow<'static, str>,让运行时 dispatcher 能动态构造错误(format!("no stream aggregation for protocol: {protocol}")),不用为此新增 TransformError 变体。旧的 TransformError::not_implemented("literal") 调用位照旧工作;新的 TransformError::new(impl Into<String>) 构造器负责动态场景。
  • store.rs 拆分 —— 原本 1564 行的 gproxy-engine/src/store.rs 拆成 store/{mod,public_traits,runtime,types}.rs,主 ProviderStore 编排层、内部 ProviderRuntime trait + ProviderInstance<C> 泛型实现、公开 trait、值类型各自独立成文件。
  • SDK 锁步版本 —— 四个 SDK crate 统一跟随 `workspace.pa...
Read more

v1.0.8

14 Apr 03:53

Choose a tag to compare

v1.0.8

Cross-protocol error bodies finally make it to the client in the
right schema, orphaned tool_result messages stop breaking Claude
requests, and streaming upstream logs now store the actual wire
bytes.
The headline fix: when a Claude/Gemini/OpenAI upstream
returns a non-2xx error body, the engine now converts it to the
client's declared error shape (e.g. Claude {"type":"error",...}
OpenAI {"error":{...}}) instead of handing the raw JSON to an SDK
that can't parse it — with a raw-bytes fallback when the upstream
shape doesn't match any declared schema. Streaming error responses
finally reach the client too, after a buffer-and-convert fast path
replaces the broken SSE transformer that used to swallow the error
body and emit only [DONE]. On the transform side, a new
push_message_block utility centralizes Claude message-building
invariants across every *→Claude converter, fixing an OpenAI
Responses-API bug where previous_response_id + fresh
function_call_output produced orphaned tool_result blocks and
Claude returned a 400. The console picks up a per-channel
max_retries_on_429 field and a one-click TOML download on the
config export page.

English

Fixed

  • Non-2xx upstream errors reached clients in the wrong protocol
    schema
    — each provider uses a different error shape (Claude
    {"type":"error","error":{...}}, OpenAI {"error":{...}}, Gemini
    {"error":{"code":N,...}}), and before this release the engine only
    ran transform_response on 2xx bodies. An OpenAI-speaking client
    that hit a Claude 400 got the raw Claude JSON back, which the SDK
    couldn't parse, so dashboards saw a generic "invalid response" on
    what was really a simple upstream 400 (e.g. prompt is too long).
    sdk/gproxy-provider/src/engine.rs and
    sdk/gproxy-provider/src/transform_dispatch.rs now route error
    bodies through the new convert_error_body_or_raw helper, which
    tries the declared error variant via BodyEnvelope::from_body_bytes
    and falls back to raw upstream bytes on schema mismatch (e.g. codex
    returning {"detail":{"code":"deactivated_workspace"}}, which isn't
    any declared error schema). Claude-error-to-OpenAI-error conversion
    is covered by a new integration test.
  • Streaming endpoints swallowed upstream error bodies — on a
    cross-protocol transform route (e.g. client speaks
    OpenAI-chat-completions, upstream is Claude), a non-2xx upstream
    response was fed to the inline per-chunk SSE transformer, which
    couldn't parse the JSON error body as an SSE frame, yielded nothing,
    and emitted only a synthetic [DONE]. The client saw an empty
    success stream instead of the actual 4xx/5xx error. execute_stream
    now detects !is_success upstream early, buffers the full error
    body (which is always a small complete JSON, not a real SSE
    stream), runs it through convert_error_body_or_raw, and returns
    a single-chunk ExecuteBody::Stream with the converted bytes. The
    raw pre-conversion upstream bytes are still captured for the
    upstream log so operators can see what actually came over the wire.
  • Orphaned tool_result blocks caused Claude 400 on OpenAI
    Responses-API requests
    — Claude's API requires "each tool_result
    block must have a corresponding tool_use block in the previous
    message," but the OpenAI Responses API lets clients send only
    function_call_output items when using previous_response_id
    (the tool_use side lives in the prior turn, which the client is
    referencing by id instead of resending). The legacy *→Claude
    transforms built messages by blindly pushing blocks, so these
    requests ended up with a leading user/tool_result message and
    no matching assistant/tool_use — Claude returned 400 every
    time. The new push_message_block helper (see Added) synthesizes
    a placeholder tool_use block with the matching id whenever it
    detects an orphaned tool_result, so the request now satisfies
    Claude's pairing rule and goes through.
  • Adjacent same-role messages from multi-block transforms — the
    per-transform push_block_message helpers produced two separate
    user messages for two consecutive tool_result pushes (and
    similarly for assistant blocks), which Claude's API rejects as
    malformed. push_message_block now merges consecutive blocks for
    the same role into a single BetaMessageContent::Blocks message,
    so every *→Claude transform produces a well-formed message list
    by construction.
  • Streaming upstream logs stored post-transform bytes instead of
    pre-transform wire bytes
    — the handler's old
    accumulated_body: Vec<u8> collected chunks as they were yielded
    downstream
    , so for cross-protocol routes the response_body
    column in upstream_requests held the converted (OpenAI/Gemini/…)
    bytes, not what Claude/OpenAI-upstream actually sent. This diverged
    from the non-stream path, which stores the pre-transform bytes via
    raw_response_body_for_log. A new stream wrapper
    (wrap_upstream_response_stream) now tees upstream bytes into an
    Arc<Mutex<Vec<u8>>> capture buffer before they reach the
    transformer, and the handler reads it after the stream drains.
    Stream and non-stream paths are now byte-for-byte consistent in
    the upstream log.

Changed

  • Passthrough streaming fast path — when a stream route has no
    transformer, no raw_capture, and no response_model_override, the
    engine now hands response.body through to the client unwrapped
    instead of going through a per-chunk try_stream! loop. This
    reclaims the passthrough latency that was lost when accumulated_body
    was added. The wrapper is only spliced in when at least one of raw
    capture, transform, or alias rewriting is active.
  • rand 0.9.4 / rand_core 0.10.1 — minor dependency bumps.
    Picks up upstream API cleanups; no gproxy code changes required.

Added

  • convert_error_body_or_raw(src_op, src_proto, dst_op, dst_proto, body) in sdk/gproxy-provider/src/transform_dispatch.rs
    converts an upstream non-2xx body from the upstream protocol's
    error schema to the client's expected error schema via
    transform_response, substituting GenerateContent for streaming
    ops (error bodies share the non-stream schema). Passthrough routes
    (same src/dst protocol and op) skip conversion entirely. On schema
    mismatch the helper logs at debug level with the full
    src_op/src_proto/dst_op/dst_proto context and returns the
    raw bytes so no error information is lost. Three unit tests cover
    Claude→OpenAI rewriting, codex-shape fallback, and the passthrough
    case.
  • ExecuteResult.stream_raw_capture: Option<Arc<Mutex<Vec<u8>>>>
    — new field on the SDK result type, populated by
    execute_stream when enable_upstream_log && enable_upstream_log_body and the route actually sees a raw-capture
    tee. The handler reads the buffer after the stream drains and
    copies it into meta.response_body, so
    upstream_requests.response_body contains the pre-transform wire
    bytes that correspond to what the non-stream path already stored.
    None on passthrough-with-logging-off routes and on the error-body
    fast path's re-use (which seeds the buffer with pre-conversion
    bytes itself).
  • wrap_upstream_response_stream in
    sdk/gproxy-provider/src/engine.rs — single stream-combinator that
    applies, in order: raw-byte tee into raw_capture, optional
    per-chunk StreamResponseTransformer, and optional model-alias
    rewriting. Replaces the previous two inlined try_stream! loops
    (one for transform + alias, one for alias-only) with a unified
    helper whose behaviour is covered by two unit tests
    (wrap_response_stream_tees_raw_bytes_in_passthrough_mode,
    wrap_response_stream_pure_passthrough_yields_chunks_unchanged).
  • push_message_block(messages, role, block) in
    sdk/gproxy-protocol/src/transform/claude/utils.rs — central
    utility for building Claude messages lists from any non-Claude
    source. Maintains two invariants:
    1. Consecutive blocks for the same role are merged into one
      BetaMessageContent::Blocks message (no adjacent same-role
      messages).
    2. Whenever a tool_result block is appended to a user message,
      the immediately-preceding assistant message is checked for a
      matching tool_use block; if none exists, a placeholder
      tool_use (named tool_use_placeholder) is synthesized in the
      assistant slot — either by promoting an existing assistant
      message's content into blocks and appending, or by inserting a
      new assistant message before the trailing user one.
      Exported from transform::claude::utils and re-exported from
      transform::utils so non-Claude callers don't need a cross-module
      dependency. Every *→Claude request transform (gemini,
      openai_chat_completions, openai_response, openai_compact,
      openai_count_tokens) is migrated to call it instead of pushing
      messages directly. Covered by 9 unit tests, including the exact
      orphaned-tool_result shape reported in production.
  • Per-channel max_retries_on_429 setting in ConfigTab — every
    channel's structured editor now exposes an optional integer input
    bound to the backend's per-credential 429-without-retry-after
    retry cap (backend default: 3). Empty input is omitted from the
    settings JSON so the backend default still applies. i18n strings
    added in both locales (field.max_retries_on_429).
  • TOML download button on the config export pageConfigExport
    module grows a neutral Download button alongside the existing
    Export. Clicking it ships the current export as
    gproxy-config-<ISO-timestamp>.toml via a Blob + <a>-click. If
    the user hasn't clicked Export yet, Download fetches the TOML
    first and then triggers the file save. New i18n key:
    common.download.

Compatibility

  • No DB, API, or config changes. `settings....
Read more

v1.0.7

13 Apr 12:02

Choose a tag to compare

v1.0.7

Self-update is unbroken, failing transforms finally tell you which
request broke them, and the docs site deploys itself.
The headline
fix centralizes wreq client policy in the engine so every HTTP path
(including self-update) follows redirects — GitHub asset downloads
stop failing on their 302 to the CDN. Pre-upstream transform errors
now capture the original downstream request body in the upstream
log, so operators actually see which JSON failed to parse. The
release pipeline grows a Cloudflare Pages deploy job for the docs
site, and the Docker deployment page is rewritten around the
official ghcr.io/leenhawk/gproxy image.

English

Fixed

  • Self-update failing with download failed: HTTP 302 Found
    GitHub serves every /releases/download/... asset as a 302 to the
    CDN host, but wreq's default redirect policy is
    redirect::Policy::none(), so wreq::get(url) in
    crates/gproxy-api/src/admin/update.rs returned the redirect
    response verbatim and download_bytes / download_text rejected
    it at the status().is_success() check. The update path never
    touched the engine client either, so the fresh per-call default
    client inherited none of the runtime configuration.
  • Pre-upstream transform failures lost the request body in logs
    — when transform_dispatch::transform_request failed before we
    ever sent anything upstream (e.g. a malformed tools[] entry
    failing to deserialize into ResponseTool), the error bubbled up
    as ExecuteError { meta: None, .. } and
    record_execute_error_logs wrote an upstream-log row with
    request_body = NULL, leaving operators a 500 with no way to see
    which JSON actually failed. GproxyEngine::execute and
    execute_stream now catch the transform error, clone the
    original downstream body beforehand, and synthesize an
    UpstreamRequestMeta via the new build_transform_error helper
    so the offending body lands in the log. URL / headers /
    response fields stay empty because the request never hit the
    wire; enable_upstream_log / enable_upstream_log_body are still
    honored.

Changed

  • Single source of truth for HTTP client policy — new
    default_http_client() helper in
    sdk/gproxy-provider/src/engine.rs centralizes the global wreq
    client policy (redirect::Policy::limited(10)). Every build path
    now routes through it:
    • GproxyEngineBuilder::build() uses it as the default fallback
      (was self.client.unwrap_or_default()), so bare
      GproxyEngine::builder().build() — used by tests and several
      admin-only bootstrap paths — no longer produces a client that
      drops redirects.
    • configure_clients and with_new_clients set .redirect(...)
      on both the normal and spoof-emulation builders, and their
      Err fallbacks route through default_http_client() instead
      of wreq::Client::default().
      This also closes a latent footgun: if configure_clients ever
      failed to build (bad proxy URL, TLS init error), the process used
      to silently fall back to a fully-unconfigured default client.
      The fallback now at least keeps the redirect policy.
  • update.rs reuses the engine's HTTP clientcheck_update
    and perform_update grab state.engine().client().clone() and
    pass it through to fetch_github_manifest, download_bytes, and
    download_text. The three helpers no longer call wreq::get(url)
    / wreq::Client::new() at all. Practical upshot: self-update
    traffic now inherits the operator's configured upstream proxy,
    TLS settings, and whatever else the engine is built with —
    previously it silently bypassed all of them.
  • Docker deployment guide rewritten around the official image
    docs/src/content/docs/deployment/docker.md (and the Chinese
    mirror) now leads with docker pull ghcr.io/leenhawk/gproxy:latest
    instead of "build Dockerfile.action locally," and documents the
    full tag matrix (latest / vX.Y.Z / staging × glibc / musl,
    plus per-arch suffixes). The installation pages cross-reference
    the new guidance so new users don't start by building an image
    they don't need to.

Added

  • GproxyEngine::client() getter — public accessor exposing
    the shared &wreq::Client, so auxiliary admin code paths can
    reuse the engine's configured client instead of constructing
    their own. The spoof client stays private; the normal client is
    the right choice for anything that is not upstream provider
    traffic.
  • build_transform_error helper in
    sdk/gproxy-provider/src/engine.rs — synthesizes an
    UpstreamRequestMeta for the pre-upstream transform failure path
    so operators get the downstream request body in the upstream log
    even when we never reached a credential or a URL.
  • Cloudflare Pages docs deploy job — the
    .github/workflows/release-binary.yml pipeline gains a
    deploy-docs-cloudflare job that runs on default-branch pushes
    and on releases: pnpm-installs, builds docs/, then ships the
    result to Cloudflare Pages via cloudflare/wrangler-action@v3
    using the cloudflare environment's
    CLOUDFLARE_API_TOKEN / CLOUDFLARE_ACCOUNT_ID /
    CLOUDFLARE_PROJECT_ID secrets. The docs site at
    https://gproxy.leenhawk.com now updates automatically with every
    merge.
  • sea-orm-migration workspace dependency — declared in
    [workspace.dependencies] in preparation for an upcoming
    managed-migration pass. No crate pulls it in yet, so this has no
    runtime effect in v1.0.7.

Compatibility

  • No DB, API, or config changes. settings.toml,
    global_settings, and the admin API schema are all untouched.
    This is a drop-in upgrade from v1.0.6 — just swap the binary.
  • Engine builder defaults shift. GproxyEngine::builder().build()
    now yields a client that follows up to 10 redirects, where v1.0.6
    and earlier yielded a client that followed zero. SDK consumers
    that were relying on the old behavior (e.g. intentionally
    capturing 3xx responses as terminal) must explicitly pass their
    own wreq::Client via http_client(...) /
    configure_clients(...).
  • Transform-failure log rows now include request_body where
    they previously had NULL. url / request_headers /
    response_* on those rows are still empty strings / empty
    arrays / NULL — the request never hit the wire, so there's
    nothing real to record. Dashboards that were filtering transform
    failures by url = '' will still work; ones that were filtering
    by request_body IS NULL will need to check for the actual error
    message instead.

简体中文

修复

  • 自更新报 download failed: HTTP 302 Found — GitHub
    /releases/download/... 资源永远是 302 到 CDN 域名的,
    wreq 的默认重定向策略是 redirect::Policy::none(),所以
    crates/gproxy-api/src/admin/update.rswreq::get(url)
    拿到的是 302 本身,download_bytes / download_text
    status().is_success() 这一步就直接拒绝。更新路径根本没
    接触到 engine 的 client,所以每次新建的默认 client 也继承不到
    任何运行时配置。
  • 上游前的 transform 失败在日志里丢了 request body
    transform_dispatch::transform_request 在真正发请求之前
    就失败(例如 tools[] 里有一个字段无法反序列化成
    ResponseTool),错误会以 ExecuteError { meta: None, .. }
    冒上来,record_execute_error_logs 写出的 upstream log 行
    request_body = NULL,运维只能看到一个 500 但看不到到底是
    哪段 JSON 解析不动。GproxyEngine::executeexecute_stream
    现在会捕获这个 transform 错误,提前克隆原始 downstream body,
    再通过新加的 build_transform_error helper 合成一个
    UpstreamRequestMeta,让出问题的 body 能落进日志。URL /
    headers / response 相关字段留空,因为请求根本没发上游;
    enable_upstream_log / enable_upstream_log_body 仍然生效。

变更

  • HTTP client 策略统一到一个入口
    sdk/gproxy-provider/src/engine.rs 新增 default_http_client()
    helper,把全局 wreq client 策略(redirect::Policy::limited(10)
    收敛到一个地方。所有构建路径现在都走它:
    • GproxyEngineBuilder::build() 的默认兜底从
      self.client.unwrap_or_default() 改成
      unwrap_or_else(default_http_client),裸的
      GproxyEngine::builder().build() —— 测试和若干 admin-only
      bootstrap 路径都在用 —— 不会再构造出一个不跟随重定向的 client。
    • configure_clientswith_new_clients 给普通 client 和
      spoof client 的 builder 都加了 .redirect(...),而且它们的
      Err 兜底分支也从 wreq::Client::default() 切到
      default_http_client()
      顺带堵了一个潜在陷阱:如果 configure_clients 构建失败(代理
      URL 有问题、TLS 初始化失败之类),之前会静默退回到一个完全
      未配置的默认 client。现在至少兜底 client 仍然会跟随重定向。
  • update.rs 改为复用 engine 的 HTTP client
    check_updateperform_update
    state.engine().client().clone() 传给 fetch_github_manifest
    download_bytesdownload_text,三个 helper 都不再调用
    wreq::get(url) / wreq::Client::new()。实际效果:自更新流量
    现在会经过运维配置的上游代理、TLS 设置以及 engine 上的其它
    配置 —— 此前是悄悄绕过了所有这些配置。
  • Docker 部署文档改为以官方镜像为中心
    docs/src/content/docs/deployment/docker.md(以及中文镜像)
    现在首推 docker pull ghcr.io/leenhawk/gproxy:latest,而不是
    「本地构建 Dockerfile.action」,并补齐了完整的 tag 矩阵
    latest / vX.Y.Z / staging × glibc / musl,以及各自的
    per-arch 后缀)。安装页也相应调整,避免新用户上来就去构建
    一个他们根本不需要构建的镜像。

新增

  • GproxyEngine::client() getter
    公开访问器,暴露共享的 &wreq::Client,方便 admin 辅助
    代码路径复用 engine 已配置好的 client,而不是各自再建一个。
    spoof client 仍然保持私有;非上游 provider 流量应该用这个
    普通 client。
  • build_transform_error helper
    sdk/gproxy-provider/src/engine.rs 新增,专门给上游前的
    transform 失败路径合成 UpstreamRequestMeta,让运维在根本
    还没选到 credential、没拿到 URL 的时候,也能在 upstream log 里
    看到 downstream 原始 body。
  • Cloudflare Pages 文档部署 Job
    .github/workflows/release-binary.yml 新增 deploy-docs-cloudflare
    job:在默认分支推送和 release 事件上触发,pnpm install
    → 构建 docs/ → 通过 cloudflare/wrangler-action@v3 推到
    Cloudflare Pages,使用 cloudflare environment 下的
    CLOUDFLARE_API_TOKEN / CLOUDFLARE_ACCOUNT_ID /
    CLOUDFLARE_PROJECT_ID 三个 secret。从此
    https://gproxy.leenhawk.com 每次合并都会自动更新。
  • sea-orm-migration workspace 依赖
    [workspace.dependencies] 中声明,为后续引入受管迁移做
    铺垫。v1.0.7 里还没有 crate 实际引用它,运行时没有任何
    影响。

兼容性

  • 不涉及 DB、API、配置变更。 settings.toml
    global_settings 和 admin API schema 全部原封不动,v1.0.6
    可以直接替换二进制升级到 v1.0.7。
  • Engine builder 默认行为变了。 `GproxyEngine::builder(...
Read more

v1.0.6

12 Apr 16:52

Choose a tag to compare

v1.0.6

Pricing is now fully admin-editable, end to end. Model prices move
out of the compiled-in &'static [ModelPrice] slice into a
pricing_json column on the models table, the provider store holds
an ArcSwap<Vec<ModelPrice>> that bootstrap and every admin mutation
push into, and the console grows a structured editor that covers all
four billing modes. The docs site is rewritten as a full bilingual
Starlight site (25 pages × 2 locales) including a new pricing
reference page.

English

Added

  • models.pricing_json column — nullable TEXT column on the
    models entity holding the full ModelPrice JSON blob: all four
    billing modes (default / flex / scale / priority) in one
    place. Threaded through ModelQueryRow,
    ModelWrite, store_query/admin, and write_sink. MemoryModel now
    carries a single Option<ModelPrice> deserialized from the column on
    load and re-serialized on admin upsert, so the complete pricing shape
    round-trips through the DB.
  • Hot-swappable provider pricingProviderInstance.model_pricing
    goes from &'static [ModelPrice] to
    ArcSwap<Vec<ModelPrice>>, and the ProviderRuntime trait gains
    set_model_pricing. Engine::set_model_pricing(provider, prices) is
    exposed for host wiring. AppState::push_pricing_to_engine rebuilds
    a ModelPrice slice from the current MemoryModel snapshot and
    pushes it into the engine; it runs once during bootstrap after
    replace_models and again from every admin mutation handler that
    changes the model set. This fixes a long-standing bug where admin
    edits to price_each_call / price_tiers_json were persisted to the
    DB but the billing engine kept reading the compiled-in slice forever.
  • Structured pricing editor in ModelsTab — the lone
    pricing_json textarea is replaced with a PricingEditor component
    that toggles between "Structured" and "JSON" views. Structured view
    provides: a single price_each_call USD input; an add/remove
    price_tiers table with 7 per-tier fields (input_tokens_up_to
    plus the six per-token unit prices); and collapsible <details>
    sections for flex / scale / priority, each with its own
    price_each_call and tiers table and auto-expanded when the model
    already has pricing in that mode. All numeric fields are held as
    strings in form state so users can type freely.
  • TOML import/export round-trips full ModelPriceModelToml
    gains six new fields (flex_price_each_call / flex_price_tiers,
    scale_price_each_call / scale_price_tiers,
    priority_price_each_call / priority_price_tiers). All nine
    pricing fields use #[serde(default, skip_serializing_if = ...)] so
    minimal models still produce compact TOML. Previously the shape only
    carried default-mode tiers, so admin-edited priority pricing was
    silently dropped on export.
  • Bilingual Starlight documentation site — the placeholder docs
    template is replaced with a comprehensive site covering the whole
    gproxy stack. 25 pages per locale (English + 简体中文), all validated
    against the source rather than inferred from READMEs. Sections:
    Introduction, Getting Started (installation, quick start, first
    request for both aggregated /v1 and scoped /{provider}/v1
    routing), Guides (providers & channels, models & aliases, users &
    API keys, permissions / rate limits / quotas, rewrite rules, Claude
    prompt caching, adding a channel, embedded console, observability),
    Reference (env vars, TOML config, dispatch table, database backends,
    graceful shutdown, Rust SDK), and Deployment (release build, Docker).
    Root READMEs rewritten as project overviews pointing at the docs
    site.
  • Pricing reference page — new
    reference/pricing.md in both English and Chinese covers the
    ModelPrice JSON shape, the per-1M-token formula, billing mode
    selection, exact-then-default price matching, and debugging checklist
    for when a price doesn't apply. Linked from guides/models.md and
    from the Starlight sidebar.
  • Unit tests for the new pricing and usage paths — an
    unknown-provider branch assertion on set_model_pricing.
  • Batch delete mode across 5 admin tables — the Users, User Keys,
    My Keys, Models, and Rewrite Rules lists gain a reusable "batch"
    toggle. Activating it swaps per-row delete buttons for checkboxes and
    surfaces a [Select all] [Clear] [Delete N] [Exit] action bar.
    Confirmation goes through window.confirm, matching existing delete
    UX. Four of the five tables reuse existing */batch-delete handlers
    already exposed by crates/gproxy-api/src/admin/mod.rs; the fifth
    (/user/keys/batch-delete) is new — user-scoped with an up-front
    ownership check against keys_for_user to prevent cross-user key
    deletion. Rewrite rules batch delete is purely client-side (filters
    the in-memory rewrite_rules JSON) since that resource has no
    backend CRUD. Implementation is factored into two shared primitives
    in frontend/console/src/components/: a generic useBatchSelection
    hook (selection state, stale-key pruning on row refetch, confirm +
    delete orchestration) and a presentational BatchActionBar.

Changed

  • ModelsTab model-pricing field — replaced price_each_call +
    price_tiers_json text inputs with the new structured
    PricingEditor / JSON textarea toggle. MemoryModelRow and
    ModelWrite TS types now expose pricing_json instead of the two
    legacy fields; the legacy fields remain on ModelWrite as nullable
    for API-schema compatibility but are always written as null by the
    console. i18n strings common.priceEachCall /
    common.priceTiersJson removed.
  • Atomic admin upsert validationbatch_upsert_models now
    pre-validates every item's pricing_json before writing any of
    them, so a malformed entry halfway through a batch no longer leaves
    half of the DB updated.
  • push_pricing_to_engine is best-effort / last-writer-wins
    documented as such so future readers don't reach for a mutex. Logs
    a warn! when set_model_pricing returns false (i.e. the
    provider is missing from the engine store), so the no-op state
    surfaces instead of being silent.
  • Responsive breakpoints tightened across admin modules — most
    admin pages used xl:grid-cols (1280px) for sidebar+content splits
    and lg:grid-cols-2 (1024px) for forms, so common laptop widths
    collapsed to a single wasteful column. Drop those to lg: / md:
    so the intended two-column layouts appear at 1024px / 768px; add
    sm: fallback to 6-field filter grids; let 8-metric rows shrink to
    1 column on small phones; scope the mobile full-width .btn rule to
    .toolbar-shell so inline table/card buttons stay compact; cap
    toast min-width to the viewport; and give the suffix-dialog modal
    padding so it no longer hugs the screen edge on phones.

Fixed

  • UsageModule query button stuck on "querying"UsageModule
    (admin) and MyUsageModule (user) shared a single queryTokenRef
    between their summary and rows effects. When setActiveQuery fired
    both effects, the rows effect bumped the counter before the summary
    request resolved, so the summary's .finally() check
    (queryTokenRef.current === token) failed and setLoadingMeta(false)
    was never called — pinning the button on "querying" forever. Split
    into summaryTokenRef + rowsTokenRef so the cancellation tokens
    are independent, matching the pattern in useRequestsModuleState.
  • x-title and http-referer headers leaked upstream — added to
    the request-header denylist in both
    gproxy-server/src/middleware/sanitize.rs and
    sdk/gproxy-routing/src/sanitize.rs, so OpenRouter-style client
    metadata stops reaching upstream channels that might reject or log
    it.

Removed

  • Legacy price_each_call + price_tiers_json columns on models
    — the two columns are removed from the SeaORM entity,
    ModelQueryRow, ModelWrite, store_query/admin, write_sink, and
    write/event. Pricing lives in pricing_json only. The 2.3→2.4
    transition intentionally left the legacy columns on disk temporarily
    to allow a backfill; this release retires them.
  • Update source configurationupdate_source TOML field,
    related i18n messages, admin types, and the
    .github/workflows/release-binary.yml internal update server flow
    are removed. The standalone DownloadsPage.astro is gone; docs
    download links now point at GitHub Releases.
  • Orphan frontend ModelsModule — the module was wired into
    app/modules.tsx's activeModule switch as case "models", but
    buildAdminNavItems never emitted a nav item for "models", so it
    was unreachable. Admin model management already lives inside the
    provider workspace's Models tab.
  • PriceTier from gproxy-core — downstream consumers use
    gproxy_sdk::provider::billing::ModelPriceTier instead.

Compatibility

  • DB schema: models.pricing_json is a pure column add, picked up
    by the SeaORM schema-sync step on startup. Existing rows get NULL
    and fall back to whatever ModelPrice the provider compiled in. The
    legacy price_each_call and price_tiers_json columns are
    removed from the entity — if you are upgrading a DB that still
    has data in those columns, migrate them into pricing_json before
    pointing v1.0.6 at the DB. A clean install via TOML seed is not
    affected.
  • Admin clients: upsert payloads now carry pricing_json: string | null. Legacy price_each_call / price_tiers_json fields remain
    on the admin API as nullable for schema compatibility, but the
    backend no longer reads them — clients should stop sending them and
    send pricing_json instead.
  • TOML exports: pricing blocks now include the extra flex / scale
    / priority fields when set. Existing TOML files without those fields
    continue to import cleanly.
  • **Self-update source is now hardcode...
Read more

v1.0.5

12 Apr 07:21

Choose a tag to compare

v1.0.5

Major refactor. Two sibling releases worth of architectural cleanup
condensed into one tag: the suffix system is deleted, the models and
model_aliases DB tables are merged, rewrite-rule/billing ownership
moves from the engine into the handler, and request-time model
resolution finally makes permission → rewrite → alias → execute
the single canonical order. No automated migration is shipped — old
model_aliases rows are re-imported into the unified models table on
startup when a TOML seed is present, otherwise re-enter them from the
console once v1.0.5 is running.

English

Added

  • Model aliases injected into model_list / model_get responses — aliases
    are now first-class entries: they appear in the OpenAI / Claude / Gemini
    model-list responses (both scoped and unscoped) alongside real models,
    GET /v1/models/{alias} resolves to the alias, and non-stream responses
    have their "model" field rewritten to the alias name the client sent
    (streaming chunks are rewritten per chunk in the engine).
  • Suffix-aware alias resolution — an alias like gpt4-fast is resolved
    by trying an exact match first, then stripping any known suffix from the
    tail, looking up the base alias, and re-appending the suffix before
    forwarding to the upstream model. (Subsequently removed along with the
    whole suffix system, but the alias+suffix combo kept working via
    channel-level rewrite rules until then.)
  • Unified model tablemodel_aliases is merged into models with a
    new alias_of: Option<i64> column. A row with alias_of = NULL is a
    real model; a row with alias_of = Some(id) is an alias pointing at
    another row's id in the same table. The alias lookup structure
    (HashMap<String, ModelAliasTarget>) is unchanged — it is simply
    rebuilt from the unified models snapshot at startup / reload.
  • POST /admin/models/pull — admin endpoint that fetches a provider's
    live model list from upstream and returns the model ids. The console
    uses this to populate the local models table via a new "Pull Models"
    button in the provider workspace's Models tab. Pulled models are
    imported as real entries (alias_of = NULL) with no pricing, which the
    admin can then edit.
  • Model List / Local dispatch for model_list / model_get — the
    *-only dispatch template presets (chat-completions-only, response-only,
    claude-only, gemini-only) default model_list and model_get to the
    Local dispatch implementation. Requests served locally never hit
    upstream; the handler builds the protocol-specific response body
    directly from the models table. GproxyEngine::is_local_dispatch(...)
    lets handlers decide before calling engine.execute.
  • Local merge for non-Local dispatch — for *-like / pass-through
    dispatch, the proxy still calls upstream for model_list, but the
    response is merged with the local models table before being returned:
    local real models that aren't in the upstream response get appended,
    then aliases mirror their target entry. model_get checks the local
    table first and returns the local entry if present, otherwise falls
    through to upstream. This works across OpenAI / Claude / Gemini
    protocols, scoped and unscoped.
  • Alias-level pricing fallback — billing now tries to price a request
    against the alias name first and falls back to the resolved real model
    name if no alias-level pricing exists. Admins can set a custom
    price_each_call / price_tiers_json on an alias row to override the
    real model's pricing for that alias only.
  • Provider workspace: dedicated Rewrite Rules tab — rewrite rules
    moved out of the Config tab's settings JSON editor into their own
    provider-workspace tab (/providers/:name → "Rewrite Rules"). The
    editor is a two-column list + detail layout: the left column shows all
    rules with a scrollbar (max ~10 visible), the right column shows path /
    action / JSON value / filter (model glob + operation / protocol chips)
    for the selected rule. Data still lives in provider.settings_json.
  • Provider workspace: unified "Models" tab — the separate "Models"
    (pricing) and "Model Aliases" tabs are merged into a single "Models"
    tab that lists both real models and aliases in the same scrollable
    list. Alias rows are shown with an "alias" badge and a → target
    indicator, and three filter buttons (All / Real only / Aliases only)
    control what is visible. The edit form has an alias_of dropdown for
    picking an alias target, and the pull-models flow is embedded in the
    same tab.
  • "+ Add Suffix Variant" dialog in the Models tab — when a real
    model is selected, a new button opens a dialog that mirrors the old
    composable suffix system: the user picks one entry per group
    (thinking / reasoning / service tier / effort / verbosity / ...), the
    dialog computes a combined suffix string and a list of rewrite-rule
    actions, and on confirm it atomically creates an alias row
    (alias_of = base.id, model_id = base + suffix) and appends the
    rewrite rules to the provider's settings_json with
    filter.model_pattern scoped to the new alias name. Presets cover
    everything the deleted Rust suffix module supported except the Claude
    header-modifying suffixes (-fast, -non-fast, -1m, -200k),
    which rewrite rules can't express.
  • Rewrite rules editor: typed value input — the "Set" action no
    longer forces admins to hand-write JSON. A Type dropdown
    (string / number / boolean / null / array / object) switches the
    value editor between a plain text input, numeric input, boolean
    dropdown, null placeholder, or JSON textarea (for arrays/objects).
    Switching type resets the value to a sensible default for the new
    type.
  • Rewrite rules editor: model-pattern autocomplete — focusing the
    model_pattern input shows a scrollable dropdown of matching model
    names (real + aliases) for the current provider. Typing filters the
    list by substring without auto-completing the input; clicking an
    entry fills in the pattern exactly.
  • Pricing-by-alias in the billing pipeline — the engine now exposes
    build_billing_context / estimate_billing as public methods, and the
    handler builds the billing context in the handler layer with the
    alias name visible so per-alias pricing takes effect.

Changed

  • Request pipeline ordering: permission check (original model name) → rewrite_rules (original model name) → alias resolve → engine.execute → billing. Permission is checked against the name the client sent
    (before any alias rewrite), so admins must explicitly whitelist each
    alias — aliases do not silently inherit their target's permissions.
  • Rewrite rules moved out of the engine into the handler layer. The
    engine no longer applies rewrite_rules; instead the handler calls
    state.engine().rewrite_rules(provider) and applies them to the
    request body itself, using the original model name for
    model_pattern matching so patterns like gpt4-fast can match before
    the name is rewritten by alias resolution.
  • Billing moved out of the engine into the handler layer. The engine
    no longer computes cost / billing_context / billing on its
    ExecuteResult; those fields are gone. Handlers now call
    engine.build_billing_context(...) and engine.estimate_billing(...)
    directly after the upstream call returns, which is also what makes
    pricing-by-alias possible.
  • Provider proxy responses rewrite the "model" field to the alias
    name the client sent, using the engine's new response_model_override
    field on ExecuteRequest. The suffix rewrite (when still present) was
    skipped when the alias override was about to overwrite the same field,
    avoiding a redundant JSON parse / serialize per request.
  • model_alias_middleware simplified — the middleware now does a
    single exact alias lookup and drops the ResolvedAlias.suffix field;
    all suffix+alias combo handling has been removed along with the suffix
    system.

Fixed

  • /admin/models/pull returning HTTP 500 — the endpoint was cloning
    the admin request's headers (including Authorization: Bearer <admin-token>, Content-Length, Host) and forwarding them to the
    upstream, which either corrupted the body length or overrode the
    channel-supplied credentials. Pull now passes an empty HeaderMap so
    the channel's finalize_request is the only source of upstream
    headers. Error messages include the first 500 characters of the
    upstream response body so failures are debuggable.
  • Pull-models button was unreachable — the button lived in the
    standalone ModelAliasesModule route, but the sidebar never linked to
    that route. Moved it into the provider-workspace Aliases tab (and
    eventually into the unified Models tab), where it actually renders.
  • Models tab scrolling — the provider-workspace Models tab now has a
    max-h-128 scrollable list so long model lists stay usable.
  • custom channel: mask_table — the mask_table field was
    removed from the backend long ago, but the frontend custom-channel
    form still rendered a dead JSON editor. Removed from
    channel-forms.ts.

Removed

  • Suffix system — the entire sdk/gproxy-provider/src/suffix.rs
    module (801 lines) is deleted, along with the enable_suffix field
    and ChannelSettings::enable_suffix / ProviderRuntime::enable_suffix
    methods on all 14 channels. Response / streaming suffix rewriting,
    suffix-based model-list expansion, the suffix groups, and all
    match_suffix_groups / strip_model_suffix_in_body /
    rewrite_model_suffix_in_body / expand_model_list_with_suffixes /
    rewrite_model_get_suffix_in_body helpers — gone. The same feature
    (gpt4 vs gpt4-fast etc.) is now expressed as separate alias rows
    with channel-level rewrite rules.
  • /admin/model-aliases/* endpoints and model_aliases DB table
    ...
Read more

v1.0.4

11 Apr 16:03

Choose a tag to compare

v1.0.4

English

Added

  • Channel-level rewrite rules — new rewrite_rules field on all 14
    channel Settings structs allows per-channel request body rewriting before
    the request is finalized. Rules support JSON path targeting with glob
    matching. A dedicated RewriteRulesEditor component with full i18n is
    available in the console.
  • Dispatch template presets for custom channel — the console now offers
    built-in dispatch template presets when configuring custom channels,
    and dispatch templates are shown for all channel types (not just custom).

Fixed

  • Request log query button stuck on loading — the query button no longer
    gets permanently stuck in loading state.
  • HTTP client protocol negotiation — removed http1_only restriction and
    enabled proper HTTP/1.1 support for client builders, improving compatibility
    with upstream providers behind HTTP/1.1-only proxies.
  • Sampling parameter stripping — model-aware stripping for
    anthropic/claudecode channels ensures unsupported sampling parameters are
    correctly removed based on the target model.
  • Dispatch template passthrough*-only dispatch templates now correctly
    use passthrough+transform for model_list / model_get operations.
  • Session-expired toast suppressed — the error toast for expired sessions
    is now suppressed before the page reload, preventing a flash of error UI.
  • Update-available toast color — changed from error-red to green success
    style.
  • Noisy ORM loggingsqlx and sea_orm log levels now default to
    warn, reducing log noise at startup and during normal operation.
  • Dispatch / sanitize rules overflow — both panels now scroll when content
    exceeds the viewport instead of overflowing the layout.
  • Upstream proxy placeholder — the upstream proxy input field now shows a
    placeholder hint.
  • Frontend i18nalias, enable_suffix, enable_magic_cache labels
    are now properly translated; "模型" renamed to "模型价格表" / "Model Pricing";
    sanitize_rules renamed to "消息重写规则" / "Message Rewrite Rules".

中文

新增

  • 渠道级重写规则 — 全部 14 个渠道 Settings 结构新增 rewrite_rules
    字段,支持在请求最终发送前对请求体进行按路径重写,规则支持 JSON path
    定位与 glob 匹配。控制台提供专用的 RewriteRulesEditor 结构化编辑组件,
    完整支持中英文。
  • Custom 渠道调度模板预设 — 控制台在配置 custom 渠道时提供内置调度模板
    预设,且调度模板现在对所有渠道类型可见(不再限于 custom)。

修复

  • 请求日志查询按钮卡死 — 查询按钮不再永久停留在 loading 状态。
  • HTTP 客户端协议协商 — 移除 http1_only 限制并启用 HTTP/1.1 支持,
    改善通过仅支持 HTTP/1.1 的代理访问上游 provider 的兼容性。
  • 采样参数裁剪 — anthropic/claudecode 渠道现在根据目标模型感知地裁剪
    不支持的采样参数。
  • 调度模板透传*-only 调度模板现在正确使用 passthrough+transform
    处理 model_list / model_get 操作。
  • 会话过期 toast 抑制 — 页面刷新前不再闪现会话过期的错误提示。
  • 更新可用 toast 颜色 — 从红色错误样式改为绿色成功样式。
  • ORM 日志降噪sqlxsea_orm 日志级别默认设为 warn,减少
    启动和运行期间的日志噪音。
  • 调度规则 / 重写规则溢出 — 两个面板内容超出视口时改为滚动,不再
    撑破布局。
  • 上游代理占位提示 — 上游代理输入框现在显示占位符提示。
  • 前端国际化aliasenable_suffixenable_magic_cache 标签
    已正确翻译;"模型"改名为"模型价格表" / "Model Pricing";sanitize_rules
    改名为"消息重写规则" / "Message Rewrite Rules"。

v1.0.3

11 Apr 08:06

Choose a tag to compare

v1.0.3

English

Added

  • Suffix system for model-list / model-get — suffix modifiers (e.g. -thinking-high, -fast) are now expanded in model list responses and rewritten in model get responses, so clients can discover available suffix variants.
  • Suffix per-channel toggle — new enable_suffix setting lets operators enable/disable suffix processing per channel.
  • VertexExpress local model catalogue — model list/get requests are served from a static model catalogue embedded at compile time, since Vertex AI Express does not expose a standard model-listing endpoint.
  • Vertex SA token bootstrap on credential upsert — when a Vertex credential with client_email and private_key is added via the admin API, the access token is automatically obtained so the first request has valid auth.

Fixed

  • GeminiCLI / Antigravity model list — both channels now correctly route model list/get through their respective quota/model endpoints (retrieveUserQuota for GeminiCLI, fetchAvailableModels for Antigravity) and normalize responses to standard Gemini format.
  • Vertex model list normalization — Vertex AI returns publisherModels with full resource paths; responses are now converted to standard Gemini models format.
  • Vertex / VertexExpress header filteringanthropic-version and anthropic-beta headers are dropped before forwarding to Google endpoints.
  • Vertex GeminiCLI-style User-Agent — Vertex requests now send proper User-Agent and x-goog-api-client headers matching Gemini CLI traffic.
  • Engine HTTP client proxy — database proxy settings now take effect after bootstrap; previously the engine client was built before DB config was loaded.
  • Engine HTTP/1.1 for standard client — the non-spoof wreq client uses http1_only() for reliable proxy traversal.
  • HTTP client request dispatch — switched from wreq::Request::from() + execute() to client.request().send() to ensure proxy/TLS settings propagate correctly.
  • Frontend: VertexExpress credential — field changed from access_token to api_key.
  • Frontend: Vertex credential — added missing optional fields (private_key_id, client_id, token_uri).

中文

新增

  • Suffix 系统支持 model-list / model-get — suffix 修饰符(如 -thinking-high-fast)现在会在模型列表响应中展开、在模型详情响应中回写,客户端可以发现可用的 suffix 变体。
  • Suffix 按渠道开关 — 新增 enable_suffix 配置项,可按渠道启用/禁用 suffix 处理。
  • VertexExpress 本地模型目录 — model list/get 请求从编译时嵌入的静态模型目录返回,因为 Vertex AI Express 没有标准的模型列表端点。
  • Vertex SA 凭证 upsert 自动换 token — 通过 admin API 添加包含 client_emailprivate_key 的 Vertex 凭证时,自动获取 access token,首次请求不会因空 token 失败。

修复

  • GeminiCLI / Antigravity 模型列表 — 两个渠道现在正确通过各自的配额/模型端点(GeminiCLI 用 retrieveUserQuota,Antigravity 用 fetchAvailableModels)路由 model list/get 请求,并将响应整形为标准 Gemini 格式。
  • Vertex 模型列表整形 — Vertex AI 返回的 publisherModels(含完整资源路径)现在被转换为标准 Gemini models 格式。
  • Vertex / VertexExpress 头过滤 — 转发到 Google 端点前丢弃 anthropic-versionanthropic-beta 头。
  • Vertex GeminiCLI 风格 User-Agent — Vertex 请求现在发送匹配 Gemini CLI 流量的 User-Agentx-goog-api-client 头。
  • Engine HTTP 客户端代理 — 数据库代理设置现在在自举后生效;之前 engine 客户端在 DB 配置加载前就已构建。
  • Engine 标准客户端 HTTP/1.1 — 非伪装 wreq 客户端使用 http1_only() 确保代理穿透可靠。
  • HTTP 客户端请求调度 — 从 wreq::Request::from() + execute() 改为 client.request().send(),确保代理/TLS 设置正确传递。
  • 前端:VertexExpress 凭证 — 字段从 access_token 改为 api_key
  • 前端:Vertex 凭证 — 添加缺失的可选字段(private_key_idclient_idtoken_uri)。

v1.0.2

10 Apr 17:09

Choose a tag to compare

v1.0.2

English

Added

  • WebSocket per-model usage tracking — when the client switches models mid-session (e.g. via response.create), usage is segmented per model and recorded separately instead of attributing all tokens to the last model.
  • WebSocket upstream message logging — WS session end now records an upstream request log containing all client→server and server→client messages as request/response body.

中文

新增

  • WebSocket 按模型分段用量 — 客户端在 WS 会话中切换模型时,用量按模型分段记录,不再把所有 token 归到最后一个模型。
  • WebSocket 上游消息日志 — WS session 结束时记录上游请求日志,包含所有客户端→服务器和服务器→客户端消息。