Skip to content

fix qwen image error#2558

Open
Arronlong wants to merge 3 commits intorouter-for-me:mainfrom
Arronlong:main
Open

fix qwen image error#2558
Arronlong wants to merge 3 commits intorouter-for-me:mainfrom
Arronlong:main

Conversation

@Arronlong
Copy link
Copy Markdown
Contributor

openclaw+feishu 90% ok。

核心修改逻辑:
// ensureQwenSystemMessage 处理 Qwen 消息格式:
// 1. 强制在 messages 最前面插入默认 system 消息
// 2. 原有 system 消息自动转为 user 角色
// 3. 超长文本(>10000字符)智能拆分,优先在空格/标点处切分
// 4. 图片/文件类型原样保留,不做拆分

第3、4点,是因为,我把cpa发出的消息体写死到qwen客户端cli.js的请求里,每次都不会报错;但是在cpa里,就会3/5的几率报错,把长内容删掉,90%则不报错,故增加了切分内容的处理逻辑,把content里的一个长句子切分为多个短句。

openclaw+feishu 90% ok
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors ensureQwenSystemMessage to normalize Qwen message payloads by prepending a default system message, converting existing system roles to user roles, and implementing intelligent text splitting for content exceeding 10,000 characters. Feedback highlights a regression risk where non-text content types (like audio or video) are dropped during array processing, an off-by-one error in the splitTextChunk function that could result in chunks exceeding the size limit, and performance inefficiencies caused by repeated JSON parsing and serialization within the message loop.

Comment on lines +228 to +266
var newParts []string
for _, item := range content.Array() {
itemType := item.Get("type").String()

if itemType == "image_url" || itemType == "image" || itemType == "file" {
newParts = append(newParts, item.Raw)
continue
}

if itemType == "text" {
text := item.Get("text").String()
if len(text) == 0 {
newParts = append(newParts, item.Raw)
continue
}

if len(text) > MAX_LENGTH {
chunks := splitTextChunk(text, MAX_LENGTH)
for _, chunk := range chunks {
part := `{"type":"text","text":` + fmt.Sprintf("%q", chunk) + `}`
newParts = append(newParts, part)
}
} else {
newParts = append(newParts, item.Raw)
}
}
}
systemParts = append(systemParts, part.Value())

var cbuf bytes.Buffer
cbuf.WriteByte('[')
for i, p := range newParts {
if i > 0 {
cbuf.WriteByte(',')
}
cbuf.WriteString(p)
}
cbuf.WriteByte(']')

processedMsg, _ = sjson.SetRaw(processedMsg, "content", cbuf.String())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

当前处理 content 数组的逻辑存在两个主要问题:

  1. 回归风险:它会静默丢弃除 image_urlimagefiletext 之外的所有内容部分(例如 audiovideo 或未来的扩展类型)。
  2. 元数据丢失:在拆分超长文本部分时,新创建的对象会丢失原始部分中的其他字段(如 cache_control)。

建议遍历数组并仅替换需要拆分的部分,同时保留所有其他部分(及其元数据)原样不动。

				var newParts []string
				for _, item := range content.Array() {
					// 仅处理文本类型进行拆分
					if item.Get("type").String() == "text" {
						text := item.Get("text").String()
						if len(text) > MAX_LENGTH {
							chunks := splitTextChunk(text, MAX_LENGTH)
							for _, chunk := range chunks {
								// 注意:拆分后的块会丢失原始元数据,但我们保留了未拆分的部分
								newParts = append(newParts, "{\"type\":\"text\",\"text\":"+fmt.Sprintf("%q", chunk)+"}")
							}
							continue
						}
					}
					// 保留非文本、短文本或未知类型(图片、文件等)的原始部分
					newParts = append(newParts, item.Raw)
				}

				var cbuf bytes.Buffer
				cbuf.WriteByte('[')
				for i, p := range newParts {
					if i > 0 {
						cbuf.WriteByte(',')
					}
					cbuf.WriteString(p)
				}
				cbuf.WriteByte(']')

				processedMsg, _ = sjson.SetRaw(processedMsg, "content", cbuf.String())

Comment on lines 307 to 313
for i := splitLen; i >= lookBackStart; i-- {
r := runes[i]
if r == ' ' || r == '\n' || r == '\t' || r == ',' || r == '.' || r == ';' || r == '!' || r == '?' {
safeIdx = i + 1 // 切在标点/空格后面
break
}
nonSystemMessages = append(nonSystemMessages, msg.Value())
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

在回溯逻辑中存在一个 off-by-one 错误。循环从 i := splitLen 开始并设置 safeIdx = i + 1。由于 splitLen 等于 max,如果第一个检查的位置就是分隔符,生成的块将包含 max + 1 个字符,超过了预期限制。循环应该从 splitLen - 1 开始,以确保块大小不超过 max

Suggested change
for i := splitLen; i >= lookBackStart; i-- {
r := runes[i]
if r == ' ' || r == '\n' || r == '\t' || r == ',' || r == '.' || r == ';' || r == '!' || r == '?' {
safeIdx = i + 1 // 切在标点/空格后面
break
}
nonSystemMessages = append(nonSystemMessages, msg.Value())
}
for i := splitLen - 1; i >= lookBackStart; i-- {
r := runes[i]
if r == ' ' || r == '\\n' || r == '\\t' || r == ',' || r == '.' || r == ';' || r == '!' || r == '?' {
safeIdx = i + 1 // 切在标点/空格后面
break
}
}

Comment on lines 191 to 271
processedMsg := msg.Raw
if strings.EqualFold(msg.Get("role").String(), "system") {
modified, err := sjson.SetRaw(msg.Raw, "role", `"user"`)
if err == nil {
processedMsg = modified
}
}
}

if !content.Exists() || content.Type == gjson.Null {
return
}
if content.IsArray() {
for _, part := range content.Array() {
if part.Type == gjson.String {
systemParts = append(systemParts, makeTextPart(part.String()))
continue
content := gjson.Get(processedMsg, "content")

if content.Type == gjson.String {
text := content.String()
if len(text) > 0 {
var newContents []string
if len(text) > MAX_LENGTH {
chunks := splitTextChunk(text, MAX_LENGTH)
for _, c := range chunks {
newContents = append(newContents, `{"type":"text","text":`+fmt.Sprintf("%q", c)+`}`)
}
} else {
newContents = append(newContents, `{"type":"text","text":`+fmt.Sprintf("%q", text)+`}`)
}

var cbuf bytes.Buffer
cbuf.WriteByte('[')
for i, s := range newContents {
if i > 0 {
cbuf.WriteByte(',')
}
cbuf.WriteString(s)
}
cbuf.WriteByte(']')
processedMsg, _ = sjson.SetRaw(processedMsg, "content", cbuf.String())
}
if isInjectedSystemPart(part) {
continue
}

if content.IsArray() {
var newParts []string
for _, item := range content.Array() {
itemType := item.Get("type").String()

if itemType == "image_url" || itemType == "image" || itemType == "file" {
newParts = append(newParts, item.Raw)
continue
}

if itemType == "text" {
text := item.Get("text").String()
if len(text) == 0 {
newParts = append(newParts, item.Raw)
continue
}

if len(text) > MAX_LENGTH {
chunks := splitTextChunk(text, MAX_LENGTH)
for _, chunk := range chunks {
part := `{"type":"text","text":` + fmt.Sprintf("%q", chunk) + `}`
newParts = append(newParts, part)
}
} else {
newParts = append(newParts, item.Raw)
}
}
}
systemParts = append(systemParts, part.Value())

var cbuf bytes.Buffer
cbuf.WriteByte('[')
for i, p := range newParts {
if i > 0 {
cbuf.WriteByte(',')
}
cbuf.WriteString(p)
}
cbuf.WriteByte(']')

processedMsg, _ = sjson.SetRaw(processedMsg, "content", cbuf.String())
}
return
}
if content.Type == gjson.String {
systemParts = append(systemParts, makeTextPart(content.String()))
return

buf.WriteByte(',')
buf.WriteString(processedMsg)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

代码在循环中对同一个消息字符串多次执行 sjson.SetRawgjson.Get 操作。每次 sjson 调用都涉及解析和重新序列化 JSON,这在处理大型负载时效率较低。此外,sjson.SetRaw 的错误被忽略了。建议考虑将消息反序列化一次,或者更直接地构建新的消息结构,以提高性能和健壮性。

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 18adb6703f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +233 to +237
newParts = append(newParts, item.Raw)
continue
}

if itemType == "text" {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve non-text content blocks during message normalization

When rebuilding content arrays in ensureQwenSystemMessage, the code only appends image_url, image, file, and text parts and has no fallback branch for any other item type. That means valid payload parts like input_audio, tool_result, or other structured blocks are silently dropped, so user input is corrupted before sending to Qwen. This is a functional regression versus preserving non-system messages as-is and will break multimodal/tool-result requests whenever those part types appear.

Useful? React with 👍 / 👎.

Comment on lines +307 to +310
for i := splitLen; i >= lookBackStart; i-- {
r := runes[i]
if r == ' ' || r == '\n' || r == '\t' || r == ',' || r == '.' || r == ';' || r == '!' || r == '?' {
safeIdx = i + 1 // 切在标点/空格后面
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce max chunk size at split boundary

The splitter can produce a chunk longer than max: if a delimiter is found at i == splitLen, the code sets safeIdx = i + 1, so the emitted chunk length becomes max+1. For boundary inputs (e.g., length 10001 with punctuation at index 10000), this defeats the purpose of overlength mitigation and can still trigger upstream "content too long" validation failures.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator

@luispater luispater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这次修改解决方向是对的,但目前会引入新的兼容性回归,建议在合并前处理下面两个阻塞问题。

  • Blocking: internal/runtime/executor/qwen_executor.go 现在只保留 textimage_urlimagefile 这几类 content part,其他类型会被静默丢弃。仓库里已经有 input_textoutput_textinput_audiotool_result 等类型的处理逻辑;例如 internal/runtime/executor/helps/token_helpers.go 已经显式支持这些类型。按现在的实现,只要请求里包含这些 part,转发到 Qwen 之前就会被删掉,这会造成行为回退。
  • Blocking: 长文本阈值判断使用的是 len(text),但真正拆分时 splitTextChunk 是按 []rune 处理的。对于中文等多字节文本,这两个单位不一致:可能会过早进入拆分分支,也仍然可能生成超过目标字节长度的 chunk。这里需要统一“限制”的度量单位,否则这个修复对中英文混合内容并不可靠。

测试信息:

  • go build -o test-output ./cmd/server && rm -f test-output 通过
  • go test ./internal/runtime/executor -run TestEnsureQwenSystemMessage -count=1 在 PR 分支失败 3 个用例;当前基线同命令已有 1 个失败,因此这次改动进一步扩大了该区域和现有测试契约的偏离

非阻塞:新增注释现在是中文,项目约定要求代码注释使用英文,建议顺手统一。

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 26a20f6e40

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +192 to +196
if strings.EqualFold(msg.Get("role").String(), "system") {
modified, err := sjson.SetRaw(msg.Raw, "role", `"user"`)
if err == nil {
processedMsg = modified
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Normalize object content after role conversion

This block rewrites every system message to user, but the subsequent normalization only handles string and array content, so a payload like { "role":"system", "content": {"type":"text","text":"..."} } is forwarded as a user message with object content. That shape is not accepted by OpenAI-compatible chat-completions (user content must be string or an array of parts), so requests that previously worked via object-to-part normalization can now be rejected.

Useful? React with 👍 / 👎.

Comment on lines +299 to +303
for len(runes) > 0 {
splitLen := max
if len(runes) <= splitLen {
chunks = append(chunks, string(runes))
break
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Split oversized UTF-8 text with a consistent length unit

The split trigger uses len(text) (bytes), but splitTextChunk enforces max against len(runes); for multibyte input this can produce a first chunk that is still over the byte threshold (e.g., 4,000 Chinese characters are ~12,000 bytes, len(runes)<=10000, so no real split happens). That defeats the overlength mitigation for non-ASCII text and can still hit upstream “content too long” validation.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a84cefa9cd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +181 to +183
type message struct {
Role string `json:"role"`
Content interface{} `json:"content"` // Can be string or []interface{}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve message fields during Qwen normalization

This struct drops every message property except role and content, and ensureQwenSystemMessage unmarshals each original message into it before writing messages back. As a result, valid chat-completions fields like tool_call_id, tool_calls, name, or function_call are silently removed from non-system messages, which can invalidate tool-call turns and break multi-step assistant/tool workflows.

Useful? React with 👍 / 👎.

@Arronlong
Copy link
Copy Markdown
Contributor Author

这次修改解决方向是对的,但目前会引入新的兼容性回归,建议在合并前处理下面两个阻塞问题。

  • Blocking: internal/runtime/executor/qwen_executor.go 现在只保留 textimage_urlimagefile 这几类 content part,其他类型会被静默丢弃。仓库里已经有 input_textoutput_textinput_audiotool_result 等类型的处理逻辑;例如 internal/runtime/executor/helps/token_helpers.go 已经显式支持这些类型。按现在的实现,只要请求里包含这些 part,转发到 Qwen 之前就会被删掉,这会造成行为回退。
  • Blocking: 长文本阈值判断使用的是 len(text),但真正拆分时 splitTextChunk 是按 []rune 处理的。对于中文等多字节文本,这两个单位不一致:可能会过早进入拆分分支,也仍然可能生成超过目标字节长度的 chunk。这里需要统一“限制”的度量单位,否则这个修复对中英文混合内容并不可靠。

测试信息:

  • go build -o test-output ./cmd/server && rm -f test-output 通过
  • go test ./internal/runtime/executor -run TestEnsureQwenSystemMessage -count=1 在 PR 分支失败 3 个用例;当前基线同命令已有 1 个失败,因此这次改动进一步扩大了该区域和现有测试契约的偏离

非阻塞:新增注释现在是中文,项目约定要求代码注释使用英文,建议顺手统一。

第1个阻塞已修复;
第2个阻塞其实不存在,因为长度10000是我自己想的,我也试验过20000的。另外,这个长度只是触发切分的条件,切分是按切分函数内部逻辑处理的。所以不存在问题。

另外,由于思路的变化,test内容也做了响应调整,已经全部测试过了。

同时,根据code review提示,做了一点性能优化的处理:
// 用于表示聊天消息的轻量级结构体,设计目的:
// 1. 避免重复执行JSON解析与序列化操作(提升性能)
// 2. 为消息字段提供类型安全保障(编译期即可完成检查)
// 3. 让代码具备自文档化能力(通过Go类型直观体现清晰的数据结构)
// 4. 简化对动态"content"字段的处理(该字段既支持字符串格式,也支持数组格式)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants