Skip to content

[0023] 优化 utf8-string-set! 性能#787

Merged
da-liii merged 3 commits into
mainfrom
da/0023/unicode
May 11, 2026
Merged

[0023] 优化 utf8-string-set! 性能#787
da-liii merged 3 commits into
mainfrom
da/0023/unicode

Conversation

@da-liii
Copy link
Copy Markdown
Contributor

@da-liii da-liii commented May 11, 2026

Summary

  • utf8-string-set!string->utf8 + bytevector-copy + bytevector-append + utf8->string 改为更高效的实现
  • 等长替换直接修改 bytevector,零额外分配
  • 不等长替换使用 make-bytevector + 手动循环填充,减少临时对象

关键改动

  1. string->byte-vector 替代 string->utf8 — 消除 UTF-8 验证循环和多余复制
  2. byte-vector->string 替代 utf8->string — 消除 UTF-8 验证循环
  3. 等长替换直接修改原 bytevector(最常见场景)
  4. 不等长替换改用 make-bytevector + 手动循环(比 copy+append 减少临时对象)

性能对比

场景 长度 原始 优化后 提升
同长度 ASCII→ASCII 10 开头 0.949s 0.085s ~11x
同长度 ASCII→ASCII 10 中间 1.213s 0.319s ~3.8x
同长度 ASCII→ASCII 10000 中间 0.918s 0.215s ~4.3x
不同长度 ASCII→中文 10 开头 1.128s 0.276s ~4.1x
不同长度 中文→ASCII 10 中间 1.932s 0.604s ~3.2x

Test plan

  • bin/gf tests/liii/unicode/utf8-string-set-bang-test.scm — 21 项测试全部通过
  • bin/gf tests/liii/unicode/ — 全部 25 个 unicode 测试通过,零回归
  • bin/gf bench/utf8-string-set-bang.scm — 基准测试验证性能提升

🤖 Generated with Claude Code

Da Shen and others added 3 commits May 11, 2026 16:47
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@da-liii da-liii merged commit 23bd93f into main May 11, 2026
4 checks passed
@da-liii da-liii deleted the da/0023/unicode branch May 11, 2026 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant