Skip to content

Conversation

@bonega
Copy link
Owner

@bonega bonega commented Jan 6, 2026

Summary

  • Replace raw pointer manipulation in encoder.rs with safe Vec::with_capacity + extend + flatten pattern
  • Removes 2 unsafe blocks (reduces total from 18 to 16)
  • Performance is equivalent or better

Benchmark Results

The mostly_ascii path is ~40% faster due to better iterator optimization.

Approach

let mut res = Vec::with_capacity(len);
res.extend(
    (0..len)
        .map(|_| self.encode_grapheme(&mut src).or(fallback))
        .flatten(),
);

The key insight is that extend() uses the iterator's size hint to avoid per-element capacity checks when capacity is pre-allocated.

Test plan

  • All existing tests pass
  • Benchmarks show no regression

Replace raw pointer manipulation with safe Vec::with_capacity +
extend + flatten pattern. This removes 2 unsafe blocks while
maintaining equivalent performance (within noise margin for large
inputs, ~40% faster for mostly-ASCII inputs).

The key insight is that extend() can use the iterator's size hint
to avoid per-element capacity checks when capacity is pre-allocated.
@bonega bonega enabled auto-merge (squash) January 6, 2026 21:16
@bonega bonega merged commit 1fa8021 into master Jan 6, 2026
4 checks passed
@bonega bonega deleted the remove-encoder-unsafe branch January 7, 2026 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants