fast-normalize-spaces

fast-normalize-spaces

Install

$ yarn add @shelf/fast-normalize-spaces

Usage

const {normalizeSpaces} = require('@shelf/fast-normalize-spaces');

normalizeSpaces('   hello     \n\n\n   \n \n \t world   ');
// 'hello world'

Benchmark

Benchmarks reuse the same pool of 45 worst-case scenarios that cover multilingual text, surrogate pairs, HTML-like tokens, and the full 2018 Unicode whitespace set.

Speed

Scenario	normalize-space-x	@shelf/fast-normalize-spaces	Speedup
~33 kb	2,772 ops/s, ±0.22%	15,880 ops/s, ±0.34%	~5.7x
~330 kb	270 ops/s, ±0.50%	1,539 ops/s, ±1.49%	~5.7x
~3.3 mb	20 ops/s, ±1.62%	152 ops/s, ±0.36%	7.6x
~33 mb	2 ops/s, ±6.05%	16 ops/s, ±0.76%	8.0x

You can run yarn benchmark:speed to test on your own.

Memory usage

Text size (UTF-8)	normalize-space-x	@shelf/fast-normalize-spaces	Improvement
~33 mb (34,603,010 bytes)	74.69mb	25.31mb	2.95x less

The larger the string, the bigger the gap. Memory usage stays close to the size of the input buffer.

Recent optimizations — September 2025

September 2025 improvements were delivered autonomously by the gpt-5-codex model. We treated the normalization routine like any critical path service and tightened the slowest sections:

Smarter lookup table – precomputes the Unicode whitespace bitmap by iterating only over the relevant code points, keeping startup cost small and lookups cache-friendly.
Single-pass whitespace collapse – streams over the text once and writes normalized characters immediately, eliminating the prior buffer-wide fill and cutting per-call writes by roughly half.
Early return for clean inputs – detects unchanged strings and returns them as-is, removing allocations when input already meets expectations.
Lean buffer management – trims trailing whitespace in place, which dropped peak RSS from ~44 MB to ~25 MB on the 33 MB payloads.

The result is a jump from ~10k ops/s to 15.8k ops/s on 33 KB payloads and 5.7–8.0× gains over normalize-space-x, with memory use reduced nearly threefold.

Set a custom payload by exporting TEXT_SIZE (in bytes) when running the benchmark:

TEXT_SIZE=$((10 * 1024 * 1024)) yarn benchmark:memory

Publish

$ git checkout master
$ yarn version
$ yarn publish
$ git push origin master --tags

License

MIT © Shelf

Name		Name	Last commit message	Last commit date
Latest commit History 332 Commits
.husky		.husky
benchmark		benchmark
src		src
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.npmrc		.npmrc
README.md		README.md
eslint.config.js		eslint.config.js
jest.config.ts		jest.config.ts
license		license
package.json		package.json
renovate.json		renovate.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

fast-normalize-spaces

Install

Usage

Benchmark

Speed

Memory usage

Recent optimizations — September 2025

See Also

Publish

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

License

shelfio/fast-normalize-spaces

Folders and files

Latest commit

History

Repository files navigation

fast-normalize-spaces

Install

Usage

Benchmark

Speed

Memory usage

Recent optimizations — September 2025

See Also

Publish

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages