@@ -50,12 +50,10 @@ the local system. Specifying the `DESTDIR` environment variable will allow you t
50
50
DESTDIR=/my/custom/path make install
51
51
```
52
52
53
- You'll need to adjust if you want to optimize with [ feature flags] ( Cargo.toml ) .
54
-
55
53
## Usage
56
54
57
- Add ` crc-fast = version = "1.3 " ` to your ` Cargo.toml ` dependencies, which will enable every available optimization for
58
- the ` stable ` toolchain. Adjust as necessary for your desired [ acceleration targets ] ( #acceleration-targets ) .
55
+ Add ` crc-fast = version = "1.5 " ` to your ` Cargo.toml ` dependencies, which will enable every available optimization for
56
+ the ` stable ` toolchain.
59
57
60
58
### Digest
61
59
@@ -312,22 +310,7 @@ but all known public & private implementations agree on the correct value, which
312
310
# Acceleration targets
313
311
314
312
This library has baseline support for accelerating all known ` CRC-32 ` and ` CRC-64 ` variants on ` aarch64 ` , ` x86_64 ` , and
315
- ` x86 ` internally in pure ` Rust ` . It's extremely fast (up to dozens of GiB/s) by default if no feature flags are
316
- used.
317
-
318
- ### tl;dr: Just tell me how to turn it up to 11! 🤘
319
-
320
- For ` aarch64 ` and older ` x86_64 ` systems, the release build will use the best available acceleration:
321
-
322
- ```
323
- cargo build --release
324
- ```
325
-
326
- For modern ` x86_64 ` systems, you can enable [ experimental VPCLMULQDQ support] ( #experimental-vpclmulqdq-support-in-rust )
327
- for a ~ 2X performance boost.
328
-
329
- At [ Awesome] ( https://awesome.co/ ) , we use these 👆 at large scale in production at [ Flickr] ( https://flickr.com/ ) and
330
- [ SmugMug] ( https://www.smugmug.com/ ) .
313
+ ` x86 ` internally in pure ` Rust ` .
331
314
332
315
### Checking your platform capabilities
333
316
@@ -344,41 +327,23 @@ cargo run arch-check
344
327
cargo build --release
345
328
```
346
329
347
- ### Experimental VPCLMULQDQ support in Rust
348
-
349
- This library also supports [ VPCLMULQDQ] ( https://en.wikichip.org/wiki/x86/vpclmulqdq ) for accelerating all ` CRC-32 ` and
350
- ` CRC-64 ` variants on modern ` x86_64 `
351
- platforms which support it when using ` nightly ` builds and the ` vpclmulqdq ` feature flag.
352
-
353
- Typical performance boosts are ~ 2X, and they apply to CPUs beginning with Intel
354
- [ Ice Lake] ( https://en.wikipedia.org/wiki/Ice_Lake_%28microprocessor%29 ) (Sep 2019) and
355
- AMD [ Zen4] ( https://en.wikipedia.org/wiki/Zen_4 ) (Sep 2022).
356
-
357
- ```
358
- rustup toolchain install nightly
359
- cargo +nightly build --release --features=vpclmulqdq
360
- ```
361
-
362
- ` AVX512 ` support with ` VPCLMULQDQ ` is stabilized on [ 1.89.0] ( https://releases.rs/docs/1.89.0/ ) , so once that becomes
363
- stable in August 2025, this library will be updated to use it by default without needing the ` nightly ` toolchain.
364
-
365
330
## Performance
366
331
367
332
Modern systems can exceed 100 GiB/s for calculating ` CRC-32/ISCSI ` , ` CRC-32/ISO-HDLC ` ,
368
333
` CRC-64/NVME ` , and all other reflected variants. (Forward variants are slower, due to the extra shuffle-masking, but
369
334
are still extremely fast in this library).
370
335
371
- This is a summary of the best [ targets ] ( #acceleration-targets ) for the most important and popular CRC checksums.
336
+ This is a summary of the performance for the most important and popular CRC checksums.
372
337
373
338
### CRC-32/ISCSI (reflected)
374
339
375
340
AKA ` crc32c ` in many, but not all, implementations.
376
341
377
342
| Arch | Brand | CPU | System | Target | 1KiB (GiB/s) | 1MiB (GiB/s) |
378
343
| :--------| :------| :----------------| :--------------------------| :--------------------| -------------:| -------------:|
379
- | x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-24xl | avx512-vpclmulqdq* | ~ 49 | ~ 111 |
344
+ | x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-24xl | avx512-vpclmulqdq | ~ 49 | ~ 111 |
380
345
| x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-24xl | sse-pclmulqdq | ~ 18 | ~ 52 |
381
- | x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | avx512-vpclmulqdq* | ~ 23 | ~ 54 |
346
+ | x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | avx512-vpclmulqdq | ~ 23 | ~ 54 |
382
347
| x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | sse-pclmulqdq | ~ 11 | ~ 20 |
383
348
| aarch64 | AWS | Graviton4 | EC2 c8g.metal-48xl | neon-eor3-pclmulqdq | ~ 19 | ~ 39 |
384
349
| aarch64 | AWS | Graviton2 | EC2 c6g.metal | neon-pclmulqdq | ~ 10 | ~ 17 |
@@ -391,9 +356,9 @@ AKA `crc32` in many, but not all, implementations.
391
356
392
357
| Arch | Brand | CPU | System | Target | 1KiB (GiB/s) | 1MiB (GiB/s) |
393
358
| :--------| :------| :----------------| :--------------------------| :--------------------| -------------:| -------------:|
394
- | x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-248xl | avx512-vpclmulqdq* | ~ 24 | ~ 110 |
359
+ | x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-248xl | avx512-vpclmulqdq | ~ 24 | ~ 110 |
395
360
| x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-248xl | sse-pclmulqdq | ~ 21 | ~ 28 |
396
- | x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | avx512-vpclmulqdq* | ~ 24 | ~ 55 |
361
+ | x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | avx512-vpclmulqdq | ~ 24 | ~ 55 |
397
362
| x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | sse-pclmulqdq | ~ 12 | ~ 14 |
398
363
| aarch64 | AWS | Graviton4 | EC2 c8g.metal-48xl | neon-eor3-pclmulqdq | ~ 19 | ~ 39 |
399
364
| aarch64 | AWS | Graviton2 | EC2 c6g.metal | neon-pclmulqdq | ~ 10 | ~ 17 |
@@ -406,9 +371,9 @@ AKA `crc32` in many, but not all, implementations.
406
371
407
372
| Arch | Brand | CPU | System | Target | 1KiB (GiB/s) | 1MiB (GiB/s) |
408
373
| :--------| :------| :----------------| :--------------------------| :--------------------| -------------:| -------------:|
409
- | x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-24xl | avx512-vpclmulqdq* | ~ 25 | ~ 110 |
374
+ | x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-24xl | avx512-vpclmulqdq | ~ 25 | ~ 110 |
410
375
| x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-24xl | sse-pclmulqdq | ~ 21 | ~ 28 |
411
- | x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | avx512-vpclmulqdq* | ~ 25 | ~ 55 |
376
+ | x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | avx512-vpclmulqdq | ~ 25 | ~ 55 |
412
377
| x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | sse-pclmulqdq | ~ 11 | ~ 14 |
413
378
| aarch64 | AWS | Graviton4 | EC2 c8g.metal-48xl | neon-eor3-pclmulqdq | ~ 20 | ~ 37 |
414
379
| aarch64 | AWS | Graviton2 | EC2 c6g.metal | neon-pclmulqdq | ~ 10 | ~ 16 |
@@ -419,9 +384,9 @@ AKA `crc32` in many, but not all, implementations.
419
384
420
385
| Arch | Brand | CPU | System | Target | 1KiB (GiB/s) | 1MiB (GiB/s) |
421
386
| :--------| :------| :----------------| :--------------------------| :--------------------| -------------:| -------------:|
422
- | x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-24xl | avx512-vpclmulqdq* | ~ 23 | ~ 56 |
387
+ | x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-24xl | avx512-vpclmulqdq | ~ 23 | ~ 56 |
423
388
| x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-24xl | sse-pclmulqdq | ~ 19 | ~ 28 |
424
- | x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | avx512-vpclmulqdq* | ~ 21 | ~ 43 |
389
+ | x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | avx512-vpclmulqdq | ~ 21 | ~ 43 |
425
390
| x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | sse-pclmulqdq | ~ 11 | ~ 13 |
426
391
| aarch64 | AWS | Graviton4 | EC2 c8g.metal-48xl | neon-eor3-pclmulqdq | ~ 16 | ~ 32 |
427
392
| aarch64 | AWS | Graviton2 | EC2 c6g.metal | neon-pclmulqdq | ~ 9 | ~ 14 |
@@ -432,16 +397,15 @@ AKA `crc32` in many, but not all, implementations.
432
397
433
398
| Arch | Brand | CPU | System | Target | 1KiB (GiB/s) | 1MiB (GiB/s) |
434
399
| :--------| :------| :----------------| :--------------------------| :--------------------| -------------:| -------------:|
435
- | x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-24xl | avx512-vpclmulqdq* | ~ 24 | ~ 56 |
400
+ | x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-24xl | avx512-vpclmulqdq | ~ 24 | ~ 56 |
436
401
| x86_64 | Intel | Sapphire Rapids | EC2 c7i.metal-24xl | sse-pclmulqdq | ~ 19 | ~ 28 |
437
- | x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | avx512-vpclmulqdq* | ~ 21 | ~ 43 |
402
+ | x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | avx512-vpclmulqdq | ~ 21 | ~ 43 |
438
403
| x86_64 | AMD | Genoa | EC2 c7a.metal-48xl | sse-pclmulqdq | ~ 11 | ~ 13 |
439
404
| aarch64 | AWS | Graviton4 | EC2 c8g.metal-48xl | neon-eor3-pclmulqdq | ~ 18 | ~ 31 |
440
405
| aarch64 | AWS | Graviton2 | EC2 c6g.metal | neon-pclmulqdq | ~ 9 | ~ 14 |
441
406
| aarch64 | Apple | M3 Ultra | Mac Studio (32 core) | neon-eor3-pclmulqdq | ~ 40 | ~ 59 |
442
407
| aarch64 | Apple | M4 Max | MacBook Pro 16" (16 core) | neon-eor3-pclmulqdq | ~ 46 | ~ 61 |
443
408
444
- \* = [ Experimental VPCLMULQDQ support in Rust] ( #experimental-vpclmulqdq-support-in-rust ) is enabled.
445
409
446
410
## Other CRC widths
447
411
0 commit comments