Implement IsZero for (), and optimize `IsZero::is_zero` for arrays #148737

zachs18 · 2025-11-09T09:01:54Z

These are probably not super useful optimizations, but they make it so that vec![expr; LARGE_LENGTH] has better performance for some exprs, e.g.

array of length zero in debug mode
tuple containing () and zero-valued integers in debug and release mode
array of () or other zero-sized IsZero type in debug mode

very rough benchmarks

use std::time::Instant;
use std::sync::atomic::{AtomicUsize, Ordering::Relaxed};

struct NonCopyZst;
static COUNTER: AtomicUsize = AtomicUsize::new(0);

impl Clone for NonCopyZst {
    fn clone(&self) -> Self {
        COUNTER.fetch_add(1, Relaxed);
        Self
    }
}


macro_rules! timeit {
    ($e:expr) => {
        let start = Instant::now();
        _ = $e;
        println!("{:56}: {:?}", stringify!($e), start.elapsed());
    };
}

fn main() {
    timeit!(vec![[String::from("hello"); 0]; 1_000_000_000]); // gets a lot better in debug mode
    timeit!(vec![(0u8, (), 0u16); 1_000_000_000]); // gets a lot better in debug *and* release mode
    timeit!(vec![[[(); 37]; 1_000_000_000]; 1_000_000_000]); // gets a lot better in debug mode
    timeit!(vec![[NonCopyZst; 0]; 1_000_000_000]); // gets a lot better in debug mode
    timeit!(vec![[[1u8; 0]; 1_000_000]; 1_000_000]); // gets a little bit better in debug mode
    timeit!(vec![[[(); 37]; 1_000_000]; 1_000_000]); // gets a little bit better in debug mode
    timeit!(vec![[[1u128; 0]; 1_000_000]; 1_000_000]); // gets a little bit better in debug mode

    // check that we don't regress existing optimizations
    timeit!(vec![(0u8, 0u16); 1_000_000_000]); // about the same time
    timeit!(vec![0u32; 1_000_000_000]); // about the same time

    // check that we still call clone for non-IsZero ZSTs
    timeit!(vec![[const { NonCopyZst }; 2]; 1_000]); // about the same time
    assert_eq!(COUNTER.load(Relaxed), 1998);
    timeit!(vec![NonCopyZst; 10_000]); // about the same time
    assert_eq!(COUNTER.load(Relaxed), 1998 + 9_999);
}

$ cargo +nightly run
// ...
vec![[String::from("hello"); 0]; 1_000_000_000]         : 11.13999724s
vec![(0u8, (), 0u16); 1_000_000_000]                    : 5.254646651s
vec![[[(); 37]; 1_000_000_000]; 1_000_000_000]          : 2.738062531s
vec![[NonCopyZst; 0]; 1_000_000_000]                    : 9.483690922s
vec![[[1u8; 0]; 1_000_000]; 1_000_000]                  : 2.919236ms
vec![[[(); 37]; 1_000_000]; 1_000_000]                  : 2.927755ms
vec![[[1u128; 0]; 1_000_000]; 1_000_000]                : 2.931486ms
vec![(0u8, 0u16); 1_000_000_000]                        : 19.46µs
vec![0u32; 1_000_000_000]                               : 9.34µs
vec![[const { NonCopyZst }; 2]; 1_000]                  : 31.88µs
vec![NonCopyZst; 10_000]                                : 36.519µs

$ cargo +dev run
// ...
vec![[String::from("hello"); 0]; 1_000_000_000]         : 4.12µs
vec![(0u8, (), 0u16); 1_000_000_000]                    : 16.299µs
vec![[[(); 37]; 1_000_000_000]; 1_000_000_000]          : 210ns
vec![[NonCopyZst; 0]; 1_000_000_000]                    : 210ns
vec![[[1u8; 0]; 1_000_000]; 1_000_000]                  : 170ns
vec![[[(); 37]; 1_000_000]; 1_000_000]                  : 110ns
vec![[[1u128; 0]; 1_000_000]; 1_000_000]                : 140ns
vec![(0u8, 0u16); 1_000_000_000]                        : 11.56µs
vec![0u32; 1_000_000_000]                               : 10.71µs
vec![[const { NonCopyZst }; 2]; 1_000]                  : 36.08µs
vec![NonCopyZst; 10_000]                                : 73.21µs

(checking release mode to make sure this doesn't regress perf there)

$ cargo +nightly run --release
// ...
vec![[String::from("hello"); 0]; 1_000_000_000]         : 70ns
vec![(0u8, (), 0u16); 1_000_000_000]                    : 1.269457501s
vec![[[(); 37]; 1_000_000_000]; 1_000_000_000]          : 10ns
vec![[NonCopyZst; 0]; 1_000_000_000]                    : 20ns
vec![[[1u8; 0]; 1_000_000]; 1_000_000]                  : 10ns
vec![[[(); 37]; 1_000_000]; 1_000_000]                  : 20ns
vec![[[1u128; 0]; 1_000_000]; 1_000_000]                : 20ns
vec![(0u8, 0u16); 1_000_000_000]                        : 20ns
vec![0u32; 1_000_000_000]                               : 20ns
vec![[const { NonCopyZst }; 2]; 1_000]                  : 2.66µs
vec![NonCopyZst; 10_000]                                : 13.39µs

$ cargo +dev run --release
vec![[String::from("hello"); 0]; 1_000_000_000]         : 90ns
vec![(0u8, (), 0u16); 1_000_000_000]                    : 30ns
vec![[[(); 37]; 1_000_000_000]; 1_000_000_000]          : 20ns
vec![[NonCopyZst; 0]; 1_000_000_000]                    : 30ns
vec![[[1u8; 0]; 1_000_000]; 1_000_000]                  : 20ns
vec![[[(); 37]; 1_000_000]; 1_000_000]                  : 20ns
vec![[[1u128; 0]; 1_000_000]; 1_000_000]                : 20ns
vec![(0u8, 0u16); 1_000_000_000]                        : 30ns
vec![0u32; 1_000_000_000]                               : 20ns
vec![[const { NonCopyZst }; 2]; 1_000]                  : 3.52µs
vec![NonCopyZst; 10_000]                                : 17.13µs

The specific expression I ran into a perf issue that this PR addresses is vec![[(); LARGE]; LARGE], as I was trying to demonstrate Vec::into_flattened panicking on length overflow in the playground, but got a timeout error instead since vec![[(); LARGE]; LARGE] took so long to run in debug mode (it runs fine on the playground in release mode)

rustbot · 2025-11-09T09:01:58Z

r? @joboet

rustbot has assigned @joboet.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

joboet · 2025-11-09T11:06:28Z

library/alloc/src/vec/is_zero.rs

+            // We could probably just return `true` here, since implementing
+            // `IsZero` for a zero-sized type such that `self.is_zero()` returns
+            // `false` would be useless, but to be safe we check anyway.


I thought so too at first, but that would conflict with the above implementation – any [T; N] now implements IsZero, so e.g. [NonTrivialCloneButZST; 5] would hit this code path but mustn't be zero-initialised.

This is in the specialization where T: IsZero, so this would not be run for T = NonTrivialCloneButZST, right?

Oh, I see what you mean, if T = [NonTrivialCloneButZST; 5], it would be wrong to return true unconditionally.

I updated the comment to explain why we can't just return true unconditionally.

@rustbot ready

Implement IsZero for (). Implement default `IsZero` for all arrays, only returning true if the array is empty (making the existing array impl for `IsZero` elements a specialization). Optimize `IsZero::is_zero` for arrays of zero-sized `IsZero` elements.

joboet · 2025-11-10T08:18:50Z

Thanks!
@bors r+ rollup=never

bors · 2025-11-10T08:18:53Z

📌 Commit 0aaa3ae has been approved by joboet

It is now in the queue for this repository.

bors · 2025-11-10T08:18:53Z

🌲 The tree is currently closed for pull requests below priority 100. This pull request will be tested once the tree is reopened.

tamird · 2025-11-10T15:46:37Z

library/alloc/src/vec/is_zero.rs

    };
 }

+impl_is_zero!((), |_: ()| true); // It is needed to impl for arrays and tuples of ().


fwiw this can just be |()| rather than |_: ()|

bors · 2025-11-11T02:44:41Z

⌛ Testing commit 0aaa3ae with merge c8f22ca...

bors · 2025-11-11T05:56:07Z

☀️ Test successful - checks-actions
Approved by: joboet
Pushing c8f22ca to main...

github-actions · 2025-11-11T05:59:20Z

What is this?

This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 29a6971 (parent) -> c8f22ca (this PR)

Test differences

Show 4 test diffs

4 doctest diffs were found. These are ignored, as they are noisy.

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard c8f22ca269a1f2653ac962fe2bc21105065fd6cd --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

dist-aarch64-apple: 5794.6s -> 7723.5s (+33.3%)
dist-x86_64-apple: 8583.0s -> 10939.3s (+27.5%)
aarch64-apple: 9396.5s -> 7079.7s (-24.7%)
dist-powerpc-linux: 4875.5s -> 5715.9s (+17.2%)
dist-various-1: 3748.5s -> 4189.4s (+11.8%)
x86_64-rust-for-linux: 3046.0s -> 2760.5s (-9.4%)
x86_64-gnu-nopt: 6864.0s -> 7432.5s (+8.3%)
aarch64-gnu-llvm-20-1: 3862.6s -> 3552.7s (-8.0%)
dist-aarch64-msvc: 5636.3s -> 6075.9s (+7.8%)
x86_64-gnu-miri: 4975.2s -> 4598.3s (-7.6%)

How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

rust-timer · 2025-11-11T07:44:48Z

Finished benchmarking commit (c8f22ca): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.9%	[-0.9%, -0.9%]	1
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results (primary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	4.4%	[4.4%, 4.4%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.3%	[-4.0%, -0.5%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.1%	[-4.0%, 4.4%]	3

Cycles

Results (secondary -2.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.1%	[-2.1%, -2.1%]	1
All ❌✅ (primary)	-	-	0

Binary size

Results (primary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.1%	[0.0%, 0.1%]	4
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.1%	[-0.1%, -0.1%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.0%	[-0.1%, 0.1%]	5

Bootstrap: 477.532s -> 476.181s (-0.28%)
Artifact size: 391.38 MiB -> 391.36 MiB (-0.00%)

rustbot assigned joboet Nov 9, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 9, 2025

joboet requested changes Nov 9, 2025

View reviewed changes

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 9, 2025

zachs18 force-pushed the unit-is-zero branch from fc8c039 to 0aaa3ae Compare November 9, 2025 17:29

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 9, 2025

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 10, 2025

tamird reviewed Nov 10, 2025

View reviewed changes

bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 11, 2025

bors merged commit c8f22ca into rust-lang:main Nov 11, 2025
12 checks passed

rustbot added this to the 1.93.0 milestone Nov 11, 2025

Implement IsZero for (), and optimize IsZero::is_zero for arrays #148737

Implement IsZero for (), and optimize IsZero::is_zero for arrays #148737

Conversation

zachs18 commented Nov 9, 2025

Uh oh!

rustbot commented Nov 9, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zachs18 Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joboet commented Nov 10, 2025

Uh oh!

bors commented Nov 10, 2025

Uh oh!

bors commented Nov 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bors commented Nov 11, 2025

Uh oh!

bors commented Nov 11, 2025

Uh oh!

Uh oh!

github-actions bot commented Nov 11, 2025

Test differences

Job duration changes

Uh oh!

rust-timer commented Nov 11, 2025

Overall result: ✅ improvements - no action needed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Implement IsZero for (), and optimize `IsZero::is_zero` for arrays #148737

Implement IsZero for (), and optimize `IsZero::is_zero` for arrays #148737

zachs18 Nov 9, 2025 •

edited

Loading