Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let rand::rng() use OsRng unless thread_rng is enabled? #1545

Open
dhardy opened this issue Dec 27, 2024 · 7 comments
Open

Let rand::rng() use OsRng unless thread_rng is enabled? #1545

dhardy opened this issue Dec 27, 2024 · 7 comments
Labels
E-question Participation: opinions wanted

Comments

@dhardy
Copy link
Member

dhardy commented Dec 27, 2024

ThreadRng adds some thread-local memory. In case not all users want this, we could add a thread_rng feature flag (probably default-enabled). We have a thread_rng feature in v0.9.

Where the thread_rng feature is not enabled, rand::rngs::ThreadRng would not be available. rand::rng() could instead use OsRng (assuming that getrandom is available), thus the thread_rng feature would mostly be a performance boost (at the cost of extra code and thread-local memory).

@dhardy dhardy added the E-question Participation: opinions wanted label Dec 27, 2024
@newpavlov
Copy link
Member

Personally, I would be somewhat surprised by such behavior, so I lean slightly against it. Though, it's not a strong opinion, so I would be fine with such change.

@dhardy dhardy mentioned this issue Jan 3, 2025
1 task
@pinkforest
Copy link

pinkforest commented Jan 28, 2025

FWIW IMO All features should be deterministic & additive where it should not change behavior of any surface under the hood.

This can cause issues if someone is expecting explicit rng and is given another instead when there is another transient crate can change the behavior for another given there is no bifurcation of features.

We went through quite an exercise here between features and cfg's:

I personally would like explicit selections and nothing else - or at least deterministic overides - given it only takes one single transient dependency to change the global behaviour - given the features do not bifurcate between the crates.

Also with say cfg(rng= "..") the top-level binary can override it over default - I have impl override too which drives the storage for my io_uring completions which can switch at will on impl based on decision of the top level binary which also typically owns the environment.

@dhardy
Copy link
Member Author

dhardy commented Jan 28, 2025

In this case, the behaviour change would be mostly unobservable which is the only reason I would consider it. Nevertheless, it is possibly observable through side-effects:

  • Faster performance
  • Extra state usage (without memory protection)

getrandom has a similar getrandom_backend cfg. We could add a cfg like rand_rng=ThreadRng / rand_rng=OsRng, but likely the default should be ThreadRng (at least, this is quite a lot faster than OsRng on Linux; we could also select based on platform).

The only purpose I think would be to reduce code and state size where some code in a dependency uses rand::rng() but the RNG is not used enough to be a performance issue.

@pinkforest
Copy link

pinkforest commented Jan 28, 2025

FIPS causes potentially a complication here given it dictates that cryptography library must choose explicitly which Rng it uses.

I hit this problem when trying to motivate rustls to adopt CryptoRng via it's CryptoProvider API.

However on another side composable cryptography primitives don't care about the environment unlike monolithic libraries like OpenSSL rustls etc. which make an opinion which one to use.

@tarcieri might have informed take on this.

@tarcieri
Copy link

tarcieri commented Feb 4, 2025

I think it's fine to fall back to OsRng in the event ThreadRng is unavailable. If there's a desire to use one of those in particular, they can always use it explicitly, right?

@dhardy
Copy link
Member Author

dhardy commented Feb 15, 2025

API changes

rand::rng is currently defined as:

#[cfg(feature = "thread_rng")]
pub fn rng() -> ThreadRng { .. }

To implement the above proposal, we would need to add:

#[cfg(not(feature = "thread_rng"))]
pub fn rng() -> impl RngCore { .. }

Adding the latter definition of fn rng is not technically a breaking change while still complying with additive nature of features, although possibly we should change the former to also return impl RngCore in the next breaking release.

Performance

On my (x86_64 (Zen 2) Linux) machine, OsRng is much slower than ThreadRng while still being usable for many applications:

random_bytes/small      time:   [172.29 ns 172.87 ns 173.82 ns]
                        thrpt:  [5.4864 GiB/s 5.5166 GiB/s 5.5354 GiB/s]
random_bytes/std        time:   [302.11 ns 303.55 ns 305.51 ns]
                        thrpt:  [3.1216 GiB/s 3.1418 GiB/s 3.1567 GiB/s]
random_bytes/thread     time:   [318.16 ns 318.65 ns 319.22 ns]
                        thrpt:  [2.9875 GiB/s 2.9929 GiB/s 2.9974 GiB/s]
random_bytes/os         time:   [1.4266 µs 1.4312 µs 1.4377 µs]
                        thrpt:  [679.25 MiB/s 682.34 MiB/s 684.54 MiB/s]
random_bytes/CachedOsRng
                        time:   [1.4235 µs 1.4277 µs 1.4346 µs]
                        thrpt:  [680.73 MiB/s 684.02 MiB/s 686.01 MiB/s]

random_u32/small        time:   [641.65 ps 642.85 ps 644.30 ps]
                        thrpt:  [5.7819 GiB/s 5.7950 GiB/s 5.8058 GiB/s]
random_u32/std          time:   [1.2439 ns 1.2457 ns 1.2478 ns]
                        thrpt:  [2.9854 GiB/s 2.9905 GiB/s 2.9948 GiB/s]
random_u32/thread       time:   [1.2437 ns 1.2447 ns 1.2460 ns]
                        thrpt:  [2.9898 GiB/s 2.9928 GiB/s 2.9953 GiB/s]
random_u32/os           time:   [16.425 ns 16.457 ns 16.495 ns]
                        thrpt:  [231.26 MiB/s 231.79 MiB/s 232.25 MiB/s]
random_u32/CachedOsRng  time:   [6.2392 ns 6.2581 ns 6.2924 ns]
                        thrpt:  [606.24 MiB/s 609.56 MiB/s 611.41 MiB/s]

random_u64/small        time:   [651.49 ps 652.42 ps 653.63 ps]
                        thrpt:  [11.399 GiB/s 11.420 GiB/s 11.436 GiB/s]
random_u64/std          time:   [1.9962 ns 1.9981 ns 2.0002 ns]
                        thrpt:  [3.7249 GiB/s 3.7287 GiB/s 3.7324 GiB/s]
random_u64/thread       time:   [2.0273 ns 2.0301 ns 2.0336 ns]
                        thrpt:  [3.6637 GiB/s 3.6701 GiB/s 3.6750 GiB/s]
random_u64/os           time:   [23.311 ns 23.390 ns 23.480 ns]
                        thrpt:  [324.93 MiB/s 326.19 MiB/s 327.28 MiB/s]
random_u64/CachedOsRng  time:   [11.869 ns 11.894 ns 11.940 ns]
                        thrpt:  [638.96 MiB/s 641.45 MiB/s 642.82 MiB/s]

Here, CachedOsRng is a cache over OsRng using BlockRng. It doesn't require a lot of code and improves u32 / u64 output significantly, but it's still extra code and state. The code is here.

I don't know how other platforms compare, but obviously OsRng is a big step down in performance vs ThreadRng, however it's still "fast enough" for many uses.

@newpavlov
Copy link
Member

In my opinion impl RngCore would be a bad approach. While technically it will not be a breaking change and it may even satisfy the additive crate feature property, it's still pretty surprising to change return types based on enabled features. In some cases it's also pretty annoying to not be able to name a type.

Instead you should just implement ThreadRng via OsRng if thread_rng is not enabled and maybe rename ThreadRng to something else and make ThreadRng an alias gated on thread_rng to this new type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E-question Participation: opinions wanted
Projects
None yet
Development

No branches or pull requests

4 participants