Add guaranteed-reproducible PRNGs to rand? #1588

dhardy · 2025-02-14T17:11:03Z

This question came up recently regarding a possible adoption to libstd (read from here), but I'm not sure we ever really asked the question of rand.

StdRng and SmallRng are deterministic but not reproducible (and in the latter case also not portable). Should we add a PRNG with guaranteed reproducibility as a new item under rand::rngs?

We already have five PRNGs available in rand if you count the ChaCha variants:

ChaCha8Rng, ChaCha12Rng, ChaCha20Rng
Xoshiro128PlusPlus, Xoshiro256PlusPlus

I'm not sure if we should ever add a guaranteed-reproducible ChaCha PRNG in rand since if we ever wanted to change the generator behind ThreadRng it would add dependencies. Given how long we've been using ChaCha in this role this may be less of an issue now.

The Xoshiro variants are more acceptable (if only because they require a lot less code; both are directly implemented in rand), though selecting one of these is likely sufficient, e.g. rang::rngs::Xoshiro256PlusPlus.

CC @hanna-kruppe @joshtriplett in case of interest

The text was updated successfully, but these errors were encountered:

benjamin-lieser · 2025-02-14T17:14:34Z

If you want guaranteed reproducibility can't you just use the named PRNG? Maybe I am misunderstanding the question.

dhardy · 2025-02-14T17:20:16Z

Yes — except that none of those named PRNGs are currently publicly export from rand.

Motivation is partially convenience and partially to make it more obvious how users may set up a reproducible PRNG (currently another crate must be added as a dependency).

benjamin-lieser · 2025-02-14T17:26:45Z

Ah true, I remember having to do this.

I would say exporting Xoshiro256PlusPlus would be a good idea, also under this name.

newpavlov · 2025-02-14T17:31:07Z

As argued in the linked issue, I don't think we need it and we should recommend use of a concrete PRNG crate (we could reference them in StdRng/SmallRng docs).

hanna-kruppe · 2025-02-14T17:36:25Z

When I'm sufficiently worried about long-term reproducibility that I'd opt for a generator with such a guarantee, I generally wouldn't be satisfied if the guarantee only covered RngCore methods or something like that. What I care about is that my program overall remains reproducible, which means e.g. any sampling Rng methods my program uses (and the trait impls backing them) can't have value-breaking changes either. Even if rand was willing to guarantee that for a larger subset of its APIs, it's very difficult for me as a user to ensure that I'm only using the guaranteed-stable subset. Depending directly on a specific rand_foo PRNG crate only solves this if you can make do with only rand_core::RngCore and avoid depending on rand entirely, but that's rare in my experience.

So I think rand, as a general-purpose crate that has good reasons to make value-breaking changes from time to time, is not in a good position to try and address the need for reproducibility. Offering it only for the simple cases, but not for the other APIs that come along for the ride, will result in just as many people being mistaken about whether their rand-using program will be reproducible with future releases of rand. That's not helping anyone.

dhardy · 2025-02-14T17:46:11Z

So I think rand, as a general-purpose crate that has good reasons to make value-breaking changes from time to time, is not in a good position to try and address the need for reproducibility.

The same is true of any library offering a wide variety of random algorithms? The solution here is simple enough: use a fixed version of rand. We should not make value-breaking changes in patch releases (outside of security concerns, though this was never yet an issue).

hanna-kruppe · 2025-02-14T18:21:47Z

Using a fixed version is not great because it means I'll effectively be on my own with maintaining that code once upstream (quite reasonably) stops doing so. Whether value-breaking changes are made in patch releases or only in minor releases is immaterial -- eventually I'll have to choose between eating a value-breaking change or sticking with an unmaintained version of the library. This won't be an issue if my code stops being actively developed before upstream moves on, but in many cases I don't want to make assumptions about that. And if I end up having to vendor the library, I'd always prefer one that is as small and simple as possible for my specific use case over a library that does basically everything.

The only way around this is if a library is aligned with my priorities w.r.t. reproducibility: making a credible promise to avoid value-breaking changes, by only adding new APIs without changing the old ones (possibly deprecating them but ideally without the implication that they'll be removed eventually). Of course, that's undesirable for everyone who doesn't need long-term reproducibility and wants to get improvements automatically. But it's not inherently impossible for a maintainer to do that, if that's their priority.

dhardy · 2025-02-14T19:21:47Z

eventually I'll have to choose between eating a value-breaking change or sticking with an unmaintained version of the library. [...] And if I end up having to vendor the library, I'd always prefer one that is as small and simple as possible for my specific use case over a library that does basically everything.

If you're talking about rand (not rand_distr), then unless you care about nightly features, there isn't much to maintain — about the only thing in rand v0.8 which "broke" is that gen will soon be a reserved keyword. As for bug fixes, v0.9 includes a couple of portability fixes and one single bug fix to IteratorRandom::choose_multiple_weighted for extremely small seeds (a value-breaking change, thus this could not be back-ported).

(I'm assuming you're not talking about maintenance of security — but even here nothing of note happened in the last four years, and if it did I expect that we would release a patch.)

So I don't buy your argument that rand is not a good choice if you care about long-term reproducibility.

hanna-kruppe · 2025-02-14T19:57:09Z

I don't know how easy or hard it would be for me to take over bugfix-only maintenance of a specific rand version. Since I'm not familiar with the code base or its history, determining that for myself would take non-trivial effort. I appreciate you sharing information about this now, but imagine if we weren't having this conversation and I'd just be looking at docs.rs/rand to make my decision. That part is just less daunting with a library that's less than, say, 1K lines of code.

In any case, if I'm happy to use a fixed version of a library then it doesn't matter if the library offers reproducibility guarantees across its releases (of course, consistent results across platforms still matter). If I'll be using rand 0.8.5 forever, then I'm not affected by value-breaking changes in later releases. Conversely, if I want to avoid pinning a specific version and instead keep updating rand, then I need reproducibility guarantees for all APIs that I'm using or might use by accident in the future, not just for the RngCore impls. That's what my first comment was about: to enable meaningful long-term reproducibility without version pinning, rand would have to make a much stronger commitment than just keeping some specific PRNG impls intact. I don't think rand can reasonably do that without unduly compromising on competing priorities.

dhardy added the E-question Participation: opinions wanted label Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add guaranteed-reproducible PRNGs to rand? #1588

Add guaranteed-reproducible PRNGs to rand? #1588

dhardy commented Feb 14, 2025

benjamin-lieser commented Feb 14, 2025

dhardy commented Feb 14, 2025

benjamin-lieser commented Feb 14, 2025

newpavlov commented Feb 14, 2025

hanna-kruppe commented Feb 14, 2025

dhardy commented Feb 14, 2025

hanna-kruppe commented Feb 14, 2025

dhardy commented Feb 14, 2025

hanna-kruppe commented Feb 14, 2025

Add guaranteed-reproducible PRNGs to rand? #1588

Add guaranteed-reproducible PRNGs to rand? #1588

Comments

dhardy commented Feb 14, 2025

benjamin-lieser commented Feb 14, 2025

dhardy commented Feb 14, 2025

benjamin-lieser commented Feb 14, 2025

newpavlov commented Feb 14, 2025

hanna-kruppe commented Feb 14, 2025

dhardy commented Feb 14, 2025

hanna-kruppe commented Feb 14, 2025

dhardy commented Feb 14, 2025

hanna-kruppe commented Feb 14, 2025