API for random number generation #874

ntessore · 2024-12-17T11:13:19Z

ntessore
Dec 17, 2024

I would like to pick up the discussion from #431. The discussion there revolved mostly around the "stateless" functional JAX-like API vs the "stateful" class-based NumPy-like API for random number generation.

A project of ours is adopting the Array API and requires random number generation. I would like to share what I found in my experiments to implement either approach in the respectively different backend. Can we map out the space of possible solutions? This would allows us to see if there is a way forward at this point, or if it's better to defer this further.

Edit: Split into individual comments.

jakevdp · 2024-12-17T21:23:51Z

jakevdp
Dec 17, 2024
Collaborator

I don't think a stateful solution in JAX relying on side-effects of flattening is something that should be recommended to users. Relying on this kind of implementation detail is likely to fail in corner cases we're not thinking of, and is almost certain to cause issues in the future when that implementation is changed.

1 reply

ntessore Dec 17, 2024
Author

I tend to agree, but I also wanted to show what is possible. In practice, not passing wrapper instances into compiled functions should not be a huge problem: generic array API code won't be compiled, and native JAX code won't expect the wrapper, anyway, but the wrapped key.

ntessore · 2024-12-17T21:51:47Z

ntessore
Dec 17, 2024
Author

A fully Array API-compliant functional random sampling library

Perhaps there's even a more radical option, although I haven't looked into it at all and it may be entirely infeasible: port JAX's random implementation to be Array API compliant, so that a stateless random number solution can be applied with any array backend.

0 replies

ntessore · 2024-12-18T18:50:15Z

ntessore
Dec 18, 2024
Author

Using the NumPy RNG itself as the "key"

Idea: Adopt the stateless/functional API from JAX as a Random API and wrap NumPy's RNGs as they are with a compliant set of functions.

In principle, this looks easy enough. Assume we wanted to implement JAX's normal() for a NumPy rng (and ignoring the precise semantics of shape vs size for a moment):

def normal(rng, shape=(), dtype=np.float64):
    return rng.standard_normal(shape, dtype)

To use the random() function with the JAX-like API, one would do:

rng, subrng = random.split(rng)
val = random.normal(subrng)

The problem lies in "splitting" a traditional RNG. While there is a method to obtain independent sub-generators, we don't really want to invoke that for every random number generation. So, in the simple use case above, we would probably want to return the same RNG twice:

def split(rng):
    """
    Simple split for standard use case: return the same
    random number generator as a sub-generator.
    """
    return rng, rng

However, there are situations where multiple sub-generators are indeed what is wanted (e.g., parallel execution). So there also needs to be an implementation of split() that does what it says:

def split(rng, num=2):
    """
    Actual split for parallel use case: return independent
    sub-generators.
    """
    return rng.spawn(num)

This is potentially more in line with split() in JAX.

The only real solution I see here is to have a separate function for the "sample some numbers" use case. The API could then work along these lines:

state = ...  # the random state; rng for numpy, key for JAX
# sample a random variate
state, sampler = random.get(state)  # get an object that can do random sampling
val = random.normal(sampler)  # use object to generate some numbers
# do parallel sampling
state, substates = random.spawn(state)  # get child state
sample_in_parallel(substates)

This would need better names for what I call state/sampler/get above, but I'm sure you get the idea.

All in all, I think this is doable. But I am not a huge fan on this API.

1 reply

ntessore Dec 27, 2024
Author

For what it's worth, this solution is slightly more natural if, instead of using the existing spawn() mechanism, the generators created by split() have their initial random state generated by the existing generator, much like it is done in JAX.

Toy implementation

import functools
from collections.abc import Sequence

import numpy as np

class FixedSeed(np.random.SeedSequence):
    def __init__(self, data):
        self.data = data

    def generate_state(self, n_words, dtype=np.uint32):
        assert (n_words,) == self.data.shape and dtype == self.data.dtype
        return self.data

@functools.singledispatch
def random_seed(bg, shape):
    raise NotImplementedError

@random_seed.register
def _(bg: np.random.PCG64, shape):
    return bg.random_raw((*shape, 4))

@random_seed.register
def _(bg: np.random.Philox, shape):
    return bg.random_raw((*shape, 2))

# add more bit generators
...

def _mkgen(seed, cls):
    return np.random.Generator(cls(FixedSeed(seed)))

def split(rng, num=2):
    bg = rng.bit_generator
    shape = tuple(num) if isinstance(num, Sequence) else (num,)
    seed = random_seed(bg, shape)
    return np.apply_along_axis(_mkgen, -1, seed, bg.__class__)

def normal(rng, shape=(), dtype=float):
    return rng.standard_normal(shape, dtype)

# add more random API functions
...

This toy implementation uses a SeedSequence subclass that returns a fixed seed for the child generators. (Rather pleasingly, this ends up being the exact same as the key-based solution for the Philox generator.) A caveat here is that I haven't run this through any statistical tests.

ntessore · 2024-12-18T18:50:58Z

ntessore
Dec 18, 2024
Author

Implementing a stateful/class-based API using JAX

Idea: Adopt the numpy.random.Generator API and wrap JAX's functions into a compliant RNG class.

Again, this looks easy enough in principle. On the JAX side, all it needs is wrapping the key into a class instance, and forward all method calls to JAX's functional API while advancing the internal state of the RNG instance:

class JRNG:
    def __init__(self, key):
        self.key = key
    def standard_normal(self, size=(), dtype=float) -> Array:
        self.key, key = jax.random.split(self.key)
        return normal(key, size, dtype)

The major problem here is that we cannot easily pass this stateful object into JAX's compiled functions. There is a workaround: treat JRNG as a pytree containing the key, and when the pytree is flattened, advance the random state and pass an independent key for use in the compiled function:

@jax.tree_util.register_pytree_node_class
class JRNG:
    ...

    def tree_flatten(self):
        self.key, key = jax.random.split(self.key)
        return (key,), None

    @classmethod
    def tree_unflatten(cls, aux_data, children):
        key, = children
        rng = object.__new__(cls)
        rng.key = key
        return rng

This workaround has the drawback that passing the RNG into a compiled function has different output than passing it into the same non-compiled function. (For reasons that aren't entirely clear to me, the pytree is also flattened 4 times for each invocation, advancing internal state each time. But this could probably be prevented.)

On the whole, I am not particularly worried about this issue. If the idea is to create a Random API partner for the Array API, there will not generally be compiled functions in the pipeline. And everything that is sufficiently low-level to be a compiled JAX function will not accept the rng wrapper, anyway, but instead receive the rng.key directly, so there is no problem there.

I packaged this up as a proof-of-concept wrapper for JAX here: glass-dev/jrng. It is only meant to be an illustration of the concept.

7 replies

ntessore Dec 20, 2024
Author

Could you mock up a quick example of where you see the problem here?

The situation I am imagining is roughly the following:

# low-level library code
# uses backend-specific primitives

@numba.jit
def library_func_numpy(x, rng):
    ...

@jax.jit
def library_func_jax(x, key):
    ...

# high-level library code
# uses the Array API as much as possible
# delegates to backend-specific implementations as necessary

def library_func(x, rng):
    if isinstance(x, np.ndarray):
        return library_func_numpy(x, rng)
    elif isinstance(x, jax.Array):
        key = rng.key()
        return library_func_jax(x, key)
    ...

# user code
# calls high-level library code with a specific backend of the user's choice

# for example: jax
x = jnp.zeros(3)
rng = rng_jax.Generator(42)
library_func(x, rng)

# could be using numpy
x = np.zeros(3)
rng = np.random.default_rng(42)
library_func(x, rng)

As far as I can tell, autodiff etc. would all pass through the high-level/Array API library code. Is that not how you would write array-agnostic libraries with the Array API?

jakevdp Dec 20, 2024
Collaborator

This looks fine, but isn't the need to branch to different code paths for different implementations entirely counter to the stated goal of the array API? If that's the route we're going, why decide on a common API at all? I can do that now without the array API.

ntessore Dec 20, 2024
Author

I was merely trying to demonstrate how a stateful RNG could interact with jitted, pure JAX functions without passing the instance into the function. But the point of this approach is that, in almost all practically relevant cases, I would not have to do a case distinction. Instead, I would call the random number generation in the high-level code:

def my_cool_model(x, y, z, *, rng):
    """
    Implement my cool model for simulating that effect we talked about.
    Works with any array backend.
    """
    a = rng.standard_normal((x.size, y.size))
    return x @ a @ y + z

I was assuming that this should work on any backend for which I can build a rng object. And since it's only forwarding calls to the underlying random implementation, I am under the impression this supports, e.g., autodiff. Is that not correct?

I guess a further implicit assumption here is that the Array API will always operate at the "Python level", and that there will hence be no generic @jit or similar.

jakevdp Dec 20, 2024
Collaborator

OK, that makes sense. But the issue with this code is that if you wrap my_cool_model in jax.jit, jax.vmap, jax.grad, or another transformation, it will fail because rng.standard_normal is impure.

So if this were the unified API chosen for random number generation, the only sensible option for JAX would be to not implement it.

ntessore Dec 20, 2024
Author

Thanks, I'm seeing the problem now. You couldn't write the following kind of generic code:

model = jax.jit(my_cool_model)
run_jax_pipeline(model)

In other words, you couldn't use a stateful RNG to write generic code that can be consumed by JAX library code, because there cannot be a JAX consumer of this API.

ntessore · 2024-12-19T22:16:22Z

ntessore
Dec 19, 2024
Author

Building a stateless/functional API on top of NumPy's existing bit generators

NumPy currently implements the Philox counter-based RNG, and more are available in the RandomGen project. Crucially, the NumPy bit generator can be instantiated with a key= argument. This makes it possible to carry the key around just like in JAX, and create bit generator instances (and random number generator instances) on the fly for random number sampling.

>>> from numpy_random_api import random
>>> key = random.key(42)
>>> key
PRNGKeyArray([3444837047, 2669555309, 2046530742, 3581440988],
             dtype=uint32, impl='philox')
>>> key, subkey = random.split(key)
>>> key
PRNGKeyArray([3973757322,  369700608,  604115056,  607984076],
             dtype=uint32, impl='philox')
>>> random.normal(subkey, 4)
array([-0.05883458,  0.6125753 , -1.29899843,  0.12702094])

Equivalent code using NumPy's random interface (where everything is inline so there is no state):

>>> import numpy as np
>>> from numpy.random import Generator, Philox, SeedSequence
>>> key = SeedSequence(42).generate_state(4).view(np.uint32)
>>> key
array([3444837047, 2669555309, 2046530742, 3581440988], dtype=uint32)
>>> key, subkey = Philox(key=key.view(np.uint64)).random_raw(4).reshape(2, 2).view(np.uint32)
>>> key
array([3973757322,  369700608,  604115056,  607984076], dtype=uint32)
>>> Generator(Philox(key=subkey.view(np.uint64))).standard_normal(4)
array([-0.05883458,  0.6125753 , -1.29899843,  0.12702094])

I have written a simple proof-of-concept here: ntessore/numpy_random_api

I have to say, this works very nicely for building a JAX-like API around NumPy's existing random framework. It's not even entirely clear to me that the approach is limited to the one existing counter-based bit generator; as far as I can tell, there is no fundamental reason why the "key" being passed around cannot be used to seed the traditional bit generators. But I also haven't checked very carefully.

All in all this would probably be my favourite solution for a random API, except that I don't like the user experience for array-agnostic libraries built on top of the Array API (which is my line of business). Since the state is carried around explicitly, this approach requires teaching users to manually random.split(key) before every single invocation of a library function that does random sampling.

4 replies

ntessore Dec 21, 2024
Author

After working with this approach a bit more, everything works well on the library side of things: using the stateless random API is perfectly comfortable (as we know from JAX), and the overhead from creating the NumPy BitGenerator and Generator on the fly hasn't really made itself known at all.¹

But I am still worried about the user experience. Asking the user to manually advance the key before each random call looks like a recipe for bug reports to me. The best I can come up with right now to mitigate that is to wrap the key into a rng-like helper class that does the advancing for the user at the highest level of the code. The neatest syntax I could come up with so far is this:

class RandomKeyHelper:
    def __init__(self, random_namespace, *args, **kwargs):
        self.random = random_namespace
        self.key = self.random.key(*args, **kwargs)

    def __pos__(self):
        self.key, key = self.random.split(self.key, 2)
        return key

import jax
key = RandomKeyHelper(jax.random, 42)
print(jax.random.normal(+key, 3))
# [-0.5675502   0.28439185 -0.9320608 ]
print(jax.random.normal(+key, 3))
# [ 0.67903334 -1.220606    0.94670606]

To be clear, I'm not proposing something like this at all. I'm only trying to say that perhaps there is some mitigation on the user-facing side of things for the need to explicitly carry the random state around at all times. Or maybe I am not giving users enough credit, and they will deal with it just fine?

Fortunately, NumPy doesn't seem to construct a SeedSequence when Philox is instantiated with a key= argument. Unfortunately, the randomgen implementations appear to always create a SeedSequence. ↩

jakevdp Dec 21, 2024
Collaborator

I understand your concerns about the need for manually advancing the random state, but any solution with a non-pure implicit state update is a no-go for JAX, no matter how you define it.

ntessore Dec 21, 2024
Author

Yeah, that would not be at the Array API/backend level at all, but something I could offer our users at the very top level of the code. As soon as anything hits the Array API boundary, everything would be purely stateless, as in JAX.

All in all, I'm pretty convinced that JAX had the only viable recipe all along, particular when also looking towards other, more restricted Array API implementations than NumPy. Passing the key (or more general random state) around in the form of a (thinly wrapped) plain array should always be possible, no matter what other constraints there are for the array backend.

If we wanted to try and standardise this, I believe the minimum set of requirements are:

The random API is functional.
Random state is stored and passed explicitly.
There is a function to create new random state from an initial seed.
There is a function to split the random state into independent substates.
There are functions for random sampling from a given random state.

Starting from there, I have the following questions:

Must the random state be array-ish?
Is the random state always a "key", or could it be something more generic, like the internal state of a traditional RNG or a NumPy-like seed sequence?
In the latter case, should the function in (3) still be called random.key() or something more generic? Similarly, for the function in (4), is the name random.split() sufficiently generic?

Even if there is no appetite for standardisation at this time, this little exercise has helped me a lot to figure out where we need to take our project. Thanks @jakevdp!

ntessore Dec 27, 2024
Author

Working with this a little more, I came across another practical question:

How do you obtain the random namespace?

If that was going to be simply xp.random.normal(key), say, it would be problematic for NumPy (v2), since np.random.normal() is the existing legacy function. This could be solved with a dispatch mechanism, but it's not exactly a neat solution.

An alternative could be an independent .__random_namespace__() mechanism. This could either be available on generic arrays, or it could be a property of key arguments only. In the former case, this is a rather large change across a number of backends. In the latter case, however, this could be done by third-party libraries, and would hence be largely independent of the array implementations itself. For example, both the stateless and the stateful NumPy solutions from this discussion could easily be implemented that way. However, it means that a key must always be provided by the user, as there is no generic random.key() facility in the array API anywhere (which I don't think is a big problem, as one probably always wants the user to provide the initial seed somewhere).

API for random number generation #874

Uh oh!

Uh oh!

ntessore Dec 17, 2024

Replies: 5 comments · 13 replies

Uh oh!

Uh oh!

jakevdp Dec 17, 2024 Collaborator

Uh oh!

ntessore Dec 17, 2024 Author

Uh oh!

Uh oh!

ntessore Dec 17, 2024 Author

A fully Array API-compliant functional random sampling library

Uh oh!

Uh oh!

ntessore Dec 18, 2024 Author

Using the NumPy RNG itself as the "key"

Uh oh!

ntessore Dec 27, 2024 Author

Uh oh!

ntessore Dec 18, 2024 Author

Implementing a stateful/class-based API using JAX

Uh oh!

ntessore Dec 20, 2024 Author

Uh oh!

Uh oh!

jakevdp Dec 20, 2024 Collaborator

Uh oh!

ntessore Dec 20, 2024 Author

Uh oh!

Uh oh!

jakevdp Dec 20, 2024 Collaborator

Uh oh!

ntessore Dec 20, 2024 Author

Uh oh!

Uh oh!

ntessore Dec 19, 2024 Author

Building a stateless/functional API on top of NumPy's existing bit generators

Uh oh!

ntessore Dec 21, 2024 Author

Footnotes

Uh oh!

Uh oh!

jakevdp Dec 21, 2024 Collaborator

Uh oh!

ntessore Dec 21, 2024 Author

Uh oh!

ntessore Dec 27, 2024 Author

ntessore
Dec 17, 2024

Replies: 5 comments 13 replies

jakevdp
Dec 17, 2024
Collaborator

ntessore Dec 17, 2024
Author

ntessore
Dec 17, 2024
Author

ntessore
Dec 18, 2024
Author

ntessore Dec 27, 2024
Author

ntessore
Dec 18, 2024
Author

ntessore Dec 20, 2024
Author

jakevdp Dec 20, 2024
Collaborator

ntessore Dec 20, 2024
Author

jakevdp Dec 20, 2024
Collaborator

ntessore Dec 20, 2024
Author

ntessore
Dec 19, 2024
Author

ntessore Dec 21, 2024
Author

jakevdp Dec 21, 2024
Collaborator

ntessore Dec 21, 2024
Author

ntessore Dec 27, 2024
Author