Skip to content

shared usage of GPUs #7

@DiTo97

Description

@DiTo97

Hi @ExpectationMax,

How difficult would it be to allow shared usage of GPUs given a known memory constraint in advance? This would be similar to the way many job scheduling softwares work when allocating the correct amount of workers.

The utility is nice as-is, but I think such a feat would be very useful for larger GPUs (over 16 GB of memory).

For instance, we could add a memory arg to the available options of each command and keep track of the per-GPU memory usage, instead of it being an exclusive flag (in use or free). In case memory is not set, we could assume either a default memory allocation request, or the control of a full GPU device, regardless of capacity. Of course, we would have to retrieve the capacity of each available GPU device and make sure that any given process does not exceed the requested memory allocation.

For the latter, the main deep learning frameworks have a way to do it:

but I do not know how we could enforce it at the device level regardless of the framework.

Alternatively, are there any other utilities you know of that already integrate this feat and that I could use?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions