You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @david-beckham-315, our repository supports context parallel inference on 1, 2, 4, 6, 8, 12 or 24 GPUs (factors of the number of model heads = 24).
To run inference on 16 GPUs, you'd have to implement another context parallel inference strategy like Ring-Attention. https://github.com/xdit-project/mochi-xdit has a Ring-Attention implementation.
Hi
There are 2 A100 machines, each has 8 A100 GPUs, does mochi support running on 2 machines with all 16 GPUs? How to run?
Thanks.
The text was updated successfully, but these errors were encountered: