-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SkyReels Multi-GPU #399
Comments
For the wrapper it probably could, but I don't have multiple GPUs to test that stuff and I'm not that interested in cloud inference myself. |
If you want to just expose the parameter in the node and that's it, I'd be happy to test it. I'm not that familiar with ComfyUI myself so I'm not sure where to start to make that adjustment. Complication would be if the bug I mentioned needs to be patched in the Kijai version as well - where would that change have to be mirrored? |
It's not that simple as the pipeline is different, would be a lot more work than just exposing an option. |
Are we talking about the same thing? I mean multiple GPUs to speed up the same video being generated, not a dividing the batch to different GPUs. What would be the pipeline change there? I just finished testing the multi-GPU on the main python SkyReels up to 4 GPUs and it's 1:1 performance improvement (2x for every 2x in GPUs). Would be really good to have. |
This is data parallelism which duplicates the model across multiple GPUs. I think most people struggle to fit the model in their GPUs and are looking for ways to spread the model weights or components across multiple GPUs to fit the model to perform with less powerful GPUs. Your method requires having multiple GPUs powerful enough to process the whole thing on each individual GPU and merely duplicate the process on each GPU to speed things up. Your use case is an exception rather than a norm. |
Maybe - but my POV is that it is already functionality supported by the model and apparently toggled by setting num_gpus to 2, 4, whatever. If it's something that can be done as a node or parameter to the ComfyUI wrapper, then it's a lot of upside. |
i offer again , a rented multi gpu cloud machine for the benefit of the community. |
not really i want multi gpu support most of us are using cloud compute with multi gpus . xdit has support for default hunyuanVideo model and more like flux ... the issue is you lose quantization and lora , i have no clue how to add that . i will have a look at SkyReels and see if there is the multi gpu implementation compatible with any hunyuanVideo nodes , it seams that it's way better than xdit , xdit offers 1.3x speed up for each gpu what really bothers me is TACO-DiT a closed source multi gpu comfy ui implementation of xdit , offered for a price .
Edit i read the code of skyreels : |
I've been testing the multi-gpu without comfyui on the main skyreels repo and it's super fast. 8 GPUs is just straight up 8x faster. So yeah, definitely a lot better than xdit. Something that was 4 hours to render on 1 GPU improves to taking only 30 min. Is the implementation not in the Skyreels model itself? I was hoping it might be something like just exposing the gpu_num parameter. Mentioning faster decoding/encoding is confusing me a bit, since most of the time spent is in the sampler, so we don't really need a faster decoder/encoder. Just want to make sure we're not talking about something besides the multi-gpu inference SkyReels already has implemented. To me, it seems like it should just be in between the model loader and the sampler to change the gpu_num flag and let the model handle the rest when it's run in the sampler. |
i think you can even straight up run hunyuanVideo base model on it , after reading most of the code the skyreel pipe line has the following extra : and on skyreels_video_infer.py
then we have single_gpu_run that i believe is an instance of each gpu run , called by mp.spawn( single_gpu_run,... and we have import torch.distributed that calls this : i think i'm very close to make musubi-tuner generate script a multi gpu one , @kijai can choose to implement this or not but it's easy to implement since SkyreelsVideoPipeline is just a wrapper class around HunyuanVideoPipeline good luck to all and thank you. |
It's not the same and complicates the setup a bit - we're talking about the built-in multi-GPU of the Skyreels model for inferencing so ideally it's just turning that on for when it's used in ComfyUI. I'll try to look more into it later this week as well, just have other things on the to-do to test first. |
Is the multi-gpu option in SkyReels exposed through the ComfyUI nodes? If not, can this be added?
And as a side note, there seems to be a bug in the official code - I made a quick fix that seemed to address it in case you run into it as well:
SkyworkAI/SkyReels-V1#66
The text was updated successfully, but these errors were encountered: