Hi there,
I had several issues running 17B model and I realize it might be due to parallel gpu config as it is setup for 8 gpu? (as I read in sp_size=8 and --nproc_per_node=8)
I fixed this adding the dit.sp_size=1 and nproc_per_node=1 as I only have 1 gpu and It worked but still not able to run it even with 100gb vram.
CUDA_VISIBLE_DEVICES=0 torchrun --node_rank=0 --nproc_per_node=1 --nnodes=1 \
--rdzv_endpoint=127.0.0.1:12345 \
--rdzv_conf=timeout=900,join_timeout=900,read_timeout=900 \
main.py humo/configs/inference/generate.yaml \
dit.sp_size=1 \
generation.frames=97 \
generation.scale_a=5.5 \
generation.scale_t=5.0 \
generation.mode=TIA \
generation.height=720 \
generation.width=1280 \
diffusion.timesteps.sampling.steps=50 \
generation.positive_prompt=./examples/test_case1.json \
generation.output.dir=./output
I'm just trying this as I was able to run 17B model in ComfyUI but it seems it's optimized for low ram.
Thanks a lot for this amazing job!!!
Hi there,
I had several issues running 17B model and I realize it might be due to parallel gpu config as it is setup for 8 gpu? (as I read in sp_size=8 and --nproc_per_node=8)
I fixed this adding the dit.sp_size=1 and nproc_per_node=1 as I only have 1 gpu and It worked but still not able to run it even with 100gb vram.
I'm just trying this as I was able to run 17B model in ComfyUI but it seems it's optimized for low ram.
Thanks a lot for this amazing job!!!