You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to use the DistLinkNeighborLoader and assigning it to an NCCL process group, but get the following error:
/opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:658: UserWarning: You are using a Backend <class 'torch.distributed.distributed_c10d.ProcessGroupGloo'> as a ProcessGroup. This usage is deprecated since PyTorch 2.0. Please use a public API of PyTorch Distributed instead.
My implementation is identical to the temporal_link_movielens_cpu.py implementation except replacing all instances of gloo with nccl.
Update 1:
It seems like this happens whenever more than 1 process is spawned from torch.multiprocessing.spawn(). For instance, the following spawns only 1 process and does not yield the error. However, when nprocs=2, the error will appear even when I set the device to cpu and backend to gloo
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am trying to use the
DistLinkNeighborLoader
and assigning it to an NCCL process group, but get the following error:My implementation is identical to the temporal_link_movielens_cpu.py implementation except replacing all instances of
gloo
withnccl
.Update 1:
It seems like this happens whenever more than 1 process is spawned from
torch.multiprocessing.spawn()
. For instance, the following spawns only 1 process and does not yield the error. However, whennprocs=2
, the error will appear even when I set the device tocpu
and backend togloo
Beta Was this translation helpful? Give feedback.
All reactions