You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I just try to use the UCX with RCCL and rccl-rdma-sharp-plugins, but unfortunately they cannot work, as there always an error like that: ucp_mm.c:855 Assertion `memh->md_map != 0' failed
And I noticed that the function nccl_ucx_regmr in rccl-rdma-sharp-plugins, only input those params:
mmap_params.field_mask = UCP_MEM_MAP_PARAM_FIELD_ADDRESS |
UCP_MEM_MAP_PARAM_FIELD_LENGTH;
mmap_params.address = (void*)reg_addr;
mmap_params.length = reg_size;
mh->mem_type = (type == NCCL_PTR_HOST)? UCS_MEMORY_TYPE_HOST: UCS_MEMORY_TYPE_CUDA;
mmap_params.field_mask |= UCP_MEM_MAP_PARAM_FIELD_MEMORY_TYPE;
mmap_params.memory_type = mh->mem_type;
And then calll the ucp_mem_map, as there is no flag to input, the ucp_mem_map will report this error.
So could someone who is so kind can help me to anslysis how to use the API correctlly?
Thanks a lot!
The text was updated successfully, but these errors were encountered:
Hi, I just try to use the UCX with RCCL and rccl-rdma-sharp-plugins, but unfortunately they cannot work, as there always an error like that:
ucp_mm.c:855 Assertion `memh->md_map != 0' failed
And I noticed that the function nccl_ucx_regmr in rccl-rdma-sharp-plugins, only input those params:
mmap_params.field_mask = UCP_MEM_MAP_PARAM_FIELD_ADDRESS |
UCP_MEM_MAP_PARAM_FIELD_LENGTH;
mmap_params.address = (void*)reg_addr;
mmap_params.length = reg_size;
mh->mem_type = (type == NCCL_PTR_HOST)? UCS_MEMORY_TYPE_HOST: UCS_MEMORY_TYPE_CUDA;
mmap_params.field_mask |= UCP_MEM_MAP_PARAM_FIELD_MEMORY_TYPE;
mmap_params.memory_type = mh->mem_type;
And then calll the ucp_mem_map, as there is no flag to input, the ucp_mem_map will report this error.
So could someone who is so kind can help me to anslysis how to use the API correctlly?
Thanks a lot!
The text was updated successfully, but these errors were encountered: