Skip to content

Add clone functionality to a field #90

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Apr 25, 2025
Merged

Add clone functionality to a field #90

merged 12 commits into from
Apr 25, 2025

Conversation

dareg
Copy link
Collaborator

@dareg dareg commented Mar 20, 2025

This PR adds a way to clone a field into a second field, which will always be a field owner.
We are already using this functionality in IAL, but only on CPU.
We are now proposing to upstream an improved version (for CPU and GPU) of this functionality into field_api.

There would be two new subroutines in the API: FIELD_CLONE_ON_HOST and FIELD_CLONE_ON_DEVICE.

Documentation is updated and several tests are added.

@dareg dareg requested review from awnawab and mlange05 March 20, 2025 10:28
@dareg dareg requested a review from pmarguinaud March 20, 2025 10:28
@dareg
Copy link
Collaborator Author

dareg commented Mar 20, 2025

Some clarifications as Philippe told something was not very clear. We take into account the case where the field to be clone is optional not because we won't give the second argument to the subroutine. But because we might have a chain of subroutines being called with an optional argument being not present from the top and propagated down to the clone subroutine.

@awnawab awnawab added the approved-for-ci Approved to run hpc-ci label Mar 20, 2025
Copy link
Collaborator

@awnawab awnawab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dareg for this contribution. I just have a few minor comments that I think should be addressed before we merge this.

Just out of curiosity, so that I can understand the use case a bit better, is there a benefit to putting the FIELD on device beyond using it as a struct for DEVPTR?

Readme.md Outdated
@@ -262,6 +262,23 @@ field's data must be present on the host. It will not work if the data are on
the device or if the field has not been allocated yet (when using the DELAY
option).

## Cloning fields with FIELD\_CLONE\_ON_

The subroutines FIELD_CLONE_ON_HOST amd FIELD_CLONE_ON_DEVICE let a field be
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: 'amd' should be 'and'

ZR => GET_${TARGET}$_DATA_RDONLY (YR)
ZL => GET_${TARGET}$_DATA_RDWR (YL)

$:offload_macros.kernels(present=['ZL','ZR']) if TARGET=="DEVICE" else None
Copy link
Collaborator

@awnawab awnawab Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the kernels construct doesn't map to openmp, so we have two choices here.

If performance is not that important, then we can simply do:

$:offload_macros.serial(present=['ZL','ZR'])
ZL(${ft.shape}$) = ZR(${ft.shape}$)
$:offload_macros.end_serial()

If performance is important, then you can generate a nested loop like I am doing here:

#:for i in range(ft.rank, 0, -1)

CMakeLists.txt Outdated
@@ -100,7 +100,8 @@ list(APPEND prec_srcs
${CMAKE_CURRENT_BINARY_DIR}/src/core/field_module.F90
${CMAKE_CURRENT_BINARY_DIR}/src/util/field_array_module.F90
${CMAKE_CURRENT_BINARY_DIR}/src/util/field_array_util_module.F90
${CMAKE_CURRENT_BINARY_DIR}/src/util/field_util_module.F90)
${CMAKE_CURRENT_BINARY_DIR}/src/util/field_util_module.F90
${CMAKE_CURRENT_BINARY_DIR}/src/util/field_clone_module.F90)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this is a precision dependent source? I don't think it is, and we only need to compile it once and can remove it from this list.

@@ -38,4 +38,5 @@ field_api_add_object_library(
set_source_files_properties( ${CMAKE_CURRENT_BINARY_DIR}/field_array_module.F90
${CMAKE_CURRENT_BINARY_DIR}/field_array_util_module.F90
${CMAKE_CURRENT_BINARY_DIR}/field_util_module.F90
${CMAKE_CURRENT_BINARY_DIR}/field_clone_module.F90
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, we only need to disable compilation for precision dependent sources.

@github-actions github-actions bot removed the approved-for-ci Approved to run hpc-ci label Apr 4, 2025
@dareg
Copy link
Collaborator Author

dareg commented Apr 4, 2025

Hi @awnawab,
I think I've fixed the problems you've mentioned.
But for some reasons the test named test_field_clone_device.F90 is now crashing when using the serial macro, while it was working with the kernels macro.
I need to debug it (but I will be on holidays for 10 days)

./tests/test_field_clone_device.x 
DEV_ALLOC_SIZE = 68719476736
Failing in Thread:1
Accelerator Fatal Error: call to cuMemcpyDtoHAsync returned error 1 (CUDA_ERROR_INVALID_VALUE): Invalid value
File: /home/gmap/mrpm/grassetj/field_api_builder/field_api/build/src/util/field_1rd_clone_module.F90
 Function: field_1rd_clone_on_device:51
 Line: 79

@awnawab awnawab added the approved-for-ci Approved to run hpc-ci label Apr 4, 2025
@awnawab
Copy link
Collaborator

awnawab commented Apr 5, 2025

Hi @dareg,

Thanks for addressing the changes. That is indeed a very strange failure. In that case, please do revert it to kernels and I'll find a suitable workaround for openmp when I rebase #92 once the current PR is merged.

(needless to say this all can wait until you are back from holiday, have a nice time!)

This reverts commit 7f8f3a7.
The serial functionality seems buggy at the moment.
@github-actions github-actions bot removed the approved-for-ci Approved to run hpc-ci label Apr 25, 2025
@dareg
Copy link
Collaborator Author

dareg commented Apr 25, 2025

Reverted to kernels

@awnawab awnawab added the approved-for-ci Approved to run hpc-ci label Apr 25, 2025
Copy link
Collaborator

@awnawab awnawab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for addressing the changes, this is now G2G 👌

@awnawab awnawab merged commit 6053d92 into ecmwf-ifs:main Apr 25, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved-for-ci Approved to run hpc-ci contributor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants