Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable GTIR-DaCe backend #696

Open
wants to merge 357 commits into
base: main
Choose a base branch
from
Open

Conversation

edopao
Copy link
Contributor

@edopao edopao commented Mar 11, 2025

This PR enables testing with the GTIR-DaCe backend.

  • Includes update of gt4py baseline to tag icon4py_staging_20250403
  • Adds CI configurations for dace test and benchmark, but the CI is not gating (awaiting CI compute nodes)
  • Includes fixes in some tests for selection of field allocator based on backend
  • Contains some change to comparison thresholds in several tests (to be reviewed)

@edopao
Copy link
Contributor Author

edopao commented Mar 31, 2025

One diffusion test is failing because we need GridTools/gt4py#1939

@edopao edopao force-pushed the update_to_gtir_dace_concat_where branch from 3144465 to 6b4dfff Compare April 3, 2025 11:47
@edopao
Copy link
Contributor Author

edopao commented Apr 3, 2025

cscs-ci run dace

@edopao edopao marked this pull request as ready for review April 3, 2025 20:31
@edopao
Copy link
Contributor Author

edopao commented Apr 3, 2025

cscs-ci run default

1 similar comment
@edopao
Copy link
Contributor Author

edopao commented Apr 4, 2025

cscs-ci run default

@edopao edopao requested review from halungge and havogt April 4, 2025 05:04
@@ -431,6 +431,8 @@ def test_run_diffusion_single_step(
):
if orchestration and not helpers.is_dace(backend):
pytest.skip("Orchestration test requires a dace backend.")
if orchestration and data_alloc.is_cupy_device(backend):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: is this a general issue or only the orchestration does not work on GPU?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only the dace orchestration

@@ -394,6 +395,7 @@ def test_factory_pg_edgeidx_dsl(grid_savepoint, metrics_savepoint, grid_file, ex
(dt_utils.R02B04_GLOBAL, dt_utils.GLOBAL_EXPERIMENT),
],
)
@pytest.mark.cpu_only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: do they fail or they are just horribly slow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100%, but I remember I saw a segfault (some time ago). I could try on the latest main, or after your PR #681 is merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two test cases have very similar names. The first (test_factory_pg_edgeidx_dsl) fails on GPU, with an error that suggests some numpy/cupy issue. The second (test_factory_pg_exdist_dsl) seems to have some tolerance issue in result validation.

See latest CI run:
https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/5125340235196978/2255149825504671/-/jobs/9627731793

@edopao
Copy link
Contributor Author

edopao commented Apr 4, 2025

cscs-ci run dace

Copy link

github-actions bot commented Apr 4, 2025

Mandatory Tests

Please make sure you run these tests via comment before you merge!

  • cscs-ci run default
  • launch jenkins spack

Optional Tests

To run benchmarks you can use:

  • cscs-ci run benchmark-bencher

To run tests and benchmarks with the DaCe backend you can use:

  • cscs-ci run dace

In case your change might affect downstream icon-exclaim, please consider running

  • launch jenkins icon

For more detailed information please look at CI in the EXCLAIM universe.

@edopao
Copy link
Contributor Author

edopao commented Apr 4, 2025

cscs-ci run dace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants