-
Notifications
You must be signed in to change notification settings - Fork 1
test: add basic system-level test suite #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
As discussed yesterday, I have added a workflow for nightly tests that first checks for commits within the past 24h in any of the repos (currently: anemoi-docs, anemoi-datasets, anemoi-core), and only runs the tests (on all mains) if there has been a change in any of them. Workflow run here: https://github.com/ecmwf/anemoi-docs/actions/runs/17433383922 (The monitor action fails here because the workflow already points to main of anemoi-docs where the test suite isn't added yet. This should be fine once merged.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor optional comments, ok for me to merge as it is!
...test/configs/datasets/aifs-ea-an-oper-0001-mars-o96-2017-2017-6h-v8-testing/task_config.yaml
Outdated
Show resolved
Hide resolved
...t/configs/datasets/cerra-rr-an-oper-0001-mars-5p5km-2017-2017-6h-v3-testing/task_config.yaml
Outdated
Show resolved
Hide resolved
tests/system-level/anemoi_test/configs/training/global/task_config.yaml
Outdated
Show resolved
Hide resolved
tests/system-level/anemoi_test/configs/training/lam/task_config.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of minor comments – sorry!
Thank you @VeraChristina, for your reply. Could you help me to find the git hashes in https://github.com/ecmwf/anemoi-docs/actions/runs/17433383922/job/49499118253 And why has this PR been merged without any ATS tag? |
Hey @MeraX, good catch that we need to update the labels of this repo and also add the github action so it has the same workflow in terms of PR labelling as the other repos. I will help with this. In terms of ATS, this PR was discussed at ATS where we also presented the overall approach. We clarified the initial focus would be to support testing from main branches and scale to main uses cases like global, LAM and stretched grid. Since there were no major concerns and overall this was see as a useful feature we told Vera this was approved. Note this doesn't mean the work on System Level Tests is done, as @VeraChristina captured https://github.com/ecmwf/anemoi-docs/issues?q=is%3Aissue%20state%3Aopen%20milestone%3A%22System-level%20tests%22 there are still improvements and features to be added, but nonetheless having this merged is a great step towards making Anemoi more robust! |
Hi @MeraX, sure thing! -- You navigate to the summary, then open the logs for the package whose version you want to check, and scroll down to find the local version tag, which includes a short commit hash. ![]() (...) ![]() (...) ![]() Clearly, this is not ideal since it's very buried in the logs and just a short commit hash. Once we start using the test suite more comprehensively, i.e. outside of just testing all main branches nightly (which will make it much more complicated to know which versions were tested together), we definitely need a better way of showing this. I've added a ticket to the milestone. Feel free to add any details or considerations I've missed! |
@MeraX Thanks for pointing this out. What I meant by “across anemoi branches” is more about the end-to-end user workflow: creating a dataset → training → inference, with the various anemoi packages installed from main at each step. That way, we check whether training works when the dataset was created with the current anemoi-datasets (and soon, whether the resulting checkpoint is compatible with current inference). So yes, this is intentional in the sense that it’s closer to the workflow we expect users to follow (installing training from main or PyPI, not mixing versions within the same environment). As you note, we may need to update dependencies, and when things break we probably want to see that, since we don’t do synchronized releases across the pipeline. That said, I agree we should have a way to test whether the pipeline with updated dependencies will work. This setup is not meant to be comprehensive — just a starting point. We’ll likely want to extend what’s tested over time, and it’d be great if others can share/document ideas. If you think the current approach should be changed, could you please open an issue so we can track the discussion there instead of on this closed PR? |
Thanks for this clarification, I might have missed that point of view in the discussion. I believe there is a myriad of ways how to set up Anemoi and it's just important to make clear, what has been proven to work and what not. |
This PR adds an ecflow suite that can be deployed and monitored by a github workflow. The purpose is to test the anemoi pipeline end-to-end, from dataset creation to inference.
The suite added by this PR consists of
The suite is set up such that new test cases for dataset creation or training can be added as configs (no pyflow knowledge required).
The github workflow
Before merging, we need to
Not part of this PR:
📚 Documentation preview 📚: https://anemoi--38.org.readthedocs.build/en/38/